generateSitemapTask

Generate a sitemap.xml for search engine indexing — with clean URLs and smart defaults.


When to Use

✅ Use generateSitemapTask when you want:

This task should run last (or near-last) to capture all generated pages.


Quick Start

generateSitemapTask({
  scanDir: 'public',
  outDir: 'public',
  siteUrl: 'https://example.com',
})

Output: public/sitemap.xml


Configuration

Option Type Required Description
scanDir string Directory to scan for generated .html files
outDir string Directory to write sitemap.xml into
siteUrl string Site root URL for absolute <loc> tags
excludes string[] Glob patterns to exclude (merged with defaults)

URL Cleaning

All URLs are automatically cleaned for best SEO practice:

File Sitemap URL
index.html /
about.html /about
blog/index.html /blog/
contact.html /contact

No .html extensions appear in the generated sitemap.


Exclusions

Default Excludes

These patterns are always excluded — no configuration needed:

Custom Excludes

Add your own patterns via the excludes option. They're merged with the defaults:

generateSitemapTask({
  scanDir: 'public',
  outDir: 'public',
  siteUrl: 'https://example.com',
  excludes: ['admin/**', 'drafts/**'],
})

Logging

Excluded files are logged at info level for visibility:

Sitemap: excluded /404.html (matched pattern: 404.html)
ℹ Sitemap: excluded /admin/index.html (matched pattern: admin/**)

What's Included

The sitemap automatically includes all .html files in the scan directory that aren't excluded:


Pipeline Position

Place after all page-generating tasks:

export default [
  prepareOutputTask({ /* ... */ }),
  generateItemsTask({ /* ... */ }),
  generatePagesTask({ /* ... */ }),
  generateFeedTask({ /* ... */ }),
  generateSitemapTask({ scanDir: 'public', outDir: 'public', siteUrl: 'https://example.com' }),  // ← Last
  copyStaticTask({ /* ... */ }),  // Assets don't affect sitemap
];

Related Tasks