Skip to main content

Add a sitemap.xml so crawlers can discover every page

A valid sitemap.xml at /sitemap.xml lists every page worth crawling, with last-modified dates.

Worth up to 10 points

Why it matters

Both classic search bots and AI crawlers use sitemap.xml as a starting point for discovery. Without one, deep pages with few internal links may never be found. With one — and a lastmod date — crawlers recrawl efficiently when content changes.

How to fix it

  1. 1. Generate dynamically, not by hand

    Static sitemap.xml files go stale the moment you publish a new page. Generate at request time from your route tree and database.

  2. 2. Include every public, indexable URL

    Static routes from your file tree, plus one entry per row for dynamic routes (blog posts, products, public reports). Skip admin, auth, and 404 pages.

    <?xml version="1.0" encoding="UTF-8"?>
    <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
      <url>
        <loc>https://yourdomain.com/</loc>
        <changefreq>weekly</changefreq>
        <priority>1.0</priority>
      </url>
      <url>
        <loc>https://yourdomain.com/blog/launch</loc>
        <lastmod>2026-05-12T10:00:00Z</lastmod>
      </url>
    </urlset>
  3. 3. Reference it from robots.txt

    Add `Sitemap: https://yourdomain.com/sitemap.xml` at the bottom of robots.txt so crawlers discover it on first visit.

  4. 4. Submit to Search Console (optional)

    For Google's classic index, submit at search.google.com/search-console. AI crawlers don't need a submission step — they find it via robots.txt.

FAQ

What if I have more than 50,000 URLs?
Split into multiple sitemaps and use a sitemap index file. Most sites never hit this limit.

See your own score

Run a free Crawlable scan to find every check that needs fixing on your site — not just this one.

Check my site

More guides