Skip to main content

Allow AI crawlers (GPTBot, ClaudeBot, PerplexityBot) in robots.txt

If GPTBot or ClaudeBot is blocked in robots.txt, your site is invisible to ChatGPT and Claude.

Worth up to 25 points

Why it matters

AI search engines treat robots.txt as gospel. A blanket `Disallow: /` under `User-agent: *` — or a missing AI bot allowlist — means ChatGPT, Claude, Perplexity and Google's AI overviews will skip every page on your site. This check is worth 25 points (the most of any check) because nothing else matters if crawlers can't get in.

How to fix it

  1. 1. Open or create /robots.txt at the root of your site

    It must live at the root, not in a subdirectory. Most frameworks have a `/public/robots.txt` (Next, Vite, TanStack Start) or a `static/robots.txt` (Astro, SvelteKit). If you don't have one, create it now.

  2. 2. Explicitly allow the AI bots that matter

    Add a block per bot you want to allow. Even if your wildcard rule is `Allow: /`, an explicit allow makes intent unambiguous and survives audits.

    User-agent: GPTBot
    Allow: /
    
    User-agent: ChatGPT-User
    Allow: /
    
    User-agent: OAI-SearchBot
    Allow: /
    
    User-agent: ClaudeBot
    Allow: /
    
    User-agent: anthropic-ai
    Allow: /
    
    User-agent: PerplexityBot
    Allow: /
    
    User-agent: Google-Extended
    Allow: /
    
    User-agent: *
    Allow: /
    Sitemap: https://yourdomain.com/sitemap.xml
  3. 3. Don't paywall robots.txt

    If your CDN or auth middleware blocks /robots.txt for unauthenticated requests, bots see a 401/403 and give up. The file must return 200 to any anonymous user-agent.

  4. 4. Re-scan and verify

    Re-run a Crawlable scan. The AI crawler access check should turn green and your score should jump.

FAQ

Should I allow every AI bot?
If discovery in ChatGPT and Claude matters more than worrying about training-data scraping, yes — allow them all. If you only want search indexing and not training, allow GPTBot's search variants (ChatGPT-User, OAI-SearchBot) and Google-Extended but leave the broader GPTBot blocked.
What about CCBot and Bytespider?
CCBot powers Common Crawl, which many AI products use as input. Bytespider is ByteDance's crawler. Allow them if you want maximum reach; block them if you've had bandwidth issues.

See your own score

Run a free Crawlable scan to find every check that needs fixing on your site — not just this one.

Check my site

More guides