Allow AI crawlers (GPTBot, ClaudeBot, PerplexityBot) in robots.txt
If GPTBot or ClaudeBot is blocked in robots.txt, your site is invisible to ChatGPT and Claude.
Why it matters
AI search engines treat robots.txt as gospel. A blanket `Disallow: /` under `User-agent: *` — or a missing AI bot allowlist — means ChatGPT, Claude, Perplexity and Google's AI overviews will skip every page on your site. This check is worth 25 points (the most of any check) because nothing else matters if crawlers can't get in.
How to fix it
1. Open or create /robots.txt at the root of your site
It must live at the root, not in a subdirectory. Most frameworks have a `/public/robots.txt` (Next, Vite, TanStack Start) or a `static/robots.txt` (Astro, SvelteKit). If you don't have one, create it now.
2. Explicitly allow the AI bots that matter
Add a block per bot you want to allow. Even if your wildcard rule is `Allow: /`, an explicit allow makes intent unambiguous and survives audits.
User-agent: GPTBot Allow: / User-agent: ChatGPT-User Allow: / User-agent: OAI-SearchBot Allow: / User-agent: ClaudeBot Allow: / User-agent: anthropic-ai Allow: / User-agent: PerplexityBot Allow: / User-agent: Google-Extended Allow: / User-agent: * Allow: / Sitemap: https://yourdomain.com/sitemap.xml3. Don't paywall robots.txt
If your CDN or auth middleware blocks /robots.txt for unauthenticated requests, bots see a 401/403 and give up. The file must return 200 to any anonymous user-agent.
4. Re-scan and verify
Re-run a Crawlable scan. The AI crawler access check should turn green and your score should jump.
FAQ
- Should I allow every AI bot?
- If discovery in ChatGPT and Claude matters more than worrying about training-data scraping, yes — allow them all. If you only want search indexing and not training, allow GPTBot's search variants (ChatGPT-User, OAI-SearchBot) and Google-Extended but leave the broader GPTBot blocked.
- What about CCBot and Bytespider?
- CCBot powers Common Crawl, which many AI products use as input. Bytespider is ByteDance's crawler. Allow them if you want maximum reach; block them if you've had bandwidth issues.
See your own score
Run a free Crawlable scan to find every check that needs fixing on your site — not just this one.
Check my site