Free SEO & GEO Tool

AI Bot & Robots.txt Checker

Check which AI bots and crawlers are allowed or blocked by robots.txt rules, then verify with live server checks using each bot's real User-Agent.

Last updated: April 2026

How this AI bot and robots.txt checker works

Most crawler checkers only test Googlebot. But in 2026, GPTBot, ClaudeBot, PerplexityBot, and Google-Extended matter just as much for AI search visibility. This AI bot and robots.txt checker tests 37 crawlers — including Googlebot, OpenAI's new OAI-AdsBot (ChatGPT ads validator), and every major AI bot — then fires a live request with each one's actual User-Agent string.

That second check is the one people miss. A bot can be "allowed" in robots.txt but still get a 403 from a Cloudflare rule or a CDN bot-protection layer. This tool shows both layers: what your robots.txt says, and what actually gets through. If GPTBot is blocked at the server level, no amount of content optimization will get you into AI answers.

Common robots.txt errors this checker finds

Most mistakes are dumb, not subtle. A missing line break between User-agent and Disallow. A Disallow rule that accidentally blocks every URL with a trailing slash typo (/folder/ vs /folder). Case sensitivity bugs (User-Agent vs user-agent), which technically work but trip up some parsers. A robots.txt file that doesn't exist at all, which means every URL is crawlable by default. And the classic: a Disallow: / left over from a staging environment that nobody removed before going live.

Is my robots.txt blocking search engines?

Run this checker against your live URL. If Googlebot is blocked, you'll see it in the first result row. Most unintended blocks come from a wildcard pattern (Disallow: /*.php) that was meant for one path but matches thousands. Fix the pattern and wait 24-48 hours for Google to re-crawl.

robots.txt tester vs. robots.txt checker

Google retired the robots.txt Tester inside Search Console in late 2023. The replacement is a short report under Settings → robots.txt that only shows the currently fetched file. It doesn't let you test custom user-agents or preview rule changes. A standalone checker like this one fills that gap: you can test any User-Agent or URL path, and preview changes to your robots.txt before pushing it live.

Explore more tools

Meta Tag Analyzer

Full meta tag audit for any URL.

llms.txt Generator

Create AI crawler guides.

Heading Checker

Analyze H1-H6 hierarchy.

Security Headers

Check HTTP security headers and server config.

JS vs No-JS

See what content disappears without JavaScript.

FAQ

What is a robots.txt file?+

A text file at your site's root (example.com/robots.txt) that tells crawlers which pages they can or cannot access. It uses User-Agent, Allow, and Disallow directives to control crawling behavior per bot.

What does "rules" mean vs "BLOCKED"?+

"Rules" means the bot can crawl your site but certain paths are restricted (e.g. /admin/, /api/). This is normal and healthy. "BLOCKED" means Disallow: /, so the bot cannot access any page at all.

Why does the server check show 403 for some bots?+

A 403 means the server actively rejects that bot regardless of robots.txt rules. This is typically done via firewall rules, CDN settings, or server-side bot detection. The bot cannot access your site even if robots.txt allows it.

What happens without a robots.txt?+

All crawlers assume they have permission to access every page. This tool shows all 37 bots as "allowed" in that case. You can still control access via server-side rules (which the live check reveals).

What is Crawl-Delay?+

A robots.txt directive that tells bots to wait a certain number of seconds between requests. It limits crawling speed to reduce server load. Google ignores Crawl-Delay, so use Search Console's crawl rate setting instead.

AI crawler check on every page

Lumina shows robots.txt rules, X-Robots-Tag, and AI traffic sources — automatically, for free.

Add Lumina to Chrome — Free