AI answer engines (ChatGPT, Perplexity, Google AI Overviews, Claude) are turning into a real referral source. But they can only cite pages they can actually crawl and parse. So I built a tiny, open-source checker and ran it against the homepages of 136 well-known companies across 7 industries to see how AI-ready the web really is.
The results weren't what I expected.
What I measured
Each homepage scored 0-100 across six things AI crawlers and answer-engines rely on:
-
AI-crawler access - does
robots.txtallow GPTBot, OAI-SearchBot, PerplexityBot, Google-Extended, ClaudeBot? - Structured data - JSON-LD (Organization, WebSite, FAQPage)
- Title + meta description
- Open Graph tags
- XML sitemap
- llms.txt
The findings
Average score by industry (higher = easier for AI to read):
| Industry | Avg score |
|---|---|
| Marketing agencies | 92 |
| SaaS | 88 |
| Dev tools | 86 |
| E-commerce | 85 |
| AI startups | 83 |
| Fintech | 74 |
| Healthtech | 63 |
What jumped out:
- Structured data is the #1 gap. A large share of sites ship no Organization or FAQPage JSON-LD - which is exactly the format AI answers like to quote.
- Plenty of well-known tech companies score in the C/D range, almost always because of missing schema, not anything hard.
- The bar is low. A deliberate, clean setup puts a tiny site ahead of companies a thousand times its size in how readable it is to AI search.
How to fix yours (copy-paste)
1. Let the AI crawlers in - robots.txt:
User-agent: GPTBot
Allow: /
User-agent: OAI-SearchBot
Allow: /
User-agent: PerplexityBot
Allow: /
User-agent: Google-Extended
Allow: /
User-agent: *
Allow: /
Sitemap: https://yourdomain.com/sitemap.xml
2. Tell engines who you are - Organization JSON-LD in your <head>:
<script type="application/ld+json">
{"@context":"https://schema.org","@type":"Organization","name":"YOUR COMPANY","url":"https://yourdomain.com","sameAs":["https://www.linkedin.com/company/yourco","https://x.com/yourco"]}
</script>
3. Add FAQPage JSON-LD to any page with Q&A - it's the structure AI answers quote most.
Check your own site (free, open source)
I open-sourced the checker. No install needed:
npx github:epistemedeus/ai-readiness https://yourdomain.com
It prints a 0-100 score plus the exact gaps and fixes. Browser version (no install): https://samedaydesk.com/tools/ai-readiness
If you'd rather grab every template ready to paste (robots.txt, all the JSON-LD, sitemap, meta/OG) in one file, I bundled them into a $9 kit: https://buy.stripe.com/9B66oI1BEdTV6116oieZ20j - but the free checker plus the snippets above honestly get most sites 80% of the way there.
What does your site score? The thing that surprised me most: a bunch of AI companies can't be read by AI. 🤔













