AI Crawler Governance Score
Rules version 2.4.1 · Rules last updated 2026-06-04 · Updated quarterly
How the AI Crawler Governance Score is computed. It is deterministic and versioned. The same robots.txt, llms.txt and AI-policy signals always produce the same score. Two numbers describe a site, a classic governance score and a profile-fit score.
The two scores
Classic governance score (0 to 100). The general hygiene of your crawler governance, independent of any chosen intent. A site can score high here and still not match the intent you picked.
Profile fit (0 to 100). How closely the site matches the intent profile you selected. A site can be classic 70 and profile fit 100, or classic 100 and profile fit 64. Fit is the number the downloadable configuration targets.
Scope boundaries
This audit covers crawler access governance and declared AI-use governance. Browser-agent operability, including accessibility tree quality, WebMCP and visual stability, belongs to a separate layer such as Lighthouse Agentic Browsing.
The six governance blocks
| Block | Weight | Rules |
|---|---|---|
| Presence and validity | 10 | R1, R2, R3 |
| Search engine posture | 10 | R4, R5, R6, R7 |
| AI crawler posture | 25 | R8, R9, R10, R11, R12, R13, R14, R15 |
| AI policy and interpretive governance | 25 | R16, R17, R18, R19, R20, R21, R31 |
| Crawl budget hygiene | 15 | R22, R23, R24, R25 |
| Resources, social and monetisation | 15 | R26, R27, R28, R29, R30 |
| Total | 100 |
The five intent profiles
Each profile expresses an intent in plain language. Profile fit is the weighted combination of four sub-scores; each profile weights them differently.
AI search open, training restricted
Allow AI search bots (ChatGPT, Claude, Perplexity), block training crawlers.
- Search visibility readiness (30%)
- Training control clarity (30%)
- Governance consistency (25%)
- Hygiene (15%)
Maximum AI visibility
Allow all legitimate AI bots for maximum citation; aggressive scrapers handled elsewhere.
- Search visibility readiness (40%)
- Training control clarity (10%)
- Governance consistency (25%)
- Hygiene (25%)
Publisher protection
Strict training control with selective AI search visibility; for media and editorial sites.
- Search visibility readiness (20%)
- Training control clarity (40%)
- Governance consistency (30%)
- Hygiene (10%)
WordPress safe default
Recommended baseline for typical WordPress sites: AI search OK, training restricted.
- Search visibility readiness (30%)
- Training control clarity (25%)
- Governance consistency (25%)
- Hygiene (20%)
Strict crawler restriction
Block all AI bots; keep only minimal traditional search visibility.
- Search visibility readiness (15%)
- Training control clarity (45%)
- Governance consistency (30%)
- Hygiene (10%)
Normative basis
The robots.txt matching follows RFC 9309 (group selection, the most specific rule wins, special characters, path scope). The sub-scores and profiles are Pagup / InferensLab interpretive governance built on top of that matcher.