AI Crawler Governance Score

Rules version 2.4.1 · Rules last updated 2026-06-04 · Updated quarterly

How the AI Crawler Governance Score is computed. It is deterministic and versioned. The same robots.txt, llms.txt and AI-policy signals always produce the same score. Two numbers describe a site, a classic governance score and a profile-fit score.

The two scores

Classic governance score (0 to 100). The general hygiene of your crawler governance, independent of any chosen intent. A site can score high here and still not match the intent you picked.

Profile fit (0 to 100). How closely the site matches the intent profile you selected. A site can be classic 70 and profile fit 100, or classic 100 and profile fit 64. Fit is the number the downloadable configuration targets.

Scope boundaries

This audit covers crawler access governance and declared AI-use governance. Browser-agent operability, including accessibility tree quality, WebMCP and visual stability, belongs to a separate layer such as Lighthouse Agentic Browsing.

The six governance blocks

Block	Weight	Rules
Presence and validity	10	R1, R2, R3
Search engine posture	10	R4, R5, R6, R7
AI crawler posture	25	R8, R9, R10, R11, R12, R13, R14, R15
AI policy and interpretive governance	25	R16, R17, R18, R19, R20, R21, R31
Crawl budget hygiene	15	R22, R23, R24, R25
Resources, social and monetisation	15	R26, R27, R28, R29, R30
Total	100

The five intent profiles

Each profile expresses an intent in plain language. Profile fit is the weighted combination of four sub-scores; each profile weights them differently.

AI search open, training restricted

Allow AI search bots (ChatGPT, Claude, Perplexity), block training crawlers.

Search visibility readiness (30%)
Training control clarity (30%)
Governance consistency (25%)
Hygiene (15%)

Maximum AI visibility

Allow all legitimate AI bots for maximum citation; aggressive scrapers handled elsewhere.

Search visibility readiness (40%)
Training control clarity (10%)
Governance consistency (25%)
Hygiene (25%)

Publisher protection

Strict training control with selective AI search visibility; for media and editorial sites.

Search visibility readiness (20%)
Training control clarity (40%)
Governance consistency (30%)
Hygiene (10%)

WordPress safe default

Recommended baseline for typical WordPress sites: AI search OK, training restricted.

Search visibility readiness (30%)
Training control clarity (25%)
Governance consistency (25%)
Hygiene (20%)

Strict crawler restriction

Block all AI bots; keep only minimal traditional search visibility.

Search visibility readiness (15%)
Training control clarity (45%)
Governance consistency (30%)
Hygiene (10%)

Normative basis

The robots.txt matching follows RFC 9309 (group selection, the most specific rule wins, special characters, path scope). The sub-scores and profiles are Pagup / InferensLab interpretive governance built on top of that matcher.