Google-Extended checker

Audit this bot. Run the Better Robots.txt checker to see whether Google-Extended is explicitly addressed in your robots.txt file.

Google-Extended is a Google product token for downstream AI use controls. Google describes Google-Extended as a standalone product token with no separate HTTP user agent and says it does not affect Google Search inclusion or ranking.

The important audit question is not simply “is Google-Extended blocked?” The better question is: does the site express the intended posture for Google-Extended without accidentally harming search visibility, AI search visibility, or WordPress crawl hygiene?

What the checker looks for

Signal	Why it matters
Explicit user-agent group	Shows that the site is not silent toward `Google-Extended`.
Allow or Disallow directive	Expresses the practical crawl posture.
Separation from wildcard rules	Avoids hiding the policy inside generic `User-agent: *` behavior.
Consistency with related bots	Prevents conflating training, search, and user-triggered access.
WordPress safety	Ensures the `Google-Extended` policy does not break public assets or search crawlers.

Example robots.txt rule

txt

User-agent: Google-Extended
Disallow: /

or, for an open posture:

txt

User-agent: Google-Extended
Allow: /

These are examples, not universal recommendations. The right choice depends on whether the site wants visibility, training restriction, user retrieval, or a conservative publisher stance.

How to interpret the audit result

Result	Interpretation
Explicitly allowed	The site permits this crawler, at least at the robots.txt level.
Explicitly blocked	The site restricts this crawler for compliant access.
Missing	The crawler inherits wildcard behavior and the posture is less clear.
Contradictory	Multiple rules or policy surfaces may express incompatible intent.
Not evaluated	The scanner could not verify the signal with enough confidence.

A mature policy should compare Google-Extended to adjacent agents from the same or similar ecosystem. The mistake is to treat all AI-related access as identical.

For example, a site may decide to allow AI search crawlers while restricting training-related crawlers. Another site may decide that user-triggered retrieval is acceptable but automated training collection is not. The checker exists to make those distinctions visible.

Source reference

The bot classification on this page is based on the public documentation available from Google: https://developers.google.com/search/docs/crawling-indexing/overview-google-crawlers. Always review the current vendor documentation before deploying a strict policy, because crawler names and purposes can evolve.

WordPress implementation

If the site runs on WordPress, Better Robots.txt gives you a safer configuration workflow than manually editing root files. You can configure crawler families, preview the generated output, publish the change, then re-run the audit.

Check Google-Extended Manage Google-Extended with Better Robots.txt

Implementation checklist

Use the audit as an implementation sequence, not as a decorative score.

Confirm the audited origin: protocol, host, and subdomain must match the site you actually want to govern.
Preserve search access unless the site is intentionally private.
Decide whether the goal is maximum AI visibility, training restriction, conservative publishing, or strict privacy.
Configure crawler families by purpose rather than by emotion.
Publish policy context only when it is coherent with the active rules.
Re-scan after changes because a generated WordPress robots.txt file can be modified by plugins, cache, server rules, or edge middleware.

Manual spot check

A technical reviewer can validate the audit manually by requesting these URLs:

txt

/robots.txt
/llms.txt
/ai-manifest.json
/.well-known/ai-governance.json
/.well-known/llm-policy.json

Then compare the result with the public pages, sitemap, and WordPress configuration. The important question is not only whether each file exists. It is whether those files express the same intent. A robots.txt block, a permissive llms.txt, and a contradictory AI policy create a weak governance layer even if each file loads successfully.

Conversion path for WordPress

If the site is WordPress, the practical next step is not a spreadsheet of recommendations. It is a configuration pass inside Better Robots.txt: choose the closest preset, adjust crawler families, preview the output, publish, and re-run the external scan. That is what turns the audit from education into proof.

Google-Extended checker ​

What the checker looks for ​

Example robots.txt rule ​

How to interpret the audit result ​

Related crawler distinctions ​

Source reference ​

WordPress implementation ​

Implementation checklist ​

Manual spot check ​

Conversion path for WordPress ​

Related audit pages ​