Skip to main contentSkip to content

Agentic readiness checklist for WordPress

This checklist is designed for WordPress sites that want to move beyond “AI SEO” slogans and into practical agentic readiness.

Use it to separate 3 questions that are often confused:

  1. Can machines crawl and discover the site?
  2. Can AI-oriented systems understand the site’s source pages and policy posture?
  3. Can browsing agents interact with the site without misreading buttons, forms, layouts, or workflow state?

Better Robots.txt helps with the first two. The third also needs accessibility, front-end, UX, and QA work.

Readiness levels

LevelDescriptionTypical risk
InitialThe site has a basic robots.txt, but crawler purposes, llms.txt, source pages, and policy signals are unclearAccidental blocking, crawler confusion, weak source routing
GovernedThe site separates crawler categories, publishes useful machine guidance, and maintains policy coherenceStill may be weak for interaction, forms, visual stability, or accessibility
Agentic-readyThe site is crawlable, governable, source-rich, accessible, stable, and testable by agentsRequires continuous QA, logs, and change management

Block 1: Search crawl baseline

A site cannot be agent-ready if it breaks basic search discovery.

Check:

  • robots.txt exists at the root.
  • Search crawlers are not blocked accidentally.
  • XML sitemaps are declared and reachable.
  • Canonical pages are indexable.
  • Important CSS and JavaScript are not blocked blindly.
  • Internal links route to the real source pages.
  • Low-value WordPress archives are controlled where appropriate.

Better Robots.txt support:

Block 2: AI crawler segmentation

A flat “allow AI” or “block AI” posture is usually too crude.

Check:

  • Search crawlers are separated from training crawlers.
  • User-triggered retrieval is not confused with background crawling.
  • Archive bots are evaluated separately.
  • SEO tool bots are evaluated separately.
  • Abusive bots are not treated like legitimate AI search crawlers.
  • GPTBot, OAI-SearchBot, ChatGPT-User, ClaudeBot, Claude-SearchBot, Claude-User, Google-Extended, PerplexityBot, and Applebot-Extended are not collapsed into one rule by accident.

Better Robots.txt support:

Block 3: llms.txt and machine summary

llms.txt should orient machines. It should not pretend to control them.

Check:

  • The file exists only if it is useful and maintained.
  • It is available at the expected root location.
  • It summarizes the site accurately.
  • It links to priority pages, not every page.
  • It includes product, support, policy, and governance surfaces where appropriate.
  • It does not contradict robots.txt.
  • It does not claim ranking impact, crawler obedience, or legal enforcement.

Better Robots.txt support:

Block 4: AI governance files and policy coherence

The site should not publish isolated files that point in different directions.

Check:

  • AI usage policy exists when the site needs one.
  • Machine-readable governance files are discoverable.
  • Source precedence is clear.
  • Policy signals are not described as hard enforcement.
  • Product capabilities are edition-qualified when needed.
  • Unsupported claims are not widened by generic language.
  • Public policy pages and machine files point to each other.

Better Robots.txt support:

Block 5: Source-page architecture

Machines need stable pages to retrieve, cite, and compare.

Check:

  • The site has canonical definition pages.
  • The site has comparison pages where users ask comparison questions.
  • Product or service pages are not only promotional.
  • Documentation pages answer practical implementation questions.
  • Case studies include enough evidence to be useful.
  • FAQs are not the only place where important claims appear.
  • Internal links connect source pages together.

Better Robots.txt support:

Block 6: WordPress route hygiene

WordPress generates many public routes that may not deserve equal machine attention.

Check:

  • Tag pages are not flooding the crawl surface.
  • Date archives are useful or constrained.
  • Author archives are useful or constrained.
  • Attachment pages are controlled.
  • Internal search pages are not indexable.
  • WooCommerce cart, account, and checkout routes are treated correctly.
  • Staging, demo, and utility paths are not exposed accidentally.

Better Robots.txt support:

Block 7: Accessibility tree and semantic controls

Agents often rely on structured representations of the page, not only visual screenshots.

Check:

  • Buttons have meaningful labels.
  • Links describe their destination.
  • Form fields have labels.
  • Error messages are programmatically connected to fields.
  • Main navigation is understandable.
  • Accordions, tabs, modals, and menus expose state clearly.
  • Decorative content does not create semantic noise.
  • Primary actions are not hidden behind visual-only cues.

Better Robots.txt support: limited. This block belongs mostly to front-end, accessibility, and UX remediation.

Block 8: Visual stability and dynamic behavior

An agent can misclick or misinterpret a page if the interface moves after the target is identified.

Check:

  • Layout shifts are minimized.
  • Images and embeds have dimensions.
  • Font loading does not create severe shifts.
  • Ads and consent banners do not move critical actions unpredictably.
  • Lazy-loaded blocks do not change the meaning of the page.
  • Sticky headers do not hide target elements.
  • JavaScript-rendered content is still understandable.

Better Robots.txt support: indirect. It can help avoid blocking required resources, but it does not fix layout stability.

Block 9: Forms, checkout, and contact workflows

Agentic readiness becomes real when a system has to act.

Check:

  • Contact forms expose required fields clearly.
  • Confirmation messages are visible and semantically clear.
  • Anti-spam layers do not create unexplained dead ends.
  • Checkout steps are stable and labelled.
  • Errors can be understood without visual guessing.
  • Consent banners do not block essential paths without accessible controls.
  • The user can distinguish “submitted”, “failed”, and “needs action”.

Better Robots.txt support: none directly. This block requires form, UX, accessibility, and QA work.

Block 10: Logs, evidence, and change management

Readiness is not a one-time artifact.

Check:

  • Crawler logs are reviewed.
  • Important bots are identifiable.
  • User-triggered fetchers are separated from background crawlers where possible.
  • Changes to robots.txt, llms.txt, policies, sitemaps, and source pages are tracked.
  • AI visibility tests are repeated under consistent conditions.
  • Claims about AI recommendation or citation are documented with dates and surfaces.

Better Robots.txt support:

What Better Robots.txt covers best

Use Better Robots.txt for:

  • robots.txt generation;
  • AI crawler segmentation;
  • WordPress crawl hygiene;
  • llms.txt publication and checking;
  • AI governance file awareness;
  • crawl-policy review;
  • audit-to-action workflows.

Do not use it as a substitute for:

  • accessibility remediation;
  • front-end performance engineering;
  • structured data strategy;
  • content architecture;
  • legal review;
  • WAF enforcement;
  • signed-agent verification;
  • complete browser-agent testing.

30-minute implementation pass

A fast first pass looks like this:

  1. Run the free audit.
  2. Fix accidental search crawler blocks.
  3. Select or adjust a Better Robots.txt preset.
  4. Review AI crawler rules by purpose.
  5. Add or improve llms.txt only if the file can be meaningful.
  6. Check whether governance files exist and are coherent.
  7. Identify 5 priority source pages.
  8. Run Lighthouse Agentic Browsing.
  9. Note accessibility, CLS, and interaction risks.
  10. Re-audit after publishing.