Agentic readiness checklist for WordPress
This checklist is designed for WordPress sites that want to move beyond “AI SEO” slogans and into practical agentic readiness.
Use it to separate 3 questions that are often confused:
- Can machines crawl and discover the site?
- Can AI-oriented systems understand the site’s source pages and policy posture?
- Can browsing agents interact with the site without misreading buttons, forms, layouts, or workflow state?
Better Robots.txt helps with the first two. The third also needs accessibility, front-end, UX, and QA work.
Readiness levels
| Level | Description | Typical risk |
|---|---|---|
| Initial | The site has a basic robots.txt, but crawler purposes, llms.txt, source pages, and policy signals are unclear | Accidental blocking, crawler confusion, weak source routing |
| Governed | The site separates crawler categories, publishes useful machine guidance, and maintains policy coherence | Still may be weak for interaction, forms, visual stability, or accessibility |
| Agentic-ready | The site is crawlable, governable, source-rich, accessible, stable, and testable by agents | Requires continuous QA, logs, and change management |
Block 1: Search crawl baseline
A site cannot be agent-ready if it breaks basic search discovery.
Check:
robots.txtexists at the root.- Search crawlers are not blocked accidentally.
- XML sitemaps are declared and reachable.
- Canonical pages are indexable.
- Important CSS and JavaScript are not blocked blindly.
- Internal links route to the real source pages.
- Low-value WordPress archives are controlled where appropriate.
Better Robots.txt support:
- Robots.txt checker for AI crawlers
- Robots.txt presence and validity check
- Search crawler access check
- WordPress crawl hygiene check
Block 2: AI crawler segmentation
A flat “allow AI” or “block AI” posture is usually too crude.
Check:
- Search crawlers are separated from training crawlers.
- User-triggered retrieval is not confused with background crawling.
- Archive bots are evaluated separately.
- SEO tool bots are evaluated separately.
- Abusive bots are not treated like legitimate AI search crawlers.
- GPTBot, OAI-SearchBot, ChatGPT-User, ClaudeBot, Claude-SearchBot, Claude-User, Google-Extended, PerplexityBot, and Applebot-Extended are not collapsed into one rule by accident.
Better Robots.txt support:
- AI crawler coverage check
- AI training vs AI search crawlers
- Block AI training without blocking AI search
Block 3: llms.txt and machine summary
llms.txt should orient machines. It should not pretend to control them.
Check:
- The file exists only if it is useful and maintained.
- It is available at the expected root location.
- It summarizes the site accurately.
- It links to priority pages, not every page.
- It includes product, support, policy, and governance surfaces where appropriate.
- It does not contradict
robots.txt. - It does not claim ranking impact, crawler obedience, or legal enforcement.
Better Robots.txt support:
Block 4: AI governance files and policy coherence
The site should not publish isolated files that point in different directions.
Check:
- AI usage policy exists when the site needs one.
- Machine-readable governance files are discoverable.
- Source precedence is clear.
- Policy signals are not described as hard enforcement.
- Product capabilities are edition-qualified when needed.
- Unsupported claims are not widened by generic language.
- Public policy pages and machine files point to each other.
Better Robots.txt support:
- AI governance files checker
- What AI usage signals can and cannot do
- Signal vs enforcement for AI crawlers
- The machine governance file stack
Block 5: Source-page architecture
Machines need stable pages to retrieve, cite, and compare.
Check:
- The site has canonical definition pages.
- The site has comparison pages where users ask comparison questions.
- Product or service pages are not only promotional.
- Documentation pages answer practical implementation questions.
- Case studies include enough evidence to be useful.
- FAQs are not the only place where important claims appear.
- Internal links connect source pages together.
Better Robots.txt support:
Block 6: WordPress route hygiene
WordPress generates many public routes that may not deserve equal machine attention.
Check:
- Tag pages are not flooding the crawl surface.
- Date archives are useful or constrained.
- Author archives are useful or constrained.
- Attachment pages are controlled.
- Internal search pages are not indexable.
- WooCommerce cart, account, and checkout routes are treated correctly.
- Staging, demo, and utility paths are not exposed accidentally.
Better Robots.txt support:
- WordPress crawl hygiene check
- Robots.txt for WooCommerce
- Robots.txt for publishers
- Robots.txt for multilingual sites
Block 7: Accessibility tree and semantic controls
Agents often rely on structured representations of the page, not only visual screenshots.
Check:
- Buttons have meaningful labels.
- Links describe their destination.
- Form fields have labels.
- Error messages are programmatically connected to fields.
- Main navigation is understandable.
- Accordions, tabs, modals, and menus expose state clearly.
- Decorative content does not create semantic noise.
- Primary actions are not hidden behind visual-only cues.
Better Robots.txt support: limited. This block belongs mostly to front-end, accessibility, and UX remediation.
Block 8: Visual stability and dynamic behavior
An agent can misclick or misinterpret a page if the interface moves after the target is identified.
Check:
- Layout shifts are minimized.
- Images and embeds have dimensions.
- Font loading does not create severe shifts.
- Ads and consent banners do not move critical actions unpredictably.
- Lazy-loaded blocks do not change the meaning of the page.
- Sticky headers do not hide target elements.
- JavaScript-rendered content is still understandable.
Better Robots.txt support: indirect. It can help avoid blocking required resources, but it does not fix layout stability.
Block 9: Forms, checkout, and contact workflows
Agentic readiness becomes real when a system has to act.
Check:
- Contact forms expose required fields clearly.
- Confirmation messages are visible and semantically clear.
- Anti-spam layers do not create unexplained dead ends.
- Checkout steps are stable and labelled.
- Errors can be understood without visual guessing.
- Consent banners do not block essential paths without accessible controls.
- The user can distinguish “submitted”, “failed”, and “needs action”.
Better Robots.txt support: none directly. This block requires form, UX, accessibility, and QA work.
Block 10: Logs, evidence, and change management
Readiness is not a one-time artifact.
Check:
- Crawler logs are reviewed.
- Important bots are identifiable.
- User-triggered fetchers are separated from background crawlers where possible.
- Changes to
robots.txt,llms.txt, policies, sitemaps, and source pages are tracked. - AI visibility tests are repeated under consistent conditions.
- Claims about AI recommendation or citation are documented with dates and surfaces.
Better Robots.txt support:
- Read crawl logs and identify bots
- Verify AI agents in logs
- AI recommendation protocol
- Cross-AI recommendation case
What Better Robots.txt covers best
Use Better Robots.txt for:
robots.txtgeneration;- AI crawler segmentation;
- WordPress crawl hygiene;
llms.txtpublication and checking;- AI governance file awareness;
- crawl-policy review;
- audit-to-action workflows.
Do not use it as a substitute for:
- accessibility remediation;
- front-end performance engineering;
- structured data strategy;
- content architecture;
- legal review;
- WAF enforcement;
- signed-agent verification;
- complete browser-agent testing.
30-minute implementation pass
A fast first pass looks like this:
- Run the free audit.
- Fix accidental search crawler blocks.
- Select or adjust a Better Robots.txt preset.
- Review AI crawler rules by purpose.
- Add or improve
llms.txtonly if the file can be meaningful. - Check whether governance files exist and are coherent.
- Identify 5 priority source pages.
- Run Lighthouse Agentic Browsing.
- Note accessibility, CLS, and interaction risks.
- Re-audit after publishing.