Spam, Feeds & Crawl Traps
Location: Step 7 — Spam, Feeds & Crawl Traps
This step groups five cleanup toggles that target low-value or high-noise URL patterns common on WordPress sites. The goal is to reduce crawl waste without removing useful discovery.
What this step controls
Five toggles:
- Block Feed Crawlers — disallows feed URLs that no longer add discoverability value.
- Block Author Archives — disallows
/author/*paths that often duplicate content already indexed elsewhere. - Block Comment Spam Params — disallows query strings that comment spam tooling tends to generate.
- Block WordPress Search URLs — disallows the internal
?s=search URL pattern that creates infinite low-value crawl paths. - Block Common Trap Parameters — disallows the most common parameter-based crawl traps (faceted-style URL variations).
When enabled, each toggle emits Disallow rules under User-agent: *.
How to decide
Use the default cleanup (most toggles on) when:
- the site is a standard WordPress install with no specific reason to expose feeds or author archives;
- crawl-log analysis shows wasted budget on duplicate or low-value URLs.
Leave Block Feed Crawlers off when:
- the site actively publishes RSS or Atom feeds for discovery (newsrooms, podcasts, syndication partners).
Leave Block Author Archives off when:
- author pages are part of the editorial brand and have unique long-form content;
- author archives are intentionally indexed for E-E-A-T signals.
Leave the search and trap parameter blocks off only if a specific campaign or integration depends on those URLs being indexable.
What this step does not do
This step does not:
- prevent comment spam itself — a moderation layer (Akismet, manual review, custom firewall rules) handles content spam;
- block bots that ignore
robots.txt; - guarantee that all parameter-based crawl traps are covered. The list is curated; an unusual stack may still need custom rules.
Plan tier
- Free: the core toggles are available.
- Pro / Premium: advanced spam and crawl-trap protection covers a broader pattern list and edge cases.