This is the observation we arrived at after having completed more than a thousand website audits. And this problem mainly concerned sites using WordPress (although it could be observed on other CMS, less accessible).
In fact, the term “blocked” is a bit extreme because it is actually more of a recurring problem creating limitations in the capacity of the sites to fetch the maximum effectiveness of their content at the level of search engines (you will be able to do the test a little lower in the article).
And when we talk about a recurring problem, no need to throw the stone to WordPress. In fact, it is rather a systematic lack of configuration, generated either by ignorance or by forgetting a feature specific to each site web but whose impact is more than meaningful in the ability of a website to come out in search results. It is an optimization strategy for search engines (SEO).
Read this article: “How to optimize my robots.txt”
So what is it about?
Concretely, this is a simple file, called the Robots.txt, located at the “root” of the site (hosting server), containing instructions for crawlers (search engines bots) specifying which pages or parts of the site, these may or may not index, and that happens to be the very first file read by the search engines when your site is loaded into a browser.
At this level, you probably say you’ve never heard of this file and, above all, you don’t understand why it can be fundamentally important. If that is the case, which would be more than normal, you do part of 95% (see title).
In order for you to understand what this is all about, you have to integrate the fact that the visibility of your website is strictly dependent on what the search engines will have understood from your content, to allow its visibility in SERPs. And, paradoxically, there is a common thought suggesting that because you have created a new page on your site or added a new product to your shop, it will necessarily appear in the search engine result pages (SERPs). Nothing is less wrong!
In fact, in order for your new content to appear in the search results, it must have been indexed by search engines. And to do this, even before any actual process of indexation, it is necessary first that these search engines are aware that there is actually new content available and that they have permission to read it!
We are very far from basic optimizations of tags, META data, content, etc. that you can read this and there on the Web, which can be done to perfection, but which would be absolutely useless if search engines were unable to read your pages!
That’s why, inside your website (in fact, of all websites), there is a file, the robots.txt, whose one and only mission is to communicate with search engines bots (crawlers).
And today, more than 95% of WordPress websites are in the situation where:
- Either they do not have a Robots.txt
- Either the Robots.txt is active in its most “basic” form and useless
- Either the Robots.txt is incomplete (see below)
- Either the Robots.txt prevents any form of content indexing by search engines (the worst situation).
So what does your Robots.txt look like?
Please copy-paste the URL of your website (with http: // or https: //) into the field shown below, which will allow you to identify first if you have one, and if so, what is its content:
For comparison, here is ours: https://better-robots.com/robots.txt. Do you see the difference?
What is the purpose of Robots.txt?
As mentioned above, the robots.txt is probably the most important file on your website at the SEO level (Search engine Optimization) and its role is to:
- Allow access to (or not) your Website to search engines (which is not nonsense)
- Allow or prohibit access to certain parts of your website (in order to avoid the publication of sensitive internal information)
- Inform about the presence of new “indexable” content (sitemap)
- Define the crawling budget of search engine bots on your website (to avoid overloading your server)
Do you understand how important it is?
How to solve this problem?
To this question, there are 2 possible answers: either you are a little geek and you can optimize yourself your robots.txt file (be careful not to make the situation worse), or you can use the “Better Robots.txt ” plugin, developed by our team and allowing to make this optimization in a few clicks and in a safe way.
At first glance, you may feel like we’re going to get you to choose the second option. Perhaps this is true but it is especially for you to avoid complicating even more your situation and to get your current organic ranking even worst than it is. As we said, this file is very important because it communicates directly with search engines and you just need of a simple typo (mistake) to block all access to your content.
That’s why we designed two versions of this plugin.
- A free (and limited) version, accessible and downloadable from your WordPress website allowing you to test the plugin and determine its efficiency (go to “Plugins”, then “Add new”, in the search bar, enter “Better robots.txt”, click on “search”. Once found, click “Install”, then click “Active”. You will be directly redirected to the main page of the plugin in order to define your settings)
- A premium version ($48.99) accessible only from this website: https://better-robots.com/product/better-robots-txt-pro/, allowing you access to all optimization features (Bad Bot blocker, Sitemap integration, etc.).
To find out how to optimize Better Robots.txt, read this article.
How did we get here?
It was both the experience and the daily practice that led us to wonder how we could solve this problem without having to connect to all the websites.
It turns out that PAGUP, the entity behind Better Robots.txt, is a SEO agency (CANADA) 100% specialized in “onPage” optimization for search engines. After having completed a very large number of SEO websites audits, we noticed the recurrence of this “sub-optimization” on most websites, even from traditional and popular agencies or companies of international renown.
However, it is within the framework of our optimization services that we have seen the almost immediate impact that could generate the optimum configuration of this file. It turns out that our optimization processes, initially dedicated to WordPress websites, consisted of two steps. The first part, more technical, at the beginning of the mandate, consisting of correct recurring technical errors (which included robots.txt and sitemap) and a second, a little later (2-3 weeks), strictly content-oriented. And it is by daily monitoring the organic ranking of these websites that we have observed that, in the vast majority of cases, some weeks after the first part of the mandate (and without touching the second stage, thus the content), these websites had acquired much more ranking (more occupied positions in the SERPs) with, for some, a 150% increase of keywords.
After more than 250 similar observations, under the same circumstances, we came to the conclusion that a well configured / optimized robots.txt could have a massive and significant impact on organic performance (SEO). And because, at the time, there was no solution available on WordPress allowing to simplify this optimization process for the largest number, we decided to create better Robots.txt.