Blog

Protect your data and block malicious robots

Did you know?

Almost 50% of world traffic, on the Web, is not human! These are essentially robots that visit your website when it’s not fake trafic. So on average, a visit to 2 on your website has absolutely nothing human.

But what is it?

According to a recent study, carried out by Incapsula, an ever-important part of the Web traffic (48.2%) is generated by robots of all types (thousands of robots), we’re talking about crawlers, monitoring bots, commercial robots, Scrapers, Trojans,… each having a very specific role.

In fact, there are two categories of robots:

  • the positives (“the good ones”)
  • the negatives (“the bad ones”)

Regarding the positive: these are essentially search engines (like Google, Bing,…) that read your content (following your links) in order to understand the information and correlate it with organic research. There are also “mobile UX”, bots that transfer content from websites to mobile or web apps. It can also be crawling robots, which scans the code of your site to extract information for study purposes (performance), or commercial robots that analyze content, images, text, or prices for comparison purposes.

Protect your data and block malicious robots
From https://www.incapsula.com/blog/bot-traffic-report-2016.html

As such, you may not know it, but if you have an online store, with thousands of products, and you have a certain popularity, it is likely that it is frequent “analyzed” by robots other than Google or Bing, to retrieve your information, the names & prices of your products, for comparison or adjustment purposes by big brands like Amazon (or even a competitor). The idea here is to obtain very quickly a great amount of information about your products on a frequently basis (rather than doing it manually, page by page) in order to allow the implementation of adapted Marketing strategies (based on your new product offer) or a more aggressive pricing policy.

At this level, despite all that one can think of this last practice, the robots at the origin of these maneuvers, are not considered bad, because basically they only analyze and extract the information (without to republish them) of your site by following certain minimum ethical rules.

Concerning the negatives: this is essentially about “scrapers”, robots intended to extract your information and your data to republish on another website (without your permission) or robots, such as “Impersonators” using / simulating fake identities to bypass the security of websites. There are also “Spammers” who inject spam links in forums, articles, reviews to create backlinks. And finally, “Scavengers” who audit sites to identify security vulnerabilities and exploit them.

What you need to know is that, on average, 95% of Web sites have already undergone an “attack” of robots.

So what do we do with this information?

Protect yourself, yes, but without becoming paranoid! At the present time, it is virtually impossible to be protected from all the malicious robots roaming the Web daily. Even the largest Web operators of this world, despite an increased level of protection, are being attacked, copied and otherwise harmed. The idea here is to become aware of this and start somewhere. That’s why Better Robots.txt offers you an advanced feature to protect your data by blocking a selection of malicious robots (essentially scrapers) among the most popular:

Better Robots.txt is based on Distill Networks analysis results in order to keep its database up-to-date as well as its plugin to allow you a first level of protection against some of them (30).


So what does your Robots.txt look like?

Please copy and paste the URL of your website (with http:// or https://) into the fields indicated below, which will first identify if you have one, and if so, what its content is:

[robotschecker]

Did you try all
our WordPress plugins ?

Better Robots

BIALTY Bulk Image Alt Text (Alt tag, Alt attribute) for Google Images

Better Robots

BETTER ROBOTS.TXT Boost your ranking on search engines with an optimized robots.txt

Better Robots

MOBILOOK Get Dynamic/Instant Mobile Previews of your Websites

Click to rate this post!
[Total: 1 Average: 5]