Google sending incorrect Property Health Warnings about robots.txt
If you have recently had a property health warning in Google Search Console (formerly known as Webmaster Tools), you may be concerned that your home page is being blocked by your robot.txt file, but don’t worry it’s probably not.
Yesterday, 18/02/16, we saw a number of alerts:
Further inspection suggested that robots.txt might be blocking an important page:
In every domain’s case, it was the home page.
However, in every domain’s case, the robots.txt had not been amended for some time, the sites did not contain the de-indexing directive:
And in every case, the home page was present and correct in the SERPs.
So initial panic over, but why were we receiving theses alarming alerts?
When using the robots.txt tester within search console, it would be suggested that wildcard directives using the * command were the culprits:
However, using Screaming Frog, was able to confirm the robots.txt was being adhered to as expected, which was to a degree confirmed by all right pages being present when checking each domain using the “site:” search operator in Google.
So I started a discussion on Google+ to get to the bottom of it, the general consensus was the any directive should really start with a /, which makes sense (even though it still works without) but luckily for me John Mueller got involved:
Though I wasn’t having it, so followed up with more testing and provided some further insight, to which he acknowledged:
Which would make sense – if Google follows the directives in the actual robots.txt, you would expect the same behaviour in the testing tool.
The tech seo take away from all this is:
If you do use wildcard directives in robots.txt, then start them with a / where applicable
which is a course of action we will be taking across all our sites that have this warning.
But, as said, it’s no real cause for concern, your home page is not being blocked, it’s just a bug.
And it’s not the only one at the moment, as yesterday there was also was a spike in “incorrect hreflang implementation” notifications sent out.
Looks to me like someone at Search Console HQ broke something yesterday.