How to fix robots.txt file error in my GWT?

google-search-console robots.txt crawl-errors

7,843

Solution 1

If you have a message saying that "Google couldn't crawl your site because we were unable to access the robots.txt file". Then it is not the contents of the robots.txt file that is in question, it is that Google simply couldn't access the file. And when Google can't access a robots.txt file then it won't crawl the site.

Using fetch as Googlebot in Webmaster Tools is a good idea. If your robots.txt file fetches successfully then it could be a past issue. If not, then you obviously need to look further to ensure Googlebot access.

Solution 2

There is no such official command as Allow in robots.txt. By default, everything is allowed. (However, it is possible to use Allow to give exceptions when you are disallowing many directory paths in one route. Often, there is no requirement for this though).

Not that I would expect it to cause an issue however.

There is no reason to specify the Mediapartners-Google user agent either as this too, is just saying allow the crawling of everything.

All your robots.txt needs from the above is the following:-

User-Agent: *
Disallow: /search/

User-agent: Mediapartners-Google
Disallow: /

Sitemap: http://latest-seo-news-updates.blogspot.com/feeds/posts/default?orderby=UPDATED

Google Webmaster Tools will report a warning to say X amount of URL's on your site were blocked by your robots.txt if you are disallowing bots to any part of your site, in which case you are at /search/. You can expand this notice to view specifically what URL's were blocked and you may well find that it is only the ones you want disallowed that Google Webmaster Tools is warning about.

You can also run an application such as Xenu to crawl your site and establish what URL's can be crawled specifically. You can also fetch as Googlebot and test your robots.txt file from within Google Webmaster Tools that will alert you to any further issues or at least complete details of any issues.

Edit: Upon further clarification, added Disallow directive for UA Mediapartners-Google.

7,843

Sathiya Kumar V M

Sathiya Kumar V M - An Enthusiastic Senior Database Developer especially in MS SQL Server, BI Developer, Specialist in SSIS, SSRS having 9+ years IT Experience. Started my career in 2012 @ContempoTech(Search Engine Genie) a startup company gave me a good platform for my SEO career. With the 1 year experience I joined Sulekha New Media Pvt ltd(Sulekha.com) where I learned SEO as well as SEM. Later I got a chance to join Servion Global Solutions as Database Engineer where I worked for nearly 7 years as Senior Database Developer as well as Reporting Developer. Currently working with BajajFinServ as Senior Technical Specialist who is working in Contact Center IT. I have my own blogs: http://sqlservertutorialspoint.blogspot.com/ - SQL Server Tutorials Point Blog http://latest-seo-news-updates.blogspot.com/ - Info about SEO http://mytamilkavithaigal.blogspot.in/ - Tamil Language Poets and Quotes Reach out me through my following profiles: Wikipedia - https://en.wikipedia.org/wiki/User:SathiyaKumarVM Twitter - http://twitter.com/SathiyaKumarseo Facebook - https://www.facebook.com/sathiyakumarseo Google+ - https://plus.google.com/+SathiyaKumarSEO About.me - https://www.about.me/sathiya.kumar Quora - http://www.quora.com/Sathiya-Kumar

Updated on September 18, 2022

Comments

Sathiya Kumar V M over 1 year
In my blog’s Webmaster Tools, there is a notification in Crawl Errors section, that is Google couldn't crawl your site because we were unable to access the robots.txt file.

My blog’s robots.txt file is:
```
User-agent: Mediapartners-Google
Disallow: 
User-agent: *
Disallow: /search
Allow: /

Sitemap: http://example.blogspot.com/feeds/posts/default?orderby=UPDATED
```
I don’t think the above file details are wrong but I don’t understand why I received such dangerous notification.
- How can I fix this issue?
- Marian Popovych almost 11 years
  
  1. "User-agent: Mediapartners-Google Disallow:" What does this section mean? 2. "Google couldn't crawl your site because we were unable to access the robots.txt file." Probably, it means your server did not return your robots.txt file while Google had tried to get it.
- Sathiya Kumar V M almost 11 years
  
  It only disallow the media-partners of Google like adsence.. Does it is wrong?
- Marian Popovych almost 11 years
  
  As for me, you should write User-agent: Mediapartners-Google* Disallow: But it is not a thing. If I were in your shoes, I would check server logs and look for Goolgebot queries to /robots.txt
- Zistoloen almost 11 years
  
  Did you check encoding of your file? UTF-8 is recommended by Google.
- MrWhite almost 11 years
  
  @Marian: You should not include an asterisk on the end of the user agent (this is not a wildcard operator in this context - the lone * is a special case). John: Disallow: by itself (without a path element) does not disallow the robot - this directive will be ignored, so it will in fact allow the robot!
MrWhite almost 11 years

Sorry, I deleted my comment! ... It would seem from the comments above that the OP does actually want to disallow the Mediapartners-Google robot, so the Disallow: directive (without a path element) in the question would seem to be incorrect. It should read Disallow: / (with a slash).
zigojacko almost 11 years

Agreed - answer updated. Thanks for pointing out clarification as per comments.