Google Retires The Googlebot-News Bot

googlebotToday, Google announced that they will no longer be crawling news sites with Googlebot-News and instead will crawl news sites with Googlebot, the same bot that crawls sites for web search. However, you can still block your content from being indexed in Google News by disallowing Googlebot-News in robots.txt or using a meta robots tag.

Blocking Content From Google News

Seem confusing? On the one hand, it’s not at all.

If you want Google to index your content in both web search and News (if you are a Google News publisher), then you don’t need to do anything. Google will keep crawling as it always has, but if you look at your server logs, you’ll only see entries for Googlebot rather than entries for both Googlebot and Googlebot-News.

If you want to keep your content out of Google News, you can keeping using the Disallow directive in robots.txt (or meta robots tag) to block Googlebot-News. Even though Google will now crawl as Googlebot rather than Googlebot-News, they’ll still respect the Googleb0t-News robots.txt directive.

You can no longer, however, disallow Googlebot and allow Googlebot-News as you can for other specialized Googlebots, although you could before this change.

Gathering Data About How Your Site Is Crawled

On the other hand, this change makes things a lot more confusing if you’re using data to understand how your site is crawled and make improvements.

For instance, if you notice that your news articles aren’t being indexed in Google News and you check the news-specific crawl errors in Google Webmaster Tools and don’t see any problems, you can no longer check your server logs to see if those articles are being crawled for the news index. You can see if the pages are being crawled generally, but this less granular insight makes it tougher to troubleshoot problems.

In this example, you may be generating a news-specific Sitemap and that generation process may be missing specific URLs. You used to be able to review your server logs, see that Googlebot-News was crawling particular URLs but not others, and then check to see if the URLs that hadn’t been crawled were in the Sitemap. Now, all the server logs will tell you is whether Google is crawling the URLs at all. If they are being crawled for web search but not News, that detail is now lost.

You lose granular insight for web search as well. If you are tracking down why particular pages on your site aren’t indexed, you could previously review your server logs to see if they were being crawled, but now it will appear as though they are, even if they are only being crawled for Google News.

You can still get News-specific and web-specific crawl errors from Google webmaster tools, so some insight is still available. In terms of granularity, Google tells me that the Google webmaster tools URLs restricted by robots.txt report includes only the pages blocked from web search and not URLs blocked from Google News.

However, It doesn’t sound like you can currently see a list of URLs Google tried to crawl but didn’t due to Googlebot-News being blocked, and unfortunately the robots.txt analysis tool in Google webmaster tools doesn’t let you test URLs blocked in Google News separately from web search. So it would be tough to determine if you were accidentally blocking URLs from indexing in Google News.

This change seems like a bit of a step backward to me. When Google News was first launched, Googlebot crawled for both web search and News and news publishers asked for a news-specific bot. Certainly, the most important reason for this is the ability to block and allow content from Google News separately from web search, and that functionality remains. However, the granular insight available was useful as well, and it’s unfortunate that will now be lost.

Opinions expressed in the article are those of the guest author and not necessarily Search Engine Land.

Related Topics: Channel: SEO | Google: News | Google: Webmaster Central | Top News


About The Author: is a Contributing Editor at Search Engine Land. She built Google Webmaster Central and went on to found software and consulting company Nine By Blue and create Blueprint Search Analytics< which she later sold. Her book, Marketing in the Age of Google, (updated edition, May 2012) provides a foundation for incorporating search strategy into organizations of all levels. Follow her on Twitter at @vanessafox.

Connect with the author via: Email | Twitter | Google+ | LinkedIn


Get all the top search stories emailed daily!  


Other ways to share:

Read before commenting! We welcome constructive comments and allow any that meet our common sense criteria. This means being respectful and polite to others. It means providing helpful information that contributes to a story or discussion. It means leaving links only that substantially add further to a discussion. Comments using foul language, being disrespectful to others or otherwise violating what we believe are common sense standards of discussion will be deleted. Comments may also be removed if they are posted from anonymous accounts. You can read more about our comments policy here.
  • ludwig coenen

    would be interesting to know, if this means that the behaviour / capabilities of the bot have changed as well.

    the google news crawling process seemed always to be focused on speed and not a precise DC detection for instance. maybe that changes, when google bot does the crawling.

    or is that not very likely to happen?

Get Our News, Everywhere!

Daily Email:

Follow Search Engine Land on Twitter @sengineland Like Search Engine Land on Facebook Follow Search Engine Land on Google+ Get the Search Engine Land Feed Connect with Search Engine Land on LinkedIn Check out our Tumblr! See us on Pinterest


Click to watch SMX conference video

Join us at one of our SMX or MarTech events:

United States


Australia & China

Learn more about: SMX | MarTech

Free Daily Search News Recap!

SearchCap is a once-per-day newsletter update - sign up below and get the news delivered to you!



Search Engine Land Periodic Table of SEO Success Factors

Get Your Copy
Read The Full SEO Guide