Subscribe Via Web Feed Subscribe with Google Add to My Yahoo! Subscribe with Bloglines Add to netvibes Subscribe with Live.com

« Google Defines IP Delivery, Geolocation, & Cloaking | Main | Microsoft Wants To Replace "404" Error With Live Search Results »

Jun. 3, 2008 at 12:11pm Eastern by Vanessa Fox

Yahoo!, Google, Microsoft Clarify Robots.txt Support

Today, Google, Yahoo!, and Microsoft have come together to post details of how each of them support robots.txt and the robots meta tag. While their posts use terms like "collaboration" and "working together," they haven't joined together to implement a new standard (as they did with sitemaps.org). Rather, they are simply making a joint stand in messaging that robots.txt is the standard way of blocking search engine robot access to web sites. They have identified a core set of robots.txt and robots meta tag directives that all three engines support:

Google and Yahoo! already supported and documented each of the core directives, and Microsoft supported most of them before this announcement. In their posts, they also list the directives they support that may not be supported by the other engines.

For robots.txt, they all support:


  • Disallow

  • Allow

  • Use of wildcards

  • Sitemap location

For robots meta tags, they all support:


  • noindex

  • nofollow

  • noarchive

  • nosnippet
  • noodpt

With this announcement, Microsoft appears to be adding support for the use of * wildcards (which will go live later this month) and the Allow directive. The biggest discrepancy is with the crawl-delay directive. Yahoo! and Microsoft support it, while Google does not (although Google does support control of crawl speed via Webmaster Tools ).



This isn't the first time the major search engines have come together for an announcement regarding how they support publishers. In late 2006, all three joined together to support XML Sitemaps and launched sitemaps.org, followed in April 2007 with support for Sitemaps autodiscovery in robots.txt, and in February 2008 with more support for more flexible storage locations of Sitemap files. In early 2005, the engines declared support for the nofollow attribute on links (in an effort to combat comment spam).



Why are the search engines coming together to talk about their varied support for traditional methods for blocking access to web content? A Microsoft spokesperson told me that while robots.txt has been the de facto standard for some time, the search engines had never come together to detail how they support it and said the aim is to "make REP more intuitive and friendly to even more publishers on the web." Google similarly said that "doing a joint post allows webmasters to see how we all honor REP directives, the majority of which are identical, but we also call out those that are not used by all of us."

Yahoo! told me:

Our goal is to come out with clear information about the actual support around REP for all engines. We have all separately at different times reported our support and this creates a long trail hard for anyone to put together. Posting the same spec at the same time provides a sync point for everyone as to the actual similarities or differences between our implementations for all engines. We are trying to address the latent concerns around differences across the engines.

Of course, each engine has provided documentation in their respective help centers for some time, and Google and Microsoft provide robots.txt analysis tools that detail how they interpret a file in their webmaster tools, so while they haven't documented their support jointly, the documentation itself isn't new.



This move may be an effort to show a consolidated front in light of the ongoing publisher attempts to create new search engine access standards with ACAP. This direction reflects the ongoing direction of the messaging the search engines have had about ACAP. For instance, Rob Jonas, Google's head of media and publishing partnerships in Europe, said in March that "the general view is that the robots.txt protocol provides everything that most publishers need to do."



For more information, see each engine's blog posts (updated as their posts go live):


Like The Story? Vote For It On Yahoo Buzz!
Subscribe To Our Daily Search News Recap!
Your Email:
Send me the monthly search newsletter too! (Learn more about our newsletters and feeds)
Subscribe To Our Search Feed!
Subscribe Via Web FeedSubscribe with GoogleAdd to My Yahoo!Subscribe with BloglinesAdd to netvibes
Subscribe with Live.comSubscribe in NewsGator OnlineSubscribe in RojoAdd to My AOL
Share & Bookmark This Story!
By Vanessa Fox Permalink Jump To Comments See Related Stories In: SEO: Blocking Spiders



Reader Comments

Search:

Search Marketing Expo

Save the date for:
SMX China (Nanjing) - Sept. 23-24
SMX Stockholm - Sept. 23-24: See who's speaking or register now.
SMX East (New York City) - Oct. 6-8: See the agenda or register today and save!
SMX London - Nov. 4-5: Pre-agenda rate now available. Click here.

Search Marketing Now

Learn more about search marketing through free online webcasts and webinars from our sister site Search Marketing Now.

Upcoming Webcasts:

Most Recent News Posts

About Search Engine Land

Stay Updated!

Get Our Search Newsletters:
Email:
Daily Monthly

Get Our Search Feed:
Subscribe Via Web FeedSubscribe with Google
Add to My Yahoo!Subscribe with Bloglines
Add to netvibesSubscribe with Live.com
Subscribe in NewsGator OnlineSubscribe in Rojo
Add to My AOL
More About Our Feeds & Newsletters

Add to Technorati Favorites

Track Us Socially:
Facebook: Our Search News App
Facebook: Search Engine Land Page
Facebook: Search Engine Land Group
Flickr: Search Engine Land
LinkedIn: Search Engine Land Group
Twitter: Search Engine Land Feed

Bragroll