Yahoo!, Google, Microsoft Clarify Robots.txt Support


Today, Google, Yahoo!, and Microsoft have come together to post details of how each of them support robots.txt and the robots meta tag. While their posts use terms like “collaboration” and “working together,” they haven’t joined together to implement a new standard (as they did with sitemaps.org). Rather, they are simply making a joint stand in messaging that robots.txt is the standard way of blocking search engine robot access to web sites. They have identified a core set of robots.txt and robots meta tag directives that all three engines support:

Google and Yahoo! already supported and documented each of the core directives, and Microsoft supported most of them before this announcement. In their posts, they also list the directives they support that may not be supported by the other engines.

For robots.txt, they all support:

  • Disallow
  • Allow
  • Use of wildcards
  • Sitemap location

For robots meta tags, they all support:

  • noindex
  • nofollow
  • noarchive
  • nosnippet
  • noodpt

With this announcement, Microsoft appears to be adding support for the use of * wildcards (which will go live later this month) and the Allow directive. The biggest discrepancy is with the crawl-delay directive. Yahoo! and Microsoft support it, while Google does not (although Google does support control of crawl speed via Webmaster Tools ).

This isn’t the first time the major search engines have come together for an announcement regarding how they support publishers. In late 2006, all three joined together to support XML Sitemaps and launched sitemaps.org, followed in April 2007 with support for Sitemaps autodiscovery in robots.txt, and in February 2008 with more support for more flexible storage locations of Sitemap files. In early 2005, the engines declared support for the nofollow attribute on links (in an effort to combat comment spam).

Why are the search engines coming together to talk about their varied support for traditional methods for blocking access to web content? A Microsoft spokesperson told me that while robots.txt has been the de facto standard for some time, the search engines had never come together to detail how they support it and said the aim is to “make REP more intuitive and friendly to even more publishers on the web.” Google similarly said that “doing a joint post allows webmasters to see how we all honor REP directives, the majority of which are identical, but we also call out those that are not used by all of us.”

Yahoo! told me:

Our goal is to come out with clear information about the actual support around REP for all engines. We have all separately at different times reported our support and this creates a long trail hard for anyone to put together. Posting the same spec at the same time provides a sync point for everyone as to the actual similarities or differences between our implementations for all engines. We are trying to address the latent concerns around differences across the engines.

Of course, each engine has provided documentation in their respective help centers for some time, and Google and Microsoft provide robots.txt analysis tools that detail how they interpret a file in their webmaster tools, so while they haven’t documented their support jointly, the documentation itself isn’t new.

This move may be an effort to show a consolidated front in light of the ongoing publisher attempts to create new search engine access standards with ACAP. This direction reflects the ongoing direction of the messaging the search engines have had about ACAP. For instance, Rob Jonas, Google’s head of media and publishing partnerships in Europe, said in March that “the general view is that the robots.txt protocol provides everything that most publishers need to do.”

For more information, see each engine’s blog posts (updated as their posts go live):



Vanessa Fox is a Contributing Editor at Search Engine Land. Called a “cyberspace visionary” by Seattle Business Monthly, she is an expert in understanding customer acquisition from organic search. She shares her perspective on how this impacts marketing and user experience at ninebyblue.com and provides authoritative search-friendly design patterns for developers at janeandrobot.com.

See more articles by Vanessa Fox >


Share, Bookmark & Discuss This Article
More:


Keep Updated: News Via Email | News Via RSS Feed | News Via Twitter


See more stories like this in the Members Library! Check out the SEO: Blocking Spiders sections of the Members Library where this story is filed. Members also get access to exclusive video content, a members-only weekly & monthly newsletter, plus more. Check out all the benefits!

Comments are closed.


RECENT COMMENTS

  • KevinSpence said " The AP & other news companies forget how much of their content is syndicated. So alright, maybe "
  • Avintrue said " So I would have to say that principle of pointing a finger with three pointing back at you is a good"
  • fonegigsblog said " Wow! Looks like Google will truly be the one stop shop for digital advertising in the next few years"

See All »


FREE DAILY SEARCH NEWS RECAP!

Stay on top of all the search news with our daily summary, the SearchCap newsletter. View a sample ›

STAY CURRENT THROUGHOUT THE DAY

RSS Feeds

The Search Engine Land feed keeps you informed as news happens. SEE ALL FEEDS »

Upcoming Search Engine Land Conferences

Advertise With Us »

Search Engine Land produces SMX, the Search Marketing Expo conference series. SMX events deliver the most comprehensive educational and networking experiences - whether you're just starting in search marketing or you're a seasoned expert.


SMX Web Site » | SMX Difference » | SMX News »


Join us at an upcoming SMX event:

Search Marketing Now Learn more about search marketing with our free online webcasts and webinars from our sister site, Search Marketing Now. Upcoming online events include:


See more webcast topics »

TRACK US SOCIALLY
Upcoming Search Engine Land Conferences

Get Your Search Engine Land
Premium Membership!

Become a premium member today and receive:

  • Express commenting privileges & photo.
  • Exclusive videos & newsletters.
  • Discounts to our SMX conferences.
  • Access to "How To" & Other Archives.

Learn More

Upcoming Search Engine Land Conferences
Add to GoogleAdd to My Yahoo!Add to BloglinesAdd to NetvibesAdd to Windows Live