Robots.txt Study Shows Webmasters Favor Google; BotSeer Robots.txt Search Engine Released


The Pennsylvania State University conducted a study that showed webmasters favored Google over other search engines in terms of allowing access to their web sites. An associated BotSeer search engine that allows searching across a collection of robots.txt files was also released.

The study looked at which robots or crawlers were listed in a web site’s robots.txt file, and Google was listed more often than any other search engine. The paper is named Determining Bias to Search Engines from Robots.txt (PDF) (it may be slow, so here is a local copy) and showed some interesting details.

The most commonly used user agent is the “universal robot,” where 93.8 percent of sites with robots.txt files have a rule allowing any crawler to access the site. 72.4 percent of the robots.txt files mentioned specific robots by name.

The chart below shows that Google’s robot, GoogleBot, is named more often than any other search engine:

Robots.txt Study

The chart below compares search engine market share to robot bias:

Robots.txt Study

The study also collects historical data on the increased usage of the robots.txt file by webmasters. It is definitely worth downloading and reading.

One more note: I mentioned this morning a quote from Eytan of Live Search:

One thing that we noticed for example while mining our logs is that there are still a fair number of sites that specifically only allow Googlebot and do not allow MSNBot.

This study confirms Eytan’s statement.

Postscript From Danny: I skimmed the report and hope to look more later. However, saying Google is most favored by seeing if Googlebot is named with allow statements isn’t conclusive. For example, Googlebot might include things like the Google AdSense crawler — and allowing that while banning other spiders still might be banning Google itself. That said, I have no doubt site owners think more about Google than other search engines when crafting their files.



Barry Schwartz is Search Engine Land's News Editor and owns RustyBrick, a NY based web consulting firm. He also runs Search Engine Roundtable, a popular search blog on very advanced SEM topics. Barry's personal blog is named Cartoon Barry and he can be followed on Twitter here.

See more articles by Barry Schwartz >


Share, Bookmark & Discuss This Article
More:


Keep Updated: News Via Email | News Via RSS Feed | News Via Twitter


See more stories like this in the Members Library! Check out the SEO: Blocking Spiders, Search Engines: Other Search Engines, Stats: Popularity sections of the Members Library where this story is filed. Members also get access to exclusive video content, a members-only weekly & monthly newsletter, plus more. Check out all the benefits!

Comments are closed.


RECENT COMMENTS

  • dian said " I haven't tried that yet but if it is the way Mazter is saying I think it won't going to do any good"
  • dian said " It really helps me a lot. The methods step by step explained the important factors easily and all th"
  • T Campbell said " Ah, that's SPIDER-Man, said the comics guy."

See All »


FREE DAILY SEARCH NEWS RECAP!

Stay on top of all the search news with our daily summary, the SearchCap newsletter. View a sample ›

STAY CURRENT THROUGHOUT THE DAY

RSS Feeds

The Search Engine Land feed keeps you informed as news happens. SEE ALL FEEDS »

Upcoming Search Engine Land Conferences

Advertise With Us »

Search Engine Land produces SMX, the Search Marketing Expo conference series. SMX events deliver the most comprehensive educational and networking experiences - whether you're just starting in search marketing or you're a seasoned expert.


SMX Web Site » | SMX Difference » | SMX News »


Join us at an upcoming SMX event:

Search Marketing Now Learn more about search marketing with our free online webcasts and webinars from our sister site, Search Marketing Now. Upcoming online events include:


See more webcast topics »

TRACK US SOCIALLY
Upcoming Search Engine Land Conferences

Get Your Search Engine Land
Premium Membership!

Become a premium member today and receive:

  • Express commenting privileges & photo.
  • Exclusive videos & newsletters.
  • Discounts to our SMX conferences.
  • Access to "How To" & Other Archives.

Learn More

Upcoming Search Engine Land Conferences
Add to GoogleAdd to My Yahoo!Add to BloglinesAdd to NetvibesAdd to Windows Live