Search Engine Land » SEO » Content » Google To Newspapers: Robots.Txt You

Google To Newspapers: Robots.Txt You

Newspaper attacks on Google somehow apparently overstepping fair use and stealing their material are just escalating, with a European led “Hamburg Declaration” coming out last week. Now Google’s blogged a response that basically says if you want out of Google, it’s easily done with a robots.txt file. That, of course, is what search engine savvy […]

Danny Sullivan on July 15, 2009 at 3:35 pm | Reading time: 4 minutes

Chat with SearchBot

Newspaper attacks on Google somehow apparently overstepping fair use and stealing their material are just escalating, with a European led “Hamburg Declaration” coming out last week. Now Google’s blogged a response that basically says if you want out of Google, it’s easily done with a robots.txt file.

That, of course, is what search engine savvy people have been telling the newspapers all along. But that’s not what the papers want. They want to be listed in Google and also paid for the right. Google’s blog post doesn’t hold out much hope on that front.

Then again, the double-secret negotiations between Google and the Associated Press continue. Google will buy off big publishers that make a lot of noise and lawsuit threats, as we’ve seen with the AP and the AFP. So the noises, I expect, will continue.

Interestingly, the ACAP alternative to robots.txt which has largely gone nowhere (some publishers use it; no search engines support it) might be getting a lifeline. ACAP project director Mark Bide posted today to the Read20 mailing list:

While ACAP has been making quiet technical progress since it was launched 18 months ago, this isn’t where our attention has been most closely focused. A perfectly formed specification is of no value unless it is implemented, so our activity has switched to evangelism for implementation. In one respect, we have been markedly successful. We now have around 1250 publishers who have undertaken very simple ACAP implementations on their websites – mostly newspapers, but including a fair number of book publishers. A full list of sites which have implemented ACAP can be found on our website (www.the-acap.org).

However, these remain symbolic, because for the time being none of the major aggregators has agreed to implement ACAP. This is, of course, a stumbling block to the debugging process…

However, a series of recent events has revitalised our dialogue with the major search engines; as a result, I have renewed optimism that we will achieve a breakthrough before the end of this year.

ACAP has always been about the principle of establishing the technologies which will allow copyright holders to make choices about the reuse of their content in the online environment in the same way as they can in the physical world. It is not about a particular technical implementation, and we remain entirely flexible about technical directions.

If there’s been a breakthrough, it’s interesting Google’s not mentioning it (the post is also up on their Public Policy blog). Indeed, they seem to say the opposite:

Some proposals we’ve seen from news publishers are well-intentioned, but would fundamentally change — for the worse — the way the web works. Our guiding principle is that whatever technical standards we introduce must work for the whole web (big publishers and small), not just for one subset or field. There’s a simple reason behind this. The Internet has opened up enormous possibilities for education, learning, and commerce so it’s important that search engines makes it easy for those who want to share their content to do so — while also providing robust controls for those who want to limit access.

Meanwhile, another rival to robots.txt and rights management emerged last week, with AP adopting it (AP also backs ACAP).

Confused? Yeah, so am I. How The AP Fails To Get Search & SEO (Again) on my personal blog gets into the new ACAP rival more.

Also recent writings on my personal blog on related issues:

Justice Richard Posner’s Copyright Law No One Can Talk About (Or Link To): (Yes, let’s outlaw linking and paraphrasing)
Garlic For The Google Vampire: (tells the Wall Street Journal that if Google’s really sucking them dry, there’s this thing called a robots.txt file they can use to block them)
No, Newspapers Don’t Need A License To Collude To Survive: (That’s what LA Times columnist Tim Rutten seemed to think, because if one blocks Google, the others get a leg up)

There’s also more in my newspapers archive. Here on Search Engine Land, some recent related writings include:

See also related discussion developing on Techmeme.

Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.

Add Search Engine Land to your Google News feed.