YADAC: Yet Another Debate About Cloaking Happens Again

Sigh. Double sigh. Triple sigh. I guess now that the SEO industry has had the required twice-yearly debate about the reputation of SEO, it’s time to do the go round about cloaking once again. A quick word about cloaking has Google’s Matt Cutts trying to clarify concerns that Philipp Lenssen of Google Blogoscoped has been raising about WebmasterWorld. The comments are now up over 100, as people rehash things that have been hashed, mashed, rebaked so many times before. Below, some cloaking history plus an honest plea about trying to get past this stupid, stupid issue.

Definition Time

Let’s do the definition, first:

Cloaking is when you show a search engine content that is different than what a human being sees.

Got it? That’s my definition, and Matt says virtually the same thing in his post today:

Cloaking is serving different content to users than to search engines.

So simple. What’s to debate? Well, is it cloaking if…

  • A spider coming from a US IP address sees a different page than a user from a UK IP address?
  • A spider sees content that a user sees, but only if they do free registration
  • A spider sees content that a user sees, but only if they do paid registration
  • A spider sees content in text that represents what a users sees in Flash
  • A spider sees content that’s slightly different than what a user sees when their browser renders Javascript

Pick one of those above — pick something else (see our Good Cloaking, Evil Cloaking & Detection column from last week) — and people can, will and have pointed at something someone is doing, then yelled "cloaking" and screamed for a ban to happen. A ban? Well, as you know, all search engines hate cloaking. Actually, that’s always been a confused point. Here starts the lesson.

History Time: Tactics Versus Intent

Back in January 2003, Alan Perkins wrote this big Cloaking Is Always A Bad Idea article, telling us that search engines always said cloaking was bad. I was never a proponent of cloaking. I was, however, well aware that NOT all the guidelines were against cloaking. In addition, with paid inclusion, I argued some cloaking was actually allowed. All this went into my Ending The Debate Over Cloaking that came out in reaction to Alan’s article, in February 2003.

Since I knew that all the search engines had allowed some types of cloaking, my advice to marketers was this, with the stress on avoiding "unapproved cloaking:"

Cloaking is getting a search engine to record content for a URL that is different than what a searcher will ultimately see, often intentionally. It can be done in many technical ways. Several search engines have explicit bans against unapproved cloaking, of which Google is the most notable one. Some people cloak without approval and never have problems. Some even may cloak accidentally. However, if you cloak intentionally without approval — and if you deliver content to a search engine that is substantially different from what a search engine records — then you stand a much larger chance of being penalized by search engines with penalties against unapproved cloaking. If in doubt, ask the search engine if it has a problem with what you intend to do, assuming you can’t get a clear answer from written guidelines that are provided. If you are working with a third-party search engine marketer, ask them for proof that what they intend to do is approved. Otherwise, be prepared for any adverse consequences.

The suggestion to avoid "unapproved cloaking" infuriated Doug Heil over at the iHelpYou forums, who could not (and to this day still cannot) get over the idea that cloaking MUST equal spamming.

My response back then remains the same today. There’s a difference between tactics and intent. Many of the things that might cause penalties with search engines are tactics (hidden text, gibberish pages, cloaking) that are closely aligned with the intent of trying to mislead or game the search algorithms. But in some cases, what’s a bad tactic (or technical implication) might have a good intent as agreed by the search engines. So they’ll allow it, either turning a blind eye to it or giving it some official endorsement.

That difference is important because back then, if the search engines got behind the "unauthorized" versus "authorized" suggestion, we wouldn’t be having today’s wasteful argument. But let’s carry on.

NPR, Google Scholar & Approved Cloaking

In May 2004, I looked at how National Public Radio was, in my view, cloaking text transcripts of audio to search engines but only letting human visitors by those. At the time, Google had a guideline against cloaking that read:

The term "cloaking" is used to describe a website that returns altered webpages to search engines crawling the site. In other words, the webserver is programmed to return different content to Google than it returns to regular users, usually in an attempt to distort search engine rankings. This can mislead users about what they’ll find when they click on a search result. To preserve the accuracy and quality of our search results, Google may permanently ban from our index any sites or site authors that engage in cloaking to distort their search rankings.

I argued that this was an example of "good cloaking" and that the real issue I had with it was that other marketers were supposedly banned from doing it:

As a searcher, I’m actually glad the method is being used. It does mean I’m more likely to find audio content of interest. Moreover, I can listen to that for free via the NPR site.

As a search engine marketer, I’m not so thrilled. I’m well aware that many other companies would like the ability to feed Google content in this manner. In addition, they have just as compelling arguments as NPR about having good content that isn’t adequately indexed by the Google crawler. Unfortunately, they’re denied the privilege of feeding relevant material just to Google’s crawler.

What about Yahoo? Anyone can enjoy the same benefits that NPR has, the ability to cloak content when relevant, through Yahoo’s content acquisition program. Non-profit organizations are offered this for free. Commercial organizations have to pay, making use of Yahoo’s trusted feed program.

By November 2004, I was writing about cloaking again. Now Google had an officially approved program that, in my view, allowed cloaking. This was Google Scholar. As I wrote:

This system may lead to problems for some searchers. In the example above, not only could I NOT read the paper, as I didn’t have a subscription, but I also could not read even an abstract. Instead, a password-prompt continued to appear, even when I cancelled it, making it extremely difficult to finally close the window (and that’s why I haven’t linked to the actual paper, to save other people the problem).

This situation is probably unusual, however. One of Google’s requirements for inclusion in Google Scholar is that publishers at least show abstracts to searchers.

The special access for publishers flies in the face of Google’s anti-cloaking policy. Google is being shown material that regular users wouldn’t normally see, its own definition of cloaking. This is a GOOD thing for searchers, but the company needs to amend its cloaking policy so as not to be hypocritical.

Indeed, that’s long overdue. This has been a problem since I first reported about a similar issue earlier this year. A sidebar piece … looks at the latest case and suggests some fixes for Google, including finally moving forward with formalizing such programs for ALL publishers.

Plea For Google To Change The Cloaking Guidelines

In the sidebar to that article, and again in a follow-up piece a few days later, I urged Google to alter its cloaking policy to something stressing that cloaking was bad only if not approved:

The term "cloaking" is used to describe a website that returns altered webpages to search engines crawling the site without permission. In other words, the webserver is programmed to return different content to Google than it returns to regular users, usually in an attempt to distort search engine rankings. This can mislead users about what they’ll find when they click on a search result. To preserve the accuracy and quality of our search results, Google may permanently ban from our index any sites or site authors that engage in cloaking without our permission, if we feel it is harmful to our search rankings.

BMW, WebmasterWorld & New York Times Cloaking Accusations

In March 2005, there was great amusement in some quarters when Google’s policy against cloaking caused it to ban itself when pages apparently with text designed to help internal Google searching made also onto external versions seen only by Google spiders, rather than humans (see here, here and here for more).

By the end of 2005, WebmasterWorld came under accusations of cloaking. Since it is one of the most important forums about search engines around — frequented by official Google reps — it sort of became a poster child of "why can they do it but others can’t" for some.

Far bigger news came in 2006. In February, BMW got banned on Google for using hidden text — in particular, a "poor-man’s" version of cloaking that used JavaScript to show different content to users. It got back in a few days later. Then in June, an article about how the New York Times optimizes content for search engines sparked a new cloaking debate when it seemed that the major search engines were allowing cloaked content. In particular, the New York Times was allowing search spiders to read content that was only accessible to humans if you registered for free or, in some cases, paid for access.

Marshall Simmonds, the NYTimes & Acceptable Cloaking was a giant discussion that came out of that article, which had me originally arguing that this was cloaking, since search spiders were seeing something different than most humans could see. But I was convinced to change my mind. Since the spiders were indeed shown what anyone could ultimately see, this wasn’t cloaking. I commented:

Until now, I would have considered feeding a search engine a page that people couldn’t see unless they registered to have been unapproved cloaking, since most users were seeing something different that then spider saw.

But sure, I’ll buy into the "eventually you’ll see the same thing" argument as this not being cloaking. Why not? Google’s allowed this in approved cases for about two years now and never wants to go on the record as this being approved cloaking. So don’t call it cloaking and everyone’s happy. Google’s not allowing something officially they say not to do, and content owners can do this without fear.

In fact, I expect Fantomaster, Beyond Engineering and anyone with IP delivery lists now can have some new-found respect from people who previously slammed them as helping cloakers. Here they’ve been saying its all about content delivery systems and now they’re right, at least in some situations. Because after all, some people aren’t going to want to depend on user agent delivery to feed content this way.

Of course, I’m still not going to do this yet with my own members-only content. Despite agreeing with you, Phil — despite seeing others allowed to do this – I’m still fearful Google might arbitrarily decide to call it cloaking anyway when they choose. But maybe I’ll be braver down the line, and why not? It’s like a whole new world.

I did have several off-the-record conversations with Google about this. The main thing that came out that I can report was that Google really felt most users should see what their spiders saw WITHOUT having to register or pay for access. That’s similar to what Matt wrote about WebmasterWorld content today, when covering the latest criticisms that it might be cloaked:

I consider the issue in a much better state now, in that most (all?) Google searchers get the identical page to what Googlebot saw. But I still consider Philipp’s February posts open for investigation, and I will get to them, in the same way that I tackled Philipp’s first two posts about this.

FYI, to understand a bit more about the WebmasterWorld situation, see Matt’s post from March 2006, How to sign up for WebmasterWorld. It explains how in some cases, trying to access a thread that you might come across from a search result can trigger a registration results. In today’s post at Matt’s, WebmasterWorld’s Brett Tabke explains and points at more information on how this has changed.

The comments in Matt’s post also get into an apparent return of some New York Times content requiring registration, as well as more complaints about Google Scholar content (added with the full cooperation of Google) also being annoying, in that you can’t read it without paying when clicking from search results.

Solution 1: Allow Registered Content

Enough history. Time for some solutions. Both Google and Yahoo have programs that allow people with free registration or fee-based content to show up in search results — and I mean mixed in with regular search results, not segregated out like the Yahoo Subscriptions product launched in June 2005. Yahoo’s will happen mainly through paid inclusion, and not that much. Google’s will happen primarily through the aforementioned Google Scholar plus the First Click Free system used for Google News but which also may happen with some web search content.

None of this content is labeled in any way. Back to that discussion on cloaking and the New York Times, I commented:

It is annoying to hit one of these links and not know that payment or registration is required, however. That problem’s going to get worse as more and more people decide this isn’t cloaking and give it a go. Google and the others should look to establish a way for site owners to better flag premium or registration only content….

I don’t agree having paid content in regular search results is bad. I have a Wall Street Journal paid subscription. They have lots of great content. If I’m doing a search, and they’ve got a good match, I want to know that.

And over at Yahoo, as I explained way way back above, I can do that. I can choose specifically to have this content revealed to me. It doesn’t make my results bad at all.

It is a bad user experience if you constantly get back results that you can’t actually view, of course, without paying. We simply aren’t going to subscribe to everything.

The solution is easy. Give users the option. Let me choose to see content that requires payment or not. Or similarly, let me choose to see content that might require free registration. We just need Google and the others to graduate from 1999 mentality and better accommodate web sites with this type of content. It’s easily done, if they want to do it. And getting more formalized program for publishers, as well as options for searchers, will help.

So enough already. Enough with the special programs that only some publishers get to do. I want Google — which leads the charge in scaring people about cloaking — to fast track a system to let anyone with registration-based or fee-based content to be in their search results.

As for usability, either flag the URLs so users know to expect a charge or registration request or make it possible for users to exclude this information. Or experiment with both. But do something so that publishers don’t take matters into their own hands and the SEO industry has to have yet another one of these debates over whether it’s cloaking.

Solution 2: Allow For Approved Cloaking

Remember that suggested revision for Google’s cloaking policy I gave above? Well sometime in 2006, Google dropped its definition of cloaking entirely. Instead, we were just left with shorter definition of cloaking here:

Make pages for users, not for search engines. Don’t deceive your users or present different content to search engines than you display to users, which is commonly referred to as "cloaking."

And a warning against it here:

However, certain actions such as cloaking, writing text in such a way that it can be seen by search engines but not by users, or setting up pages/links with the sole purpose of fooling search engines may result in removal from our index.

That warning, like the old warning, uses the word "may" in terms of removal for cloaking. Will cloaking get you banned? Maybe. Maybe if it is noticed. Maybe if it is deemed harmful to the users. Maybe if after a closer look, it can’t be pigeonholed in some other definition.

For me, it would be clearer to go back to the old definition and stress that unless approved, cloaking might result in a ban. Roll that out along with a program making it easier for people to feed in registered content. That gives Google flexibility, helps publishers plus stops this insane focus on technical/tactical implementations and refocuses concern where it belongs — the user experience.

Did what you do help or harm the search results, in Google’s opinion? If you were harming search results, Google’s always reserved the right to boot you out. And if you were technically violating guidelines but not in a harmful way, Google’s always reserved the right to turn a blind eye. Or rather, an approving eye, an eye knowing that it’s intent that matters, not some technicality.

Related Topics: Channel: SEO | SEO: Cloaking & Doorway Pages

Sponsored


About The Author: is a Founding Editor of Search Engine Land. He’s a widely cited authority on search engines and search marketing issues who has covered the space since 1996. Danny also serves as Chief Content Officer for Third Door Media, which publishes Search Engine Land and produces the SMX: Search Marketing Expo conference series. He has a personal blog called Daggle (and keeps his disclosures page there). He can be found on Facebook, Google + and microblogs on Twitter as @dannysullivan.

Connect with the author via: Email | Twitter | Google+ | LinkedIn



SearchCap:

Get all the top search stories emailed daily!  

Share

Other ways to share:
 

Read before commenting! We welcome constructive comments and allow any that meet our common sense criteria. This means being respectful and polite to others. It means providing helpful information that contributes to a story or discussion. It means leaving links only that substantially add further to a discussion. Comments using foul language, being disrespectful to others or otherwise violating what we believe are common sense standards of discussion will be deleted. Comments may also be removed if they are posted from anonymous accounts. You can read more about our comments policy here.
  • http://www.michaelvisser.com.au Michael Visser

    Sweet overview of this issue, this pit in the SEM landscape has once again ruptured and more and more consultants may be willing to try these tactics out again. As a poor searcher I’m against cloaking for paid content but support presenting localised content based on a spiders originating IP.

    Passing this article onto peers to chew on. :D

    Typo in paragraph 1, “WebmasterWord”.

  • http://www.SEOcritique.com SEOcritique.com

    Danny, I think that the concept of getting approval is fine on the surface. (And how does one get approval?) But as technologies for serving web pages progress, eventually there are going to be too many web sites using cloaking technologies than can be reviewed manually.

    Google should set clear, logical and frequently updated guidelines for cloaking and when it is or is not appropriate. This is not an issue that can be set in stone for all time. Usage and methods will change over time. I think that just as Google must chase quality in the rankings via constantly improving their algorithms that they must reconsider their guidelines about cloaking as necessary.

    This is an excellent example where more communications and detailed communications with web site administrators is a skill set that Google must develop. Right now the general perception is that you have to be a Danny Sullivan or a Rand Fishkin to speak to Google employees, and then it only happens in conference back rooms and smoke filled bars. Obviously there are far more webmasters than Google can speak to over the telephone so detailed guidelines that make things clear and unambiguous are probably the best answer….at least until a better answer makes itself known.

  • http://www.seo-theory.com/ Michael Martinez

    Google should not be telling people how to run their Web sites, and it’s good that they have toned down some of their once blatantly hostile language.

    But if they are going to ask Webmasters to respect their guidelines for inclusion (which they have every right to determine and enforce), then they owe it to the surfing community to level the playing field.

    Danny, I endorse your recommendations about the registration only sites. I don’t mind their being included in the search results as long as I know when I click on a link it will take me to content that I can read, unless I choose to click on a link that I know will take me to a registration or login page.

    Google should be informing its users of what types of content they are seeing in the search results.

  • http://www.cumbrowski.com Carsten Cumbrowski

    Hi Danny. Same Question to you as for Stephan (which remained unanswered so far):

    Serving different content to spiders than to users. What do you mean by “content”? The actual written text or the whole HTML source code?

    If I serve to the user a fully designed page, a badly designed one from SEO point of view, with a lot of junk code, too much crap before the actual text, wrong HTML tags and to the spider the same text and everything but in short, clean and proper html, then is it not cloaking?

    Is leaving parts of the navigation out for the spider which is shown to the user (for example CSS drop down nav) considered different content? I mean, it’s duplicate stuff anyway which would most likely be filtered out by the search engine.

    What is your take on that kind of “cloaking”? Ethical? Gets you banned?

    Also good question by SEOcritique.com. The same question was also forming in my head after reading multiple times the word “approved cloaking”. I am not aware of any process where you can request to get cloaking approved, just to make it official and sure that the Search Engines don’t see a problem with it. Like with the cloaking mentioned in my first question.

    Thanks, I appreaciate it.
    Carsten

  • http://dreadpiraterobert.blogspot.com Dread Pirate Robert

    Wow, thanks for the great overview. I’ve been following the conversation between Philipp and Matt and really appreciate the historical perspective!

    Now if only I’d read your post BEFORE I wrote that article on cloaking… :D

  • Dogger

    I work for a large site and we are in the process of choosing an A/B / Multi variant solution provider. One of my requirements before we go ahead with a solution is to implement IP Delivery to make sure the spider always sees the same content. We obviously aren’t intending to do anything deceitful but some of the tests we will run may change the site layout / flow pretty drastically. The potential to be flagged for cloaking has me pretty concerned. Any opinions as to where A/B testing falls within this debate?

  • http://www.herko-tierbedarf.at herko

    Hello, if I hear of chickens me badly the bird flu to have we of China get now the Chinese to have from the USA google gotten. Unfortunately neither of them gives a vaccine, unfortunate or for. I would like to mention still one the bird flu to have we soon in the grasp

  • Versand

    This is a great advice I appreciate it very much.

Get Our News, Everywhere!

Daily Email:

Follow Search Engine Land on Twitter @sengineland Like Search Engine Land on Facebook Follow Search Engine Land on Google+ Get the Search Engine Land Feed Connect with Search Engine Land on LinkedIn Check out our Tumblr! See us on Pinterest

 
 

Click to watch SMX conference video

Join us at one of our SMX or MarTech events:

United States

Europe

Australia & China

Learn more about: SMX | MarTech


Free Daily Search News Recap!

SearchCap is a once-per-day newsletter update - sign up below and get the news delivered to you!

 


 

Search Engine Land Periodic Table of SEO Success Factors

Get Your Copy
Read The Full SEO Guide