Is cloaking evil? It’s one of the most heavily debated topics in the SEO industry – and people often can’t even agree on what defines cloaking. In this column, I wanted to look at an example of what even the search engines might consider "good" cloaking, the middle-ground territory that page testing introduces plus revisiting how to detect when "evil" old-school page cloaking is happening.

Back in December 2005, the four major engines went on record at Search Engine Strategies Chicago to define the line between cloaking for good and for evil. From the audience, I asked the panelists if it was acceptable to — selectively for spiders — replace search engine unfriendly links (such as those with session IDs and superfluous parameters) with search engine friendly versions. All four panelists responded "No problem." Charles Martin from Google even jumped in again with an enthusiastic, "Please do that!"

URL Rewriting? Not Cloaking!

My understanding is that their positions haven’t changed on this. Cloaking – by its standard definition of serving up different content to your users than to the search engines — is naughty and should be avoided. Cloaking where all you’re doing is cleaning up spider-unfriendly URLs, well that’s A-OK. In fact, Google engineers have told me in individual conversations that they don’t even consider it to be cloaking.

Because search engines are happy to have you simplify your URLs for their spiders — eliminating session IDs, user IDs, superfluous flags, stop characters and so on — it may make sense to do that only for spiders and not for humans. That could be because rewriting the URLs for everyone is too difficult, costly or time intensive to implement. Or more likely, it could be that certain functionality requires these parameters, but that functionality is not of any use to a search engine spider — such as putting stuff in your shopping cart or wish list or keeping track of your click path in order to customize the breadcrumb navigation.

Many web marketers like to track which link was clicked on when there are multiple links to the same location contained on the page. They add tracking tags to the URL, like "source=topnav" or "source=sidebar." The problem with that is it creates duplicate pages for the search engine spiders to explore and index. This leads to a dilution of link gain or PageRank, because all the votes that you are passing on to that page are being split up because of the different URLs you are using. Ouch.

How about instead you employ "good cloaking" and strip out those tracking codes solely for spiders? Sounds like a good plan to me. Keep your analytics-obsessed web marketers happy, and the search engines too.

I have to mention, you don’t have to cloak your pages to simplify your URLs for spiders. There is another option: you could use JavaScript to append your various tracking parameters to the URL upon the click. For example, REI.com used to append a "vcat=" parameter on all brand links on their Shop By Brand page through JavaScript. Thus, none of their vcat containing URLs made it into Google.

Is Testing Bad Cloaking?

Is multivariate testing a form of bad cloaking? This is where services like Offermatica or even Google’s own Website Optimizer show different users different versions of the same URL. That could be considered cloaking, because human visitors and search engines are getting different content. Spiders can’t participate in the test group, and thus the content of that test is invisible to the spiders; that’s because of the requirements of AJAX, JavaScript, DHTML and/or cookies for the test platform to function on the user’s browser. Google engineers have told me that they want Googlebot to be part of the test set. Therein lies the rub; the technology isn’t built to support that.

Uncovering User Agent Based Cloaking

The "bad" cloaking from a search engine point of view is that deliberate showing to a spider content that might be entirely different than what humans see. Those doing this often try to cover their tracks by making it difficult to examine the version meant only for spiders. They do this with a "noarchive" command embedded within the meta tags. Googlebot and other major spiders will obey that directive and not archive the page, which then causes the "Cached" link in that page’s search listing to disappear.

So getting a view behind the curtain to see what is being served to the spider can be a bit tricky. If the type of cloaking is solely user agent based, you can use the User Agent Switcher extension for Firefox. Just create a user-agent of:

Googlebot/2.1 (+http://www.googlebot.com/bot.html)

under Tools > User Agent Switcher > Options > Options > User Agents in the menu. Then switch to that user agent and have fun surfing as Googlebot in disguise.

Uncovering IP Based Cloaking

But hard-core cloakers are too clever for this trick. They’ll feed content to a spider based on known IP addresses. Unless you’re within a search engine – using one of these known IP addresses — you can’t see the cloaked page, if it also has been hidden by being kept out of the search engine’s cache.

Actually, there’s still a chance. Sometimes Google Translate can be used to view the cloaked content, because many cloakers don’t bother to differentiate between the spider coming in for the purpose of translating or coming in for the purpose of crawling. Either way, it uses the same range of Google IP addresses. Thus, when a cloaker is doing IP delivery they tend to serve up the Googlebot-only version of the page to the Translate tool. This loophole can be plugged, but many cloakers miss this.

And I bet you didn’t know that you can actually set the Translation language to English even if the source document is in English! You simply set it in the URL, like so:

http://translate.google.com/translate?hl=en&sl=en
&u=URLGOESHERE&sa=X&oi=translate&resnum=9&ct=result

In the code above, replace the bolded URLGOESHERE part with the actual URL of the page you want to view. That way, when you are reviewing someone’s cloaked page, you can see the page in English instead of having to see the page in a foreign language. You can also sometimes use this trick to view paid content, if you’re too cheap to pay for a subscription.

Many SEOs dismiss cloaking out-of-hand as an evil tactic, but in my mind, there is a time and a place for it (the URL simplifying variety, not the content differing variety), even if you are a pearly white hat SEO.

Stephan Spencer is founder and president of Netconcepts, a 12-year-old web agency specializing in search engine optimized ecommerce. He writes for several publications plus blogs at StephanSpencer.com and Natural Search Blog. The 100% Organic column appears Thursdays at Search Engine Land.

Opinions expressed in the article are those of the guest author and not necessarily Search Engine Land.

Related Topics: All Things SEO Column | Channel: SEO | SEO: Cloaking & Doorway Pages | SEO: Spamming

Sponsored


About The Author: is the author of Google Power Search, creator of the Science of SEO and co-author of The Art of SEO now in its second edition, both published by O'Reilly. Spencer is also the founder of Netconcepts and inventor of the SEO technology platform GravityStream. He also blogs on his own site, Stephan Spencer's Scatterings.

Connect with the author via: Email | Twitter | Google+ | LinkedIn



SearchCap:

Get all the top search stories emailed daily!  

Share

Other ways to share:

Read before commenting! We welcome constructive comments and allow any that meet our common sense criteria. This means being respectful and polite to others. It means providing helpful information that contributes to a story or discussion. It means leaving links only that substantially add further to a discussion. Comments using foul language, being disrespectful to others or otherwise violating what we believe are common sense standards of discussion will be deleted. Comments may also be removed if they are posted from anonymous accounts. You can read more about our comments policy here.
  • http://www.fathomseo.com Mike Murray

    Thanks for the cloaking update and detailed perspective. Engines may frown on cloaking, but I’m not convinced they’re going to catch it or deal with it as much as people may think. In a way, it’s like duplicate content. I see see plenty of that going on even with national brands. And the sites still rank well. With cloaking, I do wonder if it makes sense to offer different versions to honor regulatory requirements. But I imagine site design could address that.

  • http://www.cumbrowski.com Carsten Cumbrowski

    Serving different content to spiders than to users. What do you mean by “content”? The actual written text or the whole HTML source code?

    If I serve to the user a fully designed page, a badly designed one from SEO point of view, with a lot of junk code, too much crap before the actual text, wrong HTML tags and to the spider the same text and everything but in short, clean and proper html, then is it not cloaking?

    Is leaving parts of the navigation out for the spider which is shown to the user (for example CSS drop down nav) considered different content? I mean, it’s duplicate stuff anyway which would most likely be filtered out by the search engine.

    What is your take on that kind of “cloaking”? Ethical? Gets you banned?

  • http://www.thinkseer.com/blog wilreynolds

    While hiding copy behind flash isn’t traditionally seen as cloaking, in the spirit of the definition it is basically showing the search engines one thing and the user another. I wrote about SAAB doing this a while back…what is your take?

    http://www.brandweek.com/bw/search/article_display.jsp?vnu_content_id=1003538541

  • http://www.linkedin.com/in/leevikokko Leevi Kokko

    Hiding copy behind flash should be totally OK for sites such as mtv.com, because both HTML and flash use the same content source.

    This snippet from http://www.simplebits.com/work/mtv/

    “The shiny new version of the site required the latest-and-greatest Flash plugin, and MTV.com found it important to leave no one behind. Readers who haven’t upgraded to the latest version of Flash would still receive the site’s content, identical thanks to transforming the same XML that drives the Flash version into nicely formatted XHTML/CSS templates. In addition, search engines would better index the site’s content and Flash-less browsers and devices would benefit as well.”

    So essentially, while search spider indeed sees a different version of the site compared to an average user, it is only in terms of user experience, not the actual content.

    I think this particular case is a beautiful example of how proper use of standards and industry best practices can boost your online business.

  • http://www.stephanspencer.com Stephan Spencer

    Carston,
    When I say “content” I am referring to the copy, title tags, alt text, etc. — basically anywhere a spammer could insert keyword-rich gibberish.

    In regards to your hypothetical scenario of moving HTML around and swapping in and out different HTML tags, I consider that cloaking.

    In your second hypothetical scenario I would also consider leaving out parts of navigation only for the spider to be cloaking as well.

    There is a distinction between operating in non-ethical terroritory and being at risk of a ban. Just because you are being ethical, doesn’t mean you won’t inadvertently trip an automated algorithm. I think moving HTML around on the page or replacing content within various HTML containers is dangerous even though it could be ethical from the standpoint that you are making visible to the spiders content that might be trapped within JavaScript or Flash.

    Wil,
    You bring up a really good point that Flash-based content or navigation really does need to have an alternative version that is accessible to the spiders. Whether that would be considered cloaking by the search engines, I think depends on the implementation. Are graceful degradation and progressive enhancement cloaking with regard to the use of Flash with SWFObject and DIV tags? Some people are quick to label them as such regardless of use or intent. While they can be used for cloaking purposes, if used conservatively and with their intended purpose in mind, then I don’t consider them to be cloaking. But at the end of the day it is not really what I think that matters. It is what Google and the other engines think.

    I know Google doesn’t like Flash. They don’t see it as a very accessible technology. It goes against their philosophy of making the world’s information universally accessible because it is not friendly to the visually impaired, people on antiquated browsers and handhelds and so forth. So I think that progressive enhancement of Flash would be seen as a good thing.

    Flash isn’t going away any time soon so there needs to be a workaround and I see progressive enhancement as the best workaround that we have available. It is my understanding that engines use human review to determine intent for the use of SWFObject and perhaps other forms of progressive enhancement. If the search engines thought that progressive enhancement was inherently evil, they wouldn’t bother with human review; they’d simply nuke it every time.

  • http://scottj.info/ Scott Johnson

    It’s good to air the facts on cloaking. There is a proper place for such technologies on the web. To forbid them outright is ridiculous. I enjoy reading articles like this one–articles that challenge the status quo with cold hard facts.

 

Get Our News, Everywhere!

Daily Email:

Follow Search Engine Land on Twitter @sengineland Like Search Engine Land on Facebook Follow Search Engine Land on Google+ Get the Search Engine Land Feed Connect with Search Engine Land on LinkedIn Check out our Tumblr! See us on Pinterest

 
 

Click to watch SMX conference video

Join us at one of our SMX or MarTech events:

United States

Europe

Australia & China

Learn more about: SMX | MarTech


Free Daily Search News Recap!

SearchCap is a once-per-day newsletter update - sign up below and get the news delivered to you!

 


 

Search Engine Land Periodic Table of SEO Success Factors

Get Your Copy
Read The Full SEO Guide