Google Features Volkswagen, Which Happens To Be Search Spamming
The Google Enterprise Blog recently featured the Volkswagen web site for using Google Enterprise search to power a new feature on the VW web site. As you can see, the Volkswagen home page has a huge search box in the middle of the page. Cool, right? Danny and I think so. As Danny was explaining […]
The Google Enterprise Blog recently featured the Volkswagen web site for using Google Enterprise search to power a new feature on the VW web site. As you can see, the Volkswagen home page has a huge search box in the middle of the page. Cool, right?
Danny and I think so. As Danny was explaining the news on the Daily Search Cast today, he noticed that the site loads the box up in Flash. Looking at the source code, he discovered hidden text! Yes, hidden text on a page that was featured by an official Google blog.
Here is the text that is clearly not visible on the page. It’s kept invisible using a special style called “invisibleContent:”
<div class=”invisibleContent”>Volkswagen of America presents U.S. vehicle information, pricing, incentives, deals, comparisons on Eos, GTI, Jetta, New Beetle, New Beetle Convertible, Passat, Passat Wagon, Touareg, Rabbit, R32 and the GLI with links to VW dealers, owner information, Volkswagen merchandise, and VW accessories. homepage, volkswagen, volkswagon, vw.com, home, landing, top, volkswagen.com, home page, home, top, back, VWofAmerica, Volkswagen of America, Volkswagon of America, VWoA, VWofA, volkswagon.com</div>
Google has guidelines against using hidden text. In fact, such use got a different car maker, BMW, banned briefly from Google last year. YADAC: Yet Another Debate About Cloaking Happens Again covers both of these points.
Even Google has violated its own rules. Back in 2005, text meant for internal indexing was showing up on public pages, causing one part of Google to file for a reinclusion request with another part of Google. From what Google said at the time in a WebmasterWorld discussion:
Those pages were primarily intended for the Google Search Appliances that do site search on individual help center pages. For example, http://adwords.google.com/support has a search box, and that search is powered by a Google Search Appliance. In order to help the Google Search Appliance find answers to questions, the user support system checked for the user agent of “Googlebot” (the Google Search Appliance uses “Googlebot” as a user agent), and if it found it, it added additional information from the user support database into the title.
The issue is that in addition to being accessed via the internal site-search at each help center, these pages can be accessed by static links via the web. When the web-crawl Googlebot visits, the user support system thinks that it’s the Google Search Appliance (the code only checks for “Googlebot”) and adds these additional keywords.
That’s the background, so let me talk about what we’re doing. To be consistent with our guidelines, we’re removing these pages from our index. I think the pages are already gone from most of our data centers–a search like [site:google.com/support] didn’t return any of these pages when I checked. Once the pages are fully changed, people will have to follow the same procedure that anyone else would (email webmaster at google.com with the subject “Reinclusion request” to explain the situation).
Postscript: The Google Enterprise blog updated us with a post telling us that they contacted the Volkswagen team and Volkswagen removed the hidden text from the page and placed them in the meta description of the code.