Remember how Google said recently that it might crack down on listings pages that are simply search results themselves? Reader Michael Nguyen dropped an email today to point out how, ironically, Google is now listing pages from its own Google Product Search service exactly as it has warned others not to do.
OK, settle down back there, those of you having a chuckle. Embarrassing? Yes! Intentional? Almost certainly not. Let’s take a look.
Try a search for snake light, and you’ll get this:
See down there at the bottom? Two pages from Google Product Search showing up in the top results:
I can’t resist. Let’s dig the hole for Google just a bit deeper before I throw down a ladder, so it can climb out.
Google’s recent warning about cracking down on search results showing up in its search results IS understandable. I mean, if you’ve done a search for beard trimmer, who wants to get a bunch of pages that just lead you to shopping sites, like this:
See all the results I’ve highlighted in red. Click on any of those, and you end up not at pages giving you information about beard trimmers or a particular beard trimmer product. Instead, you just get shopping search results, pages from shopping search engines listing a variety of beard trimmers and prices from merchants across the web.
For example, click on the number nine listing, and you get:
Oh, yeah, um — shopping results from across the web, courtesy of Google Product Search.
So how did this listing for Google Product Search:
Wind up being listed by Google itself in Google’s ordinary results? Is this a new conspiracy to compel searchers to try Google Product Search?
Nah. After all, Google already uses a product OneBox to push people to Google Product Search whenever it wants. With the beard trimmer search, you can see this at the top of the page:
Ah, but people might skip past this, so it’s better to be in the "real" results. This is Google perhaps trying to slip a change like this past us.
Heh. Google doesn’t need to slip that type of thing past anyone. Google’s already started preparing a move for this publicly. Remember, just last week it stopped using OneBox display for news results, putting news into regular results (for more, see here and here. I’ve yet to see this change myself). This is expected to happen to other specialized Google search results, as well.
The explanation is easy. Google almost certainly forgot to block crawling of Google Product Search results by itself and other crawlers.
Look here at the Google Product Search home page:
See those links below the search box? Those are recent queries people have done (reload the page, and they change to different examples). Click on a link, and they generate search results. Click on them as a search spider, and you’ll index search results — unless you’ve been blocked.
To block those spiders, Google would need the right entries in its robots.txt file. Let’s check!
Oops. No entries. Google does have these:
Disallow: /froogle? Disallow: /froogle_
Those were in there to stop queries from Google’s Froogle shopping search engine from being indexed. Unfortunately, Google didn’t update these entries to reflect how Froogle was renamed Google Product Search earlier this month.
That renaming shifted product results to this new URL:
As you can see in this URL for [beard trimmer]:
As soon as Google blocks the /products area via the robots.txt file, those 151,000 or so product pages that have been indexed will go away.
I am pinging Google about this, but I know exactly what they’ll say. This was an oversight, and the robots.txt file will be updated soon. So go on, you’ve had your laugh for the night!