Is Google Gearing Up To Drop The Supplemental Result Label?


Matt McGee reported last week that he noticed that a method where webmasters can see pages that Google places in its supplemental index appears to be disappearing from some of Google’s data centers. Now he’s spotted a comment at SEOmoz from Google’s Matt Cutts which suggests that Google might do away with the supplemental index all together.

From what Matt said:

As I mentioned at SMX Seattle, my personal preference would be to drop the “Supplemental Result” tag altogether because those results are 1) getting fresher and fresher, and 2) starting to show up more and more often for regular web searches.

Is this a sign that Google is actually going to take some action on removing the supplemental results tag from the web results?

Matt said he is afraid SEOs might be “fixated on Supplemental Results and focus on them to the exclusion of other aspects of SEO.” He compares the supplemental results fixation to that of PageRank:

We saw that happen with the toolbar PageRank bar and ended up slowing the update rate on the visible toolbar PageRank to every 3-4 months so that people didn’t spend too much of their time concentrating on PageRank and less on other parts of good SEO

Google has been saying for a while that the supplemental results are not a bad thing. But most SEOs are skeptical of that claim. If supplemental results are equal to normal Google results in every way, then I am all for removing the supplemental tag. But I personally doubt they are equal in every way, and therefore I am all for moving the supplemental status check up within Google Webmaster Tools.

By the way, the method to check for supplemental results works like this. Do a site: search for your site, followed by **** and then -asssdsd. Here’s how it works for Search Engine Land:

site:searchengineland.com **** -asssdsd



Barry Schwartz is Search Engine Land's News Editor and owns RustyBrick, a NY based web consulting firm. He also runs Search Engine Roundtable, a popular search blog on very advanced SEM topics. Barry's personal blog is named Cartoon Barry and he can be followed on Twitter here.

See more articles by Barry Schwartz >


Share, Bookmark & Discuss This Article
More:


Keep Updated: News Via Email | News Via RSS Feed | News Via Twitter


See more stories like this in the Members Library! Check out the Google: SEO, Google: Web Search sections of the Members Library where this story is filed. Members also get access to exclusive video content, a members-only weekly & monthly newsletter, plus more. Check out all the benefits!

13 COMMENTS ON Is Google Gearing Up To Drop The Supplemental Result Label?

Halfdeck,

They might be getting fresher but they still draw very little traffic compared to pages that rank for competitive terms. I had a site with ~25,000 supplemental pages pull 150 uniques a day, compared to 3K/day fully indexed on Yahoo.



Michael Martinez,

Matt Cutts confirmed at SMX Advanced 2007 that Supplemental Results pages are not parsed the same way as pages in the Main Web Index. So if Google only “does away” with the “Supplemental Results” label, they will be playing a game of smoke and mirrors with both users and Webmasters.

Either Google should start fully parsing Supplemental Results pages so that they have a fair chance of ranking in search results, or they should keep the label so that Webmasters can see they still need to get more inbound links to move those pages into the Main Web Index.

Google should not under any circumstances pretend to be combining the two indexes. Either Supplemental Results go away completely or they stay. This is not a “freshness” issue. This is a “Google will not fully parse and index Supplemental Pages” issue.

The stupid query that people have been passing around for “checking” Supplemental Results never worked properly. Google is doing Webmasters a favor by disabling that disability.



Hamlet Batista,

Matt Cutts confirmed at SMX Advanced 2007 that Supplemental Results pages are not parsed the same way as pages in the Main Web Index

Michael – I think you mean Google doesn’t weight or trust pages in the supplemental index the same as the ones in the main one.

Parsing is a programming term that describes the process of breaking down a structured text into its logical parts. This is something that has to be done for all HTML pages no matter in what index they are going to end up.



Halfdeck,

“Michael – I think you mean Google doesn’t weight or trust pages in the supplemental index the same as the ones in the main one.”

Nope, Hamlet.

Here’s a direct quote from a video taken at SMX Seattle:

“We parse pages and we index pages differently when they’re in the supplemental index. Think of it almost as if its sort of a compressed summary. So we’ll index some of the words in different ways on pages from the supplemental index, but not necessarily every single word in every single phrase relationship…”



Hamlet Batista,

Halfdeck – thanks for the quote.

Please note a subtle difference.

He uses “parse and index” on the first sentence and only “index” when explaining in detail.

Michael on the other hand only uses “parse”.

What I understand from the quote is that Google indexes fewer words for supplemental pages than they index for regular ones. This clearly explains why you don’t get as much traffic when the pages are in the supplemental index.



Halfdeck,

Please note a subtle difference.

He uses “parse and index” on the first sentence and only “index” when explaining in detail.

Michael on the other hand only uses “parse”.

Jeez, what’s so hard about admitting it when you’re wrong? :)

A guy named Tom at a gas pump says “I drink pepsi and eat a big mac on sundays.”

Bob says “that guy Tom drinks pepsi on sundays.”

Ok so Bob only mentioned pepsi, not big mac. Does that mean Tom doesn’t drink pepsi on Sunday? Nope. That statement is still true. Follow me?

Now, Tom continues: “The big mac I eat got extra pickles and onions. I always ask for ‘em so they give me a fresh burger instead of giving me one that’s been sitting around for two hours.”

Since Tom doesn’t mention pepsi anymore, does that mean he doesn’t drink pepsi on Sundays?

Matt Cutts said:

“We parse pages and we index pages differently when they’re in the supplemental index”

He is telling you:

1. Google parses supplemental pages differently than pages in the main index.

2. Google indexes supplemental pages differently than pages in the main index.

Michael Martinez wrote:

“Supplemental Results pages are not parsed the same way as pages in the Main Web Index”

which is exactly what Matt Cutts said.



Hamlet Batista,

Halfdeck,

I am very humble and I have absolutely no problems at all admitting when I am wrong.

I appreciate the time you took to create this nice analogy, however I am a programmer, I have created parsers myself, and I know exactly what they do. “Parse” is not the right word, at least not alone. He needed to use parse+index or index only. The same way Matt did.

Again, instead of “parse”, I’d have written: Google “extracts and indexes” fewer words for pages in the supplemental index.

Please bring this to Matt’s attention. He can say whether I am making sense or not. I did a Google search, “define: parse”

Definitions of parse on the Web:

* To break down a string of information such as a command or file into its constituent parts.
http://www.sabc.co.za/manual/ibm/9agloss.htm

* Analysis of the grammar and structure of a computer language (like SQL).
http://www.orafaq.org/glossary/faqglosp.htm

* President’s Award for Research and Scholarly Excellence.
http://www.athabascau.ca/misc/glossary.html

* While traditionally a concept of syntax and grammar validation, when used in relation to mark-up languages, this terms refers to a process of validating files by checking that tags are applied legally according to a pre-defined structure. This structure is typically defined by the Document Type Definition (DTD). Common terms used in mark-up validation are “parser” (a piece of software that validates) and “parsed”.
http://www.dclab.com/DCLTP.ASP

* To interpret a network address or command in order to do something with it. For example, to translate a FidoNet address into a form which can be understood by machines on the Internet, it is necessary to break it into its constituent parts (user’s name, zone, network, node, and point) and put the parts in an order which Internet mail transport mechanisms understand. …
associate.com/camsoc/ctt/gloss-p.html

* processing of a text file to extract desired data. Linguistic parsing may recognize words and phrases in text, and even recognize parts of speech.
philip.pristine.net/glit/en/terms.html



Halfdeck,

“I am a programmer, I have created parsers myself, and I know exactly what they do.”

I’ve written a full-blown webcrawler, which involves HTML scraping, parsing, 301 redirects detection, and robots.txt parsing, so I know a little bit about parsing webpages.

“He needed to use parse+index or index only. The same way Matt did.”

Ok, so parse+index is ok, index only is ok, but parse only is not ok. That makes perfect sense /sarcasm :)

Anyway, you’re moving far away from your original objection:

“Michael – I think you mean Google doesn’t weight or trust pages in the supplemental index the same as the ones in the main one.”

Let me put the answer in a way that might be easier for you to digest:

Google doesn’t parse and index pages in the supplemental index the same as the ones in the main index.

During parsing, Google may use less elaborate regexp that produce smaller result sets. And during indexing, Google may ignore keywords or phrases that produce over 1,000 results from the main index.



Hamlet Batista,

Halfdeck,

My point is that parsing is not the right word in this case. As I said. Please bring it up to Matt Cutts. I still don’t think this makes sense:

“Either Google should start fully parsing Supplemental Results pages so that they have a fair chance of ranking in search results”

I am glad that you are a programmer as well, so that I can go into more detail.

HTML Parsing and HTML scraping are not the same.

You can use regular expressions (scrapping) to extract specific elements such as email addresses, etc. but extracting (meaning) the body text, position information, font information, urls, etc. requires a DOM, SAX or Pull HML parser. The software needs to break the full page in its parts. This is something you don’t do partially.

Here is the source code of my basic multi-threaded Perl crawler. I posted it a few days ago. Note that I use a full HTML parser, instead of using regular expressions (scrapping).



Halfdeck,

Whether you crawl the DOM tree or parse HTML using regular expression, you are extracting information from a page. Look under the hood of code that traverses DOM trees and you’ll still find regexp code.

“but extracting (meaning) the body text, position information, font information, urls, etc. requires a DOM, SAX or Pull HML parser.”

No, you can extract URLs and other stuff just fine with regexp – its just lower-level coding. I’ve done both (extracting information from XML using AJAX and using regexp to extract content from websites so I know.

For example, on a page like this:

http://www.newegg.com/Product/Product.aspx?Item=N82E16820145034

my code would pull:

Product price
Brand name
Capacity
Voltage

etc.



Michael Martinez,

Hamlet, I’m a programmer too, have been programming longer than most people in the SEO field have been alive, and I used the term “parse” correctly.

You need to lighten up and just accept the fact that Google doesn’t parse Supplemental Pages the same way it parses pages in the Main Web Index.

The issue here is not how to use the word “parse”. It’s the fact that if Google removes the “Supplemental Results” label from their search results without changing their processing of Supplemental Results pages so that they are handled exactly the same way as pages in the Main Web Index, they will be misleading people.

As things stand right now, most Supplemental Results pages have absolutely no chance of ranking in search results for reasonable queries because Google deliberately refrains from parsing their contents. Dropping “Supplemental Results” labels from SERPs won’t change that.

They need to change the way they handle the data if they are going to change the way they present it. They have established the expectation that partially parsed or unparsed pages will be denoted as Supplemental. Pretending that such pages are just like the content in the Main Web Index will fool naive Webmasters who don’t stay on top of SEO theory, but it won’t fool everyone.



Hamlet Batista,

@halfdeck:

I respect your position but I still do not agree with it. What you described you did for that page is scrapping. I’d like to see examples of full HTML, XML, C, Java, C++, Python, Perl, etc. parsers built with regular expressions. That would be very interesting to see.

Most parsers are built with parser generators , not with regular expressions.

@Michael:

I am glad that you are a programmer as well. Although I do not agree with your use of ‘parse’, I do understand what you mean and I fully agree with you. Removing the supplemental labels and yet partially indexing those pages is definitely deceptive.



Halfdeck,

Hamlet, I’m just gonna let you have the last word :)




RECENT COMMENTS

  • poiriem said " Excellent article, and yes it is a real problem - especially for agencies - managing budgets and KPI"
  • AdvertiseSpace said " Thank you for sharing, Google maps are very helpful in almost all categories of business. I am glad "
  • AdvertiseSpace said " I am a music person and I hope Google will make this all happen. This will serve as a big threat to "

See All »


FREE DAILY SEARCH NEWS RECAP!

Stay on top of all the search news with our daily summary, the SearchCap newsletter. View a sample ›

STAY CURRENT THROUGHOUT THE DAY

RSS Feeds

The Search Engine Land feed keeps you informed as news happens. SEE ALL FEEDS »

Upcoming Search Engine Land Conferences

Advertise With Us »

Search Engine Land produces SMX, the Search Marketing Expo conference series. SMX events deliver the most comprehensive educational and networking experiences - whether you're just starting in search marketing or you're a seasoned expert.


SMX Web Site » | SMX Difference » | SMX News »


Join us at an upcoming SMX event:

Search Marketing Now Learn more about search marketing with our free online webcasts and webinars from our sister site, Search Marketing Now. Upcoming online events include:


See more webcast topics »

TRACK US SOCIALLY
Upcoming Search Engine Land Conferences

Get Your Search Engine Land
Premium Membership!

Become a premium member today and receive:

  • Express commenting privileges & photo.
  • Exclusive videos & newsletters.
  • Discounts to our SMX conferences.
  • Access to "How To" & Other Archives.

Learn More

Upcoming Search Engine Land Conferences
Add to GoogleAdd to My Yahoo!Add to BloglinesAdd to NetvibesAdd to Windows Live