Sign up for weekly recaps of the ever-changing search marketing landscape.
Google & Human Quality Reviews: Old News Returns
A quote from Google’s director of research Peter Norvig in an article at
about how human reviewers assess Google’s search quality is starting to pick up
play on the web and will no doubt grow. Problem is, it’s not news. Still, I
suppose it’s interesting to those who are new to the space or missed the
attention it got back in 2005. Some history follows.
Back in June 2005, Henk van Ess
put out information about how Google uses temporary employees to review
search results and rate the quality of them. He picked up a little hype by
calling it the “Google Secret Evaluation Lab,” while Google itself called it the
Google Rater Hub and the Quality Rater program. Secret was overkill — Google
advertised for quality raters publicly, so it was hardly kept secret. Here’s an
example from 2004.
The program wasn’t secret — just little known.
A human review program certainly wasn’t unprecedented. WebCrawler (remember
WebCrawler?) had a similar program running back in the 1990s, and all the major
search engines — to my last knowledge — have used human reviewers to assess
the quality of what they produce.
Ah — but shouldn’t Google be so smart as not to need humans at all? That’s
sort of the suggestion of a rival engineer that inspired the
New York Times to look at the program briefly today, and it got
ReadWriteWeb wondering as well. In a word, no.
If you go back to one of Google’s original algorithms,
PageRank, it was
designed to help Google automatically find the best pages by doing what a human
would do. It was meant to mimic human relevance assessments.
Over time, Google’s algorithms (as those with the other major search engines)
have continued to change and be refined. But the one unifying element to them
all is that they’re trying to do what humans would do. Would humans like pages
that rank well because they are stuffed full of nonsensical keywords? No. Enter
a filter for that. Would humans like fresh news content getting to the top of
the results for search queries that spike in reaction to a news event? Yes —
and enter a change for that.
Next, let’s talk about “hand manipulation.” To me, the anonymous search
engineer from a Google rival cited in the New York Times article is suggesting
that Google is actually picking the best sites on a query-by-query basis. For
example, if you were to search for cars, then Google had people behind the
scenes picking out the top sites.
I’ve heard this countless times from many people who have no evidence to back
it up. In contrast, I can offer several good reasons to prove it doesn’t happen:
- Google Loves Algorithms: I mean, they really love them. Googlers
would rather spend hours figuring out how to make lights automatically come on
and off when people enter a room than just flip a switch. They have that type
of mentality. They will let obvious spam and crud sit in their results for
days or weeks while they figure out an “algorithmic” solution. They just
aren’t into hand-editing results.
- Hand Editing Would Be Better: If Google was hand editing results,
they’d do much better than what you get. Search for anything — any popular
term. There will be some site there that just isn’t a good fit.
Those wearing tinfoil on their heads will say that’s to disguise the good hand
editing that they’re doing. Please.
There is some hand editing that happens, but it is not done on a
query-by-query basis. This is when a site or pages from a site are penalized.
Google has actually acted much quicker than in that past in these cases, where
penalty is applied that impacts a site across a range of queries — but not any
In my years covering Google, the closest I’ve seen them come to any
query-by-query manipulation is in the case of Googlebombing.
Google Says Stephen
Colbert Is No Longer The Greatest Living American covers how Google’s fix to
Googlebombing failed to happen. The explanation that later came never felt that
convincing, I’m afraid.
By the way, wondering if you have been visited by a quality rater? Barry
Schwartz did a post
on this last year, covering what you might see in your logs. And want some more
history on true human-generated results?
Mahalo Launches With
Human-Crafted Search Results from me last May covers that, including some of
the reasons humans are a good thing.
By the way, Mahalo — and even more so
Search Wikia from
Jimmy Wales — are responsible for the current wave of “we have humans too”
messaging that Google’s been putting out in recent months, to the degree I kind
of want to puke. Google is clearly sensitive to the claim made by competitors that they seem like some Cylon or Borg (pick your favorite SF series) machine that lacks all humanity. As a result, we’ve had a stream of “we use
human linking pattern” or “we’ve got human reviewers” or “humans personalize
their own results.”
All of these things are actually true, but they are a world-apart from the
human involvement currently at Mahalo or what Search Wikia is planning. And
that’s not to say Google’s weak, either, without proactive human involvement.
The jury’s still largely out on this. But it is interesting to see them react to
much of the pressure.
Postscript (July 14, 2010): History keeps repeating itself. The Financial Time ran an article about how Google might combat content farms, which contains this section:
Google’s Mr Singhal calls this the problem of “brand recognition”: where companies whose standing is based on their success in one area use this to “venture out into another class of information which they may not be as rich at”. Google uses human raters to assess the quality of individual sites in order to counter this effect, he adds.
That’s been interpreted at ZDNet as is if Google is hand editing results:
I’ve known about this for several years but wasn’t able to get anyone from Google on the record. These Google employees have the power to promote or even completely erase a site from the Google index.
This admission is potentially a very large problem for Google because it has maintained that its index rankings are unbiased and are computed from a natural pecking order derived from how other sites find a specific site important.
As I’ve explained above, this is incorrect. The human raters cannot promote or remove anything. They simple rate the quality of web pages they review as an additional feedback mechanism, which Google then uses to try create a better search algorithm.
Google also confirms to us that human raters still operate as I explained above.