BusinessWeek Dives Deep Into Google’s Search Quality
In a series of interviews and articles recently published online, BusinessWeek magazine tries to open up the curtains on Google’s search quality team — the ways team members evaluate Google’s search rankings and their decision-making process when changes are being made. Perhaps the place to start is with BusinessWeek’s interview of Google CEO Eric Schmidt, […]
In a series of interviews and articles recently published online, BusinessWeek magazine tries to open up the curtains on Google’s search quality team — the ways team members evaluate Google’s search rankings and their decision-making process when changes are being made.
Perhaps the place to start is with BusinessWeek’s interview of Google CEO Eric Schmidt, where one of the main talking points about search quality comes down to one word: data. Schmidt says Google’s biggets strength is having “so much scale in terms of the data we can bring to bear,” but admits that data is a stumbling block where some real-time and social activities are concerned.
“If we can’t get the data, it’s very, very difficult for us to rank it. Facebook has chosen to keep much of its data behind a wall, that’s what it has decided to do. We favor openness, because we think that works best for the users.
Twitter is a good example of something that is very hard to rank. With real-time, we should over time find a proper way to rank them.”
The series gets more detailed in an interview with Google VP Udi Manber, head of the search quality group. He reveals some stats about the search quality team’s ongoing evaluation of search results.
“We ran over 5,000 experiments last year. Probably 10 experiments for every successful launch. We launch on the order of 100 to 120 a quarter. We have dozens of people working just on the measurement part. We have statisticians who know how to analyze data, we have engineers to build the tools. We have at least five or 10 tools where I can go and see here are five bad things that happened. Like this particular query got bad results because it didn’t find something or the pages were slow or we didn’t get some spell correction.”
Manber explains that Google’s search quality team is able to try any idea it has for improving results, and the infrastructure allows the team to learn “sometimes in a day” if the change is good or bad. He says Google will bring more tools to try solving the challenge of real-time search: “If something is written on the Web that is important, we should bring it back to you in seconds. Right now we’re in minutes. Five years ago, it was once a month.” There are several questions (and answers) on this topic, and I strongly suspect the conversation reveals much of what the Google Caffeine update is about.
In a separate interview, Amit Singhal (who runs Google’s core ranking team) reveals that he’s “dismantled” the original ranking system that Larry Page and Sergey Brin developed. And Singhal talks in depth about the regular meetings Google’s search quality team has, which often involve the discussion and evaluation of “outlandish” ideas.
Scott Huffman leads the Google team that evaluates search results, and he tells BusinessWeek about how Google uses both human evaluators and automated tools to measure how good its search results are. He goes into detail about both, and the explanation of Google’s automated measurement system is particularly interesting.
“On a continuous basis in every one of our data centers, a large set of queries are being run in the background, and we’re looking at the results, looking up our evaluations of them and making sure that all of our quality metrics are within tolerance.
These are queries that we have used as ongoing tests, sort of a sample of queries that we have scored results for; our evaluators have given scores to them. So we’re constantly running these across dozens of locales. Both broad query sets and navigational query sets, like ‘San Francisco bike shop’ to the more mundane, like: Here’s every U.S. state and they have a home page and we better get that home page in the top results, and if we don’t … then literally somebody’s pager goes off.”
Matt Cutts, Google’s head of Web spam, gets the last interview in the series. Search Engine Land readers are probably most familiar with what Matt’s team does, but he does go into detail on how Google tries to distinguish when spam is intentional versus when it’s the result of a site being hacked.
“If you’ve got a longstanding site and then all of a sudden a brand-new directory pops up and it’s got a bunch of spammy terms like online casinos and debt consolidation, pills, and you’ve seen a bunch of weird links from other sites show, then you think maybe this part of the site has been hacked. So let’s not show this directory of sites to people for a little while until we know whether it’s spam or malware—or maybe scan those other 80 pages for malware as well.”
The overall impression I’m left with after reading the whole set of interviews is that, as Schmidt says initially, Google is relentless in its use of data to manage search quality — but it’s also quite willing, especially in recent years, to use human intervention to make sure the automated systems are doing what they should.