13,000 Precision Evaluations: Schmidt’s Testimony Reveals How Google Tests Algorithm Changes
Precision evaluations. Side-by-side experiments with human focus groups. Click evaluations. All of it led to more than 500 algorithm changes in 2010.
Eric Schmidt is speaking today at a Senate subcommittee hearing in Washington, DC, about Google’s business practices, and answering claims that Google’s business practices are anti-competitive. As part of his appearance, Schmidt (as all witnesses do) prepared a set of written remarks that was submitted to the subcommittee prior to his appearance today. The subcommittee has released that document (PDF download), which contains an interesting section that reveals more details about how Google tests its algorithm changes.
Schmidt explains that “potential refinements to the algorithm go through a rigorous testing process, from conception to initial testing in Google’s internal ‘sandbox’ to focused testing to final approval.” He then details how Google uses both human review and internal analysis to develop what amounted to more than 500 algorithmic changes in 2010:
To give you a sense of the scale of the changes that Google considers, in 2010 we conducted 13,311 precision evaluations to see whether proposed algorithm changes improved the quality of its search results, 8,157 side-by-side experiments where it presented two sets of search results to a panel of human testers and had the evaluators rank which set of results was better, and 2,800 click evaluations to see how a small sample of real-life Google users responded to the change.
Ultimately, the process resulted in 516 changes that were determined to be useful to users based on the data and, therefore, were made to Google’s algorithm. Most of these changes are imperceptible to users and affect a very small percentage of websites, but each one of them is implemented only if we believe the change will benefit our users.
It’s no secret that Google uses human “quality raters” — we’ve written about that in past years and others have, too. If you want to read what might be a somewhat outdated document, a Google Quality Raters Handbook was posted online in 2008. It’s since been removed, but our article recapping the handbook is still around.