Earlier this week, Google Fellow Amit Singhal gave the opening keynote at SMX London. Although Matt Cutts has always been the public face of all parts of Google’s unpaid search, his realm is primarily web spam. Singhal has been speaking publicly more often (notably when Panda launched) and oversees search quality. Or, as he described in his talk, when he came to Google in 2000, he took a look at Sergey Brin’s code and entirely rewrote Google’s ranking algorithms.
Near the end of the talk, someone asked if how much money Google will make is factored into decisions about changes to Google’s (unpaid search) algorithms. Singhal was adamant: “no revenue measurement is included in our evaluation of a rankings change.” Listening to him explain how excites he gets about search improvements and how changes are evaluated, you realize there’s no spin here. He’s absolutely telling the truth. And he would know. Chris Sherman asked if anyone at Google really understands how the whole thing works and he replied that while no one knows how everything works (all of unpaid search, AdWords, Android, etc.), he has a pretty good idea of how all of unpaid search works. Not many can make that claim.
Core to Singhal’s talk was a focus on what Google does look at when improving unpaid search algorithms. The key is always relevance.
Singhal talked about the evolution of Google’s unpaid search algorithms. In 2003, they worked on stemming and synonyms. This meant that those searching for [watch buffy the vampire slayer] [watching buffy the vampire slayer] and [view buffy the vampire slayer] would likely all see the same results. In 2007, came universal search, which was a big step forward in understanding searcher intent. (Searchers typing in [i have a dream] not only are looking for Martin Luther King Jr.’s speech, but would like to see a video of it.)
Ten years ago, search results were keyword-based, but Google is now moving towards understanding the intent behind the words. Singhal talked about Google’s acquisition of the company FreeBase, which has done substantial work on understanding phrases as entities rather than strings. “Mount Everest” isn’t just two words, it’s also a mountain, with a height, in a location, and so on. (Shortly after the talk, Google launched their Knowledge Graph, which is the next step in this understanding.) Combine intent with speech recognition and mobile devices and you almost end up with what Singhal first glimpsed years ago on Star Trek. We do indeed, live in the future (almost).
In 2012 took a big step (whether or not that step was forward is up for debate) towards greater personalization with Search Plus Your World, which began incorporating Google+ into search results for those logged in. Singhal explained that Google+ integration was not the point, it was just a proof of concept. The point was a foundation for a wider world of (more secure) searching over everything: both what’s public in the world and what’s private to each searcher. Perhaps one day Google will in fact be able to find your car keys.
Singhal said that searcher click behavior shows that searchers are happy with this integration. But he acknowledged there’s work to be done. When asked when it would launch in Europe, he said that based on feedback, it’s undergoing improvements first.
Relevance and Data: How Changes Are Evaluated
Search Plus Your World is built and evaluated the way all ranking algorithm changes are: build, evaluate, launch, learn, improve, repeat. Relevance is key to every measurement. Singhal stepped through the process:
- An engineer at Google has an idea of a signal (one of over 200) that might be introduced or tweaked to improve overall relevance.
- That algorithm change is run on a test set of data and if all looks good, human raters look at before and after results for a wide set of queries (a kind of manual A/B test). The human raters don’t know which is the before and which is the after. The raters report what percentage of queries got better (more relevant) and what percentage got worse (less relevant).
- This process gets looped several times as the algorithm is tweaked to better serve results for the queries in the “worse” set.
- Once the overall manual ratings show that the algorithm tweak makes results better overall, it’s all tested again. This time, a data center (one of many that contains Google’s index and serves results to searchers) is loaded with the new algorithm and a very small slice of searchers (typically 1%) see the modified result set. Are those searchers happier than the ones seeing the version of results without the tweak? Singhal says they compare where searchers click. Clicks on higher ranked pages mean results at the top are likely more relevant, and searchers are happier. (He didn’t say so, but they may look at other data, such as click and back behavior.)
- An independent analyst compiles the results and provides a statistical analysis, which is presented at a search quality meeting, where engineers look at the data and debate the change. If they decide this tweak improves the quality of search results overall (and is good for the web and doesn’t overly tax internal systems), the change goes out.
This process is happening all of the time with lots of different proposed tweaks and tests. 525 algorithm changes were launched in 2011. That may seem like a lot, but earlier this year Singhal noted that many more changes were tested.
“Concurrently we have approximately 100 ideas floating around that people are testing – we test thousands in a year. Last year we ran around 20,000 experiments. Clearly they don’t all make it out there but we run the process very scientifically.”
Aggregated data from millions of searchers typing millions of queries provides clear patterns. Singhal said that not only do those who get better results more quickly click higher in the search results, but they also search more. (We’ve heard this before from Google. Marissa Mayer, for instance has noted that a half a second delay in rendering search results resulted in 20% fewer searches).
Singhal noted that the kind of personalization platform envisioned with Search Plus Your World is harder to test. Human evaluation looks at relevance, but personal relevance is unique for each searcher. All Google really has to go on is click behavior. Singhal talked with Danny Sullivan about this dilemma a few weeks after Search Plus Your World launched:
“Every time a real user is getting those results, they really are delighted. Given how personal this product is, you can only judge it based on personal experiences or by aggregate numbers you can observe through click-through.”
All of this gets complicated by varied screen size. The user interface becomes more important as increased use of mobile devices and tablets shrink screen real estate.
If these changes are all about increased relevance, why is only Google+ represented in Search Plus Your World? Why not Facebook and Twitter? Singhal explained that most personally useful Facebook data is locked behind a login, and Twitter produces content at a rate that is too massive for Google to crawl quickly and comprehensively. Or, they could, but it would probably take down the Twitter servers. Twitter has also had some technical issues that have made crawling difficult, although are being fixed.
What About Panda and Penguin?
Singhal said that Google’s algorithms aren’t perfect (hence the 20,000 experiments a year). He looks at bad queries every day (and encouraged the audience to let him know about them! So, add them to comments to this post and we’ll forward them along). But when asked specifically about Panda and Penguin, two of the latest high profile algorithm changes, he said that data has shown they significantly improved the number of high quality sites being returned in results. They are not only refining what signals they use in ranking, but are improving how they gather and tune the signals themselves (so signal quality is higher). They are constantly looking for aberrations in signals.
At the end of the day, he said, site owners need to take a hard look at what value their sites are providing. What is the additional value the visitor gets from that site beyond just a skeleton answer? Ultimately, it’s those sites that provide that something extra that Google wants to showcase on the first page of search results.