Peeking Into the World Of Google’s Algorithm Changes With Google Search Quality Head Amit Singhal

Amit SinghalEarlier this week, Google Fellow Amit Singhal gave the opening keynote at SMX London. Although Matt Cutts has always been the public face of all parts of Google’s unpaid search, his realm is primarily web spam. Singhal has been speaking publicly more often (notably when Panda launched) and oversees search quality. Or, as he described in his talk, when he came to Google in 2000, he took a look at Sergey Brin’s code and entirely rewrote Google’s ranking algorithms.

Near the end of the talk, someone asked if how much money Google will make is factored into decisions about changes to Google’s (unpaid search) algorithms. Singhal was adamant: “no revenue measurement is included in our evaluation of a rankings change.” Listening to him explain how excites he gets about search improvements and how changes are evaluated, you realize there’s no spin here. He’s absolutely telling the truth. And he would know. Chris Sherman asked if anyone at Google really understands how the whole thing works and he replied that while no one knows how everything works (all of unpaid search, AdWords, Android, etc.), he has a pretty good idea of how all of unpaid search works. Not many can make that claim.

Core to Singhal’s talk was a focus on what Google does look at when improving unpaid search algorithms. The key is always relevance.

Singhal talked about the evolution of Google’s unpaid search algorithms. In 2003, they worked on stemming and synonyms. This meant that those searching for [watch buffy the vampire slayer] [watching buffy the vampire slayer] and [view buffy the vampire slayer] would likely all see the same results. In 2007, came universal search, which was a big step forward in understanding searcher intent. (Searchers typing in [i have a dream] not only are looking for Martin Luther King Jr.’s speech,  but would like to see a video of it.)

Understanding Intent

Ten years ago, search results were keyword-based, but Google is now moving towards understanding the intent behind the words. Singhal talked about Google’s acquisition of the company FreeBase, which has done substantial work on understanding phrases as entities rather than strings. “Mount Everest” isn’t just two words, it’s also a mountain, with a height, in a location, and so on. (Shortly after the talk, Google launched their Knowledge Graph, which is the next step in this understanding.) Combine intent with speech recognition and mobile devices and you almost end up with what Singhal first glimpsed years ago on Star Trek. We do indeed, live in the future (almost).

Personalization

In 2012 took a big step (whether or not that step was forward is up for debate) towards greater personalization with Search Plus Your World, which began incorporating Google+ into search results for those logged in. Singhal explained that Google+ integration was not the point, it was just a proof of concept. The point was a foundation for a wider world of (more secure) searching over everything: both what’s public in the world and what’s private to each searcher. Perhaps one day Google will in fact be able to find your car keys.

Singhal said that searcher click behavior shows that searchers are happy with this integration. But he acknowledged there’s work to be done. When asked when it would launch in Europe, he said that based on feedback, it’s undergoing improvements first.

Relevance and Data: How Changes Are Evaluated

Search Plus Your World is built and evaluated the way all ranking algorithm changes are: build, evaluate, launch, learn, improve, repeat. Relevance is key to every measurement. Singhal stepped through the process:

  1. An engineer at Google has an idea of a signal (one of over 200) that might be introduced or tweaked to improve overall relevance.
  2. That algorithm change is run on a test set of data and if all looks good, human raters look at before and after results for a wide set of queries (a kind of manual A/B test). The human raters don’t know which is the before and which is the after. The raters report what percentage of queries got better (more relevant) and what percentage got worse (less relevant).
  3. This process gets looped several times as the algorithm is tweaked to better serve results for the queries in the “worse” set.
  4. Once the overall manual ratings show that the algorithm tweak makes results better overall, it’s all tested again. This time, a data center (one of many that contains Google’s index and serves results to searchers) is loaded with the new algorithm and a very small slice of searchers (typically 1%) see the modified result set. Are those searchers happier than the ones seeing the version of results without the tweak? Singhal says they compare where searchers click. Clicks on higher ranked pages mean results at the top are likely more relevant, and searchers are happier. (He didn’t say so, but they may look at other data, such as click and back behavior.)
  5. An independent analyst compiles the results and provides a statistical analysis, which is presented at a search quality meeting, where engineers look at the data and debate the change. If they decide this tweak improves the quality of search results overall (and is good for the web and doesn’t overly tax internal systems), the change goes out.

This process is happening all of the time with lots of different proposed tweaks and tests. 525 algorithm changes were launched in 2011. That may seem like a lot, but earlier this year Singhal noted that many more changes were tested.

“Concurrently we have approximately 100 ideas floating around that people are testing – we test thousands in a year. Last year we ran around 20,000 experiments. Clearly they don’t all make it out there but we run the process very scientifically.”

Aggregated data from millions of searchers typing millions of queries provides clear patterns. Singhal said that not only do those who get better results more quickly click higher in the search results, but they also search more. (We’ve heard this before from Google. Marissa Mayer, for instance has noted that a half a second delay in rendering search results resulted in 20% fewer searches).

Singhal noted that the kind of personalization platform envisioned with Search Plus Your World is harder to test. Human evaluation looks at relevance, but personal relevance is unique for each searcher. All Google really has to go on is click behavior. Singhal talked with Danny Sullivan about this dilemma a few weeks after Search Plus Your World launched:

“Every time a real user is getting those results, they really are delighted. Given how personal this product is, you can only judge it based on personal experiences or by aggregate numbers you can observe through click-through.”

All of this gets complicated by varied screen size. The user interface becomes more important as increased use of mobile devices and tablets shrink screen real estate.

If these changes are all about increased relevance, why is only Google+ represented in Search Plus Your World? Why not Facebook and Twitter? Singhal explained that most personally useful Facebook data is locked behind a login, and Twitter produces content at a rate that is too massive for Google to crawl quickly and comprehensively. Or, they could, but it would probably take down the Twitter servers. Twitter has also had some technical issues that have made crawling difficult, although are being fixed.

What About Panda and Penguin?

Singhal said that Google’s algorithms aren’t perfect (hence the 20,000 experiments a year). He looks at bad queries every day (and encouraged the audience to let him know about them! So, add them to comments to this post and we’ll forward them along). But when asked specifically about Panda and Penguin, two of the latest high profile algorithm changes, he said that data has shown they significantly improved the number of high quality sites being returned in results. They are not only refining what signals they use in ranking, but are improving how they gather and tune the signals themselves (so signal quality is higher). They are  constantly looking for aberrations in signals.

At the end of the day, he said, site owners need to take a hard look at what value their sites are providing. What is the additional value the visitor gets from that site beyond just a skeleton answer? Ultimately, it’s those sites that provide that something extra that Google wants to showcase on the first page of search results.

Opinions expressed in the article are those of the guest author and not necessarily Search Engine Land.

Related Topics: Channel: SEO | Features: Analysis | Google: Algorithm Updates | Google: Search Plus Your World | Google: Web Search | Top News

Sponsored


About The Author: is a Contributing Editor at Search Engine Land. She built Google Webmaster Central and went on to found software and consulting company Nine By Blue and create Blueprint Search Analytics< which she later sold. Her book, Marketing in the Age of Google, (updated edition, May 2012) provides a foundation for incorporating search strategy into organizations of all levels. Follow her on Twitter at @vanessafox.

Connect with the author via: Email | Twitter | Google+ | LinkedIn



SearchCap:

Get all the top search stories emailed daily!  

Share

Other ways to share:
 

Read before commenting! We welcome constructive comments and allow any that meet our common sense criteria. This means being respectful and polite to others. It means providing helpful information that contributes to a story or discussion. It means leaving links only that substantially add further to a discussion. Comments using foul language, being disrespectful to others or otherwise violating what we believe are common sense standards of discussion will be deleted. Comments may also be removed if they are posted from anonymous accounts. You can read more about our comments policy here.
  • davidquaid

    Google is doing great things with Panda and Penguin – and the fact that so many bad SEO’s are complaining means it’s working! But they haven’t gone far enough, there are so many spammers out there that need Google to further tighten their anti-spam algorithms.

  • Dani Alcalá

    ok, here’s one terrible result in google.es when searching for: poemas cortos (wich means short poems in spanish). The 2 first results are spam. Poemascortos.org has only ads above the fold, thin content and keyword stuffing. Poemascortos.info is a bit the same, and quite a copy of poemascortos.org. I asume google thinks that when people searches for “poemas cortos” thinks that they search for these webs, because of the domain name, but this is not true, because people are lookink for poems and not for these webs. As you can guess, this search is a very popular one in spanish, not just a marginal search, so it is a very dissapointing SERP

  • http://twitter.com/johnjmcdonald John McDonald

    That sums up the main problem I’m seeing as well:  a whole lot of thin sites with keyword matching domains are suddenly dominating the search results.  This has always been a bit of a flaw in Google’s algorithm, but Penguin has destroyed a lot of more developed domains and brought these scrap pages to the top.

  • http://www.fashionox.com/ fashionox

    Great post again Vanessa, I have read a few of your post now I must say you always deliver…great references…I must say what I have seen must about this new penguin update is that title tag plays a major part….my site was rankings for a few competitive keywords…womens fashion online…my meta tag was something like fashion online, fashion dresses by fashionox, we ranked 12 in Google si I thought I could do better if I put womens in my title tag. We dropped serval pages something like 80 in serp :( I have changed title tag back but have yet to regain ranking. 

  • http://www.gg2.net/ Garavi Gujarat

    Well post vanessa. Google is doing right thing with panda and penguin. it will be punished the SEO companies which are doing spamming work for their websites to get ranking #1 in Google.

  • newyorker_1

    I am suspecting more and more that relevancy and quality are 2 completely different terms for Google. We never got definition of high quality site. What is it? To me, high quality site must be highly relevant to search result, and in a minute I can give you 10 search terms where highly irrelevant sites are in top 10. If they are in top 10 then they must be high quality. Definition please?

  • http://www.ninebyblue.com Vanessa Fox

    Relevant and high quality are in fact two different things to Google. Quality is evaluated as an independent factor and relevance is specific to the query. Google provided a checklist of how it evaluates quality here:
    http://googlewebmastercentral.blogspot.fr/2011/05/more-guidance-on-building-high-quality.html

    Part of the point of my article was that lots of queries do in fact have irrelevant results; that’s part of why the algorithms are constantly changing — in order to keep improving those cases. As I mentioned, you can list queries that return bad results in the comments. Amit would like to add them to his list.

  • SargentManuela

    my roomate’s ex-wife brought home $19224 the previous month. she is making income on the internet and moved in a $491500 condo. All she did was get lucky and try the instructions laid out on this web page===>> ⇛⇛⇛⇛► http://Freelancerseeker.wordpress.com

  • Durant Imboden

    I wonder how personalization influences Google’s perception of “search quality” for a given user or in a given type of situation?

    Let’s say that Joe User is looking for advice on Venice, Italy airport transportation. At one end of the spectrum, he’ll find a site with a collection of articles about airport boats, buses, etc. with plenty of nitty-gritty information (right down to “how to get from the arrivals terminal to the boat pier” with photos and a satellite map). Or, if he doesn’t like to read and just wants a Cliff’s Notes version, he can find a page on another site that merely lists the transportation options in a series of bullet points. 

    In other words, the question of which site is higher in quality will depend largely on Joe User’s tastes. It could also be influenced by where Joe is when he’s performing the query, and on what type of device he’s using. (If he’s researching a trip at home with a laptop, he may be inclined to read everything he can find; if he’s standing in the airport arrivals hall with a low-resolution phone, he may prefer a stripped-down page on a mobile site.)

    Does Google personalization take such factors into account? Not just for Joe User specifically, but also in a more general way (e.g., “people who clearly enjoy in-depth information will prefer this type of site,” or “people with phones may prefer this type of page, while people with desktop computers may prefer this type of page”)?

  • SalinasMorris

    my friend’s aunt made $17398 the previous week. she is making income on the internet and bought a $578000 house. All she did was get lucky and try the steps written on this website===>> ⇛⇛⇛⇛► http://hiringfreelancers.blogspot.com

  • http://twitter.com/AgentsOfValue AgentsOfValue

    Five years ago, I was having a hard time researching for contents. When I typed a keyword in search bar, almost all results were not worth-reading. But with the inception of Panda and now Penguin, I notice I’m saving a lot of time finding for a relevant and spammy-free content.

    It doesn’t matter if these algorithm updates are imperfect. Google will not stop looking for improvements though. Whatever others are saying, it can’t be denied Panda and Penguin makes this internet marketing a fair playground for everyone.

  • http://profile.yahoo.com/3SCYRMTMNRMRFIGZKXJ7CPEYJY Phoebe

    my roomate’s step-aunt makes $81 hourly on the computer. She has been without a job for 9 months but last month her pay check was $16508 just working on the computer for a few hours. Read more on this site CashLazy.c&#111m

  • http://profile.yahoo.com/R27756GFY2ATLHDYKHAXYK743Y Robyn

    my neighbor’s mother-in-law makes $74 every hour on the computer. She has been without a job for 5 months but last month her paycheck was $18217 just working on the computer for a few hours. Read more on this site CashLazy.c&#111m

  • ChristopherSkyi

    This hits the nail on the head:

    “At the end of the day, he said, site owners need to take a hard look at what value their sites are providing. What is the additional value the visitor gets from that site beyond just a skeleton answer? Ultimately, it’s those sites that provide that something extra that Google wants to showcase on the first page of search results.”

    Now, can Google really detect ‘that something extra?’ Probably not, but website owners will do well to work over the horizon where Google (& Bing) are intending to go and resist current real, but short sighted, opportunities to still game the search engines. Moreover, real reform will come to the SEO industry when Google (and the other SE’s) stop rewarding bad behavior.  Panda and Penguin are the first serious steps I’ve seen by Google to tackle spam in their index head on.

  • NguyenMarquita84

    my roomate’s ex-wife brought home $19224 the previous month. she is making income on the internet and moved in a $491500 condo. All she did was get lucky and try the instructions laid out on this web page===>> ⇛⇛⇛⇛► http://hiringfreelancers.blogspot.com

  • http://twitter.com/SingFreePress Singapore Free Press

    Yeah! I have also encountered with my client’s site (http://jrcredit.com.sg/) that I am doing an SEO right now. This site is about loan and it was really badly affected by the updates. Kindly check the site if it is good. Thanks

  • http://twitter.com/Ydraw Yinc

    I think the Last sentence pretty much sums it up and gives us something to go after  ”Ultimately, it’s those sites that provide that something extra that Google wants to showcase on the first page of search results.”  This week I will try and give my site a little extra  http://www.ydraw.com

  • CADRAN LLC

    here is a good one about an site with zero content, a similar name to official site and just a forward to that site
    search: spybubble
    second result: spybubble.net

  • http://deuci.com/ Jeffrey Lin

    as Wil Reynolds’ presentation at SEOmoz Meetup drove home, it’s more and more about “real world *stuff*” rather than program your way to the top. sure, consumers are more tech savvy, but large part because Apple makes OS and UI’s that a baby, a great grandma, and anyone in between can use.  Search has to match that “human” aspect…

  • http://profile.yahoo.com/PFDODI2UWICAF7A6JNQIMICYXI Rob Lance

    What to see a bad query?  Type bad credit loan.  One of the top 10 results is a questionpro survey page that immediately redirects to more spam.  This page also ranks for bad credit personal loan, personal loans for bad credit, and many more.  This is just one example of horrible, dangerous results for consumers.

  • Durant Imboden

    “What is the additional value the visitor gets from that site beyond just a skeleton answer?”

    I wonder if having too much information on a topic could be a negative from a Panda/Penguin point of view. Lately, we’ve been seeing a greater increase in Google referrals on our weakest topics than on our strongest, and some of our rankings for topics where we’re clearly “best of breed” are weaker than than they were before Panda was rolled out in February and April of 2011.

    There may be a fine line between “comprehensive resource” and “content farm,” and if there is, Google may be having trouble figuring out where that line lies.

Get Our News, Everywhere!

Daily Email:

Follow Search Engine Land on Twitter @sengineland Like Search Engine Land on Facebook Follow Search Engine Land on Google+ Get the Search Engine Land Feed Connect with Search Engine Land on LinkedIn Check out our Tumblr! See us on Pinterest

 
 

Click to watch SMX conference video

Join us at one of our SMX or MarTech events:

United States

Europe

Australia & China

Learn more about: SMX | MarTech


Free Daily Search News Recap!

SearchCap is a once-per-day newsletter update - sign up below and get the news delivered to you!

 


 

Search Engine Land Periodic Table of SEO Success Factors

Get Your Copy
Read The Full SEO Guide