How Google Instant’s Autocomplete Suggestions Work

It’s a well known feature of Google. Start typing in a search, and Google offers suggestions before you’ve even finished typing. But how does Google come up with those suggestions? When does Google remove some suggestions? When does Google decide not to interfere? Come along for some answers.

Google & Search Suggestions

Google was not the first search engine to offer search suggestions, nor it is the only one. But being the most popular search engine has caused many to look at Google’s suggestions more closely.

Google has been offering “Google Suggest” or “Autocomplete” on the Google web site since 2008 (and as an experimental feature back since 2004). So suggestions — or “predictions” as Google calls them — aren’t new.

What Google suggests for searches gained new attention after Google Instant Search was launched last year. Google Instant is a feature that automatically loads results and changes those results. That interactivity caused many to take a second look at suggestions, including an attempt to list all blocked suggestions.

Suggestions Based On Real Searches

The suggestions that Google offers all come from how people actually search. For example, type in the word “coupons,” and Google suggests:

  • coupons for walmart
  • coupons online
  • coupons for target
  • coupons for knotts scary farm

These are all real searches that have been done by other people. Popularity is a factor in what Google shows. If lots of people who start typing in “coupons” then go on to type “coupons for walmart,” that can help make “coupons for walmart” appear as a suggestion.

Google says other factors are also used to determine what to show beyond popularity. However, anything that’s suggested comes from real search activity by Google users, the company says.

Suggestions Can Vary By Region & Language

Not everyone sees the same suggestions. For example, above in the list is “coupons for knotts scary farm.” I see that, because I live near the Knott’s Berry Farm amusement park in Orange County, California, which holds a popular “Knott’s Scary Farm” event each year.

If I manually change my location to tell Google that I’m in Des Moines, Iowa, that particular suggestion goes away and is replaced by “coupons for best buy.”

Similarly, if I go to Google UK, I get suggestions like:

  • coupons uk
  • coupons and vouchers
  • coupons for tesco

Tesco is a major UK supermarket chain, just one reflection of how localized those suggestions are.

This is also why something like the Google Instant Alphabet or The United States of Autocomplete (shown below) — while clever — aren’t accurate and never can be, unless you’re talking about the suggestions shown in a particular region.

In short, location is important. The country you’re in, the state or province, even the city, all can produce different suggestions.

Language also has an impact. Different suggestions will appear if you’ve told Google that you prefer to search in a particular language, or based on the language Google assumes you use, as determined by your browser’s settings.

Previously Searched Suggestions

Google’s suggestions may also contain things you’ve searched for before, if you make use of Google’s web history feature.

For example, when I search for “rollerblade,” my suggestions look like this:

  • rollerblade parts
  • Rollerblade 2009 Speedmachine 110
  • rollerblades
  • rollerblade wheels
  • rollerblade

The first two come from my search history. That’s why they have the little “Remove” option next to them.

Personalized suggestion like these have been offered since May 2009. The only change with Google Instant was that they were made to look different, shown in purple similar to how links look at some web sites, to indicate if you’ve clicked on them before.

How Suggestions Are Ranked

How are the suggestions shown ranked? Are the more popular searches listed above others? No.

Popularity is a factor, but some less popular searches might be shown above more popular ones, if Google deems them more relevant, the company says. Personalized searches will always come before others.

Deduplicating & Spelling Corrections

There a small degree of deduplicating and spelling correction that happens in the final suggestions that show, Google says.

For example, if some people are typing in “LadyGaga” as a single word, all those searches still influence “Lady Gaga” being suggested — and suggested as two words.

Similarly, words that should have punctuation can get consolidated. Type “ben and je…” and it will be “ben and jerry’s” that gets suggested, even if many people leave off the apostrophe.

Freshness Matters

Google Autocomplete also has what the company calls a “freshness layer.” If there are terms that suddenly spike in popularity in the short term, these can appear as suggestions, even if they haven’t gained long-term popularity.

A good example of this was when actress Anna Paquin was married. “Anna Paquin wedding” started appearing as a suggestion just before her big day, Google says. That was useful to suggest, because many people were starting to search for that.

If Google had relied solely on long-term data, then the suggestion wouldn’t have made it. And today, it no longer appears, as it didn’t maintain long-term popularity (though “anna paquin married” has stuck).

How short-term is short-term? Google won’t get into specifics. But suggestions have been spotted appearing within hours after some search trend has taken off.

Why & How Suggestions Get Removed

As I said earlier, Google’s predictions have been offered for years, but when they were coupled with Google Instant, that sparked a renewed interest in what was suggested and what wasn’t. Were things being removed?

Yes, and for these specific reasons, Google says:

  • Hate or violence related suggestions
  • Personally identifiable information in suggestions
  • Porn & adult-content related suggestions
  • Legally mandated removals
  • Piracy-related suggestions

Automated filters may be used to block any suggestion that’s against Google’s policies and guidelines from appearing, the company says. For example, the filters work to keep things that seem like phone numbers and social security numbers from showing up.

Since the filters aren’t perfect, some suggestions may get kicked over for a human review, Google says.

Hate Speech & Protected Groups

In terms of blocking hate and violence suggestions, it’s not that everything possibly hateful gets blocked as a suggestion.

For example, “i hate my mom” and “i hate my dad” are both suggestions that come up if you type in “i hate my.” Similarly, “hate gl” brings up both “hate glee” and “hate glenn beck.”

Instead, hate suggestions are removed if they are against a “protected” group. So what’s a protected group?

Google doesn’t actually define this on its Autocomplete help page. However, a Google AdWords help page has a rundown on what Google’s long-considered to be protected groups:

  • race or ethnic origin
  • color
  • national origin
  • religion
  • disability
  • sex
  • age
  • veteran status
  • sexual orientation or gender identity

Even “majority” groups such as whites get covered by this, under the “color” category. That seems to be why “i hate white” doesn’t prompt a suggestion for “i hate whites,” just as “i hate black” doesn’t suggest “i hate blacks.”

However, in both cases, other hate references do get through (“i hate white girls” and “i hate black girls” both appear). This is where a human review may happen, if the reference is noticed.

Legal Cases & Removals

Google blocks some suggestions for legal reasons. For example, last year, Google lost two cases in France involving Google Autocomplete.

In the first, Google was ordered to remove the word “arnaque” — which means scam — from coming up as a suggestion for when someone typed in the name of a distance learning company.

Google appears to have done this, when I checked today. Google would not say if it is appealing the case or whether this applies to preventing the word “arnaque” from appearing next to any company’s name.

From some limited testing, I think Google is preventing from “arnaque” from appearing after any company name but not before (“arnaque paypay” and “arnaque groupon” are suggestions).

In the second French autocomplete case, a plantiff — whose conviction was on appeal — sued and won a symbolic 1 euro payment in damages over having the words “rapist” and “satanist” appearing next to his name.

The plantiff’s name wasn’t given in the case, so I can’t check that the terms were removed as ordered. Last year, Google said it would appeal the ruling. The company gave me no update on things when I asked for this article. It doesn’t seem likely that this has caused Google to drop having such terms appear next to the names of other people.

Yesterday, news broke about Google losing a case in Italy involving suggestions. Here, a man sued over having the Italian words for conman and fraud appearing next to his name.

I can’t check if Google has complied with the ruling, because the man’s name was never given — nor does his lawyer make clear if Google has complied. It’s also unclear if this ruling is causing such terms to be dropped in relation to anyone’s name (this seems unlikely).

I asked Google about this but was only given a standard statement:

We are disappointed with the decision from the Court of Milan.  We believe that Google should not be held liable for terms that appear in Autocomplete as these are predicted by computer algorithms based on searches from previous users, not by Google itself.  We are currently reviewing our options.

In the US, Google won a case last month waged by a woman unhappy that the words “levitra” and “cialis” appeared near her name. That case largely involved arguments about commercial infringement, rather than taking a libel stance.

Postscript: Sean Carlos of Antezeta has more on the Milan case here.

Controversial Cases

Aside from legal cases, Google’s suggestions have occasionally become news controversies. Typically, Google responds to these with a standard answer, which goes like this: the predictions are based on how people search, not by any particular “agenda” that the company is trying to push.

Google tells me it doesn’t typically comment more in these cases, because it doesn’t want to be in the position of having to issue a detailed response for any oddity that someone spots. Still, Google did open up about two examples of strange suggestions that have come up in the past.

One involved the suggestion “climategate,” which oddly disappeared shortly after appearing. My Climategate: Just How Popular Is It, According To Google? story from December 2009 has more about this.

Blame that aforementioned freshness layer, says Google. Back when this all happened, the freshness layer had a gap that allowed spiking queries to appear for a short period of time, then disappear unless they gained more long term popularity.

That gap has since been reduced. Spiking queries stay around longer, then drop unless they gain long-term traction. The “climategate” suggestion didn’t catch on and so disappeared. It wasn’t manually removed, as some assumed, Google said.

Interestingly, looking today, “climategate” still hasn’t gained enough long term popularity to come up as a suggestion at Google. But over at Bing — which, of course, uses its own unique suggestion system — it is offered.

In another case, a search for “islam is” was producing no suggestions while searches for other religions were — including negative ones. Our Islam Is … Blocked By ‘Bug’ In Google Suggest story from January 2010 has more about this.

As it turned out, there was a human error involved, Google told me.

Those suggestions had been escalated for human review as possibly being hate-related. A block was placed, because someone assumed that Islam as a religion met the protected group criteria.

But in fact, Google Autocomplete does not consider religions to be protected groups (I’ll get back to this). So other religions didn’t have a filter established for them.

Today, “islam is” brings back some negative suggestions, just as is the case with other religions.

Nationalities Briefly Protected; Religions Not

Feeling confused about who get protected, at this point? So am I.

Remember when I listed what a protected group was, according to Google, above? That included religions, but that’s the definition that Google AdWords uses, not Google Autocomplete.

Similarly, Google’s YouTube has its own definition of protected groups:

Protected groups include race or ethnic origin, religion, disability, gender, age, veteran status, and sexual orientation/gender identity.

National origin isn’t on that list. Indeed, it wasn’t on the unpublished list that Google Autocomplete uses until last May, when Google began to filter suggestions related to nationality. Search for “americans are,” for example, and you got nothing.

To me, it’s kind of crazy. Why protect nationalities but not religions? And why wouldn’t suggestions like “jews are cheap” or “jews are racist” be considered against a protected group, in terms of a race or ethnic group?

Google gave me this statement on the topic (the brackets aren’t me removing words but instead how Google indicates a search term):

Simply put, nationalities refer to individuals, religions do not. Our hate policy is designed to remove content aimed at specific groups of individuals. So [islamics are] and [jews are] or [whites are] would possibly be filtered, while queries such as [islam is] and [judaism is] would not because the suggestions are directed at other entities, not people.

Sorry, I’m not convinced by this. Worse, when I did some double-checking today, the previously established nationality filter — which the statement defends — appears to be turned off. Yes, Americans are again fat, lazy and ignorant, as Google’s “predictions” suggest, and the French are lazy cowards.

Can You Request Removals?

As you can imagine, some people would like to have negative suggestions removed. However, as explained, Google only does this in very specific instances. The company doesn’t even have a form to request this (though there is a help page on the topic, that suggests leaving comments in Google’s support forums).

Should businesses be allowed to request removal of suggestions? It’s not something that Google wants to arbitrate. Jonathan Effrat, a product manager at Google who works on Google Instant, told me:

Unfortunately, we won’t do a removal in those situations. A lot of times people are searching for it, and there’s a legitimate reason. I had a friend who used to work for a company, and the company name plus “sucks” was a suggestion, and that was the reality. It’s not really our place to say you shouldn’t be searching for that.

There are signs that Google has been pulling back by suggesting “scam” along with company names, but despite these reports, you can still find examples where this still happens. Google hasn’t commented if it’s actually made any change like this, by the way.

What About Piracy?

Of course, Google recently did decide that people shouldn’t be searching for things, in the case of online piracy, when it began blocking terms it deemed to be piracy-related in January.

That took out — and continues to take out — suggestions for some sites that may also be used for legitimate reasons. To be clear, suggestions were removed, not the sites themselves.

Want to read the Wikileaks files directly? BitTorrent or uTorrent have software that will allow you to do this. But today, Google won’t auto-suggest their names as you begin to type, deeming them too piracy related.

Aside from taking out some potentially innocent parties, the whole thing feels kind of hypocritical. Why does Google feel it needs to go over-and-above to protect searchers piracy-related suggestions when there are a range of other potential harmful ones out there?

The answer, in my view, is that this is a PR battle Google wants to win as studios and networks accuse it of supporting piracy and seek to enlist the aid of the US Congress. Dropping piracy suggestions is an easy gift, especially when Google’s not proactively removing the real issue, sites that host pirated content in its own results. It’s also a gift that might help it get network blocking of Google TV lifted.

And Fake Queries?

Meanwhile, another issue has gained fresh attention — the ability for people to “manufacture” suggestions. In particular, Amazon’s Mechanical Turk is a well-known venue where people can request that others do searches. When enough searches happen, then suggestions start appearing.

Brent Payne is probably one of the most notable examples of someone deliberately doing this “above the radar,” so to speak. He ran a series of experiments where he hired people on Mechanical Turk to do searches, which (until Google removed them) caused suggestions to appear:

Tempted to try it? Aside from potentially violating Mechanical Turk’s terms, Google says doing so is something it deems spam and will take corrective action against, if spotted.

What action? So far, that seems to be limited to removing the manufactured suggestions.

Postscript: Payne’s study has apparently gone away, but another study done in March 2012 showed that using Mechanical Turk can still have an impact.

A Suggestion For Google’s Suggestions

As I said, Google Instant prompted renewed attention about Google’s suggestions — along with debate about whether Google should be offering suggestions at all, given the reputation nightmare they can bring to some companies and individuals, as well as offense they bring to other groups. On the flipside, there’s the usefulness of them.

Here’s a case that illustrates the balancing act. Last year, a skydiving company contacted me, concerned that searches for its name brought up a suggestion of its name plus the words “death” or “accident.” Yes, the company had someone who died in a jump.

That’s something harmful to the company, even if true. Skydiving is by its nature an extremely dangerous sport, and the suggestion gives no guidance about whether the company was somehow at fault. It just immediately suggests there’s something wrong with the company.

However, it’s also incredibly useful for searchers, as a way for them to refine their queries in ways they might not expect.

Still, I think the balancing act should tip back toward not offering up anything negative about any person, company or group. No nonsense about “protected groups.” Just kill the negative suggestions, period.

This is a suggestion for all the major search engines, by the way. Enough singling out Google, when these types of examples can be found easily on Bing and Yahoo, also.

If there are negative things that people want to discover about a person, company or group, those will come out in the search results themselves, and mixed in with more context overall — good, bad or perhaps indifferent.

Yes, many Americans know they’re stereotypically seen as fat. Other nationalities and religious groups also know that there are many hurtful stereotypes about them. But who wants Google seeming to tell them that?

Yes, Google’s correct in saying that the suggestions it shows reflect what many people are searching for — and thus think.

Still, parroting harmful thoughts “searched” by others doesn’t make those things any less hurtful or harmful. And by repeating these things, there’s an argument that search engines simply makes the situation worse.

Related Topics: Channel: SEO | Features: General | Google: Instant | Google: Suggest | Legal: Censorship | Legal: Privacy | Search & Society: General | Top News

Sponsored


About The Author: is a Founding Editor of Search Engine Land. He’s a widely cited authority on search engines and search marketing issues who has covered the space since 1996. Danny also serves as Chief Content Officer for Third Door Media, which publishes Search Engine Land and produces the SMX: Search Marketing Expo conference series. He has a personal blog called Daggle (and keeps his disclosures page there). He can be found on Facebook, Google + and microblogs on Twitter as @dannysullivan.

Connect with the author via: Email | Twitter | Google+ | LinkedIn



SearchCap:

Get all the top search stories emailed daily!  

Share

Other ways to share:
 

Read before commenting! We welcome constructive comments and allow any that meet our common sense criteria. This means being respectful and polite to others. It means providing helpful information that contributes to a story or discussion. It means leaving links only that substantially add further to a discussion. Comments using foul language, being disrespectful to others or otherwise violating what we believe are common sense standards of discussion will be deleted. Comments may also be removed if they are posted from anonymous accounts. You can read more about our comments policy here.
  • http://www.redmudmedia.com Ralph

    Points well made in the conclusion. There is no that this will be gamed to death. Not that I intend to do so myself, but one of my clients is seeing a significant amount of traffic coming from users entering part of their brand name which suggests (and it is the case) that instant search is influencing this.

    So your Brent Payne example suggests to me that if enough people search for your primary search terms + the brand name then this will start to become a suggestion. Doing a search for “cheap flights” or “cheap hotels” doesn’t suggest any brand names to me yet, but I don’t think it will be long because as with links, it will be very difficult for Google to differentiate between genuine and faked searches… surely? It’s no secret that the travel and finance industries are riddled with naughty link buying and Google seems helpless to do anything about it.

    I guess Google could factor in click through rate as well so that it’s just a little harder (not impossible) to game the system.

    I’d hazard a guess that the suggestions also lend themselves to trending topics in Twitter and other social media outlets. Especially the fresh stuff that hasn’t had time to get indexed and ranked.

  • http://www.redmudmedia.com Ralph

    I meant to say there is no doubt that this will be gamed to death.

  • http://mainspring.tv/ M.V.

    Wow thanks for your research on this. I had no idea how much went into the autosearch function. I just thought it went by popularity. Very interesting!

  • http://www.rob-millard.com Rob Millard

    Nice post!

    Made a quick tool here a while back to export suggestions to a CSV:
    http://www.rob-millard.com/keyword-expander/

    You can run up to 100 keywords at once, which should give you ~1000 back :)

  • http://mohanarun.com Mohan Arun

    Regardin’ the ‘Personalized Autocomplete search suggestions’: I want to view this as a big picture’ There are two parts. First is the collection. Second is the delivery. Collection means all the methods used to collect what you already liked, (past), what you like now (present search) and predictably what you may like (future). It involves collection of data I like from across mutiple Google properties… – if I “+1′d” a website then it means I like the keywords used in the website, if I commented on a blog post using the ‘Google account’ option, then I am likely to be interested in the keywords used in the post, if I stored a bookmark in Google bookmarks then I am likely to be interested in the keywords used in the page title of the bookmark, if I shared a page in Google buzz, then I chose to talk about a web page, amplify it socially, so I probably like the keywords used there… So these are all ‘keywords I am probably interested in’. Now comes the ‘delivery’ part… How do I make use of the collected keywords? Displayin’ them in search suggestions (autocomplete) is just one way. What if you could use the same data for targetin adwords ads that are displayed when you browse the web?

  • http://rosmarin-search-marketing.com Myron Rosmarin

    Another excellent post Danny – as always. One thing that I hoped you would cover is the topic of if and how Google Suggest has altered searcher behavior. I don’t have many examples of this but here’s an interesting comparison of searches of the words horoscope and horoscopes as tracked by Google Insights.
    http://www.google.com/insights/search/#q=horoscope%2Choroscopes&geo=US&cmpt=q
    Notice how the two track together until 2008 when the singular version suddenly increases in popularity and the plural version falls. I have contended that this change in behavior was driven by Google Suggest showing the singular version first and users simply selecting that when previously they might have opted for the plural. There must be many more examples of this. Clearly there is a great deal to be gained if Google and/or 3rd parties can influence which keywords get recommended first.

  • http://www.ydeveloper.com/e-smart-ecommerce-suite.html eCommerce

    I can see nationalities in google suggestion, it’s not protected.

  • Melissa Serrano

    What I would like to know is how to remove that from any browser. It’s annoying to open your browser and click on the search engine and have all those suggestions pop open, It’s an invasion of privacy and or lack of privacy.  It’s not like before when you had your computer out in the open and were searching for something with no worries of what was going to pop open under your empty search box. Now every time I search I have to make sure I’m alone so no one can see what i’m searching. Not because I’m searching inappropriate information, it’s because it’s no ones business what I’m searching. They should have at least put an option to remove it or shut it down. I don’t find it helpful at all. I find it just intrusive and lack of privacy.

  • zahra hatami

    hi..please say me…who disable auto complate in sarech google?

Get Our News, Everywhere!

Daily Email:

Follow Search Engine Land on Twitter @sengineland Like Search Engine Land on Facebook Follow Search Engine Land on Google+ Get the Search Engine Land Feed Connect with Search Engine Land on LinkedIn Check out our Tumblr! See us on Pinterest

 
 

Click to watch SMX conference video

Join us at one of our SMX or MarTech events:

United States

Europe

Australia & China

Learn more about: SMX | MarTech


Free Daily Search News Recap!

SearchCap is a once-per-day newsletter update - sign up below and get the news delivered to you!

 


 

Search Engine Land Periodic Table of SEO Success Factors

Get Your Copy
Read The Full SEO Guide