Is Google’s Synonym Matching Increasing? How Searchers & Brands Can Be Both Helped & Hurt By Evolving Understanding Of Intent
In the beginning, Google matched the words in a searcher’s query to the words on a web page and ranks those pages (roughly) based on how many external links each had. Over the years, Google’s algorithms have evolved in numerous ways, including with how Google figures out what the searcher is really looking for. Now, for instance, when you search for [U2], not only does Search Engine Land Executive News Editor Matt McGee’s site appear at the top of the results (go Matt!), but so do images and video from the band even though the words “images” or “videos” weren’t in the query.
Google uses all kinds of signals to determine that you want the Beautiful Day video when you type in the letters “U2”, but certainly one strong signal is what previous searchers have wanted. Google has data on all of the millions of times searchers have looked for U2 in the past and know how many of them paired that search with (or later refined it with) “video” or “beautiful day” and also knows how often U2 searchers clicked on videos show in the search results. If Google started showing video, but searchers never clicked on those results, they might stop showing them; if every searcher clicked the video over a web page, they might start showing more video in results.
Google has many patents around this idea. Query Revision Using Known Highly Ranked Queries describes “a system and method use session-based user data to more correctly capture a user’s potential information need based on analysis of strings of queries other users have formed in the past. To accomplish this, revised queries are proivided based on data collected from many individual user sessions. For example, such data may include click data, explicit user data, or hover data.”
In other words, Google can look at what previous searchers have typed, clicked on, and hovered over to determine what a particular searcher might want and incorporates those signals into what pages are ranked.
Google’s spelling correction is another application of the same idea. Google can load dictionaries of how words should be spelled and common misspelled variations of those words and can look at how searchers correct searches and when they click on different variations. And it can use this data to not only suggest a query with a different spelling but to treat the misspelling as a synonym behind the scenes and rank the correctly spelled matches.
Back in 2009, I wrote about how this is typically a good thing for both searchers and site owners, but for low-volume, uncommonly spelled queries, the right results may be pushed out. The key query I focused on in that post (for Dr. Robon) has since been adjusted by Google. And that’s part of the point. Google continues refining these algorithms over time. Since all of these synonyms and typos are determined by machines, mismatches that would be obvious to the human eye may seem perfectly matched to the algorithm. As human eyes (searchers) review the matches and skip or refine the ones that are out of place (by either clicking the shown results or not), the machines can learn and adjust.
Google has moved well beyond word matching and onto matching based on intent. The exact words in the query don’t have to be anywhere on a page for it to rank. Google’s Matt Cutts talked about this in a recent interview:
“Keyphrases don’t have to be in their original form. We do a lot of synonyms work so that we can find good pages that don’t happen to use the same words as the user typed.”
You can see this in action with a search for pet adoption. In the search results, both pets and cats are bolded and “Meow Cat Rescue” is ranking as a match. (Interestingly, dogs don’t appear to be treated as a synonym on the organic side, but do seem to be triggering broad match (while cats aren’t) on the paid side.)
You can see this in combination with intent. Stationary and stationery are both valid words but they mean very different things. Only the second word means “paper supplies”. Sadly, many of us type the first word rather than the second when we mean paper rather than standing still. For just the query [stationary], Google doesn’t treat “stationery” as a synonym (although you’ll see that several pages rank that use the “ary” spelling but mean the paper products). However, for the query [stationary supplies], Google does treat “stationery” as a synonym as likely few people are looking for supplies to help with remaining in one place.
It’s obvious in this case that using synonyms to simply rank pages about “stationery supplies” when someone searches for “stationary supplies” is likely to improve the usefulness of search results. But, even this can go awry. Look at the searches being done around “stationary supplies”. They include:
- office supplies
- paper supplies
In fact, many searches are for combinations of staples and office supplies (“staples stationary”, “staples stationary supplies”…). If someone searching for “stationary supplies” actually wants “stationery supplies” or “paper supplies”, then does this mean that when someone searches for stationary (or stationery) supplies, they actually want staples?
The case of hhgreg
That’s exactly what happened with electronics store h. h. gregg, which reader Artem Russakovskii of Android Police tipped us to. Presumably, lots of people were searching for h. h. gregg in conjunction with things like laptops, TVs, and printers. But lots more people were searching for laptops, TV, and printers in conjunction with Best Buy. So when people searched for [hhgregg site], Google ranked hhregg.com first, but ranked bestbuy.com second.
In fact, the other five results on the page were bestbuy.com as well.
On the surface (to the human eye), this seems like a serious bug, but notice that “best buy” is bolded in the results. This was simply synonym matching based how we search. In fact, many of us are apparently typing queries with both hhgregg and best buy in them (although it appears most are looking for comparisons between the two).
By the way, Google told us about the above search:
We try to identify synonyms automatically, and there are times we miss and synonymize terms that seem to be closely associated on the web but don’t in fact have the same meaning — in this case it’s an issue with how we handle multi-word terms. We’ve been working on an algorithmic solution to these types of cases, so we appreciate the feedback.
Bing works in a similar way. See below how they match “stationery” pages for a “stationary” search (without flagging it) and match “Britney” pages for a “Britny” search (and flag it as spelling correction).
What’s a Brand To Do?
Where does this leave us? Don’t worry so much about including all of the various ways someone misspell topics on your page. It hurts your credibility and likely won’t make a difference in how you rank for those misspellings.
Don’t worry about “SEO copywriting” — including specific phrases in specific combinations a particular number of times on a page.
Use keyword research not for that but to figure out intent — what is your audience really looking for? What tasks are they trying to accomplish? Use the highest volume variation in your title tag and then otherwise, just make sure your page is the most relevant, useful result for that intent. (I’ve given several talks and webinars about this process and I describe a methodology for it in my book.)
And this is yet another example of the importance of brand authority. If audiences associate you with particular topics and start to search for you in conjunction with those topics, your pages are more likely to rank for searches for those topics without your brand. Offline advertising, social media engagement, and direct engagement resulting in return visitors to your site can all help with this.
And if your competitor starts ranking for searches for your brand, you might want to post about it in Google’s discussion forum rather than wait for searcher behavior to adjust things algorithmically.
Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.