Why Results Quality Is So Important to Search Engines
Every single search engine has, at the heart of it, a dynamic tension that must be respected. It needs to balance user experience with revenue opportunities. Getting this balance right is incredibly difficult, as Ask, Yahoo and other engines that have seen their market shares precipitously fall can attest to. There was a time when […]
Every single search engine has, at the heart of it, a dynamic tension that must be respected. It needs to balance user experience with revenue opportunities. Getting this balance right is incredibly difficult, as Ask, Yahoo and other engines that have seen their market shares precipitously fall can attest to. There was a time when Yahoo (and MSN, in a previous incarnation) used to routinely show 4, 5 or even more results in the critical “North” ad position (top of page, above organic results).
As these engines boosted revenue (because showing more ads will almost always boost revenue) they also saw their market shares decline. Google has generally respected the 3 ad maximum in the top spot. Over time, Microsoft and Yahoo have also come to abide by this upper limit on ads. So, why is 3 the magic number? Or, is it always the magic number? I’ll come back to this in a moment.
There are user experience dynamics that happen in this relative small piece of real estate that are interesting to explore, because the serve to help us understand the search interaction in greater detail. All the major engines now spend a lot of time thinking about how to maintain top of page relevancy.
First, let’s revisit how we scan a search results page. We have developed a pretty efficient F-shaped scanning strategy, starting in the upper left of the search results page. Why the upper left? It’s because our brains automatically create short cuts based on probability of success. Engines, more often than not, show the most relevant result in the upper left. If this piece of real estate typically delivers what we’re looking for, after awhile we’ll just naturally start there without thinking.
You might argue that in our culture, we read top to bottom, left to right so that would lead to an upper left bias. This contributes, but if Google and Bing suddenly started showing the most relevant result half way down the page, we’d soon adjust our scanning strategy to make that our entry point. Search scanning patterns are much more about probability of success than they are about more arbitrary cultural guidelines.
Scanning a search page is really a hunt for the most useful information on the page, and as Pirolli and Card discovered in the early 90’s at PARC, we have adapted evolutionary mechanisms to assist us there. We use the same basic strategies that we developed to hunt for food.
We look for cues, or, to use the analogous term, “scent”. And efficiency is the rule. We’re none-too-patient in our hunt for usefulness. We start at the top left, scan down the left hand side of the results in a vertical path (the upright leg of the “F”) and, if we find something of interest, scan laterally on the title (the horizontal arms of the “F”).
This creates a triangle shaped scan pattern which I christened the “Golden Triangle.” In our many studies over the years, we’ve found the typical session time, from first scan to first click, to be in the 10 to 12 second range. And in that time, we scan approximately 4 to 7 listings. This provides a clue as to why 3 ads at the top of the page seems to be the upper limit that users will tolerate.
Our Relevancy Consideration Set
When comparing options, humans have built in limits of alternatives we can hold in our working memory. Previous research indicates it’s about 7 discrete pieces of information (more recent research shows it’s probably closer to 3 or 4), plus or minus two. This seems to hold true on the results page, given the number of listings we consider (in the 3 to 5 range, based on our observational studies).
We like to compare a small set of potential results, and then pick our top contender from that set. If nothing in the set seems to be a good candidate, we start moving down the page (we’ll come back to this in a moment).
So, what are we looking for in that first consideration set? We certainly want relevance, but what is our definition of relevance? Here we come to some fairly nuanced biases on the part of the user.
Most users seem to know that there’s a difference between the top results, bounded by a shaded box, and the results that appear below that. The visual boundary between the two sends a cue to the user that they should be considered separately. And although that user may not be familiar with the specifics of a search engine algorithm, let alone the intricacies of an advertising quality score, the majority still know that the top results are more commercial in nature and the lower results less so.
In the user’s mind, less commercial equals more trustworthy, so the top organic result becomes a sort of usefulness baseline. Users use it to compare other results against. And this is where we see why too many ads on top starts to erode user confidence.
If an engine puts 4 or more ads on top to be considered, they haven’t left an available memory slot for the baseline listing in the top organic spot. They’ve forced the user to either consider only ads or to break their natural scanning pattern and skip further down the page (which the brain hates to do). This will almost always generate more clicks in the sponsored ads (at least, it will generate more first clicks) but the engine will pay a price.
By not allowing the user to follow their natural inclination to compare the ads against the top organic listing, they start to erode user confidence, which will eventually erode market share. This is the lesson Ask, Yahoo and MSN all learned the hard way.
Quality Is Not Just About The Numbers
It’s not just the number of ads that factors into user confidence. The quality of ads is also important. Ideally, for a commercial query, the ads should be more useful results than the organic ones. This is why Google has put such stringent quality thresholds on ads that appear in the North position.
This point was driven forcefully home in a study we did. We took a group of users and split them in two. We showed both groups a page of hypothetical search results from a generic engine and asked them to pick the most relevant result. The two pages of results were identical except for one thing – the ad at the very top of the results. The first group was shown a highly relevant ad and the second group was shown a marginally relevant ad. After their search interaction, we asked both groups four questions:
- Would you use the search engine again for a similar type of search
- Would you use it again for any type of search?
- Would you make it your preferred search engine?
- Would you recommend this engine to a friend?
The difference between the two groups was astounding.
- For the first question – would you use the engine for a similar search, only 5% of the group shown the less relevant ad said yes. 75% of the group shown the relevant ad said they would use the engine again.
- For the second question – would you use this engine for any type of query – 17% of the less relevant ad group would give it a second chance, compared to 68% of the more relevant ad group.
- For the third question – would this engine become your preferred engine – only 5% of the less relevant ad group would add it to their favorites, with 31% of the relevant ad group prepared to make the commitment.
- And finally, the acid test of user satisfaction – would you recommend this engine to a friend? 18% of the poor ad group would spread the word, compared to 53% of the relevant ad group.
These numbers are pretty amazing, considering the only thing different between the two groups was the quality of the first ad. Everything else on the page, the other 20 sponsored and organic results, was identical. It brings into stark relief the importance of quality in that critical first slot.
So, if ads are this important, it speaks to the importance of a huge ad inventory. In the early stages of the Microsoft/Yahoo partnership, Steve Ballmer mentioned something that was critically important:
“The fundamental basis for doing the search deal with Yahoo has to do with critical mass in the advertising marketplace. It doesn’t have to do with technology, or any of these other things, it really is a market phenomenon. Together we would have more advertisers… which means we’d have more relevant ads on our page.”
The study I talked about previously showed the importance of highly relevant ads at the top of the results. But quality is not just about relevance. Quality is also determined by the appearance of trusted brands. In another study we conducted, we found that in markets where major brands have been slow to adopt search, the top sponsored results quickly become the equivalent of a commercial slum.
Major retailers in some markets (including my own home country of Canada) have been regrettably late in coming to the search marketing party. For many searches on retail consumer products, the only sponsored results you’ll find come from online sellers and aggregators. In most cases, the brands are totally foreign to the average consumer. The major retail brands that dominate the bricks and mortar world are conspicuously absent from the top of the sponsored search results.
In the study I mentioned, we again simulated a search experience for users from a well established international consumer market, this time on Google, and tested click through rates when we showed ads from the market’s biggest retail chains. These chains had never run search campaigns (nor had any of their main competitors) in our test market, so the top of the search results were typically dominated by lesser-known rivals.
Our baseline group was shown real results straight from Google. Our test groups was shown the same results page, but with an ad from the top retailer inserted at the top of the page. We then tested click throughs across the two groups for a number of searches, including common consumer queries such as types of clothing and popular children’s toys.
In the baseline group, we see anemic click through rates of 1 to 3% in the top sponsored ads. Not really surprising, considering the obscure nature of the advertisers. But even we were amazed by the click through rates we saw from our test group that were shown ads from the familiar retailer. For generic product queries, our test ads captured click through rates that exceeded 30 to 35%. One out of every three visitors clicked on an ad from a retailer they recognized and trusted.
In time, with more competition on the page, we would expect to see these click through rates drop down to a more sustainable level (typically in the 8 to 10% range), but this shows how search categories where recognized brands have been slow to play offer a significant early adopter advantage to the ones willing to test the waters.
One last comment on the power of quality and relevance in ads. During one test looking at search behaviors in a group of participants looking for lap top computers, we noticed a fairly high percentage of them quickly scanning the top sponsored and organic results, then, without clicking, looking over at the top sponsored ads on the right hand side. This is relatively unusual. These ads are usually only referred to by 30 to 35% of visitors, and then it’s typically later in the scanning session. But in this study, a high percentage of our subjects were checking out these ads within the first 5 seconds.
Finally, our research coordinator asked one of the participants why they were looking up at those ads, the response was:
“I was looking for the ad from Dell. I don’t see them anywhere in the results and I expected to.”
Sure enough, as we were mocking up the results page, we didn’t include Dell as a test brand. At the time of the study, Dell dominated laptop sales, with over 1/3 of the market. When our users didn’t see a brand they expected to in the results set, their confidence in the quality of the results, and, by extension, in the engine itself, started to erode.
One search executive recently said to me, “I have a whole team of people looking at top of page quality, trying to get the balance just right.” No wonder. When even one irrelevant ad can erode user confidence to the extent we saw in our studies, top of page quality is a matter of life and death for engines.
Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.
New on Search Engine Land