# The Long Tail Of Search

What is the “long tail” of paid search, and why does it matter?

Chris Anderson coined the long tail concept in a 2004 Wired magazine article. Anderson’s original argument applied to online merchandising. Because web-only merchants (think Netflix) should have lower inventory carrying costs than traditional retailers (think WalMart), web-only merchants can afford to offer a broader catalog of items. Anderson describes a graph where the x-axis represents SKU sales-rank and the y-axis represents corresponding sales by SKU. Best-selling items appear on the left side of the distribution. These are hit products with large sales, the products carried by traditional retailers. This graph drops off steeply, then heads rightward with an nearly flat long tail. The tail are the niche products, individually unimportant, but—and here’s the big insight from Anderson’s article—collectively significant.

The shape of Anderson’s curve is a power law. Power laws pop up in natural science, chaos theory, computer networks, and languages. In linguistics, Zipf’s “Law” states that the distribution of word frequencies in a text follows a power law.

Let’s check out Zipf’s Law for ourselves. If you take the text of Moby Dick and count each word, you get this graph:

You can see the most prevalent word by far is “the”, at about 15k occurrences in Melville’s novel. The second most popular word, “of”, is far less popular: under 7k occurrences. Note how the graph drops very rapidly and basically levels out. (The graph continues far far far rightward, not shown.) This plot shows that Moby Dick—like any text—contains a tiny number of super-high frequency words, and a huge number of words with very low frequencies.

Of course, Moby Dick isn’t a novel about “the”. In aggregate, the long tail of low frequency words comprise most of the text.

Also, note that the 21st most frequent word in Moby Dick is “whale”. “Whale” isn’t a particularly common word in normal English, but it is highly prevalent in this whaling novel. This reminds us that notions of “common” and “uncommon” depend highly on context.

By now, you might be wondering if any of this matters to paid search marketers.

It does.

If you take the phrases in a search portfolio and order them by decreasing clicks or cost, you’ll quickly see a tiny handful of terms comprise most of your clicks. Try it out. The distribution of click frequencies in a well-designed paid search phrase list approximately follows a power law.

I pulled sample data for one of our clients, an online specialty retailer ranked in the Internet Retailer 100. We are running about 45K active phrases for this client on each major search engine.

These 45K terms aren’t created equal. The top 10 phrases comprise 64% (!) of their clicks. And phrases 11 through 40 comprise 27% of their clicks. That is, the first 40 terms—less than one tenth of one percent of their total term list—account for 91% of their clicks! Phrases 41 through 45K, which comprise 99.9% of their search phrases, collectively account for a mere 9% of clicks. Wow.

So is the long tail important to this client? You bet. Phrases 41 through 45K generate 16% of their total PPC sales. And because these phrases have, on average, lower CPCs than the high-traffic head terms, they comprise 19% of the client’s total PPC profit.

The lazy search marketer might look at these numbers and reason: “Hmmm…. I can get 91% of the traffic with less than 1% of the effort. I’ll run 40 phrases, get most of the bang for not that much effort, and head out early for the day to do some golfing.”

The savvy search marketer views the situation differently: “Hmmm…. running good ads on 45K terms will take a boatload of work, getting all those bids and match types and landing pages and ad copy done right. Ouch. But not doing this would mean walking away from 16% of revenue and 19% of profit… I guess it is time to roll up my sleeves and dig in.”

Is the long tail right for all search advertisers?

Perhaps not. Big campaigns with tens of thousands or hundreds of thousands of terms require much more care and attention than smaller campaigns of a few hundred top terms. Big campaigns offer more dark nooks for inefficiencies and errors to hide. You need good systems to generate and maintain relevant and targeted ad copy. You need advanced statistical approaches to cluster ads to compute bids intelligently (because almost every phrase in the tail lacks sufficient traffic to assess its conversion individually with any degree of statistical certainty.) In short, embracing very large term lists offers both opportunity and risk. No doubt: a small campaign executed well will out-perform a large campaign executed poorly.

Executed well, the long tail terms offer modest gains in sales and earnings (say, low double digit percentage increases). Managed poorly, the long tail can easily stab you in the back, generating significant cost overruns and earnings losses (say, high double digit percentage decreases).

Eight suggestions regarding large search phrase portfolios:

• Don’t apply sweeping simplistic bid rules (for example: “Any phrase with less than 10 clicks / month, bid \$0.50″) to the long tail. This is a sure-fire way to torpedo your efficiency.
• Don’t use generic ad copy with title substitution. These shortcuts hurt performance. Write targeted relevant ad copy which matches the actual phrase.
• Don’t use generic landing pages. Match each phrase with a highly specific targeted and relevant deep-linked destination URL.
• Don’t confuse quality with quantity. It is easy to bulk up phrase lists in dumb ways. We’ve seen agencies seeking quantity over quality add useless additional words to phrases (prepending “buy” to every phrase doubles your list!) or run extremely long detailed phrases which will never see impressions or clicks (for example, full SKU names like “Extra Strength Oscillating 45mm Titanium Widget”).
• Don’t manage campaigns of more than a few hundred phrases “by hand” using spreadsheets.
• Don’t manage all phrases identically. The left and the right sides of the distribution require different strategies. Success in the head is about copy testing, day-parting, and match-type optimization. Success in the tail is about aggregation, smart copy management, and aggregation.
• Don’t update bids on all terms with the same frequency. Be nice to Google (and save yourself API fees) by bidding long tail phrases only as necessary.
• Don’t fall back to a “just run the winners” approach, turning off long tail phrases prematurely. Most long tail phrases generate zero sales in a typical month. The low-frequency phrases which generate sales this month won’t, on average, be the low-frequency phrases which generate sales next month. Use strong statistical portfolio clustering approaches to bid low-frequency phrases smartly.

The long tail. Done well, a great place to find reasonable performance bumps in PPC campaigns.

Alan Rimm-Kaufman leads the Rimm-Kaufman Group, a direct marketing services and consulting firm founded in 2003. The Paid Search column appears Tuesdays at Search Engine Land.

Opinions expressed in the article are those of the guest author and not necessarily Search Engine Land.

