Search Engine Land » PPC » The Pitfalls Of A/B Ad Split Testing, Part 1

The Pitfalls Of A/B Ad Split Testing, Part 1

Most search advertisers have no question that testing ads is a good thing, and can lead to much higher performing campaigns. But is it possible to over-test and over-optimize, actually leading to worse results? The answer may surprise you.

Matt Van Wagner on March 8, 2010 at 4:36 pm | Reading time: 6 minutes

While channel surfing a while back, I came across the “National Dog Show,” a TV program where dogs of all shapes and sizes are paraded around by their handlers, and then poked and prodded by judges who select one of these purebreds to win the top prize.

It was great fun to see these beautiful canines compete to become top dog, but there’s an unfortunate downside to this sort of competition. Evidence suggests that selective breeding of dogs for looks alone can lead to a thinning of their genetic diversity and the emergence of genetically-inherited defects. The purest of the purebreds are often less healthy and don’t age as well as their more genetically diverse cousin, the less attractive, but very lovable, mutt.

As I turned off the TV and got back to writing ad copy for PPC A/B testing, a question popped into my mind. Is it possible that we are over breeding, or in our PPC vernacular, over-optimizing our text ads to the detriment of the overall health of our campaigns?

The answer, surprisingly enough, is yes! A/B testing, taken to its extreme, can actually cause PPC campaign performance to degrade.

Creating best of breed ads through testing

A/B ad testing seems simple enough.

You run two ads in an ad group, let them rotate evenly so ad impressions are split evenly between the two ads. After a while, you evaluate the results and declare a winner based on highest click-through rate (CTR), conversion rate (CVR) or the blended ratio of those two metrics, CTR times CVR. The losing ad is tossed out and the winning ad moves on to the next round of testing.

The next round in the A/B ad testing process looks a lot like the first, except that it involves testing the current champion ad against a new challenger ad, letting those two ads battle it out for the best CTR and CVR, until a winner emerges. This process is repeated, ad infinitum, if you’ll pardon the pun, with the hope that eventually you will end up with the best PPC ad ever written in the history of online advertising. In reality, what usually happens is that you get bored with the test or run out of copywriting ideas, and so you just end the test, set the champion ad as the default, and then move on to optimize other parts of your PPC campaigns.

No one will deny that A/B ad testing is valuable and that it enables a rational, scientific approach to campaign optimization that can yield practical improvements in your PPC campaigns, especially early on in the testing cycle.

A/B Testing is so simple and easy to understand that it’s hard not to like it and use it all the time. It has become somewhat of a sacred cow in that regard. But as with many other PPC campaign optimization tactics, A/B ad split testing can be misapplied. What’s worse, it can actually degrade your campaign performance over time. Why?

Simply put, the over optimization of ads and the resulting potential decline in ad group performance.

To understand how and why A/B testing can cause performance declines, let’s assume you own the Blue Widgets store down on Main Street and a customer walks in the front door. You know nothing about the customer except the fact that they just walked into your store.

Here are the unique selling features and benefits of your store:

Selection: Blue widgets available in any shade of blue
Savings: Save 20% on blue widgets this month
Quality: Eco-friendly blue widgets
Availability: Blue widgets in stock for fast delivery
Brand: We carry ACME blue widgets

Of these benefits, you know that saving money is on most people’s minds, and generally appeals to a wide cross-section of people.

So the key question: Will you have a better chance of selling to this person if you help them understand all of your store’s unique selling features and benefits, or tell them about the 20% off sale, repeating yourself five times.?

Subjectively, I think most of us would say that you’d have a better chance of connecting with a customer if you used all five benefits, since you have 5x greater chance of hitting one of your potential customer’s hot buttons.

In a PPC campaign an ad is your sales pitch. It is what brings customers in the door. The challenge for PPC ads, however, is that you can’t fit all of your top benefits into a single ad. Instead, you break the messages up individually, or group a few together, and then start the process of A/B testing (or multivariate testing if you are advanced) to determine the best message.

After a few rounds of testing, you find out that the “Save 20%” is your best ad and so you pause all the other lower-performing ads, and congratulate yourself on optimizing your campaign.

However, this is where A/B ad testing can lead to erroneous conclusions. The tests do not optimize your ad groups; they simply identify your best ad. While A/B testing can do a great job of identifying your best ad, narrowing your ad groups to a single lone winner can be a huge mistake. Your most effective optimized ad group sometimes requires two, three or more ads to perform optimally, not just the champ from your A/B testing.

Here’s why. When people are looking for products and services they make multiple searches. Sometimes they refine their queries, but often times they type the same or nearly identical search query over and over again.

What this means is that someone searching on “blue widgets” five times, will see “save 20%” ad five times. If they search ten times, they’ll see it ten times. The only thing they will know about your blue widgets is that it will save them money.

Just like in our example above of a person walking into your store and getting touched by five great selling benefits, PPC ad groups that offer more than one great ad should outperform ad groups that are limited to a single “best of breed” ad.

When it comes to optimizing ad groups, give me five lovable mutts over one blue ribbon winner anytime.

Next month, I’ll take a look at the math behind A/B testing and demonstrate how an ad group can outperform the best single ad in that ad group that emerges from A/B tests. I’ll also take a look at the challenges of A/B testing at the ad group level, and how to maintain healthy message diversity in your ad campaigns.

Contributing authors are invited to create content for Search Engine Land and are chosen for their expertise and contribution to the search community. Our contributors work under the oversight of the editorial staff and contributions are checked for quality and relevance to our readers. The opinions they express are their own.

Add Search Engine Land to your Google News feed.