Sign up for our daily recaps of the ever-changing search marketing landscape.
Google Website Optimizer, Pensive Riffs: A Meeting of the Minds
Part one of this article is here.
“Big money goes around the world” – Rush
When we left off, I had shared the results of a first test for the home page of a small business ecommerce site. My colleagues and I went through a systematic process to work with the business owners (like us, aficionados of the complex riffs of guitar virtuoso Alex Lifeson) to identify the most important areas of the homepage to test for potentially increased sales conversions. This included contributing some design elements, headline copy, body copy, etc. as well as an accelerated, yet sophisticated we think(!) usability analysis.
Beware the sounds of salesmen
I hope this account serves as a realistic but optimistic antidote to the breathless “we increased conversion rates 500%!!” case studies so often seen out there, that don’t tell you that they did so by removing obvious nonperforming keywords from a paid search account or by fixing obvious shopping cart problems. Starting from a low base certainly makes any improvement story sound more impressive, but I assume most of you are looking to pick the middle-hanging fruit, not that easy-to-reach low-hanging variety.
Recap of the first test
On this first test, the learnings about what worked and what didn’t were fairly clear. In this case, different persuasive elements did not improve results significantly, although removing persuasion (going with shorter copy) did mess things up. Overall, as the test wore on, some results converged. In other words, the different “page elements” in the multivariate test of 24 page permutations (3X2X2X2) did sometimes show a winning variation, but other times, something that raced off to an early lead wound up close to tied in the end.
Google’s Tom Leung recently noted that we need to be careful not to pay undue attention to page elements alone, in isolation for combination (or variable interaction) effects. In other words, ideally, you would test your page long enough for the winning combination out of the 24 to be statistically significant. There is indeed something a bit eerie about seeing, say, “recipe #4” to be leading all the others consistently, no matter how long you run the test. It’s like the numbers are speaking to you.
We also learned that the texture and complexity of user interaction exceeded what we expected, even being prepared for it, and even performing a fairly systematic test intended to give clear, unambiguous questions to our user interface and persuasive elements questions.
The first test ran over many weeks, and the numbers were somewhat conclusive, and sobering. Three of our four testing elements got no better, and sometimes worse, results.
- “Short copy” was the biggest loser. The original converted much better.
- Eliminating three cluttery product promo boxes above the body text did not hurt conversions, but it didn’t help, either. It seems that “clean design” has its limits.
- Two new headlines we tried were pretty big losers, also. The longer (original) one was the winner. We believe this was because it contained the shipping offer that is a real driver to conversions and user behavior throughout the process of filling the cart. At this point the belief that I am smarter than the client is starting to wane. I’m thinking “any escape might help to smooth the unattractive truth.”
- A decision to remove a “rotating special” promo box in the upper right-hand margin was a good one. Given what a small part this played in the overall page layout, the modest improvement in conversions here was proof of our hypothesis. This eliminated clutter, but also moved (yet another) mention of the shipping offer up into view.
- We settled on the winning combination (not exactly the same as the winner of each variable showdown, but close) as our base new home page. However, the case is definitely not closed at this stage. Only one of our page element tests showed significant improvement. But some of the losing causes taught us enough to want to run a new test. At this point, the winning combination was only 33.7% better in terms of conversion rates than the original, with only a 36% chance of beating all combinations. An OK result, but not too conclusive.
So far, I can tell you’re underwhelmed. In the first test, the winning combination eked out only a slight victory over the original. Many of the contending combinations were so much worse than the original that the ensuing graph of competing combos showed so much “red” it was embarrassing. (Or as Geddy Lee sang in 1984, “I see red, and it hurts my head.”)
What we did for test #2
Test #2 was much better. We took our slight victory, established it as the default new “original page,” and parlayed it into a major win as we tested additional changes. I’m sitting here looking at the Website Optimizer report, and instead of seeing a sliver of green and a bunch of red bars, all of the combos in competition with “original, phase two” are performing better – so the bar chart shows mostly green. Woohoo! And one combo, as so often seems to happen, is starting to pull away.
In debating test #2, we had to shelve several test ideas, or simply implement additional changes arbitrarily, because too many combinations hinders the effort to reach statistical significance in a reasonable time frame. We wanted to limit it to 8 or 12 page permutations this time. You don’t want to be testing for months on end.
I carried over some nagging concerns about the winners and losers from before. I felt that the “short copy” I’d contributed wasn’t implemented too well, and in combination with some other page elements, simply looked funny. I decided the solution would be to write “medium length copy” while continuing to attempt to improve the messaging. I also tried to exercise some control over visual layout. Finally, I experimented with signing it more personably – the Mayflower Hanging Baskets Family (not really the name of the company), as the family-owned business aspect is touted in other parts of the messaging but for some reason that signature said “staff” rather than “family.”
I also thought the “original” headline still wasn’t optimal. But rather than tinker with it again, I tested a font change.
Finally, we decided to learn more about the “clutter-boxes” above the text. Conversion rates were about the same when we eliminated them. What if we added three more of the pesky little critters? My wife looked at this in action before we ran the test and exclaimed “yikes, that’s just wrong. I’d leave the site immediately.” Her concern proved to be more or less warranted.
Testing copy variations, headline font, and clutter-boxes gave us eight combos this time around. A relatively simple 2X2X2 test.
The consumer response
As I write this, our test has reached statistical significance ahead of the expected schedule. Google Website Optimizer (rather than the familiar “you have an estimated 29 days to go…”) is shouting “Congratulations! Combination #4 is the winner!” Ah, the payoff.
Conversion rates and sales volumes across the board took an expected dip in August; but they’ll be moving sharply up for the client’s back-to-school rush. But the controlled nature of these tests makes them reasonably reliable even with seasonality.
So what exactly happened here?
My persistence in trying to improve the sales copy was rewarded. The first time out, we didn’t prove that “short copy worked” but we also didn’t prove the original copy was optimal. As it turns out, my medium copy worked well with other page elements. It included a revamp of the H1 level headings and a change of the heading fonts, but also another extensive copy rewrite. This was a big winner, proving that clarity and persuasion are integral to the customer experience.
The new, crisper font and color for the headline grabbed an early lead, but convergence set in and as we speak, it’s not significantly better. Bear in mind, different *wordings* for the headlines had been clear losers in test #1. People were reading and noticing what was being said, the first time around.
And what about the clutter-boxes? “3 Boxes” above the text wasn’t a negative, but 6 boxes was a slight conversion killer. A no-brainer? Maybe so. But now we know. I doubt we’ll be going to 9 boxes.
At this stage, the winning combination is converting 106% better than the winning combo from last time, and the second-place and third-place combos, 63% and 51% better. After two tests, we’re clearly doubling conversion rates from the original. Controlling for seasonality, let’s say this upped conversion rates on average from 1.8% to 3.6% — and consider that a lot of that is stumble-in organic traffic. It’s an aggregate figure that covers a lot of different scenarios. (If you’re scoring at the office of a larger company, just add a zero or two to the sales figures you see below to see what this might be doing for your results.)
The changes pursued here stop well short of a full site redesign, multichannel branding campaign, or business plan changes, which could have a more dramatic effect.
But a doubling of conversions from mostly organic referral traffic is no small victory. It should lead to a tangible improvement in profitability.
It’s fascinating to be part of a small part of ecommerce history unfolding: the spread of conversion science to the small to midsized business. Significant bottom-line improvement to conversion rates – without massive site overhauls or stacking the deck with huge offline spending – is available at an accessible cost to many more businesses today, because the tools are inexpensive or free, and expert help is often affordable from an ROI standpoint (although it’s not free). That expert help is important in reducing myriad options to the most probably areas for improvement – and then correctly implementing creative, communications, and navigational elements… if not the first time, then the second.
Despite the increasing accessibility of complex testing to the average business, the multivariate nature of the interactions makes this process more complex than one-dimensional, haphazard hackers and tweakers often let on. As for practitioners of the “old ways” of testing response – to close out with another Neil Peart lyric – can they “face the knowledge that the truth is not the truth”? Can you say “obsolete”?
Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.