• Search Engine Land
  • Sections
    • SEO
    • SEM
    • Local
    • Retail
    • Google
    • Bing
    • Social
    • Resources
    • More
    • Home
  • Search Engine Land
  • SEO
  • SEM
  • Local
  • Retail
  • Google
  • Bing
  • Social
  • Resources
  • Live
  • More
  • Events
  • SUBSCRIBE

Search Engine Land

Search Engine Land
  • SEO
  • SEM
  • Local
  • Retail
  • Google
  • Bing
  • Social
  • Resources
  • More
  • Newsletters
  • Home
SEM

Sometimes You Just Need To Go With Your Gut In Data Analysis

It is difficult to understand why statisticians commonly limit their inquiries to Averages, and do not revel in more comprehensive views. Their souls seem as dull to the charm of variety as that of the native of one of our flat English counties, whose retrospect of Switzerland was that, if its mountains could be thrown […]

Siddharth Shah on June 21, 2012 at 12:52 pm
  • More

It is difficult to understand why statisticians commonly limit their inquiries to Averages, and do not revel in more comprehensive views. Their souls seem as dull to the charm of variety as that of the native of one of our flat English counties, whose retrospect of Switzerland was that, if its mountains could be thrown into its lakes, two nuisances would be got rid of at once. An Average is but a solitary fact, whereas if a single other fact be added to it, an entire Normal Scheme, which nearly corresponds to the observed one, starts potentially into existence.

Some people hate the very name of statistics, but I find them full of beauty and interest. Whenever they are not brutalised, but delicately handled by the higher methods, and are warily interpreted, their power of dealing with complicated phenomena is extraordinary. They are the only tools by which an opening can be cut through the formidable thicket of difficulties that bars the path of those who pursue the Science of man.

— Sir Francis Galton
Natural Inheritance
(1889), 62-3

Data analysis can be overhyped. We are taught to believe that with the right data analysis, we can understand everything around us and make rational decisions in our best interests. We are even shown how to do this in our schools and universities where we analyze datasets and are asked to infer insights.

The trouble is, the datasets we are taught on are unrealistic. Unlike toy datasets, real marketing data is often incomplete, sparse and some times even wrongly entered. So while it would be possible to come up with crystal clear conclusions if the data were robust, we are often forced to make decisions with partial or small datasets.

In these situations, we must combine the analysis with some heuristics and gut checks to come up with our conclusions. This is best understood with an example.

You are asked to find the quarter on quarter change in spend for  advertisers based on a small sample. (The data has been randomized and is not representative of any real advertiser.)

The simple thing to do here is to take the average across all advertisers and measure change i.e. measure the change on the total. The change in this case is -63%. So is this reflective of the marketplace? Let us explore this data a bit further.

Step 1: Directional Trends

Are we sure that the overall trend is positive or negative? If we breakdown trends by advertiser, we get this chart:

Note that 9 out of 11 advertisers show a drop in spend. Further, we know from our knowledge of the retail vertical that spend usually drops in Q1 wrt the Q4 holiday season. So we are confident that the spend trend is negative.

Step 2: Identification & Treatment Of Outliers

A closer look at the sample reveals that in Q4, over 50% of spend came from advertiser 9, the same advertiser dropped spend by 83%. In such cases, it is useful to measure the median change across the sample. Further, we can also measure the change without Advertiser 9.

Statistic Change
Mean without advertiser 9 -40%
Median change -30%

Clearly, Advertiser 9 is biasing our results. Another way to check the impact of the advertisers is to measure the average drop without each advertiser present in the sample. So we measure the mean drop in spend without advertiser 1,2,3 etc.

Note how stable the estimates are for all but advertiser 9. Most estimates hover around -60% but without advertiser 9, it drops to -31%. Clearly advertiser 9 is biasing the results negatively.

Step 3: Cross Checking The Data

If possible, we should try to come to our estimate by other means. So we try another approach. If we order advertisers by spend and then calculate the change in spend cumulatively, what trend to we see?

Two things become very apparent:

  1. Without advertiser 9, the sample represents 77% of spend in Q1 and the change in spend between Q4 and Q1 is -40%.
  2. As we add larger advertiser the change in spend generally becomes increasingly negative. This indicates that larger advertisers dropped spend more than smaller advertisers in general; something that we might want to investigate further.

Step 4: Coming Up With The Estimate

This is the hardest and most controversial part. Here are the estimates that we have:

Method Estimate
Mean (all) -60%
Mean (without Advertiser 9) -40%
Median -30%
Mean without top 2 advertisers -20%

Given all these estimates, I would feel fairly comfortable in saying that spend has declined between 25% and 35% between quarters. Of course, I am mixing estimates with my gut feeling here and here is where it gets subjective.

In conclusion, when working with partial or sparse data, try to see the data from different angles rather than just the overall average.

Further, gut check your analysis with your domain knowledge. This is hard when your analysis shows something unexpected. If it is an unexpected number, it might really be a new insight or it might be a wrong conclusion.

At these times, your gut can really lead you to the right path or completely astray.

Hence, check your conclusions in different ways. The difference between an average analyst and the best ones, are that the best analysts know when to trust their instinct and when not to. That cannot be taught; it comes with experience.


Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.



About The Author

Siddharth Shah
Siddharth Shah is head of web analytics, digital strategy and insights at Adobe. He leads a global team that manages the performance of over $2 BN dollars of ad spend on search, social and display media at Adobe.

Related Topics

Channel: SEM

We're listening.

Have something to say about this article? Share it with us on Facebook, Twitter or our LinkedIn Group.

Get the daily newsletter search marketers rely on.

Processing...Please wait.

See terms.

ATTEND OUR EVENTS

Lorem ipsum doler this is promo text about SMX events.

Available On-Demand: SMX Create

May 18-19, 2021: SMX London

June 8-9, 2021: SMX Paris

June 15-16, 2021: SMX Advanced

June 21-22, 2021: SMX Advanced Europe

August 17, 2021: SMX Convert

November 9-10, 2021: SMX Next

December 14, 2021: SMX Code

Available On-Demand: SMX

Available On-Demand: SMX Report

×


Learn More About Our SMX Events

Discover actionable tactics that can help you overcome crucial marketing challenges. Our next conference will be held:

Next Event: Sept. 14-15, 2021

Available On-Demand: March 2021

Available On-Demand: October 2020

×

Attend MarTech - Click Here


Learn More About Our MarTech Events

White Papers

  • Gartner Magic Quadrant for Digital Experience Platforms
  • Selecting a Customer Data Platform For Your Organization: The 2020 Gartner Market Guide
  • The Complete Guide to Web Core Vitals
  • The New Era of Automation in SEO
  • Nielsen Annual Marketing Report: Era of Adaptation
See More Whitepapers

Webinars

  • Drive Customer Engagement with the Power of Personalization
  • 7 Use Cases That Prove Why You Should Implement DAM
  • Accelerate Your SEO & Content Marketing Program with 4 Key Milestones
See More Webinars

Research Reports

  • Local Marketing Solutions for Multi-Location Businesses
  • Enterprise Digital Asset Management Platforms
  • Identity Resolution Platforms
  • Customer Data Platforms
  • B2B Marketing Automation Platforms
  • Call Analytics Platforms
See More Research

Attend SMX For Only $199

h
Receive daily search news and analysis.

Channels

  • SEO
  • SEM
  • Local
  • Retail
  • Google
  • Bing
  • Social

Our Events

  • SMX
  • MarTech

Resources

  • White Papers
  • Research
  • Webinars

About

  • About Us
  • Contact
  • Privacy
  • Marketing Opportunities
  • Staff

Follow Us

  • Facebook
  • Twitter
  • LinkedIn
  • Newsletters
  • RSS
  • Youtube

© 2021 Third Door Media, Inc. All rights reserved.

Your privacy means the world to us. We share your personal information only when you give us explicit permission to do so, and confirm we have your permission each time. Learn more by viewing our privacy policy.Ok