• http://www.biketoworkbarb.blogspot.com Barb Chamberlain

    Interesting post. You start out by talking about the need to put statistics into context for decision makers. But at the end, referring to the boxplot, you say, “An analyst reading the chart will draw the right inferences.”

    Yes, an analyst will–but that’s not the decision-maker audience you described initially. I wonder if the boxplot is too unfamiliar to convey meaning to many people. (Blame PowerPoint for the proliferation of bar and pie charts.)

    Another way of reading that chart at a glance–without reference to labels–is to say that Campaign 2 must be the “biggest” campaign because it occupies the most real estate on the chart.

    Bigger box size is a positive indicator in the “size does matter” world, not an indicator of variance. It also has those nice long whiskers, which must mean that it’s reaching more, right? (to the non-analyst)

    I wouldn’t show the boxplot to my boss, as it requires too much explanation. For your own insights it would work fine.

    Not an analyst with a better solution, just an Edward Tufte fan.

  • http://blog.efrontier.com sidshah

    Hi Barb,
    Thanks for your feedback. I agree with much of what you say. However…

    In my experience I have found that box plots need to be explained once but once explained are very intuitive to decision makers. Box plots have several “Tuftian” characteristics. They are compact and compress a lot of information in little real estate. They dont waste much ink either ( you dont have to shade the boxes). And they are quite intuitive once explained ( I know you disagree with me on this, but try explaining them to your boss once)

    The one BIG virtue they have over classic Tufte plots is that they are not exceptionally custom and impossible to reproduce like most of Tutfe’s famous charts ( can you program the Nepoleon chart into a computer ?). Even his bump charts are a pain to reproduce in Excel. Sparklines seem useful but in real business situations I havent found them as useful as I thought. There is far too much information compression and you often miss out on short term trends.

    I like Bill Cleveland’s ( http://www.amazon.com/Visualizing-Data-William-S-Cleveland/dp/0963488406) chart drawing philosophy more than Tufte’s because they can be easily done and scale naturally to multivariate data. Moreover they have most of the positive characteristics of Tufte’s plots. The *amazing* R plotting packages lattice and ggplot are based on Cleveland’s work.

    A final point. The average is the most commonly used statistic because it is easy to understand. But as a descriptor of data’s characteristic it is, well…. average :)

  • http://www.outsidethecurve.com Ryan Bruss

    “An analyst reading the chart will draw the right inferences and analyze the data to confirm her hypothesis.”

    I’m not so sure about that. These graphs give no information about why Campaign 2 increased performance over time or why it was the only one with a few exceptional days. If there is some property of Campaign 2 that makes it more likely than the others to have exceptional days or increased performance over time, that may change your conclusions. Also, I see no measures of statistical significance making it impossible to say anything about a hypothesis.

  • http://www.rimmkaufman.com George Michie

    Good discussion!

    This is one of the reasons we believe in the fundamental importance of smart analysts having access to and the ability to manipulate raw data. Canned reports limit the number of ways you can look at data.

    The box charts are really cool, but I agree with Barb that some folks will not grok them. While I love Tufte’s work, I agree with Sid, it’s just impossible for us mortals to reproduce that stuff. Great if you have a team of graphic artists at your disposal, but something easier to produce in excel or R is the most useful, even if it’s less visually appealing.