I fully agree that appeals to authority should be ignored. My profile has nothing to do with how right or wrong I might be.
Ad hominem arguments, however, should also be ignored. Saying that I'm unqualified doesn't prove anything and adds nothing to the discussion.
If you have specific criticisms of my reasoning, I'm more than happy to listen. If all you have are personal insults, however, well, enjoy the rest of your day.
I naively assumed that ten years of teaching data visualization at NASA, Yale, Visa, the U.N., UofT, etc. would qualify me to write about something like this, but I guess not. Thanks for setting me straight :-)
BTW, what terminology was I using incorrectly?
To clarify, I (article author) am not aware of any scenarios in which a box plot would be a better choice than simpler chart types, even for very sophisticated audiences. From the article:
"Other reviewers suggested that the conclusion [of this article] should be that box plots are a useful chart type, but only for statistically savvy audiences. Again, I’m going a step further, suggesting that even those audiences would be better served by other chart types in virtually all situations."
As the author of the original Nightingale article that kicked off this (wild) thread, maybe I can clarify a few things:
My fundamental concern with box plots is that no one has ever shown me a single scenario in which a given insight was clearer in a box plot than it would be in a simpler chart type (i.e., strip plot, distribution heatmap, or stacked histograms). If someone can show me even a hand-crafted, cherry-picked scenario with the same data shown as a (well-designed) box plot AND a strip plot, distribution heatmap and stacked histograms, and in which a potentially useful insight is clearer in the box plot than in the other chart types, I’ll happily change my opinion. I’m still waiting for someone to show me such a scenario, though.
In the meantime, I’m not sure why one would use box plots when simpler chart types are available that say the same thing about the data or, in many cases, say more about the data (show gaps, multi-modal distributions, etc.). Even if the audience is very used to reading box plots, they’ll still find strip plots, distribution heatmaps and stacked histograms to be simpler to read (and will actually see gaps, clusters, etc.)
How do I know that other distribution chart types are simpler to read than box plots? Because I’ve taught these chart types to literally thousands of people of all skill levels all over the world. Quartiles are just inherently less intuitive than bins or, in the case of strip plots, no delimiters to understand at all.
Like I said, if someone can show me a scenario like the one that I described above, though, I’ll happily change my mind…
Before people jump all over me, I should clarify what I mean by a “potentially useful insight.” For example, “showing the interquartile range” is not an “insight” in this context, it’s an “observation” because it doesn’t point to any kind of action or conclusion, in and of itself. A potentially useful insight would be something like, “The employee salaries in Company A are generally higher than those in Company B.” or “Most people make close to $80K in Company A, but the salaries are much more spread out in Company B.” Basically, an “insight” in this context is a piece of information that would point directly to some kind of action or conclusion.
Ad hominem arguments, however, should also be ignored. Saying that I'm unqualified doesn't prove anything and adds nothing to the discussion.
If you have specific criticisms of my reasoning, I'm more than happy to listen. If all you have are personal insults, however, well, enjoy the rest of your day.