Skepticism about science and medicine

In search of disinterested science

Correlations: Plausible or implausible, NONE prove causation

Posted by Henry Bauer on 2014/05/18

My critique of confusing correlations with causes (60 MINUTES on aging — correlations or causes?) brought a number of comments, including a link to a boingboing piece, “Spurious correlations: an engine for head-scratching coincidences”  with an illustration of a correlation that is obviously not a cause-effect relationship:


This came from a website  with software that can generate correlations on request from a large database. Many other examples are offered of correlations that are obviously meaningless, for instance:


Far too many people and institutions perpetually fall into the trap of taking correlations as causation. The error is pervasive in statements and publications about medical science and practice from official agencies and from doctors and researchers, and the media perpetually fail to debunk such statements.

So the Spurious Correlations website is a valuable tool
for reminding people never to assume that a correlation has a causal basis.

But I would like to add a couple of comments.

1. Our intuition about what a “correlation” is differs from how numbers like the 0.947091 in the example above are calculated. As the website points out in its “About this page” (whose link is anything but prominent), “there are better ways to calculate correlation than I do here” .
The website uses what is perhaps the most common formula, “a simple Pearson’s correlation coefficient”. That’s also used in the Microsoft Excel CORREL formula. I first realized how different the result of that can be from an intuitive sense of correlation when an article claimed a geographical correlation between HIV and AIDS for which the actual data seemed to me to show “obviously” no correlation (pp. 110-2 in The Origin, Persistence and Failings of HIV/AIDS Theory).
An informal tutorial from my friend Jack Good  set me straight. Everyone should beware that what might seem like quantitatively very good correlations, with numbers like 0.75 or more, may not signify what our intuition says about how good or bad the correlation is.
(And everyone should beware of the accuracy implied by numbers like 0.947091. All too many publications show such numbers, copied from a computer, that imply an accuracy to 6 significant digits, 1 part in a million. Rarely are more than two figures warranted, in this case 0.95.)
2. The most important caveat, though, is that the Spurious Correlations website features correlations that are obviously absurd and not reflecting any causal relationship. In the real world, however, considerable real damage is done all the time because

correlations that look plausibly reflective of a causal relation
are mistakenly taken to reflect actual causation

That happens pervasively in medicine. Correlations between blood pressure and heart attacks, for example, led to designating blood pressure as a “risk factor” for heart attacks, interpreted mistakenly as high blood pressure constituting an actual risk of causing heart attacks, and using medication to lower blood pressure when in actual fact there is no evidence that high blood pressure causes heart attacks (or strokes) — see “Evidence-based medicine? Wishful thinking”  and “Seeking Immortality? Challenging the drug-based medical paradigm”.
All sorts of shibboleths about HIV/AIDS are treated as fact by media and public just because they seem plausible for a sexually transmitted infection, yet there evidence is plain that neither “HIV” nor “AIDS” is infectious and that they are not correlated with one another. (Correlation never proves causation; but lack of correlation is strong evidence against causation and places heavy burden of proof on anyone claiming causation.)
So too with “global warming”. Given all the doubts about human-caused global warming, for instance that carbon dioxide in the atmosphere has continued to increase dramatically during the last ~15 years without discernible increase in temperature, global-warming and environmentalist activists have succeeded in making the dogma one of (unfalsifiable) “climate change” instead of warming, and pundits galore hold forth about how “climate change” has brought more extreme events that are increasingly extreme — which seems so plausible, until you realize that this is mere speculation and not a reflection of known historical events; and that one could just as plausibly speculate that, as temperature rises, ocean and air currents become stronger and will tend to even things out and decrease the likelihood of extreme events.

Correlations never prove causation
and that needs to be emphasized over and over again,
the more plausibly causative a given correlation  appears to be.


6 Responses to “Correlations: Plausible or implausible, NONE prove causation”

  1. Mark said

    Hmmm…I’m usually cautious about using a word like: “NEVER.” In fact, I think that I read, somewhere, that some philosophers believe that the only thing that can ever be proven is correlation. According to them, at least, causality cannot be proven in the absolute sense, at least. Well, whatever the case may be with those philosophers, I understand the caution with assuming causality from correlation, but I worry that you might be going too far by saying NEVER.


    • Henry Bauer said

      Correlation never PROVES causation is the point. I didn’t say that there may never be a cause associated with a correlation. The presence of a correlation may well be a reason to look for a possible causative relationship, but correlation in itself is no proof of a causative relationship, EVER!


      • Loránd-Levente Pálfi said

        Very interesting information. And actually very funny. Indeed. Thanks a lot, Mr. Bauer (or: whatever the right, more polite, more suitable, more academic expression would be, if one would speak proper English, which I unfortunately don’t, yet). I found the above shown diagrams (or whatever they are called in English; I’m referring to those website excerpts) hilarious, I could have watched 50 more of them. I would have lied flat on the ground with tears of joy. Is there a whole book about this topic? I mean: What I find here is a short contribution (I’m not even sure, if one could call it an article), is there/are there someone or more people who have dealt with this, exactly this and possibly only this topic in a whole book? Obviously I wouldn’t mind, if in such a book also related problems and related topics would be discussed. What I’m trying to avoid is a book, that has one chapter about this, but otherwise deals with completely other things such as statistics on high level which I probably wouldn’t understand. Oh: It should be a book, one could understand not being a statistician, not being a mathematician.


      • Henry Bauer said

        L-L P:
        The author of the Spurious Correlations website has also published a book about spurious correlations


      • Loránd-Levente Pálfi said

        Thank you. I will probably fetch that book on Amazon once I am done reading “Dogmatism in Science and Medicine” (already half-way through!). This reminds me, in “Science or Pseudoscience”, which I have read about one year ago or maybe more, at one place you quote from a book about statistics; it sounded like a very interesting book written in a style approachable by non-statisticians, it was a book of the sort “a statistician telling the world how many problems there are in statistics and why the world shouldn’t trust statisticians”. I tried to find the title in the bibliography of “Science or Pseudoscience”, but couldn’t and cannot. Don’t remember the title, and from what I see when browsing the bibliography nothing falls in my eyes. Can you remember what’s the title and who’s the author? Cause that book is definitely on my wish list too, if only I know the bibliographic details.


      • Henry Bauer said

        L-L P:
        It was Huff, Darrell. 1954. How to Lie with Statistics. New York: W. W. Norton.

        Elsewhere I’ve cited
        Best, J. 2001. Damned Lies and Statistics: Untangling Numbers from the Media, Politicians, and Activists. Berkeley (CA): University of California Press.
        Best, J. 2004. More Damned Lies and Statistics: How Numbers Confuse Public Issues. Berkeley (CA): University of California Press.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: