I've been thinking about the age-old caution that "correlation does not imply causation." A classic example is the ice cream / shark attack correlation. Whenever ice cream sales rise, so do shark attacks. Not realizing that correlation does not imply causation, one might conclude that ice cream sales cause shark attacks.
What has been bugging me about this is you can't really prove causation. You can do an experiment over and over a million times and get the same predictable result, but all you have really proved is that there is a strong correlation between your experiment's conditions and its results; there is always a chance that one day you might get a different result. One day, an apple might fall from a tree and stop, suspended in mid-air. Extremely unlikely given everything we have observed about the universe, but still possible.
This is what Einstein meant when he said, "No amount of experimentation can ever prove me right; a single experiment can prove me wrong." It would take an infinite number of predictable experiments (or correlations) to prove that one thing causes another, but it only takes one counterexample to disprove a causal relationship.
Therefore I'm a bit confused when someone says "correlation does not imply causation," because I'd like to know what does. Nothing can prove causation. You can only disprove causation. And when you want to gather support for causation, correlations are all you've got.
The trouble is with the word "imply." To a mathematician, statistician, or logician, to "imply" means "to be a sufficient circumstance." The phrase therefore really means "correlation does not always lead to causation." In the colloquial sense, "imply" simply means to "suggest." In this sense, correlation very much does imply causation. They are linked. Indeed, you can't have a causation without correlation.
So the next time someone mentions two correlated phenomena to you ("I get sleepy when I eat turkey!"), I would encourage you to think of a counter-example instead of pulling out the old, worn-out correlation/causation warning. Because if you do, the person pointing out the correlation will be offering a lot more support for causation than you will be detracting from it with a somewhat misleading statement about the rules of logic.
A lot of the ideas in this post came from these two articles:
Wikipedia: Correlation does not imply causation
Daily Meh: Correlation implies causation
Comic: xckd


Good thoughts.
ReplyDeleteWhat researchers use as the gold standard to detect causation is the randomized control trial. It is not perfect because the randomization does not always work but it controls for confounders much better than a simple correlation test. I would be hesitant to say that correlation even hints at causation because the vast majority of correlated things have nothing to do with causation.
The way I always heard it in grad school was that numbers can show correlation, but you need logic to prove causation.
ReplyDeleteYeah, the Nationals have won every game I've been to! I wonder what would happen if I got season tickets...
ReplyDeleteThe statement:
ReplyDelete"They are linked. Indeed, you can't have a causation without correlation"
isn't any variation of correlation implies causation (strong or weak). It is, in fact, causation implies correlation which is always, always true.
Your weak form of correlation suggests causation again isn't true. Correlation suggests a common cause or, to put it in slightly better terms:
Correlation evidences a common cause