The point I was trying to make (and will elaborate here) is that the usual mantra, "Correlation does not imply causation," is true only in a trivial sense, so we need to think about it more carefully. And as regular readers might expect, I'll take a Bayesian approach.

It is true that correlation doesn't imply causation in the mathematical sense of "imply;" that is, finding a correlation between A and B does not prove that A causes B. However, it does provide evidence that A causes B. It also provides evidence that B causes A, and if there is a hypothetical C that might cause A and B, the correlation is evidence for that hypothesis, too.

In Bayesian terms, a dataset, D, is evidence for a hypothesis, H, if the probability of H is higher after seeing D. That is, if P(H|D) > P(H).

For any two variables, A and B, we should consider 4 hypotheses:

A: A causes B

B: B causes A

C: C causes A and B

N: there are no causal relationships among A, B, and C

And there might be multiple versions of C, for different hypothetical factors. If I have no prior evidence of any causal relationships among these variables, I would assign a high probability (in the sense of a subjective degree of belief) to the null hypothesis, N, and low probabilities to the others. If I have background information that makes A, B, or C more plausible, I might assign prior probabilities accordingly. Otherwise I would assign them equal priors.

Now suppose I find a correlation between A and B, with p-value=0.01. I would compute the likelihood of this result under each hypothesis:

L(D|A) ≈ 1: If A causes B, the chance of finding a correlation is probably high, depending on the noisiness of the relationship and the size of the dataset.

L(D|B) ≈ 1, for the same reason.

L(D|C) ≈ 1, or possibly a bit lower than the previous likelihoods, because any noise in the two causal relationships would be additive.

L(D|N) = 0.01. The probability of seeing a correlation with the observed strength, or more, under the null hypothesis, is the computed p-value, 0.01.

When we multiply the prior probabilities by the likelihoods, the probability assigned to N drops by a factor of 100; the other probabilities are almost unchanged. When we renormalize, the other probabilities go up.

In other words, the update takes most of the probability mass away from N and redistributes it to the other hypotheses. The result of the redistribution depends on the priors, but for all of the alternative hypotheses, the posterior is greater than the prior. That is

P(A|D) > P(A)

P(B|D) > P(B)

P(C|D) > P(C)

Thus, the correlation is evidence in favor of A, B and C. In this example, the Bayes factor for all three is about 100:1, maybe a bit lower for C. So the correlation alone does not discriminate much, if at all, between the alternative hypotheses.

If there is a good reason to think that A is more plausible than B and C, that would be reflected in the priors. In that case the posterior probability might be substantially higher for A than for B and C.

And if the resulting posterior, P(A|D), were sufficiently high, I would be willing to say that the observed correlation implies causation, with the qualification that I am using "imply" in the sense of strong empirical evidence, not a mathematical proof.

People who have internalized the mantra that correlation does not imply causation might be surprised by my casual (not causal) blasphemy. But I am not alone. This article from Slate makes a similar point, but without the Bayesian mumbo-jumbo.

And the Wikipedia page on "Correlation does not imply causation" includes this discussion of correlation as scientific evidence:

Much of scientific evidence is based upon a correlation of variables – they are observed to occur together. Scientists are careful to point out that correlation does not necessarily mean causation. The assumption that A causes B simply because A correlates with B is often not accepted as a legitimate form of argument. However, sometimes people commit the opposite fallacy – dismissing correlation entirely, as if it does not suggest causation. This would dismiss a large swath of important scientific evidence.

I think this is a reasonable conclusion, and hopefully not too shocking to my colleagues in the back of the room.In conclusion, correlation is a valuable type of scientific evidence in fields such as medicine, psychology, and sociology. But first correlations must be confirmed as real, and then every possible causative relationship must be systematically explored. In the end correlation can be used as powerful evidence for a cause-and-effect relationship between a treatment and benefit, a risk factor and a disease, or a social or economic factor and various outcomes. But it is also one of the most abused types of evidence, because it is easy and even tempting to come to premature conclusions based upon the preliminary appearance of a correlation.

UPDATE February 21, 2014: There is a varied and lively discussion of this article on reddit/r/statistics.

One of the objections raised there is that I treat the hypotheses A, B, C, and N as mutually exclusive, when in fact they are not. For example, it's possible that A causes B

*and*B causes A. This is a valid objection, but we can address it by adding additional hypotheses for A&B, B&C, A&C, etc. The rest of my argument still holds. Finding a correlation between A and B is evidence for all of these hypotheses, and evidence against N.

One of my anonymous correspondents on reddit added this comment, which gives examples where correlation alone might be used, in the absence of better evidence, to guide practical decisions:

In general, one of the nice things about Bayesian analysis is that it provides useful inputs for decision analysis, especially when we have to make decisions in the absence of conclusive evidence.This [meaning my article] is not too different from the standard view in medicine, though usually phrased in more of a discrete "levels of evidence" sense than a Bayesian sense. While direct causal evidence is the gold standard in medicine, correlational studies are still taken as providing some evidence that is sometimes worth acting on, in the absence of better evidence. For example, correlations with negative health outcomes are sometimes taken as reasons to issue recommendation to avoid certain behaviors/drugs/foods (pending further data), and unexpected correlations are often taken as good justification for funding further studies into a relationship.

When someone in an argument unthinkingly repeats the mantra, "Correlation does not imply causation," I like to reply, "Maybe not, but it correlates with causation." That's usually a conversation stopper.

ReplyDeleteThat's good. Thanks!

DeleteI use that too.

DeleteNicely put. Of course, if correlation was really no evidence for causation, then why on Earth would we ever bother measuring it, and how on Earth would we ever find out anything about causation?

ReplyDelete(Was it Fisher who originated this mantra? I believe it was also Fisher who argued that there was no compelling evidence for a causal link between smoking and lung cancer.)

What's interesting, beyond this, is that by measuring some third variable, beyond A and B, it's often possible to draw inferences about the type of causal hierarchy that A and B sit in, as my recent post on Berkson's paradox argues.

Allen, on that topic, is there a book on graph theory in the pipeline for your ever growing (and much admired) 'Think...' series?

Hi Tom. I couldn't find an original source for the mantra, but yes, Fisher worked as a consultant for tobacco companies and used the "correlation does not imply causation" argument to deny that smoking causes lung cancer.

DeleteThanks for the link to your post. I will check it out.

And yes, there is a little graph theory in the Think X series. Think Complexity has chapters on small world graphs and scale free networks, and presents basic graph algorithms like BFS and Dijkstra's algorithm.

Fisher I believe still argued for a causal connection, but the opposite one - lung cancer caused smoking. apparently smoking can cause some relief from the symptoms of lung cancer, so those with lung cancer are more likely to be smokers. and I think he also argued a bit for the genetic connection, so C causing A. I believe this is discussed in the Theory that would not Die.

Deletehttp://oyhus.no/CorrelationAndCausation.html contains the simplest proof of the matter, I think. http://oyhus.no/AbsenceOfEvidence.html also contains a great and simple proof for the phrase "absence of evidence is evidence of absence" that also tends to trip up people whose critical thinking exposure only took them through deductive logic.

ReplyDeleteThanks for both of those links -- they look good. Maybe I will write about "absence of evidence" some time. When my appetite for arguing with people on the Internet returns :)

DeleteGreetings Professor. Thanks you for your kind responses on a different thread. I couldn't help but comment on this one, too.

ReplyDeleteI am mostly interested in statistics coming at it from the point of view of investing in the stock market. There are many factors that a investor could choose from when selecting a share to buy, and of course one must try to winnow these factors out.

Although "Efficient Market" theory has been around for some time, there has been emerging interest in psychology on people's investing decisions. The dominant faction are what we might call the "behaviouralists". I call these the "glass half empty" guys, as there central thesis is that human beings are inherently irrational. So they are subject to such things as "hindsight bias", "confirmation bias", "base rate fallacies", and many many more. Particularly relevant here, though, is "conservatism" (they underweigh new sample evidence when compared to Bayesian belief-revision) and conflating correlation with causation. So, if you believe this school of thought, then human beings are big bags of irrationality.

On the other hand, psychologist Gigerenzer has studied the use of "bounded rationatily and heuristics in decision making". His work seems almost diammetrically opposed to the behaviorilists. He is a "glass half full" guy, and he makes a good demonstration of how humans are actually capable of making good decisions under uncertainty. He showed that under some circumstances, simple heuristics can often beat statistical methods, often because the latter tends to over-fit to training data.

Your post neatly highlights the contrasts between the two camps. Your article neatly demonstrates how an apparent irrationality can actually be rational.

In a way, it's quite remarkable when you think about it: Mother Nature is trying to endow humans with optimal survival decision-making skills for a lack of carefully tabulated statistics tables. I wonder just how much of irrational human behaviour will later be found out to be best-fit adaptationally to our environment.

I hope this has been interesting.

Very interesting. Thanks for this comment!

Delete