Correlation Matters Regardless of Causation

If you ever post a chart on the inter­net, peo­ple will show up to tell you that Cor­re­la­tion does not equal causation.

Much in the way that peo­ple like to tell you night fol­lows day or that when it rains things get wet. It’s true, and about equally useful.

Because it’s become a way to sim­ply dis­miss data as irrelevant.

But cor­re­la­tion mat­ters. It mat­ters a lot. Because it points you in a direc­tion. Gives you some­where to look for addi­tional data.

If there’s a cor­re­la­tion there may be a cau­sa­tion. Or there may be a third party cau­sa­tion. Or, it could be gen­uine co-incidence. But even that tells you something.

So stop prat­tling about cau­sa­tion. You’re not bring­ing any­thing to the party we haven’t heard a mil­lion times before. If you want to offer feed­back… sug­gest an addi­tional factor.

Google+: View post on Google+

Loading Disqus Comments ...
Loading Facebook Comments ...

32 thoughts on “Correlation Matters Regardless of Causation

  1. January 5, 2013 at 12:53

    Cor­re­la­tion isn’t cau­sa­tion… except when it is.

    That’s what we call a true-ism.

  2. January 5, 2013 at 12:55

    Good point– cor­re­la­tion is prob­a­bly the largest indi­ca­tor of cau­sa­tion because the pres­ence of a causal rela­tion­ship is always accom­pa­nied by a coin­ci­dent cor­re­la­tion of some sort.

  3. January 5, 2013 at 12:57

    Cor­re­la­tion is only impor­tant when there is a testable con­nec­tion between the data sets, and some­times even when there is a causal link peo­ple draw the link the wrong direction.

    Cor­re­la­tion is, in other words, only a start and is use­less with­out other data to sup­port what­ever con­clu­sion is drawn.

  4. January 5, 2013 at 12:59

    It’s hard to say it’s use­less.  If you are want­ing to effi­ciently apply resources to deter­min­ing the cause of some­thing, one would think you’d first search for cor­re­lated events and then apply your exper­i­men­tal resources to seek­ing for a mech­a­nism of causation.

  5. January 5, 2013 at 13:02

    I don’t think it’s that clear cut. There was an old say­ing… Stud­ies showed that men who stayed mar­ried 30 or more years tended to be bald. So does cor­re­la­tion equal cau­sa­tion? If so, them mar­riage causes baldness…(of course, right?) But the study finally showed no cau­sa­tion between the two fac­tors.. Men who were mar­ried 30 or more years were an older pop­u­la­tion and bald­ness showed a rela­tion to the age and not the mar­riage itself. The two fac­tors had a com­mon thread… Years… But one did not cause the other.

  6. January 5, 2013 at 13:03

    Use­less with­out other data. With­out other data you may draw a con­clu­sion in the wrong direc­tion, or choose an appeal­ing look­ing but acausal cor­re­la­tion. I can find sim­i­lar look­ing trends in totally unre­lated areas (length of best­selling nov­els vs. drilling rig count in the Con­ti­nen­tal US perhaps).

  7. January 5, 2013 at 13:05

    But the point is inves­ti­ga­tion even­tu­ally leads to the answer +Bar­bie Black.

  8. January 5, 2013 at 13:06

    And what sce­nario are you envis­ag­ing where their either wouldn’t be other data or it couldn’t be obtained +Daniel Tay­lor?

    The whole point of my post is to stop using that phrase as an argu­ment to just ignore the data completely.

  9. January 5, 2013 at 13:09

    There is a strong ten­dency in cer­tain cir­cles (mainly polit­i­cal, all tribes of) to show a cor­re­la­tion as if it were the proof of cau­sa­tion itself, which is where the “cor­re­la­tion is not cau­sa­tion” argu­ment is most often used.

    If the per­son mak­ing the claim then comes up with sup­port­ing data show­ing the exis­tence and direc­tion of cau­sa­tion and it is ignored, they have a legit­i­mate gripe.

    Too often this is not the case how­ever, and the pretty graph is left to stand alone while peo­ple try to defend it purely on the basis that it shows a correlation.

  10. January 5, 2013 at 13:10

    And by say­ing “cor­re­la­tion is not cau­sa­tion” you are not in any way advanc­ing the debate +Daniel Tay­lor. Thus my sug­ges­tion that peo­ple instead actu­ally sug­gest an alter­nate factor.

  11. January 5, 2013 at 13:12

    It doesn’t lead to the answer, it can cre­ate a false con­clu­sion. There was a very inter­est­ing study about crime reduc­tion and abor­tion done some years ago. It high­lighted why the police were wrong in their assess­ment of enforce­ment. Crime was falling and police said it was due to their increased pres­ence and crack­down, less crime was because of more police… But an inter­est­ing (and con­tro­ver­sial) study showed that the two were not related at all. They showed that crime reduced at the exact 20 year mark of legal­ized abor­tion. Peo­ple were hav­ing less unwanted chil­dren, less babies they could not sup­port, less teenage preg­nan­cies and the crime reduc­tion was related across the board in the cor­re­lat­ing decades that fol­lowed regard­less of the police pres­ence. In this instance… Regard­less of what you think about abor­tion, it showed the com­plete use­less­ness of the rela­tion­ship between addi­tional cops and crime reduc­tion. The exact oppo­site of what you pro­pose that cor­re­la­tion and cau­sa­tion sheds light on a relationship.…

  12. January 5, 2013 at 13:13

    +Bar­bie Black, I read about that in Freako­nom­ics.

  13. January 5, 2013 at 13:14

    I’m actu­ally watch­ing this idiocy going on right now in one of +Mike Elgan’s threads. Despite Elgan even stat­ing in the post that it’s a cor­re­la­tion and ask­ing if there’s a cau­sa­tion… genius after genius feels the need to post that mean­ing­less phrase.

  14. January 5, 2013 at 13:15

    +Bar­bie Black Read what I posted again. I cov­ered all those possibilities.

  15. January 5, 2013 at 13:17

    But to say cor­re­la­tion points in a direc­tion is the prob­lem Im hav­ing with your state­ment. It does not always point to a rela­tion­ship. Some­times it does the exact oppo­site, it cre­ates a false rela­tion­ship with no actual con­nect­ing factors.

  16. January 5, 2013 at 13:18

    Then again… you’re not read­ing what I wrote +Bar­bie Black 

    If there’s a cor­re­la­tion there may be a cau­sa­tion. Or there may be a third party cau­sa­tion. Or, it could be gen­uine co-incidence. But even that tells you something.

    There is value in the cor­re­la­tion even if it turns out there isn’t a causation.

  17. January 5, 2013 at 13:18

    Well. If you over­lay parts of two graphs that seem­ingly cor­re­late and con­struct a cau­sa­tion from only that call­ing that out is imho totally reasonable.

    Much of such “data” brought up in dis­cus­sions on the inter­net is right­fully dis­missed as a way to back up an argu­ment with­out any real evidence.

    Sure. If you say: “This sug­gest there might be a causal rela­tion­ship, we should look into that.” that’s fine. That’s using the cor­re­la­tion as a first, very help­ful step as it should be. Going beyond that and let­ting it stand on its own? Not so good.

  18. January 5, 2013 at 13:21

    Again +Ste­fan Hacker peo­ple are say­ing just that and they’re still get­ting that stu­pid phrase thrown at them.

    And even if they aren’t… you the cor­rect response is to show how the claimed cau­sa­tion is flawed.

    The phrase is useless.

  19. January 5, 2013 at 13:23

    Some­thimes the phrase is use­less, and some­times the data really is irrel­e­vant and the per­son using the phrase isn’t com­pe­tent or inclined to education.

  20. January 5, 2013 at 13:23

    I think +Eoghann Irv­ing has a good point here.

    Before it was tested, the obser­va­tion that con­tract­ing cow­pox seemed to pre­vent small­pox infec­tion was sim­ply an old wives tale.  Jen­ner made an assump­tion after a bit of inter­view­ing and jumped right into exper­i­men­ta­tion, but if you con­sider for a sec­ond that he might not have done so,  cor­re­la­tion becomes very important.

    If Jen­ner took his work inter­view­ing milk­maids and cre­ated a real sta­tis­ti­cal study he’d have noted a marked dif­fer­ence in the rate of small­pox mor­tal­ity for peo­ple exposed to cat­tle, par­tic­u­larly those with cow­pox.  Cow­pox expo­sure and small pox mor­tal­ity were clearly depen­dent vari­ables, imply­ing some level of cor­re­la­tion.  With­out real germ the­ory, this was not obvi­ously causal.  Even with suc­cess­ful exper­i­ments to use weak­ened cow­pox mate­r­ial to inoc­u­late against small­pox, there was lit­tle under­stand­ing of the causal rela­tion­ship.  It was still just strong cor­re­la­tion until solid germ the­ory came about.  

    If it had been soundly dis­missed as “merely cor­re­la­tion”, Jen­ner and oth­ers may not have risked poten­tially deadly exper­i­ments and uti­lized their lim­ited resources to study the idea… and many more peo­ple would have died.

  21. January 5, 2013 at 13:24

    Well said +Ste­fan Hacker

  22. January 5, 2013 at 13:24

    Fine. Twin­kles are becom­ing increas­ingly hard to find, at the same time there has been an increase in gun sales. What does this cor­re­lat­ing downturn/upswing tell me? That the less twin­kles we have the more guns we wish to own? (I know I sound like a smar­tass, but I’m try­ing to make a point)

  23. January 5, 2013 at 13:29

    Those things are prob­a­bly not strongly cor­re­lated though, because length­en­ing the scope of a study on avail­abil­ity would show that the two vari­ables aren’t overly depen­dent on one another.  (Probably)

    The fact that Twinkies are becom­ing scarce now while gun sales have spiked over the last five years may not hold true at all if we look at the data over 10 or 15 years.  It might be a US only phe­nom­e­non too, and widen­ing our Gun Sales or Twinkie Avail­abil­ity num­bers geo­graph­i­cally might show the rela­tion­ship to be a geo­graphic artifact.

  24. January 5, 2013 at 13:31

    Fur­ther study required. Exactly +Gabriel Cooper. The orig­i­nal study is not worth­less. It’s the start­ing point.

  25. January 5, 2013 at 13:32

    In the sec­ond case +Daniel Tay­lor the phrase is still use­less because the per­son shouldn’t have wasted everyone’s time com­ment­ing in the first place. They added nothing.

  26. January 5, 2013 at 13:32

    +Eoghann Irv­ing The phrase is use­less if the one you are telling it to already inter­nal­ized its mes­sage. Oth­er­wise its very help­ful for con­cisely stat­ing a very impor­tant mes­sage every­one should know.

    Imho both­er­ing to refute every shady look­ing graph on the inter­net with a well researched essay on why cur­rent research doesn’t sup­port the pre­sented data is a fools errand.

    Imho it’s the posters job to pro­vide such evi­dence if cor­rect cri­tique, like the sen­tence you dis­like so much, is brought up.

    Sure. If you have the data in hand or a real stake in the dis­cus­sion you can take the time. But said sen­tence, when appro­pri­ate, will be part of a proper response.

  27. January 5, 2013 at 14:05

    So what of data/numbers. It’s really only use­ful for experts. There are many ways to frame and stear that it really becomes use­less out­side of those who can truly dis­sect and under­stand it.

  28. January 5, 2013 at 14:29

    I don’t agree +Ste­fan Hacker. It’s a phrase that doesn’t change anyone’s mind because it doesn’t give them a rea­son to change their mind.

  29. January 5, 2013 at 14:46

    Then we have to agree to dis­agree ;) I think it gives a point to pon­der that can change their mind. Even if it doesn’t. While I love to make peo­ple see my point in inter­net dis­cus­sions a lot of times it just won’t hap­pen. In such cases I’m con­tent with show­ing that not every­one trusts/agrees with what is posted so other users approach it with more cau­tion. Bonus points if I can do it with a to the point objection.

  30. January 5, 2013 at 14:49

    Then why not say some­thing mean­ing­ful… like point­ing out the actual issue with the data pre­sented +Ste­fan Hacker?

    You might just as well be say­ing… fail.

  31. January 5, 2013 at 15:34

    If the graph sim­ply shows a gen­eral trend for two things to increase or decrease over time, then that is too weak to present as ‘sup­port’ and I’d have no prob­lem throw­ing that line out. It might not add any­thing new but nor does the ini­tial graph as there’s too much that fol­lows that pat­tern. With a time based cor­re­la­tion I want to see at least one cor­re­spond­ing trough or peak that might point to which way cau­sa­tion may flow or what other fac­tors or events may contribute.

  32. January 6, 2013 at 01:08

    I don’t know the post or sub­ject that inspired this and you might be right that the state­ment gets overused.  How­ever, when a con­clu­sion is drawn based on flawed logic then it must be said.

Leave a Reply