If you ever post a chart on the internet, people will show up to tell you that Correlation does not equal causation.
Much in the way that people like to tell you night follows day or that when it rains things get wet. It’s true, and about equally useful.
Because it’s become a way to simply dismiss data as irrelevant.
But correlation matters. It matters a lot. Because it points you in a direction. Gives you somewhere to look for additional data.
If there’s a correlation there may be a causation. Or there may be a third party causation. Or, it could be genuine co-incidence. But even that tells you something.
So stop prattling about causation. You’re not bringing anything to the party we haven’t heard a million times before. If you want to offer feedback… suggest an additional factor.
Google+: View post on Google+
Correlation isn’t causation… except when it is.
That’s what we call a true-ism.
Good point– correlation is probably the largest indicator of causation because the presence of a causal relationship is always accompanied by a coincident correlation of some sort.
Correlation is only important when there is a testable connection between the data sets, and sometimes even when there is a causal link people draw the link the wrong direction.
Correlation is, in other words, only a start and is useless without other data to support whatever conclusion is drawn.
It’s hard to say it’s useless. If you are wanting to efficiently apply resources to determining the cause of something, one would think you’d first search for correlated events and then apply your experimental resources to seeking for a mechanism of causation.
I don’t think it’s that clear cut. There was an old saying… Studies showed that men who stayed married 30 or more years tended to be bald. So does correlation equal causation? If so, them marriage causes baldness…(of course, right?) But the study finally showed no causation between the two factors.. Men who were married 30 or more years were an older population and baldness showed a relation to the age and not the marriage itself. The two factors had a common thread… Years… But one did not cause the other.
Useless without other data. Without other data you may draw a conclusion in the wrong direction, or choose an appealing looking but acausal correlation. I can find similar looking trends in totally unrelated areas (length of bestselling novels vs. drilling rig count in the Continental US perhaps).
But the point is investigation eventually leads to the answer +Barbie Black.
And what scenario are you envisaging where their either wouldn’t be other data or it couldn’t be obtained +Daniel Taylor?
The whole point of my post is to stop using that phrase as an argument to just ignore the data completely.
There is a strong tendency in certain circles (mainly political, all tribes of) to show a correlation as if it were the proof of causation itself, which is where the “correlation is not causation” argument is most often used.
If the person making the claim then comes up with supporting data showing the existence and direction of causation and it is ignored, they have a legitimate gripe.
Too often this is not the case however, and the pretty graph is left to stand alone while people try to defend it purely on the basis that it shows a correlation.
And by saying “correlation is not causation” you are not in any way advancing the debate +Daniel Taylor. Thus my suggestion that people instead actually suggest an alternate factor.
It doesn’t lead to the answer, it can create a false conclusion. There was a very interesting study about crime reduction and abortion done some years ago. It highlighted why the police were wrong in their assessment of enforcement. Crime was falling and police said it was due to their increased presence and crackdown, less crime was because of more police… But an interesting (and controversial) study showed that the two were not related at all. They showed that crime reduced at the exact 20 year mark of legalized abortion. People were having less unwanted children, less babies they could not support, less teenage pregnancies and the crime reduction was related across the board in the correlating decades that followed regardless of the police presence. In this instance… Regardless of what you think about abortion, it showed the complete uselessness of the relationship between additional cops and crime reduction. The exact opposite of what you propose that correlation and causation sheds light on a relationship.…
+Barbie Black, I read about that in Freakonomics.
I’m actually watching this idiocy going on right now in one of +Mike Elgan’s threads. Despite Elgan even stating in the post that it’s a correlation and asking if there’s a causation… genius after genius feels the need to post that meaningless phrase.
+Barbie Black Read what I posted again. I covered all those possibilities.
But to say correlation points in a direction is the problem Im having with your statement. It does not always point to a relationship. Sometimes it does the exact opposite, it creates a false relationship with no actual connecting factors.
Then again… you’re not reading what I wrote +Barbie Black
If there’s a correlation there may be a causation. Or there may be a third party causation. Or, it could be genuine co-incidence. But even that tells you something.
There is value in the correlation even if it turns out there isn’t a causation.
Well. If you overlay parts of two graphs that seemingly correlate and construct a causation from only that calling that out is imho totally reasonable.
Much of such “data” brought up in discussions on the internet is rightfully dismissed as a way to back up an argument without any real evidence.
Sure. If you say: “This suggest there might be a causal relationship, we should look into that.” that’s fine. That’s using the correlation as a first, very helpful step as it should be. Going beyond that and letting it stand on its own? Not so good.
Again +Stefan Hacker people are saying just that and they’re still getting that stupid phrase thrown at them.
And even if they aren’t… you the correct response is to show how the claimed causation is flawed.
The phrase is useless.
Somethimes the phrase is useless, and sometimes the data really is irrelevant and the person using the phrase isn’t competent or inclined to education.
I think +Eoghann Irving has a good point here.
Before it was tested, the observation that contracting cowpox seemed to prevent smallpox infection was simply an old wives tale. Jenner made an assumption after a bit of interviewing and jumped right into experimentation, but if you consider for a second that he might not have done so, correlation becomes very important.
If Jenner took his work interviewing milkmaids and created a real statistical study he’d have noted a marked difference in the rate of smallpox mortality for people exposed to cattle, particularly those with cowpox. Cowpox exposure and small pox mortality were clearly dependent variables, implying some level of correlation. Without real germ theory, this was not obviously causal. Even with successful experiments to use weakened cowpox material to inoculate against smallpox, there was little understanding of the causal relationship. It was still just strong correlation until solid germ theory came about.
If it had been soundly dismissed as “merely correlation”, Jenner and others may not have risked potentially deadly experiments and utilized their limited resources to study the idea… and many more people would have died.
Well said +Stefan Hacker
Fine. Twinkles are becoming increasingly hard to find, at the same time there has been an increase in gun sales. What does this correlating downturn/upswing tell me? That the less twinkles we have the more guns we wish to own? (I know I sound like a smartass, but I’m trying to make a point)
Those things are probably not strongly correlated though, because lengthening the scope of a study on availability would show that the two variables aren’t overly dependent on one another. (Probably)
The fact that Twinkies are becoming scarce now while gun sales have spiked over the last five years may not hold true at all if we look at the data over 10 or 15 years. It might be a US only phenomenon too, and widening our Gun Sales or Twinkie Availability numbers geographically might show the relationship to be a geographic artifact.
Further study required. Exactly +Gabriel Cooper. The original study is not worthless. It’s the starting point.
In the second case +Daniel Taylor the phrase is still useless because the person shouldn’t have wasted everyone’s time commenting in the first place. They added nothing.
+Eoghann Irving The phrase is useless if the one you are telling it to already internalized its message. Otherwise its very helpful for concisely stating a very important message everyone should know.
Imho bothering to refute every shady looking graph on the internet with a well researched essay on why current research doesn’t support the presented data is a fools errand.
Imho it’s the posters job to provide such evidence if correct critique, like the sentence you dislike so much, is brought up.
Sure. If you have the data in hand or a real stake in the discussion you can take the time. But said sentence, when appropriate, will be part of a proper response.
So what of data/numbers. It’s really only useful for experts. There are many ways to frame and stear that it really becomes useless outside of those who can truly dissect and understand it.
I don’t agree +Stefan Hacker. It’s a phrase that doesn’t change anyone’s mind because it doesn’t give them a reason to change their mind.
Then we have to agree to disagree
I think it gives a point to ponder that can change their mind. Even if it doesn’t. While I love to make people see my point in internet discussions a lot of times it just won’t happen. In such cases I’m content with showing that not everyone trusts/agrees with what is posted so other users approach it with more caution. Bonus points if I can do it with a to the point objection.
Then why not say something meaningful… like pointing out the actual issue with the data presented +Stefan Hacker?
You might just as well be saying… fail.
If the graph simply shows a general trend for two things to increase or decrease over time, then that is too weak to present as ‘support’ and I’d have no problem throwing that line out. It might not add anything new but nor does the initial graph as there’s too much that follows that pattern. With a time based correlation I want to see at least one corresponding trough or peak that might point to which way causation may flow or what other factors or events may contribute.
I don’t know the post or subject that inspired this and you might be right that the statement gets overused. However, when a conclusion is drawn based on flawed logic then it must be said.