In perfect timing for our Bayes discussion, one of our Senior Science Advisors at work sent along this article by Gelman and Loken (see here: http://www.americanscientist.org/issues/feature/2014/6/the-statistical-crisis-in-science/2) , questioning the value of statistically significant comparisons many of us researchers fall into doing. Calling it the “Statistical Crisis in Science”, Gelman and Loken delve into several papers that have come out with proven statistical significance, but question how much is influenced by the predetermined human predictions, and the data set of which they analyze. They discuss the issue dubbed “p-hacking” of when researchers come up with statistically significant results, but don’t look further at other relationships or comparisons for significance as well. They are only looking at the hypothesis at hand, and not for the relationships with the best p-value. The authors claim “the mistake is in thinking that, if the particular path that was chosen yields statistical significance, this is strong evidence in favor of the hypothesis.”
While I agree we need to consider all possible relationships or hypothesis so that we don’t miss the true importance of what is going on, how often can we look at all possibilities that drive us to a result? The very basis of most of our studies come from a driven question, which focuses on the relationship analyzed.
The idea of statistic significance and how it’s role in our analyses is still clearly a debate among the community. I had a coworker publish a paper that her co-author, a statistician, demanded that she included p-values in her analysis, but she felt the p-values were not relevant to the analysis being discussed, nor did they make much sense when included.
It raises the point that Jarrett made in class yesterday, of it all really comes around to what questions we’re trying to answer. Really, we can find a relationship (whether its strong or poor) with anything. But depending on the story we’re trying to tell, the tests we use to support our results must be relevant to our study.