Thursday, April 11, 2013

Test statistics and the publication game

It is well known that journals do not like replications or confirmations of hypotheses. They are looking for the empirical results that contradict popular wisdom, and this must be influencing the way researchers look for test results. To increase your chances of success, you want to only mention highly significant results and ignore the so-so ones.

Abel Brodeur, Mathias Lé, Marc Sangnier and Yanos Zylberberg look at the distribution of p-values in articles published in the top three economics journals. I am not quite sure what the distribution of p-values would be if the publication process were unbiased, but it would probably look like a Poisson distribution and it would be monotonic on each side of the mode. What the authors find does not look at all like this. There is a distinct lack of test results that just miss the 5% or 10% significance, and distinctively more that just pass those thresholds, making the distribution bimodal. Interestingly, this problem is less present when stars are not used to highlight significance or when the authors are tenured.

These results indicate that there is more than a selection bias. This is an inflation bias by the researcher when he only presents the most significant results, which were obtained by finding the specification that allows to pass the magic significance thresholds. I do not think this is ethical, but the publishing game makes it unavoidable, so the profession is apparently fine with it. I guess we have to tolerate this and take it into account when reading papers much like we know there is grade inflation when looking at transcripts or there is similar inflation when reading recommendation letters.

PS: This paper is a strong candidate for the best paper title of the year. Bravo!

PS2: What is really unethical is claiming results are significant when they are not. The case of Ulrich Lichtenthaler comes to mind, who added "significance stars" to his results when they were not warranted. The fact that he still managed to publish widely is an indictment of the quality of research in business journals, too.


Bobby said...

Interesting paper, but the abnormal distribution of p-values is not new. Uri Simohnson, for example, has published work on p-value distribution in journals that's very similar to this working paper (and that oddly isn't cited by it). Andrew Gelman has done a ton of work on it. I'm not saying it's not great work, but we've known this for a long while....

Bobby said...

p.s. One of Simohnson's papers exactly addresses the question of what the p-value distribution would look like if the publication process was unbiased.