P value ban: small step for a journal, giant leap for science

Editors reject flawed system of null hypothesis testing

Imagine, if you dare, a world without P values.

Perhaps you’re already among the lucky participants in the human race who don’t know what a P value is. Trust me, you don’t want to. P stands for pernicious, and P values are at the root of all (well, most) scientific evil.

Of course, I don’t mean evil in the sense of James Bond’s villains. It’s an unintentional evil, but nevertheless a diabolical conspiracy of ignorance that litters the scientific literature with erroneous results. P values are supposed to help scientists decide whether an apparently meaningful experimental result is really just a fluke. But in fact, P values confuse more than they clarify. They are misused, misunderstood and misrepresented.

But now somebody is finally trying to do something about it.

Last month a scientific journal — Basic and Applied Social Psychology — announced that it won’t publish papers that mention the unmentionable P value. No longer will the journal permit published papers to report the P value’s use in the process of “null hypothesis testing,” which psychologists and scientists in many other fields routinely rely on. Anyone embarking on a research career soon gets infected with this method. When you want to test to see whether a food additive causes cancer, or a medicine cures a disease, you assume that it doesn’t — the null hypothesis — and then do an experiment comparing the drug or medicine with a placebo, or another drug, or whatever. If more people survive with the medicine than with the placebo, maybe the medicine works. Or maybe that result was a fluke — the luck of the draw. P values supposedly tell you whether the difference you saw was luck or reality.

Except that they don’t. P value calculations tell you only the probability of seeing a result at least as big as what you saw if there is no real effect. (In other words, the P value calculation assumes the null hypothesis is true.) A small P value — low probability of the data you measured — might mean the null hypothesis is wrong, or it might mean that you just saw some unusual data. You don’t know which. And if there is a real effect, your calculation of a P value is rendered meaningless, because that calculation assumed that there wasn’t a real effect.

Nevertheless, the scientific establishment — the peer-reviewed journals that supposedly police scientific standards and decide what research gets published — has largely insisted on P values as a measure of publication worthiness. But now the editors of Basic and Applied Social Psychology have gone rogue.

“The [P value] fails to provide the probability of the null hypothesis, which is needed to provide a strong case for rejecting it,” David Trafimow and Michael Marks of New Mexico State University write in the journal’s editorial announcing the P value ban.

It’s no great shock that some of the world’s statistical organizations have reacted a bit negatively. In a statement, the American Statistical Association expressed concern that the P value–ban “may have its own negative consequences.” More than two dozen “distinguished statistical professionals” are developing a statement for the association “to appear later this year” that will “highlight the issues and competing viewpoints.” Composing such a statement was a very good idea — 50 years ago.

And in fact, for decades, many distinguished statistical professionals and others have been harping on the intellectual bankruptcy of P values and null hypothesis testing. “Despite the awesome pre-eminence this method has attained … it is based upon a fundamental misunderstanding of the nature of rational inference, and is seldom if ever appropriate to the aims of scientific research,” the philosopher of science William Rozeboom wrote — in 1960. Later he called it “surely the most bone-headedly misguided procedure ever institutionalized in the rote training of science students.”

Many others since Rozeboom have argued just as forcefully that P values are pathological. Their widespread use in scientific research renders many if not most scientific papers guilty of reporting a finding that will later turn out to be wrong. P values pose a serious problem that has plagued the scientific process for nearly a century.

Yet they remain persistently misunderstood. In an account of the Basic and Applied Social Psychology ban, a prestigious international scientific journal stated that “the closer to zero the P value gets, the greater the chance that the null hypothesis is false.” That’s utterly wrong, but it is often how P values get explained and understood. And perhaps that’s the best reason to get rid of them.

Follow me on Twitter: @tom_siegfried

Tom Siegfried is a contributing correspondent. He was editor in chief of Science News from 2007 to 2012 and managing editor from 2014 to 2017.

More Stories from Science News on Math

From the Nature Index

Paid Content