Latest Issue of Science News

Context

Science past and present
Tom Siegfried

Context


Context

Doctors flunk quiz on screening-test math

Of 61 physicians, hospital staff and medical students asked, "If a test to detect a disease whose prevalence is 1 out of 1,000 has a false positive rate of 5 percent, what is the chance that a person found to have a positive result actually has the disease?" only 14 gave the correct answer — 2 percent.

Sponsor Message

Imagine a hypothetical baseball player. Call him Alex. He fails a drug test that is known to be 95 percent accurate. How likely is it that he is really guilty?

If you said 95 percent, you’re wrong. But don’t feel bad. It puts you in the company of a lot of highly educated doctors.

OK, it’s kind of a trick question. You can’t really answer it without knowing some other things, such as how many baseball players actually are guilty drug users in the first place. So let’s assume Alex plays in a league with 400 players. Past investigations indicate that 5 percent of all players (20 of them) use the illegal drug. If you tested them all, you’d catch 19 of them (because the test is only 95 percent accurate) and one of them would look clean. On the other hand, 19 clean players would look guilty, since 5 percent of the 380 innocent players would be mistakenly identified as users.

So of all the players, a positive (guilty) test result would occur 38 times — 19 truly guilty, and 19 perfectly innocent. Therefore Alex’s positive test means there’s a 50-50 chance that he is actually guilty.

Now if these mathematical machinations mattered only for baseball, it wouldn’t be worth blogging about. But the same principles apply to medical screening tests. For decades, doctors and advocacy groups have promoted screening tests for all sorts of diseases without putting much thought into the math for interpreting the test results.

Way back in 1978, one study found that many doctors don’t understand the relationship between the accuracy of a test and the probability of disease. But those doctors probably went to med school in the ’60s and weren’t paying much attention. And there were no blogs or Twitter to disseminate important medical information back then. So some Harvard Medical School researchers recently decided to repeat the study. They posed the question in exactly the same way: “If a test to detect a disease whose prevalence is 1 out of 1,000 has a false positive rate of 5 percent, what is the chance that a person found to have a positive result actually has the disease?” The researchers emphasized that this was just for screening tests, where doctors had no knowledge of anyone’s symptoms.

Of 10 medical students given the quiz, only two got the right answer. So we can hope that the other eight will flunk medical school and never treat any patients. But of 25 “attending physicians” given the question, only six got the right answer. Other hospital staff (such as interns and residents) didn’t do any better.

Among all the participants, the most common answer was 95 percent, the test’s accuracy rate. But as with the hypothetical drug test, the test’s actual “positive predictive value” for how many people really had the disease was much lower, in this case only about 2 percent. (If only 1 in a 1,000 people have the disease, testing 1,000 people would produce about 50 false positives and 1 correct identification; 1 divided by 51 is 1.96 percent.)

“Our results show that the majority of respondents in this single-hospital study could not assess PPV in the described scenario,” Arjun Manrai and collaborators wrote in a research letter published April 21 in JAMA Internal Medicine. “Moreover, the most common error was a large overestimation…, an error that could have considerable impact on the course of diagnosis and treatment.”

Really, you don’t want a doctor who tells you it’s 95 percent likely that you’re toast when the actual probability is merely 2 percent.

It’s natural to wonder, though, whether these hypothetical exercises ever apply to real life. Well, they do, in everything from prostate cancer screening to mammography guidelines. They also apply to news reports about new diagnostic tests, an area in which the media are generally not very savvy.

One recent example involved Alzheimer’s disease. In March, the journal Nature Medicine published a report about a test of blood lipids. It predicted the imminent arrival (within two to three years) of Alzheimer’s (or mild cognitive impairment) with over 90 percent accuracy. News reports heralded the 90 percent accuracy of the test as though it were big deal. But more astute commentary pointed out that such a 90 percent accurate test would in fact be wrong 92 percent of the time.

That’s based on an Alzheimer’s prevalence in the population of 1 percent. If you test only people over age 60, the prevalence rate goes up to 5 percent. In that case a positive result with a 90 percent accurate-test is correct 32 percent of the time. “So two-thirds of positive tests are still wrong,” pharmacologist David Colquhoun of University College London writes in a blog post, where he works out the math in detail for use in evaluating such screening tests.

Neither the scientific paper nor media reports pointed out the fallacy in the 90 percent accuracy claim, Colquhoun noted. There seems to be “a conspiracy of silence about the deficiencies of screening tests,” he comments. He suggests that researchers seeking funding are motivated to hype their results and omit mention of how bad their tests are, and that journals seeking headlines don’t want to “pour cold water” on a good story. “Is it that people are incapable of doing the calculations? Surely not,” he concludes.

But many doctors and journal editors and journalists surely aren’t capable (or at least haven’t tried) to do the calculations. As the Harvard researchers point out in JAMA Internal Medicine, efforts to train doctors about statistics need improvement.

“We advocate increased training on evaluating diagnostics in general,” they write. “Specifically, we favor revising premedical education standards to incorporate training in statistics in favor of calculus, which is seldom used in clinical practice.”

Statistical training would also be a good idea for undergraduate journalism programs. And it should be mandatory in graduate level science journalism programs. It’s not — perhaps because many students in such programs already have an advanced science degree. But that doesn’t mean that they actually understand statistics. Even if they’ve taken a statistics course. Science journalists really need a course in statistical inference and evaluating evidence, designed specifically for reporting on scientific studies. Maybe in my spare time I could put something like that together. But I’d probably have to hype it to get funding.

Follow me on Twitter: @tom_siegfried

Editor's Note: This story was updated on April 23, 2014, to correct the title and affiliation of David Colquhoun. 

Note: To comment, Science News subscribing members must now establish a separate login relationship with Disqus. Click the Disqus icon below, enter your e-mail and click “forgot password” to reset your password. You may also log into Disqus using Facebook, Twitter or Google.

X
Quantum Physics,, History of Science

Shor’s code-breaking algorithm inspired reflections on quantum information

By Tom Siegfried 6:13pm, April 18, 2014
Twenty years ago, physicists met in Santa Fe to explore the ramifications of quantum information.
Quantum Physics,, Science & Society

Quantum experts discuss the measurement problem: A transcript from 1994

By Tom Siegfried 2:00pm, April 13, 2014
A fairly complete transcript of a discussion about quantum physics on May 19, 1994, the last day of a workshop in Santa Fe, N.M., evolves into a more general discussion of the interpretation of quantum mechanics and the quantum measurement problem.
Quantum Physics,, History of Science

Robert Redford film foretold Shor’s quantum computing bombshell

By Tom Siegfried 1:00pm, April 10, 2014
Twenty years ago, Peter Shor showed how quantum computers could break secret codes, turning the movie Sneakers from fiction to fact.
Cosmology

Maybe time’s arrow needs ergodicity as well as entropy

By Tom Siegfried 4:46pm, April 1, 2014
Explaining the arrow of time might require an equilibrium universe with hidden ergodic dynamics.
Cosmology,, History of Science

Top 10 cosmological discoveries

By Tom Siegfried 9:34am, March 21, 2014
The cosmic microwave background radiation has played a part in many of cosmology’s greatest discoveries.
Cosmology,, History of Science

Inflation rides gravity waves into cosmological history

By Tom Siegfried 5:15pm, March 17, 2014
The discovery of gravity waves in the cosmic microwave radiation signals the success of inflationary cosmology.
History of Science

Top 10 scientists of the 13th century

By Tom Siegfried 5:39pm, March 14, 2014
Modern science began to emerge in Western Europe centuries before the Scientific Revolution, thanks to a few scholars who were ahead of their time.
History of Science

Medieval cosmology meets modern mathematics

By Tom Siegfried 1:47pm, March 12, 2014
Applying modern math to Robert Grosseteste’s theory of the heavenly spheres reveals a medieval idea’s similarity to modern cosmology.
Physics

Key to free will may be stripping reality naked

By Tom Siegfried 3:48pm, March 3, 2014
If reality emerges from an unseen foundation, human free will could influence the future.
Quantum Physics

Finding a quantum way to make free will possible

By Tom Siegfried 3:27pm, February 26, 2014
Maybe quantum influences from the Big Bang make humans unpredictable, permitting the possibility of free will.
Subscribe to RSS - Context