An effort to reproduce findings of five prominent cancer studies has produced a mixed bag of results.
In a series of papers published January 19 in eLife, researchers from the Reproducibility Project: Cancer Biology report that none of five prominent cancer studies they sought to duplicate were completely reproducible. Replicators could not confirm any of the findings of one study. In other cases, replicators saw results similar to the original study’s, but statistical analyses could not rule out that the findings were a fluke. Problems with mice or cells used in two experiments prevented the replicators from confirming the findings.
“Reproducibility is hard,” says Brian Nosek, executive director of the Center for Open Science in Charlottesville, Va., an organization that aims to increase the reliability of science. It’s too early to draw any conclusions about the overall dependability of cancer studies, Nosek says, but he hopes redo experiments will be “a process of uncertainty reduction” that may ultimately help researchers increase confidence in their results.
The cancer reproducibility project is a collaboration between Nosek’s center and Science Exchange, a network of labs that conduct replication experiments for a fee. Replicators working on the project selected 50 highly cited and downloaded papers in cancer biology published from 2010 to 2012. Teams then attempted to copy each study’s methods, often consulting with the original researchers for tips and materials. The five published in eLife are just the first batch. Eventually, all of the studies will be evaluated as a group to determine the factors that lead to failed replications.
In 2011, researchers predicted an antiulcer drug called cimetidine could fight a type of lung cancer. In studies with mice, the researchers found that the drug reduced the size of tumors (top graph). The drug doxorubicin, already known to work against lung cancer tumors, was used as a comparison. A repeat of the drug tests (bottom graph) tested one dosage of cimetidine. It also appeared to reduce tumor size, but an initial statistical analysis indicated the result might be a fluke. Combining the two tests’ results, though, supported the initial finding of the drug’s potential effectiveness.
Critics charge that the first batch of replication studies did not accurately copy the originals, producing skewed results. “They didn’t do any troubleshooting. That’s my main complaint,” says cancer biologist Erkki Ruoslahti of Sanford Burnham Prebys Medical Discovery Institute in La Jolla, Calif.
Ruoslahti and colleagues reported in 2010 in Science that a peptide called iRGD helps chemotherapy drugs penetrate tumors and increases the drugs’ efficacy. In the replication study, the researchers could not confirm those findings. “I felt that their experimental design was set up to make us look maximally bad,” Ruoslahti says.
Replicators aren’t out to make anyone look bad, says cancer biologist Tim Errington of the Center for Open Science. The teams published the experimental designs before they began the work and reported all of their findings. What Ruoslahti calls troubleshooting, Errington calls fishing for a particular result. Errington acknowledges that technical problems may have hampered replication efforts, but that’s valuable data to determine why independent researchers often can’t reproduce published results. Identifying weaknesses will enable scientists to design better experiments and conduct research more efficiently, he argues.
Other researchers took issue with the replicators’ statistical analyses. One study sought to reproduce results from a 2011 Science Translational Medicine report. In the original study, Atul Butte, a computational biologist at the University of California, San Francisco, and colleagues developed a computer program for predicting how existing drugs might be repurposed to treat other diseases. The program predicted that an ulcer-fighting drug called cimetidine could treat a type of lung cancer. Butte and colleagues tested the drug in mice and found that it reduced the size of lung tumors. The replication attempt got very similar results with the drug test. But after adjusting the statistical analysis to account for multiple variables, the replication study could no longer rule out a fluke result. “If they want a headline that says ‘It didn’t replicate,’ they just created one,” Butte says. Errington says the corrections were necessary and not designed to purposely invalidate the original result. And when replication researchers analyzed both the original and replication study together, the results once again appeared to be statistically sound.
A failure to replicate should not be viewed as an indication that the original finding wasn’t correct, says Oswald Steward, a neuroscientist at the University of California, Irvine, who has conducted replication studies of prominent neuroscience papers but was not involved in the cancer replication studies. “A failure to replicate is simply a call to attention,” Steward says. Especially when scientists are building a research program or trying to create new therapies, it is necessary to make sure that the original findings are rock solid, he says. “We scientists have to really own this problem.”
Editor’s note: This story was updated January 26, 2017, to correct the starting point of the x-axis in the first graph.