The modern scientific process is guided and determined by peer review. A group of scientists in the field read and scrutinize a scientific manuscript before it is published. Committees of experts review plans for studies and determine whether they should be funded. The National Institutes of Health, National Science Foundation and the European Research Council distributed nearly $40 billion for scientific research in 2014, based almost entirely on peer review.
Do fellow scientists really pinpoint the best research plans to fund? Some scientists have worried that peer review pushes funding toward high profile institutions and previously successful researchers more than to potentially innovative research. A new study attempts to address this concern and finds that peer review does identify influential research. But evaluating whether peer review is the best way to identify the best science to fund is a much harder question to answer.
In the United States, some basic biomedical and clinical research is funded by private companies and investors. But much of it is funded by the NIH with government money. That raises the question of whether that money is being spent on research that is worthwhile for the population as a whole, whether it finds cures and uncovers the causes of disease. “Is the NIH still funding the path-breaking research that is likely to be influential?” asks Leila Agha, an economist at Boston University and a coauthor on the study. “In high-level applications, can [peer review] distinguish the best research?”
For all that science is a data-based enterprise, scientists don’t have the data on whether peer review really does end up funding the best research. “If you were a congressperson or a taxpayer, you might say ‘show me some data that peer review is good at picking things that turn out to be important,’” says Jeremy Berg, a biochemist at the University of Pittsburgh. “But until this study was done, the answer was ‘we believe it but we can’t prove it.’ As scientists, that’s kind of embarrassing.”
When a scientist wants to get NIH funding for a study, she writes up a grant proposal that reports results from preliminary studies, gives goals for the project, outlines the future experiments and estimates the time and resources they will require. The researcher submits her grant, and it’s assigned to a study section of 20 to 30 researchers who work in disciplines closely related to that of the grant proposal.
Within the study section, the grant will be assigned to three reviewers, two of whom provide detailed comments, and a reader, who provides additional comments. The reviewers will give the grant an overall score based on five criteria: Significance, scientific approach, potential innovation, the proposing scientist’s skills and whether the researcher’s university has the resources to support the work. About 40 to 50 percent of grants will be “triaged” at this stage. The rest go to the study section as a whole. After about 10 to 15 minutes of discussion, the grants receive final rankings by priority, with the lowest scores being the best and most likely to be funded.
In recent times, this means that most grants — even those that score well — will not get funding. NIH has a current annual budget of around $30 billion, but that number has not kept pace with the increasing number of scientists applying for research money. In 2014, only about 16 percent of new applications were funded. This makes applying for grants more competitive, and thus makes it even more important that peer review is selecting the research with the highest potential payoff.
Agha and Danielle Li, an economist at Harvard University, wanted to determine whether peer review could successfully predict the influence of the subsequent research. They examined the funding scores for a total of 137,215 peer-reviewed grants funded between 1980 and 2008. For each of the grants, they hunted down how many published scientific studies or patents the grant yielded within five years of the grant’s success. Li and Agha also looked at how many citations the scientific studies for each grant had accrued.
As they assessed the scores and the grants’ success rates, the researchers tried to factor out the scientists’ institutions, previous funding, previous work and field of study. The results showed that grants with higher scores did, in fact, tend to have more patents and more highly-cited publications associated with them. For each 10-point drop in score, a grant was 19 percent less likely to produce a high-impact publication and 14 percent less likely to produce a patent. The economists report their findings April 23 in Science.
“It’s good news. It’s suggesting that [grant reviewers] do on average have a clue,” says Lars Lefgren, an economist at Brigham Young University in Salt Lake City. “Some people complain that the NIH may be biased in terms of awarding grants to people with big names or established track records but who don’t have the most exciting or novel research. This study suggests those types of concerns may not happen on average.”
While the results do suggest that peer review can identify and fund impressive grant applications, Berg says that there are lots of hits and misses. Some very well-scored grants have relatively low publication records, and some grants that barely made the cut turned out to be superstars. “There’s a lot of scatter at every score,” he notes. But that’s to be expected. “It’s about predicting the future,” he explains. “A lot of scores are given for experiments that haven’t been performed yet. It’s like picking stocks: You do what you can do, but you don’t expect 100 percent accuracy.” And when money is tight and funding levels are extremely competitive, a grant proposal that just barely got funded is probably not actually a better proposal than the one that just barely missed the cut.
The study included a large percentage of grants that were competitive renewals. These are grants that have already been funded but are not complete and need a further five years of work. These grants tend to receive much higher scores and are much more likely to be funded than brand new research applications. “The more mature the research, the easier the task of figuring out if it will be successful,” says Agha. “But even with [new] grants, the relationship between peer review score and outcome is very much there.” Of course, many new grants have extensive preliminary data, often funded by previously successful grants, and so the previous success of the scientist is always going to come into play in scoring.
In addition, the study doesn’t show — and indeed can’t show — what path-breaking grant applications the peer review process may have missed. You can’t study the results of research that was never funded and potentially never performed. Berg says that one way to get at that question would be to calculate what might happen if the funding cutoff was more restrictive than it actually is. Researchers could then assess what grants would not have been funded and what research would have been lost.
And of course, Agha notes that their results can’t compare between the success rate of peer review as a system and other ways of choosing what research to fund, because there are no other systems available. “We’re not trying to make the case that the system is perfect or infallible,” she says.
But the study shows that peer review is adding something. Reviewers are not simply rewarding their high-profile colleagues and are choosing studies that are associated with future productivity. “Everyone realizes [peer review] has flaws and biases, but it’s relatively transparent and it has done a good job,” Berg notes. “There have been important discoveries,” It’s like what they say about democracy, he says: “It’s the worst system, except for all the others.”