Web edition: March 8, 2009
Scientific plagiarism is morally reprehensible. But it also can prove dangerous, especially in biomedical arenas. It’s something that was pointed out (if obliquely), last week, by the author of a copycatted journal article. He was quoted in a new policy paper in Science.
Harold Garner’s investigative team at the University of Texas Southwestern Medical Center had presented the author evidence indicating that his research data had been copied by others and republished. Complained the copycatted author: “[M]y major concern is that false data will lead to changes in surgical practice.”
And here’s a hypothetical example of how that might come about.
Suppose author #1 reported that in a few experimental trials, drug therapy worked as well as surgery. But he only had data exhibiting this success in three of the five patients treated. Still, that drug therapy cost only 2 percent as much as surgery and reduced recovery times by 90 percent.
But what if there was something unusual about the patients on which the drug therapy had worked — like they all were post-menopausal, or were men, or shared the same unusual genetic mutation, or had a coexisting condition like diabetes that altered their metabolism of the drug. Because the condition that these patients shared had not yet been teased out, the drug therapy’s success rate would ultimately prove anomalous — and its substitution for surgery uniformly dangerous in most other groups of patients.
Because the drug therapy had only worked three times, any reader would recognize that findings in the initial paper were preliminary.
Now what if another team reported comparing the same drug vs. surgery in a group of 200 patients — and described the same roughly 60 percent success rate for the drug? The second paper, in this instance a plagiarized report based on the first journal article (but containing bogus data), not only would lend credence to the initial finding, but also raise the drug treatment’s statistical strength, since it was now successful in a far larger population.
Garner witnessed something along these lines in the studies his team investigated for potential plagiarism. In one instance, the second paper in a pair “exactly doubled everything” in the first paper. Where there were data from 102 patients in the first paper, the second had 204. Tables describing particular attributes of each patient were identical for the first 102 patients, Garner told me, and the remainder “seems to have been synthesized.”
The data-mining software that Garner’s team developed to uncover such similarities between different papers is currently available free for anyone to use, including manuscript reviewers and journal editors. Yet some have expressed reluctance to use the product because it isn’t yet commercial, and therefore might not ensure confidentiality for all materials run through it.
“We don’t look at papers or the submissions, and we have a secure site,” Garner says. Proving that submitted materials remain fully confidential and secure, however, “is not at the level that journals would like. So we’re thinking about developing such a commercial product,” he told me.
Of course, for the current product to work there has to be a substantial match between the abstracts of two papers. The reason: MEDLINE, the database against which a new paper is compared, contains only abstracts. If and only when abstracts exhibit substantial similarity has Garner’s team searched out the full texts of the matching papers and then run their full texts through its program to scout for additional similarities.
If abstracts of two papers are quite different, then, the system will never have a clue to any virtually verbatim text or data present in the body of the paper.
Long, T.C., . . . and H.R. Garner. 2009. Responding to Possible Plagiarism. Science 323(March 6):1293.