Old-fashioned gene hunting wasn’t terribly efficient. Geneticists typically pursued one gene at a time, armed only with guesses—usually wrong—about which chunks of genetic code might be linked to human disease.
Geneticists managed to bag a few trophies anyway—genes for Huntington’s chorea and cystic fibrosis, for example—mostly in rare diseases caused by a problem in a single, high-powered gene. Unfortunately, most of the more common diseases, such as type II diabetes, are instead controlled by a whole crowd of gene variants, each playing a small and often subtle role in the path to disease.
To spot these quiet genes lying in the genomic underbrush, disease geneticists realized they’d better try a new tack. In the mid-1990s, the most foresighted among them asked, “What if someday we could take a bunch of unrelated people and compare their genetic blueprint in lots of different places, all at once? Could it revolutionize the study of human disease?” International partnerships soon formed to figure out if this was even possible.
It was. Now, new technology, buttressed by new analytical methods and enhanced knowledge of the genome, allows scientists to do just that: Researchers can test up to a million of the most important spots across the entire genome at one time. These “genome-wide association” studies excel at detecting the subtle effects from common versions or variants of genes that went unnoticed before. Researchers can now put their guesswork aside and watch as a single study hauls in thousands of potential gene suspects.
Not surprisingly, geneticists are cheered by the prospect of leaving behind their days of hapless gene-hunt bumbling. “After years as ‘Keystone Cops,’ complex-trait geneticists can now find culprits not previously suspected and establish guilt beyond a reasonable doubt,” geneticists David Altshuler and Mark Daly of the Broad Institute in Cambridge, Mass., wrote last July in Nature Genetics.
In the past two years alone, genome-wide association studies have found about 100 new genetic variants linked to 40 common diseases, including type II diabetes, prostate cancer and heart disease. These studies point to genes that researchers never suspected of being involved with certain diseases, or to uncharted regions known as “gene deserts” where genes are not known—at least yet—to exist.
Researchers hope the new studies will help explain how common diseases develop and also will help guide the search for new treatments and drugs. “I think it is absolutely clear that we have learned a tremendous amount about a whole range of complex, common, genetic diseases in the human population, and we have much greater knowledge than we did just a very short time ago,” says biostatistician Michael Boehnke of the University of Michigan in Ann Arbor. But for all the promise and hype, finding genes with the new methods may not prove as easy as shooting DNA ducks in a genomic pond. Genome-wide association studies come booby-trapped with potential pitfalls.
Ironically, some of these problems stem from the studies’ biggest strength: an unprecedented avalanche of data. Other challenges arise from the lingering genetic effects of migrations out of Africa 60,000 years ago. Ignoring these issues might cause scientists to waste valuable time investigating innocent suspects, while the truly significant genes slip away unnoticed.
As results from huge new studies roll in, these challenges are attracting more attention. The National Institutes of Health held a special meeting in March to discuss how to translate genome-wide association data into clinical research and practice. And scientific journals are publishing special papers instructing researchers in the art of interpreting genome-wide association studies.
“First you have to go through a sifting process, filtering the true signals from the false signals and making sure you don’t miss any,” says epidemiologist Muin Khoury of the Centers for Disease Control and Prevention in Atlanta. “I think this is as much of an art as a science right now.”
Deluge of data
For all the apparent variation among people, the human genetic code is actually 99.5 percent identical from person to person. That remaining individualistic half percent can help explain how diseases develop in some people and not others. Genome-wide association study researchers rely on results from two big projects to guide them to these crucial areas.
The first project, the government-sponsored Human Genome Project, analyzed a human genome archetype and transcribed the 3.2 billion nucleotide “genetic letters” that make up human DNA. Using this framework, the nonprofit International HapMap Project is pinpointing the 11 million specific sites along the genome where genetic information differs by a single letter. About 4 million sites have been cataloged so far.
Usually these one-letter sites, called single nucleotide polymorphisms, or SNPs (pronounced “snips”), do not themselves cause disease. But the SNPs often lie near important genes that can. So the SNPs serve as convenient signposts—pointing researchers to important disease-related genes in the neighborhood.
To find SNPs linked with a certain disease, the simplest approach is to compare groups of volunteers side by side. Researchers recruit a group of breast cancer patients, say, and a group of similar people who are breast cancer-free. The researchers use “SNP chips”—microchips that test up to 1 million selected SNPs at once—and record the versions of each SNP that each person possesses.
Then researchers statistically compare SNPs in the groups. If most breast cancer patients had two “T” versions of the SNP known as ESR1002, for example, and most disease-free volunteers had two “G” versions, then researchers would flag ESR1002 as a possible breast cancer suspect. Further investigation might then point to an important gene nearby.
Yet a million SNPs on a chip still means a million potential suspects to sift through—most of which are ultimately not related to the disease. The flood of information is potentially overwhelming. As the title of a New England Journal of Medicine editorial last summer described it, genome-wide association studies are like “drinking from the fire hose.”
In fact, most statistical methods were built to deal with data scarcity, not to handle a data deluge. So when a genome-scan delivers its data—four to five thousand times more information than in traditional epidemiology studies—standard statistical methods can easily choke.
For example, a genome-wide association study of 1 million SNPs will flag about 50,000 SNPs as significant. But most will be false alarms, indistinguishable from real results. Worse yet, truly interesting SNPs may be ignored and never get flagged in the first place.
The problem lies in how results get flagged. Statistical methods essentially set a cutoff value that any result must surmount before being flagged as significant —a statistical hurdle, in a sense. Traditional hurdles do a good job of separating true results from bogus ones when there aren’t many competitors in the race. But in a million-SNP blitz, too many false results manage to scramble over the hurdle just by random luck.
“There have been problems in the past when people have declared victory prematurely,” says geneticist Joel Hirschhorn of the Broad Institute, by declaring SNPs to be significant based only on the traditional statistical hurdles. “It was hard to convince people that [the old level] was not an appropriate threshold. People are starting to accept that now.”
The simplest solution is just raising the hurdle. Traditionally, researchers have permitted a bogus result to sneak through about 1 time in 20. With the new genome-wide scans, it’s now usually no more than 5 in 100 million. This raises the bar considerably, Hirschhorn says.
But higher hurdles require bigger studies. That’s because much of the muscle power behind these studies depends on how many participants are included. New studies need an extra boost of muscle to hoist the important SNPs over the now-higher bar—otherwise no SNP might get flagged. So researchers have to scramble to find money and volunteers. Typical sample sizes for genetic association studies can now run in the tens of thousands.
Even then, added muscle power might not be enough. So researchers are turning to multistage studies, too. In such studies scientists first scan the full genome, then try to replicate the strongest findings with new subjects in subsequent studies. “That really leads to a new type of epidemiology, because basically no one study is remotely definitive,” says epidemiologist David Hunter of the Harvard School of Public Health in Boston. “We have to put together consortia and large-scale collaborations.”
Still, data-sharing is becoming easier. For example, researchers who conduct genome-wide association studies funded through the National Institutes of Health must now deposit their data into a common database for immediate access.
Surprisingly, the earlier that researchers collaborate, the better—at least from a statistical point of view, Hunter says. Researchers originally thought that if small, independent groups each did their own study and then compiled a running list of all the important SNPs from their results, everything would be fine.
Not so, Hunter says. It turns out that the running list of results will still miss important SNPs. It’s better for those groups to pool all their subjects at the beginning and run one big scan, he says. The final list from the pooled study will be more accurate and complete than the running list from independent studies.
The reason, Hunter says, is that subtle genetic effects (such as those likely to contribute to diseases) can be picked up only with a sufficiently large sample size—in the same way that a larger magnifying glass is needed to spot the smaller bugs in the undergrowth. “So this will be a long-running story for common diseases,” he says, “because as we put together more and more scans, we’ll find more and more truly associated variants.”
Also afflicting these multistage studies is the peculiar “winner’s curse” phenomenon, in which top results in small initial studies don’t always pan out in later studies. This is a close cousin of the “Sports Illustrated curse,” in which star rookies featured on the magazine’s cover end up with a crash-and-burn second season.
There’s a simple statistical explanation, says epidemiologist Teri Manolio of the National Human Genome Research Institute of the National Institutes of Health in Bethesda, Md. Researchers will naturally try to replicate the most extreme top-scoring results in an initial study. But these huge effects probably owe their super-high ranking in part to a true effect and in part to sheer random luck. Small follow-up studies—designed to look for these big effects—will miss the more subtle, true effects, Manolio says.
Thus initial studies may appear flawed, even if they aren’t. The solutions—increasing sample sizes and recognizing that extreme initial results are likely overinflated—are beginning to take hold. “It’s happening,” Manolio says, “but it’s happening slowly.”
The trouble with ancestry
Complications from race and ancestry can also play a role in genome-wide association studies. That’s because people with European, Asian and African ancestries have different genetic patterns. These patterns can be misleading. “There is a big debate about this in the genetics community,” says geneticist Eric Jorgenson of the University of California, San Francisco. “Does race matter? Individual genotypes are what matter, but at the same time, race is correlated with genotype.”
Take a simplified example: Suppose most people of European ancestry in a sample had blue eyes and also happened to have disease X, while most people of Asian ancestry were brown-eyed and disease-free. A naïve analysis might conclude that the blue-eyes SNP is responsible for disease X, even if eye color and disease are completely unrelated.
That is, the methods are likely to nab the wrong SNP suspects, simply because these innocent SNPs tend to show up in the same situations as truly guilty SNPs. This genetic-mixing issue shows up in other kinds of studies, too. But it’s a particular problem for studies of the entire genome because of the huge number of ancestry-related SNPs being tested.
Traditionally, researchers have addressed this genetic-mixing problem largely by balancing the number of study volunteers belonging to different racial groups. But this strategy goes only so far, Jorgenson says. Genetic heritage is more complicated than skin color or grandparents’ birthplace, and the ancestral variation in the gene pool can’t be conveyed with a simple check-off survey box.
Some nifty statistical tricks, however, can help researchers spot and fix this problem in their analyses, Jorgenson says. For example, one method comes up with a mathematical summary of every volunteer’s personal genetic ancestry and incorporates that into the analysis. This effectively allows researchers to “strip” each volunteer of his or her genetic ancestry and simply investigate the important genetic patterns that are left over.
Ancestry can cause other problems. Waves of migration out of Africa—starting about 60,000 years ago—were led by a relatively small number of people, resulting in a narrower gene pool in the new communities. Plus, as populations spread throughout Europe, Asia and the Americas, settlers faced limited mating choices, further reducing genetic variability.
These conditions—founder effects and bottleneck populations—meant that new emigrant groups had less genetic diversity than the original African population. Over time, these effects became more pronounced. People with recent African ancestry now have more variability across their genome than do people with European and Asian ancestry.
Problems arise when people with different genetic ancestries are included in one study, Jorgenson says. Scanning a group with greater genetic variability requires more refined tools. “If you’re applying genome-wide association studies to a bottleneck population with less variability, you can use a wider-tooth comb,” he says. “Populations with more variability need a finer-tooth comb.” Current methods may miss disease-linked SNPs in African-Americans, especially if the SNPs are associated with rare gene variants.
The ideal solution would be to sequence every letter of volunteers’ genomes—thus providing the finest-toothed comb possible. Cost and logistics are still prohibitive for this approach, however. Still, the more SNPs that manufacturers can squeeze onto their SNP chips, the more likely that important SNPs will be caught, Jorgenson says. And some manufacturers are already starting to design chips that incorporate sets of SNPs suitable for different genetic ancestries.
Genome-wide association studies might indeed prove to be a bonanza for modern gene hunters. But in all the excitement, researchers shouldn’t forget the value of good old-fashioned study design, Khoury warns. “I think people are being lulled into a zone of comfort,” he says, as some researchers rely on million-SNP chips, large sample sizes and multiple replication studies to cover up study flaws.
And there’s still a nagging question: After you’ve bagged your gene, what do you do? “To me, this is the biggest stumbling block,” Khoury says. “You still have to work out the biology of that hit…. That’s actually where the hard work begins.”
It’s clear that clinical applications are still years away, Manolio says. Some companies are starting to sell personalized genetic tests based on results from genome-wide association studies. But researchers hardly know what the study results mean themselves; any immediate translation into personalized medicine will naturally be problematic.
“There is a lot of missing heritability in our results right now,” Boehnke says. “If your goal [with these studies] is personalized medicine and developing your own personal genetic report card, we’re definitely not there yet. I don’t know whether we ever will be.”
Regina Nuzzo is a freelance writer based in Washington, D.C.