DNA databases are too white, so genetics doesn’t help everyone. How do we fix that?

DNA databases need diversity for genetic research to help all people

illustration of a woman walking across of a bridge of DNA

A lack of diversity in genetic databases is making precision medicine ineffective for many people.

Delphine Lee

It’s been two decades since the Human Genome Project first unveiled a rough draft of our genetic instruction book. The promise of that medical moon shot was that doctors would soon be able to look at an individual’s DNA and prescribe the right medicines for that person’s illness or even prevent certain diseases.

That promise, known as precision medicine, has yet to be fulfilled in any widespread way. True, researchers are getting clues about some genetic variants linked to certain conditions and some that affect how drugs work in the body. But many of those advances have benefited just one group: people whose ancestral roots stem from Europe. In other words, white people.

Instead of a truly human genome that represents everyone, “what we have is essentially a European genome,” says Constance Hilliard, an evolutionary historian at the University of North Texas in Denton. “That data doesn’t work for anybody apart from people of European ancestry.”

She’s talking about more than the Human Genome Project’s reference genome. That database is just one of many that researchers are using to develop precision medicine strategies. Often those genetic databases draw on data mainly from white participants. But race isn’t the issue. The problem is that collectively, those data add up to a catalog of genetic variants that don’t represent the full range of human genetic diversity.

When people of African, Asian, Native American or Pacific Island ancestry get a DNA test to determine if they inherited a variant that may cause cancer or if a particular drug will work for them, they’re often left with more questions than answers. The results often reveal “variants of uncertain significance,” leaving doctors with too little useful information. This happens less often for people of European descent. That disparity could change if genetics included a more diverse group of participants, researchers agree (SN: 9/17/16, p. 8).

One solution is to make customized reference genomes for populations whose members die from cancer or heart disease at higher rates than other groups, for example, or who face other worse health outcomes, Hilliard suggests.

And the more specific the better. For instance, African Americans who descended from enslaved people have geographic and ecological origins as well as evolutionary and social histories distinct from those of recent African immigrants to the United States. Those histories have left stamps in the DNA that can make a difference in people’s health today. The same goes for Indigenous people from various parts of the world and Latino people from Mexico versus the Caribbean or Central or South America.

Researchers have made efforts to boost diversity among participants in genetic studies, but there is still a long way to go. How to involve more people of diverse backgrounds — which goes beyond race and ethnicity to include geographic, social and economic diversity — in genetic research is fraught with thorny ethical questions.

To bring the public into the conversation, Science News posed some core questions to readers who watched a short video of Hilliard explaining her views.

Again and again, respondents to our unscientific survey said that genetic research is important for improving medical care. But our mostly white respondents had mixed feelings about whether the solution is customized projects such as Hilliard proposes or a more generalized effort to add variants to the existing human reference genome. Many people were concerned that pointing out genetic differences may reinforce mistaken concepts of racial inferiority and superiority, and lead to more discrimination.

illustration of a strand of DNA
Delphine Lee

Why is genetics so white?

Some of our readers asked how genetic research got to this state in the first place. Why is genetic research so white and what do we do about it?

Let’s start with the project that makes precision medicine even a possibility: the Human Genome Project, which produced the human reference genome, a sort of master blueprint of the genetic makeup of humans. The reference genome was built initially from the DNA of people who answered an ad in the Buffalo News in 1997.

Although many people think the reference genome is mostly white, it’s not, says Valerie Schneider, a staff scientist at the U.S. National Library of Medicine and a member of the Genome Reference Consortium, the group charged with maintaining the reference genome. The database is a mishmash of more than 60 people’s DNA.

An African American man, dubbed RP11, contributed 70 percent of the DNA in the reference genome. About half of his DNA was inherited from European ancestors, and half from ancestors from sub-Saharan Africa. Another 10 people, including at least one East Asian person and seven of European descent, together contributed about 23 percent of the DNA. And more than 50 people’s DNA is represented in the remaining 7 percent of the reference, Schneider says. Information about the racial and ethnic backgrounds of most of the contributors is unknown, she says.

All humans have basically the same DNA. Any two people are 99.9 percent genetically identical. That’s why having a reference genome makes sense. But the 0.1 percent difference between individuals — all the spelling variations, typos, insertions and deletions sprinkled throughout the text of the human instruction book — contributes to differences in health and disease.

Much of what is known about how that 0.1 percent genetic difference affects health comes from a type of research called genome-wide association studies, or GWAS. In such studies, scientists compare DNA from people with a particular disease with DNA from those who don’t have the disease. The aim is to uncover common genetic variants that might explain why one person is susceptible to that illness while another isn’t.

In 2018, people of European ancestry made up more than 78 percent of GWAS participants, researchers reported in Cell in 2019. That’s an improvement from 2009, when 96 percent of participants had European ancestors, researchers reported in Nature.

Most of the research funded by the major supporter of U.S. biomedical research, the National Institutes of Health, is done by scientists who identify as white, says Sam Oh, an epidemiologist at the University of California, San Francisco. Black and Hispanic researchers collectively receive about 6 percent of research project grants, according to NIH data.

“Generally, the participants who are easier to recruit are people who look like the scientists themselves — people who share similar language, similar culture. It’s easier to establish a rapport and you may already have inroads into communities you’re trying to recruit,” Oh says.

illustration of a strand of DNA
Delphine Lee

When origins matter

Hilliard’s hypothesis is that precision medicine, which tailors treatments based on a person’s genetic data, lifestyle, environment and physiology, is more likely to succeed when researchers consider the histories of groups that have worse health outcomes. For instance, Black Americans descended from enslaved people have higher rates of kidney disease and high blood pressure, and higher death rates from certain cancers than other U.S. racial and ethnic groups.

In her work as an evolutionary historian studying the people and cultures of West Africa, Hilliard may have uncovered one reason that African Americans descended from enslaved people die from certain types of breast and prostate cancers at higher rates than white people, but have lower rates of the brittle-bone disease osteoporosis. African Americans have a variant of a gene called TRPV6 that helps their cells take up calcium. Overactive TRPV6 is also a hallmark of those breast and prostate cancers that disproportionately kill Black people in the United States.

The variant can be traced back to the ancestors of some African Americans: Niger-Congo–speaking West Africans. In that part of West Africa, the tsetse fly kills cattle, making dairy farming unsustainable. Those ancestral people typically consumed a scant 200 to 400 milligrams of calcium per day. The calcium-absorbing version of TRPV6 helped the body meet its calcium needs, Hilliard hypothesizes. Today, descendents of some of those people still carry the more absorbent version of the gene, but consume more than 800 milligrams of calcium each day.

Assuming that African American women have the same dietary need for calcium as women of European descent may lead doctors to recommend higher calcium intake, which may inadvertently encourage growth of breast and prostate cancers, Hilliard reported in the Journal of Cancer Research & Therapy in 2018.

“Nobody is connecting the dots,” Hilliard says, because most research has focused on the European version of TRPV6.

illustration of a strand of DNA
Delphine Lee

One size doesn’t fit all

Some doctors and researchers advocate for racialized medicine in which race is used as proxy for a patient’s genetic makeup, and treatments are tailored accordingly. But racialized medicine can backfire. Take the blood thinner clopidogrel, sold under the brand name Plavix. It is prescribed to people at risk of heart attack or stroke. An enzyme called CYP2C19 converts the drug to its active form in the liver.

Some versions of the enzyme don’t convert the drug to its active form very well, if at all. “If you have the enzyme gene variant that will not convert [the drug], you’re essentially taking a placebo, and you’re paying 10 times more for something that will not do what something else — aspirin — will do,” Oh says.

The inactive versions are more common among Asians and Pacific Islanders than among people of African or European ancestry. But just saying that the drug won’t work for someone who ticked the Pacific Islander box on a medical history form is too simplistic. About 60 to 70 percent of people from the Melanesian island nation of Vanuatu carry the inactive forms. But only about 4 percent of fellow Pacific Islanders from Fiji and the Polynesian islands of Samoa, Tonga and the Cook Islands, and 8 percent of New Zealand’s Maori people have the inactive forms.

Assuming that someone has a poorly performing enzyme based on their ethnicity is unhelpful, according to Nuala Helsby of the University of Auckland in New Zealand. These examples “reiterate the importance of assessing the individual patient rather than relying on inappropriate ethnicity-based assumptions for drug dosing decisions,” she wrote in the British Journal of Clinical Pharmacology in 2016.

A far better approach than either assuming that ethnicity indicates genetic makeup or that everyone is like Europeans is to analyze a person’s DNA and have a precise reference genome to compare it against, Hilliard says. Deciding which genomes to create should be based on known health disparities.

“We have to stop talking about race, and we have to stop talking about color blindness.” Instead, researchers need to consider the very particular circumstances and environments that a person’s ancestors adapted to, Hilliard stresses.

illustration of a strand of DNA
Delphine Lee

What is diversity in genetics?

Recruiting people from all over the world to participate in genetic research might seem like the way to increase diversity, but that’s a fallacy, Hilliard says. If you really want genetic diversity, look to Africa, she says.

Humans originated in Africa, and the continent is home to the most genetically diverse people in the world. Ancestors of Europeans, Asians, Native Americans and Pacific Islanders carry only part of that diversity, so sequencing genomes from geographically dispersed people won’t capture the full range of variants. But sequencing genomes of 3 million people in Africa could accomplish that task, medical geneticist Ambroise Wonkam of the University of Cape Town in South Africa proposed February 10 in Nature (SN Online: 2/22/21).

Wonkam is a leader in H3Africa, or Human Heredity and Health in Africa. That project has cataloged genetic diversity in sub-Saharan Africa by deciphering the genomes of 426 people representing 50 groups on the continent. The team found more than 3 million genetic variants that had never been seen before, the researchers reported October 28 in Nature. “What we found is that populations that are not well represented in current databases are where we got the most bang for the buck; you see so much more variation there,” says Neil Hanchard, a geneticist and physician at Baylor College of Medicine in Houston.

What’s more, groups living side by side can be genetically distinct. For instance, the Berom of Nigeria, a large ethnic population of about 2 million people, has a genetic profile more similar to East African groups than to neighboring West African groups. In many genetic studies, scientists use another large Nigerian group, the Yoruba, “as the go-to for Africa. But that’s probably not representative of Nigeria, let alone Africa,” Hanchard says.

That’s why Hilliard argues for separate reference genomes or similar tools for groups with health problems that may be linked to their genetic and localized geographic ancestry. For West Africa, for example, this might mean different reference datasets for groups from the coast and those from more inland regions, the birthplace of many African Americans’ ancestors.

Some countries have begun building specialized reference genomes. China compiled a reference of the world’s largest ethnic group, Han Chinese. A recent analysis indicates that Han Chinese people can be divided into six subgroups hailing from different parts of the country. China’s genome project is also compiling data on nine ethnic minorities within its borders. Denmark, Japan and South Korea also are creating country-specific reference genomes and cataloging genetic variants that might contribute to health problems that their populations face. Whether this approach will improve medical care remains to be seen.

People often have the notion that human groups exist as discrete, isolated populations, says Alice Popejoy, a public health geneticist and computational biologist at Stanford University. “But we really have, as a human species, been moving around and mixing and mingling for hundreds of thousands of years,” she says. “It gets very complicated when you start talking about different reference genomes for different groups.” There are no easy dividing lines. Even if separate reference genomes were built, it’s not clear how a doctor would decide which reference is appropriate for an individual patient.

illustration of a strand of DNA
Delphine Lee

Discrimination worries

One big drawback to Hilliard’s proposal may be social rather than scientific, according to some Science News readers.

Many respondents to our survey expressed concern that even well-intentioned scientists might do research that ultimately increases bias and discrimination toward certain groups. As one reader put it, “The idea of diversity is being stretched into an arena where racial differences will be emphasized and commonalities minimized. This is truly the entry to a racist philosophy.”

Another reader commented, “The fear is that any differences that are found would be exploited by those who want to denigrate others.” Another added, “The idea that there are large genetic differences between populations is a can of worms, isn’t it?”

Indeed, the Chinese government has come under fire for using DNA to identify members of the Uighur Muslim ethnic group, singling them out for surveillance and sending some to “reeducation camps.”

People need a better understanding of what it means when geneticists talk about human diversity, says Charles Rotimi, a genetic epidemiologist and director of the Center for Research on Genomics and Global Health at the U.S. National Human Genome Research Institute, or NHGRI, in Bethesda, Md. He suggests beginning with “our common ancestry, where we all started before we went to different environments.” Because the human genome is able to adapt to different environments, humans carry signatures of some of the geographic locations where their ancestors settled. “We need to understand how this influenced our biology and our history,” Rotimi says.

illustration of a DNA strand made of people
Expanding DNA databases to include a broader mix of people may reveal more variants relevant to some common diseases. Delphine Lee

Researchers can work to understand the genetic diversity within our genome “without invoking old prejudices, without putting our own social constructs on it,” he says. “I don’t think the problem is the genome. I think the problem is humanity.”

Lawrence Brody, director of NHGRI’s Division of Genomics and Society, agrees: “The scientists of today have to own the discrimination that happened in the generations before, like the Tuskegee experiment, even though we’re very far removed from that.” During the infamous Tuskegee experiment, African American men with syphilis were not given treatment that could have cured the infection.

“We want the fruits of genetic research to be shared by everyone,” Brody says. It’s important to determine when genetic differences contribute to disease and when they don’t. Especially for common diseases, such as heart disease and diabetes, genetics may turn out to take a back seat to social and economic factors, such as access to health care and fresh foods, for example, or excessive stress, racism and racial biases in medical care. The only way to know what’s at play is to collect the data, and that includes making sure the data are as diverse as possible. “The ethical issue is to make sure you do it,” Brody says.

Hilliard says that the argument that minorities become more vulnerable when they open themselves to genetic research is valid. “Genomics, like nuclear fusion, can be weaponized and dangerous,” she says in response to readers’ concerns. “Minorities can choose to be left out of the genomic revolution or they can make full use of it,” by adding their genetic data to the mix.

illustration of a strand of DNA
Delphine Lee

Different priorities

Certain groups are choosing to steer clear, even as scientists try to recruit them into genetic studies. The promise that the communities that donate their DNA will reap the benefits someday can be a hard sell.

“We’re telling these communities that this is going to reduce health disparities,” says Keolu Fox, a Native Hawaiian and human geneticist at the University of California, San Diego. But so far, precision medicine has not produced drugs or led to health benefits for communities of color, he pointed out last July in the New England Journal of Medicine. “I’m really not seeing the impact on [Native Hawaiians], the Navajo Nation, on Cheyenne River, Standing Rock. In the Black and brown communities, the least, the last, the looked over, we’re not seeing the … impact,” Fox says.

That’s because, “we have a real basic infrastructure problem in this country.” Millions of people don’t have health care. “We have people on reservations that don’t have access to clean water, that don’t have the … internet,” he says. Improving infrastructure and access to health care would do much more to erase health disparities than any genetics project could right now, he says.

Many Native American tribes have opted out of genetic research. “People ask, ‘How do we get Indigenous peoples comfortable with engaging with genomics?’ ” says Krystal Tsosie, a member of the Navajo (Diné) Nation, geneticist at Vanderbilt University in Nashville, and cofounder of the Native Biodata Consortium. “That should never be the question. It sounds coercive, and there’s always an intent in mind when you frame the question that way.” Instead, she says, researchers should be asking how to protect tribes that choose to engage in genetic research.

What would you like to tell the scientists working in this area? Send your thoughts to feedback@sciencenews.org.

And issues of privacy become a big deal for small groups, such as the 574 recognized Native American tribal nations in the United States, or isolated religious or cultural groups such as the Amish or Hutterites. If one member of such a group decides to give DNA to a genetic project, that submission may paint a genetic portrait of every member of the group. Such decisions shouldn’t be left in individual hands, Tsosie says; it should be a community decision.

Hilliard says minorities’ resistance to participating in genetic research is about more than a fear of being singled out; it’s the result of being experimented on but seeing medical breakthroughs benefit only white people.

“Medical researchers just need to accomplish something that benefits somebody other than Europeans,” she says. “If Blacks or Native Americans or other underrepresented groups saw even a single example of someone of their ethnicity actually being cured of the many [common] chronic diseases and specific cancers for which they are at high risk, that paranoia would evaporate overnight.”

More Stories from Science News on Genetics

From the Nature Index

Paid Content