Figuring out how many genes are in the human genetic instruction manual, or genome, isn’t as easy as scientists once thought. The very definition of a gene has changed since the completion of the Human Genome Project more than 15 years ago.
Genes used to be defined as stretches of DNA that contain instructions that are copied into RNA and then turned into proteins. Researchers still don’t entirely agree on how many of these protein-coding genes there are. Estimates range from 19,901 to a new count of 21,306 published August 20 in BMC Biology.
But in the last decade, researchers have learned that not all genes produce proteins. Many scientists have expanded the definition of a gene to include ones that make RNAs that, instead of being turned into proteins, have other functions in the cell.
Numbers of RNA-producing genes (also called noncoding genes) are even more up in the air than protein-coding genes, says Steven Salzberg, a biostatistician at Johns Hopkins University who headed the new count. His team has already found more of these RNA genes — 25,525, including 18,484 long noncoding RNA, or lncRNA genes (SN: 12/17/11, p. 22) — than protein-coding ones, and his count doesn’t include microRNAs and other recently discovered small RNAs.
Even without the small RNAs, Salzberg’s new total of human genes comes to at least 46,831. Other scientists have debated the estimate, and Salzberg says, “I will not be surprised if 10 years from now, we still don’t have an agreed-upon number.”