As recently as the 1980s, scientists collected genetic data by laboriously tracking the diffusion of DNA molecules through slabs of gel. Now researchers stand by as machines gush billions of letters of DNA code a day, and struggle to cope with the data deluge.
In the past two decades, the speed of sequencing has leapt from around 10,000 bases per day per machine to more than 1 billion (above, blue). Since the introduction of high-throughput machines in the mid 2000s, costs have plummeted (above, orange) and the price of sequencing a whole human genome (above, gold) has tumbled almost to the long-anticipated figure of $1,000.
The biggest expense in sequencing a human genome now is the cost of storing it, says Scott Kahn, chief information officer of Illumina, a San Diego biotech company specializing in high-throughput sequencing. Someday soon, he says, it may be cheaper to resequence a person’s genome each time the data are needed than to store the information in silico.
First bacterial genome sequenced (Haemophilus influenza)
Size: 1.8 megabases
First fungus genome sequenced (Saccharomyces cerevisiae: yeast)
Size: 12.1 megabases
First animal genome sequenced (Caenorhabditis elegans: nematode)
Size: 97 megabases
First plant genome sequenced (Arabidopsis thaliana: weed related to mustard)
Size: 115.4 megabases
First human genome draft sequence (Homo sapiens)
Size: 3.2 gigabases
The human genome project is declared complete.
The ENCODE project begins. The project, funded by the NIH, was designed to find all of the functional elements of the human genome.
Final version of the human genome sequence published.
Human microbiome project begins. The project surveys the genomes of microbes living throughout the human body.
Human microbiome project completed
Size: 3.5 terabases