By mixing soapy water, oil and the theory of information, a physicist has found a possible clue to the origin of the genetic code, as well as to the structure of other biochemical languages.
Life’s workhorse molecules are made from only 20 different types of amino acids, encoded in the chemical makeup of DNA. In principle, DNA could code for about three times that many, 64 possible combinations. Comparing the genetic code with the physics of soapy water suggests an explanation for why nature chose 20 as an optimal number, Tsvi Tlusty of the Weizmann Institute of Science in Rehovot, Israel, reports in an upcoming Proceedings of the National Academy of Sciences.
Genes are segments of DNA that encode instructions for constructing the molecules, primarily proteins, needed to build and operate cells. Each gene is a long sequence of “letters” — A, C, T and G — symbols for the chemical bases adenine, cytosine, thymine, and guanine. Each three-letter combination specifies an amino acid. But the code is redundant, meaning that sometimes different triplets represent the same amino acid — for example, CAA and CAG both represent glutamine.
The genetic code presumably evolved from the diverse and chaotic chemistry of the Earth’s primordial broth. Before settling on the 20 standard amino acids, the developing code faced opposing pressures. Organisms with a more complex molecular language — using more than 20 types of amino acids — could have deployed a wider range of chemical combinations to adapt to environmental changes. But organisms with simpler chemistry required less molecular machinery and energy, Tlusty explains. And using fewer amino acids reduces the rate of random errors in copying genetic information: If several triplets have the same meaning, there’s a good chance that changing one letter will have no consequences.