DNA and RNA are long linear polymers, called nucleic acids, that carry information in a form that can be passed from one generation to the next. These macromolecules consist of a large number of linked nucleotides, each composed of a sugar, a phosphate, and a base. Sugars linked by phosphates form a common backbone, whereas the bases vary among four kinds. Genetic information is stored in the sequence of bases along a nucleic acid chain. The bases have an additional special property: they form specific pairs with one another that are stabilized by hydrogen bonds. The base pairing results in the formation of a double helix, a helical structure consisting of two strands. These base pairs provide a mechanism for copying the genetic information in an existing nucleic acid chain to form a new chain. Although RNA probably functioned as the genetic material very early in evolutionary history, the genes of all modern cells and many viruses are made of DNA. DNA is replicated by the action of DNA polymerase enzymes. These exquisitely specific enzymes copy sequences from nucleic acid templates with an error rate of less than 1 in 100 million nucleotides.
Genes specify the kinds of proteins that are made by cells, but DNA is not the direct template for protein synthesis. Rather, the templates for protein synthesis are RNA (ribonucleic acid) molecules. In particular, a class of RNA molecules called messenger RNA (mRNA) are the information-carrying intermediates in protein synthesis. Other RNA molecules, such as transfer RNA (tRNA) and ribosomal RNA (rRNA), are part of the protein-synthesizing machinery. All forms of cellular RNA are synthesized by RNA polymerases that take instructions from DNA templates. This process of transcription is followed by translation, the synthesis of proteins according to instructions given by mRNA templates. Thus, the flow of genetic information, or gene expression, in normal cells is:
This flow of information is dependent on the genetic code, which defines the relation between the sequence of bases in DNA (or its mRNA transcript) and the sequence of amino acids in a protein. The code is nearly the same in all organisms: a sequence of three bases, called a codon, specifies an amino acid. Codons in mRNA are read sequentially by tRNA molecules, which serve as adaptors in protein synthesis. Protein synthesis takes place on ribosomes, which are complex assemblies of rRNAs and more than 50 kinds of proteins.
The last theme to be considered is the interrupted character of most eukaryotic genes, which are mosaics of nucleic acid sequences called introns and exons. Both are transcribed, but introns are cut out of newly synthesized RNA molecules, leaving mature RNA molecules with continuous exons. The existence of introns and exons has crucial implications for the evolution of proteins.