This series of posts is intended as a basic introduction to the science of evolution for non-specialists. You can see the introduction to this series here. In this post we examine how variation arises and is passed on in populations through DNA copying errors.
How organisms reproduce “after their own kind” (to borrow the language from Genesis) is a longstanding question in biology. A closely related question arises from the observation that within a “kind,” not all individuals are the same—variation exists within populations of the same species. For many years, the mechanism that could explain both the observedconstancy of a species (faithful reproduction of the form of an organism) and variation (not all members of a species are identical) remained a mystery. In order to shed some light on these important issues for evolutionary biology, we need to take some time to explore the “nuts and bolts” of how two important biological molecules work, and how they relate to one another: deoxyribonucleic acid (DNA) and proteins.
Molecular Genetics 101: Proteins and DNA
You might be surprised to learn that early work in exploring the molecular basis for genetics favored proteins as the hereditary molecule instead of DNA. It was suspected that whatever was acting as a hereditary molecule would be large and complex, and proteins were both. Proteins can be very long, since they are a polymer of smaller, repeating components (monomers). We can use children’s interlocking bricks to illustrate what we mean. For bricks, each individual piece is a monomer, and when they’re snapped together, they form a polymer:
Proteins are built pretty much in the same way. For proteins, the monomers are a group of compounds called amino acids (each amino acid is one monomer). Like the bricks in our analogy, they have features in common that allow them to be “snapped together” into a long chain. They also have significant differences, analogous to the different colors in the diagram: some amino acids are hydrophobic (i.e. they are repelled by water), others are hydrophilic (i.e. attracted to water). Some are large and bulky, others are comparatively small, and so on. Unlike the rigid bricks in our analogy, proteins are marvelously flexible, and fold up into a three-dimensional shape, as directed by the properties of the monomers.
There are 20 different amino acids that are used to make proteins, and they can be combined in any sequence in order to produce a protein with specific properties—properties that arise from the combination and specific order of amino acids, and the final shape they give to the protein. This diversity in monomers means that there are many, many different possibilities for protein sequences (and thus shapes, and functions)—even a polymer only two monomers in length has 400 possible sequences (i.e. 202, or 20x20), and proteins can be thousands of amino acids long. It was this possibility for large-scale complexity that suggested that proteins might have enough “storage capacity” to hold hereditary information and pass it on to the next generation.
Beginning in the late 1920s, however, research began to point away from proteins and towards DNA as the hereditary molecule. DNA, like proteins, is a polymer formed from a set of monomers (in this case, nucleic acids). In contrast to the 20 monomers found in proteins, DNA has only four monomers: compounds abbreviated as A, C, G and T. It was for this reason that researchers were initially skeptical that such a “simple” polymer could act as a source of hereditary information.
Despite this skepticism, evidence continued to mount that DNA was in fact the physical basis for hereditary information. Once this evidence convinced the majority of scientists, the race was on to understand exactly how DNA accomplished this remarkable task. Soon, it became clear that understanding the structure of DNA was crucial to understanding its function, and several research groups famously competed to be the first to decipher it.
Determining the structure of DNA did indeed shed light on its function. Though it has only four monomers, the structure of DNA revealed how it can easily replicate and pass information on: not only is DNA a long polymer, it is a polymer that can specify its own replication through interactions between its monomers. Perhaps a picture would help explain. Imagine bricks that now have “partners” they are attracted to. We’ll represent that attraction, which is a type of chemical bond called a hydrogen bond, with a black dot. The “A” and “T” monomers are attracted with two hydrogen bonds, and the “C” and “G” monomers with three:
These “attraction pairings” between monomers are important: they allow one DNA polymer to act as a template for a second, “complimentary” DNA polymer. Imagine a DNA sequence as follows:
As the second DNA polymer is made, monomers are selected, one at a time, to match their “partners” in the first polymer:
These two polymers are held together by the alignment of many hydrogen bonds, and you are likely familiar with them as the “two strands” of the DNA double helix:
While this more realistic model of DNA shows the precise details of its molecular structure, the important features are summarized by our simple “toy brick” model. DNA is a pair of long polymers that can be separated and used to make new copies that are faithful to the original.
While these features of DNA readily explain how it is faithfully copied, recall that we also need to explain variation. Variation, in the most basic terms, means there is sometimes imperfection in the copying process. If DNA is indeed the hereditary molecule, and if DNA copying was 100% accurate, then variation would never arise, and all offspring would be genetically identical to their parents. Without variation, recombination would have no effect (since there would be no variation to mix into new combinations).
There are many ways that variation can enter during the DNA copying process, and in a future post we will examine several of them. One way that we will consider now is simple “mispairing” of monomers during replication. At a certain (very low) frequency, inappropriate monomers are paired together. The arrow in the figure below shows one such mismatched pair, where a red monomer (G) on the bottom strand was incorrectly paired with a yellow monomer (T) when the top strand was made. When this set is replicated, both the top and bottom strands are copied, but now the correct partners for each monomer are found. The result is two different outcomes: one copy now has the original, correct C:G pair (on the left), and the other has a new variant, with an A:T pair (on the right). This change will be faithfully copied from here on, since later copies don’t “know” what the original sequence was. The result is a new variant in the population.
Taken together, the properties of DNA match what we observe in nature: faithful reproduction of form, but not perfect reproduction of form. At its base, constancy and heritable variation in biological populations trace back to how DNA functions.
What about proteins?
While the properties of DNA make it a great hereditary molecule (that nonetheless allows for variation to arise), DNA itself is not capable of doing the day-to-day functions that organisms need (enzyme functions, structural functions, and so on). For these functions, the vast structural diversity of proteins is required. In the next post in this series, we’ll discuss how the hereditary information in DNA is transferred to protein structure and function, and how variation in DNA can cause variation at the protein level.