This series of posts is intended as a basic introduction to the science of evolution for non-specialists. You can see the introduction to this series here [https://biologos.org/blog/evolution-basics-a-new-introductory-course-on-evolutionary-biology]. In this post we examine how variation at the DNA level translates into variation in protein structure and expression.
Yesterday, we discussed how DNA replication is readily facilitated by its structure, since one half of the DNA double helix can serve as a template for making the other half. We also discussed how DNA, though well-suited for its hereditary role, is not at all suited to performing cellular functions—but that proteins fill these roles. With these details covered, we’re now ready to discuss how the hereditary information in DNA is converted to the functional diversity that we see in proteins—and how variation plays a part in this process. The first step in this discussion requires us to look into how chromosomes and genes work.
Molecular Genetics 102: Chromosomes and Genes
Humans have 46 chromosomes in each of their cells, and they come in pairs. We receive one of each pair as a set of 23 chromosomes from each parent: eggs contain 22 non-sex chromosomes plus an X chromosome, and sperm contain 22 non-sex chromosomes plus either an X or a Y chromosome. Each chromosome is one long DNA double helix, with millions of DNA base pairs. Our largest chromosomes have about 250 million base pairs, and our smallest about 50 million. Taken together, the human genome has about 3 billion DNA base pairs in each set of 23 chromosomes, or a total of about 6 billion if you count both sets.
Distributed on these 23 chromosome pairs are genes—the units of biological function encoded within our DNA. What exactly constitutes a “gene,” like all good concepts in biology, is “fuzzy,” but for our purposes we will define a gene as a sequence of chromosome DNA base pairs that are used to make a functional, non-DNA product. Humans have about 20,000 genes, and they can be quite spread out on chromosomes, with a lot of non-gene DNA in between them. If we represent a chromosome as a solid black line (as is common in many genetics textbooks), we can “zoom in” to see the features of one of its many genes. In this case, this is a gene that makes a protein product:
First off, we can see that the parts of the gene that are used to specify the protein amino acid sequence (the blue boxes) are only one part of the whole. Other sequences (such as those represented by the light blue lines and the red boxes) are sequences that direct certain cell types to make this protein, and how much of it to make. All of the sequences represented as boxes are made into what is called “messenger RNA”, or “mRNA”—sort of a single-stranded version of DNA—that is only as long as the gene sequence, and often splices out sequences that intersperse the sections that code for the protein structure (so-called “introns”, which can be seen in the above figure). This mRNA “working copy” of the gene is then used to direct the synthesis of the protein through a process called translation.
If this all seems a little complex, don’t worry—for our purposes here, it’s enough to recognize that genes are (a) a small section of a much longer DNA molecule (i.e. a chromosome), (b) have some sequences that determine the sequence of the protein that they encode (i.e. the order of its amino acids), and (c) other regulatory sequences that are not part of the protein code itself, but function instead as signals to tell cells when and where the protein should be made, or “expressed.”
With these details in mind, now consider how variation at the DNA level can affect chromosome structure. As we saw yesterday, when chromosomes are copied, DNA copying errors may occur. Not surprisingly, many types of mutation events can also impact the function of genes, and ultimately the characteristics of the organism:
Single base pair mutations: mispairing of nucleic acids can lead to chromosome copies that differ from the original by one base pair (as we saw yesterday). These so-called “point mutations” can occur inside genes (in either regulatory DNA, or protein-coding DNA) or in the sequences between genes. Single base pair changes in protein coding DNA may have no effect on the protein at all (since there are often different DNA sequences that produce the same sequence of amino acids, a feature known as “redundancy” of the genetic code). Other changes may alter the amino acid sequence by substituting one amino acid for another, but still have no effect on the function of the protein (since many protein functions can be accomplished by slightly different protein sequences). Other changes might reduce or even remove protein function. Still other changes might improve protein function—give it better enzymatic activity, for example.
Changes in regulatory DNA are also possible, and the effects of these changes can similarly be neutral, harmful or beneficial. What is interesting about regulatory DNA is that small changes can have quite large effects on where and when a protein is made—and changes that alter key genes that function early in development can have significant downstream affects on the organism as a whole. We’ll examine this in some detail in future posts in this series.
Deletion events: sometimes, stretches of DNA can be lost during chromosome replication due to breakage and rejoining events. Sometimes deletions affect only a few base pairs, but in some cases they can span thousands of base pairs. Parts of genes, or even entire genes, can be lost, and genes flanking the deletion are brought closer together. As we have seen for point mutations, deletions can have no effect, a detrimental effect, or even a beneficial effect depending on the specific event. For example, sometimes deletions remove regulatory sequences that shut down gene expression in certain cells. Removing this sequence allows the gene to be expressed where it was not expressed before—which again could be neutral, harmful or beneficial, depending on the circumstances.
Duplication events: this is the opposite of a deletion, where a portion of a chromosome’s sequence is doubled and ends up side-by-side. As for deletions, duplications can be small, or thousands of base pairs long, spanning numerous genes—and similarly be neutral, harmful or beneficial.
One common mechanism that produces duplications and deletions simultaneously occurs during recombination in the cells that lead to eggs or sperm. You might recall that “crossing over” is the term used to describe the physical breakage and rejoining of chromosomes to “mix and match” sequences between chromosome pairs during the cell divisions that lead to gametes (i.e. meiosis). Normally, chromosomes pair up for this exchange by lining up their (nearly identical) sequences, followed by precise breakage and rejoining:
What can happen, at a low frequency, is that chromosome pairs don’t align their sequences correctly. The alignment is based on the same sequences on each chromosome finding each other and binding to each other. Mistakes can happen because of repetitive sequences between genes—sequences that “trick” the chromosomes into thinking they’ve found their correct sequence alignment, when in fact there are two loops of unpaired sequence, one on each chromosome. If a crossover occurs between these loops, the result is one chromosome with a duplication, and the other with a deletion:
Of course, this list of mutation types is not exhaustive (for example, we have seen how autonomous, parasitic DNA elements called transposons can insert into chromosomes, disrupting functions, or contributing to new ones).
Summing up: constancy and change
Taken together, these mechanisms introduce variation into populations, and since that variation is in DNA, the variation is heritable. Variation at the chromosome level may influence the function of genes, and ultimately traits at the level of the organism. Changes at the DNA level that do cause meaningful variation at the organismal level are available for natural selection to act on—and we have already seen certain examples of selected mutations, such as the duplication of amylase genes in humans and in dogs. Other mutations, of course, are selected against, and may be removed from populations over time. The properties of DNA as both an agent of constancy and heritable change mean that populations are not entirely genetically stable: they can change over time, though the features of DNA that make it a largely accurate transmitter of information ensure that those changes will likely be subtle ones at the level of the organism.
As we will see in the next post in this series, this genetic instability can put separate populations of the same species on different trajectories, and allow differences to accrue that ultimately lead to new species forming.