This series of posts is intended as a basic introduction to the science of evolution for non-specialists. You can see the introduction to this series here. In this post we compare gene sequences between humans and other mammals to test the hypothesis that they are modified copies derived from an ancestral genome.
In the last post, we saw that at a large scale of organization, the human genome has the features it should have if indeed we share ancestry with other great apes. Continuing with our “book” analogy, we now turn to comparing these texts at a finer level of detail – that of sentences and words.
Comparing genomes at the “sentence” level
In a previous post, we compared the DNA sequences of a gene found in a number of species of Drosophila. Such comparisons are also possible using DNA sequences from mammals (including humans and other primates), and the pattern they produce is by now familiar:
As we saw for the Drosophila sequences, this gene is nearly identical across a number of species. Specifically, the human sequence and the sequences of three other primates (chimpanzees, gorillas and orangutans) differ by only a handful of DNA monomers (at the most, 4 of the 90 are different). Also, as we have seen before, there is no biological need for these sequences to be this identical – in fact, even for this small region of this gene, there are over 53 million (!) different ways to code for the exact same amino acid sequence. Most of those sequences are much more different from the human sequence than the nearly identical sequences we observe in other primates. To carry the point further, there is also no particular biological need for this gene to have the exact amino acid sequence we see shared among primates. In other organisms (such as dogs and wolves) a slightly different sequence performs the same task equally well.
Of course it is not possible to show DNA alignments of large swaths of DNA sequence in this format. This small gene segment, however, is representative of genes (and even whole genomes) among primates. A detailed comparison of all gene sequences between humans and chimpanzees, for example, reveals that they are 99.4% identical across 1.85 x 107 (18 million) DNA monomers. Note that regions of the genome that code for genes are a tiny minority of genome sequences - humans and chimpanzees have over 3.0 x 109 (3 billion) DNA monomers in their genomes. Of these 3 billion monomers, 2.7 billion of them align with each other with only a 1.23% difference between them.
In short, when comparing DNA sequences between humans and other primates, we see exactly the pattern we would predict based on shared ancestry – a pattern consistent with slight modifications to an ancestral genome.
Looking for typos
In a previous post in this series, we discussed how DNA replication is a highly accurate process, but not a perfect one. These two features of DNA replication mean that mutations can occur to genes when they are copied, and that future copies made from a mutated template will faithfully pass that mutation on (at least, until a second mutation occurs at the same location to change things once again). What this means is that gene sequences can persist in genomes for a long time after they are mutated to a non-functional sequence if there is not a selective disadvantage for losing the function in question. (If a mutation does result in a disadvantage, then natural selection will tend to remove it from its population, as we have discussed previously.)
One such example involves a gene that codes for an enzyme (L-gulonolactone oxidase, or “GULO”) that is required the synthesis of vitamin C in mammals. Most mammals make their own vitamin C from other compounds in their diet, and the GULO gene is necessary for the last step in the process that converts a vitamin C precursor to the final product. As we have seen for other genes, the sequence for this gene is conserved between mammals – it has a nearly identical sequence that is maintained through natural selection. For example, a portion of this gene in cows, dogs and rats has the following sequence (with differences from the cow sequence outlined in black):
In all three of these species, this gene is functional, and all three can make their own vitamin C without obtaining it directly from their diet.
Humans, of course, cannot make their own vitamin C – we get scurvy if we do not obtain vitamin C from our diet. This atypical situation (for a mammal) is shared by other great apes, and for the same reason. Though these species have some of the DNA sequence for the GULO gene, it has numerous mutations in it that render the gene unable to make a functional enzyme product. The same region of the GULO gene shown in the above figure has the following sequences in humans, chimpanzees and orangutans (now with differences from the human sequence outlined in black):
Once again we notice that the primate sequences are nearly identical to one another. One new feature to note here, however, is that these three copies of the GULO gene are non-functional in part because they have a deletion mutation – the removal of one DNA monomer (highlighted in yellow in the primate sequences). This deletion mutation is identical in all three species, providing evidence that it is a “shared typo” copied from a prior text – or, in biological terms, a deletion mutation that happened once in the common ancestor of humans, chimpanzees and orangutans, and was then inherited by all three species. Dogs, cows and rats, however, branched off of the lineage leading to primates before this deletion event occurred:
The loss of GULO function does not seem to have been a selective disadvantage for primates at the time – likely because they had a diet rich in vitamin C. Indeed, even for humans, this loss is not a serious problem unless one finds oneself without a source of vitamin C for a prolonged period of time.
The nose knows
As interesting as the GULO example is (and it is an example I have discussed in more detail in another context) it is but one of many examples of shared, identical mutations found in the human genome and other primate genomes. One study that examined shared primate mutations in detail investigated mutations in genes devoted to the sense of smell. These genes, called olfactory receptors, are proteins found on the membrane of cells in the nasal epithelium in mammals. Olfactory receptors do their job by binding on to compounds in the air, changing shape in the process, and signaling that change in shape to the nervous system in what we perceive as the sense of smell. The combined action of numerous olfactory receptors acting in concert is what gives any given smell its distinctive features. Mammals dedicate a disproportionate amount of their genome to olfactory receptor genes, most likely because such genes are so useful for finding food, finding mates, and in general perceiving one’s environment. Despite their usefulness, these genes can also be mutated and lost – and indeed, the human genome shows that our species has lost several due to mutation. As for the GULO gene, however, these defective olfactory gene sequences persist in recognizable form. What is more important for our purposes, however, is the pattern these mutated genes form when compared to other primate genomes. As we first introduced with our copied book analogy, we expect to find some typos that are shared between texts, and other typos that are unique to one edition. For defective olfactory genes, we observe precisely these two categories – shared mutations, and unique mutations:
As you can see from the diagram above, humans share the most identical olfactory gene mutations with chimpanzees, fewer with gorillas, and fewer still with orangutans. Of the 12 mutations that are identical between humans and chimpanzees, 9 are also identical with gorillas, and 6 with orangutans. These shared mutations and the pattern we find them in are easily explained through shared ancestry, as indicated in red on the diagram above. The mutations unique to a given species are also easily explained as arising after populations separate (in blue).
It’s also important to note what we do not see when comparing these mutations between primates. We do not observe identical mutations between humans and gorillas, for example, unless we always see the exact same mutation in chimpanzees. This makes perfect sense if the common ancestral population of humans and gorillas is also the common ancestral population of humans, gorillas and chimpanzees. Likewise, if we observe identical mutations shared between humans and orangutans, we can predict with confidence that we will observe these exact mutations in gorillas and chimpanzees – and in fact we do. This pattern of shared mutations is precisely what one would predict if in fact it was produced by shared ancestry – with nothing out of place.
Multiple lines of evidence, one conclusion
In the next post in this series, we’ll circle back and discuss how the multiple lines of genomics evidence for human evolution that we’ve examined cohere into a mutually supportive pattern.