Evolution Basics: Genomes as Ancient Texts, Part 2
Note: This series of posts is intended as a basic introduction to the science of evolution for non-specialists. You can see the introduction to this series here. In this post we examine the DNA sequences of several fruit fly species to test the hypothesis that they arose through speciation events.
In yesterday’s post, we discussed some features of what we would expect when comparing genomes between similar species, if indeed those species descended from a common ancestral population. Returning to our “book” analogy for genomes, we made the point that the first thing to look for would be overall structure of two genomes proposed to be descended from a common ancestral genome: are the “chapters,” “paragraphs,” etc., in the same order? Do they use the same “sentences”? and so on. In other words, do the genomes of any existing species look like they are slightly modified copies of each other?
The answer to this question from modern comparative genomics is an emphatic yes. What we see when we compare genomes of species that we suspect to be relatives is very much that they do indeed look like copies of each other. In some cases, the match between two genomes can be in excess of 95%, DNA monomer for DNA monomer, for the bulk of the two genomes. Not only do they have the same genes, but they have them in the same order on their chromosomes, with each chromosome in the two species having a match in the other species. Imagine finding a book that was 95% identical to another manuscript, with chapters, paragraphs and sentences all in the same order, with only small differences between them – that is the sort of impression one gets when comparing genomes between some species.
One group that has been analyzed in some detail are various species of fruit flies (in the genus Drosophila (pronounced “Dro-SOF-i-la”). Scientists have now determined the complete genome sequences for twelve Drosophila species and compared them with one another. Some species have nearly identical genomes, exactly as one would predict if they were once the same species with a common genome. While it’s not possible to show large amounts of DNA sequence here, let’s examine a small fragment of one “sentence” (i.e. gene) in three species of fruit flies (Drosophila yakuba, Drosophila simulans, and Drosophila sechellia) – all known to be distinct fly species:
The first impression one gets when looking over the sequences is that they are nearly identical. This is not unusual for these three species – indeed, this pattern applies to every gene they have (and they all have the same genes). The second thing one notices, however, is that there are rare differences – in this small snippet, two of the species (D. simulans and D. sechellia) have an “a” in the fourth position of this gene, where D. yakuba has a “t”.
Think back to the analogy that we used yesterday – that of book printings and typos shared between them. These sequences in these three species are very much like the hypothetical printings in our analogy, and we can make sense of the pattern we observe in much the same way:
(Now, astute readers will also note that the other possibility is that the “a” at the fourth position is the original text, that the “t” is the typo, and that a (t to a) mutation happened once on the lineage leading to D. yakuba. If you’re wondering about this option, well done. The trick to deciding what the original text is to look at as many copies as possible - and in this case, when we look at a wide number of additional Drosophila species, we see an “t” in the fourth position in most other species, and an “a” only in D. simulans and D. sechellia. This means the most parsimonious choice is that the “t” is the original, and the “a” is a mutation).
While this is only a small example, it illustrates what scientists observe when comparing the genomes of species they suspect to be relatives based on other criteria (such as morphology). What they see is precisely what one would expect if indeed speciation events had occurred to produce the species in question: nearly identical genomes, with small changes shared between some species.
Identity beyond what’s necessary for function at the DNA level
A further observation that supports the hypothesis that these sequences are copies of an ancestral sequence is that the level of identity (matching of sequence) between them is greater than it needs to be, even when the function of the gene is considered. Let’s return to the gene fragment that we were just examining. This sequence, as the start of a gene, codes for a protein with the same function in all three species. (If you need a refresher on how genes are made up into DNA monomers that are eventually translated into a sequence of amino acid monomers, you can refer to two prior posts in this series, here and here.) In these species this protein has the following sequence for the first eight monomers (amino acids):
As you can see, the second amino acid is different between the two sequences, but the other amino acids are identical. What is important for our purposes here is to note that that there are many, many ways to write this “sentence” and arrive at the same meaning (sequence of amino acids). This is possible because for most amino acids, there are several DNA monomer combinations (of three nucleotides) that produce the same amino acid when translated. For example, the sequence in D. yakuba could also be written as follows (among many other options):
This sequence is quite different from what we see in the other two species:
In this case, only 14 of the 27 DNA monomers match – an identity of only about 52%). What we observe between these species, however, is that 26 out of the 27 monomers match (over 96% identity). In other words, it would be possible for these two genes to be much less identical at the DNA level, and still have the same amino acid sequences that we observe in the two species. Yet, what we see when we compare the two genes, is that they match at the DNA level much more than they need to in order to have the same amino acid sequence. A simple explanation is that the two sequences match because they are copied from the same original sequence.
Identity beyond what’s necessary for function at the amino acid level
A second observation that supports the hypothesis that the D. yakuba, D. simulans and D. sechellia gene sequences are in fact copies follows from examining other fly species that are less closely related to these three. All Drosophila species examined to date have this gene, but in more distant relatives the sequence can be somewhat different. For example, this gene in D. mojavensis has the following DNA sequence:
Again, some of the sequence remains identical in all four species (supporting the hypothesis that this sequence is also a copy, but with more changes), but now we see greater differences. Despite these differences, however, the D. mojavensis version of the gene is perfectly functional and does the exact same job as the gene in the other three more closely related species.
So, these observations indicate that there is no biological need for nearly identical genes at the amino acid level, or even at the DNA level, in different species. Numerous amino acid sequences, and even numerous DNA sequences, are equally capable of performing the same function. Yet, what we see time and again (across whole genomes!) are nearly identical genes, with a few (often shared) differences – exactly what speciation events would be expected to produce.
What about “common design”?
One question I am frequently asked when presenting this sort of data is that of “common design” as an alternative explanation. In other words, could these sorts of patterns be explained as separately created species that do not share ancestry, but rather were designed (created) to have the same (or similar) genes because those genes need to have the same (or similar) functions?
We have already seen the basic problems with this line of argument – that genes (and entire genomes) of similar species match much more than they need to – and that the differences we see in closely related species are arranged in exactly the pattern one would predict if speciation events had produced them.
Of course, most Christians don’t lose sleep over the possibility that numerous fruit fly species arose over time through multiple speciation events. As we have mentioned previously, even most Young Earth Creationists accept speciation events such as these. What is more contentious, of course, is the question of whether the pattern of shared ancestry extends to our own species. In the next post in this series, we’ll examine this question by comparing the human genome to the genomes of our proposed nearest living relatives – the great apes.