Speciation and Incomplete Lineage Sorting

Bookmark and Share

October 14, 2011 Tags: History of Life

Today's entry was written by Dennis Venema. You can read more about what we believe here.

Speciation and Incomplete Lineage Sorting

One of the challenges for discussing evolution within evangelical Christian circles is that there is widespread confusion about how evolution actually works. In this (intermittent) series, I discuss aspects of evolution that are commonly misunderstood in the Christian community. In the two previous posts, we examined how speciation is something that happens to populations. In this post, we explore why individual gene histories may not match species histories as populations diverge, and look at how these results have been misinterpreted by some members of the ID movement.

Populations and genetic diversity

One consequence of speciation being a population event is that populations have genetic diversity – not all members of the population are genetically identical. For any particular gene, then, a population may have several slightly different forms present within it. These different forms are called alleles. An example in humans that is fairly well-known is the different alleles that control blood types: one allele gives rise to the A type, another to the B type, and a third allele the O type. Individuals may be either blood type A (either two A alleles or A + O); blood type B (either two B alleles or B + O); type AB (one A allele + one B allele) or type O (two O alleles). Any one individual can have only two alleles of this gene (one from mom, the other from dad), but as a population we collectively maintain all three. Other human genes have many more alleles than three (for example, some genes of the immune system have hundreds of alleles) despite the fact that any given individual can have at most two. The larger a population is, the more alleles of a given gene it can maintain. Smaller populations are more at risk of losing alleles due to chance (something called genetic drift).

Genetic diversity and speciation

The fact that populations maintain genetic diversity is important to remember when considering speciation. Speciation events are commonly represented with branching tree diagrams (“phylogenies”, or “species trees”) such as this one:

Here we see that Species 1 and Species 2 are more closely related to each other than they are to Species 3. What this says is that Species 1 and Species 2 shared a common ancestral population more recently with each other than either did with Species 3. So far, so good – but what this doesn’t mean, however, is that comparing gene sequences between these species will always group 1 & 2 together as more similar to each other than to 3. While this will be true most of the time, it is expected that some of the time this pattern will not hold. The reason is due to something called incomplete lineage sorting, and it has to do with the fact that populations going their separate ways carry genetic diversity with them. Let’s try to explain what is going on here.

Imagine that the ancestral population of all three species (the 1,2,3 common ancestor) has four alleles of a certain gene (represented by different colors in the diagram). These alleles originally arose due to a single mutational difference during DNA copying. Once there is a difference in place, two alleles can go on to acquire other differences over time, again, through copying errors. As a result, alleles can be compared to each other, just like species. Alleles that are recently separated will have more similarities in common, and alleles that have been separate for longer will have acquired more differences. In this example, the blue and green alleles are more similar to each other than either is to red or orange, and vice versa. The blue and green alleles arose from a common ancestral allele, and the red and orange alleles arose from a common ancestral allele. Further back in time, these two ancestral alleles themselves arose from one common starting allele. All four alleles will have a great deal in common (nucleotide sequences inherited from the single ancestral allele), as well as differences (for example, the red and orange alleles will share all changes that occurred between the time they split off from the blue/green lineage and when they themselves separated into two distinct alleles).

Now consider the time when the (1,2,3 common ancestor) population divides to become the (1,2 common ancestor) species and the Species 3 ancestor (the first branch in the diagram). As this population divides into two species, it is not guaranteed that all four alleles will be present in the founding population of each new species, simply by chance. Each founding population is a sample of the original population, but any given sample may omit certain alleles:

In the example above, we see that the red allele has been lost from the (1,2 common ancestor) species, and that the Species 3 ancestor has lost the blue and orange alleles. What this means is that the founding population of the (1,2 common ancestor) species didn’t have any individuals that carried the red allele, and that the Species 3 ancestor founding population didn’t have any individuals that had the blue or orange alleles. Both events happened simply by chance, because the founding populations are not representative samples of the original population.

Later, as the (1,2 common ancestor) species separates again into Species 1 and Species 2, the same issues arise. The two founding populations may not transmit all of the genetic diversity of the (1,2 common ancestor) population:

In this case, the founding population leading to Species 1 did not include a member with the green allele, and the founding population leading to Species 2 did not include any members with either blue or orange alleles. Also, the green allele has been lost in the lineage leading to Species 3 (it became rare and was eventually not passed on due to chance).

In the present day, examining the alleles of the three modern species will reveal different levels of similarity. The blue allele is now only found in Species 1, and it is most similar to the green allele in Species 2, and less similar to the red allele in Species 3. This pattern matches the overall “species tree” pattern for these three species:

The orange allele in Species 1, however, tells a different story: it is most similar to the red allele in Species 3, and less similar to the green and blue alleles. If we knew only about the orange allele in Species 1, we might conclude that Species 1 and Species 3 are the closest relatives. This is because the “gene tree” for these alleles places orange closest with red, even though the true “species tree” reveals an overall pattern of speciation that is different:

The orange allele thus has a gene phylogeny that is said to be “discordant” with the overall species phylogeny.

How do biologists assemble species trees if gene trees can be discordant?

It might seem from the above discussion that assembling a species phylogeny from gene phylogenies is a hopeless task: after all, if any individual gene tree might be misleading, how can we be certain we have the correct species tree?

The solution is to realize that while any individual gene tree might be discordant, gene trees that match the species tree will be the most common category. In our example above, Species 1 and Species 2 share a common ancestral population for some time after the (1,2 common ancestor) and the Species 3 common ancestor populations diverge. This means that any event that happens to this population (loss of an allele, for example) will be reflected in all descendant species (in our example, Species 1 and Species 2). This common history favors gene trees that match the species tree. For a discordant tree, the ancestral (1,2) population needs to maintain two alleles, and these alleles cannot sort equally into Species 1 and 2. This can happen, but it is less likely.

What this means in practice is that biologists expect a certain pattern of gene trees when comparing related organisms. Using our three species as an example, most gene trees should match the species tree. The less likely outcome is a gene tree where an allele from Species 1 is more similar to the allele in Species 3. We can be confident we have the correct species tree because the majority of the gene trees favor one species tree over the alternatives.

A problem for common descent?

The fact that gene phylogenies/trees and species phylogenies/trees don’t always match is not something that surprises scientists, since it is a well-known phenomenon and the mechanisms underlying it are understood: species arise from genetically diverse populations and that diversity does not always sort completely down to every descendant species. Discordant phylogenies, however, are commonly used among Christians as a means to cast doubt on to common ancestry and/or evolutionary biology as a whole. One example from the Intelligent Design movement will serve as an illustration. In a blog post discussing discordant trees found when comparing the human genome to that of other primates, Casey Luskin argues

Since humans are typically said to be most closely related to chimps, this data conflicts with the standard supposed tree … the basic problem is that one gene (or portion of the genome) gives you one version of the tree, while another gene (or portion of the genome) gives you a very different version of the tree. This leads to discrepancies between molecule-based trees, wherein DNA data fails to provide a consistent picture of common ancestry.

In the end, molecular trees are based upon the sheer assumption that the degree of genetic similarity reflects the degree of evolutionary relatedness … Clearly this assumption fails when different genes paint contradictory pictures of evolutionary relationships.

As we have seen, these differences are the natural, expected consequence of genetic diversity from an ancestral population sorting itself incompletely into different descendant species. The data set Casey is concerned about is primate evolution, where the species tree for humans, chimpanzees, gorillas and orangutans is as follows:

In the article linked above, Casey is discussing a recent comparison of the newly-completed orangutan genome with the human genome. The availability of the orangutan genome allowed researchers to scan the human genome for locations where humans are more similar to orangutans than to chimps. These regions are rare in the human genome, and very short in length. Indeed, the researchers found a pattern: chromosome segments in humans most often match chimpanzees, and do so for thousands of nucleotide base pairs at a time, on average. Those regions that match orangutans are tiny (on average less than 100 base pairs) and rare. This is exactly what one expects from the species tree: humans and chimps are much more likely to have gene trees in common, since they more recently shared a common ancestral population (around 4-5 million years ago). Humans and orangutans, on the other hand, haven’t shared a common ancestral population in about 10 million years or more, meaning that it is much less likely for any given human allele to more closely match an orangutan allele. It is certainly possible, however, and in scanning over the entire genome rare sites that have this pattern can be found. Indeed, the authors of the paper above used previously-determined speciation times and population size estimates to predict what fraction of the human genome would be expected to match more closely with orangutans. Based on these parameters obtained in other studies, they predicted 0.9% of the human genome would have a human : orangutan gene tree. Their observed value was 0.8% - a result that provides additional support for the population size estimates and speciation times from other studies.

Why is this data interesting?

Aside from its misinterpretation by the ID movement, this sort of data actually provides us with information about the population size of the species that went on to give rise to orangutans, gorillas, chimpanzees and humans, as well as times for the various speciation events. I have discussed similar data for the (gorilla/chimpanzee/human) and (chimpanzee/human) common ancestor populations elsewhere; this new data merely confirms previous estimates of the population sizes of the various ancestral groups, and extends back to the (orangutan/gorilla/chimpanzee/human) common ancestor population with greater precision. As before, these results continue to strongly support the hypothesis that the human lineage has never been as low as two individuals at any point in our evolutionary history. Indeed, these new results confirm that the human : chimp common ancestor population was large (about 50,000 members). As Darrel Falk and I have discussed here on BioLogos in the past, all methods used to date (numerous approaches, all using independent assumptions) would have to be wildly wrong (by several orders of magnitude) if indeed our species arose from just two individuals.


Dennis Venema is Fellow of Biology for The BioLogos Foundation and associate professor of biology at Trinity Western University in Langley, British Columbia. His research is focused on the genetics of pattern formation and signalling.

< Previous post in series Next post in series >


Share your thoughts

Have a comment or question for the author? We'd love to hear from you.

View the archived discussion of this post

This article is now closed for new comments. The archived comments are shown below.

Loading...
Page 1 of 1   1
Terrance - #65509

October 14th 2011

The DI were not the only anti-evolution organisation to report on these findings. See these articles from RTB -
http://www.reasons.org/will-real-human-ancestor-please-stand
http://www.reasons.org/skunk-any-other-name
 
And this podcast - http://www.podtrac.com/pts/redirect.mp3/c450913.r13.cf2.rackcdn.com/snf20110919pf.mp3


Dennis Venema - #65564

October 17th 2011

Thanks for those links, Terrance. Yes, this material is misinterpreted by many antievolutionary groups - all the more reason to try educate Christians on the genuine science. 


beaglelady - #65521

October 14th 2011

This is off-topic (but not completely): there’s a free webcast for a symposium on human evolution at 4:30 Eastern Standard Time

http://www.nescent.org/media/NABTSymposium2011.php

Four different speakers are participating.  I’m posting it because it is relevant to many of the essays we’ve seen here on BioLogos.   Hopefully it will be available on-demand also.

Enjoy!


felsenst - #65615

October 20th 2011

(From Joe Felsenstein—for some reason my name doesn’t seem to get attached automatically to my user name)  Good post, Dennis.   As you see, coalescent trees of copies of genes predict some loci will have trees of gene copies that conflict with the species tree.  The trees that show human most closely related to orang are a small minority of loci.  We don’t have a gorilla genome sequence yet (as far as I know) but there have been studies with many individual loci examined, and they show that about 1/6 of loci have human closest to gorilla, 1/6 have chimp and gorilla as closest, and the rest, 2/3 of all loci have human closest to chimp.   It makes a fairly self-consistent picture—when the gorilla genome gets done, it should confirm that picture.  [By the way, I made a comment to this effect some days ago here.  Apologies if this is a repeat.  But when I went to look at it after it was submitted, it did not show up among the comments—but the count of the number of comments had increased.  Spooky.]


beaglelady - #65620

October 20th 2011

I noticed that the comment count is always one more than the number of visible comments.  Maybe the Intelligent Designer of the DI is hiding them.


Dennis Venema - #65631

October 20th 2011

Thanks for the comment Joe. I too noticed that the count was higher than those displayed - thanks for re-posting this. 


Menno van Barneveld - #65758

October 26th 2011

I missed in the article what misinterpretation was done on the descendent tree by the ID movement. I think that the only difference with BioLogos is that the ID supposes that God is at the stearing wheel all the time and the holy spirit is doing the changes all the time when species have to be differentiated. This excludes evolution by chance.  This last option has my preference.


beaglelady - #65759

October 26th 2011

So God is either running a dating service for critters, or doing genetic engineering on the sly?


Amused1 - #66070

November 14th 2011

One of the hypotheses is speciation by a random drift. The problem is that evidence for random drift may be difficult to obtain. There is some hope in so called “junk DNA” which is more visible than other mutations and it appears to coincide with the  origin of species:


http://www.biology-direct.com/content/6/1/44

Science, for obvious reasons must use random models as a null hypotheses. Miracles cannot be any part of the scientific explanation because science has nothing to say whether they occur or not (they cannot be reproduced). Miracles, by definition are rare or unique. Monod extensively commented on the relationship between science and unique events (see Chance and Necessity). Science often attempts to put unique events in the context of random models. For example, to explain the origin of our Universe that appears to “anticipate” life (anthropic principle), science contemplates a model of randomly generated set of universes (called the multiverse). It is not obvious if the multiverse hypothesis is falsifiable or not. Only falsifiable hypothesis can be discussed and presented as scientific. The process of falsification can help to organize scientific evidence even if the hypothesis is rejected in the end (e.g. theory of phlogiston or aether). I am puzzled by scientists expressing “scientific” opinions about religions (Dawkins) or religious people trying to discuss science from the point of view of unfalsifiable hypotheses.


Cal King - #82731

September 28th 2013

So, the data suggests that the common ancestor of humans and chimps had a population size of 50,000.  That is a reasonable figure. However, we do have 2 fewer chromosomes than the apes, and that means the population of our ancestor at some point must have dwindled to a much smaller size, in order for the chromosomal mutation to occur and spread. That is because a reduction in chromosome number means an individual with the mutation will produce infertile young unless he or she is mated to an individual with the same mutation. That is possible only if the population is small and there is much inbreeding.


Page 1 of 1   1