Join us April 17-19 for the BioLogos national conference, Faith & Science 2024, as we explore God’s Word and God’s World together!

Dennis Venema
 on May 16, 2013

From Variation to Speciation

…in small populations, drift can have a large impact on allele frequencies from one generation to the next. In large populations, natural selection predominates...

Part 6 of 22 in Evolution Basics

In the last post in this series, we examined how DNA variation arises as chance events, such as base-pair mismatches, duplications and deletions. In order to understand how this variation may (eventually) contribute to a speciation event, we need to discuss how variation spreads within a population. First, we need a small amount of vocabulary to facilitate the discussion: specifically, we need to explain the distinction between a gene and an allele.

As a geneticist, I pull my hair out at times when reading popular media reports of scientific matters. One of my biggest pet peeves is the use of the word “gene” in the sense of saying that an individual “has the gene” for a specific trait. We have already described genes are a section of DNA sequence on a chromosome that contributes to a function of some kind, usually by coding for a protein product. In this sense, humans all have the same genes (or very nearly so)—the 20,000 or so sequences that make us who we are biologically. What we don’t have, however, are identical genes—there are differences that arise through the copying errors that we have discussed. These differences are called alleles. You can think of an allele as a “version” or “flavor” of a gene. Mutation events don’t usually create new genes (though they can through duplication). Usually, new alleles are created. In a prior post, we used children’s toy bricks to illustrate how a new variant could arise through a DNA base-pairing mistake during chromosome replication:

In this example, we have a sequence that, through a copying error, becomes two slightly different versions of what is (almost) the same sequence.  These differences would be called two distinct alleles, and if they affect the function of a gene, they might have a noticeable effect at the level of the whole organism. When the media talks about the “gene” for this or that trait, what they actually mean is the allele for a given trait—the specific variant of a gene that is correlated with a specific medical condition, for example.

Selection and drift

So, DNA variation is all about the production of new alleles—but what happens to these alleles over time within a population? Obviously, when a new allele arises, it is present in only one individual. If it is to make an impact on the population as a whole, it needs to spread to other individuals by being passed on to offspring. In this way, variation can enter a population and then become more common over time. There are a number of factors that can influence this process. If the population size is small, then chance alone can increase (or decrease) the frequency of an allele in a population—an effect known as genetic drift. Since drift can be a major player in how allele frequencies change over time in a population, it’s worth taking some time to discuss it in some detail.

Drift is essentially a non-representative sampling event. Consider a small population of sexually reproducing organisms that we can represent as rectangles, each containing two alleles (the squares, with the colors representing allele differences). We can represent their “reproduction” as a 50:50 chance of passing on either allele to their offspring in the next generation. (Note: each “passing” event is independent of any other—for example, there is no mechanism to guarantee that any individual would pass down both of their alleles if they reproduced twice.) For the breeding pair on the left, one parent has two yellow alleles, and the other has a blue allele and a yellow allele. When they reproduce, by chance the parent with both alleles passes on only the yellow allele to both offspring. For the breeding pair on the left, the parent with both alleles also passes on the yellow allele twice, and their blue allele not at all. These chance events shift the frequency of the two alleles quite significantly within one generation:

Now imagine that the offspring pair up to mate, and that once again we have, by chance, a slightly non-representative sampling to form the next generation:

The point is that this small population is prone to large fluctuations in the frequencies of the blue or yellow alleles because it is so small. This small size means that chance events within even one breeding pair have a large impact on the population as a whole. In the next generation, for example, the blue allele might be lost completely—and once it has disappeared, it will be absent until it either arises again through a new mutation event, or enters the population through an individual migrating in from a population where it is still present.

For large populations, however, the situation is rather different. Imagine a population with 1000 individuals, with a total of 500 yellow alleles and 500 blue alleles randomly distributed among the individuals. When this population reproduces, it will never vary much from this 50:50 ratio from one generation to the next. In this large population, chance events within one breeding pair are a very small proportion of the population as a whole, and on average, the population will reflect the 50:50 probability of any allele being passed on.

So, how can allele frequencies change in large populations, when drift is largely impotent? We have already seen one mechanism that can accomplish this: natural selection.  Natural selection is simply the effect that individuals who possess a certain allele reproduce more frequently than individuals who do not have that allele. Over time, this skewing of the probability of reproducing increases the frequency of the selected allele in the population. For dogs early in the domestication process, duplication of amylase genes happened as a one-time, chance mutation event. Dogs that carried the duplicated amylase allele reproduced at a slightly greater frequency than dogs without it, since the duplication allele allowed for dogs to derive more nutrition from the food they were receiving from their new environment (i.e. human sources). Over time, the duplicated allele became so frequent in the dog population that the ancestral, non-duplicated allele was lost all together. At this point the new allele was “fixed” in the population: it had a frequency of 100%.

To sum up: in small populations, drift can have a large impact on allele frequencies from one generation to the next. In large populations, natural selection predominates, and drift has little impact. Both of these mechanisms can contribute to changing allele frequencies over time within populations, and as such both can be factors that contribute to speciation events.

Changing allele frequencies and speciation

Speciation is the production of two species from a common ancestral population. (Now, we have already discussed how defining “species” is a fuzzy concept, and it is the fact that they arise slowly and incrementally that makes them challenging to define.)

One way to understand how speciation starts is to consider two populations of the same species, that for whatever reason, stop interbreeding with each other—perhaps through geographic isolation. While a geographic “barrier” has nothing at all to do with genetic differences or reproductive compatibility, if such a barrier is in place, then alleles that arise in one population will not be transferred to the other population. Additionally, if two populations are not exchanging alleles, then allele frequencies in the two populations are now no longer tied to the other and averaged between them. This means that drift and selection will now act independently on the two populations. Once uncoupled, the two populations may then follow different trajectories—one population may start out small, and be dominated by drift until it increases in size. The other population may remain large, and be subject to natural selection in ways the other population is not. Over a long period of time, the two populations may become genetically different enough that they form two distinct species. The key, of course, is the nature of the barrier preventing exchange of alleles between populations. In the next post in this series, we’ll examine how such barriers can form between populations.


Previously, we made a number of points worth summarizing here:

New alleles arise as unique events in individuals, but may become common in their population through various processes, including genetic drift and natural selection.

New alleles, should they become common in a population, may shift the average characteristics of that population.

If exchange of alleles between two populations of the same species is blocked or reduced, then average characteristics of the two populations may diverge from each other.

Given enough time, these processes may lead to differences between the two groups that are significant enough to establish them as distinct species.

With these points in hand, we are now ready to have a closer look at the various ways that genetic exchange between populations can be reduced or eliminated. We’ll start by looking at the simplest case, that of complete geographic isolation.

Geographic barriers

Geographic separation of two populations of the same species is a rapid and effective way of stopping the exchange of alleles between them. At the point of separation, the two populations are, of course, fully capable of interbreeding biologically, but prevented from doing so by physical separation. One example of geographic isolation leading to speciation that we have discussed already is the various species of finches that Darwin observed on the Galapagos islands off the coast of South America. The original finch population of the Galapagos was founded by a small group of birds that arrived on the islands from the South American mainland, most likely blown there during a storm. These birds, as a population, were biologically cut off from the source population on the mainland, since the Galapagos are hundreds of miles offshore. Once separated from the larger population, the smaller “founding” group no longer received new alleles from it, nor passed new alleles that arose back to it. Despite being two populations of the same species, they were now genetically sealed off from one another, and differences in allele frequencies began to accrue between them. These differences lead to changes in average characteristics over time, and ultimately the formation of a new species.

The founder effect

In many cases, this process of accumulating differences gets a head start right at the point of separation, due to a phenomenon known as the “founder effect.” A small founding population is very often a non-representative sample of the genetic diversity of the source population. For example, consider a hypothetical population with 36 individuals. Each individual carries two alleles of a given gene, and there are four different alleles of this gene in the population (represented by the four colors):

Note that the yellow allele is the most common, followed by the blue allele. The purple and red alleles are comparatively uncommon in this population. In fact, their rarity means that it would be unlikely for this population to have an individual with two red alleles in the next generation, for example. In order to have such an individual, two parents who were “carriers” of the red allele would have to mate, and both pass the red allele on to their offspring. This is not impossible, but in this population it would be unlikely.

Now suppose that a few members of this population start a new population on an isolated island. Only six individuals start the new population, and the alleles that they carry are not a perfect representation of the allele frequencies of the larger, source population (approximate frequencies are shown for the source population and the new “founding” population):

We can see that for the common alleles, the yellow allele has increased in frequency, and the blue allele has decreased in frequency. Despite these differences, the common alleles are reasonably similar in frequency to the source population. The rare alleles, however, have had larger shifts: the red allele is now much more common in the newly founded population, and the purple allele has been lost entirely.

Now, these changes are subtle, and changes for one gene would not likely be enough to precipitate a speciation event between the two populations. These sorts of changes could, however, be significant in the long run. Consider the red allele in the newly founded population. As this population increases in number, it will be much more likely to have individuals with two red alleles crop up in this population than in the original source population. If this genetic combination has a selective advantage, then natural selection will be able to act on it in the new population. In the source population, however, this genetic combination is much more unlikely, largely preventing natural selection from acting on this allele combination. Over time, the red allele could come to dominate the new population, but remain rare in the source population. Additionally, it is likely that the environment will be somewhat different for these two populations, leading to differences in natural selection. What might be a winning allele combination for the mainland might not be as suitable for the island environment, and vice versa.  A second issue is that the newly founded population, like any small population, is much more subject to genetic drift than the larger source population. The red allele might increase in frequency in the new population simply due to chance alone, and not due to the action of natural selection.

Taken together, these mechanisms can put the two populations onto different trajectories, and, over time, lead to significant differences between them. Given enough time, the differences that accrue may be enough to keep the populations separate even if they should come into contact again. If so, most biologists would classify the two populations as distinct species. While this is easier to do for species that have been separated for a long time and have accumulated significant differences (and as such no longer interbreed, or interbreed only rarely), it is notoriously difficult for more recently separated populations that are not yet fully reproductively isolated. As such, what constitutes a “true species” instead of merely a “subspecies” or “variety” is often a subject for discussion and debate between scientists, and indeed was a topic that Darwin devoted much time to in his works. The ambiguity arises out of the mechanism of slow, gradual divergence of species from a common ancestral population.

Not just differences

Given the foregoing conversation, you might be under the impression that the differences between species are the main issue. Certainly differences are vital, since ultimately it will be the accumulation of differences that will lead to new species being formed. It is important to remember, however, that for closely related species, these differences will be small in number compared to the features that remain unchanged for both groups. At the genetic level, we can illustrate this by considering a gene for which there is only one allele in the source population – perhaps an allele that has been under natural selection and has displaced all other alleles. The newly founded population will inherit only this allele, despite the small sample size of the founding group, since there are no other variants in the population. The result is that the island population will be identical to the mainland population for this trait until a mutation event (in either population) even allows for the possibility of change. For most traits, mutations will not arise, since the DNA copying mechanism is highly accurate. This will keep most traits between the two populations constant. The pattern we expect for recently diverged species, then, is one of mostly identical characteristics overlaid with only a smattering of differences. You might recall that it was exactly this pattern in its biogeographical context that caused Darwin to reflect on the possibility that species may not be stable:

“The most striking and important fact for us in regard to the inhabitants of islands, is their affinity to those of the nearest mainland, without being actually the same species. Numerous instances could be given of this fact. I will give only one, that of the Galapagos Archipelago, situated under the equator, between 500 and 600 miles from the shores of South America. Here almost every product of the land and water bears the unmistakeable stamp of the American continent. There are twenty-six land birds, and twenty-five of these are ranked by Mr. Gould as distinct species, supposed to have been created here; yet the close affinity of most of these birds to American species in every character, in their habits, gestures, and tones of voice, was manifest.”

Note that it was the combined pattern of overwhelming “affinities” (distinctive features in common) with subtle, but significant differences that Darwin observed. The birds in question were distinct species, but they retained the “unmistakeable stamp” of their heritage. It was these observations that led Darwin to hypothesize that these finch species were the product of a speciation event brought on through geographic isolation.

While geographic isolation is a straightforward situation that can lead to genetic barriers and the formation of new species, speciation can also occur without full separation. In the next post in this series, we’ll examine a case of speciation with only a partial geographic (and genetic) barrier – a case that will also demonstrate the “fuzziness” of what exactly constitutes a species.

Earlier, we examined the relatively simple case of geographic separation of populations (such as a new population being founded on an island). Geographic separation is an effective barrier to what biologists call “gene flow” between populations – an effect more properly described as “allele flow”. As new alleles arise in separate populations, lack of interbreeding keeps each allele in the population where it arises. These new alleles may contribute to speciation over time if they affect the characteristics of the organism.  If, on the other extreme, new alleles can pass freely between two populations, then they will not contribute to a speciation event, since they will not make the two populations become more different over time.

What goes around, comes around

While these two extremes (geographically separated populations and fully continuous populations) are straightforward to understand, it is possible to find situations that are shades of gray between them. For example, consider two populations (we’ll call them “A” and “B”) that are members of the same species. They are able to exchange alleles between them, but at a reduced rate compared to sharing within each population. This effect can arise due to the geographic shape of their habitat – if it is long and narrow, then the two populations may abut each other only along a small portion of their range. This means that, on average, an individual from population A is more likely to find a mate within population A than to mate with a member of population B in their small area of overlap. We can represent this with boxes representing the two populations, abutting each other along one of their narrow sides:

This arrangement thus restricts, but does not completely abolish, allele flow between the two populations. In effect, this is a partial barrier to allele flow. Populations A and B are members of the same species, but the two populations are not genetically identical. As new alleles arise in population A, they are not shared across to population B as often as they are shared within population A, and vice versa. As such, populations A and B may have different frequencies of any given allele, and may even have some alleles that the other population lacks all together. It is also possible that the two populations may experience differences in natural selection (since their environments are not identical), and/or differences in genetic drift, depending on the population size for each. The net result is a balance of forces acting on the two populations – some favoring differences (selection and/or drift) and another favoring similarities (limited flow of alleles through interbreeding).

In nature, this effect can extend to multiple populations in a “string” spread out across a ribbon of suitable habitat. Let’s add three more populations (C, D and E) to the above example to illustrate:

Once populations become spread out over a wide geographic area, the differences between the populations at the extremities (populations A and E in our diagram) can become quite significant.  In some cases, interestingly enough, the populations on the ends of the string can be different enough that they do not recognize each other as members of the same species, despite the fact that they are genetically connected through a series of intermediate populations. In some cases, scientists need to bring members of the extreme populations together to see if they are able to interbreed (i.e. employing the biological species concept as a definition of species). In other cases, the topography of the habitat brings them together in nature, allowing the populations at the extremes of the string to meet each other around a ring, but with a natural barrier in the middle (such as a mountain or a valley of unsuitable habitat). The result is what are known as “ring species”:

You can see the inherent difficulty for defining which populations are separate species, (if indeed any are at all). There is allele flow between all populations, but only around the ring. The two populations at the (overlapping) ends, despite encountering each other in the same habitat, are different enough that they do not interbreed. Defining these populations as separate species (or not) is a fruitless attempt to draw a line of demarcation on a gradient. For those interested in a real-life example of a ring species, the subspecies of the salamander Ensatina eschscholtzii on the west coast of North America are both a textbook case and a subject of ongoing research.

Now, if we encountered the populations at the extremities in the wild without the intermediate, “bridging” populations, we would not hesitate to classify them as distinct species. It is also easy to see what would follow if any of the bridging populations were lost, or if changes in habitat severed the connection between any of them – the result would be a break in the chain of allele flow, cutting the terminal populations off from one another. What ring species illustrate is that though speciation is a slow process of accumulating differences between populations, it is possible even without a full barrier to allele flow.

Speciation without geographic separation

While ring species illustrate how species can form by partitioning variation out over a wide geographic area, it is also possible for barriers to allele flow to arise within a population in a more geographically compact location. All that is needed is a bias that promotes allele exchange within a subgroup of the population at the expense of exchange with the wider population – and as we have seen with ring species, this barrier need not be absolute to allow two subpopulations to accumulate differences and diverge from one another over time. One way for this to occur is for subpopulations to begin to exploit resources within a common geographic area differently – an effect known as resource partitioning.  As subpopulations begin to specialize into slightly different “manners of life”, as Darwin put it, they become more likely to interbreed within their subpopulation than with the population as a whole. Since preferential breeding is a (partial) barrier to allele flow, this can place the two subpopulations on a genetic trajectory that reinforces their differences and leads to a speciation event. Resource partitioning is the likely mechanism that drives multiple, rapid speciation events that occur when a founding population reaches a new habitat where competitors are largely absent. The colonization of volcanic islands, a topic we have discussed previously, can lead to adaptive radiation. One example is the numerous species of Darwin’s finches on the Galapagos Islands that descend from one species of finch that originally colonized the archipelago, and subsequently diversified into numerous species that specialize in different food sources. In the absence of other birds on the islands, many “manners of life” (what we would now call niches) were available for different subpopulations of birds to occupy.

Summing up – speciation starts as barriers to allele flow

Full geographic separation, the partial geographic separation seen with ring species, and resource partitioning of subpopulations are all barriers to allele flow between (what starts as) members of the same species. This provides the opportunity for new alleles to arise that are not shared between two populations, and shift the average characteristics of the two groups away from each other.  Next, we’ll examine some of the traits that such alleles contribute to – traits that improve barriers to allele flow and thus promote speciation events.

Previously, we introduced the idea that species can form in the same geographic location based on resource partitioning—where the two populations become increasingly suited, over time, to exploit different niches. In this post, we’ll explore this phenomenon in detail, using an example of nascent species that have formed in the very recent past, and under human observation: diversification within hawthorn fliesRhagoletis pomonella. These flies are attracted to the unripened fruits of hawthorns, a wild relative of domestic apples (i.e. something resembling a small crabapple). Hawthorn fruit is also where hawthorn flies find their mates and lay their eggs, to allow the larvae to feed on the fruit (and cause it to spoil and fall early, with the larvae along for the ride). Hawthorn flies produce only one generation per year, and survive the winter buried as pupae. Moreover, they have a short adult lifespan, giving them only a short period to find a mate, breed, and for the females to lay eggs. This crucial period, of course, is set by the life cycle of the hawthorn—when its fruit is available for the flies to use as a food source and meeting location.  As such, natural selection (exerted by the hawthorn life cycle) acts on genetic variation relevant to hatching time in hawthorn fly populations. The timing of hatching shows heritable variation, and flies that happen to hatch near the fringes of when hawthorn fruit is available (or worse, when there is no fruit available at all) do not reproduce as successfully as do flies that hatch when hawthorn fruit is abundant. Not surprisingly, the result is that we observe populations of hawthorn flies that are well-timed with their host plants, with most members of any fly population hatching in concert with the height of fruit availability:

Hatching time is an example of a continuous trait, in contrast to a discontinuous trait. Discontinuous traits are traits that have distinct categories: black versus blue eyes,  or red versus white flowers, and so on. Many traits cannot be “binned” into such categories, but rather form a distribution in populations. Traits such as height and weight are examples of continuous traits, and the timing of hawthorn fly hatching is another. The effect that the hawthorn tree has on the hawthorn fly is an example of stabilizing selection—fruit availability is selecting against flies that fall outside the boundaries on either side (i.e. flies hatching too early, or flies hatching too late). The overall effect is to keep fly hatching matched to fruit availability, generation after generation.

Tempted by an apple

Something happened to upset this stable, balanced interaction, however: the introduction of domestic apples to North America by European colonists. As we noted above, hawthorns and apples are related plants, with somewhat similar fruits. One difference, however, was the timing of fruit development in apples compared to hawthorns: domestic apples produce fruit some weeks earlier than do hawthorns. The introduction of apples into the hawthorn fly habitat thus provided a potential food source for flies that happened to hatch on the “early” end of the spectrum:

For those “early” flies that were attracted to this new, but somewhat similar fruit in their environment, the result would be twofold: (a) finding a food source with reduced competition from members of their own species, and (b) finding a mate with similar tendencies of attraction to apples. What was previously a “losing” genetic combination (hatching too early, without sufficient food or reasonable prospects for a mate) was now a “winning” combination. As a result, “early” variants could now reproduce much more effectively than they could before, and thus increase in number over successive generations:

In other words, once apples were present, the environment was no longer selecting fly populations in a stabilizing way, but rather acting to shape variation into two subpopulations. The selection had now switched to being diversifying selection. Importantly, these two subpopulations were not diversifying only with respect to hatching time and food preference, but also (given the nature of their biology) with respect to mating preference. As the “apple” variants increased in number, they naturally bred more frequently with other “apple” variants, since they encountered their mates on apple trees. The result was a partial barrier to allele flow that would reinforce the nascent differences between the two groups over time.

While the hawthorn and apple “species” of Rhagoletis pomonella have been the subject of human interest for centuries (mostly owing to the economic impact of the apple species as a pest) geneticists are just starting to get a handle on the allele differences that were the targets of selection during the separation process. Not surprisingly, genes known from prior research to affect hatch timing show up as having different alleles in the two groups. Other candidate genes include the receptor proteins the flies use to detect odors from their target fruits—with certain alleles more tuned to apple odors, and other alleles tuned to hawthorn odors. What started out as variation within one population has now been partitioned by selection into allele combinations suited to distinct niches—and given the short timeframe in which the switch to apples occurred, it is likely that new mutations did not play a role. Rather, recombination and segregation of existing alleles of numerous genes was enough to provide genetic differences that suited some members of the original population to exploit the new opportunity. The net effect was the shifting of a few continuous traits (hatch timing, fruit odor preference) to match a new environmental niche and precipitate a barrier to allele flow.

Selection for the few

Having considered the genes (and their alleles) that were under selection during this speciation event, there are a few points to make. The number of genes under selection (and thus with different alleles in the two new species) will be relatively rare. Only alleles that affect traits relevant to adaptation to the new niche will be affected. Most genes will remain identical between the two populations, since they were not under diversifying selection, but continued to be under stabilizing selection for their (identical) role in both species.  For example, consider genes required for cellular energy conversion or wing development—processes that both species still need to do in the exact same way. These genes will have the same alleles (or perhaps only one allele) in both populations, since the function of these genes were not relevant to adapting to the new niche. In short, the overall pattern that speciation produces will be a small smattering of differences in alleles for the genes under selection (or genes that happened to experience drift by chance) against a backdrop of the large majority of identical genes that were not subject to selection (or drift).

Indeed, one reason we can be confident that the hawthorn and apple “specialists” of Rhagoletis pomonella are in fact the products of a recent speciation event (aside from the fact that farmers observed them as they arose) is because of the overwhelming identity between their genomes—they have only tiny differences in a handful of genes. Biologically, it’s an open question if they are in fact truly separate species, since they do continue to exchange alleles, albeit at a greatly reduced rate compared to sharing alleles within their respective populations. As we have seen for ring species, this example shows us that it is possible to observe in the present day the precise features we would predict for an ongoing, “in process” speciation event. Additionally, it shows that only a small handful of differences, derived from variation already existing within a population, can start two subpopulations on a trajectory that gradually improves the barrier to allele flow between them. Over time, these effects can lead to the formation of closely related species.

In the long run

The production of closely-related species from a common ancestral population is hardly controversial among evangelical Christians, though the mechanisms underlying such events are not commonly appreciated. What is more controversial for many, however, is the suggestion that these mechanisms also produce widely diverged species over greater spans of time. In the next post in this series, we’ll turn to some lines of evidence that support the hypothesis that highly diverse modern species are indeed derived from common ancestral populations deep in the past.

Next in series
See all

About the author

Dennis Venema

Dennis Venema

Dennis Venema is professor of biology at Trinity Western University in Langley, British Columbia. He holds a B.Sc. (with Honors) from the University of British Columbia (1996), and received his Ph.D. from the University of British Columbia in 2003. His research is focused on the genetics of pattern formation and signaling using the common fruit fly Drosophila melanogaster as a model organism. Dennis is a gifted thinker and writer on matters of science and faith, but also an award-winning biology teacher—he won the 2008 College Biology Teaching Award from the National Association of Biology Teachers. He and his family enjoy numerous outdoor activities that the Canadian Pacific coast region has to offer.