Earlier this year, I received an email from Dr. Richard Buggs, who is a plant genome biologist working in the UK. Dr. Buggs had been reading Adam and the Genome, the book that I co-authored with New Testament scholar Scot McKnight that came out in February. Now I’m typically running behind on my email inbox at the best of times, and a reply to Dr. Buggs was clearly not going to be a note that I could dash off in a spare few minutes. And so I left the email unanswered – sorry Richard, if I may call you that – and after a while I forgot about it. Not surprisingly, and completely understandably, Dr. Buggs assumed I wasn’t going to respond, and posted the email on his webpage as an open letter. He provides the context as follows:
A few months ago, I was reading a new book by Dennis Venema and Scot McKnight entitled Adam and the Genome. I was surprised to find a claim within the book that the past effective population size of humans has definitely never dropped below 10,000 individuals and that this is a fact of comparable scientific certainty to heliocentrism. I emailed Dennis Venema, the biologist author of the book, to query this. Unfortunately, he has not yet responded. I therefore remain unconvinced that it is a scientific impossibility for human beings to have all descended from a single couple. If I am wrong, though, I would like to know.
Buggs’s concerns were then picked up by the Intelligent Design community: they were featured by the Discovery Institute, shared by Stephen Meyer on Facebook, and thereafter picked up by a number of evangelical news agencies. Suddenly a lot of people wanted to know what I had to say about this. Buggs was also clear about his motivation for contacting me. He is concerned that I might be overstating the scientific case for a large human ancestral population size to the detriment of my Christian audience:
I wanted to let you know that in my view you seem to be on very shaky ground here, and in danger of alienating Christians from science on the basis of a wrong interpretation of the current literature… I would encourage you to step back a bit from the strong claims you are making that a two-person bottleneck is disproven.
So, is there a genetic case to be made for Adam and Eve as the sole genetic progenitors of all of humanity? Have I overstated the scientific evidence to exclude this possibility? Note well: the question is not “were Adam and Eve historical individuals?” That is a question that science is not equipped to answer, as I discuss in the book. Science can tell us about our past population sizes, but it cannot weigh in on the historicity of any individuals within that population. Within the BioLogos tent, there are a range of views on Adam and Eve. What Buggs is asking here is whether Adam and Eve could have been the sole genetic progenitors of the entire human race.
To address this question, let’s take a close look at what Dr. Buggs discusses in his email (and for those who have not read it yet, I suggest reading it in its entirety before continuing here). I also note that I quite appreciate the gracious tone in which the letter was written. I don’t agree with Buggs, but I will endeavor to answer in a similarly constructive way.
The first issue I’d like to tackle is a minor one in some ways, but a significant one in others. I agree with Buggs that science has not “disproven” that humans could have descended uniquely from just two people (and also I’m acutely aware that this last sentence is particularly ripe for selective quoting). In the same breath, I also don’t think that science has “disproven” geocentrism – the idea that the earth is the immobile center of the universe. I actually spend a good deal of time in Adam and the Genome discussing what science is, and how it works as a powerful, yet limited, way of knowing. Science offers us converging lines of evidence for ideas about the natural world we have not yet rejected through repeated experimentation – but everything in science is held, at least in some sense, tentatively. Even the things we consider the most certain in science could be rejected in light of new evidence – even a sun-centered solar system. I put it as follows in Adam and the Genome, after a lengthy discussion of the evidence for common ancestry of humans and other species, and a large ancestral human population:
As our methodology becomes more sophisticated and more data are examined, we will likely further refine our estimates in the future. That said, we can be confident that finding evidence that we were created independently of other animals or that we descend from only two people just isn’t going to happen. Some ideas in science are so well supported that it is highly unlikely new evidence will substantially modify them, and these are among them: The sun is at the center of our solar system, humans evolved, and we evolved as a population (55).
In other words, we have multiple, interlocking, converging lines of evidence for each of these three claims, and we can have great confidence that new scientific evidence will not substantially change our views. (Also, note that I do not claim this certainty for the oft-cited ~10,000 figure, as Buggs seems to imply, since future estimates could possibly shift this value a bit. What I’m saying is that new evidence isn’t going to get us from a population to a pair). Is it proven? No, proof is for alcohol and mathematics, as the saying goes. Can you take it to the bank? Absolutely. It’s a subtle difference, but an important one.
With that out of the way, we can now attend to the specific points that Buggs presents as a challenge to our confidence that the ancestral human population was a large one.
Heterozygosity and population bottlenecks
One objection raised by Buggs centers around the concept of heterozygosity. In many organisms, such as humans, genes have two copies – one we receive from our mothers, and one from our fathers. The fact that we have two copies of each gene means that those copies can be slightly different. Different variants of a gene are called “alleles” – and if, for a given gene, someone has two different alleles, they are said to be heterozygous for that gene. Heterozygosity is just a measurement of how many individuals in a population are heterozygous for a particular gene. It is also possible to estimate the average heterozygosity of a population by looking at heterozygosity at multiple genes, or even across the entire genome. It is this measurement that Buggs has in mind when he offers the following critique:
To get more specific, I think you are mistaken when you say this:
“If a species were formed through such an event [by a single ancestral breeding pair] or if a species were reduced in numbers to a single breeding pair at some point in its history, it would leave a telltale mark on its genome that would persist for hundreds of thousands of years— a severe reduction in genetic variability for the species as a whole.”
It is easy to have misleading intuitions about the population genetic effects of a short, sudden bottleneck… a single pair of individuals can carry a great deal of heterozygosity with them through a bottleneck, if they come from an ancestral population with high diversity, and they will pass that on to the population they found, so long as it grows rapidly.
Buggs is correct that a population passing through an extreme bottleneck – and a bottleneck of two is as extreme as it gets for a sexually reproducing species – will on average retain a sizeable proportion of its heterozygosity (perhaps 75%, on average, if conditions are right). This is not the same thing, however, as retaining a significant proportion of the population’s genetic diversity. A bottleneck (and especially one as extreme as a reduction to two individuals) would greatly reduce genetic diversity, even if heterozygosity is not hugely impacted. The reason for the apparent discrepancy is that heterozygosity is a very limited way to measure genetic diversity. Let’s use an example to help us understand things.
Imagine a fictional population (population one) where there are only two alleles present for a particular gene – the “dee” gene. Let’s call the two alleles “d1” and “d2” as a way to distinguish them. In this population, half of the individuals are heterozygous – “d1d2” – they have one copy of each of the two variants. The other half of the population is split between the two ways of being homozygous, or having two identical alleles: one quarter of the population is “d1d1”, and one quarter is “d2d2”. This population thus has a heterozygosity value of 0.5 for this gene, since half of the individuals are d1d2.
Now imagine a second population (population two). In this population, there are 10 different alleles of the “dee” gene: d1, d2, d3, etc – all the way up to d10. Depending on how rare (or common) each of these alleles is in the population, this population could also have a heterozygosity value of 0.5 for this gene. In this case, the heterozygosity value would be the sum of all the different heterozygous individuals (and there would be a large number of different possibilities). For example, we would now have individuals that could be d1d2, d1d3, d1d4, and so on for all the different combinations. Only homozygous individuals would be excluded from the heterozygosity score – d1d1, d2d2, d3d3, and so on. Based on the frequencies of the different alleles, it’s entirely possible to have the heterozygosity value for the “dee” gene in population two also equal 0.5.
And here we see the challenge of using heterozygosity scores to estimate genetic diversity. Despite their equal heterozygosity scores, population two is much more genetically diverse than population one. It has 10 alleles where population one has only 2. By using techniques that capture this diversity, we would correctly infer that population two has a much larger ancestral population size than population one does. All of those different alleles ultimately trace back to different ancestors that had mutation events to produce them.
Let’s carry this thought experiment a little further. Now imagine that population two undergoes an extreme genetic bottleneck – a reduction to only one breeding pair. For argument’s sake, let’s say these two individuals happened to be genetically d1d2 and d1d2 (I’ll choose these for convenience, but the same point could be made with any two individuals selected at random). This population would expand after the bottleneck, but now alleles d3 through d10 have been lost. The new population could also easily have a heterozygosity value of 0.5, even though it has had a severe reduction in genetic diversity in the form of lost alleles. Note well: a population can pass through a bottleneck with little or no effect on heterozygosity, but with a dramatic reduction in genetic diversity. So, Buggs’s noting of the former – that populations can retain heterozygosity – does not establish that humans could have passed through a bottleneck of two with their genetic diversity only marginally affected. Our lineage would have been affected by losing alleles.
The key here is that one individual can only have at most two alleles of any gene. A population reduction to one breeding pair would mean that at most, four alleles of a given gene could pass through the bottleneck – in the case where both individuals are heterozygous, and heterozygous for different alleles. The population would then have to wait for new mutation events to produce new alleles of this gene – a process that will take a significant amount of time. Since this would happen to all genes in the genome at the same time – a reduction to a maximum of four alleles – we would notice this effect for a long time thereafter as genetic diversity was slowly rebuilt across the genome as a whole.
So, a bottleneck to two individuals would leave an enduring mark on our genomes – and one part of that mark would be a severe reduction in the number of alleles we have – down to a maximum of four alleles at any given gene. Humans, however, have a large number of alleles for many genes – famously, there are hundreds of alleles for some genes involved in immune system function. These alleles take time to generate, because the mutation rate in humans is very low. This high allele diversity is thus the first indication that we did not pass through a severe population bottleneck, but rather a relatively mild one (estimated, as we have discussed, at about 10,000 individuals by current methods).
Another effect that a bottleneck to two individuals would produce is that there would be no rare alleles after the bottleneck. All alleles would have a frequency of at least 25%. As the population expanded after such an event, those alleles would stay common, and only new mutations would produce less common alleles. What we observe in humans in the present day is that many alleles are rare – even exceedingly rare. The distribution of alleles in present-day humans looks like it comes from an old, large population – not one that passed through an extreme bottleneck within the last few hundred thousand years, which is when our species is found in the fossil record. Thus the observation that we have many alleles of certain genes and the distribution of allele frequencies both support the hypothesis that humans come from a population, rather than a pair.
In the next part of this reply, we’ll look at a method that looks beyond alleles at a single gene to multiple genes simultaneously. Such a method can detect bottlenecks with great power, because it examines patterns of alleles in groups. As we will see, this method also fails to find a bottleneck below about 10,000 individuals for our species, adding to the evidence that our lineage was never reduced from a population to a pair.
In the first part of this article, I tackled the concern that Dr. Buggs raised about the ability of heterozygosity to survive a population bottleneck. I intended to continue on to his concerns about linkage disequilibrium (LD) models in this post. However, following on from that post, dialogue between myself, Dr. Buggs, and others on the BioLogos Forum made it clear to me that an additional discussion about allele-based methods for estimating population sizes might be useful.
A non-technical summary
Unfortunately for a lay audience, this conversation gets pretty technical in a hurry–especially since Dr. Buggs is a biologist and is critiquing my work in Adam and the Genome at a high level of detail. A detailed critique is fair game, of course. I’ve often said that in many ways peer review starts rather than ends with publication—and I’ve certainly offered technical, critical reviews of books from other perspectives. Sauce for the goose is sauce for the gander.
Part of Dr. Buggs’s critique, as I understand it, is his doubt that a bottleneck to two people has ever been explicitly tested by population genetics methods. Moreover, he argues that a sudden reduction to two people followed by a rapid population expansion could in fact be missed by the methods I base my conclusions on. If so, then my confidence that humans have never dipped down to a population of two would be overstated, and there might be room for reasonable doubt that the science is as settled as I claim it to be.
In the first part of my reply, we discussed heterozygosity and saw that it can be maintained even as many alleles are lost during a bottleneck, reducing the genetic variation within a population. In this part of my reply, I’ll move on to discussing how allele-based methods are used to estimate population sizes, and sketch a brief history of how they have been applied to humans. In the process we’ll see that Dr. Buggs’s hypothesis—a severe bottleneck in human history—has indeed been tested and rejected by scientists. We’ll also learn about coalescence, which will not only help us understand allele-based methods in general, but also prepare us for a later discussion of a particular method of estimating population sizes that is of concern to Dr. Buggs: the pairwise sequentially Markovian coalescent (PSMC) method.
But, let’s walk before we run. We’ll start with an introduction to the concept of coalescence, and methods that use it to estimate population sizes.
Coalescent-based methods: a primer
When we examine a given gene (or a DNA region in between genes) in present-day humans we often note different DNA sequence variants that are present–i.e. different alleles. When we compare any two alleles to each other, we can see how many differences there are between them. Usually these changes are one or more single DNA letter differences, or perhaps insertions or deletions of one or more DNA letter. Sometimes the changes are more extreme–but in either case, we can deduce how many DNA mutation events separate any two alleles. Working back in time in a population, eventually we will work back through each of the mutation events that caused the changes, and the two alleles will eventually become the same–i.e. they will coalesce.
Taking into account estimates of mutation frequency it is possible to estimate how long ago any two present-day alleles coalesce with each other. This is called the Time to the Most Recent Common Ancestor, or TMRCA, for any pair of alleles.
When looking at a number of alleles in a present-day population, we will see a range of TMRCA values—some pairs of alleles will be quite similar to each other, meaning that they are separated by one or perhaps only a few mutations—and thus coalesce with each other recently in the past. Other allele pairs will have many more differences, and coalesce further back in time. For a gene or DNA region with many alleles in present-day humans, we can do pairings of all the different possible alleles, and produce a range of TMRCA values for what we observe in the present day. In this way we account for all the alleles we observe in the present and compare them to each other in a pairwise manner.
One thing that TMRCA values can be used for is to estimate the ancestral size of the population in question at different points in its past. The probability that any two alleles will coalesce in a given previous generation (again, working back from the present) is directly related to how large the population is. Remember that mutation events are rare, because the mutation rate is very low. Thus the probability of coalescence is mostly due to the probability of alleles being inherited from the same ancestor (even as rare mutations are also occurring during this process). As the number of ancestors drops–i.e. the population size decreases—the probability of coalescence increases. As the population size increases, the probability of coalescence drops. Thus, when a bottleneck occurs—a reduction in the population size—the probability of coalescence increases, and it increases for all genes in the genome.
Coalescence and bottlenecks
In the case of an extreme bottleneck, where the population size drops to just two individuals, many alleles will be lost. At most four alleles will make it through the bottleneck (as we discussed in the first part of my reply). What we didn’t mention then but is also relevant is that the surviving alleles also have a good probability of being lost in the generations following on from the bottleneck. In small populations, such as one expanding after an extreme bottleneck, the probability of losing some alleles just by chance is high. In the first part of my reply recall that we discussed heterozygosity – the case when there are at least two alleles for a given gene in a population. We saw that even in the most favorable conditions after a bottleneck, heterozygosity is preserved only about 75% of the time. This means that about 25% of the time, heterozygosity is lost, and that only one allele remains in the population for a given gene. If only one allele is present, then this is a coalescence point for that gene: going forward, we will have to wait for mutations to produce new alleles, and those new alleles will coalesce back to their single ancestral allele that survived the bottleneck. In the future, as new alleles are produced from the surviving allele through mutation, the new alleles will all coalesce within a few generations of the bottleneck. Their TMRCA values will thus be almost identical.
After a bottleneck, then, we will see a range of TMCRA values across the genome, once we account for all of the alleles we can find in the present day. For some genes, multiple alleles will survive the bottleneck, and their TMRCA values will thus precede it. Other genes will coalesce at the bottleneck. Other genes will coalesce after the bottleneck, since coalescence happens occasionally even without a bottleneck to increase its probability. The way bottlenecks leave a detectable mark on a genome is this: they give alleles at numerous genes the same coalescence time. Even if it is only on average around 25% of genes that coalesce at this time, this is still a large number of genes in absolute terms. The fact that they all coalesce at the same time in the past is the indication that coalescence was highly probable then—because population size was small.
Coalescent-based methods are thus an excellent way to detect bottlenecks—even really brief ones, if they are severe enough. Even a brief, severe bottleneck will still greatly increase the chances of alleles being lost, and the telltale signature of numerous genes that coalesce within a short time frame. A rapid expansion of population after a severe bottleneck can reduce this effect, but never eliminate it. On average, about 25% of genes will coalesce, and this is more than enough genes to reveal a bottleneck—even if the bottleneck was only one generation long, and followed by rapid expansion.
Since coalescence-based methods were developed, they have been used widely for investigating ancestral human population sizes. As we sequenced the human genome, we applied these methods to an increasingly large data set of allele variation from around the world. The results of these studies have consistently indicated that humans descend from a substantial population and not merely from a pair. Let’s trace a snapshot of that experimental history next.
A brief history of coalescent methods and human population size estimates
Early studies on human variation, prior to the human genome project (HGP) were restricted to working with alleles of single “genes”, or more properly, short stretches of DNA that included a gene but also some DNA around it. These studies depended on the researchers actually going out and sequencing a large number of people for this specific region, and then making sense of the allele diversity they found.
For example, this early paper looks at a few such genes for which data was available at the time and concludes this (from the abstract, with my emphases):
Genetic variation at most loci examined in human populations indicates that the (effective) population size has been approximately 10(4) (i.e., 10,000) for the past 1 Myr and that individuals have been genetically united rather tightly. Also suggested is that the population size has never dropped to a few individuals, even in a single generation. These impose important requirements for the hypotheses for the origin of modern humans: a relatively large population size and frequent migration if populations were geographically subdivided. Any hypothesis that assumes a small number of founding individuals throughout the late Pleistocene can be rejected.”
What is interesting to note is that at this time in the scientific literature, a severe bottleneck in the ancestral human population was considered a possibility to be tested—even the idea that a very sudden bottleneck might have taken place. In these early dates of human population genetics research, it was an open question if humans came from a very small founding population. In this paper, however, the authors conclude that such a bottleneck is ruled out by the evidence: there is simply too much variation in present-day populations that coalesces further back than one million years ago (1 Myr).
Later pre-HGP papers were in agreement with these early results. For example, this paper looked at allele diversity at the PHDA1 gene, and reports a human effective population size of ~18,000. Similarly, studies of allelic diversity at the beta-globin gene found it to indicate an ancestral effective population size of ~11,000, and conclude that “There is no evidence for an exponential expansion out of a bottlenecked founding population, and an effective population size of approximately 10,000 has been maintained.” They also state that the allelic diversity they are working with cannot be explained by the recent population expansion that characterizes our species—the alleles are too old (i.e. they have TMRCA values that are too large) to be that recent.
It is in this timeframe that a coalescence study investigating allelic diversity of a different kind was published. It looks at allelic diversity of Alu insertions. Alu elements are transposons—short snippets of autonomous, mobile DNA that can replicate and move within genomes—that generate “alleles” where they insert into a chromosome. Generally, if an Alu is present, that’s an allele, compared to when an Alu is absent (the alternative allele). This study also acts as an independent check of other coalescence-based studies because it does not depend on a forward nucleotide substitution rate—i.e., the standard DNA mutation rate, since Alu alleles are not produced by nucleotide substitutions. This paper concludes that the human effective population size is ~18,000. They also state (my emphases):
The disagreement between the two figures suggests a mild hourglass constriction of human effective size during the last interglacial since 6000 is very different from 18,000. On the other hand our results also deny the hypothesis that there was a severe hourglass contraction in the number of our ancestors in the late middle and upper Pleistocene. If humans were descended from some small group of survivors of a catastrophic loss of population, then the distribution of ascertained Alu polymorphisms would show a pre-ponderance of high frequency insertions (unpublished simulation results). Instead the suggestion is that our ancestors were not part of a world network of gene flow among archaic human populations but were instead effectively a separate species with effective size of 10,000-20,000 throughout the Pleistocene.
In the late 1990s and early 2000s, we start to get into what are really human genome project papers but are focused studies on worldwide allele variation for small DNA regions, rather than genome-wide variation. For example, one paper of this type looked at allele diversity for a small region of chromosome 1. This study employs a variety of estimates of population size for this region, and concludes the following (my emphases). Note that once again, the hypothesis of a severe bottleneck is tested and rejected:
An average estimate of 12,600 for the long-term effective population size was obtained using various methods; the estimate was not far from the commonly used value of 10,000. Fu and Li’s tests rejected the assumption of an equilibrium neutral Wright-Fisher population, largely owing to the high proportion of low-frequency variants. The age of the most recent common ancestor of the sequences in our sample was estimated to be more than 1 Myr. Allowing for some unrealistic assumptions in the model, this estimate would still suggest an age of more than 500,000 years, providing further evidence for a genetic history of humans much more ancient than the emergence of modern humans. The fact that many unique variants exist in Europe and Asia also suggests a fairly long genetic history outside of Africa and argues against a complete replacement of all indigenous populations in Europe and Asia by a small Africa stock. Moreover, the ancient genetic history of humans indicates no severe bottleneck during the evolution of humans in the last half million years; otherwise, much of the ancient genetic history would have been lost during a severe bottleneck.
Accordingly, we can see from these studies that Dr. Buggs’s hypothesis – that present-day human allelic variation could be consistent with a brief bottleneck to just two individuals—has indeed been tested in the literature and rejected. No such bottleneck is supported within the last 500,000 years or more, far longer than our species has existed in the fossil record. Present-day humans have too many alleles that are more ancient than a severe bottleneck during this timeframe would allow. Moreover, no evidence of a severe bottleneck as revealed by grouped TMRCA values—i.e. a number of genes showing clustered coalescence times—is present in human DNA over this same period. Humans dip down to a population size of about 10,000—not 2.