Linguistics and the Question of Common Ancestry

| By on Letters to the Duchess

Previously, we discussed a method for estimating human ancestral population sizes over the span of our geological history. Anatomically modern humans first appear in the fossil record at about 200,000 years ago, and the SNP-based methods we examined covered a similar time span. These studies, as we saw, indicate that we descend from a population that maintained a minimum of about 10,000 individuals from this point in our history through to the present day. This result, however, often causes confusion about the source of these 10,000 ancestors. As we have discussed, we are used to erroneously thinking in discontinuous terms where species have a sudden start – and as such, it’s common for folks to wonder where these 10,000 people suddenly appeared from.

Think back to our analogy to language evolution, and the origins of the present-day English language. Though we can trace the historical path of English back to the Anglo Saxons at around 900 A.D., we understand that this linguistic group did not suddenly appear on the landscape. Rather, the Anglo Saxons were the descendants of an earlier group—with incremental change, generation by generation—that we traced to a common ancestral population with the speakers of present-day West Frisian that lived at around 400 A.D. on the European continent. Pushing further back in time, one could trace the linguistic history of this group and chart its connections to other related languages.

Human evolution can be understood in much the same way. The minimum population size of 10,000 ancestors did not suddenly appear out of nowhere—they too had an ancestral population that produced them, and so on, deep into the past. Just as “English” slowly emerged over tens of generations, so too our lineage slowly came to be biologically human - over thousands (or tens, or hundreds, of thousands) of generations.

Genomes as linguistic histories

In previous posts in this series, we have examined two lines of evidence for language change over time. The first was change within a language lineage, illustrated with translations of John 1:29 from the time of the Anglo Saxons to the present day. The second line of evidence was to compare distinct modern languages with each other, as we illustrated with two sentences pronounced the same in West Frisian and Modern English:

English: Butter, bread, and green cheese is good English and good Frise.
West Frisian: Bûter, brea, en griene tsiis is goed Ingelsk en goed Frysk.

The reason for these lines of evidence is straightforward: as languages are handed down over time, they accumulate changes. Within a lineage, then, we expect to see evidence of shifts over generations. The same process, however, also would be expected to produce new languages if a common population were separated into two isolated groups.

For biological evolution, then, we would expect equivalent lines of evidence – evidence of change within a specific lineage leading to a present-day species, and evidence that lineages may have split over time to produce new species. For languages, we naturally look to texts; for species, we can look to genomes. Interestingly, genomes have certain features that make these investigations even easier to perform than with languages.

For a language, meaning is required for transmission to the next generation: nonsensical text will be eliminated. Genomes, on the other hand, are quite adept at transmitting nonsensical text. Genomes are copied by DNA-replicating enzymes that are not “aware” of the “meaning” of the DNA sequence they are copying. “Meaning” in this context is biological function, something that the DNA copying enzymes do not have the ability to determine. Whereas a human scribe may fix a misspelled word in a copied manuscript, DNA “scribes” cannot see the intended meaning and fix a previous copying error. As such, DNA mutations (copying errors), once introduced into a population, may be inherited by others – and, given enough time, become the only version of a given sequence in a population. One example of this in the human genome is a DNA copying error we all share – mutations that destroy the function of the enzyme used to make vitamin C. All humans lack this enzyme, called L-gulonolactone oxidase (abbreviated GULO). Though all of us lack this enzyme function, we also all retain much of its DNA sequence in our genomes. Our enzymes keep on transmitting this defective sequence as faithfully as they can, unaware that the enzyme no longer works (and that we must consume a diet rich in vitamin C in order to compensate). As such, our genomes tell us of a time when our lineage was able to synthesize our own vitamin C as other mammals do, and illustrate that our lineage has changed over time.

The second line of evidence we expect from biological evolution is that a lineage may separate, and the two populations go on to form distinct – but closely related – species. The evidence, as we have seen from our analogy using modern West Frisan and modern English, is shared similarities between the two resulting groups. For species, similarities between genomes can be evaluated to look for evidence of shared ancestry. To return to the GULO example, it has long been known that in addition to humans, other primates also lack the ability to synthesize their own vitamin C. DNA sequencing of other primate genomes reveals that they too have the remains of the GULO enzyme sequence, as we do. Further comparison reveals that many of the mutations that remove the function of this enzyme in humans are shared with other primates. One example is a single DNA-letter deletion that removes the function of the gene (highlighted in yellow):

When faced with evidence such as this, there are two competing hypotheses that biologists consider. The first hypothesis is that this shared similarity (a deletion that removes the function of a gene) occurred in the common ancestral population of present-day humans, chimpanzees, and orangutans before this population separated into lineages that led to the present-day species. The alternate, less likely hypothesis, is that this precise mutation occurred three times in three separate lineages. While less likely, this option remains a possibility, and one that a geneticist would take seriously. What is needed, of course, is more evidence from other regions of the genome – while any one shared similarity might be attributed to chance, the combined force of many such similarities would provide a compelling case.

Next, we’ll examine a larger data set to see if the hypothesis of shared ancestry remains supported, and begin to explore how shared ancestry can inform us about our population dynamics as we became human.




Venema, Dennis. "Linguistics and the Question of Common Ancestry" N.p., 29 Jan. 2015. Web. 18 January 2019.


Venema, D. (2015, January 29). Linguistics and the Question of Common Ancestry
Retrieved January 18, 2019, from /blogs/dennis-venema-letters-to-the-duchess/adam-eve-and-human-population-genetics-part-5-linguistics-and-the-question-of-common-ancestry

About the Author

Dennis Venema

Dennis Venema is professor of biology at Trinity Western University in Langley, British Columbia. He holds a B.Sc. (with Honors) from the University of British Columbia (1996), and received his Ph.D. from the University of British Columbia in 2003. His research is focused on the genetics of pattern formation and signaling using the common fruit fly Drosophila melanogaster as a model organism. Dennis is a gifted thinker and writer on matters of science and faith, but also an award-winning biology teacher—he won the 2008 College Biology Teaching Award from the National Association of Biology Teachers. He and his family enjoy numerous outdoor activities that the Canadian Pacific coast region has to offer. 

More posts by Dennis Venema