If your heart is right, then every creature is a mirror of life to you and a book of holy learning, for there is no creature - no matter how tiny or how lowly - that does not reveal God’s goodness.
Thomas a Kempis - Of the Imitation of Christ (c.1420)
Paralogs, Synteny and WGD
I begin by restating that which I wrote as this series was initiated:
One prominent antievolutionary argument put forward by the Intelligent Design Movement (IDM) is that significant amounts of biological information cannot be created through evolutionary mechanisms – processes such as random mutation and natural selection. ID proponent and structural biologist Doug Axe frames the argument this way (his comments begin at approx. 15:19 in the video):
“Basically every gene, every new protein fold… there is nothing of significance that we can show [that] can be had in that gradualistic way. It’s all a mirage. None of it happens that way.”
The importance of this line of argumentation for the IDM can be seen clearly in Stephen Meyer’s book, Signature in the Cell (published in 2009). In this book, Meyer claims that an intelligent agent is responsible for the information we observe in DNA because, in his words, natural mechanisms “will not suffice” to explain it:
Since the case for intelligent design as the best explanation for the origin of biological information necessary to build novel forms of life depends, in part, upon the claim that functional (information-rich) genes and proteins cannot be explained by random mutation and natural selection, this design hypothesis implies that selection and mutation will not suffice to produce genetic information … (p. 495)
It’s hard to overstate the importance of this argument for Meyer in Signature, and for the IDM as a whole. In the conclusion to a pivotal chapter entitled “The Best Explanation” Meyer presents the following summary of his case:
Since the intelligent-design hypothesis meets both the causal-adequacy and causal-existence criteria of a best explanation, and since no other competing explanation meets these conditions as well –or at all–it follows that the design hypothesis provides the best, most causally adequate explanation of the origin of the information necessary to produce the first life on earth. Indeed, our uniform experience affirms that specified information … always arises from an intelligent source, from a mind, and not a strictly material process. So the discovery of the specified digital information in the DNA molecule provides strong grounds for inferring that intelligence played a role in the origin of DNA. Indeed, whenever we find specified information and we know the causal story of how that information arose, we always find that it arose from an intelligent source. It follows that the best, most causally adequate explanation for the origin of the specified, digitally encoded information in DNA is that it too had an intelligent source. (p. 347)
Put more simply, Meyer claims that if we see specified information, we infer design, since we know of no mechanism that can produce specified information through an unintelligent, natural process. As a logical argument, Meyer’s position only works if (and this is a big if) – his premises are correct.
So that’s how the series began. In each part of this series, I have shown why Meyer is wrong. Natural mechanisms can explain the origin of new information and they can do so on a grand scale. Step-by-step and post-by-post I have shown how new information has arisen in the history of life and I have shown how the mechanism occurs through natural means. I have also emphasized that this does not in any manner exclude God from the processes. The natural laws that bring about this increase in information are a reflection of God’s ongoing natural activity. Without that activity all would disintegrate into nothingness. What science has not demonstrated, (in contrast to what Meyer has proposed) is that God’s supernatural activity is necessary also. Furthermore there is nothing in Scripture which mandates that we should expect God’s supernatural activity to be necessary to bring about an increase in information. This removes the matter from the realm of a religious question and causes us to evaluate Meyer’s proposal on the basis of the quality of the science itself.
In the present post, I demonstrate the evidence that two huge increases (doublings) in information content likely occurred in the evolution of vertebrates (organisms with a backbone) about 450 million years ago. Indeed it is likely that these doublings served as a key prelude to the huge array of vertebrate organisms (including ourselves) that subsequently arose in the history of creation.
Biologists have known for a long time that genes exist in families where each gene within a family is clearly related to other members of the same family. Individual genes for building hemoglobin, for example are members of a single family. It is known that each member of the family arose through a series of duplication events all of which trace back to a single ancestral gene. After each duplication event different mutations accumulate in opposing members of the duplicate pair and lead to divergence from identity. As gene family members change, so also their function becomes altered. Genes like this, (those which are clearly related to each other historically through a duplication event) are called paralogs. Every time a gene duplicates it provides an opportunity for new information (complex specified information, to use Meyer’s term). This has happened routinely in the history of life and it occurs through natural mechanisms that are well understood.
Make mine a double - double
In addition to the widespread occurrence of duplication of individual genes and subsequent divergence, there has long been speculation that early in vertebrate evolution there was a time when the entire genome (the complete collection of genes) was doubled in a vertebrate ancestor. This is called a whole-genome duplication(WGD) event. Moreover, some evidence suggested that perhaps there was not just one, but rather two sequential WGD events in the early vertebrate lineage. WGD events provide a wealth of raw material for evolutionary innovation, since genes dedicated to one function now have a copy that is not constrained by natural selection to perform that role any longer (since the other copy can do that function). In many cases, the new gene copies are lost before they neofunctionalize (i.e. become different enough to gain a new function) – but in some instances, the copies are maintained and acquire new functions to become paralogs (such as we saw for steroid hormone receptors in detail in Part 3 of this series).
While there was some evidence to suggest that vertebrates had undergone one, or perhaps two, rounds of WGD early in their evolution, it was not, until recently, possible to distinguish the signature of a true vertebrate WGD event from smaller gene duplication events that accrued over time. Two key pieces of information were needed: first, the complete genome sequence of an organism closely related to vertebrates, and secondly, knowledge of the precise arrangement of the genes thought to be involved in the WGD events.
The first issue, that of a complete genome sequence of a close relative to vertebrates, was needed in order to determine which genes would have been present at the time of the proposed WGD event(s). Genes present in a modern close relative of vertebrates (the tunicate Ciona intestinalis) and in modern vertebrates (such as humans and fish) indicate which subset of modern vertebrate genes were present in the single ancestral species from which both tunicates like Ciona and vertebrates like us were derived. . With the Ciona genome in hand, and a vast array of vertebrate genomes, researchers were able to determine this set of genes, and distinguish them from genes that cropped up later in vertebrate evolution, or independently in either group.
The next question was a relatively simple one: now that the subset of genes present at the proposed time for WGD was identified, how could the hypothesis of WGD be tested? The answer is in what is known as synteny: the physical arrangement of genes in a specific order on chromosomes. Darrel Falk and I have previously discussed synteny when examining human – chimpanzee common ancestry, and readers who have not read that post might find it helpful. In this case, when testing the hypothesis of WGD events, the researchers knew that this process would produce a specific pattern of gene arrangements in modern genomes – a pattern that accounts for both duplication and loss of genes. Additionally, it would resolve the 1x WGD versus 2x WGD debate, since these two possibilities would produce different genomic patterns.
Signature in the synteny, redux
Consider a hypothetical genome that has only nine genes (represented by the colored boxes, redrawn from Figure 1 in Dehal and Boore, 2005) in a specific order on one chromosome pair. For simplicity’s sake, we’ll only show one copy of the chromosome. This chromosome is first duplicated in the WGD event, followed by large-scale loss of many redundant genes. Some genes, however, persist as paralogs (gene copies that pick up new functions and thus are not eliminated).
The point is that this event would retain the spatial pattern of the original gene set prior to duplication, even as some copies are lost. The new genome of this organism would now have two chromosome pairs, each copied from an original single chromosome, with some genes lost, and some genes now present as paralogs. Now imagine if a second WGD event occurred:
This event would produce an organism now with four chromosomes. As before, rapid gene loss (of redundant copies) and some neofunctionalization (to produce paralogs) would be expected:
The final pattern would have groupings within the same genome (groups of synteny) where paralogs are arranged in the same spatial pattern four times over. The four groupings of synteny would not be expected to be identical, since due to gene loss there might be as few as one copy remaining, or two, three or four paralogs persisting. While the hypothesis of WGD could not be resolved by looking at single paralog families, the overall pattern of four-fold synteny would be distinctive and unmistakable when investigators examined whole genomes. Tunicates are the control since even though they are closely related to vertebrates, the two hypothetical duplication events arose in vertebrate ancestral species which are not part of the tunicate “family tree.”
Testing the WGD hypothesis, and implications for ID
The research group first identified 3,753 genes that are present as single copies in Ciona but present as multiple paralogs in fish and humans. These are the genes that may still show a WGD or 2x WGD synteny signature in modern vertebrates (although chromosomal rearrangements may erase the synteny signature over time). Significantly, a large percentage of these modern paralogs are still present in four-fold syntenty groups that span about 25% of the human genome (and thus have persisted as blocks of synteny for approximately 450 million years). This evidence is a strong indication that the modern vertebrate genome went through two rounds of WGD early in its evolution, and that these events provided substantial “raw material” for the acquisition of new information through gene divergence and neofunctionalization:
In other words, gene duplication and divergence to produce new CSI appears to be commonplace in evolution, including the evolution of our own species. Far from being rare exceptions, multiple lines of genomics evidence point to new structures, functions and information being produced through natural means. If the Intelligent Design Movement wishes to contest that natural mechanisms cannot produce new information, they need to address this widespread and compelling pattern.
In the next installment of this series, we’ll return to a topic of great interest to evangelical Christians: are the marked differences between humans and chimpanzees explainable via new information arising through evolutionary mechanisms?
Dehal, P., and Boore, J. (2005). Two Rounds of Whole Genome Duplication in the Ancestral Vertebrate. PLoS Biol 3(10): e314