In the last post in this series, we discussed how there are seldom genuinely “new” features that arise through evolution. Instead, what we observe is descent with modification: evolution works by modifying existing features into “new” ones. To return to our language analogy, modern English in one sense is not “new”—it is a (highly) modified descendent of Anglo-Saxon. Of course, in another sense, it is very much “new” in that it has changed so significantly over the last thousand years. So too with evolution—whether one is willing to call evolutionary changes “new” or not, substantial change can accumulate over time in a lineage. That change ultimately needs to occur at the DNA level—in the “information state” of a species over time—in order for it to be heritable.
As we have already discussed, such changes often arise from subtle modification of preexisting genes—changes in regulatory DNA directing where and when genes are transcribed and translated; duplications of genes allowing for the duplicates to pick up different functions, and so on. While these mechanisms appear to provide the bulk of incremental change over time, other more dramatic events are also possible. Sometimes, genes are formed “de novo”—literally, “from new”—and as more and more genomes are sequenced, biologists are finding evidence for an ever-increasing number of such newly-formed genes. Such genes can be identified by comparing the protein products of genes in a group of related organisms. If one species has a protein that its relatives do not, it may be that the gene for this protein was formed after its lineage separated from the other species closest to it:
Genes like this are actually easy to find these days, since they can be identified by simply comparing protein databases of species known to be closely related to one another. In a sea of similarities, these proteins stick out like sore thumbs. They’re new, or at least more new than the other mechanisms we have looked at, all of which are incremental modifications of previously existing protein coding genes. Because cell biologists often call protein genes “ORFs”—for “Open Reading Frame,” referring to a sequence of DNA letters that can be read off, or translated, into a protein, these “lonely” protein genes picked up the nickname “ORFan” genes. They are also known as “taxonomically restricted genes”—which is just a fancy way of saying that they show up only in some lineages, but not in closely related lineages.
ID and de novo genes
As we have seen in previous posts, those in the Intelligent Design (ID) movement argue that natural processes cannot explain the ultimate origin of the information we see in biological systems (and as we have seen, that argument is seriously undermined by the evidence that the DNA code was formed, at least in part, by chemical affinities). In addition to this argument, the ID movement also claims that mutation and natural selection are unable to produce new information either from previously existing genes, or through de novo gene formation. For ID writers such as Stephen Meyer, the existence of ORFan genes is a sure sign that design was required for their production—where “design” means “apart from a natural process.” If evolution cannot account for them, Meyer argues, then design can be inferred.
This issue—the ability of natural mechanisms to produce “new” biological information—was part of the famous Kitzmiller v. Dover trial that tested the constitutionality of teaching ID in US public schools. At the trial, one expert witness for the plaintiffs, cell biologist Kenneth Miller, presented evidence that natural mechanisms were capable of generating new biological information. In the following exchange, Miller is being questioned by his own legal counsel under direct examination:
Q. Now, has there been scientific research done on this proposition of whether or not there are natural explanations for new biological information?
Miller: Yes, there has, in fact, a great deal.
Q. And could I direct your attention to Plaintiffs' Exhibit 245. Do you recognize this exhibit?
Miller: Yes, I do. This is a review article that was written in a very prestigious journal, Nature Reviews Genetics, and it's written by Manyuan Long and several other people. And the title of the article is, “The Origin of New Genes, Glimpses From the Young and the Old.” It's an article that I read immediately, as many scientists did when it came out, because it describes a number of mechanisms by which new genetic information is developed by the processes of evolution.
One of the mechanisms that Miller notes in his testimony is de novo gene origination, since it is discussed in the review paper authored by Long and colleagues that Miller offers as a summary of the relevant evidence. Because this paper (and the genetic mechanisms for producing new biological information it discusses) featured prominently in the trial, it captured the attention of ID writers in the wake of their defeat in Kitzmiller. Stephen Meyer, for example, discusses the Long paper in his 2013 book Darwin’s Doubt as follows (page 211):
During the 2005 Kitzmiller v. Dover trial… biologist Kenneth Miller cited Long’s paper in his testimony. He said that it shows how new genetic information evolves… But do evolutionary biologists really know this?
After discussing mechanisms that reshape existing genes – and claiming them to be inadequate – Meyer turns to the issue of de novo gene origination (page 219):
Long does cite at least one type of mutation that does not presuppose existing genetic information, the de novo origination of new genes.
Curiously, Meyer claims that de novo gene origination is based on an unknown mechanism (page 221, emphasis his):
Indeed, evolutionary biologists typically use the term “de novo origination” to described unexplained increases in genetic information; it does not refer to any known mutational process.
Taking stock, then, many of the mutational processes that Long cites either: (1) beg the question as to the origin of the specified information contained in genes or parts of genes, or (2) evoke completely unexplained de novo jumps – essentially evolutionary creation ex nihilo (“from nothing”).
When I first read this section of Darwin’s Doubt, I was very surprised. The reason for being surprised was straightforward—the de novo origin of genes is not at all an unexplained mystery, nor does it rely on unknown mutational processes. Let’s examine how de novo genes actually form.
New? Yes; but modified nonetheless
The error in Meyer’s argument seems to be that he thinks that de novo, ORFan genes come out of nowhere in some unexplained fashion (Darwin’s Doubt, page 216):
Thus, even if it could be assumed that similar gene sequences always point to a common ancestor gene, these ORFan genes cannot be explained using the kind of scenarios that Long’s article cites. Since ORFans lack sequence similarity to any known gene – that is, they have no homologs in even distantly related species – it is impossible to posit a common ancestral gene from which a particular ORFan and its homolog might have evolved. Remember: ORFans, by definition, have no homologs. These genes are unique —one of a kind—a fact tacitly acknowledged by the increasing number of evolutionary biologists who attempt to ‘explain’ the origin of such genes through de novo (“out of nowhere”) origination.
It all makes for a good argument on the surface—but unfortunately for Meyer, the argument does not hold up once one learns a bit more about ORFan genes. Yes, they are new protein coding genes, but they are not formed from brand-new DNA sequences that arise through some unknown mechanism. They are—as we expect for the products of evolution—slightly modified descendants of very similar DNA sequences, derived from a common ancestor. Meyer is technically correct that ORFans do not have homologous genes in other organisms, but he is apparently unaware that they do have homologous DNA sequences in closely related organisms. Though these sequences are not protein-coding genes, they are often very close to becoming gene sequences. Moreover, these DNA sequences are also often in the same relative location in the genome as the new ORFan gene:
As we have discussed previously, genes have a number of features that allow them to be transcribed into mRNA and translated (by the ribosome) into a functional protein. They need an open reading frame—a DNA sequence that can code for a protein—as well as regulatory DNA that directs when and where the gene is transcribed and translated. In a large genome, these sorts of sequences can be assembled by chance—forming a “proto gene” as it were, that is merely one or two mutations away from becoming a protein coding gene. These genes, of course, cannot be essential genes when they are first produced by random chance, since prior to their coming into being the organism did not need them. Interestingly, though, we observe a reasonable number of ORFan genes that are now subject to natural selection. So, far from being the result of an “out of nowhere, unexplained” process, ORFan genes are evidence that new protein genes can form readily, and that some of them go on to pick up essential roles after they form.
Revisiting nylonase – an ORFan poster child
One example of an ORFan gene that I have discussed previously—and happens to be one of my favorites due to its dramatic nature—is nylonase. Nylonase is a bacterial enzyme that breaks down the synthetic, man-made polymer nylon. This enzyme arose in a de novo fashion in a population of bacteria living in a chemical wastewater pond, and when it arose, it was of significant benefit since it allowed the bacteria to use nylon as a food source. Interestingly, nylonase arose as a mutation in another protein coding gene—an insertion of a single DNA letter. This single insertion of one letter created a “stop codon”—three DNA letters that tell the ribosome to stop adding amino acids to the protein chain of the translated protein—as well as simultaneously creating a new “start codon”—three DNA letters that tell the ribosome to start making a protein chain from that point in the mRNA code. The new start codon, however, was not aligned with the old gene’s codons—the new gene was shifted over by one DNA letter. Thus, the resulting protein was brand new, and completely unlike the previous one. The new protein acted as a weak, inefficient nylonase, but it was better than nothing. Later, this de novo protein gene would be duplicated, and the duplicate gene would pick up a few mutations that greatly improved its ability to break down nylon.
Nylonase is thus an excellent example of new biological information—with an important new function—forming through de novo gene origination. What Meyer misunderstands, and thinks of as a challenge to evolution, is actually an excellent example of what he thinks is impossible for evolution to accomplish.
Know when to hold ‘em, and when to fold ‘em
In Darwin’s Doubt, Meyer makes a second claim—that the basic elements of functional protein genes, known as protein folds, cannot be formed by natural processes and thus must be the products of design. In the next post in this series, we’ll learn more about how proteins fold up into stable structures as we investigate this claim.