t f p g+ YouTube icon

Signature in the Pseudogenes, Part 1

Bookmark and Share

May 10, 2010 Tags: Genetics
Signature in the Pseudogenes, Part 1

Today's entry was written by Dennis Venema and Darrel Falk. Please note the views expressed here are those of the author, not necessarily of BioLogos. You can read more about what we believe here.

In our previous post, we likened comparing genomes of related organisms to reading alternative history novels. We noted that before two species diverge, they share the same “backstory” but then go on to accumulate changes after separation.

One interesting feature of looking at genomes is that often we can find the mutated remains of once-functional genes. These are called pseudogenes, or “false genes.” Pseudogenes might be part of a shared backstory for two species, or they might crop up independently after two species go their separate ways. Either way, they are easy to spot at the molecular level because they retain a lot of similarity. For example, here are the DNA sequences for the start of one particular gene1 in several species (for our purposes, its function is not important).

As you examine the sequence of letters above, note that DNA contains a four letter code. This string of “letters” is made up of the molecules adenine, guanine, cytosine, and thymine strung together within the large super-molecule, DNA. Our cells read the encoded instructions and, interpreting the code, build each of the different proteins required for the maintenance of life.

Note that the instructions have changed a little since these five species had a common “backstory” (ancestor). Despite the changes, for the dog, mouse and chicken, the protein is fully functional. This is not so, however, for the chimp and human. The “dot” (highlighted by the red arrow) means that one single letter of the instructions has been deleted. This change would be like finding this sentence in the first edition of a book:


But, in the second edition of the same book, we find this instead:


The sentence has no meaning anymore, but, as we compared the first and second versions of the book, we would be able to tell exactly what had happened: the letter “I” had been deleted from the sentence, and everything following would be messed up. A single deletion throws off the whole code from that point on. Thus, for chimps and humans, the instructions become gibberish, and the protein molecules produced according to that gene’s instructions are now badly mangled and unable to function.

As you go back and examine sequence in the human/chimp pseudogene, notice how both species carry the exact same deletion. This suggests that the occurrence of this single deletion occurred in one individual, a common ancestor with whom both species have a shared backstory.

Let’s return to our book analogy. Presumably all copies of the second edition had the exact same non-functional sentence about the BIG RAT. If someone was to examine two second edition copies of the book, each of which were missing that same letter, “I,” it would be unthinkable to propose that the exact same mistake occurred independently in the printing of each of the two books. Similarly, it would be incorrect to propose that the new incoherent sentence had some important meaning which literary scholars will discover some day. We would know, plain and simple, that a mistake had occurred. Anything other than that would be highly contrived.

Today both chimps and humans carry the exact same mutation because they both have the same backstory. However, it is even more poignant than that. There are 20,000 pseudogenes in the human genome. Each has its own unique backstory. Each can be traced out in the same manner we have just done for this one.

The hypothesis of common ancestry makes precise predictions about how pseudogenes will be distributed in related species. Once a gene has been mutated into a pseudogene in a certain species, that pseudogene with its specific inactivating mutation will be passed on to all descendents of that species.

The figure below demonstrates this for a specific pseudogene, which we will term pseudogene “y.” Note that in a very specific individual at a very specific time, gene “y” underwent a change in its code—it mutated. That altered code was passed on to the subsequent generations and ended up in two daughter species, Species A and Species B.

Now consider a second gene, which we call “x.” It also underwent a mutation, but did so earlier in the lineage. Let’s call the new mutated form of this gene pseudogene “x.” This is shown in the next figure. Since this mutation occurred earlier in the lineage in an organism that was a common ancestor to Species A, B, and C, all three of these species carry the abnormal, non-functional version of “x.” The lineage to species D, however, had already broken away. It does not carry the mutated version of “x.”

Finally, consider another gene, which we’ll call “z.” This gene is perfectly functional in Species A, B, and D. However, when you examine its code in Species C, guess what? It carries a non-functional pseudogene. What do you think has happened here? This is a recent change, so recent that it occurred in an individual whose ancestors only went on to become Species C. Here is a summary figure which illustrates the time at which each of the three mutations occurred and the ramifications of each change.

In this example, since gene “x” is mutated to a pseudogene in the common ancestor of species A, B and C, we would expect to find this pseudogene, with the same exact inactivating mutation, in these three species. Similarly, the pseudogene version of gene “y” with exactly the same code-change should be found only in species A and B. Finally, there are many cases in which a pseudogene is found only within one species, or, at most, a couple of closely related sister species. Pseudogene “z” is our example of that.

If life really does have a backstory of this sort, then you can see the power of this technique for tracing the lineage. It allows us to trace the history of life, species by species. Interestingly though, there have long been other—non-genetic—ways of tracing life’s history. Biologists have been using these alternative methods for many decades. For example, by examining fossils (paleontology) and tracing changes in body structure (comparative anatomy), the history of life had already been pretty much worked out before DNA sequencing data ever came into the picture.

For the most part, the data which are emerging from DNA sequencing projects simply verify that which biologists have known for years through these other methods of exploring life’s history. Still, the results are extremely gratifying in their consistency. In science, one looks for corroborating evidence. If the DNA data had suggested totally different lineages, then there would have been good reason to doubt the common descent hypothesis. Such is not the case though. The supporting data keep piling up; there is no longer any doubt.

Remember how science works. If there are multiple lines of evidence—each internally consistent with the central overarching principle—a consensus is reached. The theory is judged to be correct and the scientists move on to further explore its ramifications.

If the theory of common descent is true, then it also makes predictions about what we would not expect to find at the genetic level. We go on to explore this topic in our next post.

Dennis Venema is professor of biology at Trinity Western University in Langley, British Columbia. He holds a B.Sc. (with Honors) from the University of British Columbia (1996), and received his Ph.D. from the University of British Columbia in 2003. His research is focused on the genetics of pattern formation and signaling using the common fruit fly Drosophila melanogaster as a model organism. Dennis is a gifted thinker and writer on matters of science and faith, but also an award-winning biology teacher—he won the 2008 College Biology Teaching Award from the National Association of Biology Teachers. He and his family enjoy numerous outdoor activities that the Canadian Pacific coast region has to offer. Dennis writes regularly for the BioLogos Forum about the biological evidence for evolution.
Darrel Falk is former president of BioLogos and currently serves as BioLogos' Senior Advisor for Dialog. He is Professor of Biology, Emeritus at Point Loma Nazarene University and serves as Senior Fellow at The Colossian Forum. Falk is the author of Coming to Peace with Science.

Next post in series >

View the archived discussion of this post

This article is now closed for new comments. The archived comments are shown below.

Page 1 of 3   1 2 3 »
John VanZwieten - #13099

May 10th 2010

Very helpful explanation of pseudo-genes here.

I wonder, do the defective/useless protiens continue to be manufactured?  Or is there a feedback mechanism so that pseudo-genes no longer produce anything at all?

Glen Davidson - #13100

May 10th 2010

It seems that the most frequent response made now in response to important facts like these is that this is a “theological matter,” that it purports to know the mind of God (well, obviously they’re not sticking with “it could be aliens”  story very well).

Well, no, it needn’t be at its most basic.  In fact, at its narrowest the argument is merely that any “design hypothesis” ought to produce results different from those expected from evolution, which we don’t find.  What does design even mean if it is limited to using the material of heredity and lacks the leaps and foresight possible with intelligence?

But of course the argument generally turns toward theology, mainly because ID is little other than theistic apologia.  How does one answer the claim that life is “so complex that it must have been designed by God” (God is implicit when not explicit) except to ask why God would make things appear to have evolved, and to lack the marks of intelligence? 

Glen Davidson

HornSpiel - #13102

May 10th 2010

” ID is little other than theistic apologia.” Good point Glen, and embarrassingly bad apologia at that.

Brian in NZ - #13103

May 10th 2010

How many of the 20000 pseudogenes in the human DNA are also common in chimps (or other species)?

Charlie - #13110

May 10th 2010

A common example is our inability to make vitamin C.  Chimps also have this pseudogene where we lost the enzyme to synthesize ascorbic acid, a compound that is eventually turned into vitamin C.

Charlie - #13111

May 10th 2010


as far as humans and chimps go, I bet we share a large percentage of the 20000 pseudogenes.  I don’t know the percentage, but if chimps and humans shared 99.9% of the pseudogenes, wouldn’t you be more inclined to think the two species had a common ancestor?

Chris Massey - #13112

May 10th 2010


I have the same question as Brian in NZ. Do we know how many of our 20,000 pseudogenes can be found in chimps (with identical responsible mutations)?

Also, how do geneticists determine that a pseudogene is indeed a corrupted version of a functional gene found in other species? Is it based on the functional gene and the pseudogene being found in identical locations on the two respective genomes or on the high degree of correspondence between the unmutated portions of the respective nucleotide sequences? Or is it both? Hope that makes sense.

Chris Massey - #13113

May 10th 2010

Sorry, I meant to ask “Dennis and Darrel”

Darrel Falk - #13121

May 10th 2010

Brian and Chris

In one recent study http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2687790/  of one particular class of genes (ribosomal protein genes), of 1,462 pseudogenes found in chimpanzee, 1282 are also found (virtually always at the same position) in humans as well.  So you can extrapolate from there to get a rough estimate of what it would be for the complete collection of 20,000.

In answer to your question about how geneticists identify a pseudo-gene: it is based upon both position and the identification of specific mutational abnormalities which would render the gene non-functional.  There are other factors as well.

Young pseudogenes will often still produce product, albeit defective product.  As time goes by they may accumulate more mutations which block product-formation altogether.  Some pseudogenes (a special class called processed pseudogenes) would never produce a protein product.  There is an interesting reason for that, but it’s a story in its own right.


Darrel Falk - #13123

May 10th 2010


Perhaps I should define young—- roughly, a couple of hundred thousand years or so. 


Jeffrey L Vaughn - #13131

May 10th 2010


I remember reading a few years ago of experiments to repair the GULO (vitamin C) pseudogene in monkeys.  Or possibly, it was replacing the non-working GULO gene with a working “glow” gene.

When the working mouse GULO gene was replaced by a jellyfish glow gene, glow-in-the-dark mice were created.

When the non-working GULO gene in monkeys was replaced, the monkeys all died.  I can’t find the references.  Assuming I remember correctly, wouldn’t this likely imply that the GULO pseudogene DNA actually codes another gene that is absolutely necessary for monkeys and higher primates, rather than merely being a useless piece of code?


Chris Massey - #13134

May 10th 2010


Thanks for pointing me to that study. Can you help clarify something for me. When they find “1,282 human-chimp pseudogene pairs found in syntenic regions” does that mean that in each case the gene is pseudogenized in the same way in both the human and chimp version? In the example you give in your article there is a common point mutation pseudogenizing both genes. Is that the case for all 1282 of the pairs in this study - an identical mutation responsible for rendering the gene non-functional? Or would this total include some genes that just happen to be pseudogenized in both humans and chimps but for different reasons (and therefore through independent mutation events)?

Darrel Falk - #13139

May 10th 2010


There are primarily two different types of pseudogenes, Dennis and I have written about the first.  This class involves the inactivation of a previously active gene.

The second class, referred to as processed pseudogenes, is made in a totally different manner.  In this case,a mRNA molecule is copied backwards into a DNA molecule and inserted somewhat randomly at one spot in the genome.  Once having been inserted it gets passed on through the generations.  (The paper to which I referred you, focuses on this second class.)

Processed pseudogenes have certain trademarks or “signatures” enabling us to recognize their origin.  For example mRNA’s end in a long string of A’s (adenines).  Similarly the processed pseudo gene frequently ends in a long string of A’s as well.  They have other trademarks as well.

In general members of this class of pseudogenes are never functional.  They go into the genome somewhat randomly and they seldom go into a spot adjacent to an appropriate “on” switch.


Chris Massey - #13140

May 10th 2010


Okay, thanks. That makes sense. It sounds a lot like the process by which endogenous retroviruses find their way into our genome.

Darrel Falk - #13144

May 10th 2010


I would have to see the experimental details before I could comment on the death of the monkeys. If you find that paper, please let me know.


MAS - #13203

May 11th 2010

This comment has been removed by the moderator.

Charlie - #13209

May 11th 2010


I’d be interested in that article as well,  Let me know if you find it.

Glen Davidson - #13210

May 11th 2010

Glen you are extremely quick to claim that ID is ‘little other than theistic apologia’

I’m not quick to do so at all.  I and many others (Biologos) have documented it voluminously, and your lack of regard for such scholarship and failure to even acknowledge it do not speak well for you.

You are quick to make naked and false assertions about people who have dealt with these matters carefully and with evidence (the Wedge itself is about all anyone would need).

however in all debates and point/counterpoint articles I have seen/read the ‘Idiots’ and their sympathizers seen to be far more scientifically literate and philosophically astute than the Darwinists.

Considering your lack of regard for evidence and standards of truth, your “witness” to such absurdities counts for very little indeed.  The fact is that you have done nothing attempt to defame your betters, which is about all that ID ever does.  Indeed, its apologetics consists largely in false charges against actual scientists and scholars, for you all have no chance in winning a fair fight.

Hence the lack of any hint of fairness in your attack.

Glen Davidson

Mike Gene - #13220

May 11th 2010

If life really does have a backstory of this sort, then you can see the power of this technique for tracing the lineage. It allows us to trace the history of life, species by species. Interestingly though, there have long been other—non-genetic—ways of tracing life’s history. Biologists have been using these alternative methods for many decades.

Nice essay.  Easy to read and very powerful.  There is no doubt in my mind that pseudogenes signal common descent, but there is a deeper level of analysis available for those who embrace evolution, yet also remain open to possibility that it is influenced by design linked here.

Bilbo - #13234

May 11th 2010

Yes, I agree that Dennis and Darryl have provided a very clear, powerful case for common descent.  What isn’t clear is whether the evolutionary journey has been one of strictly random mutations or if non-random mutations were also necessary, as Behe argues. 

I’m wondering if there is a way to tell, all in the interests of “theistic apologia,” of course.

Page 1 of 3   1 2 3 »