ENCODE and “Junk DNA,” Part 2: Function: What’s in a Word?

Bookmark and Share

September 26, 2012 Tags: Genetics

Today's entry was written by Dennis Venema. You can read more about what we believe here.

ENCODE and “Junk DNA,” Part 2: Function: What’s in a Word?
Image courtesy of Flickr user prettywar-stl.

On Monday, I introduced “function” as a particularly useful concept in biology, but also cautioned that—like all good concepts—it has “fuzzy” edges. Indeed, it has lots of similarly “fuzzy” peers in the language of biology: for example, asking a molecular biologist “What is a gene?” or asking an ecologist “What is a species?” is not advisable unless you have an hour or more to devote to the conversation. A discussion of biological “function” could generate a similar conversation.

For most biologists, something biological has function if to contributes to the characteristics of an organism in such a way as to favor its reproduction (usually by favoring its survival). Conversely, for a biologist to claim that some feature of an organism is non-functional, they are claiming that this feature does not contribute to or favor survival or reproduction. To return to our historically-interesting example from yesterday, the wild-type allele of the enzyme responsible for making purple pigment in pea flowers (the “P” allele) is functional since it has an observable affect on the characteristics of the organism that favors its reproduction (attracting pollinators, perhaps). When a mutation arose in this gene to prematurely terminate the synthesis of the protein enzyme, the recessive “p” allele resulted. Biologists would not hesitate to label this allele as a loss-of-function allele, because the function they have in mind is that of making purple pigment. The fact that this allele still produces an mRNA and even a partial protein product would not faze them in the least, since the known biological function of the gene has been disrupted. On the other hand, the ENCODE definition of “function” as “any detectable biological activity” presents things differently—by that standard we would not be able to discern any difference between these two alleles, despite the evidence that one is functional (in the sense above) and the other is not.

What this means is that the ENCODE definition of “function” is specific to a context: detecting (any) biochemical activity for a segment of DNA in the genome. As I mentioned in my first post, looking for biochemical activity is a useful and interesting undertaking, and the ENCODE project is impressive in its scope. What it does not do, however, is define “function” in the usual biological sense we have just discussed: that of meaningful contribution to survival and reproduction. In fact, biologists would expect that many DNA sequences that are non-functional in the traditional sense would be detected as “functional” using the ENCODE definition. One such example is that of transposon-derived sequences, which make up nearly 50% of the human genome.

Transposons, and ENCODE

We previously examined transposons in our series on “Junk DNA.” In brief, these are parasitic DNA sequences that serve to replicate themselves and spread within genomes. They have sequences that act to recruit host enzymes for making mRNA and a protein enzyme that acts to copy and/or move the transposon to a new chromosome location. These entities are veritable beehives of biochemical activity, but biologists consider them non-functional (with respect to their hosts) even if they are highly functional (with respect to the transposon). In many cases, however, transposon sequences in mammals are defective—they have picked up mutations such that they no longer make the enzyme they need for movement, or perhaps the mutation ruined one of the DNA sites the enzyme binds to. As before, these sequences are non-functional with respect to their mammalian host—they make no contribution to the host organism at all—and they are non-functional even to themselves (since the transposon cannot replicate any longer). Even such doubly non-functional sequences, however, will retain detectable biochemical activity. Host DNA-binding proteins will still bind to these sequences, mRNA may be produced, and even the transposon enzyme might be partially made as a non-functional protein. These biochemical activities may persist for thousands of generations before additional mutations silence them, so these sequences would still be identified as “functional” according the ENCODE criteria. Since almost half of the human genome is made up from such repetitive sequences, it’s not surprising that ENCODE found so much “function.” Yes, these sequences have detectable biochemical activity, but that’s not surprising at all, given what we know about transposons. Nor does such activity demonstrate that these sequences are functional in the more strict sense. Indeed, lines of evidence from comparative genomics strongly suggest they are not.

Consider the onion

One such line of evidence is that closely related species can vary widely in the amount of DNA they contain, yet have the same number of genes. For example, some species in the genus Allium (onion, garlic and related plants) can have over five times as much DNA as other species within the same group. The difference is largely in repetitive DNA sequences, such as transposons and transposon fragments. Such observations are challenging to square with the hypothesis that the species with the larger amounts require all of it for function in the strict sense, since the species in the group are all almost exactly the same structurally. If Onion Species B has five times as much DNA as Onion Species A, it does not mean that all of it is necessary to build the body form of Species B. No, the developmental process for building Species B involves laying down the very same structures that we find in Species A, with only slight modifications. So even if all of the “extra” DNA in Species B is doing something biochemically, it doesn’t mean that it is all necessary to build or maintain the body form. Furthermore, we might notice that the onion has over five times as much DNA as humans. Do we really think that it takes five times more functionally necessary DNA to build an onion than it does to make a human being? No. Much of the extra DNA, put simply, may be “functioning” in some way (i.e. biochemically active), but it is highly unlikely that it is functionally necessary. This observation led evolutionary and genome biologist T. Ryan Gregory to propose the “onion test” as a mental check against proposed universal functions for non-coding DNA (using “function” in the strict sense):

The onion test is a simple reality check for anyone who thinks they have come up with a universal function for non-coding DNA. Whatever your proposed function, ask yourself this question: Can I explain why an onion needs about five times more non-coding DNA for this function than a human?”

The “vitellogenin test”

Whereas the onion test of meaningful function is a broad look across the genomes of a group of related organisms, a complementary strategy is to examine specific cases of DNA that have been widely accepted as being non-functional (i.e. not necessary for the building and maintenance of the body). Indeed, if the argument against the very idea of “:non-functional DNA” is to be convincing to most biologists, it needs to address cases where the accumulated evidence for the standard definition of non-functionality is strong. So, with a tip of the hat to Gregory’s “onion test,” I’d also like to propose a test to be used for the claim that “junk DNA” has been shown to be non-existent. Simply put, the test asks: does the claim address the features we observe in the human Vitellogenin 1 pseudogene?

Since this is a pseudogene that may already be familiar to readers from my previous discussions of “junk DNA,” it will serve as a useful example to explore further. For those who have not yet encountered this example, however, I will summarize its relevant features before going on to re-evaluate it in light of ENCODE.

In egg-laying animals, including some mammals like the platypus, the Vitellogenin 1 (Vit 1) gene produces a protein that is used in the formation of egg yolk. Yolk serves as a source of nutrients for the developing embryo once it is cut off from the maternal supply when the eggshell is formed. Placental mammals, like humans, retain a link to their mothers throughout their embryonic development through the placenta, and therefore do not need egg yolk in the same way that egg-laying organisms do.

Several years ago, a group of researchers went looking for remains of Vit 1 gene sequences in humans and other mammals. According to evolutionary theory, all mammals are the descendents of egg-laying ancestors – meaning that, if traced back far enough, placental mammals and modern egg-laying organisms such as birds once were the same ancestral species, with a common genome. Working with this knowledge, the researchers located the Vit 1 gene sequence in chickens, and took note of the sequences on either side of it (for convenience we’ll call them “Gene A” and “Gene B”. They then located these sequences in the human genome, where they also sit side-by-side. Examining the sequence between Gene A and Gene B in the human genome revealed that the mutated remains of the Vit 1 gene were still present in the human genome, in the exact spot that an expectation of common ancestry (in this case, conservation of genome structure, or shared synteny) would predict:

Also, when comparing the Vit 1 pseudogene between various placental mammals, we observe that several of the inactivating mutations (deletions) are common to all, indicating that they occurred in the last common ancestor of these species, and were subsequently inherited:

To sum up, what we observe in the mammalian Vit 1 pseudogenes is as follows:

  1. The function of the Vit 1 gene in egg-laying organisms is well known and well understood.
  2. Placental mammals, including humans, do not require a functional Vit 1 protein product, yet have a Vit 1 sequence that cannot, due to many mutations, perform its known function as a protein involved in yolk formation. In other words, in placental mammals, the Vit 1 gene has suffered a loss of function that renders it a pseudogene.
  3. A Vit 1 pseudogene in placental mammals can be located using predictions based on shared synteny with egg-laying organisms such as chicken.
  4. Placental mammals, including humans, share a number of identical mutations within their Vit 1 pseudogene, indicating that these mutations happened once in a common ancestor, and were inherited from that common ancestor.

Taken together, these lines of evidence strongly support the conclusion that the Vit 1 gene we observe in placental mammals is non-functional in the strict sense – that it does not contribute to reproduction or survival. The possibility that the Vit 1 sequence in placental mammals might retain some residual biochemical activity (it once was a functional gene, after all) would not change these lines of evidence or the conclusions drawn from them. Moreover, the (however slight) possibility that certain parts of any given pseudogene might have gained an important new function - a process called exaptation that we have discussed previously - does not affect the conclusions drawn from the whole study of Vit 1 as to its origins as a previously functional but now non-functional gene.

Taking the test

Though I have presented much of this evidence about the Vit 1 pseudogene here on BioLogos in the past, I am not yet aware of any other science/faith organization that has addressed this evidence. Web searches for terms such as “junk DNA” or “pseudogene” at various such sites produce a significant number of articles addressing the topic, and all sites examined had at least one page addressing the human GULO / GLO pseudogene as a specific example. Similar searches, including searches for the more generic term “yolk” failed to reveal any discussion of this pseudogene on any of the websites listed. I would invite these groups, all of whom have recently posted on the ENCODE project to suggest that “junk DNA” is no longer a tenable idea, to “take the test” and offer an explanation for the features we observe in the human Vitellogenin 1 pseudogene.

References cited

Vit 1 pseudogene sequences were assembled using the NCBI BLAST site (http://blast.ncbi.nlm.nih.gov/Blast) and from figure S2 of Brawand et. al., 2008 (see below).

Brawand, D., Wali, W., and Kaessmann, H. Loss of Egg Yolk Genes in Mammals and the Origin of Lactation and Placentation. PLoS Biology 6, 0507-0517.

Dennis Venema is professor of biology at Trinity Western University in Langley, British Columbia. He holds a B.Sc. (with Honors) from the University of British Columbia (1996), and received his Ph.D. from the University of British Columbia in 2003. His research is focused on the genetics of pattern formation and signaling using the common fruit fly Drosophila melanogaster as a model organism. Dennis is a gifted thinker and writer on matters of science and faith, but also an award-winning biology teacher—he won the 2008 College Biology Teaching Award from the National Association of Biology Teachers. He and his family enjoy numerous outdoor activities that the Canadian Pacific coast region has to offer. Dennis writes regularly for the BioLogos Forum about the biological evidence for evolution.

< Previous post in series

View the archived discussion of this post

This article is now closed for new comments. The archived comments are shown below.

Page 1 of 1   1
Tim - #73097

September 26th 2012


Great article!

Regarding your search for creationist organizations that have addressed the vitellogenin pseudogene, you could take a look here:


The author doesn’t seem to grasp that vitellogenin’s role is specifically to serve as a carrier for mass yolk protein transfers, however.  Which seems to seriously undercut his argument.

Tim - #73099

September 26th 2012

...sorrry, should have been “mass yolk transfer.”  Don’t know why I thought to put protein in there.

Tim - #73102

September 26th 2012

OK, now that I looked over again what I have on Vitellogenin, it has become clear that I have no clear idea what it does. :(

I thought it’s role was in bulk transport for yolk components, but some of the literature out there talks about it being a yolk precursor.  So basically I have no idea.

Help Dennis!

Hendry Lukas S - #79921

May 13th 2013

I simply stumbled upon your website and wanted to say that I have really enjoyed reading your blog post. modifikasi ninja teknik sipil

Dennis Venema - #73109

September 26th 2012

Hi Tim,

The short answer is that it is both - it is a carrier of some components (lipids, sugars) and the protein itself is cleaved to form several yolk components. So, it’s a major carrier, and itself becomes a major portion of yolk. Obviously, this function (storing up large amounts of yolk before eggshell deposition) is not something placentals like us need to do.

Dennis Venema - #73110

September 26th 2012

Oh, and thanks for the link to the web site trying to explain away the human Vit 1 pseudogene - I haven’t found anything else that even attempts to deal with it. The response, as you indicated, falls short because the author doesn’t understand what Vit 1 actually does.

Tim - #73112

September 26th 2012

No problem Dennis.  Out of curiosity, are you aware of the mechanism that does produce the small ammount of yolk in a placental mammalian egg (e.g., human egg)?

Dennis Venema - #73114

September 26th 2012

No, I’m not - I know that it is a Vit 1 - independent process, but that is all. I’m not sure if much is known about it, but I haven’t looked into it deeply.

Tim - #73116

September 26th 2012

Oh, OK.  I tried looking into it myself, but couldn’t find anything. 

I guarantee you though that, as you saw in the creationist article I linked, that small amount of yolk present in the eggs of placental mammals will rear its head whenever anyone disputes this pseudogene as in fact non-functional. 

Without being able to clearly articulate where this yolk comes from, the argument that it is a Vit 1 independent process may not carry much persuasive power. 

If you ever do find out what this process is, please post it sometime.  Thanks Dennis!

Dennis Venema - #73118

September 26th 2012

Sure, will do.

Don’t forget that we know what Vit 1 is, and we know how it works (as a protein for bulk production of yolk via its role as both carrier and yolk protein). This function is well understood. We also know that in its present form in placental mammals, Vit 1 cannot be doing that role - it has multiple inactivating mutations that block translation into a functional protein product). The fact that a second pathway produces a tiny amount of yolk in a Vit 1 - independent manner is not really an issue. All mammals are amniotes, but only egg-laying ones have functional Vit 1.

Tim - #73125

September 26th 2012

Thanks.  I do get what you’re saying, and I agree of course.

It’s just that within a creationist framework, the Vit 1 pseudogene wouldn’t be thought of as a pseudogene.  It would be thought of as a fully functional sequence responsible for some as yet determined purpose - and given the tiny ammount of yolk in placental mammalian eggs, creationists would likely nominate that tiny production of yolk as related to its purpose.  If we can’t say exactly how that yolk got there, it would be hard to persuasively argue that the sequence you and I call the Vit 1 pseudogene had no role.

Tim - #73113

September 26th 2012


Roger A. Sawtelle - #73111

September 26th 2012

Is there any question today that the genetic code is a code, a language, that tells different cells how to work together to be an organism? 

Joriss - #73147

September 27th 2012


I have a question. The Vitellogenin gene is a pseudogene in humans and mammals, why? Why is a gene called a pseudogene? Is it in itself  clear that it doesn’t build up the organism, so therefore - in the strict sense - doesn’t function? Or is it just because it doesn’t function compared with the gene of the ancestor, and therefore is considered to have lost it’s function? I mean, when you look at the gene between “gene A” and “gene B” in mammals without knowing that there is a similar gene in egg-laying ancestors, would it, without the possibility to be compared, still be obvious we have to do with a pseudogene?

Dennis Venema - #73155

September 27th 2012

Hi Joriss,

The Vit 1 gene is a pseudogene in placental mammals because it has many, many mutations that prevent it being properly expressed as mRNA or translated into a protein product. We would recognize it as a pseudogene due to these features, along with its features that indicate it was once a gene - an open reading frame, a promoter, splice sequences and other features. If we knew nothing about Vit 1 in other organisms we would not have a good idea of its normal function, but we would certainly recognize it as a heavily-mutated gene remnant regardless. Hope this answers your questions.

Tim - #73159

September 27th 2012


Do you have any journal article or resource you could direct us to for this?  I’d be very interested to wrap my head around how we truly know pseudogenes such as Vit 1 fail in production of protein products.

Dennis Venema - #73161

September 27th 2012

Hi Tim - the paper I cite in the post would be a good start - it is also open access.

Tim - #73164

September 27th 2012


Thanks.  I’ve read through the article, but with respect to the deactivation of the Vit genes, it only mentions that they are inactivated with insertions / deletions

However, we know that these gene remnants may still be partially transcribed.  What I was interested in knowing was how we determine that no protein production with a vialbe biological function can possibly arise.  I fully accept that it doesn’t.  But I am trying to put on my creationist hat, and this is precisely the suspicion that a creationist is likely to have.  How do we know?  Has this been observationally confirmed in any way?  Or is the genetic mechanism of protein production so well know that we can determine this just from an analysis of the sequence?  And what reference would we be able to refer to to substantiate this?

Dennis Venema - #73166

September 27th 2012

Hi Tim,

Insertion/deletion mutations (“indels”) change what is known as the “reading frame” when the cell attempts to translate an mRNA into protien. Reading frame is required for getting the right nucleotide triplets (called codons) in a sequence, and thus the right amino acids in a sequence (specified by the codons). After an indel that is not a multiple of three (e.g. a single nucleotide deletion) the frame shifts, sort of like this (to use an english sentence as an analogy:

THE FAT CAT ATE THE RAT  (now, let’s delete the “F” in “FAT”). The cell would still try to “read” in groups of three, but get gibberish:


Since we understand the codon (triplet) code, we can read right off the DNA sequence that things are seriously messed up. Hopefully that helps.

Dennis Venema - #73167

September 27th 2012

Another point to make is that a shift in frame often generates a triplet that codes for “stop making protein now”. The usual outcome of a frameshift mutation (indel) is a run of gibberish amino acids and then a premature stop to the protein.

Tim - #73169

September 27th 2012


Thanks.  I recognize that indels wreak havoc on genes resulting in, as you put it, gibberish output.  However, the challenge is demonstrating to the creationist that the output is in fact gibberish, rather that the precise code God intended to accomplish some as of yet undefined purpose.  So how do we demonstrate they are gibberish?

If we compare the Vit 1 pseudogene in human or other mammalian DNA to the Vit 1 gene in avian species such as the chicken, we can clearly articulate how these indels have shifted frames and prematurely stopped protein production.  However, the creationist will be resistant to accepting this comparison.

They may point out, as you noted in your last post, that a run of amino acids are produced, and they will ask the question, “how do we know these amino acids don’t do something biologically useful?  Such as contribute to the nutritve value of the small ammount of yolk granules present in the human oocyte?”

So it would be of interest to demonstrate that these amino acids are in fact gibbersh and useless biologically.

Dennis Venema - #73222

September 29th 2012

Hi Tim,

I’ve done a bit more work to address your question. Look at the figure above showing the Vit 1 sequences. This is a region very early in the protein code for Vit 1, which is a very large protein (over 1,800 amino acids). The first deletion mutation in the mammalian Vit 1 pseudogenes shifts the reading frame, and only eight nucleotides later a stop codon is generated (TGA) in the new reading frame. The result is a protein only 27 amino acids long, with 24 of those from the Vit 1 code. Recall that Vit 1 functions by being a large carrier protein (for lipids and sugars) which is then cleaved into large protein fragments to form the major components of yolk. The tiny fragment left in the mammalian Vit 1 pseudogene does not retain the regions of the protein that do these jobs.

Dennis Venema - #73223

September 29th 2012

To continue, anyone who wishes to claim that this tiny protein of 27 amino acids is a deliberately designed sequence will have to address the fact there remains a high degree of similarity to functional Vit 1 *after* the stop codon.

Tim - #73241

September 29th 2012


Thank you for looking into this.  This really helps.

I agree that a protein fragment of only 27 amino acids would have no hope of even remotely accomplishing the carrier function or precursor function of the type of yolk we see in our egg-laying evolutionary cousins.

And of course the striking and quite obvious shared synteny and sequence homology between the Vit 1 avian gene and the Vit 1 pseudogene in humans and other placental mammals should be enough to establish that any argument proposing a sort of Vit 1 ‘light’ function to the (pseudo)gene would be speculative and strained at best.

The best remaining argument I can see for the creationist at this point would be to note that the yolk consitution in the human oocyte is no where near as developed as the yolk in an avian egg, with yolk granules far finer and I would gather much more basic / simple in their composition.

So if we could establish that a truncated protein of just 27 amino acids would be too short or otherwise unsuitable to contribute compositionally to these basic yolk granules, then I think we would clinch this.

I’ve tried searching online for this type of compositional information, but all I came away with was some mention of fatty and albuminous substances.

If you find out anything (not that I’d want to burden you further ), please let us know.

Thanks Dennis!

fatty and albuminous substances

Tim - #73242

September 29th 2012

*After reading my post, I realized I didn’t make my point clearly*

What I was getting at was that the shared synteny & sequence homology between the Vit 1 avian gene and the Vit 1 psuedogene in humans and other placental mammals would suggest that if the Vi1 1 pseudogene was in fact a fully functioning gene, it’s function ought to be related to the sort we see in the Vit 1 avian gene.  Maybe a “Vit 1 ‘light’.”  However, the heavily truncated protein fragments such as the 27 amino acid chain referenced render implausible any type of Vit 1 like function.  So if there was a function, it would have to be something very different than what we see Vit 1 accomplish.  And so the commonalities in shared synteny & sequence homology would be too coincidental to ignore.  And arguments against pseudogene status from that point on would take a heavily ad-hoc path.

But even that ad-hoc reasoning would take a big hit if we could demonstrate that this truncated protein of just 27 amino acids would be too short or unsuitable to contribute compositionally to the basic yolk granules in the human oocyte.  So to establish that would be icing on the cake.

OK, just wanted to clear that up and hope to make some kind of sense

Joriss - #73182

September 28th 2012

Yes, you answered my questions, thank you so much. I had some other questions in addition to these questions, but Tim , putting on a creationist hat, asked them already, thank you, Tim. So is there waterproof evidence - or nearly waterproof - that these amino acids are really gibberish? So now I will put on an evolutionist hat - smiley - and listen to your answers to him.

PNG - #73200

September 28th 2012

A somewhat whimsical note to the shared pseudogene argument. Tonight I watched a newsmagazine segment on one of the networks covering an attempted murder case. The key bit of evidence incriminating the suspect was that in a note soliciting a murder for hire, there was a street name which was misspelled in the same way that the suspect was known to have misspelled the same street name in a letter that she sent. This evidence, along with a plausible motive, was apparently considered strong enough by the suspect’s lawyer that she accepted a plea bargain and prison sentence rather than let a jury see it. The argument for common descent from shared complex mutations is essentially the same, although much stronger because there is not one, or a few, but millions of them shared between primate (or mammalian) genomes.

PNG - #73201

September 28th 2012

See the following reference for more shared inactivating mutations in pseudogenes. http://www.ncbi.nlm.nih.gov/pubmed/18085818

beaglelady - #73336

October 4th 2012

Great news! 

coursera.org will be offering a free online course called Introduction to Genetics and Evolution,  taught by Mohamed Noor of Duke University.  It starts on Oct 10.  And, again, it’s free!

No prior coursework is assumed, and students all over the world take these courses.  

Check it out—and sign up.  coursera.org is excellent!



Ralph - #75024

December 6th 2012

There are Tons and Tons of information in an sigle block of DNA.It is very hard to understand the proper working of it but the professionals are doing there best to undestand all the things.


gazebo for garden

Janine - #76507

February 11th 2013

What do biologists call the process when allele frequencies in a population of a species
change over time due to chance?  Moshen Zargar

Adrienne Adrienne - #79873

May 12th 2013

Thanks to write a good blog. This is really good blog. I mean it. This blog have so knowledge about this issue, and so much passion and knowdge.

steel frame

Jone dip - #79890

May 12th 2013

It truly is not simple to retain this kind of top quality in the webpage.This is a well written article on this subject.In reality prosperous content and extraordinarily functional information.

Buy Nicest Snapbacks
Samra Scc - #80207

May 17th 2013

Can we do the same process with the chicken or rabbits..? Bookmarkit Bookmarks

Acton Falkner - #80771

June 5th 2013

Well writing blog, it seems that you have worked very hard on this blog. Keep doing this job.

valet parking at gatwick airport

Larry Walton - #80377

May 21st 2013

Inventure India is a management consulting firm has an unparalleled depth of knowledge and resources combined with lube oil filters manufacturers in India functional and industry expertise for geographical reach. dth hammers manufacturers in IndiaWe help leaders make distinctive, lasting and substantial enhancement to the performance of their organizations.Marriage Flower Buckets || Corporate interior designers in India

coca233 - #81387

June 27th 2013

I am really surprised with this topic. Thanks very much for sharing. personal loans

Page 1 of 1   1