In 2003, under the leadership of BioLogos founder Francis Collins, the Human Genome Project sequenced the full human genome, showing us for the first time the order of the 3.2 billion chemical “bases” that make up the rungs of DNA’s double helix structure. The project identified and mapped 23,000 genes that code for proteins, but those genes make up less than 2% of the total sequence—far fewer than originally predicted, given the complexity of humans. While many non-coding sequences were identified as having function as well, there were still vast swaths of the genome that had no obvious function. In fact, what was known about certain classes of sequences suggested that they had no functional role for humans—such as the sequences identified as either transposons or transposon fragments that make up nearly half of our genome. These sorts of sequences seemed to fit into what was popularly known as the “junk DNA” category.
With the complete genome sequence in hand, we knew the sequence and location of our genes, but what we didn’t know was how all those genes are regulated: how do the trillions of cells in our bodies know when to turn on or off all those genes? How do the hundreds of distinct cell types develop and function together, when they are all running on the same DNA “operating system?”
That’s where the ENCODE (short for Encyclopedia of DNA Elements) project comes in. Launched in September 2003, shortly after the announced completion of the Human Genome Project, the goal of the ENCODE project is “to build a comprehensive parts list of functional elements in the human genome, including elements that act at the protein and RNA levels, and regulatory elements that control cells and circumstances in which a gene is active.” In other words, the project seeks to understand how the genome “works.”
Early this month, researchers from ENCODE released more than thirty papers presenting their findings. During a Science magazine online chat, the project’s data coordinator, Ewan Birney, explained the outcome:
The ENCODE project aimed to start our understanding of how the human genome works. We know that (nearly) all the information that determines a human is in the genome, as we all start off as single cell with this DNA. However, we had a patchy understanding of how it works, in particular away from protein coding genes.
To work out how the genome works, we used the fact there are many tiny machines (proteins and RNA - RNA is very like DNA) in each of our cells which know how to "read" parts of the genome. By monitoring where these little molecular machines are on the genome, or how parts of the DNA are copied into RNA (there are quite a few different types of RNA as well), we start to gain some insight into the genome.
We did many such experiments, across different cell types (eg, one cell type was very similar to a liver cell type; another was very similar to a white blood cell). This way not only can we see what is similar, we can also see differences between these cell types.
There is a lot more to get to know and understand here - this is definitely closer to the start than the end. But it is a substantial amount of data, and analysis, to start on this journey.
According to the abstract of one of the lead papers from Nature, this extraordinary glut of data “enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions.” Only 2% of the genome codes for proteins, but 80% or more has some biochemical function. As a Science news article put it, these 30 papers “sound the death knell for the idea that our DNA is mostly littered with useless bases.”
The pro-Intelligent Design organization The Discovery Institute has heralded the discovery as the “demise of junk DNA.” Casey Luskin writes for their blog Evolution News:
Let's simply observe that it provides a stunning vindication of the prediction of intelligent design that the genome will turn out to have mass functionality for so-called "junk" DNA. ENCODE researchers use words like "surprising" or "unprecedented." They talk about of how "human DNA is a lot more active than we expected." But under an intelligent design paradigm, none of this is surprising. In fact, it is exactly what ID predicted.
The extent to which the ENCODE project been able to identify function has been surprising—even exhilarating—though scientists have for some time been getting glimpses of the many ways in which segments of DNA can be “active.” Even in 1970 biologists knew that some non-coding DNA had function, and by 2003 there was a large body of work demonstrating that many non-coding elements acted as promoters, enhancers, insulators, and so on. Indeed, in recent years many have come to appreciate the fact that “junk” was never really an appropriate metaphor in the first place. Still, because sequencing of multiple genomes has shed such extraordinary light on key evolutionary mechanisms, many geneticists have focused on function primarily in terms of which regions do or do not contribute to the evolutionary fitness of their host, rather than whether they were merely "doing something" biochemically. What the impressive ENCODE project has done is open a treasure trove of new information that can only accelerate the pace at which researchers are able to explore the incredible subtlety and complexity of DNA, and refine the very concept of “functionality.”
So with all this in mind, is ENCODE a stunning victory for ID, as Luskin believes? Bryan College biologist Todd Wood thinks not. He writes, “I don't think that function equates to design, nor do I think that design requires or predicts function. They're not the same thing… my understanding of function does not require me to hypothesize God (or an anonymous designer, if you must) as the proximal cause.”
We agree. Indeed we would go on to say that evolution and design are not mutually exclusive. So while finding function is not sufficient to prove design, recognizing that function has arisen by way of evolution does not indicate that God was not at work. We at BioLogos believe God providentially works out his purposes—his designs—through the elegant processes of evolution, not in opposition to them.
Amazing as the new data are, it only strengthens and enhances our evidence for evolution. While much of the genome is “doing something” biochemically, it is still likely that the majority of the sequence is evolutionarily neutral (Senior Fellow Dennis Venema discusses the evidence for this “neutrality” in a post on our site, including a striking comparison between 29 different mammal genomes and the human genome). In fact, another ENCODE researcher participating in the Science magazine chat, John A. Stamatoyannopoulos of the University of Washington School of Medicine, thinks the findings align beautifully with evolutionary theory:
ENCODE's data provide a unique and powerful window through which to view evolutionary change. We can see those changes directly by lining up the genome sequences of many different organisms -- these line-ups have revealed millions of regions where all the genomes agree, indicating sequences that have been specially preserved by evolution while others have decayed away (ie freely changed their letter codes). We now see that a large proportion of these 'conserved' regions are lighted up by ENCODE annotations, indicating that they are marking spots in the genome that contain important instructions for cell function.
We’ve discussed “junk” DNA previously, including a multi-part series by Dennis Venema, and we’ve received many emails over the past few days asking for our comments on the ENCODE findings. On Monday and Tuesday, Dr. Venema will begin to offer his own thoughts on ENCODE.
A special thanks goes to Darrel Falk, Mark Sprinkle, Kathryn Applegate, Dennis Venema, and Tom Burnett for their contributions to this post.