Stephen Freeland
 on July 29, 2013

The Evolutionary Origins of Genetic Information

To address a major Intelligent Design critique of evolutionary theory, Stephen Freeland discusses the progress mainstream science has made towards understanding the origin of genetic information.


To address a major Intelligent Design critique of evolutionary theory, Stephen Freeland discusses the progress mainstream science has made towards understanding the origin of genetic information.

Any living branch of science achieves progress by testing new ideas. The results of these tests determine whether each new idea is accepted as a change to what we thought we knew, dismissed as incorrect or simply stagnates owing to a lack of clear evidence. For evolutionary theory, one such proposition is that some features of genetic information cannot evolve through natural processes unless we allow a role for an intelligent designer. This proposition claims testability by defining information in a way that is usually reserved for human creations, such as computer programming code. The underlying idea is that we know intelligent beings create computer-code, so if similar features occur within genetic information then perhaps genetic information derives from an intelligent agency? However, many biologists perceive that they are able to understand exactly where life’s genetic information comes from (the local environment) by thinking in terms of more fundamental and well-established definitions of information that do not involve Intelligent Design.

Current science does not have a detailed, widely-accepted description for how a genetic information system evolved in the first place. Intelligent Design proponents suggest that this is a key weakness of existing evolutionary theory, consistent with the need for an intelligent designer. I describe the progress that mainstream science has made towards understanding the origin of genetic information since the molecular basis of genetic information was first understood, encouraging readers to reach their own conclusions.

Biological evolution describes a natural process that transfers information from a local environment into the chemical known as DNA. Something similar happens when gravity causes raindrops to form a puddle, and the shape of the ground beneath becomes reflected in the underside of the water.

This unusual definition of evolution seeks to clarify an ambiguity in traditional alternatives, such as “biological evolution is a natural process of change in genetic material over time.”The phrase “change in genetic material” describes and confines exactly what scientists measure and test to develop their evolutionary theory, however any description of this sort omits two aspects of a living science.  One is the group of all propositions that have been revealed as incorrect through tests (such as recapitulation – the claim that embryos re-enact their evolutionary history as they develop from a single fertilized egg cell.2) Let us call these incorrect propositions Category 1 omissions. Knowing about them can help scientists avoid wasted time spent repeating previous errors.

The second element missing from a classic definition comprises all propositions for which science has yet to find clear evidence, for or against. We may refer to these as Category 2 omissions. Propositions in this second category are especially important to science because all suggestions to change existing scientific understanding start here. In other words, Category 2 propositions can gather supporting evidence until they become accepted as scientific truth. These successful challenges to established science will alter what we previously thought we understood, perhaps even requiring a change in definition of that science. (It is both humbling and inspiring to remember that scientific knowledge is incomplete in ways that are actively misleading us at present.) However, many Category 2 propositions follow a different trajectory as careful application of the scientific method reveals that they are incorrect, re-classifying these ideas as Category 1 propositions. A third fate is possible for Category 2 propositions. If they do not generate sufficient evidence to make a clear case, whether it be for or against, then they will stagnate. A proposition often ends in stagnation if it fails to generate clear, testable hypotheses that have the power to transform established theory.

Intelligent Design has already started its life in Category 2 by suggesting that current evolutionary theory cannot adequately explain the origin of new genetic information. The unusual definition of evolution written above hints why many scientists, including Christians such as myself, think this is an incorrect (Category 1) proposition. What follows seeks to explain why in greater detail – and to equip you to judge for yourself.

Evaluating suggestions for changes to evolutionary theory.

Start by imagining a line that describes every conceivable degree of genetic difference that could separate any two living organisms (Figure 1). In fact, we don’t have to rely on imagination – such differences can be measured precisely, due to life’s shared biochemistry of DNA and proteins (see Box 1 below). Most criticisms of evolution are, upon careful inspection, claims that evolutionary theory is incomplete. They suggest that evolutionary theory can only explain differences up to a specific point on this line. For example, older versions of creationism claim natural processes cannot change anything more than the frequency (number of copies) of genetic material already present within a species. In effect, this defines a point X on the line shown in Figure 1. To the right of lie larger differences in genetic material, such as those that separate different species. Under creationism, these differences are considered too large for natural processes to explain, and are therefore explained by divine intervention.

A growing weight of detailed evidence shows that new species form by the accumulation of changing gene frequencies within a population.3,4 This evidence has led many contemporary versions of creationism to increase the acceptable limit for evolution, moving point X on the line in Figure 1 to point Y. An explanation is that God created fundamental kinds of animal and plants so that the formation of new species within these kinds are legitimate outcomes of natural processes.5 Accepting this interpretation, it is now the larger degrees of genetic difference lying to the right of Y that require supernatural explanation.

Figure 1

Figure 1: Any two or more organisms can be compared for genetic similarity (e.g. in terms of differences in DNA sequence), and thus plotted as a point on a line that runs from “complete genetic similarity” (clones or identical twins) to “very little genetic similarity”, such as a human and an E. coli bacterium.

For our purposes, what matters is that different versions of creationism all accept some degree of evolution but place a cut-off on the extent of change that evolution can produce, explaining anything above that point by divine intervention. Wherever the cut-off is perceived, the same terminology is used: micro-evolution (anything to the left of the acceptable limit) is attributable to natural processes, but macro-evolution (anything to the right of this point) requires a new explanation – direct creation by God.

The terms “micro-evolution” and “macro-evolution” come originally from similar suggestions made within secular science during the early development of evolutionary theory.6 Biologists working early in the 20th century were learning how to cause genetic mutations in a laboratory setting. These mutations could, in a single generation, produce large changes in an organism’s appearance. Some pioneers of this new science (genetics) thought their discoveries changed evolutionary theory. Darwin had previously described a process of evolution by natural selection, and this process could be observed changing the frequencies of genes within populations over one or more generations. However, subtle differences in the genetic makeup of a population seemed too small to connect with the large jumps being witnessed in laboratories, and the latter seemed more relevant to the formation of new species. A typical evolutionary debate from this time also defined a point somewhere near X on the line shown in Figure 1. Everyone agreed that Darwin’s process could explain changes to the left of this point (micro-evolution), but some now argued that a fundamentally new phenomenon called genetic mutation, or macro-mutation, was responsible for the larger-scale differences to the right (macro-evolution.)

At first sight, both macro-mutationism and creationism seem similar. Both propose a cut-off point for the degree of genetic change that evolutionary theory can explain, and both propose a new cause must be added to explain genetic differences beyond this cut-off. Where the two propositions differ for science is in their potential for tests.  Supernatural causes (literally, those that come from beyond nature) cannot be tested directly from within the natural universe. Science can get no nearer than searching for indirect evidence, such as natural phenomena that cannot be explained by any known, natural cause. Evidence of this kind is unlikely to carry creationist propositions from Category 2 suggestions into accepted science. In part, this is because specific data used to justify un-natural causes tend to find equal or better explanation in terms of the natural causes measured by science as new data becomes available.7 Mostly, however, the problem is that un-natural phenomena can never be more than consistent with a supernatural cause. Even where specific claims for un-natural phenomena have not been refuted, it remains equally possible that science has yet to understand natural causation, and science keeps growing its understanding in ways that support evolution.8

In contrast to creationism, the work of the early geneticists referred to strictly natural phenomena (i.e. those occurring within the observable, natural universe). This focus allowed for direct evaluation by science. Through a series of hypotheses and tests, geneticists revealed that early examples of laboratory-induced macro-mutation were, in fact, large-scale genetic damage caused by powerful doses of radiation and chemicals. Meanwhile, other tests clarified that within nature, genetic mutations of far greater subtlety do indeed account for the minor differences between members of a species (micro-evolution). Further evidence indicated that micro-evolution accumulates over time to account for all larger degrees of evolutionary diversification (macro-evolution). In other words, science not only failed to find supporting evidence for the idea that macro-mutations are responsible for the emergence of new species, it also undermined the observation that had led to this hypothesis in the first place.  Science refuted the claim that macro-mutations filled a gap within evolutionary theory by discovering that there was no gap to fill. Macro-mutationist ideas for the origin of new species have therefore moved from Category 2 (ideas for which the evidence is unclear) to Category 1 (ideas that are incorrect), and are no longer actively researched by evolutionary biologists.9

Over the years, secular science has proposed many other novel factors that evolutionary theory should absorb to better explain biological diversity. So far, all have gone the way of macro-mutationism.10 However, cutting-edge research is, by definition, constantly probing for evidence to support new insights. For example, one recent claim is that without adding any new causal factors, enough biological evolution will ultimately produce something like our own sentient species.11 Contrary to popular belief, this outcome is not predicted by current evolutionary science.12 The new claim of inevitable outcomes has not been refuted by science, nor has the supporting evidence become overwhelming. In fact, scientists still don’t know quite how to weigh the evidence – how to measure inevitability when it comes to evolution. As a result, inevitable outcomes remains a Category 2 idea, a topic of active debate and research until scientists gather a clear majority of evidence to reject or accept it into science.13 If such evidence is not forthcoming, the idea will likely atrophy.

These three propositions, creationismmacro-mutationism and inevitable outcomes, provide context for discussing another idea that has arisen in Category 2: the idea that evolutionary theory would be improved by allowing a role for a guiding intelligence. Nothing is inherently unscientific about this suggestion so long as it can find appropriate evidence (through tests) to help scientists decide, one way or the other.  One idea for a test is to ask whether we can identify properties of genetic information that resemble human-created information. The idea is that we are intelligent, so if genetic material looks like the sort of thing we would make then it might be better explained as the product of Intelligent Design, especially if science can identify features of genetic information inexplicable by known evolutionary processes.14 Intelligent Design names one of these features Specified Complexity – a type of information content that aims to measure the semantic content of information (the amount of meaning within a piece of information). According to the concept of Specified Complexity, natural processes that lack a guiding intelligence cannot produce new genetic information nor can they explain the origin of genetic information because this implies an increase in Specified Complexity. Each of these claims warrants careful consideration.

Unless life began in greater quantity than it now exists, evolution requires that natural processes have, over time, increased the total quantity of genetic material (DNA) present on our planet.

This is one way in which science currently believes genetic information has increased over time: a natural process has increased the number of copies of DNA molecules without any need for guidance by an intelligent agent. This kind of increase in genetic information is exactly what we see whenever a natural population grows (e.g. bacteria during an infection). Clearly, this type of information-increase is not at issue. Indeed, Intelligent Design refers to this as a flow of information, rather than creation of new information.15

Along similar lines, unless life originated containing more DNA than the most genetically complex organism alive today, then some lineages must have increased the quantity of DNA they contain through evolution.16 Established science knows ways to observe and measure this kind of increase in genetic information. For example, genome-sequencing technology has revealed small variations in the length of genetic material carried by different individuals within every natural population, including our own species.17 One of the fundamental types of mutation recognized by geneticists are indels (short for “insertion/deletion mutations”) and they represent micro-evolution. Why could insertions not accumulate faster than deletions over time, causing genetic material to grow in size? This is exactly what we would expect if micro-evolution adds up to produce macro-evolution. Again, Intelligent Design agrees with mainstream science that this is entirely within the realm of causation by existing theory and that a focus on quantities of DNA is misleading. Genetic differences between a human and an amoeba are only partly attributable to the different quantity of genetic material present in each. For example, the Amoeba proteus genome contains 100-fold more DNA than a human genome; other species of Amoebae contain both much larger and much smaller quantities of genetic information.18

More important than the quantity of DNA present in each species is the different order in which nucleotides are linked together to spell out genetic messages. DNA has the unusual property of being aperiodic. This means that the sequence of nucleotides within a DNA molecule is not constrained to any kind of repeating pattern (see Box 1 below). It is precisely this property that allows DNA or anything with similar properties to carry a large amount of information. For example, written English is an aperiodic sequence built from relatively few symbols. Everything ever written in English can be copied using one simple keyboard. The trick is to arrange these building-block symbols into particular aperiodic sequences. The major difference between this article and Harry Potter lies not in the quantity of letters and punctuation used but in the sequence in which these symbols have been assembled. Where current evolutionary science disagrees with Intelligent Design is in the suggestion that some sequences of genetic information can only be generated by a guiding intelligence.1 Intelligent Design asserts that natural processes cannot produce changes in genetic information if these changes correspond to an increase in specified complexity. Specified complexity is defined in a way that tries to capture the difference that separates this essay from Harry Potter. More accurately, specified complexity is the information that distinguishes any random sequence of symbols from orderings that have meaning.19

The idea that some sequences of DNA cannot be produced by natural processes owing to the information they contain has no empirical support from modern genetics. In fact, quite the reverse. Genetic information is stored in sequences of nucleotides that have been chemically linked together to form a molecule of DNA. Genetics, bioinformatics, biochemistry and molecular biology all agree that natural processes can cause any nucleotide to become the neighbor of any other within a DNA sequence. Mutations that interconvert each of the four nucleotides have been observed within natural populations and within the laboratory, as have insertions, deletions and trans-locations of mini-sequences from one region of the DNA sequence to another. These elementary components of modern genetics are, in principle, more than sufficient to produce any DNA sequence from any other. Try this for yourself by listing a series of mutations that convert the word “evolution” into “creation” with the restriction that each mutation must either change a single letter, insert or delete one or more letters, or move the position of any sub-group of letters. There are many ways to reach the outcome, and this remains true for any two words that you can choose.20

The biochemistry that describes how genetic information is stored, replicated, corrected and translated into proteins is fascinating but requires no novel concepts regarding semantic information. The question is whether the addition of this latter concept can reveal insights, such as limitations too subtle to observe with empirical science? As the companion article by Isaac explains, science recognizes several types of information. One of the most fundamental types is thermodynamic information, a fundamental parameter of physics that reflects all that could be different about the universe. If evolutionary theory implies an increase or loss in thermodynamic information then it would be in conflict with established ideas belonging to another branch of science. This is not the case. Nothing about biological evolution ever involves an increase (or decrease) in the thermodynamic information present within the universe. Indeed, evolution can be described precisely in terms of thermodynamic processes by which sources of energy bring into being particular states of information within a DNA molecule. The opening definition of this essay tries to emphasize this point: “Biological evolution describes a natural process that transfers information from a local environment into the chemical known as DNA.” To understand why this causes many biologists to doubt whether additional concepts regarding information are necessary or helpful, one must return to Darwin’s original insight.

Within a population of individuals that vary from one another, those that best match their environment will, on average, leave behind the most offspring. Wherever the match is genetically programmed, the version of the genetic program associated with the best match will tend to increase in frequency over time by leaving behind more copies of itself. As these advantageous versions are copied from one generation to the next, they will mix with new variations that either increase or decrease the match. All the while, the environment keeps changing and mutations keep occurring so the matching process continues. Repeating this process over and over will create a pool of genetic programs that have accumulated variations maximizing the overall match between organism and environment (quite simply because those that didn’t match so well left behind fewer copies of themselves).21 Through this process, genetic material will evolve to mirror some of the information presented by the environment in which it is copying itself. This information might include patterns in time and space by which ambient temperatures vary, or patterns of chemical resources found in the environment. Things get especially interesting when we realize that some of the most significant information about an organism’s environment is specified by other organisms. The color of leaf on which an organism feeds may become reflected in its genetic material if this type of genetic programming helps the herbivore to hide from predators; conversely, genetic material may evolve to program colorations that contrast with the background of other organisms in an environment where finding and attracting mates is the strategy that leaves behind the most copies. Each reflection originates in physical parameters but these collide, transfer information and start new emanations as they become reflected in organisms’ genetic material. No matter how complex these rebounding, mixing reflections of the environment become, they will never create new information (any more than your image in a reflection of a reflection of a reflection contains more information than you do.)22 Viewed in this light, biological evolution is a natural process that distills thermodynamic information from a highly complex environment into molecules of DNA.23

Evolution is to DNA what gravity is to a puddle of water: in both cases it is possible to isolate elements of the whole that carry impressively complex information (species really do contain lots of complex genetic programs written out in DNA, as does the shape produced when a body of H2O perfectly matches some of the information inherent to the collection of rocks and debris beneath.) If we considered only the water, we might be tempted to think that some sort of intelligence had sculpted such a complex and accurate reflection of the environment. We might even measure this information content to demonstrate its improbability of arising by chance. But step back far enough to see the whole picture, and we realize that evidence consistent with design can be better understood as a result of natural processes (gravity and a pre-existing, information-rich environment.) In the case of biological evolution, evolution and DNA take the place of gravity and water. Gravity and evolution not only permit the transfer of environmental information into a chemical medium, but inevitably and inexorably lead to this information transfer. Given this understanding, it is hard to see what evolutionary science would gain by accepting other concepts of semantic information that create a problem to be solved by invoking an indeterminate intelligent designer.

The description of evolution given above applies once the world contains a genetic material that can influence its own rate of copying by reflecting the environment. In living systems, these remarkable properties are produced by the Central Dogma of molecular biology (see Box 1 below). Perhaps a stronger argument for Intelligent Design is that no natural process could create such a versatile system in the first place?

It is true that at present, evolutionary science does not have a clear, detailed and well-accepted explanation for how the Central Dogma of molecular biology emerged. But does that mean it is time to embrace Intelligent Design as a better approach? By analogy, current medical science has not found the cure for cancer. Taken in isolation, this sound-bite could lead to the misleading view that existing research directions, developed for decades, are best written off as a failure. This would miss an important context. Many aspects of cancer are now being treated with far greater effectiveness than ever before as a result of ongoing research. However, these cures are not robust (all-encompassing) enough to be summarized into the statement “we have found the cure for cancer.” This status is typical of big questions within science: failure to reach the sound-bite goal should not be mistaken for evidence that the research program has failed. Scientific progress is measured by the insights that research produces, and their implications for where we might usefully look next. These insights may even open up new awareness of just how much we do not understand, but characterizing the past few decades of cancer research as an exhaustive search that has ended in failure would be more than premature: it would be actively misleading. This final section of the article offers context to help the reader judge whether a similar situation holds for current research into natural processes that explain the origin of genetic information.

Let us start by making entirely clear what scientists are looking for. As the previous section explains, the challenge is not to find a natural process that can create enough information for a simple genetic system. The universe is replete with information capacity and syntax – from the positions of stars within our galaxy (and billions of others) to the arrangement of atoms in a single grain of sand. Within living systems, most of this information is ignored – so the question is not “where did the information come from” (unless we wish to talk cosmology – a very different subject) but rather “how does nature create systems that focus on some of this natural information?” Put another way, the challenge for understanding the origin of genetic systems is to find how natural processes can simplify a large amount of thermodynamic information into a syntax that displays only the disciplined chemical semantics of a self-replicator.

The exact details of life’s genetic information system came into focus during the middle of the 20th century.24 In 1953 Watson and Crick published the structure of DNA,25 revealing the innate capacity of this molecule to replicate and evolve indefinitely. Thirteen years later, a consortium of scientists published the details of the genetic code by which the information carried by DNA is translated into specific protein sequences.26 The system was so fundamental to understanding life, yet so simple and easy to explain that it has become known as theCentral Dogma of molecular biology (Box 1). However, it was puzzling from an evolutionary perspective. Protein catalysts supervise the construction of individual nucleotides (the building-blocks for making DNA and RNA). Other proteins link these nucleotides into DNA or RNA sequences, depending on their type (deoxyribonucleotides into DNA, and ribonucleotides into RNA). Proteins can perform these roles because each one has just the right chemical properties to catalyze a specific chemical reaction (such as linking a molecule of the nucleotide “A” to T, G or C to start building a genetic message).27 Each protein is a long chain of amino acids (typically several hundred) that have been chemically linked together. The function and shape of a protein emerge spontaneously according to the sequence of these amino acids – just as the meaning of a word is carried (for us) by a sequence of letters drawn from the English alphabet.28 The only way to reliably build the right sequence(s) of amino acids to make the proteins of metabolism is to follow genetic instructions, one code-word (codon) at a time. In other words, for more than three thousand million years, everything living has needed proteins to make genetic information – and needed genetic information to specify how these proteins are to be made.

At the time of discovery, this system looked like an example of what proponents of Intelligent Design might call irreducibly complexity. In other words, a complex system that cannot evolve from simpler precursors, because any simplification would lose the entire functional value of the system. This perception of an un-evolvable code was further enhanced by the discovery that the same exact genetic code is at work in organisms as different as human beings and E coli bacteria (Refer back to Figure 1: this is about as genetically different as living organisms can be!). Scientists of the time came to think that one genetic code was universal for all living systems on our planet. This led Francis Crick to propose that the genetic code is a “Frozen accident” of evolution,29 universal across life precisely because once it had formed (by some unknown event), it was so fundamental to all biochemistry that it could never change again. Specifically, he pointed out that any change to the rules of genetic coding would be equivalent to a simultaneous mutation in every single gene in the organism (Box 1).30 While evolutionary theory requires that occasional small mutations produce a better fit to the environment, the simultaneous mutation of thousands of genes seems extreme even by the standards of macro-mutationism. However, subsequent science has developed at least three major lines of research that undermine the concept of a frozen accident (and irreducible complexity) for genetic coding.31

First, it has been discovered that the genetic code is not universal. Around a dozen or so minor variations exist.32 These variations are mostly codes in which one or more genetic codons have altered their amino acid “meanings.” Some involve a more significant change – the addition of a 21st or 22nd amino acid.33 Everything indicates that these genetic codes evolved from the standard genetic code during the past few hundred million years, and continues to evolve today. Arguments for the evolvability of the code are strengthened by the finding that amino acids are assigned to genetic code-words non-randomly. In particular, codons are assigned to amino acids in such a pattern that common mutations produce minor variations as proteins are decoded. A growing body of evidence connects this feature of the code to the idea that considerable evolution by natural selection had gone into shaping this system.34 Everything suggests that the genetic code is evolved and evolvable after all.35

The second major insight into the origins of genetic coding is that multiple, independent lines of evidence suggest the standard amino acid alphabet of 20 building-blocks grew from a smaller earlier alphabet corresponding to an earlier stage in genetic code evolution. Many variations have been proposed.36 Most derive their views by considering only one or two types of evidence; sophisticated calculations of the amino acid sequences of truly ancient proteins, the repertoire of amino acids found in meteorites; simulations of an early, pre-biological planet Earth and so on. What is interesting is an un-looked for match between the broad findings of these different approaches. In particular, different approaches end up dividing the 20 amino acids of modern organisms into 10 that were around in the earliest systems, and 10 that arrived later, as by-products of early biological evolution. The members of each group are remarkably consistent,37 hinting directly at the process by which the genetic code evolved, growing more complex over time from simpler beginnings. Recent findings are also starting to make sense of why natural selection created this particular alphabet of building blocks.38

The third line of insight takes us backwards to the possible origins of genetic coding. Some scientists have used the SELEX approach described in a companion paper by Watts to define mini-sequences of RNA that specifically bind to a particular amino acid.39 Although results have been patchy, some amino acids seem to associate with surprising choosiness to the code-words assigned to them in the standard genetic code. This association suggests that the earliest steps in genetic coding may have been nothing more than simple physical affinities between two types of chemical.

Between them, these insights represent significant progress from the impossibly self-referential system viewed by Crick and those around him just 50 years ago. This half-century of research indicates that the standard genetic code at work in modern cells may be a product of substantial evolution that had taken place by around 3 billion years ago. But perhaps the most interesting progress is that few scientists still regard the emergence of life’s Central Dogma as the origin for genetic information.

The observation that RNA sequences can bind amino acids hints at something very important: proteins are not the only type of molecule that can spontaneously fold into shapes with interesting properties. As described in the companion paper by Watts, sequences of RNA can exhibit protein-like behavior. Technologies first developed in the 1980’s and 1990’s have been used to lab-evolve a wide variety of molecules, dubbed ribozymes in deference to the previously known class of protein catalysts known as enzymes. These ribozymes now cover most steps of fundamental biochemistry (such linking together carbon atoms to make important biological molecules). Proteins are much less necessary for life than they seemed a couple of decades ago. This observation finds unlooked-for synergy with another line of scientific discoveries. In modern living systems, not all RNA performs the simple role of carrying genetic information from DNA to be decoded into proteins. A handful of the genes that are faithfully copied from DNA into RNA fold up into a complex three-dimensional shapes that act as if they were proteins. Interestingly, these natural ribozymes tend to occur in the most ancient metabolic pathways–those shared by bacteria, humans and everything else alive today. Aspects of biology that have not changed much in billions of years of evolution are likely still with us because they have been doing their job very well throughout this period. In other words, this type of RNA behaving like a protein is exactly what one might expect to see if the ribozymes produced by SELEX resemble a stage of our truly ancient evolutionary past when genetic coding of proteins was far less important (if it was present at all). Oddly enough, Crick (of the Frozen Accident) had suggested something similar to this concept of molecular fossils when he looked at how genetic de-coding works. He noticed that the adaptor molecules responsible for decoding individual genetic code-words into specific amino acids are nothing more than folded-up RNA. He also noticed that the biggest and most complex molecular machine involved with genetic de-coding (the ribosome) seemed to be made of RNA with a few proteins thrown in for good measure. Three decades later, new technology allowed researchers enough precision in their study of the ribosome’s structure to confirm this is correct: although proteins are embedded within the tangled, folded RNA, they appear to offer little more than structural enhancements.40 At its core, the ribosome is a ribozyme. It seems likely that a primitive ribosome could function without any encoded proteins: exactly what we would expect if genetically encoded proteins emerged from a simpler, earlier world in which only RNA existed.

Of equal interest, everything points towards DNA being the last arrival out of the 3 fundamental biomolecules: DNA, RNA and protein.41 DNA is made by complex, genetically encoded protein enzymes without a ribozyme in sight. The individual building-blocks of DNA (deoxynucleotides) are made by taking and modifying a nucleotide of RNA. Again, all this is exactly what we would expect if DNA evolved from RNA, after genetically encoded proteins had already entered the picture. Indeed, DNA is a more chemically inert version of RNA – better for safe storage of genetic information, worse for folding up into a catalyst. This is what you might expect if it emerged after RNA had already handed off the job of catalysis to genetically encoded protein enzymes. The RNA would get left sandwiched in the middle of DNA and proteins, just where we find it today (see Box 1 below).

Observations that expand on all of these themes continue to accumulate and are beginning to sketch a framework that was completely unknown in the mid 1960’s. At its best, this “RNA-world” hypothesis solves much of the puzzle for the origin of living systems. One molecule, RNA, is its own catalyst and information carrier. However, many puzzles remain. For instance, the universe seems quite good at making amino acids without life. They have been found in meteorites, formed in simulations of the conditions of interstellar space and turn up reliably in just about every possible simulation of our planet’s early conditions. For nucleotides, the building-blocks of RNA, the exact reverse is true. It seems relatively simple to make the nucleobases (such as Adenine and Guanine)–but these must be chemically linked to a ribose sugar and a phosphate in order to make a single nucleotide in processes that are antagonistic to those in which the bases form: there are real chemical difficulties in forming the individual nucleotide building blocks, and even bigger difficulties for linking them together into sequences that do not also contain all sorts of unwanted molecular garbage.42 If RNA came first then why is it so much easier to make amino acids than RNA from non-biological scratch?

Scientists are relatively confident that an RNA-protein world preceded ours in which DNA genes are copied into mRNA transcripts en route to protein translation. Every clue that we can find supports this conclusion. What is much less certain is how the RNA-protein world itself emerged. One broad class of ideas asserts that we have simply failed to discover some set of conditions that encourages sequences of RNA to form spontaneously. Mineral surfaces are often mentioned here, as they can catalyze many chemical reactions. For example in 2004, the mineral borate was shown to catalyze the notoriously difficult synthesis of ribose – an essential component of the chemical structure of every single nucleotide.43 Perhaps other minerals will be found to help other steps in nucleotide synthesis, and for linking nucleotides into sequences. Certainly chemists, geologists and biologists are talking more than ever before as they seek to add up their knowledge of the ways in which life, chemistry and the planet interact. Among them, increasing attention is coming to focus on hydrothermal vents as a good place to look next in the search for the origin of life.44 Here, hot water full of interesting chemicals is forced to flow over richly diverse mineral. This can produce a slew of chemical reactions, most of which are still poorly understood.

Another view is that searching for non-biological origins for RNA is looking in the wrong place. Instead, genetic information, at least in the form that we think of it (polymerized nucleotide sequences) was itself an evolutionary invention of an earlier metabolism, a pre-RNA world. Perhaps significantly, proponents here are also drawn to minerals and to hydrothermal vents because the same conditions that might aid nucleotide synthesis produce a slew of interesting and newly discovered chemical reactions.45

It might even be that these two views meet up one day. Since the mid-1960s, a scientist by the name of Grayham Cairns Smith has been proposing that minerals were the original genetic information.46 Crystalline minerals show the interesting property of harnessing energy from the environment to grow by making copies of themselves. As they do this, they are creating chemical order from chaos. That is exactly what a salt crystal is doing as you watch saltwater evaporate in a glass or a rock-pool. They might also catalyze specific chemical reactions on their surface according to their exact atomic composition.47 In effect, they might carry simple genetic information that starts to trap the energy flowing through the system into a chemical reflection of the environment. But by now we are talking about one of the swarm of competing ideas at the edge of Category 2. Here they will compete and rise or fall according to the evidence that can be gathered through careful and ingenious tests.


Evolutionary theory, like any other branch of science, achieves progress by testing new ideas. Some of these ideas will go on to change what we thought we knew, others will be found incorrect, and some will stagnate as they fail to gather clear evidence, for or against. For evolutionary theory, many suggestions have been made for new causal factors that are required to explain how genetic diversity has arisen. Intelligent Design, for example, proposes that some types of genetic information cannot evolve through natural processes unless we admit a role for an intelligent designer. This proposition claims testability by using a definition of information that usually refers to creation by an intelligent agent. Meanwhile, many biologists perceive that they are able to understand exactly where life’s genetic information comes from (the local environment) by thinking in terms of more fundamental and well-established definitions of information that do not involve Intelligent Design. A related suggestion is that current evolutionary theory cannot explain how natural processes could produce a genetic information system in the first place. I agree that we are far from a full understanding, but choose to outline some major themes in the scientific progress made since the discovery of life’s Central Dogma in 1966 to provide a context for the reader to judge for themselves whether it is time to conclude that this search has failed.

It would be remiss to finish an article in this journal without some comment on the theology of all this. If we accept the evolutionary explanations sketched above, then science is taking major steps towards understanding the mechanism by which life came into the universe. Some famous advocates of this science claim it presents a logical connection to an atheistic world-view.48 Many others (myself included) perceive that any connection between evolution and spirituality is an act of faith–and faith in atheism is only one of many options.49 For my part, I find excitement and challenge in the search to unravel this marvelous mystery. I choose to associate that inspiration with a loving, creator God whose universe I am exploring. I agree with Dawkins (and Darwin) that from a human standpoint, the suffering and death implicit to natural selection form questions for my faith–and I am grateful that scientists and theologians are able to discuss such issues in forums such as this,50 where I can read, learn and grow my relationship with God through an exploration of science.

About the author

Stephen Freeland

Stephen Freeland

Stephen Freeland is an Astrobiologist and the Director of Interdisciplinary Studies at UMBC in Baltimore. Building from a bachelors degree in zoology (Oxford), a Masters’ degree in computer science (University of York), and a Ph.D. in genetics (Cambridge University), his personal research came to focus upon the earliest evolution of life on our planet. After a postdoctoral fellowship to Princeton, Steve worked for eight years as a biology professor at UMBC before leaving to serve for four years as the project manager for the University of Hawaii node of the NASA Astrobiology Institute (where he worked to facilitate scientists from diverse disciplines working together to derive insights into the origin, distribution and evolution of life in the universe.) In 2013, he returned to UMBC in order to run one of the oldest interdisciplinary studies programs in the country (where he works to support students in creating and executing of unique undergraduate degree programs that combine two or more traditional academic disciplines.) Raised Methodist, Steve explored many different denominations before coming to land at St. Bartholomew’s Episcopal church in Baltimore (Saint Bee’s). He is the proud husband and father in a blended family that comprises and three daughters who bring such joy and energy that only his amazing wife can rescue him to go for a quiet walk with the dog.