The Evolutionary Origins of Genetic Information, Part 2

| By Stephen Freeland

This and next week on the BioLogos Forum, join astrobiologist Stephen Freeland for a look into the nature of information and the origins of life on earth. (These posts were originally published as a paper in the ASA’s academic journal, PSCF, and are reprinted here with permission.) 

Stephen was raised in a Christian family, and his father is a Methodist minister in England whose passion for natural history and science provided a rich environment in which to explore the relationship between science and faith. During Stephen’s teenage years, he explored various denominations, from Catholic to charismatic non-denominational churches, and most recently, life in Baltimore has led Stephen to a deep and rewarding connection with St. Bartholemew's Episcopal church, where he enjoys the Christ-centered meeting point of spiritual substance, social justice, inclusive grace, and rich traditions of liturgy and music.

Because of Stephen’s commitment to deepening his faith through conversations with other Christians, which helps to deepen our corporeal understanding of God’s grace and processes—and because the nature of this material being the rather controversial subject of first life and evolution—Stephen will be participating in the online conversation at the bottom of each post in this series. At the end of each post, you’ll find a few discussion questions, which we encourage you to use as starting points for commenting, (but you are of course welcome to ask him questions of your own, and add your own observations to the dialogue). 

Note: For more on the topic of genetics and evolution, please see BioLogos Fellow Dennis Venema’s current series.

Can natural processes generate new genetic information?

Unless life began in greater quantity than it now exists, evolution requires that natural processes have, over time, increased the total quantity of genetic material (DNA) present on our planet.

This is one way in which science currently believes genetic information has increased over time: a natural process has increased the number of copies of DNA molecules without any need for guidance by an intelligent agent. This kind of increase in genetic information is exactly what we see whenever a natural population grows (e.g. bacteria during an infection.). Clearly, this type of information-increase is not at issue. Indeed, Intelligent Design refers to this as a flow of information, rather than creation of new information.1

Along similar lines, unless life originated containing more DNA than the most genetically complex organism alive today, then some lineages must have increased the quantity of DNA they contain through evolution.2 Established science knows ways to observe and measure this kind of increase in genetic information. For example, genome-sequencing technology has revealed small variations in the length of genetic material carried by different individuals within every natural population, including our own species.3 One of the fundamental types of mutation recognized by geneticists are indels (short for “insertion/deletion mutations”) and they represent micro-evolution. Why could insertions not accumulate faster than deletions over time, causing genetic material to grow in size? This is exactly what we would expect if micro-evolution adds up to produce macro-evolution. Again, Intelligent Design agrees with mainstream science that this is entirely within the realm of causation by existing theory and that a focus on quantities of DNA is misleading. Genetic differences between a human and an amoeba are only partly attributable to the different quantity of genetic material present in each. For example, the Amoeba proteus genome contains 100-fold more DNA than a human genome; other species of Amoebae contain both much larger and much smaller quantities of genetic information.4

More important than the quantity of DNA present in each species is the different order in which nucleotides are linked together to spell out genetic messages. DNA has the unusual property of being aperiodic. This means that the sequence of nucleotides within a DNA molecule is not constrained to any kind of repeating pattern (see Box 1). It is precisely this property that allows DNA or anything with similar properties to carry a large amount of information. For example, written English is an aperiodic sequence built from relatively few symbols. Everything ever written in English can be copied using one simple keyboard. The trick is to arrange these building-block symbols into particular aperiodic sequences. The major difference between this article and Harry Potter lies not in the quantity of letters and punctuation used but in the sequence in which these symbols have been assembled. Where current evolutionary science disagrees with Intelligent Design is in the suggestion that some sequences of genetic information can only be generated by a guiding intelligence.1 Intelligent Design asserts that natural processes cannot produce changes in genetic information if these changes correspond to an increase in specified complexity. Specified complexity is defined in a way that tries to capture the difference that separates this essay from Harry Potter. More accurately, specified complexity is the information that distinguishes any random sequence of symbols from orderings that have meaning.5

The idea that some sequences of DNA cannot be produced by natural processes owing to the information they contain has no empirical support from modern genetics. In fact, quite the reverse. Genetic information is stored in sequences of nucleotides that have been chemically linked together to form a molecule of DNA. Genetics, bioinformatics, biochemistry and molecular biology all agree that natural processes can cause any nucleotide to become the neighbor of any other within a DNA sequence. Mutations that interconvert each of the four nucleotides have been observed within natural populations and within the laboratory, as have insertions, deletions and trans-locations of mini-sequences from one region of the DNA sequence to another. These elementary components of modern genetics are, in principle, more than sufficient to produce any DNA sequence from any other. Try this for yourself by listing a series of mutations that convert the word “evolution” into “creation” with the restriction that each mutation must either change a single letter, insert or delete one or more letters, or move the position of any sub-group of letters. There are many ways to reach the outcome, and this remains true for any two words that you can choose.6

The biochemistry that describes how genetic information is stored, replicated, corrected and translated into proteins is fascinating but requires no novel concepts regarding semantic information. The question is whether the addition of this latter concept can reveal insights, such as limitations too subtle to observe with empirical science? As the companion article by Isaac explains, science recognizes several types of information. One of the most fundamental types is thermodynamic information, a fundamental parameter of physics that reflects all that could be different about the universe. If evolutionary theory implies an increase or loss in thermodynamic information then it would be in conflict with established ideas belonging to another branch of science. This is not the case. Nothing about biological evolution ever involves an increase (or decrease) in the thermodynamic information present within the universe. Indeed, evolution can be described precisely in terms of thermodynamic processes by which sources of energy bring into being particular states of information within a DNA molecule. The opening definition of this essay tries to emphasize this point: “Biological evolution describes a natural process that transfers information from a local environment into the chemical known as DNA.” To understand why this causes many biologists to doubt whether additional concepts regarding information are necessary or helpful, one must return to Darwin’s original insight.

Within a population of individuals that vary from one another, those that best match their environment will, on average, leave behind the most offspring. Wherever the match is genetically programmed, the version of the genetic program associated with the best match will tend to increase in frequency over time by leaving behind more copies of itself. As these advantageous versions are copied from one generation to the next, they will mix with new variations that either increase or decrease the match. All the while, the environment keeps changing and mutations keep occurring so the matching process continues. Repeating this process over and over will create a pool of genetic programs that have accumulated variations maximizing the overall match between organism and environment (quite simply because those that didn’t match so well left behind fewer copies of themselves).7 Through this process, genetic material will evolve to mirror some of the information presented by the environment in which it is copying itself. This information might include patterns in time and space by which ambient temperatures vary, or patterns of chemical resources found in the environment. Things get especially interesting when we realize that some of the most significant information about an organism’s environment is specified by other organisms. The color of leaf on which an organism feeds may become reflected in its genetic material if this type of genetic programming helps the herbivore to hide from predators; conversely, genetic material may evolve to program colorations that contrast with the background of other organisms in an environment where finding and attracting mates is the strategy that leaves behind the most copies. Each reflection originates in physical parameters but these collide, transfer information and start new emanations as they become reflected in organisms’ genetic material. No matter how complex these rebounding, mixing reflections of the environment become, they will never create new information (any more than your image in a reflection of a reflection of a reflection contains more information than you do.8) Viewed in this light, biological evolution is a natural process that distills thermodynamic information from a highly complex environment into molecules of DNA.9

Evolution is to DNA what gravity is to a puddle of water: in both cases it is possible to isolate elements of the whole that carry impressively complex information (species really do contain lots of complex genetic programs written out in DNA, as does the shape produced when a body of H2O perfectly matches some of the information inherent to the collection of rocks and debris beneath.) If we considered only the water, we might be tempted to think that some sort of intelligence had sculpted such a complex and accurate reflection of the environment. We might even measure this information content to demonstrate its improbability of arising by chance. But step back far enough to see the whole picture, and we realize that evidence consistent with design can be better understood as a result of natural processes (gravity and a pre-existing, information-rich environment.) In the case of biological evolution, evolution and DNA take the place of gravity and water. Gravity and evolution not only permit the transfer of environmental information into a chemical medium, but inevitably and inexorably lead to this information transfer. Given this understanding, it is hard to see what evolutionary science would gain by accepting other concepts of semantic information that create a problem to be solved by invoking an indeterminate intelligent designer.

NOTE: Please join us tomorrow as we continue the discussion on whether natural processes can account for the origin of genetic information!

Box 1.    An Introduction to Biological Coding and the Central Dogma of Molecular Biology

A code is a system of rules for converting information of one representation into another. For example Morse Code describes the conversion of information represented by a simple alphabet of dots and dashes to another, more complex alphabet of letters, numbers and punctuation. The code itself is the system of rules that connects these two representations. Genetic coding involves much the same principles, and it is remarkably uniform throughout life (Figure 2): genetic information is stored in the form of nucleic acid (DNA and RNA), but organisms are built by (and to a large extent from) interacting networks of proteins. Proteins and nucleic acids are utterly different types of molecule; thus it is only by decoding genes into proteins that self-replicating organisms come into being, exposing genetic material to evolution. The decoding process occurs in two distinct stages: during transcription local portions of the DNA double-helix are unwound to expose individual genes as templates from which temporary copies are made (transcribed) in the chemical sister language RNA. These messenger RNA molecules (mRNA’s) are then translated into protein.

The language-based terminology reflects the fact that both genes and proteins are essentially 1-dimensional arrays of chemical letters. However, the nucleic acid alphabet comprises just 4 chemical letters (the 4 nucleotides are often abbreviated to ‘A’, ‘C’, ‘G’ and ‘T’ – but see footnote27), whereas proteins are built from 20 different amino acids. Clearly, no 1:1 mapping can connect nucleotides to amino acids. Instead nucleotides are translated as non-overlapping triplets known as codons. With 4 chemical letters grouped into codons of length 3, there are 4x4x4 = 64 possible codons. Each of these 64 codons is assigned to exactly one of 21 meanings (20 amino acids and a ‘stop translation’ signal found at the end of every gene.) The genetic code is quite simply the mapping of codons to amino acid meanings (Figure 2a). One consequence of this mapping is that most of the amino acids are specified by more than one codon: this is commonly referred to as the redundancy of the code. 

Although the molecular machinery that produces genetic coding is complex (and indeed, less than perfectly understood), the most essential elements for this discussion are the tRNA’s and ribosome. Each organism uses a set of slightly different tRNA’s that each bind a specific amino acid at one end, and recognize a specific codon or subset of codons at the other. As translation of a gene proceeds, appropriate tRNAs bind to successive codons, bringing the desired sequence of amino acids into close, linear proximity where they are chemically linked to form a protein translation product. In this sense, tRNA’s are adaptors and translators – between them, they represent the molecular basis of genetic coding. The ribosome is a much larger molecule, comprising both RNA and various proteins, which supervises the whole process of translation. It contains a tunnel through which the ribbon of messenger RNA feeds; somewhere near to the center of the ribosome, a window exposes just enough genetic material for tRNA’s to compete with each other to bind the exposed codons. 


Q1: “Genetic differences between a human and an amoeba are only partly attributable to the different quantity of genetic material present in each. For example, the Amoeba proteus genome contains 100-fold more DNA than a human genome; other species of Amoebae contain both much larger and much smaller quantities of genetic information.” It surprised a lot of people to learn that homo sapiens, while more complex in many ways than our animal cousins, don’t necessarily possess more DNA than other animal species. What do you make of this? Do you think this inequality should be a cause, as some have suggested, for humans to regard themselves less highly, or at least less ‘naturally’ superior to other biological forms?

Q2: “Within a population of individuals that vary from one another, those that best match their environment will, on average, leave behind the most offspring. Wherever the match is genetically programmed, the version of the genetic program associated with the best match will tend to increase in frequency over time by leaving behind more copies of itself… Through this process, genetic material will evolve to mirror some of the information presented by the environment in which it is copying itself.” If evolution mirrors its environment, how much do you feel humans are in charge of our own evolution, now that we can control, through technology, certain aspects of our environment? If you accept the historic evolution of the human species, do you believe that, even now, we continue to evolve? What do you think is the “next step” for human beings, biologically?

Q3: “The major difference between this article and Harry Potter lies not in the quantity of letters and punctuation used but in the sequence in which these symbols have been assembled” Do you think it is possible and/or helpful to seek a particular measurement scale that can reflect accurately and objectively how much information is present in a given work of literature? Or will such measurements always reflect the unique focus of a particular study (e.g. to establish the factual content of a given work, or its poetic attributes)? What does this have to say about the possibility for science to measure the superiority (or otherwise) of our species relative to others?

Q4: In this section, Stephen describes three main points about genetic information: 1) that there is no chemical limitation on how letters of DNA can assemble, 2) that thermodynamic information is unrelated to DNA processes, and—most importantly— 3) that the semantic-level information in DNA ultimately comes from the complexities of the environment. What do you think of these arguments?


1.  “Intelligent Design as a Theory of Information,” William Dembski (1998).  Web material copyrighted to William Dembski, available at:

2. Though nothing in evolutionary theory suggests that there must be an increase in the length or complexity of a DNA molecule over time: for example, many bacteria and viruses appear to have undergone extensive natural selection to reduce the size of their genetic material as a specific adaptation to make copies of themselves faster than their competitors. For a recent example, see: Nikoh N, Hosokawa T, Oshima K, Hattori M, Fukatsu T.  “Reductive Evolution of Bacterial Genome in Insect Gut Environment.” Genome Biology and Evolution (2011)

3. R. Redon, S. Ishikawa, K. R. Fitch, L. Feuk, G. H. Perry, T. D. Andrews, H. Fiegler, M. H. Shapero, A. R. Carson, W. Chen, E. K. Cho, S. Dallaire, J. L. Freeman, J. R. González, M. Gratacòs, J. Huang, D. Kalaitzopoulos, D. Komura, J. R. MacDonald, C. R. Marshall, R. Mei, L. Montgomery, K. Nishimura, K. Okamura, F. Shen, M. J. Somerville, J. Tchinda, A. Valsesia, C. Woodwark, F. Yang, J. Zhang, T. Zerjal, J. Zhang, L. Armengol, D. F. Conrad, X. Estivill, C. Tyler-Smith, N. P. Carter, H. Aburatani, C. Lee, K. W. Jones, S. W. Scherer and M. E. Hurles “Global variation in copy number in the human genome”; Nature (2006) 444: 444-454.

4. For an excellent, evolving review of the interesting topic of genome sizes, see Gregory, T.R. (2005). Animal Genome Size Database. Available at:

5. If you would like to consider the implications of combinatorial language in greater detail without any formal mathematics, try reading Jorge Luis Borges’ famous short story entitled “The library of Babel.”  Available in English translation as pages 51-59 of “Labyrinths: selected stories and other writings” (1964, New Directions/Penguin, New York)

6. In fact, what is harder is to deduce is which of the many routes is most likely, if you assign slightly different probabilities to each different type of step. This is why the past couple of decades have seen considerable research effort go into developing computer algorithms that estimate the most likely series of mutation-steps that separate two versions of genetic material. To understand the level of complexity here, consider some different routes by which a series of letter-mutations could transform the word “evolution” into “creation”, and then scale that challenge upwards to do something similar for two sentences, two paragraphs, two novels. A good, recent overview is given in: Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S., “MEGA5: Molecular Evolutionary Genetics Analysis using Maximum Likelihood, Evolutionary Distance, and Maximum Parsimony Methods.” Mol Biol Evol (2011) 28:2731-9.

7. For an accessible, eloquent discussion of where this can lead, see Chapter 3 (“Accumulating Small Change” pp. 43-77), from Richard Dawkins’ book “The Blind Watchmaker” (1986 New York: W. W. Norton & Company)

8. On a different note, it is interesting to see how this same line of thought parallels theological examination of the famous biblical text that humanity was created in the image of God(Genesis, 1:27.) If each of us is built in the image of God, and each of us is different, then it follows that each of us is capable of developing a different relationship with God based on the unique perspective granted us. This observation provides a logical check to any theologies that assert necessary submission to a single, all-embracing interpretation of God’s revealed truth. Within the Gospels, Jesus’ personal encounters show a consistent emphasis on the unique point of connection between an individual’s perspective and God’s greater truth (e.g. compare  John 3:1-7, John 4:1-29, Mark 17:10-22, Matthew 8:5-13, Luke 23:33-43), together with a consistent wariness towards group ideologies (e.g. Mark 12:18-27, Matthew 12:1-9, Mathew 15:1-11).

9. This reductionist description of evolution contains little that is new (scientifically) precisely because the aim of this essay is to explain how classic Neo-Darwinian orthodoxy addresses the issue of the origin of (new) genetic information. This view of evolution is probably best known through the popular works of writers such as Dawkins, and everything written here is in true alignment with insights expressed in books such as The Selfish Gene, The Blind Watchmaker and (most relevant to criticisms of reductionism) The Extended Phenotype. Behind these works lies an extensive primary research literature that has developed these ideas, before and after, with respect to genomics, genetics, biological development (“embryology”), animal behavior, morphology, life history strategies and so on. This reductionist view does not overlook the existence of phenotype as the filter through which the environment passes its information into DNA - this is why the Extended Phenotype is the most relevant popular work to discuss in this context - but as Dawkins explains so clearly in the Selfish Gene, environmental pressures that do not create a corresponding “match” within DNA are irrelevant to evolution precisely because heritability is one of the 3 tenets (variation, heritability and competition to reproduce) that lead to Darwin’s inescapable conclusion: heritable variations which increase the reproductive success of a lineage will, over time, accumulate.


About the Author

Stephen Freeland

  Stephen Freeland is currently the Director for the Interdisciplinary Studies program at UMBC ( His academic background (a bachelor’s degree in zoology from Oxford, a master’s in biological computation from York University, and a doctorate in genetics from Cambridge) has led him to spend the past twenty years researching the evolution of genetic coding. Steve’s current research explores the evolution of the amino acid “alphabet”—the set of twenty building blocks with which life has been making the proteins of metabolism for more than three billion years. Underlying this research is a growing interest in the cosmological question, “To what degree is life on Earth (or elsewhere) a result of chance?” As the son of a biology teacher who retrained as a Methodist minister, Steve has been blessed with an encouraging environment with which to explore the interface of science and faith since childhood.


Comments are currently not showing correctly. We are working to address the issue. In the meantime, you can access all comment and discussion boards by clicking the link below: