The question before us is this: how can natural laws and random events, over time, assemble the kind of information content we see living organisms. In the first two posts of this series I described several systems where just a few simple pieces can combine into a vast space of possibilities, and where random events can cause those pieces to explore portions of that space, assembling more complex objects and a rich variety of environments. But that’s not the whole story. The earliest living organisms on Earth were much simpler than today’s organisms. Greater complexity implies greater information. To understand how that happened, we need to add two more chapters to the story: evolutionary adaptation and co-option.
Consider Cog, a robot developed at the Massachusetts Institute of Technology to help test theories about human learning. Cog learns about its environment by interacting with it. It has many interconnected computer processors working simultaneously that process sensory information, control body movements, and then coordinate sensory information with body motion so that Cog can learn to perform physical tasks.
One task which Cog can learn through repeated trials is pointing its arm at a distant object. During the first few trials, Cog flails its arm and points randomly. Then, error-correcting routines take over and, after repeated trials, Cog gets better and better at pointing. Now consider the end result of many trials. There are a many variables in Cog's distributed memory which allow it to point successfully. These variables control the sequence, timing, and amplitude of various motions in Cog’s neck, shoulder, elbow and wrist joints. If you reset Cog’s memory to where it was before it started learning, and have Cog re-learn the task, it will re-learn the task in about the same amount of time, with the same success, but with a different final set of variables in its memory. The reason the final set of variables is different from trial to trial is because Cog's learning starts out with random arm-flailing, which provides data for subsequent error-correction and improvement.
After Cog has learned a task, there is a great deal of information in its distributed memory. Out of all possible sets of variables in Cog’s memory, only a tiny subset of variables allows it to perform the specified task of pointing at distant objects. (This is somewhat analogous to the fact that out of all possible DNA sequences, only a tiny subset of DNA sequences can produce a living organism.) Once Cog has learned to point, there is new information in Cog’s memory necessary for the task. Where did that information come from?
To help answer this question, imagine a simple computer program designed to learn how to navigate mazes. This program reads an instruction string of zeros and ones; each pair of bits in the instruction string tells it to move left, right, up, or down the screen. The instruction string starts out as ten randomly chosen bits, enough to move five steps. The program enters the maze and simply follows the instruction string from the beginning. If the program hits a wall in the maze before it gets to the end of its instruction string, it stops following the instruction string and generates an error signal. If the program gets an error signal, it randomly flips one bit and tries again from the beginning of the maze.
Eventually, the program will hit upon an instruction string which it can follow to the end without getting an error signal. When this happens, the program increases the length of the instruction string by duplicating a random ten-bit piece of its instruction string and appending it to the end. Now whenever it generates an error signal, it randomly flips only one of the last ten bits of its instruction string. After repeated trials, it will once again hit upon an instruction string which it can follow to the end without error. And once again it lengthens its instruction string further. This continues until the program finally finds the exit of the maze. If the maze is large, the instruction string it discovered by trial-and-error can be much longer than the simple program which reads the instruction string. There will be a lot of new information in that instruction string telling the program how to navigate the maze. Where did that information come from?
Both Cog and the maze-navigating program illustrate how information about the environment can be duplicated into the instruction strings (or “genomes”) of a self-replicating and evolving system. Biological evolution does the same thing. The genomes of organisms contain information about how to survive and thrive in a particular environment. Often, there is redundancy in that information. The same amount of information can be encoded in many different ways. When a mutation occurs in one member of a population that leads to greater reproductive success, and that mutation spreads to the population, the organisms become better adapted to their environment, and their genomes contain still more information about how to survive and thrive in that environment.1
For the past few years, I’ve been working with collaborators and students to build another computer model of this. We call it Pykaryotes.2 The digital organisms in our computer model “live” and move in an environment with a distribution of several different kinds of food. Each organism has a sort of genome—a string of codons which tells it what food to gather when, when and what direction move, and what proteins to make. After a certain number of genome reading steps, an organism’s fitness is calculated based on how many food chemicals it has gathered, and its fitness determines the average number of offspring it produces.
During reproduction, our digital organisms might experience various types of mutations inspired by real-world biological mutations, including point-mutations, deletions, genome copying, and horizontal gene transfer. When we run the Pykaryotes program with these mutation rates set neither too high nor too low and with adequate rewards for fitness, the simple starting organisms almost always become more fit and more complex as the generations pass, and the information content of their genomes grows. When we run this program under other conditions—for example, when the mutation rate is too high or too low, or when there are only weak rewards for increased fitness—the simple starting organisms do not evolve to greater and greater fitness, and the information content of their genomes does not grow. The program was designed to behave this way in order to mimic real biological evolution.
Pykaryotes illustrates yet another way in which information and complexity can evolve. In our digital organisms, proteins which initially have one function can gain new functions (sometimes without losing their old functions) through co-option. Protein co-option could start with gene duplication, followed by mutation of one copy of the gene (while the other retains its original function) until the protein has changed enough that it begins interacting in new ways with other proteins in the cell. A second way for a protein to become co-opted is when mutations happen, not in the gene for the protein itself, but elsewhere in the genome, creating changes in other proteins in the cell, and causing new interactions. A third way for co-option to happen is not with changes in the genome at all, but with changes in the environment. When the environment changes, a protein which already performs one function can continue to do so while beginning also to perform a new function in the new environment.
Over time, our digital organisms evolve complexes of 2 to 5 bound proteins in which the complexes as a whole have food-gathering functions but each component protein has no independent function. Our computer model allows us to study how the rate at which such protein complexes evolve depends on things like mutation rates, the strength of natural selection, and the frequency of changes in the environment. Whenever conditions are right, these digital organisms evolve ever larger functional protein complexes, each of which requires ever larger amounts of information to describe how the organisms gather each type of food. Through gene duplication, mutation, and co-option, the information content of their genomes grows as well.
Other articles on the BioLogos website describe real biological examples of co-option creating new biological information and increasing complexity (check out the further reading section for some examples). Cog the robot, the maze navigation program, and Pykaryotes are all examples of “evolutionary algorithms.” They are human-designed computational systems inspired by God-designed biological systems. As evolutionary systems adapt to their environment through a process of trial-and-error (a better term would be “trial-and-greater-success”), they accumulate more and more information about the environment, and encode that information inside themselves.
I’m fascinated by natural systems which become more complex over time via the interplay of law and chance. I believe that God designed the laws and random processes of the natural world so that, under certain conditions, physical and biological systems evolve greater complexity and create information naturally. We humans have been inspired by God’s handiwork and created many games, mathematical systems, and technologies which model some of those natural processes. I don’t claim to have proved that the complex biochemical machinery of modern cells evolved from simpler beginnings. I’m making the following more limited claim: information poses no barrier to the evolution of complexity.
And the next time you see snow falling, pause to be amazed by this fact: just the simple water molecule can combine with other copies of itself, through the interplay of law and chance, to form billions of billions of snowflakes—each one unique.