EDITOR’S NOTE: A scientific glossary is provided below this article for those who are unfamiliar with the scientific terms and concepts.
Is “biological information” merely an analogy of convenience for biologists, or does the cell contain information in the sense of a language or code? In this series we explore the science behind this question and its implications for the arguments of the Intelligent Design movement.
In the previous post in this series, we explored how DNA functions as a carrier of information. However, it lacks the ability to perform cellular activities, given its inability to form the complex shapes needed to act as enzymes, structural components of the cell, and so on. In contrast, we saw that proteins perform these roles admirably, while themselves lacking the ability to act as a template for their own replication (as DNA does). Proteins are, in a sense, “disposable” entities: once made, they function until they are damaged and recycled by the cell – broken down for the amino acid monomers they contain. They cannot replicate, nor can their structure be used as a template for replication like the structure of DNA. DNA and proteins are thus complementary in function: DNA supplies the information needed to determine the sequence of amino acids to make functional proteins, and proteins do the cellular work that DNA on its own cannot do.
Interposed between these two systems is a third set of large molecules: ribonucleic acids, or RNA. One way to think about RNA is as a single-stranded version of DNA. This is not entirely correct, since RNA uses one monomer unique to itself (uracil, or “U”) in place of the thymidine (or “T”) in DNA. Still, it’s not a bad way of visualizing it. DNA has two strands that wind around each other in the famous “double helix”. As we saw in the last post, the attraction between the two strands arises from the alignment of atoms that participate in weak attractive bonds called hydrogen bonds.
RNA, since it is single-stranded, does not have an opposing strand with which to form hydrogen bonds. This allows an RNA molecule to form hydrogen bonds between its own monomers– bending and flexing to line up monomers for pairing. The result is a molecule that has both information-bearing properties like DNA, but the ability to take on an array of functional three-dimensional shapes, like proteins. RNA molecules can be unfolded to use their sequence of monomers to specify a copy, and can fold up to do important functions in the cell.
A short, single stranded RNA molecule folded up on itself can form an array of functional 3-D shapes, while retaining its ability to carry information in the sequence of its monomers. The paired sections, which look something like a double helix, are in fact paired sections within one RNA strand.
As it happens, three distinct classes of RNA molecules are essential for transferring the information of DNA into proteins: “messenger RNA”, “transfer RNA” and “ribosomal RNA.” The overall process is straightforward: the sequence of DNA monomers (nucleotides) needs to be transferred to its corresponding sequence of protein monomers (amino acids). Let’s examine the roles of each of these RNA classes in this process in turn.
Messenger RNA: from DNA to working copy
Each chromosome is a very long DNA double helix. This large entity is not suited to easily moving around within the cell; moreover, it also contains the DNA sequence of many proteins (the regions of a chromosome that have sequences for proteins are called protein-coding genes, a term you are likely familiar with). When A cell needs to convert the DNA of a protein-coding gene into a protein, it is copied into a single-stranded RNA version that spans only this one gene. A single-stranded RNA copy of an individual gene is called a messenger RNAs, or “mRNA”. To make the copy, enzymes spread apart the two strands of the chromosomal DNA such that a section of it is now two single-stranded regions, and one of the strands is used as a template to make the strand of RNA. This provides the cell with a “working copy” of a single gene, in RNA form, that can easily move around within the cell. Specifically, mRNAs need to leave the cell nucleus – the internal compartment where chromosomes are found – and be transported out into the cell cytoplasm – where further processing can take place. While mRNAs do have some 3-D structure that is important for their function, they are mainly used to take a copy of the DNA sequence to the place where it can be converted to an amino acid sequence – a large enzyme complex called the ribosome. Since the mRNA and the DNA from which it was copied are both nucleotide sequences, scientists called this process “transcription” when it was discovered. Transcription, as the name implies, is copying a text to make a duplicate in the same language. The process of converting the nucleotide sequence, or “language” into an amino acid sequence, is correspondingly called translation. The next two RNA types are required for this process.
ABOVE: This 3D animation shows how proteins are made in the cell from the information in the DNA code.
Cracking the code
One of the challenges of translation that long puzzled biologists was trying to understand how a sequence of DNA monomers – with four monomer options – was converted into a sequence of amino acids, with 20 options. Indeed, the few monomer types found in DNA lead early biologists to conclude that it was far too simple to act as a repository of the vast amounts of information needed by a cell. Later experimental evidence in favor of DNA as the hereditary molecule would have to swim against this current of suspicion.
Once the DNA double helix structure was worked out in the early 1950s, there followed something of a race to elucidate exactly how such a simple structure could specify the complex sequences of amino acids. The structure of DNA immediately answered the “how does it replicate with high accuracy?” question, but failed to reveal how the precise amino acid sequences of proteins were specified.
Since DNA has only four monomers, scientists quickly realized that a simple one-to-one correspondence between one nucleotide and one amino acid would not be the answer – since such a system could only allow for four amino acids in proteins, when 20 were known. Alternative hypotheses were then explored – one of which was that groups of nucleotides could be used to specify a single amino acid. Pairs of nucleotides would thus have 16 possible states (four options for the first and four options for the second, giving 16 total possibilities). Since this is also less than 20, the idea that three nucleotides might specify a single amino acid was investigated. This system allows for 4x4x4 combinations, or 64 in total. While this exceeded the number needed, this hypothesis proved fruitful. Over time, biologists worked out that three nucleotides were indeed used to “code” for a single amino acid. The fact that there were more nucleotide combinations than the 20 required was also explained in time – many amino acids could be coded for by several combinations of nucleotides. For example, the amino acid glycine was found to be coded for by the DNA nucleotide triplets GGA, GGC, GGG, and GGT. The “code” was in fact partially redundant.
A code by any other name
It was also, unsurprisingly, at this time that the “code” analogy for these correspondences entered the biologist’s vocabulary. The race to figure out the links between nucleotide triplets and their resulting amino acid was discussed by scientists as “cracking the biological code” and similar phrases. That this work took place in the 1960s, following on from the successes of Bletchley Park and Ultra in World War Two, and in the midst of the espionage and counterespionage of the Cold War, lent further weight to the metaphor. So apt was this analogy, that many scientists, to say nothing of the media and the public, often did not qualify that this was in fact an analogy. The name given for the nucleotide triplets was “codon”, and to this day biology textbooks speak of the codon table as the “genetic code.”
Code or chemistry?
Though the biologists who did this work used “code” language as an analogy for the complex chemistry they were discovering, it is important to remember that they did not view it as an actual code in the sense of a symbolic system designed by an intelligence. In contrast, the Intelligent Design (ID) movement does view the “genetic code” and its associated chemistry in this way - primarily because they claim that natural processes are not sufficient to explain its origin. Once we’ve examined how this intricate system works, we’ll be in a better position to understand and evaluate that claim.
In tomorrow’s post, we’ll continue to explore this system by at the roles of transfer and ribosomal RNA molecules in the translation process.