Behe, Lenski and the “Edge” of Evolution, Part 2: Gaining a New Function

Bookmark and Share

October 23, 2012 Tags: Genetics

Today's entry was written by Dennis Venema. You can read more about what we believe here.

Behe, Lenski and the “Edge” of Evolution, Part 2: Gaining a New Function

Note: In this series, we reexamine the claim made by Intelligent Design proponent Michael Behe to have found a limit to “Darwinian” evolution in light of recent results from the laboratory of Richard Lenski.

Climbing Mount Citrate

As we discussed yesterday, the most dramatic innovation yet observed in the E. coli Long Term Evolution Experiment (LTEE) was the ability, acquired by one of the twelve cultures, to use citrate as a carbon source under aerobic conditions. When we last discussed the LTEE in 2011, we noted what was known then about the mutations that eventually combined to produce the Cit+ trait:

Tracking down the nature of this dramatic change led to some interesting findings. The ability to use citrate as a food source did not arise in a single step, but rather as a series of steps, some of which are separated by thousands of generations:

  1. The first step is a mutation that arose at around generation 20,000. This mutation on its own does not allow the bacteria to use citrate, but without this mutation in place, later generations cannot evolve the ability to use citrate. Lenski and colleagues were careful to determine that this mutation is not simply a mutation that increases the background mutation rate. In other words, a portion of what later becomes “specified information for using citrate” arises thousands of generations before citrate is ever used.
  2. The earliest mutants that can use citrate as a food source do so very, very poorly – once they use up the available glucose, they take a long time to switch over to using citrate. These “early adopters” are a tiny fraction of the overall population. The “specified information for using citrate” at this stage is pretty poor.
  3. Once the (poor) ability to use citrate shows up, other mutations arise that greatly improve this new ability. Soon, bacteria that use citrate dominate the population. The “specified information for using citrate” has now been honed by further mutation and natural selection.
  4. Despite the “takeover”, a fraction of the population unable to use citrate persists as a minority. These cells eke out a living by being “glucose specialists” – they are better at using up glucose rapidly and then going into stasis before the slightly slower citrate-eaters catch up. So, new “specified information to get the glucose quickly before those pesky citrate-eaters do” allows these bacteria to survive. As such, the two lineages in this population have partitioned the available resources and now occupy two different ecological niches in the same environment. As such, they are well on their way to becoming different bacterial species.

As such, we noted three distinct steps observed by the Lenski group: steps they call potentiation, actualization, and refinement. Potentiation mutations do not themselves result in the ability to use citrate under aerobic conditions, but they are necessary for it to appear later. Actualization is the mutation that first brings about the Cit+ trait, though, as we noted, this step produced only a very weak Cit+ effect. This nascent ability, however, then undergoes refinement through additional mutations and selection to give the final, robust Cit+ trait observed in the culture.

While some things were known about these steps when the Lenski group last published on this topic (in 2008), the precise details remained unclear. What was needed was a complete characterization of the Cit+ bacteria through whole-genome sequencing to help indentify the changes. These long-awaited results are now available in a new paper published last month by the Lenski group, and they shed light on all three stages of the process.

Lights, camera, actualization

The key step - and the one of greatest interest - is of course actualization: the mutation that converted a Cit- cell to a Cit+ one. This is also one of the easiest steps to study, since the mutation provides the cell with a new feature that can be detected experimentally. Though E. coli cannot use citrate as a carbon source in the presence of oxygen, they are capable of using citrate in anoxic conditions (i.e. when oxygen is absent). To do so, they employ a protein that imports citrate in to the cell while at the same time exporting a compound called succinate. Since this protein is already present in the E. coli genome, it was long suspected that a genetic regulatory change that turned on its production in the presence of oxygen could be the key innovation that produced the first Cit+ bacterium in the culture. As we discussed yesterday, Behe notes that this change could result from a loss-of-FCT or a gain-of-FCT mutation:

“If the phenotype of the Lenski Cit+ strain is caused by the loss of the activity of a normal genetic regulatory element, such as a repressor binding site or other FCT, it will, of course, be a loss-of-FCT mutation, despite its highly adaptive effects in the presence of citrate. If the phenotype is due to one or more mutations that result in, for example, the addition of a novel genetic regulatory element, gene duplication with sequence divergence, or the gain of a new binding site, then it will be a noteworthy gain-of-FCT mutation.”

Interestingly, the actualization mutation was indeed a change of regulation of the anoxic citrate / succinate transporter, and it arose through a gain-of-FCT mutation. The mutation turned out to be a side-by-side duplication of the citrate / succinate transporter gene, as well as portions of two genes on either side of it. This imprecise duplication placed a partial fusion of these flanking genes next door to one of the copies of the citrate / succinate transporter gene. This brought the copy under the control of promoter sequences derived from of one of its neighbors, a gene that is active when oxygen is present. The resulting product was a copy of the citrate / succinate transporter gene that was now very weakly expressed in aerobic conditions. Since this is an example of a mutation that duplicates a gene and simultaneously creates a new regulatory element for it (causing significant sequence divergence), this is a clear-cut example of a gain-of-FCT mutation.

Responding to the data

While Behe has not yet, to my knowledge commented on this particular development within the LTEE, one of his colleagues in the Intelligent Design Movement (IDM), microbiologist Ann Gauger, has offered her thoughts. Two themes emerge in her commentary: that the Cit+ trait is “not new”, and that the number of mutations it required were within the bounds set out by Behe and another member of the IDM, structural biologist Douglas Axe:

When is an innovation not an innovation? If by innovation you mean the evolution of something new, a feature not present before, then it would be stretching it to call the trait described by Blount et al. in "Genomic analysis of a key innovation in an experimental Escherichia coli population" an innovation [...]

The total number of mutations postulated for this adaptation is two or three, within the limits proposed for complex adaptations by Axe (2010) and Behe in Edge of Evolution. Because the enabling pre-adaptive mutations could not be identified, though, we don't know whether this was one mutation, a simple step-wise series of adaptive mutations, or a complex adaptation requiring one or two pre-adaptations before the big event.

But does this adaptation constitute a genuine innovation? That depends on the definition of innovation you use. It certainly is an example of reusing existing information in a new context, thus producing a new niche for E. coli in lab cultures. But if the definition of innovation is something genuinely new, such as a new transport molecule or a new enzyme, then no, this adaptation falls short as an innovation. And no one should be surprised.

While Gauger does not speak to the tension between her description of the Cit+ mutation as “not genuinely new” and Behe’s criteria that this should be classified as a gain-of-FCT mutation, it is clear that she views this event as within Behe’s “edge” – i.e. within the bounds of “what Darwinism can do.” Additionally, she sees it as falling within the scope of what is evolutionarily possible as proposed by Axe’s work. In the next installment of this series, we’ll revisit how Behe defines his (claimed) limit of what evolutionary processes can accomplish, with this new evidence in hand. In doing so, a careful examination of the potentiation and refinement phases of the Cit+ transition will be informative.

For further reading:

Blount, Z.D., Barrick, J.E., Davidson, C.J. and Lenski, R.E. (2012). Genomic analysis of a key innovation in an experimental Escherichia coli population. Nature 489; 513- 518.

Michael J. Behe, The Edge of Evolution: The Search for the Limits of Darwinism (New York: Free Press, 2007).

Michael J. Behe (2010). Experimental evolution, loss-of-function mutations, and “The first rule of adaptive evolution”. The Quarterly Review of Biology 85(4); 419-445.


Dennis Venema is professor of biology at Trinity Western University in Langley, British Columbia. He holds a B.Sc. (with Honors) from the University of British Columbia (1996), and received his Ph.D. from the University of British Columbia in 2003. His research is focused on the genetics of pattern formation and signaling using the common fruit fly Drosophila melanogaster as a model organism. Dennis is a gifted thinker and writer on matters of science and faith, but also an award-winning biology teacher—he won the 2008 College Biology Teaching Award from the National Association of Biology Teachers. He and his family enjoy numerous outdoor activities that the Canadian Pacific coast region has to offer. Dennis writes regularly for the BioLogos Forum about the biological evidence for evolution.

< Previous post in series Next post in series >


View the archived discussion of this post

This article is now closed for new comments. The archived comments are shown below.

Loading...
Page 2 of 2   « 1 2
Jon Garvey - #74283

November 10th 2012

Dennis

I’m not sure I fully understand your thoughts

That of course is why my post was a layman’s view. Your post does clarify your earlier statement for us punters, though, so thanks.

The problem with all these issues, it seems to me, is the difficulty of pinning down what “useful information” actually is. I think the bioinfomatics people are to be congratulated for trying to pin that down, because there clearly is a difference between a random sequence and one with sophisticated function (like a functional algorithm). That’s why I’m more interested in the conceptual issue than the metric.

(As an aside, at the back of my mind here is the thought not just of simple protein coding, but of the roles of specific sequences in control networks, and also the observation that most “genes”, we’re told, are transcribed in multiple contexts, all of which creates new constraints/possibilities/complexities. Since all that requires detailed knowledge of the whole function of a particular gene it can’t really be applied to the examples you’ve been dealing with here, and would be even harder to quantify.)

But, again from a lay perspective, the question is what proportion of de-novo inactive, non-selected sequences actually are a mutation or two away from function (a) in absolute terms and (b) in cells.

If random sequences, unlike say computer code, can usually be tweaked to produce a functional protein, then there’s no controversy - Darwinian point mutations can achieve anything. But if cells contain a disproportionate number of near-functional (new) sequences compared to the random, then their accretion over time becomes part of the question to be answered: the explanation of the origin of instant coffee isn’t in the boiling water you add.

Anyway, I hope Kirk will get back here with some fits.


Dennis Venema - #74289

November 10th 2012

Thanks, Jon. Glad that what I wrote clairified what you were looking for. 

On your question, one thing to keep in mind is that we have some examples of new genes / proteins arising with only a very weak function - the early Cit+ mutation in the LTEE is one example, and the original mutation that produced nylonase is another. When these genes arose, they barely did anything - but it was enough to allow for natural selection to act on variants of them, and to eventually hone their functions over time. So, there doesn’t seem to be a need for the new sequences to have a strong function, just a function (however weak) that can come under selection. 

Thanks for the good questions. Yes, hopefully Kirk will weigh in again. 


Kirk Durston - #74313

November 11th 2012

Dennis, my apologies for taking so long to comment. With regard to the possibility of shared ancestry between humans and other life, given the lack of details provided, I think there is a range of possible scenarios that are compatible with the idea that ‘God formed man of dust from the ground’.  That being said, on scientific grounds I can no longer seriously entertain a neo-Darwinian process to fully explain the disparity and diversity of life and the origin of man. I simply do not have sufficient information to take a position at this time. At best (from an evolutionary perspective) the diversification of life would have to be guided, but that is not even remotely close to the neo-Darwinian proposal.

I think Hazen’s M(Ex)/N probability (see post 73990) is very important. The ratio M(Ex)/N gives us a target size for that protein family … the total number of functional sequences compared to the total area of sequence space for a protein of that length. This will tell us how difficult it will be for an evolutionary process to ‘locate’ that area of functional sequence space (pre-determined by physics). It is also a way to objectively evaluate an hypothesis that a particular protein arose via a random walk.

Unfortunately, Hazen’s equation is not workable for functional proteins since M(Ex) is unknown. The method given in my earlier paper (1) provides a way to estimate one of the unknowns (functional information or FI). One can then use Hazen’s equation to solve for M(Ex)/N. Dennis is correct that my method cannot estimate FI if there is only a single instance of the sequence available. Typically, I like to have at least 500 unique sequences for the probability distribution of each amino acid at each site to begin to level out. A lone functional sequence still has FI, I just cannot estimate it with a sample size of only 1.

M(Ex)/N can be interpreted as the probability of achieving a functional sequence in a single search, but the real probability is orders of magnitude smaller, if a sequence is slowly and randomly assembled via a random walk until a function is actualized that has a fitness advantage, then allowing a hill-climbing refinement. Assembled slowly, one mutation at a time, the probability is drastically reduced to .215^N(N!/N^N).

It gets worse. The method published in my earlier paper assumes that each site can be mutated independent of all other sites. We know that in reality, there is a high degree of inter-site dependence, which will further reduce the probability by many orders of magnitude. My latest paper presents a method to detect those interrelated sites (2). So the point to make is that having a protein suddenly pop into existence is the most probable scenario, not the least probable.

Now let us consider Prot-Hyp, a hypothetical sequence of 80 amino acids (aa) in a Chimpanzee that has a high degree of similarity to a 100 aa de novo gene in a human, as described by Wu. We have three possible explanations:

1. It is an example of a future gene that is currently non-functional and being assembled slowly by random mutations.

2. It is the remnant of a once function gene, still found in humans, that has been deactivated in chimps as a result of one or more deletional events in the past.

3. It is neither (1) or (2) but already serves an important function that we have yet to discover.

Consider (1): Wouldn’t it be bizarre to observe a sequence of numbers slowly being randomly assembled that turns out to be the combination for the local bank vault? I think (1) is equally as bizarre for Prot-Hyp. The probability of getting the first 80 aa correct is .215^80(80!/80^80) ….. a ridiculously small probability, and that is assuming site independence.

Consider (2): We observe deletional events all the time. I would say that the probability of seeing such an event in one of 20,000 genes is pretty high, very close to 1.

Consider (3): What with ENCODE results, this would not surprise me in the least. I’d say the probability is pretty good especially in light of the fact that the sequence is preserved in both Chimps and Humans; it has to be doing something.

Looking at (1), (2) and (3), given that (1) is wildly improbable and that (2) and (3) seem quite probable, I would argue that the most rational position is to reject (1) and choose either (2) or (3) or to suspend judgment between those two pending further research.

References:

1. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2217542/

2. http://bsb.eurasipjournals.com/content/pdf/1687-4153-2012-8.pdf 


Dennis Venema - #74612

November 21st 2012

Hi Kirk, sorry for the delay. My responses are below, but first I’ll quote the section I’m addressing:

—-

1. It is an example of a future gene that is currently non-functional and being assembled slowly by random mutations.

2. It is the remnant of a once function gene, still found in humans, that has been deactivated in chimps as a result of one or more deletional events in the past.

3. It is neither (1) or (2) but already serves an important function that we have yet to discover.

Consider (1): Wouldn’t it be bizarre to observe a sequence of numbers slowly being randomly assembled that turns out to be the combination for the local bank vault? I think (1) is equally as bizarre for Prot-Hyp. The probability of getting the first 80 aa correct is .215^80(80!/80^80) ….. a ridiculously small probability, and that is assuming site independence.

Consider (2): We observe deletional events all the time. I would say that the probability of seeing such an event in one of 20,000 genes is pretty high, very close to 1.

Consider (3): What with ENCODE results, this would not surprise me in the least. I’d say the probability is pretty good especially in light of the fact that the sequence is preserved in both Chimps and Humans; it has to be doing something.

—-

The researchers carefully examined other primate genomes to ensure that the non-coding version of the sequence observed in chimps is indeed the ancestral state. This strongly argues against (2). Your appeal to an unknown function (3) is also a bit strained - we observe the function in humans as a coding sequence, yet the ancestral state is non-coding. Thus the function we now observe in humans and the presumed function in chimps cannot be the same function. Yes, it is formally possible (however unlikely) that the sequence has a non-coding function in chimps, but even granting that possibility does not detract from the fact that the present-day function in humans cannot be the same as that non-coding function. As for (1) - the process is not assembling a sequence for a defined target (e.g. your analogy to a bank vault). There is no coding function needed before the sequence becomes coding, and only if the sequence is useful when it becomes coding will it be preserved. Any genome of any size can be predicted to have these sorts of events happen. Primates have moderately sized genomes (around 3.2 billion nucleotides). If there is a non-zero mutation rate, new coding sequences will arise - there is nothing that can stop that, and it is possible to predict the frequency of such events. 


Kirk Durston - #74658

November 23rd 2012

Dennis, thank you for that response. This discussion is getting to the heart of how completely novel proteins (novel protein families) were formed. As your four-part series was not designed to address this issue specifically, I would like to see a Biologos write-up on how a novel gene family can be achieved through a testable evolutionary method. In the meantime, here are some of my thoughts with regard your comments.

Ancestral sequences: I took another look at the Wu et al. paper to see what their method was for determining if a sequence was ancestral. I may have missed it, but it seems that they merely assumed that the shorter, non-coding sequences in the Chimpanzee and Orangutan were ancestral. They cannot actually be ancestral, since no one assumes we descended from either, so the only way they could be ancestral is if they were assembled in the last common ancestor. How probable is the appearance of at least 60 sequences in the last common species (or maybe genus) that will later turn out to be key components in 60 human genes? Science must be testable, so the hypothesis that 60 later-coding sequences can appear in non-coding regions within the last common ancestor needs to be tested against what we actually know about proteins, using the data available to us in Pfam and other similar databases. That is part of what I have looked at. The data reveals something about biological proteins that says it will not happen before the last star runs out of fuel. Those that say it can, can not be looking at the actual protein data; it gives us a very good idea of M(Ex)/N that directly falsifies evolutionary scenarios. Something else must be going on.

Degenerate genes: The 60 ‘ancestral’ sequences were either lacking a start codon (and had to be less than 80% the length of the human gene)  or have a frame shift-induced premature stop codon. We observe that frame-shift mutations and/or deletions can occur that could degenerate a gene in this way, so we already have an observable, verifiable method for that, which is highly probable. Whether that is the case for all 60 (or any of them) is still a question in my mind, but imminently preferable in the face of the wildly improbable alternative … the slow assembly of 60 sequences in the last common ancestor that just so happen to later code for 60 stable, folding, functional proteins.

Functional orthologous sequences: Wu appears to be assuming that the last common ancestor lived 5 to 6 million years ago (roughly 400,000 generations ago).  To have these 60 sequences conserved in both the Chimpanzee and Orangutan for 400,000 generations suggests that they may have a function, especially in light of recent findings that seem to indicate a net deletional bias over the generations. I agree with you that these sequences must have a different function from the protein coding orthologous sequences in humans, but that is no problem to my way of thinking as clarified in the next section.

Design: From a design perspective, I would want to have as many functions and 3-D protein structures coded for by the same sequence or very similar sequences within reach of the very limited capabilities of an evolutionary search. This is a beautiful way of compressing information. We already know that many protein sequences have more than one function and can have more than one structure with very small modification. With no significant change in information, multiple functions can be achieved, but one would have to be a super-intelligence to know the physics well enough to actually write the code for this sort of compression, as the M(Ex)/N ratio is incredibly small.


Eddie - #75108

December 9th 2012

Dennis and Kirk:

Thanks for this constructive exchange.  Meaty in substance, and polite in conduct.  The focus was the science, and bashing of the other person’s theology and accusations of extra-scientific motives was scrupulously avoided.  If only all ID-TE exchanges could be like this! 

I hope the management at BioLogos will look at this exchange and realize that this is what the Christian public wants to see in ID-TE discussions.  Open, frank discussion of the issues, whether scientific or theological, without pulling any punches when it comes to data, facts or reasoning, but avoiding all hits below the belt and all tribal shibboleths.  I think the evangelical world deserves conversations like this from TE and ID leaders, which are unfortunately all too rare.

I have another suggestion.  Just as Bill Dembski was invited to write a substantive column here in the Southern Baptist series, other ID proponents could be invited to write substantive columns as well.  For example, since both Behe and Meyer have been repeatedly criticized here, they could be offered a substantial column here to respond to criticisms.  This would create the same impression of balance that was created by the Southern Baptist series.  Perhaps this idea could be implemented in the new year.

Congratulations to BioLogos for allowing Kirk to air his substantive disagreements here.  I hope we will hear from Kirk again.  And congratulations to Dennis for responding to Kirk in a non-defensive, constructive way.


HornSpiel - #75122

December 10th 2012

From a design perspective, I would want to have as many functions and 3-D protein structures coded for by the same sequence or very similar sequences within reach of the very limited capabilities of an evolutionary search.

Kirk,

I would love to know from your perspective what it is that ID is trying to achieve, both generally and specifically.

Obviously from the quote above you accept evolution to some extent. I would assume you accept common descent as the normal way that species develop. Yet at the same time you question the current models as sufficiently explanatory. Are your objections in this case intended  to be 1) generalizable to all speciation events, or 2) specific to the human speciation event?

Based on what we now know of neanderthal genetics, would you expect similar questions to arise at the common boundary of our humanoid species or sub-species ancestors?

Finally, how do you interpret your M(Ex)/N ratio findings? You say ” I would like to see a Biologos write-up on how a novel gene family can be achieved through a testable evolutionary method.” It sounds like you are skeptical that they could. Do you think that your findings conclusively point to “a super-intelligence” that “wrote the code?” Or are you more cautious, simply stating that more research needs to be done?

Also, am I correct in my understanding that an M(Ex)/N ratio is a measure that allows one to infer if a design event has occurred or not? I wonder if you could present a layman’s explanation of that calculation.

Like Eddie, I appreciate your discussion here with Dennis.

 

 


W W W - #76419

February 7th 2013

The researchers carefully examined other primate genomes to ensure that the non-coding version of the sequence observed in chimps is indeed the ancestral state. This strongly argues against (2). Your appeal to an unknown function (3) is also a bit strained - we observe the function in humans as a coding sequence, yet the ancestral state is non-coding. Thus the function we now observe in humans and the presumed function in chimps cannot be the same function. Yes, it is formally possible (however unlikely) that the sequence has a non-coding function in chimps, but even granting that possibility does not detract from the fact that the present-day function in humans cannot be the same as that non-coding function. As for (1) - the process is not assembling a sequence for a defined target (e.g. your analogy to a bank vault). There is no coding function needed before the sequence becomes coding, and only if the sequence is useful when it becomes coding will it be preserved. Any genome of any size can be predicted to have these sorts of events happen. Primates have moderately sized genomes (around 3.2 billion nucleotides). If there is a non-zero mutation rate, new coding sequences will arise - there is nothing that can stop that, and it is possible to predict the frequency of such events. 


Page 2 of 2   « 1 2