Behe’s Bad Arithmetic and Worse Science

Blogging on Peer-Reviewed ResearchI would like to celebrate the official release date of Michael Behe’s new book, The Edge of Evolution, by pointing out a grotesque factual error in its very heart.

A major theme, if not the major theme, of Behe’s diatribe is the disease known as malaria. He wants to prove that evolution by natural means of any but the most trivial features of living things is completely impossible, and he uses the parasites which cause malaria as his test case. Unfortunately for him, but happily for the decades of scientific discovery which he tries to condemn, the mistakes in his argument tear it asunder.

Since Behe’s argument is built upon malaria, and in particular the evolution of the parasite Plasmodium falciparum to be resistant to the drug chloroquine, we should look at what modern biology has learned about how chloroquine resistance evolved. As chloroquine was “the best and most affordable antimalarial drug” in the history of antimalarial drugs, the evolution of resistance to it — and the spread of that resistance throughout the world — poses a real problem, which scientists are naturally eager to investigate.

Digging down into the literature brings us Xin-zhuan Su et al.‘s 1997 paper in Cell, which reports on chloroquine-resistant P. falciparum. Su et al. note that chloroquine resistance appears to have arisen independently in South America and Southeast Asia. Historical evidence indicates this, and the genetic study backs it up:

CQR [ChloroQuine-Resistant] P. falciparum parasites spread steadily from two foci that originated 40 years ago in South America and Southeast Asia after the massive use of chloroquine for nearly a decade. The African continent was spared for a time, until CQR parasites entered East Africa in the 1970s and subsequently swept across the continent.

Later research, by Mehlotra et al. (2001), identified another focus in Papua New Guinea, and pointed out that the Papua New Guinea strain was different enough from the Southeast Asian ones to have been another, independent development of resistance. In fact, South America also offered two distinct foci, one in Venezuela and the other in Colombia.

Today, CQR malaria is present in nearly all malarious regions except certain areas of the Middle East, Central America, and the Caribbean. This steady and inexorable march of chloroquine resistance from two foci is in contrast to the expansion of pyrimethamine-resistant P. falciparum strains, which contain simple point mutations in dihydrofolate reductase–thymidylate synthase that have been selected many times ([42]). The rare events of chloroquine resistance therefore suggest that the genesis of CQR P. falciparum was complex, requiring a special combination of multiple mutations.

In other words, chloroquine resistance is a tricker problem than evolving resistance to other antimalarial drugs. A lucky parasite can, by sheer accident, find that one “letter” in its DNA sequence has been altered, waking up to realize it is now immune to pyrimethamine. Of course, the details of the story are pretty complicated: the long half-life that the sulfadoxine-pyrimethamine drug combination has in the body plays a role in how resistant microbes can prosper, and overmedication, poor absorption, improper prescriptions and other factors have all been implicated in SP resistance. For one thing, all these complications mean that it would be hard to figure out how likely the DNA is to mutate by looking at the population of infected people. The number of “resistance events” which occur can be much larger than the number of events researchers can measure: some resistant parasites are destroyed by the host’s immune system, some are knocked out by other drugs, and others just might not become prevalent enough in the population to be noticed. (Moreover, we can expect these factors to vary in severity among different diseases and different environments.) Working back from the percentage of resistant parasites in the population to the chance of DNA mutating is not an easy problem.

Su et al. suggest that the trick which brings about chloroquine resistance is a mutation in a stretch of DNA they call cg2, converting it to an allele known as Dd2. (The cg2 sequence is about 36,000 bases long, enough to fill about fourteen sheets of single-spaced paper.) But upon examining the parasites sensitive to chloroquine, they find that the cg2 block exists in a variety of different forms. “CQS” (S for sensitive) parasites from Africa and Asia exhibit many different features in their cg2 sequences, and most of these different kinds of cg2 contain features also found in the CQR version. They don’t contain all of the CQR gene’s distinctive characters, but by “recombining” selections from the different CQS alleles, one can make a CQR version of cg2. This is exactly how Su et al. say that chloroquine resistance could well have come about:

The particular combination of polymorphisms in the Dd2 cg2 gene may therefore have come together by recombination to form an exact structure necessary for chloroquine resistance.

What we’re seeing here is not the sudden corruption of two “letters” in a DNA “sentence,” but rather the gradual accumulation of mutations, eventually gathering together to confer chloroquine resistance upon the lucky winner.

Later, attention shifted from cg2 to a nearby gene on the same chromosome, called pfcrt. However, the same story of accumulating mutations plays itself out. Talisuna et al. (2004) provide a summary:

Current evidence from transfection studies (71, 187) strongly suggests that the mechanism of P. falciparum resistance to CQ is linked to mutations in the pfcrt gene, especially the substitution of threonine for lysine at position 76. However, other mutations in the pfcrt gene at positions 72 to 78, 97, 220, 271, 326, 356, and 371, as well as mutations in other genes such as pfmdr1, might be involved in the modulation of resistance (173, 223). CQ resistance seems to involve a progressive accumulation of mutations in the pfcrt gene, and the mutation at position 76 seems to be the last in the long process leading to CQ clinical failure (53, 92).

Hastings, Bray and Ward (2002), writing in the journal Science, go into a little more detail. They note that sequential accumulation of mutations is the best explanation for chloroquine resistance:

The first mutations spread because they confer increased tolerance to CQ on parasites, enabling them to infect humans sooner after drug treatment — for example, mutation 4 allows parasites to infect people 6 days after treatment rather than 7 days. The relatively rapid elimination of CQ means that these are rather weak selective forces (6) and that the spread of these first mutations will be slow. Eventually, mutation 8 arises, which allows the parasite to survive therapeutic levels of CQ. Once above this threshold, the selective advantage conferred by this mutation becomes enormous and the pfcrt haplotype (now containing several sequentially acquired mutations) spreads rapidly across geographic regions where CQ is in common use. This appears to have occurred four times for CQ resistance: twice in South America, once in southeast Asia, and once in Papua New Guinea (see the viewpoint by Wellems on page 124) (10).

Whew! Got all that?

Now comes the time to ask what all this has to do with Michael Behe and his book, The Edge of Evolution.

PZ Myers sets the stage:

He invents a new metric, the CCC, or “chloroquine complexity cluster”. This is the probability of evolving a fairly simple trait in the malaria parasite, resistance to a compound called chloroquine. Malaria that is resistant to chloroquine has two specific changes to a protein pump called PfCRT; one amino acid at position 76 and another at position 220 are changed from the more common form. By a couple of arguments, from the probability of getting two independent changes in the sequence and the observed frequency of evolution of chloroquinone resistance in the population of infected people, he comes up with a number: the odds of acquiring this specific pair of mutations is one in 1020.

Uh-oh.

First, we already know that CQR doesn’t appear via “independent changes,” but that the evidence strongly suggests it can arise through successive steps, each one conferring a little benefit until the final version can blow the competition away. Second, Chen et al. (2005) showed that the mutation at position 220, changing alanine to serine, is not necessary for chloroquine resistance. It’s common in CQR parasites around the world, but some P. falciparum caught in the Philippines show no mutation at position 220, and other mutations at positions 144 or 160 instead. This alone triples the probability of CQR arising by “independent changes.”

Of course, Behe is also already heading into the “there be dragons here” territory of logical fallacies. Suppose, for a moment, that his figure of one success in 1020 mutations is entirely accurate; that is, let’s grant him that these specific changes at positions 76 and 220 in the PfCRT protein happen only once in 1020 tries.

Why should that figure apply to anything other than chloroquine resistance in malaria? What possible reason could there be for this one number to constrain the entire flourishing diversity of the Earth’s biosphere? As Mark Chu-Carroll has already pointed out, point mutations — the accidental modification of single locations in a gene sequence, one spot at a time — does not even cover all the different ways that mutations can happen. On top of that, we know that this particular change, the mutation of positions 76 and 220, is not the only way chloroquine resistance can appear.

Let’s look at Behe’s first argument for the 1020 figure. A little arithmetic shows it to be invalid! Saurabh, a graduate student in molecular biology, did a back-of-the-envelope calculation with some reasonable figures for how likely individual locations in a gene are to mutate (the sort of numbers molecular biologists and biochemists have memorized) and found that Behe’s figure was too low — far too low. Steve Reuland, a post-doc at the University of Colorado Health Sciences Center, soon arrived at the same conclusion (independently, as far as I can tell). A conservative ballpark estimate of the mutation rate would be about 10-9 per base pair per generation; that is, one in 109 cellular reproductive cycles, a chosen base pair will mutate. Because Behe is interested in two independent point mutations, the probability of seeing both is 1 in 1018 — already two orders of magnitude higher than his figure. Next, note that a single infected person can be host to 1011 or 1012 parasites, meaning that on average, we’d see a CQR parasite in one out of every million to ten million people. Clearly, something interesting is going on here. A rough estimate might differ from the carefully measured value by a factor of two or five, but an error this big is just too much.

Where, then, did Behe get his number? This is the story he tells:

…resistance to chloroquine has appeared fewer than ten times in the whole world in the past half century. Nicholas White of Mahidol University in Thailand points out that if you multiply the number of parasites in a person who is very ill with malaria times the number of people who get malaria per year times the number of years since the introduction of chloroquine, then you can estimate that the odds of a parasite developing resistance is roughly one in a hundred billion billion.

It took less than thirty seconds with Google Scholar to find Nicholas White’s paper, “Antimalarial drug resistance” (2004), in the Journal of Clinical Investigation. The relevant sentence is as follows:

Resistance to chloroquine in P. falciparum has arisen spontaneously less than ten times in the past fifty years (14). This suggests that the per-parasite probability of developing resistance de novo is on the order of 1 in 1020 parasite multiplications.

But footnote 14 doesn’t actually support that claim! It points — surprise, surprise — back to Su et al. (1997). What that paper says is that CQR has arisen independently in at least two — or, as we now know, four — places. That’s not the same thing as the DNA sequence mutating two or four times in fifty years to give the CQR sequence.

First, as we discussed earlier, the number of “resistance events” seen by researchers is almost certain to underestimate the chance that the DNA will mutate to produce resistance. Second, CQR arises from a sequence of mutations, each conferring a greater benefit than the last, with the change at position 220 one of at least three possibilities for the last step. Third, there’s no reason to assume a priori that CQR is the model for all mutations. Why should the probability of seeing the last step in this particular chain of genetic changes have anything to do with the probability of other mutations, many of which don’t even involve the pointwise substitution of one amino acid for another? This is a good time to recall Mark Chu-Carroll’s observation that Behe completely ignores gene duplication, frameshifting and all other types of genetic change besides point mutation.

Behe assumes, or rather declares by fiat that chloroquine resistance arises by two point mutations, and he takes a bogus measurement of how likely that process is to occur and applies it to the entire history of evolution. As Nick Matzke notes, it’s his “central measuring stick throughout the book”.

Frankly, this still dazzles me. Behe ignores the facts which are known about a subject, takes a guesstimate of one number, interprets it as something else, plugs that into a bad model which covers only a fraction of the ways that mutations can happen, and then trumpets the “discovery” that evolution is impossible. (And he gets paid for it, too.) The mixture of gall and negligence, the sheer brazen quality of this ignorance, is a wonder to behold.

(My hat is off to Nick Matzke for this one. He promises a write-up by Steve Reuland on the Panda’s Thumb website today or tomorrow.)

REFERENCES:

UPDATE (7 June 2007): Welcome, fellow Pharyngulans! If you’re absolutely starved for reading material, I have a few other things you might like, including this bit of happy fun-time optimism, my rant about “quantum mind” nonsense and of course LOLCreationists.

14 thoughts on “Behe’s Bad Arithmetic and Worse Science”

  1. It’s even worse that that. In the original Fidock et al. paper (2000, Molec. Cell) identifying mutations in the pfcrt gene that correlate with CQR, they identify a sensitive isolate from the Sudan that differs from nearly all the resistant isolates from Africa and Southeast Asia by only one change (at position 76). This sensitive isolate already has a mutation at position 220, suggesting that the various CQR isolates from that part of the world arose by single point mutation at position 76 from the sensitive Sudanese strain.

    This was supported by their finding that imposing selection for drug resistance on this sensitive strain resulted in CQR forms, which, you guessed it, are now mutant at position 76. The upshot is that it’s possible to go from a senstive strain found in the wild to a resistant strain, also found in the wild by a single point mutation.

  2. Hang on. Behe multiplies “the number of parasites in a person who is very ill with malaria times the number of people who get malaria per year times the number of years since the introduction of chloroquine” and divides by ten to get his figure of 10^20, right? And then he argues that 10^20 is such a ridiculously large number that it could never have happened and therefore evolution is bunk?

    What am I missing here? There must be something, because it seems to me that Behe’s argument is self destroying; if you take his odds of the mutation occuring, then multiply it by “the number of parasites in a person who is very ill with malaria times the number of people who get malaria per year times the number of years since the introduction of chloroquine” then you’ll find that we would have expected around about ten mutations to have occured in that time…

    Seriously, somebody explain this to me, my head hurts and there’s no way Behe could be that stupid, surely?

  3. This is not a new problem for Behe. See, e.g., http://www.proteinscience.org/cgi/reprint/14/9/2217 . From the abstract: “A recent paper [co-authored by Behe] in this journal has challenged the idea that complex adaptive features of proteins can be explained by known molecular, genetic, and evolutionary mechanisms. It is shown here that the conclusions of this prior work are an artifact of unwarranted biological assumptions, inappropriate mathematical modeling, and faulty logic.”

    Sound familiar?

  4. Manigen: I’m not the best person to respond to this, but I’ve seen similar comments elsewhere which have not been replied to, so here goes. From what I understand (from PZM and other reviewers, I have not read and do not expect to read Behe), Behe accepts that CQR malaria did evolve naturally, and cites a probability of that event based on an estimate that it occurred roughly ten times in fifty years within a population of a certain size. He then applies that probability to try to rule out “macroevolution”.

    At first glance, the reviewers such as our host who note that four, rather than ten, events are known in the literature, seem to be helping his case, by making the empirical probability even smaller. However, they go on to explain that nobody knows how many times the event has actually occurred, and that theoretical considerations indicate it is much more likely than Behe estimates.

  5. Your estimated frequency of mutation into a CQR variant seems too low in terms of infected humans. If the specified two-base-pair mutation occurs once in 10**18 cell reproductive cycles, the proper divisor (to find infected humans before the mutation is expected) includes the number of times that P. falciparum multiplies inside each host. I would expect this to be hundreds or thousands, bringing the numbers down by several more orders of magnitude. (Not that I’m a biologist of any sort by training — perhaps I am wrong about factoring in intra-host reproductive cycles.)

  6. To all, many thanks!

    Michael Poole:

    I am also not a biologist (notice how the tagline of this blag says “math and physics” — and don’t expect me to be infallible even there!). However, I think this part includes your point:

    Next, note that a single infected person can be host to 1011 or 1012 parasites, meaning that on average, we’d see a CQR parasite in one out of every million to ten million people.

    I believe the 1012 figure includes the reproductions; that is, it tallies up all the parasites living over time, not just those alive at a particular instant. I may, however, be mistaken (IANAB), in which case — as you say — the actual number has an additional factor bringing up the net probability.

    Again, many thanks to everybody!

  7. Two notes:

    1) That Behe “Protein Science” article was a huge embarassment to the journal, and was later positively destroyed by Michael Lynch of Indiana University in the same journal (I guess as a sort of mea culpa). The article is available for free here:
    http://www.proteinscience.org/cgi/content/full/14/9/2217

    2) To Michael Poole, as Blake suggests, what’s important to consider is not the number of generations (which is probably no more than several dozens in a typical malaria infection, since Plasmodium doubles every 24 hours), but the total number of division events, since every single one can potentially result in Behe’s given two-substitutions. This can be approximated by roughly counting the number of parasites present at peak infection.

  8. Jeez, you scientists… Your problem is that just because something ocurred at least 10 times in the past, you just a priori assume that means that it’s not impossible. You guys just need to be open-minded and consider that maybe those things that happened really can’t happen.

    (I AM KIDDING!)

  9. Reverse-engineering mathematics? Is that valid in any way, shape or form?

    “This happened, so let’s figure out the odds of it happening…”

    I have a doomesday book (I’m a collector) that describes the odd of former Soviet Premier Gorbachev NOT being the Anti-Christ as 36,000,000,000 to 1.

    With this in mind, Behe’s work looks awfully familiar.

  10. Behe is ignorant of the history of malaria, as well. He writes:

    The fierce malarial parasite — the same evolutionary dynamo that shrugs off humanity’s drugs — has an Achilles’ heel: It won’t develop in its mosquito host unless temperatures are at the very least balmy, so it’s restricted mainly to the tropics.

    Malaria used to be endemic in Northern Europe. It wasn’t restricted to the tropics until after the advent of antimalarial drugs.

    If the parasite could develop at lower temperatures it could spread more widely. But despite tens of thousands of years and a huge population size, much larger than that of Antarctic fish, it has not done so. Why can fish evolve ways to live at subfreezing temperatures while malaria can’t manage to live even at merely cool temperatures?”

    Another evolutionary mystery: why haven’t those Antarctic fish found out a way to parasitize the human circulatory system?

  11. windy:

    As the kids these days say, “LOL!”

    Indeed, the Italian word mal’aria (“bad air”) was introduced into English to describe a disease which afflicted Rome, a “horrid thing called mal’aria, that comes to Rome every summer” (Horace Walpole, 1740). Rome isn’t exactly sub-Saharan Africa.

Comments are closed.