The Obligate Scientist: Math, Computers and Intelligent Design Pseudoscience

Here's a recently published paper (PDF reprint) by intelligent design (creationism) proponent William Dembski. Strong criticisms of the paper (and how it's being misused by other intelligent design creationism proponents) have already popped up on blogs in some posts like these, and even in Dembski's own blog - which he promptly put a stop to by disabling comments.

Instead of focusing on the paper itself, I wanted to illustrate how it (and other mathematical or computational results) can be misused in promoting ID. Before I begin, feel free to read through the blog posts and skim the paper.

After that, we can start in on this blurb from the Discovery Institute as an example of this sort of misuse by asking "So does this paper support intelligent design creationism??"

The paper itself specifies (in the abstract) what they did, and how they applied it (here and below, I have used a bold font for emphasis):

This paper develops a methodology based on these information measures to gauge the effectiveness with which problem-specific information facilitates successful search. It then applies this methodology to various search tools widely used in evolutionary search.

I happen to know a thing or two about using mathematical models in science, and this paper is a fantastic example of what I consider mathematical equivocation - using the power and complexities of mathematics as a logical tool to try and back a claim that really isn't backed by the math. It's significant because, unlike typical rhetorical arguments, the math obfuscates the assumptions and logical arguments being made, and can at times require graduate level background to decipher - so the equivocation is a bit harder (if not impossible, for some) to actually notice.

While the paper says nothing about intelligent design creationism, others (including Dembski himself) claim that it applies. Lets start with the title of the Discovery Institute piece. First, they claim this is a pro-ID publication. Second, we have our first logical fallacy: the appeal to authority. It seems that the holy grail of ID creationist efforts is to have some science-cred to wave around, and a "peer-reviewed scientific article" is exactly that. So what about the pro-ID claim?

In this blog post we get the Discovery Institute's take on what the paper is really about:

A new article titled "Conservation of Information in Search: Measuring the Cost of Success," in the journal IEEE Transactions on Systems, Man and Cybernetics A: Systems & Humans by William A. Dembski and Robert J. Marks II uses computer simulations and information theory to challenge the ability of Darwinian processes to create new functional genetic information.

To understand why the paper has absolutely nothing to do with real functional genetic information (despite this claim), requires knowing about information theory (an unknown topic to the vast majority of people, scientists included) and that the kind of "information" discussed in the paper is very different from genetic information in a biological context. Indeed, the word genetic only appears on the first page of the article when mentioning "genetic algorithms", and there's no mention of "functional genetic" anything in the paper.

Unfortunately, intelligent design creationists frequently misuse or improperly apply the concept of "information" (which can be defined in a number of ways). Demanding a clear definition is always a good way to keep on track with what's actually being said.

Here's the closest we come to definitions for interpreting information in this paper:

endogenous information, which measures the difficulty of finding a target using random search;
exogenous information, which measures the difficulty that remains in finding a target once a search takes advantage of problem specific information; and
active information, which, as the difference between endogenous and exogenous information, measures the contribution of problem-specific information for successfully finding a target.

Getting back to the Disco Institutes blog post, besides the equivocation this blurb is also largely fueled by another common logical no-no found in many pro-ID arguments - the combined false dichotomy and straw man ~~arguments~~ fallacies whereby (in a debate context) you misrepresent your opponents position with something you can refute, then pretend you refuted your opponent's true position, and then assert that their being wrong makes your position correct.

Here's the Discovery Institute spinning the article against evolution "unguided" by an intelligent designer:

Darwinian evolution is, at [it's] heart, a search algorithm that uses a trial and error process of random mutation and unguided natural selection to find genotypes (i.e. DNA sequences) that lead to phenotypes (i.e. biomolecules and body plans) that have high fitness (i.e. foster survival and reproduction). Dembski and Marks' article explains that unless you start off with some information indicating where peaks in a fitness landscape may lie, any search — including a Darwinian one — is on average no better than a random search.

Note that the implication (at least to me) is that evolution requires "some information" (e.g. an intelligent designer?) if it's going to work "better than a random search." Also note the false dichotomy at play here, as the blog post seems to imply that it's a "pro-ID" publication because it allegedly refutes evolution.

Just to clarify the staw man, evolution by natural selection is equated with evolutionary algorithms, which are then criticized and it's all put forth as being "pro-ID".

So far, I don't see how any of this supports intelligent design creationism or refutes evolution by natural selection. Feel free to correct me if I'm wrong here!

So what's really the take home message from this paper?

Returning to the matter of equivocation, MarkCC's blog post helps to clarify the meaning of "information" in the paper, although only implicitly:

In terms of information theory, you can look at a search algorithm and how it's shaped for its search space, and describe how much information is contained in the search algorithm about the structure of the space it's going to search.
What D&M do in this paper is work out a formalism for doing that - for quantifying the amount of information encoded in a search algorithm, and then show how it applies to a series of different kinds of search algorithms.

In terms of evolution, it's really just asking how much of the information about the space (i.e. the fitness landscape) is tied up in the algorithm (i.e. natural selection), which given that natural selection is all about the relative number of offspring contributed to the next generation, these results really don't seem surprising or problematic.

The conclusions of the paper seem to be awkwardly stated and kind of amusing:

CONCLUSION
Endogenous information represents the inherent difficulty of a search problem in relation to a random-search baseline. If any search algorithm is to perform better than random search, active information must be resident. If the active information is inaccurate (negative), the search can perform worse than random...

Hmm... So in order for evolution by natural selection (as an algorithm) to work better than random, it needs to include correct information from genetics, developmental biology, ecology, and so on? Got it. If we get it wrong, it'll perform worse than some other random hypotheses? Got it.

This section also has some either very ironic or very well chosen wording...

... Accordingly, attempts to characterize evolutionary algorithms as creators of
novel information are inappropriate. To have integrity, search algorithms, particularly computer simulations of evolutionary search, should explicitly state as follows: 1) a numerical measure of the difficulty of the problem to be solved, i.e., the endogenous information, and 2) a numerical measure of the amount of problem-specific information resident in the search algorithm, i.e., the active information.

Nice - "attempts to characterize" are inappropriate because we're not using your particular definition of information? Gee, now THAT sounds familiar.

To be honest, I have no real expertise in algorithms, but from what I could dig up I don't think this use of "integrity" means anything out of the ordinary. I'll run this by some computer science friends of mine and any insights will appear below.

Until then, my best response to the question "So does this paper support intelligent design creationism??" is decidedly, No.