

The Fallacies of Intelligent Design Theory
The information content of a sequence is defined as the number of bits required to transmit that sequence using an optimal encoding. Suppose we roll a die with eight faces. Since eight is a power of two, the optimal code for a uniform probability distribution is: log 8 = 3 bits. Information theory provides a formula (see below) for determining the number of bits required in an optimal code even when we do not know the code. Consider uniform probability distributions where the number of possible outcomes is not a power of two. With a fair six-sided die, the number of bits required to transmit one throw is: log 6 = 2.58. Though a single throw cannot be transmitted in less than 3 bits, a sequence of such throws can be transmitted using 2.58 bits on average. One way to transmit less than 3 bits/throw is to consider them three at a time. The number of possible three-throw sequences is 63 = 216. Using 8 bits, we can encode a number between 0 and 255, so a three-throw sequence can be encoded in 8 bits with a little to spare.
In probability terms, each possible value of a six-sided die occurs with equal probability P = 1/6. Information theory tells us that the minimum number of bits required to encode a throw of the die is -log P = 2.58. In the eight-sided die example above, every message had a length exactly equal to -log P bits. For biased (non-uniform) probability distributions, let the variable x range over the values to be encoded and let P(x) denote the probability of that value occurring. The expected number of bits required to encode one value is the weighted average of the number of bits required to encode each possible value, where the weight is the probability of that value:
H, which is the number of bits needed to transmit a signal communicating a configuration, irrespective of the content of the message, is known as Shannon uncertainty. A more conventional definition of information that is consistent with the vernacular use of the term is R = Hbefore - Hafter, the decrease in Shannon uncertainty under the action of some process. If R > 0, information has been gained, there has been an increase in order, and fewer bits are now needed to describe the system. Shannon uncertainty is a measure of the randomness in a signal that is applied in communication theory.
One of the measures of information is a quantity called entropy, which quantitatively characterizes the level of disorder in a system, i.e., how much randomness is in a signal. An alternative way to look at this is to consider how much information is carried by the signal. The more information carried by a text, the larger the text's entropy. The total entropy of a text as a whole is proportional to the text's length. As an example, consider some text, encoded as a string of letters, spaces, and punctuation (so the signal is a string of characters). Since some characters are not very likely, e.g., z, while others are very common, e.g., e, the string of characters is not really random. On the other hand, since we cannot predict what the next character will be, it does have some randomness. Entropy is a measure of this randomness. Except for units and a different base for logarithms, Shannon uncertainty and entropy are identical. Shannon referred to his quantity H as entropy, just expressed in bits rather than the Joules per Kelvin units of conventional physics.
Consider a string of the same letter (like A) repeated over and over: AAAAAAA...etc. This meaningless text is perfectly ordered, and thus its entropy is essentially zero. Now consider an extremely long text (string) obtained by randomly drawing letters of the alphabet (and a space) from a container, writing down the letter drawn, and returning it to the container. This string will almost always be gibberish. Thus, there is no or very little order in the random string, and the entropy of the meaningless information carried by that string is large. Meaningful texts are located somewhere in the middle of the entropy scale, with their entropy calculated to be about 1 bit per character.
Next we consider how information is transmitted by natural processes.
Generation of Information by Natural Processes
The following is from an article, "How to Evolve Specified Complexity by Natural Means," by Matt Young, Adjunct Professor of Physics, Colorado School of Mines (Young: 2002). The question of interest is, can natural processes create large quantities of information?
Consider a machine that tosses coins, five at a time. The coins are assumed to be fair and the machine not so precise that its tosses are predictable. Each coin may turn up heads or tails with 50% probability. There are 32 combinations:
HHHHH
HHHHT
HHHTH
HHHTT
and so forth.
<NEXT>
The Information Theory Case Against Evolution: Pages 1, (2), 3, 4, 5, 6, 7