While working at Bell Labs in 1949, Claude Shannon formally marked the launch of information theory with his book titled “The Mathematical Theory of Communication”. It came at a time when biochemistry was blazing trails into genetics, so it was natural that information theory played a huge role in forming the discourse around the genetic code.

Information is a conceptual entity of the universe that is gaining status with scientists in many areas. Physicists and computer scientists are understandably enamored with the idea that information is a physical reality. Information as a universal entity is a mostly mathematical construct, but information in our universe really does seem to have a degree of independence from its temporal physical embodiment. In this way it appears to possess features like mass or energy. Just as potential energy is easily translated into kinetic energy, information can be translated from one form to another. We practice this principle when we read and write, or speak to each other. In this case information is encoded for travel in vehicles that we call languages, and languages are at the heart of information translation of all sorts. Languages are codes of communication.

In a general sense, information travels in any system that has a finite number of discrete choices, and it is quantified by a metric known as a bit. This is a bit confusing (pun intended) because the word ‘bit’ is also used to describe a binary digit. Binary digits are merely symbols; they are symbolic vessels that carry different amounts of information. The actual amount of information carried by one binary digit will depend on a number of things. Consider a thermometer. It is an instrument or vessel used to measure temperature, and binary digits are like the markings on the thermometer. The value announced by the markings on a thermometer stand for a quantity of heat, but there is another measure to be applied to the information provided by the instrument. The value of knowing the measurement and the precision of the measurement is different from the markings of a scale. Heat is heat, but we can have various amounts of information about it. For instance, just knowing whether something is hot or cold provides one bit of information. Knowing the heat in a system with four possible temperatures provides two bits of information.

It is true that a random binary digit contains one bit of information, but it is not true that a binary digit and one bit of information are synonymous. Anyone familiar with digital compression will recognize this difference. For instance, a digital image stored in one million bits of computer memory might easily yield to compression, and the information required to visually describe the exact same image in two different systems might be compressed into, say, one thousand bits of data. This means that each bit in the original image contains only about 1/1000 of a bit of visual information. Of course, realizing this compression windfall requires knowledge of an encoding schema. It also requires that we find patterns in the data. Most significantly, it requires awareness of the system making and displaying the various potential forms of information contained by the original bits.

There is a big catch. There might be additional dimensions of information beyond just visual data contained in a file storing a digital image. This then requires creative exercises in defining the quantities, forms and dimensions in various information systems. For instance, a secret text message, “attack at dawn!” might be cleverly encrypted in the hypothetical digital image just described. In this case, there are two dimensions of information contained in the image, and one of them is in danger of being missed. The actual information content of each bit might decrease significantly by overlooking informative dimensions and using a careless compression scheme. Our ignorance of the encrypted text message within the image data very well might cause us to unknowingly throw out valuable information when we compress the original file.

There are several useful techniques we can use to identify the content and form that a quantity of information might take. It is bittersweet that the information identified by many of these techniques has been given the name ‘entropy’ (H), because, of course, entropy has another meaning in physics as well. The two concepts are very similar, statistical concepts, and perhaps at a profound level they represent the same concept. My intuition is that they do, but there is a real danger of confusing the entropy of information with the entropy of thermodynamics. Nonetheless, entropy is the name we shall use here, and it will be a useful concept in our examination of genetic information systems. To limit the potential for confusion, I will use the term ‘entropy’ here only in the context of information entropy.

In broad strokes, information entropy is a measure of the value of knowledge. Specifically, it is the value of knowing the precise choice made by a system given a discrete number of choices. For instance, the entropy of knowing the outcome of an honest coin toss is one bit, because an honest coin toss is the epitome of one random bit of information.The coin might land heads or tails with equal probability. Knowledge of the actual outcome is worth one bit of information.
6 bits of information. |

However, the uncertainty of an honest coin toss is at a maximum. Conversely, a two-headed coin lands heads every time, so the uncertainty and therefore the entropy of any number of these absolutely rigged coin tosses are reduced to zero. We know the results without even tossing the coin, so the value of knowing them is nil.

Zero bits of information from a 2-headed coin.

Similarly, if the coin is rigged somehow with a probability of 75% heads, 25% tails, then the entropy of knowing outcomes from this coin is calculated to be 0.811 bits per toss. This curious value is derived from the following formula provided us by Shannon, where P(x) stands for the probability that x will occur.

Therefore, entropy is embedded in the concept of uncertainty. As previously described, it is no accident that this formula resembles the formula for thermodynamic entropy. Information entropy is the sum of uncertainties of a finite state system existing in any potential state of the system. As uncertainty changes, or the number of potential states changes, so too changes the entropy of the system. The challenge of measuring this in any system lies in our ability to identify the number of potential states and their probabilities. This is generally how we shall approach our efforts to quantify genetic information, and in order to do this we will rely on some combinatoric properties of discrete mathematics. The following definitions are quoted from the website of Wolfram Research.

“Discrete Mathematics - The branch of mathematics dealing with objects which can assume only certain ‘discrete’ values. Discrete objects can be characterized by integers, whereas continuous objects require real numbers. The study of how discrete objects combine with one another and the probabilities of various outcomes is known as combinatorics.”

“Combinatorics - The branch of mathematics studying the enumeration, combination, and permutation of sets of elements and the mathematical relations which characterize these properties.”

Just like the four-temperature thermometer, or two coin tosses, there are two bits of information in a perfectly random sequence of nucleotides, which can be thought of as a genetic information channel or signal.

If we follow the information through translation a little further we find a curious thing: the information content appears to go up and then down.

The metric is in codon units, or 'triplet equivalents' (TE). The surprise for most people is that information content goes up through the tRNA phase of translation. This is because of wobble, ironically, since it introduces new choices at the third nucleotide position. There are 160 anti-codons, whereas there are only 64 codons. Of course, there are typically only 20 amino acids, so the information content appears to fall, but does it really?

The actual information content at each of these stages depends on the actual number of each molecular type present during translation, and the probability of each being used. We have yet to get a good handle on tRNA in this regard. In fact, there might be hundreds of tRNA molecules with slight variations in a given cell. How might these tRNA variations affect downstream information?

The key to genetic information at the amino acid stage is not only what is being used but also how it's being used. Sure, leucine is always leucine, but are there different ways to use it in translation. What about a tRNA that puts leucine into a peptide chain rapidly vs. slowly? This distinction, as with the thermometer example above, delivers one bit of information to translation. More importantly, the genetic code is now working in a second dimension. This is what is known as an additional degree of freedom. In sciencespeak we are talking about 'translation kinetics.'

Variation in translation kinetics is an absolutely proven reality that 'codon usage' impacts translation outcomes, and the consequences of this are not trivial. This is the mechanism that theoretically drives conformation changes in protein folding. The important thing to note is that translation has been experimentally proven to operate with more than one degree of freedom. In other words, knowing the amino acid sequence is now not enough to allow us to determine the outcome of protein folding! We must have additional dimensions of information; therefore, it cannot be a one-dimensional code.

Beyond kinetics, what other degrees of freedom might there be? The next most obvious candidate is fidelity. Not every tRNA will be as reliable as the others. As we have just seen, when probabilities change, information changes. So, if two tRNAs deliver leucine at translation, but one does it more reliably than another, then the two deliver different amounts of information.

The most interesting prospect for how one leucine residue might differ from another during translation is related to spatial orientation. Since tRNA vary in size by up to 40%, it is not unreasonable to expect that they behave differently in a spatial sense at the point where a peptide bond is actually made. If the spatial differences in tRNA impact the nature of the bond that is made, then spatial information is being delivered during translation. The simplest but most significant case would be in making a distinction between the formation of a cis-peptide bond versus a trans-peptide bond. Beyond that it is not obvious how many choices might be made with respect to bond angle rotations. This absolutely is a plausible hypothesis relating to the mechanisms of genetic translation, and there is no empiric evidence that it doesn't work this way. In fact, common sense predicts that it does, and the observed results of translation supports common sense. What is the most plausible structure for efficiently handling this spatial information?

I am told that the technology is not yet to a level where peptide bonds can be measured at the point of translation, but that time is coming. When it is finally proven that all bonds are not created equal, there will be a renewed interest in the genetic code. The downstream effects are real, and their study is the next great frontier.

You heard it here first.

<Top> - <Home> - <Store> - <Code World> - <Genetic Code> - <Geometry>

Material on this Website is copyright Rafiki, Inc. 2003 ©

Last updated
September 11, 2003 7:25 AM