A Mathematical Theory of Communication
The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point.
The move that started everything
Shannon's first act is an act of refusal. He sets meaning aside.
Frequently the messages have meaning [...]. These semantic aspects of communication are irrelevant to the engineering problem.
By declining to ask what a message means, Shannon makes the quantity of information measurable. A message is a selection from a set of possible messages. The more uncertainty resolved by a selection, the more information it carries.
Entropy
For a source emitting symbols with probabilities p_i, the information
is the entropy:
H = - Σ p_i log₂ p_i (bits per symbol)
Entropy is maximized when outcomes are equiprobable and collapses to zero when one outcome is certain. It is, exactly, the average surprise of the source, and the floor below which no lossless code can compress.
The channel and its capacity
Every real channel adds noise. Shannon's deepest result, the noisy channel coding theorem, says something that sounded impossible in 1948: reliable communication over a noisy channel is achievable up to a finite rate, the channel capacity, with an error probability as small as you like, provided you encode over long enough blocks.
| Concept | What it bounds |
|---|---|
| Entropy H | Best possible lossless compression |
| Capacity C | Best possible reliable transmission rate |
Below capacity: arbitrarily good. Above capacity: arbitrarily bad. The boundary is sharp.
Why it belongs in every archive
Almost everything digital descends from this paper, modems, compression, cryptography, error-correcting codes in every disk and deep-space probe. Shannon did not improve communication. He drew its outer wall and proved where it stood.
A working note tracing the idea of attention from early seq2seq work through the Transformer, connecting to information-theoretic roots.
- Claude E. Shannonbuilder
Founded information theory. Invented the bit. Proved that reliable communication over a noisy channel is always achievable up to a finite rate.