Introduction

"The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point."

"The semantic aspects are irrelevant to the engineering problem."

What is significant is that the actual message is selected from a *set of possible messages*, and the system must be ready to select any member of that set.

If the number of the messages in this set is finite, then, this total number or a monotonic function of it, can be used as a "measure of the information produced when one message is chosen from the set" (if all choices are equally probable). By "measuring information" we mean "measuring the amount of information".

The most natural choice is the logarithmic function. Because of three reasons:

1. Usual engineering variables like time, bandwidth, relays, etc usually vary linearly with the logarithm of the number of possibilities. A relay is an electrically operated switch. For example, adding one relay to a group of relays doubles the number of possible states of the relays. This involves log in base 2, so that doubling the possible states (2) corresponds to a single new relay (log₂2=1). Three new relays would multiply the number of states by 8, so 3=log₂8. Also, if we double the time Δt of the message we roughly square the number N of possible messages. This would mean that Δt ~ log(N) for any base, since => 2·Δt ~ log(N²).

2. It is closer to our intuition of what is a proper measure. We feel that two disks should have twice the capacity of one for information storage, or that two channels should have twice the capacity for transmitting information. We tend to do linear comparisons.

3. Logarithms are more simple, mathematically, than dealing with the raw number of possibilities.

We can choose whatever base for logarithms, and our choice determines the choice of the unit for measuring information. For example, if we use base 2, the unit is called *bit*, a word suggested by J. W. Tukey. A more generic name would be a binary digit.

A device with two stable states, like a relay or a flip-flop circuit can store one bit of information. N devices can store N bits. Since the total number or possible states is 2ᴺ, the amount of stored information is log₂2ᴺ = Nlog₂2=N, i.e. N bits.

If we use base 10, then the unit of information is called decimal digit. We can relate bases 2 and 10 by

log₂M = log₁₀M / log₁₀2 ≃ 3.32·log₁₀M ,

so a decimal digit is about 3.32 times a binary digit (bit). In other words, the amount of information you can store in a decimal digit is about 3.32 times the amount of information you can store in a binary digit. Maybe your intuition was that a decimal digit could store 5 times what a bit can store! So, every time you see a calculator or a number display, think that every digit there has a storage capacity of one decimal digit.

Natural logarithms, those with base e, are useful when integration and differentiation are required. The resulting units will be called natural units.

In general, to change from base b to base a we need to multiply by logₐb.

A *communication system* follows this schematic diagram:

 info source     transmitter                     receiver      destination
 +++++++++++    ++++++++++++                    +++++++++++    ++++++++++++
 |         |    |          |                    |         |    |          |
 |         |    |          |       +---+        |         |    |          |
 |         |--->|          |------>|   |------->|         |--->|          |
 |         |mes |          |signal +-∧-+received|         |mes |          |
 |         |sage|          |         |    signal|         |sage|          |
 +++++++++++    ++++++++++++     +---|--+       +++++++++++    ++++++++++++
                                 |      |
                                 +------+
                                   noise
                                  source

The main parts are:

1) An information source. It produces a message or a sequence of messages. The message may be:

A sequence of letters as in a telegraph or teletype.
A single function of time f(t) as in radio or telephony.
A function of time and other variables, like f(x,y,t), where the (x,y) part can codify the intensity of a point, like in black and white television signal.
Two or more functions of time f(t), g(t), h(t).
Several functions of several variables f(x,y,t), g(x,y,t), h(x,y,t).
Various combinations of the others.

2) A transmitter. It operates to produce a signal suitable for transmission over the channel. In telephony it changes sound pressure to a proportional electric current. You may also sample, compress, quantise, encode, modulate or process the signal in other ways.

3) The channel is the medium used to transmit the signal from transmitter to receiver. A wire, a coaxial cable... Here the author says "a beam of light, a band of radio frequencies...", but I wonder whether in this case the channel is "empty space", since it is the medium used for electromagnetic waves to travel. To me, the beam of light would be the signal, not the channel. Unless we consider the channel as both the wave and the medium used to travel, and the signal only the information that is travelling.

4) The receiver usually performs the inverse operations of those performed by the transmitter. It reconstructs the message from the signal.

5) Destination. It is the person or thing for whom the message is intended.

In order to consider general problems involving communication systems, we need to idealise these elements as pure mathematical entities.

We classify communication systems into three main categories: discrete, continuous and mixed. In a discrete system, both the message and the signal are sequences of discrete symbols. Telegraphy, where both message and signal are sequences of dots, dashes and spaces, is an example of discrete system. A continuous system deals with both continuous message and signal, like in radio or television. A mixed system involves discrete and continuous variables.

We first focus on discrete systems. They are extremely relevant, since computing machines exchange discrete sequences all the time. Moreover, the discrete case provides the foundation for the continuous and mixed cases.