Dick Pountain/16 September
2002/12:15/Idealog 98
In my last column I took a swipe at the
misuse of the word Information in buzz phrases like 'The Information Society'.
I'd like to carry on that particular line of thought a little further, because
it can unravel a lot more woolly thinking about computers and computing. For
example you might expect that I'd also object to the phrase 'Information
Technology', but not so. On the contrary IT is a perfectly fine description of
what computers do because they do indeed manipulate information - and nothing
but information. The problem is that people believe (or wish) that computers
can manipulate much more.
Let's start from basics by remembering
precisely what the great Claude Shannon discovered about Information. Shannon
was an engineer and mathematician working at Bell Labs in New York during and
after World War II, in a department whose main aim was to find ways to make
telephone and radio communications more reliable. The theory and practice of
error correcting codes was their hot topic. In a remarkable paper of 1948,
which ought to rank with those of Einstein and Alan Turing as seminal moments
of 20th century thought, Shannon put this whole endeavour onto a scientific
footing, with his insight that information can be treated as a mathematical
quantity like any other, measured in units of binary digits, for which he
coined the word BIT (and where would we be without it?) In effect a bit is the
answer to a single yes/no question and the amount of information in a message
is its degree of uniqueness - what distinguishes it from all other possible messages
of the same length - which can be established by asking a sufficient number of
yes/no questions. Shannon discovered an elegant equation that relates
Information to entropy or disorder, one form of which is:
I = log2N
where I is the number of bits of
information in a message and N is the number of possible different messages.
As an example, using just the letters of
the English alphabet you can construct 11,881,376 different five-letter
word-strings like 'world' or 'xfnxg', regardless of whether they mean anything.
Hence the information content of a five-letter word is log2(11881376),
or close to 24 bits. You could measure this by asking of each letter in turn,
'Is it before M? Is it after G?’ and so on. The ASCII representation of a
five-letter word is 40 bits long, which leaves plenty of scope for compression.
However encoding a five-letter word using fewer than 24 bits must necessarily
throw away some information, which can never be recovered. Such losses are
often acceptable and useful - when you compress a digital picture (which is
just a long string of bits) using the lossy JPEG algorithm some information is
lost, but not so much as to make the picture unrecognizable. Shannon's
Information Theory can be applied to any field where information is encoded,
for example genetics where DNA sequences are considered as messages, or
particle physics with particle spins as messages.
A most important feature of Information
Theory though is that it says nothing at all about meaning. Indeed meaning is
probably something that doesn't exist at all outside of the human mind. A
particular string of bits encodes a photograph of your mother; to me it's a
picture of a nameless woman while to a bee it's a pattern of colours that is
not a flower. To a computer it just means that string of bits and no other.
This leads to all sorts of paradoxes that have yet to be resolved by either
philosophers or scientists. It would would seem desirable to have some
human-oriented definition of Information in which a list of travel directions
written in Italian conveys more information to an Italian speaker than to a
non-Italian speaker, but no satisfactory one exists. What's more these problems
of semantics, on further investigation, seem to be radically resistant to
computerization.
We can digitize and store all sorts of
representations of the real world - pictures, sounds, movements, even smells -
but extracting meaning from these representations remains beyond digital
computers. Some state-of-the-art security software can now pick faces out of a
crowd, so it might recognize your mother, but that doesn't go very far toward
capturing what your mother 'means'. The temptation among gung-ho artificial
intelligence enthusiasts would be to say, 'But that's only a matter of degree -
we need to store more information about your relationship with her'. That just
doesn't wash though. We each live slightly off the centre of a vast web of
relationships with other people and social institutions that constitutes our
'context' (and we also have an inner life of emotions, desires and phantasies
that connect to, distort and modify this web.)
The
complexity of such a context is so staggeringly huge that trying to isolate a
few thousand relevant variables and stuff them into an array will just not work.
It's been tried - 'semantic nets', 'frames' or whatever - and the traversal
algorithms instantly explode into NP-completeness. A few million years of
evolution has honed the human brain into a dedicated network processor for
attempting to traverse such contexts. It contains around 3 billion synapses,
which could in principle make something like 23x109 (two-to-the-power-three-billion) connections among themselves, or
rather more than there are particles in the universe, probably. It does a
reasonably good job, creating dozens of human civilizations, Shakespeare,
Beethoven, Posh 'n Becks, the Big Mac and the Pentium 4, but I think we'd all
agree that it still has some way to go before it can be signed off out of
beta...
No comments:
Post a Comment