But because the mask is
actually
hollow, the reverse happens.
Richard-Dawkins-Unweaving-the-Rainbow
Or think of the problem of recognizing a particular person's face.
By long in-group convention, the hypothetical face we are talking about is assumed to belong to the grandmother of the distinguished neurobiologist J.
Lettvin, but substitute any face you know, or indeed any object you can recognize.
We are not concerned here with subjective consciousness, with the philosophically hard problem of what it means to be aware of your grandmother's face.
Just a cell in the brain which fires if and only if the grandmother's face appears on the retina will do nicely for a start, and it is very difficult to arrange.
It would be easy if we could assume that the face would always fall exactly on a particular part of the retina.
There could be a keyhole arrangement, with a grandmother-shaped region of cells on the retina wired up to a grandmother-signalling cell in the brain.
Other cells - members of the 'anti-keyhole' - would have to be wired up in inhibitory-fashion, otherwise the central nervous cell would respond to a white sheet just as strongly as to the grandmother's face which - together with all other conceivable images - it would necessarily 'contain'.
The essence of responding to a key image is to avoid responding to everything else.
The keyhole strategy is ruled out by sheer force of numbers.
Even if Lettvin needed to recognize nothing but his grandmother, how could he cope when her image falls on a different part of the retina? How cope with her image's changing size and shape as she approaches or recedes, as she turns sideways, or cants to the rear, as she smiles or as she frowns? If we add up all possible combinations of keyholes and anti- keyholes, the number enters the astronomical range. When you realize that Lettvin can recognize not only his grandmother's face but hundreds of other faces, the other bits of his grandmother and of other people, all the letters of the alphabet, all the thousands of objects to which a normal person can instantly give a name, in all possible orientations and
apparent sizes, the explosion of triggering cells gets rapidly out of hand. The American psychologist Fred Attneave, who had come up with the same general idea as Barlow, dramatized the point by the following calculation: if there were just one brain cell to cope, keyhole fashion, with each image that we can distinguish in all its presentations, the volume of the brain would have to be measured in cubic light years.
How then, with a brain capacity measured only in hundreds of cubic centimetres, do we do it? The answer was proposed in the 1950s by Barlow and Attneave independently. They suggested that nervous systems exploit the massive redundancy in all sensory information. Redundancy is jargon from the world of information theory, originally developed by engineers concerned with the economics of telephone line capacity. Information, in the technical sense, is surprise value, measured as the inverse of expected probability. Redundancy is the opposite of information, a measure of unsurprisingness, of old-hatitude. Redundant messages or parts of messages are not informative because the receiver, in some sense, already knows what is coming. Newspapers do not carry headlines saying, 'The sun rose this morning'. That would convey almost zero information. But if a morning came when the sun did not rise, headline writers would, if any survived, make much of it. The information content would be high, measured as the surprise value of the message. Much of spoken and written language is redundant - hence possible condense telegraphese: redundancy lost, information preserved.
Everything that we know about the world outside our skulls comes to us via nerve cells whose impulses chatter like machine guns. What passes along a nerve cell is a volleying of 'spikes', impulses whose voltage is
fixed (or at least irrelevant) but whose rate of arriving varies meaningfully. Now let's think about coding principles. How would you translate information from the outside world, say, the sound of an oboe or the temperature of a bath, into a pulse code? A first thought is a simple rate code: the hotter the bath, the faster the machine gun should fire. The brain, in other words, would have a thermometer calibrated in pulse rates. Actually, this is not a good code because it is uneconomical with pulses. By exploiting redundancy, it is possible to devise codes that convey the same information at a cost of fewer pulses. Temperatures in the world mostly stay the same for long periods at a time. To signal 'It is hot, it is hot, it is still hot. . . ' by a continuously high rate of machine-gun pulses is wasteful; it is better to say, 'It has suddenly become hot' (now you can assume that it will stay the same until further notice).
And, satisfyingly, this is what nerve cells mostly do, not just for signalling temperature but for signalling almost everything about the world. Most nerve cells are biased to signal changes in the world. If a trumpet plays a long sustained note, a typical nerve cell telling the brain
about it would show the following pattern of impulses: Before the trumpet starts, low firing rate; immediately after the trumpet starts, high firing rate; as the trumpet carries on sustaining its note, the firing rate dies away to an infrequent mutter; at the moment when the trumpet stops, high firing rate, dying away to a resting mutter again. Or there might be one class of nerve cells that fire only at the onset of sounds and a different class of cells that fire only when sounds go off. Similar exploitation of redundancy - screening out of the sameness in the world - goes on in cells that tell the brain about changes in light, changes in temperature, changes in pressure. Everything about the world is signalled as change, and this is a major economy.
But you and I don't seem to hear the trumpet die away. To us the trumpet seems to carry on at the same volume and then to stop abruptly. Yes, of course. That's what you'd expect because the coding system is ingenious. It doesn't throw away information, it only throws away redundancy. The brain is told only about changes, and it is then in a position to reconstruct the rest. Barlow doesn't put it like this, but we could say that the brain constructs a virtual sound, using the messages supplied by the nerves coming from the ears. The reconstructed virtual sound is complete and unabridged, even though the messages
themselves are economically stripped down to information about changes. The system works because the state of the world at a given time is
usually not greatly different from the preceding second. Only if the world changed capriciously, randomly and frequently, would it be economical for sense organs to signal continuously the state of the world. As it is, sense organs are set up to signal, economically, the discontinuities in the worlds and the brain, assuming correctly that the world doesn't change capriciously and at random, uses the information to construct an
internal virtual reality in which the continuity is restored.
The world presents an equivalent kind of redundancy in space, and the nervous system uses the corresponding trick. Sense organs tell the brain about edges and the brain fills in the boring bits between. Suppose you are looking at a black rectangle on a white background. The whole scene is projected on to your retina - you can think of the retina as a screen covered with a dense carpet of tiny photocells, the rods and cones. In theory, each photocell could report to the brain the exact state of the light falling upon it. But the scene we are looking at is massively redundant. Cells registering black are overwhelmingly likely to be surrounded by other cells registering black. Cells registering white are nearly all surrounded by other white-signalling cells. The important exceptions are cells on edges. Those on the white side of an edge signal white themselves and so do their neighbours that sit further into the white area. But their neighbours on the other side are in the black area. The brain can theoretically reconstruct the whole scene if just the retinal
cells on edges fire. If this could be achieved there would be massive savings in nerve impulses. Once again, redundancy is removed and only information gets through.
Elegantly, the economy is achieved in practice by the mechanism known as lateral inhibition'. Here's a simplified version of the principle, using our analogy of the screen of photocells. Each photocell sends one long wire to the central computer (brain) and also short wires to its immediate neighbours in the photocell screen. The short connections to the neighbours inhibit them, that is, turn down their firing rate. It is easy to see that maximal firing will come only from cells that lie along edges, for they are inhibited from one side only. Lateral inhibition of this kind is common among the low-level units of both vertebrate and invertebrate eyes.
Once again, we could say that the brain constructs a virtual world which is more complete than the picture relayed to it by the senses. The information which the senses supply to the brain is mostly information about edges. But the model in the brain is able to reconstruct the bits between the edges. As in the case of discontinuities in time, an economy is achieved by the elimination - and later reconstruction in the brain - of redundancy. This economy is possible only because uniform patches exist in the world. If the shades and colours in the world were randomly dotted about, no economical remodeling would be possible.
Another kind of redundancy stems from the fact that many lines in the real world are straight, or curved in smooth and therefore predictable (or mathematically reconstructable), ways. If the ends of a line are specified, the middle can be filled in using a simple rule that the brain already 'knows'. Among the nerve cells that have been discovered in the brains of mammals are the so-called 'line-detectors', neurones that fire whenever a straight line, aligned in a particular direction, falls on a particular place in the retina, the so-called 'retinal field' of the brain cell. Each of these line-detector cells has its own preferred direction. In the cat brain, there are only two preferred directions, horizontal and vertical, with an approximately equal number favouring each direction; however, in monkeys other angles are accommodated. From the point of view of the redundancy argument, what is going on here is as follows. In the retina, all the cells along a straight line fire and most of these impulses are redundant. The nervous system economizes by using a single cell to register the line, labelled with its angle. Straight lines are economically specified by their position and direction alone, or by their ends, not by the light value of every point along their length. The brain reweaves a virtual line in which the points along the line are reconstructed.
However, if a part of a scene suddenly detaches itself from the rest and starts to crawl over the background, it is news and should be signalled. Biologists have indeed discovered nerve cells that are silent until something moves against a still background. These cells don't respond when the entire scene moves - that would correspond to the sort of apparent movement the animal would see when it itself moves. But movement of a small object against a still background is information-rich and there are nerve cells tuned to detect it. The most famous of these are the so-called 'bug-detectors' discovered in frogs by Lettvin (he of the grandmother) and his colleagues. A bug-detector is a cell which is apparently blind to everything except the movement of small objects against their background. As soon as an insect moves in the field covered by a bug-detector, the cell immediately initiates massive signalling and the frog's tongue is likely to shoot out to catch the insect. To a sufficiently sophisticated nervous system, though, even the movement of a bug is redundant if it is movement in a straight line. Once you've been told that a bug is moving steadily in a northerly direction, you can assume that it will continue to move in this direction until further notice. Carrying the logic a step further, we should expect to find higher-order movement detector cells in the brain that are especially sensitive to change in movement, say, change in direction or change in speed. Lettvin and his colleagues found a cell that seems to do this, again in the frog. In their paper in Sensory Communication (1961) they describe a particular experiment as follows:
Let us begin with an empty gray hemisphere for the visual field-There is usually no response of the cell to turning on and off the illumination. It is silent. We bring in a small dark object say 1 to 2 degrees in diameter, and at a certain point in its travel, almost anywhere in the field, the cell suddenly 'notices' it. Thereafter, wherever that object is moved it is tracked by the cell. Every time it moves, with even the faintest jerk, there is a burst of impulses that dies down to a mutter that continues as long as the object is visible. If the object is kept moving, the bursts signal discontinuities in the movement, such as the turning of corners, reversals, and so forth, and these bursts occur against a continuous background mutter that tells us the object is visible to the cell. . .
To summarize, it is as if the nervous system is tuned at successive hierarchical levels to respond strongly to the unexpected, weakly or not at all to the expected. What happens at successively higher levels is that the definition of that which is expected becomes progressively more sophisticated. At the lowest level, every spot of light is news. And the next level up, only edges are 'news'. At a higher level still, since so many edges are straight, only the ends of edges are news. Higher again, only movement is news. Then only changes in rate or direction of movement. In Barlow's terms derived from the theory of codes, we could say that the
nervous system uses short, economical words for messages that occur frequently and are expected; long, expensive words for messages that occur rarely and are not expected. It is a bit like language, in which (the generalization is called Zipf's Law) the shortest words in the dictionary are the ones most often used in speech. To push the idea to an extreme, most of the time the brain does not need to be told anything because what is going on is the norm. The message would be redundant. The brain is protected from redundancy by a hierarchy of filters, each filter tuned to remove expected features of a certain kind.
It follows that the set of nervous filters constitutes a kind of summary description of the norm, of the statistical properties of the world in which the animal lives. It is the nervous equivalent of our insight of the previous chapter: that the genes of a species come to constitute a statistical description of the worlds in which its ancestors were naturally selected. Now we see that the sensory coding units with which the brain confronts the environment also constitute a statistical description of that environment. They are tuned to discount the common and emphasize the rare. Our hypothetical zoologist of the future should therefore be able, by inspecting the nervous system of an unknown animal and measuring the statistical biases in its tuning, to reconstruct the statistical properties of the world in which the animal lived, to read off what is common and what rare in the animal's world.
The inference would be indirect, in the same way as for the case of the genes. We would not be reading the animal's world as a direct description. Rather, we'd infer things about the animal's world by inspecting the glossary of abbreviations that its brain used to describe it. Civil servants love acronyms like CAP (Common Agricultural Policy) and HEFCE
(Higher Education Funding Council for England); fledgling bureaucrats surely need a glossary of such abbreviations, a codebook. If you find
such a codebook dropped in the street, you could work out which ministry it came from by seeing which phrases have been granted abbreviations, presumably because they are commonly used in that ministry. An intercepted codebook is not a particular message about the world, but it is a statistical summary of the kind of world which this code was designed to describe economically.
We can think of each brain as equipped with a store cupboard of basic images, useful for modelling important or common features of the animal's world. Although, following Barlow, I have emphasized learning as the means by which the store cupboard is stocked, there is no reason why natural selection itself, working on genes, should not do some of the work of filling up the cupboard. In this case, following the logic of the previous chapter, we should say that the store cupboard in the brain contains images from the ancestral past of the species. We could call it a
collective unconscious, if the phrase had not become tarnished by association.
But the biases of the image kit in the cupboard will not only reflect what is statistically unexpected in the world. Natural selection will ensure that the repertoire of virtual representations is also well endowed with images that are of particular salience or importance in the life of the particular kind of animal and in the world of its ancestors, even if these are not especially common. An animal may need only once in its life to recognize a complicated pattern, say the shape of a female of its species, but on that occasion it is vitally important to get it right, and do so without delay. For humans, faces are of special importance, as well as being common in our world. The same is true of social monkeys. Monkey brains have been found to possess a special class of cells which fire at full strength only when presented with a complete face. We've already seen that humans with particular kinds of localized brain damage experience a very peculiar, and revealing, kind of selective blindness. They can't recognize faces. They can see everything else, apparently normally, and they can see that a face has a shape, with features. They can describe the nose, the eyes and the mouth. But they can't recognize the face even of the person they love best in all the world.
Normal people not only recognize faces. We seem to have an almost indecent eagerness to see faces, whether they are really there or not. We see faces in damp patches on the ceiling, in the contours of a hillside, in clouds or in Martian rocks. Generations of moongazers have been led, by the most unpromising of raw materials, to invent a face in the pattern of craters on the moon. The Daily Express (London) of 15 January 1998 bestowed most of a page, complete with banner headline, on the story that an Irish cleaning woman saw the face of Jesus in her duster: 'Now a stream of pilgrims is expected at her semi-detached home . . . The woman's parish priest said, 'I've never seen anything like it before in my 34 years in the priesthood. " The accompanying photograph shows a pattern of dirty polish on a cloth which slightly resembles a face of some kind: there is a faint suggestion of an eye on one side of what could be a nose; there is also a sloping eyebrow on the other side which gives it a look of Harold Macmillan, although I suppose even Harold Macmillan might look like Jesus to a suitably prepared mind. The Express reminds us of similar stories, including the 'nun bun' served up in a Nashville cafe, which 'resembled the face of Mother Teresa, 86' and caused great excitement until 'the aged nun wrote to the cafe demanding the bun be removed'.
The eagerness of the brain to construct a face, when offered the slightest encouragement, fosters a remarkable illusion. Get an ordinary mask of a
human face - President Clinton's face, or whatever is on sale for fancy dress parties. Stand it up in a good light and look at it from the far side of the room. If you look at it the normal way round, not surprisingly it looks solid. But now turn the mask so that it is facing away from you and look at the hollow side from across the room. Most people see the illusion immediately. If you don't, try adjusting the light. It may help if you shut one eye, but it is by no means necessary. The illusion is that the hollow side of the mask looks solid. The nose, brows and mouth stick out towards you and seem nearer than the ears. It is even more striking if you move from side to side, or up and down. The apparently solid face seems to turn with you, in an odd, almost magical way. I'm not talking about the ordinary experience we have when the eyes of a good portrait seem to follow you around the room. The hollow mask illusion is far more spooky. It seems to hover, luminously, in space. The face really really seems to turn. I have a mask of Einstein's face mounted in my room, hollow side out, and visitors gasp when they glimpse it. The illusion is most strikingly displayed if you set the mask on a slowly rotating turntable. As the solid side turns before you, you'll see it move in a sensible 'normal reality' way. Now the hollow side comes into view and something extraordinary happens. You see another solid face, but it is rotating in the opposite direction. Because one face (say, the real solid face) is turning clockwise while the other, pseudo-solid face appears to be turning anticlockwise, the face that is rotating into view seems to swallow up the face that is rotating away from view. As the turning continues, you then see the really hollow but apparently solid face rotating firmly in the wrong direction for a while, before the really solid face reappears and swallows up the virtual face. The whole experience of watching the illusion is quite unsettling and it remains so no matter how long you go on watching it. You don't get used to it and don't lose the illusion.
What is happening? We can take the answer in two stages. First, why do we see the hollow mask as solid? And second, why does it seem to rotate in the wrong direction? We've already agreed that the brain is very good at - and very keen on - constructing faces in its internal simulation room. The information that the eyes are feeding to the brain is of course compatible with the mask's being hollow, but it is also compatible - just - with an alternative hypothesis, that it is solid. And the brain, in its simulation, goes for the second alternative, presumably because of its eagerness to see faces. So it overrules the messages from the eyes that say, 'This is hollow'; instead, it listens to the messages that say, 'This is a face, this is a face, face, face, face. ' Faces are always solid. So the brain takes a face model out of its cupboard which is, by its nature, solid.
But having constructed its apparently solid face model, the brain is caught in a contradiction when the mask starts to rotate. To simplify the
explanation, suppose that the mask is that of Oliver Cromwell and that his famous warts are visible from both sides of the mask. When looking
at the hollow interior of the nose, which is really pointing away from the viewer, the eye looks straight across to the right side of the nose where there is a prominent wart. But the constructed virtual nose is apparently pointing towards the viewer, not away, and the wart is on what, from the virtual Cromwell's point of view, would be his left side, as if we were looking at Cromwell's mirror image. As the mask rotates, if the face were really solid, our eye would see more of the side that it expected to see more of and less of the side that it expected to see less of.
But because the mask is actually hollow, the reverse happens. The relative
proportions of the retinal image change in the way the brain would
expect if the face were solid but rotating in the opposite direction. And that is the illusion that we see. The brain resolves the inevitable contradiction, as one side gives way to the other, in the only way possible, given its stubborn insistence on the mask's being a solid face: it
simulates a virtual model of one face swallowing up the other face.
The rare brain disorder that destroys our ability to recognize faces is called prosopagnosia. It is caused by injury to specific parts of the brain. This very fact supports the importance of a 'face cupboard' in the brain. I don't know, but I'd bet that prosopagnosics wouldn't see the hollow mask illusion. Francis Crick discusses prosopagnosia in his book The Astonishing Hypothesis (1994), together with other revealing clinical conditions. For instance, one patient found the following condition very frightening which, as Crick observes, is not surprising:
. . . objects or persons she saw in one place suddenly appeared in another without her being aware they were moving. This was particularly distressing if she wanted to cross a road, since a car that at first seemed far away would suddenly be very close . . . She experienced the world rather as some of us might see the dance floor in the strobe lighting of a discotheque.
This woman had a mental cupboard full of images for assembling her virtual world, just as we all do. The images themselves were probably perfectly good. But something had gone wrong with her software for deploying them in a smoothly changing virtual world. Other patients have lost their ability to construct virtual depth. They see the world as though it was made of flat, cardboard cut-outs. Yet other patients can recognize objects only if they are presented from a familiar angle. The rest of us, having seen, say, a saucepan from the side, can effortlessly recognize it from above. These patients have presumably lost some ability to manipulate virtual images and turn them around. The technology of virtual reality gives us a language to think about such skills, and this will be my next topic.
I shall not dwell on the details of today's virtual reality which is certain,
in any case, to become obsolete. The technology changes as rapidly as everything else in the world of computers. Essentially what happens is as follows. You don a headset which presents to each of your eyes a miniature computer screen. The images on the two screens are nearly
the same as each other, but offset to give the stereo illusion of three dimensions. The scene is whatever has been programmed into the computer: the Parthenon, perhaps, intact and in its original garish colours; an imagined landscape on Mars; the inside of a cell, hugely magnified. So far, I might have been I describing an ordinary 3-D movie. But the virtual reality machine provides a two-way street. The computer doesn't just present you with scenes, it responds to you. The headset is wired up to register all turnings of your head, and other body movements, which would, in the normal course of events, affect your viewpoint. The computer is continuously informed of all such movements and - here is the cunning part - it is programmed to change the scene presented to the eyes, in exactly the way it would change if you were really moving your head. As you turn your head, the pillars of the Parthenon, say, swing round and you find yourself looking at a statue which, previously, had been 'behind' you.
A more advanced system might have you in a body stocking, laced with strain gauges to monitor the positions of all your limbs. The computer can now tell whenever you take a step, whenever you sit down, stand up, or wave your arms. You can now walk from one end of the Parthenon to the other, watching the pillars pass by as the computer changes the images in sympathy with your steps. Tread carefully because, remember, you are not really in the Parthenon but in a cluttered computer room. Present day virtual reality systems, indeed, are likely to tether you to the computer by a complicated umbilicus of cables, so let's postulate a future tangle-free radio link, or infrared data beam. Now you can walk freely in an empty real world and explore the fantasy virtual world that has been programmed for you. Since the computer knows where your body stocking is, there is no reason why it shouldn't represent you to yourself as a complete human form, an avatar, allowing you to look down at your 'legs', which might be very different from your real legs. You can watch your avatar's hands as they move in imitation of your real hands. If you use these hands to pick up a virtual object, say a Grecian urn, the urn will seem to rise into the air as you 'lift' it.
If somebody else, who could be in another country, dons another set of kit hooked up to the same computer, in principle you should be able to see their avatar and even shake hands - though with present day technology- you might find yourself passing through each other like ghosts. The technicians and programmers are still working on how to
create the illusion of texture and the 'feel' of solid resistance. When I visited England's leading virtual reality company, they told me they get many letters from people wanting a virtual sexual partner. Perhaps in the future, lovers separated by the Atlantic will caress each other over the Internet, albeit incommoded by the need to wear gloves and a body stocking wired up with strain gauges and pressure pads.
Now let's take virtual reality a shade away from dreams and closer to practical usefulness. Present day doctors have recourse to the ingenious endoscope, a sophisticated tube that is inserted into a patient's body through, say, the mouth or the rectum and used for diagnosis and even surgical intervention. By the equivalent of pulling wires, the surgeon steers the long tube round the bends of the intestine. The tube itself has a tiny television camera lens at its tip and a light pipe to illuminate the way. The tip of the tube may also be furnished with various remote- control instruments which the surgeon can control, such as micro- scalpels and forceps.
In conventional endoscopy, the surgeon sees what he is doing using an ordinary television screen, and he operates the remote controls using his fingers. But as various people have realized (not least Jaron Lanier, who coined the phrase 'virtual reality' itself) it is in principle possible to give the surgeon the illusion of being shrunk and actually inside the patient's body. This idea is in the research stage, so I shall resort to a fantasy of how the technique might work in the next century. The surgeon of the future has no need to scrub up, for she need not go near her patient. She stands ? in a wide open area, connected by radio to the endoscope inside the patient's intestine. The miniature screens in front of her two eyes present a magnified stereo image of the interior of the patient
immediately in front of the tip of the endoscope. When she moves her head to the left, the computer automatically swivels the tip of the endoscope to the left. The angle of view of the camera inside the intestine faithfully moves to follow the surgeon's head movements in all three planes. She drives the endoscope forward along the intestine by her footsteps. Slowly, slowly, for fear of damaging the patient, the computer pushes the endoscope forwards, its direction always controlled by the direction in which, in a completely different room, the surgeon is walking. It feels to her as though she is actually walking through the intestine. It doesn't even feel claustrophobic. Following present day endoscopic practice, the gut has been carefully inflated with air, otherwise the walls would press in upon the surgeon and force her to crawl rather than walk.
When she finds what she is looking for, say a malignant tumour, the surgeon selects an instrument from her virtual toolbag. Perhaps it is most convenient to model it as a chainsaw, whose image is generated in the computer. Looking through the stereo screens in her helmet at the
enlarged 3-D tumour, the surgeon sees the virtual chainsaw- in her virtual hands and goes to work, excising the tumour, as though it were a tree stump needing to be removed from the garden. Inside the real patient, the mirrored equivalent of the chainsaw is an ultrafine laser beam. As if by a pantograph, the gross movements of the surgeon's whole arm as she hefts the chainsaw are geared down, by the computer, to equivalent tiny movements of the laser gun in the tip of the endoscope.
For my purposes I need say only that it is theoretically possible to create the illusion of walking through somebody's intestine using the techniques of virtual reality. I do not know whether it will actually help surgeons. I suspect that it will, although a present day hospital consultant whom I have asked is a little sceptical. This same surgeon refers to himself and his fellow gastroenterologists as glorified plumbers. Plumbers themselves sometimes use larger-scale versions of endoscopes for exploring pipes and in America they even send down mechanical 'pigs' to eat their way through blockages in drains. Obviously the methods I imagined for a surgeon would work for a plumber. The plumber could 'tramp' (or 'swim'? ) down the virtual water pipe with a virtual miner's lamp on his helmet and a virtual pickaxe in his hand for clearing blockages.
The Parthenon of my first example existed nowhere but in the computer. The computer could as well have introduced you to angels, harpies or winged unicorns. My hypothetical endoscopist and plumber, on the other hand, were walking through a virtual world that was constrained to resemble a mapped portion of reality, the real interior of a drain or a patient's intestine. The virtual world that was presented to the surgeon on her stereo screens was admittedly constructed in a computer, but it was constructed in a disciplined way. There was a real laser gun being controlled, albeit represented as a chainsaw because this would feel like a natural tool to excise a tumour whose apparent size was comparable to the surgeon's own body- The shape of the virtual construction reflected, in the way most convenient to the surgeon's operation, a detail of the real world inside the patient. Such constrained virtual reality is pivotal in this chapter. I believe that every species that has a nervous system uses it to construct a model of its own particular world, constrained by continuous updating through the sense organs. The nature of the model may depend upon how the species concerned is going to use it, at least as much as upon what we might think of as the nature of the world itself.
Think of a gliding gull adroitly riding the winds off a sea cliff. It may not be flapping its wings, but this doesn't mean that its wing muscles are idle. They and the tail muscles are constantly making tiny adjustments, sensitively fine-tuning the bird's flight surfaces to every eddy, every nuance of the air around it. If we fed information about the state of all
the nerves controlling these muscles into a computer, from moment to moment, the computer could in principle reconstruct every detail of the air currents through which the bird was gliding. It would do this by assuming that the bird was well designed to stay aloft and on that assumption construct a continuously updated model of the air around it. It would be a dynamic model, like a weather forecaster's model of the world's weather system, which is continuously revised by new data supplied by weather ships, satellites and ground stations and can be extrapolated to predict the future. The weather model advises us about tomorrow's weather; the gull model is theoretically capable of 'advising' the bird on the anticipatory adjustments that it should make to its wing and tail muscles in order to glide on into the next second.
The point we are working towards, of course, is that although no human programmer has yet constructed a computer model to advise gulls on how to adjust their wing and tail muscles, just such a model is surely being run continuously in the brain of our gull and of every other bird in flight. Similar models, preprogrammed in outline by genes and past experience, and continuously updated by new sense data from millisecond to millisecond, are running inside the skull of every swimming fish, every galloping horse, every echo-ranging bat.
That ingenious inventor Paul MacCready is best known for his superbly economical flying machines, the man-powered Gossamer Condor and Gossamer Albatross and the sun-powered Solar Challenger. He also, in 1985, constructed a half-sized flying replica of the giant Cretaceous pterosaur Quetzalcoatlus. This huge flying reptile, with a wingspan comparable to that of a light aircraft, had almost no tail and was therefore highly unstable in the air. John Maynard Smith, who trained as an aero-engineer before switching to zoology, pointed out that this would have given advantages of manoeuvrability, but it demands accurate moment-to-moment control of the flight surfaces. Without a fast computer to adjust its trim continuously, MacCready's replica would have crashed. The real Quetzalcoatlus must have had an equivalent computer in its head, for the same reason. Earlier pterosaurs had long tails, in some cases terminated by what looks like a ping-pong bat, which would have given great stability, at a cost in manoeuvrability. It seems that, in the evolution of late, almost tailless pterosaurs like Quetzalcoatlus, there was a shift from stable but unmanoeuvrable to manoeuvrable but unstable. The same trend can be seen in the evolution of manmade aeroplanes. In both cases, the trend is made possible only by increasing computer power. As in the case of the seagull, the pterosaur's on-board computer inside its skull must have run a simulation model of the animal and the air through which it flew.
You and I, we humans, we mammals, we animals, inhabit a virtual world, constructed from elements that are, at successively higher levels, useful for representing the real world. Of course, we . feel as if we are firmly placed in the real world - which is exactly as it should be if our constrained virtual reality software is any good. It is very good, and the only time we notice it at all is on the rare occasions when it gets something wrong. When this happens we experience an illusion or a hallucination, like the hollow mask illusion we talked about earlier.
The British psychologist Richard Gregory has paid special attention to visual illusions as a means of studying how the brain works. In his book Eye and Brain (fifth edition 1998), he regards seeing as an active process in which the brain sets up hypotheses about what is going on out there, then tests those hypotheses against the data coming in from, the sense organs. One of the most familiar of all visual illusions is the Necker cube. This is a simple line drawing of a hollow cube, like a cube made of steel rods. The drawing is a two-dimensional pattern of ink on paper. Yet a normal human sees it as a cube. The brain has made a three- dimensional model based upon the two-dimensional pattern on the paper. This is, indeed, the kind of thing the brain does almost every time you look at a picture. The flat pattern of ink on paper is equally compatible with two alternative three-dimensional brain models. Stare at the
drawing for some seconds and you will see it flip. The facet that had previously seemed nearest to you will now appear farthest. Carry on looking, and it will flip back to the original cube. The brain could have been designed to stick, arbitrarily, to one of the two cube models, say the first of the two that it hit upon, even though the other model would have been equally compatible with the information from the retinas. But in fact the brain takes the other option of running each model, or hypothesis, alternately for a few seconds at a time. Hence the apparent cube alternates, which gives the game away. Our brain constructs a three-dimensional model. It is virtual reality in the head.
When we are looking at an actual wooden box, our simulation software is provided with additional information, which enables it to arrive at a clear preference for one of the two internal models. We therefore see the box in one way only, and there is no alternation. But this does not diminish the truth of the general lesson we learn from the Necker cube. Whenever we look at anything, there is a sense in which what our brain actually
makes use of is a model of that thing in the brain. The model in the brain, like the virtual Parthenon of my earlier example, is constructed. But, unlike the Parthenon (and perhaps the visions we see in dreams), it is, like the surgeon's computer model of the inside of her patient, not entirely invented: it is constrained by information fed in from the outside world.
A more powerful illusion of solidity is conveyed by stereoscopy, the slight discrepancy between the two images seen by the left and the right eyes. It is this that is exploited by the two screens in a virtual reality helmet. Hold up your right hand, with the thumb towards you, about one foot in front of your face, and look at some distant object, say a tree, with both eyes open. You'll see two hands. These correspond to the images seen by your two eyes. You can quickly find out which is which by first shutting one, then the other, eye. The two hands appear to be in slightly different places because your two eyes converge from different angles and the images on the two retinas are correspondingly, and tellingly, different. The two eyes get a slightly different view of the hand, too. The left eye sees a bit more of the palm, the right eye sees a bit more of the back of the hand.
Now, instead of looking at the distant tree, look at your hand, again with both eyes open. Instead of two hands in the foreground and one tree in the background, you'll see one solid-looking hand and two trees. Yet the hand image is still falling on different places on your two retinas. What this means is that your simulation software has constructed a single model of the hand, a model in 3-D. What's more, the single three- dimensional model has made use of information from both eyes. The brain subtly amalgamates both sets of information and puts together a useful model of a single, three-dimensional, solid hand. Incidentally, all retinal images of course are upside down, but this doesn't matter because the brain constructs its simulation model in the way that best suits its purpose and defines this model as the right way up.
The computational tricks used by the brain to construct a three- dimensional model from two two-dimensional images are astonishingly sophisticated, and are the basis of perhaps the most impressive of all illusions. These date back to a discovery by the Hungarian psychologist Bela Julesz in 1959. A normal stereoscope presents the same photograph to the left and the right eye but taken from suitably different angles. The brain puts the two together and sees an impressively three-dimensional scene. Julesz did the same thing, except that his pictures were random pepper and salt dots. The left and the right eye were shown the same random pattern, but with a crucial difference. In a typical Julesz experiment, an area of the pattern, say, a square, has its random dots displaced to one side, the appropriate distance to create the stereoscopic illusion. And the brain sees the illusion - a square patch stands out - even though there is not the smallest trace of a square in either of the two pictures. The square is present only in the discrepancy between the two pictures. The square looks very real to the viewer, but it really is nowhere but in the brain. The Julesz Effect is the basis of the 'Magic Eye' illusions so popular today. In a tour de force of the explainer's art, Steven Pinker devotes a small section of How the Mind Works (1998) to the
principle underlying these pictures. I won't even try to better his explanation.
There is an easy way to demonstrate that the brain works as a sophisticated virtual reality computer. First, look about you by moving your eyes. As you swivel your eyes, the images on your retinas move as if you were in an earthquake. But you don't see an earthquake. To you, the scene seems as steady as a rock. I am leading up, of course, to saying that the virtual model in your brain is constructed to remain steady. But there is more to the demonstration, because there's another way to make the image on your retina move. Gently poke your eyeball through the skin of the eyelid. The retinal image will move in the same kind of way as before. Indeed you could, given sufficient skill with your finger, mimic the effect of shifting your gaze. But now you really will think you see the earth move. The whole scene shifts, as if you were witnessing an earthquake.
What is the difference between these two cases? It is that the brain computer has been set up to take account of normal eye movements and make allowance for them in constructing its computed model of the world. Apparently the brain model makes use of information, not only from the eyes, but also from the instructions to move the eyes. Whenever the brain issues an order to the eye muscles to move the eye, a copy of that order is sent to the part of the brain that is constructing the internal model of the world. Then, when the eyes move, the virtual reality software of the brain is warned to expect the retinal images to move just the right amount, and it makes the model compensate. So the constructed model of the world is seen to stay still, although it may be viewed from another angle. If the earth moves at any time other than when the model is told to expect movement, the virtual model moves accordingly. This is fine, because there really might be an earthquake. Except that you can fool the system by poking your eyeball. As the final demonstration using yourself as guinea pig, make yourself giddy by spinning round and round. Now stand still and look fixedly at the world. It will appear to spin even though your reason tells you that it is not getting anywhere in its rotation. Your retinal images are not moving, but the accelerometers in your ears (which work by detecting the movements of fluid in the so-called semicircular canals) are telling the brain that you are spinning. The brain instructs the virtual reality software to expect to see the world spinning. When the images on the retina do not spin, therefore, the model registers the discrepancy and spins itself in the opposite direction. To put it in subjective language, the virtual reality software says to itself, 'I know I'm spinning from what the ears are telling me; therefore, in order to hold the model still, it will be necessary to put the opposite spin on the model, relative to the data that the eyes are sending in. ' But the retinas actually report no spin, so the compensating
spin of the model in the head is what you seem to see. In Barlow's terms, it is the unexpected, it is 'news', and that is why we see it.
Birds have an additional problem which humans ordinarily are spared. A bird perched on a tree branch is constantly being blown up and down, to and fro, and its retinal images seesaw accordingly. It is like living through a permanent earthquake.
The keyhole strategy is ruled out by sheer force of numbers.
Even if Lettvin needed to recognize nothing but his grandmother, how could he cope when her image falls on a different part of the retina? How cope with her image's changing size and shape as she approaches or recedes, as she turns sideways, or cants to the rear, as she smiles or as she frowns? If we add up all possible combinations of keyholes and anti- keyholes, the number enters the astronomical range. When you realize that Lettvin can recognize not only his grandmother's face but hundreds of other faces, the other bits of his grandmother and of other people, all the letters of the alphabet, all the thousands of objects to which a normal person can instantly give a name, in all possible orientations and
apparent sizes, the explosion of triggering cells gets rapidly out of hand. The American psychologist Fred Attneave, who had come up with the same general idea as Barlow, dramatized the point by the following calculation: if there were just one brain cell to cope, keyhole fashion, with each image that we can distinguish in all its presentations, the volume of the brain would have to be measured in cubic light years.
How then, with a brain capacity measured only in hundreds of cubic centimetres, do we do it? The answer was proposed in the 1950s by Barlow and Attneave independently. They suggested that nervous systems exploit the massive redundancy in all sensory information. Redundancy is jargon from the world of information theory, originally developed by engineers concerned with the economics of telephone line capacity. Information, in the technical sense, is surprise value, measured as the inverse of expected probability. Redundancy is the opposite of information, a measure of unsurprisingness, of old-hatitude. Redundant messages or parts of messages are not informative because the receiver, in some sense, already knows what is coming. Newspapers do not carry headlines saying, 'The sun rose this morning'. That would convey almost zero information. But if a morning came when the sun did not rise, headline writers would, if any survived, make much of it. The information content would be high, measured as the surprise value of the message. Much of spoken and written language is redundant - hence possible condense telegraphese: redundancy lost, information preserved.
Everything that we know about the world outside our skulls comes to us via nerve cells whose impulses chatter like machine guns. What passes along a nerve cell is a volleying of 'spikes', impulses whose voltage is
fixed (or at least irrelevant) but whose rate of arriving varies meaningfully. Now let's think about coding principles. How would you translate information from the outside world, say, the sound of an oboe or the temperature of a bath, into a pulse code? A first thought is a simple rate code: the hotter the bath, the faster the machine gun should fire. The brain, in other words, would have a thermometer calibrated in pulse rates. Actually, this is not a good code because it is uneconomical with pulses. By exploiting redundancy, it is possible to devise codes that convey the same information at a cost of fewer pulses. Temperatures in the world mostly stay the same for long periods at a time. To signal 'It is hot, it is hot, it is still hot. . . ' by a continuously high rate of machine-gun pulses is wasteful; it is better to say, 'It has suddenly become hot' (now you can assume that it will stay the same until further notice).
And, satisfyingly, this is what nerve cells mostly do, not just for signalling temperature but for signalling almost everything about the world. Most nerve cells are biased to signal changes in the world. If a trumpet plays a long sustained note, a typical nerve cell telling the brain
about it would show the following pattern of impulses: Before the trumpet starts, low firing rate; immediately after the trumpet starts, high firing rate; as the trumpet carries on sustaining its note, the firing rate dies away to an infrequent mutter; at the moment when the trumpet stops, high firing rate, dying away to a resting mutter again. Or there might be one class of nerve cells that fire only at the onset of sounds and a different class of cells that fire only when sounds go off. Similar exploitation of redundancy - screening out of the sameness in the world - goes on in cells that tell the brain about changes in light, changes in temperature, changes in pressure. Everything about the world is signalled as change, and this is a major economy.
But you and I don't seem to hear the trumpet die away. To us the trumpet seems to carry on at the same volume and then to stop abruptly. Yes, of course. That's what you'd expect because the coding system is ingenious. It doesn't throw away information, it only throws away redundancy. The brain is told only about changes, and it is then in a position to reconstruct the rest. Barlow doesn't put it like this, but we could say that the brain constructs a virtual sound, using the messages supplied by the nerves coming from the ears. The reconstructed virtual sound is complete and unabridged, even though the messages
themselves are economically stripped down to information about changes. The system works because the state of the world at a given time is
usually not greatly different from the preceding second. Only if the world changed capriciously, randomly and frequently, would it be economical for sense organs to signal continuously the state of the world. As it is, sense organs are set up to signal, economically, the discontinuities in the worlds and the brain, assuming correctly that the world doesn't change capriciously and at random, uses the information to construct an
internal virtual reality in which the continuity is restored.
The world presents an equivalent kind of redundancy in space, and the nervous system uses the corresponding trick. Sense organs tell the brain about edges and the brain fills in the boring bits between. Suppose you are looking at a black rectangle on a white background. The whole scene is projected on to your retina - you can think of the retina as a screen covered with a dense carpet of tiny photocells, the rods and cones. In theory, each photocell could report to the brain the exact state of the light falling upon it. But the scene we are looking at is massively redundant. Cells registering black are overwhelmingly likely to be surrounded by other cells registering black. Cells registering white are nearly all surrounded by other white-signalling cells. The important exceptions are cells on edges. Those on the white side of an edge signal white themselves and so do their neighbours that sit further into the white area. But their neighbours on the other side are in the black area. The brain can theoretically reconstruct the whole scene if just the retinal
cells on edges fire. If this could be achieved there would be massive savings in nerve impulses. Once again, redundancy is removed and only information gets through.
Elegantly, the economy is achieved in practice by the mechanism known as lateral inhibition'. Here's a simplified version of the principle, using our analogy of the screen of photocells. Each photocell sends one long wire to the central computer (brain) and also short wires to its immediate neighbours in the photocell screen. The short connections to the neighbours inhibit them, that is, turn down their firing rate. It is easy to see that maximal firing will come only from cells that lie along edges, for they are inhibited from one side only. Lateral inhibition of this kind is common among the low-level units of both vertebrate and invertebrate eyes.
Once again, we could say that the brain constructs a virtual world which is more complete than the picture relayed to it by the senses. The information which the senses supply to the brain is mostly information about edges. But the model in the brain is able to reconstruct the bits between the edges. As in the case of discontinuities in time, an economy is achieved by the elimination - and later reconstruction in the brain - of redundancy. This economy is possible only because uniform patches exist in the world. If the shades and colours in the world were randomly dotted about, no economical remodeling would be possible.
Another kind of redundancy stems from the fact that many lines in the real world are straight, or curved in smooth and therefore predictable (or mathematically reconstructable), ways. If the ends of a line are specified, the middle can be filled in using a simple rule that the brain already 'knows'. Among the nerve cells that have been discovered in the brains of mammals are the so-called 'line-detectors', neurones that fire whenever a straight line, aligned in a particular direction, falls on a particular place in the retina, the so-called 'retinal field' of the brain cell. Each of these line-detector cells has its own preferred direction. In the cat brain, there are only two preferred directions, horizontal and vertical, with an approximately equal number favouring each direction; however, in monkeys other angles are accommodated. From the point of view of the redundancy argument, what is going on here is as follows. In the retina, all the cells along a straight line fire and most of these impulses are redundant. The nervous system economizes by using a single cell to register the line, labelled with its angle. Straight lines are economically specified by their position and direction alone, or by their ends, not by the light value of every point along their length. The brain reweaves a virtual line in which the points along the line are reconstructed.
However, if a part of a scene suddenly detaches itself from the rest and starts to crawl over the background, it is news and should be signalled. Biologists have indeed discovered nerve cells that are silent until something moves against a still background. These cells don't respond when the entire scene moves - that would correspond to the sort of apparent movement the animal would see when it itself moves. But movement of a small object against a still background is information-rich and there are nerve cells tuned to detect it. The most famous of these are the so-called 'bug-detectors' discovered in frogs by Lettvin (he of the grandmother) and his colleagues. A bug-detector is a cell which is apparently blind to everything except the movement of small objects against their background. As soon as an insect moves in the field covered by a bug-detector, the cell immediately initiates massive signalling and the frog's tongue is likely to shoot out to catch the insect. To a sufficiently sophisticated nervous system, though, even the movement of a bug is redundant if it is movement in a straight line. Once you've been told that a bug is moving steadily in a northerly direction, you can assume that it will continue to move in this direction until further notice. Carrying the logic a step further, we should expect to find higher-order movement detector cells in the brain that are especially sensitive to change in movement, say, change in direction or change in speed. Lettvin and his colleagues found a cell that seems to do this, again in the frog. In their paper in Sensory Communication (1961) they describe a particular experiment as follows:
Let us begin with an empty gray hemisphere for the visual field-There is usually no response of the cell to turning on and off the illumination. It is silent. We bring in a small dark object say 1 to 2 degrees in diameter, and at a certain point in its travel, almost anywhere in the field, the cell suddenly 'notices' it. Thereafter, wherever that object is moved it is tracked by the cell. Every time it moves, with even the faintest jerk, there is a burst of impulses that dies down to a mutter that continues as long as the object is visible. If the object is kept moving, the bursts signal discontinuities in the movement, such as the turning of corners, reversals, and so forth, and these bursts occur against a continuous background mutter that tells us the object is visible to the cell. . .
To summarize, it is as if the nervous system is tuned at successive hierarchical levels to respond strongly to the unexpected, weakly or not at all to the expected. What happens at successively higher levels is that the definition of that which is expected becomes progressively more sophisticated. At the lowest level, every spot of light is news. And the next level up, only edges are 'news'. At a higher level still, since so many edges are straight, only the ends of edges are news. Higher again, only movement is news. Then only changes in rate or direction of movement. In Barlow's terms derived from the theory of codes, we could say that the
nervous system uses short, economical words for messages that occur frequently and are expected; long, expensive words for messages that occur rarely and are not expected. It is a bit like language, in which (the generalization is called Zipf's Law) the shortest words in the dictionary are the ones most often used in speech. To push the idea to an extreme, most of the time the brain does not need to be told anything because what is going on is the norm. The message would be redundant. The brain is protected from redundancy by a hierarchy of filters, each filter tuned to remove expected features of a certain kind.
It follows that the set of nervous filters constitutes a kind of summary description of the norm, of the statistical properties of the world in which the animal lives. It is the nervous equivalent of our insight of the previous chapter: that the genes of a species come to constitute a statistical description of the worlds in which its ancestors were naturally selected. Now we see that the sensory coding units with which the brain confronts the environment also constitute a statistical description of that environment. They are tuned to discount the common and emphasize the rare. Our hypothetical zoologist of the future should therefore be able, by inspecting the nervous system of an unknown animal and measuring the statistical biases in its tuning, to reconstruct the statistical properties of the world in which the animal lived, to read off what is common and what rare in the animal's world.
The inference would be indirect, in the same way as for the case of the genes. We would not be reading the animal's world as a direct description. Rather, we'd infer things about the animal's world by inspecting the glossary of abbreviations that its brain used to describe it. Civil servants love acronyms like CAP (Common Agricultural Policy) and HEFCE
(Higher Education Funding Council for England); fledgling bureaucrats surely need a glossary of such abbreviations, a codebook. If you find
such a codebook dropped in the street, you could work out which ministry it came from by seeing which phrases have been granted abbreviations, presumably because they are commonly used in that ministry. An intercepted codebook is not a particular message about the world, but it is a statistical summary of the kind of world which this code was designed to describe economically.
We can think of each brain as equipped with a store cupboard of basic images, useful for modelling important or common features of the animal's world. Although, following Barlow, I have emphasized learning as the means by which the store cupboard is stocked, there is no reason why natural selection itself, working on genes, should not do some of the work of filling up the cupboard. In this case, following the logic of the previous chapter, we should say that the store cupboard in the brain contains images from the ancestral past of the species. We could call it a
collective unconscious, if the phrase had not become tarnished by association.
But the biases of the image kit in the cupboard will not only reflect what is statistically unexpected in the world. Natural selection will ensure that the repertoire of virtual representations is also well endowed with images that are of particular salience or importance in the life of the particular kind of animal and in the world of its ancestors, even if these are not especially common. An animal may need only once in its life to recognize a complicated pattern, say the shape of a female of its species, but on that occasion it is vitally important to get it right, and do so without delay. For humans, faces are of special importance, as well as being common in our world. The same is true of social monkeys. Monkey brains have been found to possess a special class of cells which fire at full strength only when presented with a complete face. We've already seen that humans with particular kinds of localized brain damage experience a very peculiar, and revealing, kind of selective blindness. They can't recognize faces. They can see everything else, apparently normally, and they can see that a face has a shape, with features. They can describe the nose, the eyes and the mouth. But they can't recognize the face even of the person they love best in all the world.
Normal people not only recognize faces. We seem to have an almost indecent eagerness to see faces, whether they are really there or not. We see faces in damp patches on the ceiling, in the contours of a hillside, in clouds or in Martian rocks. Generations of moongazers have been led, by the most unpromising of raw materials, to invent a face in the pattern of craters on the moon. The Daily Express (London) of 15 January 1998 bestowed most of a page, complete with banner headline, on the story that an Irish cleaning woman saw the face of Jesus in her duster: 'Now a stream of pilgrims is expected at her semi-detached home . . . The woman's parish priest said, 'I've never seen anything like it before in my 34 years in the priesthood. " The accompanying photograph shows a pattern of dirty polish on a cloth which slightly resembles a face of some kind: there is a faint suggestion of an eye on one side of what could be a nose; there is also a sloping eyebrow on the other side which gives it a look of Harold Macmillan, although I suppose even Harold Macmillan might look like Jesus to a suitably prepared mind. The Express reminds us of similar stories, including the 'nun bun' served up in a Nashville cafe, which 'resembled the face of Mother Teresa, 86' and caused great excitement until 'the aged nun wrote to the cafe demanding the bun be removed'.
The eagerness of the brain to construct a face, when offered the slightest encouragement, fosters a remarkable illusion. Get an ordinary mask of a
human face - President Clinton's face, or whatever is on sale for fancy dress parties. Stand it up in a good light and look at it from the far side of the room. If you look at it the normal way round, not surprisingly it looks solid. But now turn the mask so that it is facing away from you and look at the hollow side from across the room. Most people see the illusion immediately. If you don't, try adjusting the light. It may help if you shut one eye, but it is by no means necessary. The illusion is that the hollow side of the mask looks solid. The nose, brows and mouth stick out towards you and seem nearer than the ears. It is even more striking if you move from side to side, or up and down. The apparently solid face seems to turn with you, in an odd, almost magical way. I'm not talking about the ordinary experience we have when the eyes of a good portrait seem to follow you around the room. The hollow mask illusion is far more spooky. It seems to hover, luminously, in space. The face really really seems to turn. I have a mask of Einstein's face mounted in my room, hollow side out, and visitors gasp when they glimpse it. The illusion is most strikingly displayed if you set the mask on a slowly rotating turntable. As the solid side turns before you, you'll see it move in a sensible 'normal reality' way. Now the hollow side comes into view and something extraordinary happens. You see another solid face, but it is rotating in the opposite direction. Because one face (say, the real solid face) is turning clockwise while the other, pseudo-solid face appears to be turning anticlockwise, the face that is rotating into view seems to swallow up the face that is rotating away from view. As the turning continues, you then see the really hollow but apparently solid face rotating firmly in the wrong direction for a while, before the really solid face reappears and swallows up the virtual face. The whole experience of watching the illusion is quite unsettling and it remains so no matter how long you go on watching it. You don't get used to it and don't lose the illusion.
What is happening? We can take the answer in two stages. First, why do we see the hollow mask as solid? And second, why does it seem to rotate in the wrong direction? We've already agreed that the brain is very good at - and very keen on - constructing faces in its internal simulation room. The information that the eyes are feeding to the brain is of course compatible with the mask's being hollow, but it is also compatible - just - with an alternative hypothesis, that it is solid. And the brain, in its simulation, goes for the second alternative, presumably because of its eagerness to see faces. So it overrules the messages from the eyes that say, 'This is hollow'; instead, it listens to the messages that say, 'This is a face, this is a face, face, face, face. ' Faces are always solid. So the brain takes a face model out of its cupboard which is, by its nature, solid.
But having constructed its apparently solid face model, the brain is caught in a contradiction when the mask starts to rotate. To simplify the
explanation, suppose that the mask is that of Oliver Cromwell and that his famous warts are visible from both sides of the mask. When looking
at the hollow interior of the nose, which is really pointing away from the viewer, the eye looks straight across to the right side of the nose where there is a prominent wart. But the constructed virtual nose is apparently pointing towards the viewer, not away, and the wart is on what, from the virtual Cromwell's point of view, would be his left side, as if we were looking at Cromwell's mirror image. As the mask rotates, if the face were really solid, our eye would see more of the side that it expected to see more of and less of the side that it expected to see less of.
But because the mask is actually hollow, the reverse happens. The relative
proportions of the retinal image change in the way the brain would
expect if the face were solid but rotating in the opposite direction. And that is the illusion that we see. The brain resolves the inevitable contradiction, as one side gives way to the other, in the only way possible, given its stubborn insistence on the mask's being a solid face: it
simulates a virtual model of one face swallowing up the other face.
The rare brain disorder that destroys our ability to recognize faces is called prosopagnosia. It is caused by injury to specific parts of the brain. This very fact supports the importance of a 'face cupboard' in the brain. I don't know, but I'd bet that prosopagnosics wouldn't see the hollow mask illusion. Francis Crick discusses prosopagnosia in his book The Astonishing Hypothesis (1994), together with other revealing clinical conditions. For instance, one patient found the following condition very frightening which, as Crick observes, is not surprising:
. . . objects or persons she saw in one place suddenly appeared in another without her being aware they were moving. This was particularly distressing if she wanted to cross a road, since a car that at first seemed far away would suddenly be very close . . . She experienced the world rather as some of us might see the dance floor in the strobe lighting of a discotheque.
This woman had a mental cupboard full of images for assembling her virtual world, just as we all do. The images themselves were probably perfectly good. But something had gone wrong with her software for deploying them in a smoothly changing virtual world. Other patients have lost their ability to construct virtual depth. They see the world as though it was made of flat, cardboard cut-outs. Yet other patients can recognize objects only if they are presented from a familiar angle. The rest of us, having seen, say, a saucepan from the side, can effortlessly recognize it from above. These patients have presumably lost some ability to manipulate virtual images and turn them around. The technology of virtual reality gives us a language to think about such skills, and this will be my next topic.
I shall not dwell on the details of today's virtual reality which is certain,
in any case, to become obsolete. The technology changes as rapidly as everything else in the world of computers. Essentially what happens is as follows. You don a headset which presents to each of your eyes a miniature computer screen. The images on the two screens are nearly
the same as each other, but offset to give the stereo illusion of three dimensions. The scene is whatever has been programmed into the computer: the Parthenon, perhaps, intact and in its original garish colours; an imagined landscape on Mars; the inside of a cell, hugely magnified. So far, I might have been I describing an ordinary 3-D movie. But the virtual reality machine provides a two-way street. The computer doesn't just present you with scenes, it responds to you. The headset is wired up to register all turnings of your head, and other body movements, which would, in the normal course of events, affect your viewpoint. The computer is continuously informed of all such movements and - here is the cunning part - it is programmed to change the scene presented to the eyes, in exactly the way it would change if you were really moving your head. As you turn your head, the pillars of the Parthenon, say, swing round and you find yourself looking at a statue which, previously, had been 'behind' you.
A more advanced system might have you in a body stocking, laced with strain gauges to monitor the positions of all your limbs. The computer can now tell whenever you take a step, whenever you sit down, stand up, or wave your arms. You can now walk from one end of the Parthenon to the other, watching the pillars pass by as the computer changes the images in sympathy with your steps. Tread carefully because, remember, you are not really in the Parthenon but in a cluttered computer room. Present day virtual reality systems, indeed, are likely to tether you to the computer by a complicated umbilicus of cables, so let's postulate a future tangle-free radio link, or infrared data beam. Now you can walk freely in an empty real world and explore the fantasy virtual world that has been programmed for you. Since the computer knows where your body stocking is, there is no reason why it shouldn't represent you to yourself as a complete human form, an avatar, allowing you to look down at your 'legs', which might be very different from your real legs. You can watch your avatar's hands as they move in imitation of your real hands. If you use these hands to pick up a virtual object, say a Grecian urn, the urn will seem to rise into the air as you 'lift' it.
If somebody else, who could be in another country, dons another set of kit hooked up to the same computer, in principle you should be able to see their avatar and even shake hands - though with present day technology- you might find yourself passing through each other like ghosts. The technicians and programmers are still working on how to
create the illusion of texture and the 'feel' of solid resistance. When I visited England's leading virtual reality company, they told me they get many letters from people wanting a virtual sexual partner. Perhaps in the future, lovers separated by the Atlantic will caress each other over the Internet, albeit incommoded by the need to wear gloves and a body stocking wired up with strain gauges and pressure pads.
Now let's take virtual reality a shade away from dreams and closer to practical usefulness. Present day doctors have recourse to the ingenious endoscope, a sophisticated tube that is inserted into a patient's body through, say, the mouth or the rectum and used for diagnosis and even surgical intervention. By the equivalent of pulling wires, the surgeon steers the long tube round the bends of the intestine. The tube itself has a tiny television camera lens at its tip and a light pipe to illuminate the way. The tip of the tube may also be furnished with various remote- control instruments which the surgeon can control, such as micro- scalpels and forceps.
In conventional endoscopy, the surgeon sees what he is doing using an ordinary television screen, and he operates the remote controls using his fingers. But as various people have realized (not least Jaron Lanier, who coined the phrase 'virtual reality' itself) it is in principle possible to give the surgeon the illusion of being shrunk and actually inside the patient's body. This idea is in the research stage, so I shall resort to a fantasy of how the technique might work in the next century. The surgeon of the future has no need to scrub up, for she need not go near her patient. She stands ? in a wide open area, connected by radio to the endoscope inside the patient's intestine. The miniature screens in front of her two eyes present a magnified stereo image of the interior of the patient
immediately in front of the tip of the endoscope. When she moves her head to the left, the computer automatically swivels the tip of the endoscope to the left. The angle of view of the camera inside the intestine faithfully moves to follow the surgeon's head movements in all three planes. She drives the endoscope forward along the intestine by her footsteps. Slowly, slowly, for fear of damaging the patient, the computer pushes the endoscope forwards, its direction always controlled by the direction in which, in a completely different room, the surgeon is walking. It feels to her as though she is actually walking through the intestine. It doesn't even feel claustrophobic. Following present day endoscopic practice, the gut has been carefully inflated with air, otherwise the walls would press in upon the surgeon and force her to crawl rather than walk.
When she finds what she is looking for, say a malignant tumour, the surgeon selects an instrument from her virtual toolbag. Perhaps it is most convenient to model it as a chainsaw, whose image is generated in the computer. Looking through the stereo screens in her helmet at the
enlarged 3-D tumour, the surgeon sees the virtual chainsaw- in her virtual hands and goes to work, excising the tumour, as though it were a tree stump needing to be removed from the garden. Inside the real patient, the mirrored equivalent of the chainsaw is an ultrafine laser beam. As if by a pantograph, the gross movements of the surgeon's whole arm as she hefts the chainsaw are geared down, by the computer, to equivalent tiny movements of the laser gun in the tip of the endoscope.
For my purposes I need say only that it is theoretically possible to create the illusion of walking through somebody's intestine using the techniques of virtual reality. I do not know whether it will actually help surgeons. I suspect that it will, although a present day hospital consultant whom I have asked is a little sceptical. This same surgeon refers to himself and his fellow gastroenterologists as glorified plumbers. Plumbers themselves sometimes use larger-scale versions of endoscopes for exploring pipes and in America they even send down mechanical 'pigs' to eat their way through blockages in drains. Obviously the methods I imagined for a surgeon would work for a plumber. The plumber could 'tramp' (or 'swim'? ) down the virtual water pipe with a virtual miner's lamp on his helmet and a virtual pickaxe in his hand for clearing blockages.
The Parthenon of my first example existed nowhere but in the computer. The computer could as well have introduced you to angels, harpies or winged unicorns. My hypothetical endoscopist and plumber, on the other hand, were walking through a virtual world that was constrained to resemble a mapped portion of reality, the real interior of a drain or a patient's intestine. The virtual world that was presented to the surgeon on her stereo screens was admittedly constructed in a computer, but it was constructed in a disciplined way. There was a real laser gun being controlled, albeit represented as a chainsaw because this would feel like a natural tool to excise a tumour whose apparent size was comparable to the surgeon's own body- The shape of the virtual construction reflected, in the way most convenient to the surgeon's operation, a detail of the real world inside the patient. Such constrained virtual reality is pivotal in this chapter. I believe that every species that has a nervous system uses it to construct a model of its own particular world, constrained by continuous updating through the sense organs. The nature of the model may depend upon how the species concerned is going to use it, at least as much as upon what we might think of as the nature of the world itself.
Think of a gliding gull adroitly riding the winds off a sea cliff. It may not be flapping its wings, but this doesn't mean that its wing muscles are idle. They and the tail muscles are constantly making tiny adjustments, sensitively fine-tuning the bird's flight surfaces to every eddy, every nuance of the air around it. If we fed information about the state of all
the nerves controlling these muscles into a computer, from moment to moment, the computer could in principle reconstruct every detail of the air currents through which the bird was gliding. It would do this by assuming that the bird was well designed to stay aloft and on that assumption construct a continuously updated model of the air around it. It would be a dynamic model, like a weather forecaster's model of the world's weather system, which is continuously revised by new data supplied by weather ships, satellites and ground stations and can be extrapolated to predict the future. The weather model advises us about tomorrow's weather; the gull model is theoretically capable of 'advising' the bird on the anticipatory adjustments that it should make to its wing and tail muscles in order to glide on into the next second.
The point we are working towards, of course, is that although no human programmer has yet constructed a computer model to advise gulls on how to adjust their wing and tail muscles, just such a model is surely being run continuously in the brain of our gull and of every other bird in flight. Similar models, preprogrammed in outline by genes and past experience, and continuously updated by new sense data from millisecond to millisecond, are running inside the skull of every swimming fish, every galloping horse, every echo-ranging bat.
That ingenious inventor Paul MacCready is best known for his superbly economical flying machines, the man-powered Gossamer Condor and Gossamer Albatross and the sun-powered Solar Challenger. He also, in 1985, constructed a half-sized flying replica of the giant Cretaceous pterosaur Quetzalcoatlus. This huge flying reptile, with a wingspan comparable to that of a light aircraft, had almost no tail and was therefore highly unstable in the air. John Maynard Smith, who trained as an aero-engineer before switching to zoology, pointed out that this would have given advantages of manoeuvrability, but it demands accurate moment-to-moment control of the flight surfaces. Without a fast computer to adjust its trim continuously, MacCready's replica would have crashed. The real Quetzalcoatlus must have had an equivalent computer in its head, for the same reason. Earlier pterosaurs had long tails, in some cases terminated by what looks like a ping-pong bat, which would have given great stability, at a cost in manoeuvrability. It seems that, in the evolution of late, almost tailless pterosaurs like Quetzalcoatlus, there was a shift from stable but unmanoeuvrable to manoeuvrable but unstable. The same trend can be seen in the evolution of manmade aeroplanes. In both cases, the trend is made possible only by increasing computer power. As in the case of the seagull, the pterosaur's on-board computer inside its skull must have run a simulation model of the animal and the air through which it flew.
You and I, we humans, we mammals, we animals, inhabit a virtual world, constructed from elements that are, at successively higher levels, useful for representing the real world. Of course, we . feel as if we are firmly placed in the real world - which is exactly as it should be if our constrained virtual reality software is any good. It is very good, and the only time we notice it at all is on the rare occasions when it gets something wrong. When this happens we experience an illusion or a hallucination, like the hollow mask illusion we talked about earlier.
The British psychologist Richard Gregory has paid special attention to visual illusions as a means of studying how the brain works. In his book Eye and Brain (fifth edition 1998), he regards seeing as an active process in which the brain sets up hypotheses about what is going on out there, then tests those hypotheses against the data coming in from, the sense organs. One of the most familiar of all visual illusions is the Necker cube. This is a simple line drawing of a hollow cube, like a cube made of steel rods. The drawing is a two-dimensional pattern of ink on paper. Yet a normal human sees it as a cube. The brain has made a three- dimensional model based upon the two-dimensional pattern on the paper. This is, indeed, the kind of thing the brain does almost every time you look at a picture. The flat pattern of ink on paper is equally compatible with two alternative three-dimensional brain models. Stare at the
drawing for some seconds and you will see it flip. The facet that had previously seemed nearest to you will now appear farthest. Carry on looking, and it will flip back to the original cube. The brain could have been designed to stick, arbitrarily, to one of the two cube models, say the first of the two that it hit upon, even though the other model would have been equally compatible with the information from the retinas. But in fact the brain takes the other option of running each model, or hypothesis, alternately for a few seconds at a time. Hence the apparent cube alternates, which gives the game away. Our brain constructs a three-dimensional model. It is virtual reality in the head.
When we are looking at an actual wooden box, our simulation software is provided with additional information, which enables it to arrive at a clear preference for one of the two internal models. We therefore see the box in one way only, and there is no alternation. But this does not diminish the truth of the general lesson we learn from the Necker cube. Whenever we look at anything, there is a sense in which what our brain actually
makes use of is a model of that thing in the brain. The model in the brain, like the virtual Parthenon of my earlier example, is constructed. But, unlike the Parthenon (and perhaps the visions we see in dreams), it is, like the surgeon's computer model of the inside of her patient, not entirely invented: it is constrained by information fed in from the outside world.
A more powerful illusion of solidity is conveyed by stereoscopy, the slight discrepancy between the two images seen by the left and the right eyes. It is this that is exploited by the two screens in a virtual reality helmet. Hold up your right hand, with the thumb towards you, about one foot in front of your face, and look at some distant object, say a tree, with both eyes open. You'll see two hands. These correspond to the images seen by your two eyes. You can quickly find out which is which by first shutting one, then the other, eye. The two hands appear to be in slightly different places because your two eyes converge from different angles and the images on the two retinas are correspondingly, and tellingly, different. The two eyes get a slightly different view of the hand, too. The left eye sees a bit more of the palm, the right eye sees a bit more of the back of the hand.
Now, instead of looking at the distant tree, look at your hand, again with both eyes open. Instead of two hands in the foreground and one tree in the background, you'll see one solid-looking hand and two trees. Yet the hand image is still falling on different places on your two retinas. What this means is that your simulation software has constructed a single model of the hand, a model in 3-D. What's more, the single three- dimensional model has made use of information from both eyes. The brain subtly amalgamates both sets of information and puts together a useful model of a single, three-dimensional, solid hand. Incidentally, all retinal images of course are upside down, but this doesn't matter because the brain constructs its simulation model in the way that best suits its purpose and defines this model as the right way up.
The computational tricks used by the brain to construct a three- dimensional model from two two-dimensional images are astonishingly sophisticated, and are the basis of perhaps the most impressive of all illusions. These date back to a discovery by the Hungarian psychologist Bela Julesz in 1959. A normal stereoscope presents the same photograph to the left and the right eye but taken from suitably different angles. The brain puts the two together and sees an impressively three-dimensional scene. Julesz did the same thing, except that his pictures were random pepper and salt dots. The left and the right eye were shown the same random pattern, but with a crucial difference. In a typical Julesz experiment, an area of the pattern, say, a square, has its random dots displaced to one side, the appropriate distance to create the stereoscopic illusion. And the brain sees the illusion - a square patch stands out - even though there is not the smallest trace of a square in either of the two pictures. The square is present only in the discrepancy between the two pictures. The square looks very real to the viewer, but it really is nowhere but in the brain. The Julesz Effect is the basis of the 'Magic Eye' illusions so popular today. In a tour de force of the explainer's art, Steven Pinker devotes a small section of How the Mind Works (1998) to the
principle underlying these pictures. I won't even try to better his explanation.
There is an easy way to demonstrate that the brain works as a sophisticated virtual reality computer. First, look about you by moving your eyes. As you swivel your eyes, the images on your retinas move as if you were in an earthquake. But you don't see an earthquake. To you, the scene seems as steady as a rock. I am leading up, of course, to saying that the virtual model in your brain is constructed to remain steady. But there is more to the demonstration, because there's another way to make the image on your retina move. Gently poke your eyeball through the skin of the eyelid. The retinal image will move in the same kind of way as before. Indeed you could, given sufficient skill with your finger, mimic the effect of shifting your gaze. But now you really will think you see the earth move. The whole scene shifts, as if you were witnessing an earthquake.
What is the difference between these two cases? It is that the brain computer has been set up to take account of normal eye movements and make allowance for them in constructing its computed model of the world. Apparently the brain model makes use of information, not only from the eyes, but also from the instructions to move the eyes. Whenever the brain issues an order to the eye muscles to move the eye, a copy of that order is sent to the part of the brain that is constructing the internal model of the world. Then, when the eyes move, the virtual reality software of the brain is warned to expect the retinal images to move just the right amount, and it makes the model compensate. So the constructed model of the world is seen to stay still, although it may be viewed from another angle. If the earth moves at any time other than when the model is told to expect movement, the virtual model moves accordingly. This is fine, because there really might be an earthquake. Except that you can fool the system by poking your eyeball. As the final demonstration using yourself as guinea pig, make yourself giddy by spinning round and round. Now stand still and look fixedly at the world. It will appear to spin even though your reason tells you that it is not getting anywhere in its rotation. Your retinal images are not moving, but the accelerometers in your ears (which work by detecting the movements of fluid in the so-called semicircular canals) are telling the brain that you are spinning. The brain instructs the virtual reality software to expect to see the world spinning. When the images on the retina do not spin, therefore, the model registers the discrepancy and spins itself in the opposite direction. To put it in subjective language, the virtual reality software says to itself, 'I know I'm spinning from what the ears are telling me; therefore, in order to hold the model still, it will be necessary to put the opposite spin on the model, relative to the data that the eyes are sending in. ' But the retinas actually report no spin, so the compensating
spin of the model in the head is what you seem to see. In Barlow's terms, it is the unexpected, it is 'news', and that is why we see it.
Birds have an additional problem which humans ordinarily are spared. A bird perched on a tree branch is constantly being blown up and down, to and fro, and its retinal images seesaw accordingly. It is like living through a permanent earthquake.
