British law allows people to
immigrate
if they can prove that their
parents are already British citizens.
parents are already British citizens.
Richard-Dawkins-Unweaving-the-Rainbow
But another way to look at it has always seemed to me more vivid.
The song is not informing the female but manipulating her.
It is not so much changing what the female knows as directly changing the internal physiological state of her brain.
It is acting like a drug.
There is experimental evidence from measuring the hormone levels of female doves and canaries, as well as their behaviour, that the sexual
state of females is directly influenced by the vocalizations of males, the effects being integrated over a period of days. The sounds from a male canary flood through the female's ears into her brain where they have an effect that is indistinguishable from one that an experimenter can procure with a hypodermic syringe. The male's 'drug' enters the female through the portals of her ears rather than through a hypodermic, but this difference does not seem particularly telling.
The idea that birdsong is an auditory drug gains plausibility when you look at how it develops during the individual's lifetime. Typically, a young male songbird teaches himself to sing by practising: matching up fragments of trial song against a 'template' in his brain, a pre- programmed notion of what the song of his species 'ought' to sound like. In some species, such as the American song sparrow, the template is built in, programmed by the genes. In other species, such as the white crowned sparrow or the European chaffinch, it is derived from a 'recording' of another male's song, made early in the young male's life from listening to an adult. Wherever the template comes from, the young male teaches himself how to sing in such a way as to match it.
That, at least, is one way to talk about what happens when a young bird perfects his song. But think of it another way. The song is ultimately designed to have a strong effect on the nervous system of another member of the species, either a prospective mate or a possible territorial rival who needs to be warned off. But the young bird himself is a member of his own species. His brain is a typical brain from that species. A sound that is effective in arousing his own emotions is likely to be as effective in arousing a female of the same species. Instead of speaking of the young male trying to shape his practice song to 'match' a built-in 'template', we could think of him as practising on himself as a typical member of his species, trying out fragments of song to see whether they excite his own passions, that is, experimenting with his own drugs on himself. And, to complete the circuit, perhaps it is not too surprising that nightingale song should have acted like a drug on the nervous system of John Keats. He was not a nightingale, but he was a vertebrate, and most drugs that work on humans have a comparable effect upon other vertebrates. Man- made drugs are the products of comparatively crude trial and error testing by chemists in the laboratory Natural selection has had thousands of generations in which to fine-tune its drug technology.
Should we feel indignant on Keats's behalf at such a comparison? I do not believe that Keats himself would have done so - Coleridge even less. The 'Ode to a Nightingale' accepts the implication of the drug analogy, makes it wonderfully real. It is not demeaning to human emotion that we try to analyse and explain it, any more than, to a balanced judge, the rainbow is diminished when a prism unweaves it.
In this chapter and the previous one, I have used the barcode as a symbol of precise analysis, in all its beauty. Mixed light is sorted into its rainbow of component colours and everybody sees beauty. That is a first analysis. Closer detail reveals fine lines and a new elegance, the elegance of detection, of the bringing of order and understanding. Fraunhofer barcodes speak to us of the exact elemental nature of distant stars. A precisely measured pattern of stripes is a coded message from across the parsecs. There is grace in the sheer economy of unweaving intimate details about a star which, one had thought, could be found only through the costly undertaking of a journey lasting 3,000 human lifetimes. On another scale, we find a similar story when we look at the formant stripes in speech, the harmonic barcodes of music. There is elegance, too, in the barcodes of dendrochronology: the stripes across ancient Sequoia wood which tell us precisely in which year BC the tree was seeded, and what the weather was like in every one of the intervening years (for weather conditions are what give tree rings their characteristic widths). Like Fraunhofer's lines transmitted across space, tree rings transmit messages to us across time, and again there is a supple economy. It is the power - the fact that we can learn so much by precise analysis of what seems so little information - that gives these unweavings their beauty. The same is true, perhaps even more dramatically, of sound waves in speech and music - barcodes on the air.
Recently we have been hearing much about another kind of barcode - DNA 'fingerprints', barcodes in the blood. DNA barcodes expose and reconstruct details of human affairs that one might have supposed forever inaccessible even to legendarily great detectives. The main practical use of barcodes in the blood so far is in courts of law, and it is to them - and the benefits that a scientific attitude may bring to them - that we turn in the next chapter.
5
BARCODES AT THE BAR
And he said, Woe unto you also, ye lawyers! for ye lade men with burdens grievous to be borne, and ye yourselves touch not the burdens with one of your fingers. . . . Woe unto you, lawyers', for ye have taken away the key of knowledge: ye entered not in yourselves, and them that were entering in ye hindered.
Luke 11
On the face of it, the law may seem about as far as you can get from poetry or the wonder of science. Perhaps there is poetic beauty in the abstract ideas of justice or fairness, but I doubt if many lawyers are moved by it. In any case, that is not what this chapter is about. I shall be looking at an example of the role of science in the law: at a different aspect of science and its importance in society; a sense in which scientific understanding may become a valuable part of good citizenship. In courts of law, juries are increasingly asked to understand evidence which the lawyers themselves may not fully comprehend. Evidence from the unweaving of DNA - what we shall come to see as barcodes in the blood - is the outstanding example, and it is the main subject of this chapter. But it is not just facts about DNA that scientists can contribute. More importantly, it is the underlying theory of probability and statistics; it is scientific ways of making inferences that need to be brought to bear. Such matters stretch beyond the narrow subject of DNA evidence.
I am told on good authority that defence lawyers in the United States sometimes object to jury candidates on the grounds that they have had a scientific education. What can this mean? I would not question the right of defence lawyers to disallow the selection of particular jurors. A juror may be prejudiced against the race or class to which the defendant belongs. It is obviously undesirable that a raving homophobe should try a case of anti-homosexual violence. It is for this kind of reason that defence lawyers in some countries are allowed to cross-examine potential jurors and strike them off the list. In the USA lawyers can be completely blatant about their criteria for jury selection. A colleague tells me of a time when he was up for selection to a jury, on an injury litigation case. The lawyer asked, 'Would anyone here have a problem awarding a substantial amount of money to my client, perhaps in the millions? '
A lawyer can also disqualify a juror without giving reasons. Although this may be just, the only time I have seen it happen it misfired. I was a member of a panel of 24 individuals from which juries of 12 were to be selected. I had already participated in two juries with members of this panel, and I knew their individual foibles. One particular man was cast- iron prosecution fodder; he would take the same hard line almost regardless of the particular case. The defence lawyer waved him through like a breeze. The next one up, a large middle-aged woman, was the opposite: a guaranteed softie, a pure gift to the defence. But her appearance perhaps suggested the opposite, and it was against her that the defence lawyer chose to exercise his right of veto. I have never forgotten the look of wounded hurt on her face as, with a cutting movement of the hand, learned counsel struck her - whom he little knew could have been his secret weapon - out of the jury box.
But, to repeat the astonishing fact, lawyers in the United States have been known to use the following reason for striking down potential jurors: the prospective juror is well educated in science, or has some knowledge of genetics or probability theory. What is the problem? Are geneticists known to harbour deep-seated prejudices against certain sections of society? Are mathematicians especially likely to be of the "flog 'em . . . string 'em up . . . it's the only language they understand . . . law and
order' persuasion? Of course not. Nobody has ever claimed such a thing.
The lawyers' objections are more ignobly based. There is a new kind of evidence increasingly coming into the criminal courts: evidence from
DNA fingerprinting, and it is extremely powerful. If your client is innocent, DNA' evidence may well provide a knock-down convincing way to establish his innocence. Conversely, if he is guilty, DNA evidence has a good chance of establishing his guilt in cases where no other evidence can. DNA evidence is quite hard to understand at the best of times.
There are controversial aspects of it which are even harder. In these circumstances, you would think that an honest lawyer who wishes to see justice done would welcome jurors capable of grasping the arguments. Wouldn't it be an obviously good thing to have at least one or two people in the jury room who can redress the ignorance of their baffled colleagues? What kind of a lawyer is it who prefers a jury incapable of following the case that either attorney is making?
The answer is a lawyer who is more interested in winning than in seeing justice done. A lawyer, in other words. And it seems to be a fact that advocates, of both prosecution and defence, frequently disallow individual jurors specifically because they are educated in science.
Courts of law have always needed to establish individual identity. Was the individual seen hurrying from the scene Richard Dawkins? Is the hat dropped at the scene of the crime his hat? Are those his fingerprints on the weapon? A yes answer to one of these questions does not by itself prove his guilt, but it is certainly an important factor to be taken into account. Most of us, including most jurors and lawyers, have an intuitive sense that there is something specially reliable about eye-witness evidence. In this we are almost certainly wrong, but the error is a pardonable one. It may even be built into us by millennia of evolutionary history in which eye-witness evidence really was the most reliable. If I see a man in a red woolly hat climbing a drainpipe, you will have a hard time persuading me later that he was actually wearing a blue beret. Our intuitive biases are such that eye-witness evidence trumps all other categories. Yet numerous studies have shown that eye-witnesses, however convinced they may be, however sincere and well-meaning, frequently mis-remember even conspicuous details such as the colour of clothing and the number of assailants present.
When individual identification is important, for instance when a woman who has been raped is called upon to identify her attacker, courts perform a rudimentary statistical test known as the identity parade or line-up. The woman is led past a line of men, one of whom the police suspect on other grounds. The others have been pulled in off the streets or are out-of-work actors, or police officers dressed in plain clothes. If the woman picks out one of these stooges, her identification evidence is discounted. But if she picks out the man the police already suspect, her evidence is taken seriously.
Rightly so. Especially if the number of people in the identity parade is large. We are all statisticians enough to see why this is. The prior suspicion of the police must be open to doubt - otherwise there would be no point in seeking the woman's evidence at all. What impresses us is agreement between the woman's identification and the independent evidence offered by the police. If the identity parade contains only two men, the witness would have a 50 per cent chance of picking the man already suspected by the police, even if she chose at random - or if she were mistaken. Since the police might also be mistaken, this represents an unacceptably high risk of injustice. But if there are 20 men in the line, the woman has only a 1 in 20 chance of choosing, by guesswork or error, the man the police already suspect. The coincidence of her identification and the police's prior suspicion probably really means something. What
is going on here is the assessment of coincidence, or the odds that something might happen by chance alone. The probability of meaningless coincidence is even less if the identity parade has 100 men, because a 1 in 100 chance of error is noticeably less than a 1 in 20 chance of error. The longer the line-up, the more secure the eventual conviction.
We also have an intuitive sense that the men chosen for the line-up must not look too obviously different from the suspect. If the woman originally told the police to look for a man with a beard, and the police have now arrested a bearded suspect, it is clearly unjust to stand him in a line with 19 clean-shaven men. He might as well be standing by himself. Even if the woman has said nothing about the appearance of her attacker, if the police have arrested a punk in a leather jacket it would be wrong to stand him in a line of suited accountants with furled umbrellas. In multiracial countries such considerations have added importance. Everyone understands that a black suspect should not be placed in an otherwise all-white line-up, or vice versa.
When we think about how we identify somebody, the face first leaps to mind. We are particularly good at distinguishing faces. As we shall see in another connection, we even seem to have evolved a special part of the brain set aside for the purpose, and certain kinds of brain damage
disable our face-recognition faculty while leaving the rest of vision intact. In any case, faces are good for recognition because they are so variable. With the well-known exception of identical twins, you seldom meet two people whose faces are confusable. It is not totally unknown, however, and an actor can be made up to look very like somebody else. Dictators often employ doubles to perform for them when they are too busy, or to draw the fire of assassins. It has been suggested that one reason charismatic leaders so often sport moustaches (Hitler, Stalin, Franco, Saddam Hussein, Oswald Mosley) is to make it easier for doubles to impersonate them. Mussolini's shaven head perhaps served the same purpose.
Apart from identical twins, ordinary close relatives are sometimes sufficiently alike to fool people who don't know them well. (Unfortunately the story that Doctor Spooner, when Warden of my college, once stopped an undergraduate and said, 'I never can remember is it you or your brother was killed in the war? ' is probably not true, like most alleged Spoonerisms. ) The resemblance of brothers and sisters, of fathers and sons, of grandparents and grandchildren, serves to remind us of the huge pool of facial variety in the general population of non-relatives.
But faces are only a special case. We are riddled with idiosyncrasies which, with sufficient training, can be used to identify individuals. I had
a school friend who claimed (and my spot checks confirmed it) that he could recognize any member of the 80-strong residence in which we lived purely by listening to their footsteps. I had another friend from Switzerland who claimed that when she walked into a room she could tell, by smell, which members of her circle of acquaintances had recently left the room. It is not that her colleagues didn't wash, just that she was unusually sensitive. That this is in principle possible is confirmed by the fact that police dogs can distinguish between any two human beings by smell alone, with the exception, yet again, of identical twins. As far as I know, the police haven't adopted the following technique, but I bet you could train bloodhounds to track down a kidnapped child after giving them a sample sniff of his brother. A way might even be found to use a jury of bloodhounds to decide paternity cases.
Voices are as idiosyncratic as faces, and various research teams are working on computer voice recognition systems for authenticating identity. It would be a great boon if, in the future, we could dispense with front door keys and rely on a voice-operated computer to obey our personal Open Sesame command. Handwriting is sufficiently individual for the written signature to be used as a guarantee of identity on bank cheques and important legal documents. Signatures are actually not particularly secure because they are too easily forged, but it is still impressive how recognizable handwriting can be. A promising newcomer
to the list of individual 'signatures' is the iris of the eye. At least one bank is experimenting with automated iris-scanning machines as a way of verifying identity. The customer stands in front of a camera which photographs the eye, digitizes the image into what a newspaper described as 'a 256 byte: human barcode'. But none of these methods of verifying human identity even comes close to the potential of DNA fingerprinting, properly applied.
It is not surprising that police dogs can smell the difference between any two humans except identical twins. Our sweat contains a complicated cocktail of proteins, and the precise details of all proteins are minutely specified by the coded DNA instructions that are our genes. Unlike handwriting and faces, which vary continuously and grade smoothly into one another, genes are digital codes, much like those used in computers. Again with the exception of identical twins, we differ genetically from all other people in discrete, discontinuous ways: an exact number of ways that you could even count if you had the patience. The DNA in each one of my cells (give or take a tiny minority of mistakes, and not including red blood cells which have lost all their DNA, or reproductive cells which contain a random half of my genes) is identical to the DNA in all my other cells. It differs from the DNA in every one of your cells, not in some vague, impressionistic way but at a precise number of locations dotted along the billions of DNA letters that we both have.
It is almost impossible to exaggerate the importance of the digital revolution in molecular genetics. Before Watson and Crick's epochal announcement in 1953 of the structure of DNA, it was still possible to agree with the concluding words of Charles Singer's authoritative A Short History of Biology, published in 1931:
. . despite interpretations to the contrary, the theory of the gene is not a 'mechanist' theory. The gene is no more comprehensible as a chemical or physical entity than is the cell or, for that matter, the organism itself. Further, though the theory speaks in terms of genes as the atomic theory speaks in terms of atoms, it must be remembered that there is a fundamental distinction between the two theories. Atoms exist independently, and their properties as such can be examined. They can even be isolated. Though we cannot see them, we can deal with them under various conditions and in various combinations. We can deal with them individually. Not so the gene. It exists only as a part of the chromosome, and the chromosome only as part of a cell. If I ask for a living chromosome, that is, for the only effective kind of chromosome, no one can give it to me except in its living surroundings any more than he can give me a living arm or leg. The doctrine of the relativity of functions is as true for the gene as it is for any of the organs of the body. They exist and function only in relation to other organs. Thus the last of the
biological theories leaves us where the first started, in the presence of a power called life or psyche which is not only of its own kind but unique in each and all of its exhibitions.
This is dramatically, profoundly, hugely wrong. And it really matters. Following Watson and Crick and the revolution that they sparked, a gene can be isolated. It can be purified, bottled, crystallized, read as digitally coded information, printed on a page, fed into a computer, read out again into a test tube and reinserted into an organism where it works exactly
as it did before. When the Human Genome Project, which set out to work out the complete gene sequence of a human being, is completed,
probably by the year 2005, the full genome will fit comfortably on two standard CD ROM discs, leaving enough space for a textbook of
molecular embryology. These two discs could then be sent into outer space, and the human race could go extinct secure in the knowledge that there is now a chance that at some future time and in some distant place, a sufficiently advanced civilization would be able to reconstitute a human being. Meanwhile, back on earth, it is because DNA is deeply and fundamentally digital - because the differences between individuals and between species can be precisely counted, not vaguely and impressionistically measured - that DNA fingerprinting is potentially so powerful.
I assert the uniqueness of each individual's DNA with confidence, but even this is only a statistical judgement. Theoretically, the sexual lottery could throw up the same genetic sequence twice. An 'identical twin' of Isaac Newton could be born tomorrow. But the number of people that would have to be born in order to make this event at all likely would be larger than the number of atoms in the universe. Unlike our face, voice
or handwriting, the DNA in most of our cells stays the same from babyhood to old age, and it cannot be altered by training or cosmetic surgery. Our DNA text has such a huge number of letters that we can precisely quantify the expected number shared by, say, brothers or first cousins as opposed to, say, second cousins or random pairs chosen from the population at large. This makes it useful not only for labelling individuals uniquely and matching them to traces such as blood or semen, but for establishing paternity and other genetic relationships.
British law allows people to immigrate if they can prove that their
parents are already British citizens. A number of children from the
Indian subcontinent have been arrested by sceptical immigration officials. Before the advent of DNA fingerprinting it was often impossible for these unfortunate people to prove their parentage. Now it is easy. All you do is take a sample of blood from the putative parents and compare a particular set of genes with the corresponding set of genes from the child. The verdict is clear and unequivocal, with none of the doubt or fuzziness
that creates a need for qualitative judgements. Several young people in Britain today owe their citizenship to DNA technology.
"A similar method was used to identify skeletons discovered in Yekaterinburg and suspected of belonging to the executed Russian royal family. Prince Philip, Duke of Edinburgh, whose exact relationship to the Romanovs is known, graciously gave blood, and from this it was possible to establish that the skeletons were indeed those of the Tsar's family. In a more macabre case, a skeleton exhumed in South America was proved to belong to Doctor Josef Mengele, the Nazi war criminal known as the 'Angel of Death'. DNA taken from the bones was compared with blood from Mengele's still-living son, and the identity of the skeleton proved. More recently, a corpse dug up in Berlin has been proved, by the same method, to be that of Martin Bormann, Hitler's deputy, whose disappearance had led to endless legends and rumours and more than 6,000 'sightings' around the world.
Despite the name 'fingerprinting', our DNA, being digital, is even more individually characteristic than the patterns of whorls on our fingers. The name is appropriate because, like true fingerprints, DNA evidence is often inadvertently left behind after a person has departed the scene. DNA can be extracted from a bloodstain on a carpet, from semen inside a rape victim, from a crust of dried nasal mucus on a handkerchief, from sweat or from shed hairs. The DNA in the sample can then be compared with that in the blood taken from a suspect. It is possible to assess, to almost any desired level of probability, whether the sample belongs to a particular person or not.
So, what are the snags? Why is DNA evidence controversial? What is it about this important kind of evidence that makes it possible for lawyers to bamboozle juries into misinterpreting or ignoring it? Why have some courts been moved to the despairing extreme of ruling out this evidence altogether?
There are three major classes of potential problem, one simple, one sophisticated and one silly. I'll come to the silly problem and the more sophisticated difficulties later but first, as with any kind of evidence, there is the simple - and very important - possibility of human error. Possibilities, rather, for there are plenty of opportunities for mistakes and even sabotage. A tube of blood may be mislabelled, either by accident or in a deliberate attempt to frame somebody. A sample from the scene of a crime may be contaminated by sweat from a lab technician or a police officer. The danger of contamination is especially great in those cases where an ingenious technique of amplification called PCR (polymerase chain reaction) is used.
You can easily see why amplification might be desirable. A tiny smear of sweat on a gun butt contains precious little DNA. Sensitive though DNA analysis can be, it needs a certain minimum quantity of material to work on. The technique of PCR, invented in 1983 by the American biochemist Kary B. Mullis, is the dramatically successful answer. PCR takes what little DNA there is and produces millions of copies, multiplying again and again whatever code sequences are there. But, as always with amplification, errors are amplified along with the true signal. Stray scraps of DNA contamination from a technician's sweat are amplified as effectively as the specimen from the scene of the crime, with obvious possibilities for injustice.
But human error is not peculiar to DNA evidence. All kinds of evidence are vulnerable to bungling and sabotage, and must be handled with scrupulous care. The files in a conventional fingerprint library may be mislabelled. The murder weapon may have been touched by innocent people as well as the murderer, and their fingerprints have to be taken, along with the suspect's, for elimination purposes. Courts of law are already accustomed to the need to take all possible precautions against mistakes and they still, sometimes tragically, happen. DNA evidence is not immune to human bungling but nor is it particularly vulnerable, except in so far as PCR amplifies error. If all DNA evidence were to be thrown out because of occasional mistakes, the precedent should rule out most other kinds of evidence, too. We have to suppose that codes of practice and rigorous precautions can be developed to guard against human error in the presentation of all kinds of legal evidence.
The more sophisticated difficulties that bedevil DNA evidence will take longer to explain. They, too, have their precedents in conventional types of evidence, although this point often does not seem to be understood in law courts.
Where identification evidence of any kind is concerned, there are two types of error which correspond to the two types of error in any statistical evidence. In another chapter, we shall call them Type 1 and Type 2 errors, but it is easier to think of them as false positive and false negative. A guilty suspect may escape, through not being recognized - false negative. And - false positive (which most people would see as the more dangerous error) - an innocent suspect may be convicted because he happens, by ill luck, to resemble the genuinely guilty party. In the case of ordinary eye- witness identification, an innocent bystander who happens to look a bit like the real criminal could consequently be arrested - false positive. Identity parades are designed to make this less probable. The chance of a miscarriage of justice is inversely related to the number of people standing in the line-up. The danger can be increased in the ways we have
already considered - the line-up being unfairly stacked with clean-shaven men for example.
In the case of DNA evidence the danger of a false positive conviction is theoretically very low indeed. We have a blood sample from a suspect, and we have a specimen from the scene of the crime. If the entire set of genes in both these samples could be written down, the probability of a false conviction is one in billions and billions. Identical twins apart, the chance that any two humans would match all their DNA is tantamount to zero. But unfortunately it is not practical to work out the complete gene sequence of a human being. Even after the Human Genome Project is completed, to attempt the equivalent in the solution of each crime is unrealistic. In practice, forensic detectives concentrate on small sections of the genome, preferably sections that are known to vary in the population. And now our fear must be that, although we could safely rule out mis-identification if the whole genome were considered, there might be a danger of two individuals' being identical with respect to the small portion of DNA that we have time to analyse.
The probability that this would happen ought to be measurable for any particular section of the genome; we could then decide whether it was an acceptable risk. The larger the section of DNA, the smaller the probability of error, just as, in an identity parade, the longer the line-up the safer the conviction. The difference is that an identity parade, in order to compete with the DNA equivalent, would need to contain not a couple of dozen people but thousands, millions or even billions in the line. Apart from this quantitative difference, the analogy- with the identity parade continues. We shall see that there is a DNA equivalent of our hypothetical line-up of clean-shaven men with one bearded suspect. But first, a little more background on DNA fingerprinting.
Obviously we sample the equivalent parts of the genome in both suspect and specimen. These parts of the genome are chosen for their tendency to vary widely in the population. A Darwinian would note that the parts that don't vary are often the parts that have an important role to play in the survival of the organism. Any substantial variations in these important genes are likely to have been removed from the population by the death of their possessors - Darwinian natural selection. But there are other parts of the genome that are very variable, perhaps because they are not important for survival. This isn't the whole story because in fact some useful genes are quite variable. The reasons for this are controversial. It's a bit of a digression but . . . What is this life if, full of stress, we have no freedom to digress?
The 'neutralist' school of thought, associated with the distinguished Japanese geneticist Motoo Kimura, believes that useful genes are equally
useful in a variety of different forms. This emphatically does not mean that they are useless, only that the different forms are equally good at what they do. If you think of genes as writing out their recipes in words, the alternative forms of a gene can be thought of as the very same words written in different typefaces: the meaning is the same, and the product of the recipe will come out the same. Genetic changes, 'mutations', that make no difference are not 'seen' by natural selection. They aren't mutations at all, for all the difference they make to the life of the animal, but they are potentially useful mutations from the point of view of the forensic scientist. The population ends up with lots of variety at such a locus (position in a chromosome), and this kind of variety could in principle be used for fingerprinting.
The other theory of variation, opposed to Kimura's neutral theory, believes that the different versions of the genes really do different things and that there is some special reason why both are preserved by natural selection in the population. For example, there might be two alternative forms of a blood protein, A? and ss, which are susceptible to two infectious diseases called alfluenza and betaccosis respectively, each being immune to the other disease. Typically, an infectious disease needs a critical density of susceptible victims in a population, otherwise an epidemic can't get going. In a population dominated by A? types, there are frequent epidemics of alfluenza but not of betaccosis. So natural selection favours the ss types who are immune to alfluenza. It favours them so much that after a while they come to dominate the population. Now the tables are turned. There are epidemics of betaccosis, but not of alfluenza. The A? types now are favoured by natural selection because they are immune to betaccosis. The population may keep oscillating between A? dominance and ss dominance, or it may settle down to an intermediate mixture, an 'equilibrium'. Either way, we'll see plenty of variation at the gene locus concerned, and this is good news for the finger-printers. The phenomenon is called 'frequency dependent selection' and it is one suggested reason for high levels of genetic variation in the population. There are others.
However, for our forensic purposes, it matters only that there are variable sections of the genome. Whatever the verdict in the controversy over whether the useful bits of the genome are variable, there are in any case lots of other regions of the genome which are never even read, or never translated into their protein equivalents. Indeed, an astonishingly high proportion of our genes seem to be doing nothing whatsoever. They are therefore free to vary-, which makes them excellent DNA fingerprinting material.
As if to confirm the fact that a great deal of DNA is doing nothing useful, the sheer quantity of DNA in the cells of different kinds of organisms is
wildly variable. Since DNA information is digital, we can measure it in
the same kind of units as we measure computer information. One bit of information is enough to specify one yes/no decision: a 1 or a 0, a true or a false. The computer on which I am writing this has 256 megabits (32 megabytes) of core memory. (The first computer that I owned was a
bigger box but had less than one five thousandth of the memory capacity. ) The equivalent fundamental unit in DNA is the nucleotide base. Since there are 4 possible bases, the information content of each base is equivalent to 2 bits. The common gut bacterium Escherichia coli has a genome of 4 mega-bases or 8 megabits. The crested newt, Triturus cristatus, has 40,000 megabits. The 5,000-fold ratio between crested
newt and bacterium is about the same as that between my present computer and my first one. We humans have 5,000 mega-bases or 6,000 megabits. This is 750 times as great as the bacterium (which satisfies
our vanity), but what are we to make of the newt trumping us sixfold? We'd like to think that genome size is not strictly proportional to what it does: presumably quite a lot of that newt DNA isn't doing anything. This is certainly true. It is also true of most of our DNA. We know from other evidence that, of the 3,000 mega-base human genome, only about 2 per cent is actually used for coding protein synthesis. The rest is often called junk DNA. Presumably the crested newt has an even higher percentage
of junk DNA. Other newts have not.
The surplus of unused DNA falls into various categories. Some of it looks like real genetic information, and probably represents old, defunct genes, or out-of-date copies of genes that are still in use. These pseudo-genes would make sense if they were read and translated. But they are not read and translated. Hard disks on computers usually contain comparable junk: old copies of work in progress, scratchpad space used by the computer for interim operations, and so on. We users don't see this junk, because our computers only show us those parts of the disk that we need to know about. But if you get right down and read the actual information on the disk, byte by byte, you'll see the junk, and much of it will make some sort of sense. There are probably dozens of disjointed fragments of this very chapter peppered around my hard disk at present, although there is only one 'official' copy that the computer tells me about (plus a prudent back-up).
In addition to the junk DNA which could he read but isn't, there is plenty of junk DNA which not only isn't read but wouldn't make any sense if it were. There are huge stretches of repeated nonsense, perhaps repeats of one base, or alternations of the same two bases, or repeats of a more complicated pattern. Unlike the other class of junk DNA, we cannot account for these 'tandem repeats' as outdated copies of useful genes. This repetitive DNA has never been decoded, and presumably has never been of any use. (Never useful for the animal's survival, anyway. From
the point of view of the selfish gene, as I explained in another book, we could say that any kind of junk DNA is 'useful' to itself if it just keeps surviving and making more copies of itself. This suggestion has come to be known by the catch-phrase 'selfish DNA', although this is a little unfortunate because, in my original sense, working DNA is selfish too. For this reason, some people have taken to calling it 'ultra-selfish DNA'. )
Anyway, whatever the reason, junk DNA is there, and there in prodigious quantities. Because it is not used, it is free to vary. Useful genes, as we have seen, are severely constrained in their freedom to change. Most changes (mutations) make a gene work less effectively, the animal dies and the change is not passed on. This is what Darwinian natural selection is all about. But mutations in junk DNA (mostly changes in the number of repeats in a given region) are not noticed by natural selection. So, as we look around the population, we find most of the variation that is useful for fingerprinting in the junk regions. As we shall now see, tandem repeats are particularly useful because they vary with respect to number of repeats, a gross feature which is easy to measure.
If it wasn't for this, the forensic geneticist would need to look at the exact sequence of bases in our sample region. This can be done, but sequencing DNA is time-consuming. The tandem repeats allow us to use cunning short-cuts, as discovered by Alec Jeffreys of the University of Leicester, rightly regarded as the father of DNA fingerprinting (and now Sir Alec). Different people have different numbers of tandem repeats in particular places. I might have 147 repeats of a particular piece of nonsense, where you have 84 repeats of the same piece of nonsense in the corresponding place in your genome. In another region, I might have 24 repeats of a particular piece of nonsense to your 38 repeats. Each of us has a characteristic fingerprint consisting of a set of numbers. Each of these numbers in our fingerprint is the number of times a particular piece of nonsense is repeated in our genome.
We get our tandem repeats from our parents. We each have 46 chromosomes, 25 from our father and 23 homologous, or corresponding, chromosomes from our mother. These chromosomes come complete with tandem repeats. Your father got his 46 chromosomes from your paternal grandparents, but he didn't pass them on to you in their entirety. Each of his mother's chromosomes was lined up with its paternal opposite number and bits were exchanged before a composite chromosome was put into the sperm that helped to make you. Every sperm and every egg is unique because it is a different mix of maternal and paternal chromosomes. The mixing process affects the tandem repeat sections as well as the meaningful sections of the chromosomes. So our characteristic numbers of tandem repeats are inherited, in much the same way as our eye colour and hair curliness are inherited. With the difference that, whereas our eye colour results from some kind of joint
verdict of our paternal and our maternal genes, our tandem repeat numbers are properties of the chromosomes themselves and can therefore be measured separately for paternal and maternal chromosomes. At any particular tandem repeat region, each of us has two readings: a paternal chromosome repeat number and a maternal chromosome repeat number. From time to time, chromosomes mutate - suffer a random change - in their tandem repeat numbers. Or a particular tandem region may be split by chromosomal crossing over. This is why there is variation in tandem repeat numbers in the population. The beauty of tandem repeat numbers is that they are easy to measure. You don't have to get embroiled in detailed sequencing of coded DNA bases. You do something a bit like weighing them. Or, to take another equally apt analogy, you spread them out like coloured bands from a prism. I'll explain one way of doing this.
First you need to make some preparations. You make a so-called DNA probe, which is a short sequence of DNA that exactly matches the nonsense sequence in question - up to about 20 nucleotide bases long. This is not difficult to do nowadays. There are several methods. You can even buy a machine off the shelf which makes short DNA sequences to any specification, just as you can buy a keyboard to punch any desired string of letters on a paper tape. By supplying the synthesizing machine with radioactive raw materials, you make the probes themselves radioactive, and so 'label' them. This makes the probes easy to find again later, as natural DNA is not radioactive, and so the two are readily distinguishable from each other.
Radioactive probes are a tool of the trade, which you must have ready before you start a Jeffreys fingerprinting exercise. Another essential tool is the 'restriction enzyme'. Restriction enzymes are chemical tools that specialize in cutting DNA, but cutting it only in particular places. For example, one restriction enzyme may search the length of a chromosome until it finds the sequence GAATTC (G, C, T and A are the four letters of the DNA alphabet; all genes, from all species on earth, differ only in consisting of different sequences of these four letters). Another restriction enzyme cuts the DNA wherever it can find the sequence GCGGCCGC. A number of different restriction enzymes are available in the toolbox of the molecular biologist. They originate from bacteria, who use them for their own defensive purposes. Each restriction enzyme has its own unique search string which it homes in on and cuts.
Now, the trick is to choose a restriction enzyme whose specific search string is completely absent from the tandem repeat we are interested in. The whole length of DNA is therefore chopped into short stretches, bounded by the characteristic search string of the restriction enzyme. Of course, not all the stretches will consist of the tandem repeat we are
looking for. All sorts of other stretches of DNA will happen to be bounded by the favoured search string of the restriction enzyme scissors. But some of them will consist of tandem repeats and the length of each scissored stretch will be largely determined by the number of tandem repeats in it. If I have 147 repeats of a particular piece of DNA nonsense, where you have only 85, my snipped fragments will be correspondingly longer than your snipped fragments.
We can measure these characteristic lengths using a technique that has been around in molecular biology for quite a while. This is the bit that is rather like spreading them out with a prism, as Newton did for white light. The standard DNA 'prism' is a gel electrophoresis column, that is, a long tube filled with jelly through which an electric current is passed. A solution containing the scissored stretches of DNA, all jumbled together, is poured into one end of the tube. The DNA fragments are all electrically attracted to the negative end of the column, which is at the other end of the tube, and they move steadily through the jelly. But they don't all move at the same rate.
There is experimental evidence from measuring the hormone levels of female doves and canaries, as well as their behaviour, that the sexual
state of females is directly influenced by the vocalizations of males, the effects being integrated over a period of days. The sounds from a male canary flood through the female's ears into her brain where they have an effect that is indistinguishable from one that an experimenter can procure with a hypodermic syringe. The male's 'drug' enters the female through the portals of her ears rather than through a hypodermic, but this difference does not seem particularly telling.
The idea that birdsong is an auditory drug gains plausibility when you look at how it develops during the individual's lifetime. Typically, a young male songbird teaches himself to sing by practising: matching up fragments of trial song against a 'template' in his brain, a pre- programmed notion of what the song of his species 'ought' to sound like. In some species, such as the American song sparrow, the template is built in, programmed by the genes. In other species, such as the white crowned sparrow or the European chaffinch, it is derived from a 'recording' of another male's song, made early in the young male's life from listening to an adult. Wherever the template comes from, the young male teaches himself how to sing in such a way as to match it.
That, at least, is one way to talk about what happens when a young bird perfects his song. But think of it another way. The song is ultimately designed to have a strong effect on the nervous system of another member of the species, either a prospective mate or a possible territorial rival who needs to be warned off. But the young bird himself is a member of his own species. His brain is a typical brain from that species. A sound that is effective in arousing his own emotions is likely to be as effective in arousing a female of the same species. Instead of speaking of the young male trying to shape his practice song to 'match' a built-in 'template', we could think of him as practising on himself as a typical member of his species, trying out fragments of song to see whether they excite his own passions, that is, experimenting with his own drugs on himself. And, to complete the circuit, perhaps it is not too surprising that nightingale song should have acted like a drug on the nervous system of John Keats. He was not a nightingale, but he was a vertebrate, and most drugs that work on humans have a comparable effect upon other vertebrates. Man- made drugs are the products of comparatively crude trial and error testing by chemists in the laboratory Natural selection has had thousands of generations in which to fine-tune its drug technology.
Should we feel indignant on Keats's behalf at such a comparison? I do not believe that Keats himself would have done so - Coleridge even less. The 'Ode to a Nightingale' accepts the implication of the drug analogy, makes it wonderfully real. It is not demeaning to human emotion that we try to analyse and explain it, any more than, to a balanced judge, the rainbow is diminished when a prism unweaves it.
In this chapter and the previous one, I have used the barcode as a symbol of precise analysis, in all its beauty. Mixed light is sorted into its rainbow of component colours and everybody sees beauty. That is a first analysis. Closer detail reveals fine lines and a new elegance, the elegance of detection, of the bringing of order and understanding. Fraunhofer barcodes speak to us of the exact elemental nature of distant stars. A precisely measured pattern of stripes is a coded message from across the parsecs. There is grace in the sheer economy of unweaving intimate details about a star which, one had thought, could be found only through the costly undertaking of a journey lasting 3,000 human lifetimes. On another scale, we find a similar story when we look at the formant stripes in speech, the harmonic barcodes of music. There is elegance, too, in the barcodes of dendrochronology: the stripes across ancient Sequoia wood which tell us precisely in which year BC the tree was seeded, and what the weather was like in every one of the intervening years (for weather conditions are what give tree rings their characteristic widths). Like Fraunhofer's lines transmitted across space, tree rings transmit messages to us across time, and again there is a supple economy. It is the power - the fact that we can learn so much by precise analysis of what seems so little information - that gives these unweavings their beauty. The same is true, perhaps even more dramatically, of sound waves in speech and music - barcodes on the air.
Recently we have been hearing much about another kind of barcode - DNA 'fingerprints', barcodes in the blood. DNA barcodes expose and reconstruct details of human affairs that one might have supposed forever inaccessible even to legendarily great detectives. The main practical use of barcodes in the blood so far is in courts of law, and it is to them - and the benefits that a scientific attitude may bring to them - that we turn in the next chapter.
5
BARCODES AT THE BAR
And he said, Woe unto you also, ye lawyers! for ye lade men with burdens grievous to be borne, and ye yourselves touch not the burdens with one of your fingers. . . . Woe unto you, lawyers', for ye have taken away the key of knowledge: ye entered not in yourselves, and them that were entering in ye hindered.
Luke 11
On the face of it, the law may seem about as far as you can get from poetry or the wonder of science. Perhaps there is poetic beauty in the abstract ideas of justice or fairness, but I doubt if many lawyers are moved by it. In any case, that is not what this chapter is about. I shall be looking at an example of the role of science in the law: at a different aspect of science and its importance in society; a sense in which scientific understanding may become a valuable part of good citizenship. In courts of law, juries are increasingly asked to understand evidence which the lawyers themselves may not fully comprehend. Evidence from the unweaving of DNA - what we shall come to see as barcodes in the blood - is the outstanding example, and it is the main subject of this chapter. But it is not just facts about DNA that scientists can contribute. More importantly, it is the underlying theory of probability and statistics; it is scientific ways of making inferences that need to be brought to bear. Such matters stretch beyond the narrow subject of DNA evidence.
I am told on good authority that defence lawyers in the United States sometimes object to jury candidates on the grounds that they have had a scientific education. What can this mean? I would not question the right of defence lawyers to disallow the selection of particular jurors. A juror may be prejudiced against the race or class to which the defendant belongs. It is obviously undesirable that a raving homophobe should try a case of anti-homosexual violence. It is for this kind of reason that defence lawyers in some countries are allowed to cross-examine potential jurors and strike them off the list. In the USA lawyers can be completely blatant about their criteria for jury selection. A colleague tells me of a time when he was up for selection to a jury, on an injury litigation case. The lawyer asked, 'Would anyone here have a problem awarding a substantial amount of money to my client, perhaps in the millions? '
A lawyer can also disqualify a juror without giving reasons. Although this may be just, the only time I have seen it happen it misfired. I was a member of a panel of 24 individuals from which juries of 12 were to be selected. I had already participated in two juries with members of this panel, and I knew their individual foibles. One particular man was cast- iron prosecution fodder; he would take the same hard line almost regardless of the particular case. The defence lawyer waved him through like a breeze. The next one up, a large middle-aged woman, was the opposite: a guaranteed softie, a pure gift to the defence. But her appearance perhaps suggested the opposite, and it was against her that the defence lawyer chose to exercise his right of veto. I have never forgotten the look of wounded hurt on her face as, with a cutting movement of the hand, learned counsel struck her - whom he little knew could have been his secret weapon - out of the jury box.
But, to repeat the astonishing fact, lawyers in the United States have been known to use the following reason for striking down potential jurors: the prospective juror is well educated in science, or has some knowledge of genetics or probability theory. What is the problem? Are geneticists known to harbour deep-seated prejudices against certain sections of society? Are mathematicians especially likely to be of the "flog 'em . . . string 'em up . . . it's the only language they understand . . . law and
order' persuasion? Of course not. Nobody has ever claimed such a thing.
The lawyers' objections are more ignobly based. There is a new kind of evidence increasingly coming into the criminal courts: evidence from
DNA fingerprinting, and it is extremely powerful. If your client is innocent, DNA' evidence may well provide a knock-down convincing way to establish his innocence. Conversely, if he is guilty, DNA evidence has a good chance of establishing his guilt in cases where no other evidence can. DNA evidence is quite hard to understand at the best of times.
There are controversial aspects of it which are even harder. In these circumstances, you would think that an honest lawyer who wishes to see justice done would welcome jurors capable of grasping the arguments. Wouldn't it be an obviously good thing to have at least one or two people in the jury room who can redress the ignorance of their baffled colleagues? What kind of a lawyer is it who prefers a jury incapable of following the case that either attorney is making?
The answer is a lawyer who is more interested in winning than in seeing justice done. A lawyer, in other words. And it seems to be a fact that advocates, of both prosecution and defence, frequently disallow individual jurors specifically because they are educated in science.
Courts of law have always needed to establish individual identity. Was the individual seen hurrying from the scene Richard Dawkins? Is the hat dropped at the scene of the crime his hat? Are those his fingerprints on the weapon? A yes answer to one of these questions does not by itself prove his guilt, but it is certainly an important factor to be taken into account. Most of us, including most jurors and lawyers, have an intuitive sense that there is something specially reliable about eye-witness evidence. In this we are almost certainly wrong, but the error is a pardonable one. It may even be built into us by millennia of evolutionary history in which eye-witness evidence really was the most reliable. If I see a man in a red woolly hat climbing a drainpipe, you will have a hard time persuading me later that he was actually wearing a blue beret. Our intuitive biases are such that eye-witness evidence trumps all other categories. Yet numerous studies have shown that eye-witnesses, however convinced they may be, however sincere and well-meaning, frequently mis-remember even conspicuous details such as the colour of clothing and the number of assailants present.
When individual identification is important, for instance when a woman who has been raped is called upon to identify her attacker, courts perform a rudimentary statistical test known as the identity parade or line-up. The woman is led past a line of men, one of whom the police suspect on other grounds. The others have been pulled in off the streets or are out-of-work actors, or police officers dressed in plain clothes. If the woman picks out one of these stooges, her identification evidence is discounted. But if she picks out the man the police already suspect, her evidence is taken seriously.
Rightly so. Especially if the number of people in the identity parade is large. We are all statisticians enough to see why this is. The prior suspicion of the police must be open to doubt - otherwise there would be no point in seeking the woman's evidence at all. What impresses us is agreement between the woman's identification and the independent evidence offered by the police. If the identity parade contains only two men, the witness would have a 50 per cent chance of picking the man already suspected by the police, even if she chose at random - or if she were mistaken. Since the police might also be mistaken, this represents an unacceptably high risk of injustice. But if there are 20 men in the line, the woman has only a 1 in 20 chance of choosing, by guesswork or error, the man the police already suspect. The coincidence of her identification and the police's prior suspicion probably really means something. What
is going on here is the assessment of coincidence, or the odds that something might happen by chance alone. The probability of meaningless coincidence is even less if the identity parade has 100 men, because a 1 in 100 chance of error is noticeably less than a 1 in 20 chance of error. The longer the line-up, the more secure the eventual conviction.
We also have an intuitive sense that the men chosen for the line-up must not look too obviously different from the suspect. If the woman originally told the police to look for a man with a beard, and the police have now arrested a bearded suspect, it is clearly unjust to stand him in a line with 19 clean-shaven men. He might as well be standing by himself. Even if the woman has said nothing about the appearance of her attacker, if the police have arrested a punk in a leather jacket it would be wrong to stand him in a line of suited accountants with furled umbrellas. In multiracial countries such considerations have added importance. Everyone understands that a black suspect should not be placed in an otherwise all-white line-up, or vice versa.
When we think about how we identify somebody, the face first leaps to mind. We are particularly good at distinguishing faces. As we shall see in another connection, we even seem to have evolved a special part of the brain set aside for the purpose, and certain kinds of brain damage
disable our face-recognition faculty while leaving the rest of vision intact. In any case, faces are good for recognition because they are so variable. With the well-known exception of identical twins, you seldom meet two people whose faces are confusable. It is not totally unknown, however, and an actor can be made up to look very like somebody else. Dictators often employ doubles to perform for them when they are too busy, or to draw the fire of assassins. It has been suggested that one reason charismatic leaders so often sport moustaches (Hitler, Stalin, Franco, Saddam Hussein, Oswald Mosley) is to make it easier for doubles to impersonate them. Mussolini's shaven head perhaps served the same purpose.
Apart from identical twins, ordinary close relatives are sometimes sufficiently alike to fool people who don't know them well. (Unfortunately the story that Doctor Spooner, when Warden of my college, once stopped an undergraduate and said, 'I never can remember is it you or your brother was killed in the war? ' is probably not true, like most alleged Spoonerisms. ) The resemblance of brothers and sisters, of fathers and sons, of grandparents and grandchildren, serves to remind us of the huge pool of facial variety in the general population of non-relatives.
But faces are only a special case. We are riddled with idiosyncrasies which, with sufficient training, can be used to identify individuals. I had
a school friend who claimed (and my spot checks confirmed it) that he could recognize any member of the 80-strong residence in which we lived purely by listening to their footsteps. I had another friend from Switzerland who claimed that when she walked into a room she could tell, by smell, which members of her circle of acquaintances had recently left the room. It is not that her colleagues didn't wash, just that she was unusually sensitive. That this is in principle possible is confirmed by the fact that police dogs can distinguish between any two human beings by smell alone, with the exception, yet again, of identical twins. As far as I know, the police haven't adopted the following technique, but I bet you could train bloodhounds to track down a kidnapped child after giving them a sample sniff of his brother. A way might even be found to use a jury of bloodhounds to decide paternity cases.
Voices are as idiosyncratic as faces, and various research teams are working on computer voice recognition systems for authenticating identity. It would be a great boon if, in the future, we could dispense with front door keys and rely on a voice-operated computer to obey our personal Open Sesame command. Handwriting is sufficiently individual for the written signature to be used as a guarantee of identity on bank cheques and important legal documents. Signatures are actually not particularly secure because they are too easily forged, but it is still impressive how recognizable handwriting can be. A promising newcomer
to the list of individual 'signatures' is the iris of the eye. At least one bank is experimenting with automated iris-scanning machines as a way of verifying identity. The customer stands in front of a camera which photographs the eye, digitizes the image into what a newspaper described as 'a 256 byte: human barcode'. But none of these methods of verifying human identity even comes close to the potential of DNA fingerprinting, properly applied.
It is not surprising that police dogs can smell the difference between any two humans except identical twins. Our sweat contains a complicated cocktail of proteins, and the precise details of all proteins are minutely specified by the coded DNA instructions that are our genes. Unlike handwriting and faces, which vary continuously and grade smoothly into one another, genes are digital codes, much like those used in computers. Again with the exception of identical twins, we differ genetically from all other people in discrete, discontinuous ways: an exact number of ways that you could even count if you had the patience. The DNA in each one of my cells (give or take a tiny minority of mistakes, and not including red blood cells which have lost all their DNA, or reproductive cells which contain a random half of my genes) is identical to the DNA in all my other cells. It differs from the DNA in every one of your cells, not in some vague, impressionistic way but at a precise number of locations dotted along the billions of DNA letters that we both have.
It is almost impossible to exaggerate the importance of the digital revolution in molecular genetics. Before Watson and Crick's epochal announcement in 1953 of the structure of DNA, it was still possible to agree with the concluding words of Charles Singer's authoritative A Short History of Biology, published in 1931:
. . despite interpretations to the contrary, the theory of the gene is not a 'mechanist' theory. The gene is no more comprehensible as a chemical or physical entity than is the cell or, for that matter, the organism itself. Further, though the theory speaks in terms of genes as the atomic theory speaks in terms of atoms, it must be remembered that there is a fundamental distinction between the two theories. Atoms exist independently, and their properties as such can be examined. They can even be isolated. Though we cannot see them, we can deal with them under various conditions and in various combinations. We can deal with them individually. Not so the gene. It exists only as a part of the chromosome, and the chromosome only as part of a cell. If I ask for a living chromosome, that is, for the only effective kind of chromosome, no one can give it to me except in its living surroundings any more than he can give me a living arm or leg. The doctrine of the relativity of functions is as true for the gene as it is for any of the organs of the body. They exist and function only in relation to other organs. Thus the last of the
biological theories leaves us where the first started, in the presence of a power called life or psyche which is not only of its own kind but unique in each and all of its exhibitions.
This is dramatically, profoundly, hugely wrong. And it really matters. Following Watson and Crick and the revolution that they sparked, a gene can be isolated. It can be purified, bottled, crystallized, read as digitally coded information, printed on a page, fed into a computer, read out again into a test tube and reinserted into an organism where it works exactly
as it did before. When the Human Genome Project, which set out to work out the complete gene sequence of a human being, is completed,
probably by the year 2005, the full genome will fit comfortably on two standard CD ROM discs, leaving enough space for a textbook of
molecular embryology. These two discs could then be sent into outer space, and the human race could go extinct secure in the knowledge that there is now a chance that at some future time and in some distant place, a sufficiently advanced civilization would be able to reconstitute a human being. Meanwhile, back on earth, it is because DNA is deeply and fundamentally digital - because the differences between individuals and between species can be precisely counted, not vaguely and impressionistically measured - that DNA fingerprinting is potentially so powerful.
I assert the uniqueness of each individual's DNA with confidence, but even this is only a statistical judgement. Theoretically, the sexual lottery could throw up the same genetic sequence twice. An 'identical twin' of Isaac Newton could be born tomorrow. But the number of people that would have to be born in order to make this event at all likely would be larger than the number of atoms in the universe. Unlike our face, voice
or handwriting, the DNA in most of our cells stays the same from babyhood to old age, and it cannot be altered by training or cosmetic surgery. Our DNA text has such a huge number of letters that we can precisely quantify the expected number shared by, say, brothers or first cousins as opposed to, say, second cousins or random pairs chosen from the population at large. This makes it useful not only for labelling individuals uniquely and matching them to traces such as blood or semen, but for establishing paternity and other genetic relationships.
British law allows people to immigrate if they can prove that their
parents are already British citizens. A number of children from the
Indian subcontinent have been arrested by sceptical immigration officials. Before the advent of DNA fingerprinting it was often impossible for these unfortunate people to prove their parentage. Now it is easy. All you do is take a sample of blood from the putative parents and compare a particular set of genes with the corresponding set of genes from the child. The verdict is clear and unequivocal, with none of the doubt or fuzziness
that creates a need for qualitative judgements. Several young people in Britain today owe their citizenship to DNA technology.
"A similar method was used to identify skeletons discovered in Yekaterinburg and suspected of belonging to the executed Russian royal family. Prince Philip, Duke of Edinburgh, whose exact relationship to the Romanovs is known, graciously gave blood, and from this it was possible to establish that the skeletons were indeed those of the Tsar's family. In a more macabre case, a skeleton exhumed in South America was proved to belong to Doctor Josef Mengele, the Nazi war criminal known as the 'Angel of Death'. DNA taken from the bones was compared with blood from Mengele's still-living son, and the identity of the skeleton proved. More recently, a corpse dug up in Berlin has been proved, by the same method, to be that of Martin Bormann, Hitler's deputy, whose disappearance had led to endless legends and rumours and more than 6,000 'sightings' around the world.
Despite the name 'fingerprinting', our DNA, being digital, is even more individually characteristic than the patterns of whorls on our fingers. The name is appropriate because, like true fingerprints, DNA evidence is often inadvertently left behind after a person has departed the scene. DNA can be extracted from a bloodstain on a carpet, from semen inside a rape victim, from a crust of dried nasal mucus on a handkerchief, from sweat or from shed hairs. The DNA in the sample can then be compared with that in the blood taken from a suspect. It is possible to assess, to almost any desired level of probability, whether the sample belongs to a particular person or not.
So, what are the snags? Why is DNA evidence controversial? What is it about this important kind of evidence that makes it possible for lawyers to bamboozle juries into misinterpreting or ignoring it? Why have some courts been moved to the despairing extreme of ruling out this evidence altogether?
There are three major classes of potential problem, one simple, one sophisticated and one silly. I'll come to the silly problem and the more sophisticated difficulties later but first, as with any kind of evidence, there is the simple - and very important - possibility of human error. Possibilities, rather, for there are plenty of opportunities for mistakes and even sabotage. A tube of blood may be mislabelled, either by accident or in a deliberate attempt to frame somebody. A sample from the scene of a crime may be contaminated by sweat from a lab technician or a police officer. The danger of contamination is especially great in those cases where an ingenious technique of amplification called PCR (polymerase chain reaction) is used.
You can easily see why amplification might be desirable. A tiny smear of sweat on a gun butt contains precious little DNA. Sensitive though DNA analysis can be, it needs a certain minimum quantity of material to work on. The technique of PCR, invented in 1983 by the American biochemist Kary B. Mullis, is the dramatically successful answer. PCR takes what little DNA there is and produces millions of copies, multiplying again and again whatever code sequences are there. But, as always with amplification, errors are amplified along with the true signal. Stray scraps of DNA contamination from a technician's sweat are amplified as effectively as the specimen from the scene of the crime, with obvious possibilities for injustice.
But human error is not peculiar to DNA evidence. All kinds of evidence are vulnerable to bungling and sabotage, and must be handled with scrupulous care. The files in a conventional fingerprint library may be mislabelled. The murder weapon may have been touched by innocent people as well as the murderer, and their fingerprints have to be taken, along with the suspect's, for elimination purposes. Courts of law are already accustomed to the need to take all possible precautions against mistakes and they still, sometimes tragically, happen. DNA evidence is not immune to human bungling but nor is it particularly vulnerable, except in so far as PCR amplifies error. If all DNA evidence were to be thrown out because of occasional mistakes, the precedent should rule out most other kinds of evidence, too. We have to suppose that codes of practice and rigorous precautions can be developed to guard against human error in the presentation of all kinds of legal evidence.
The more sophisticated difficulties that bedevil DNA evidence will take longer to explain. They, too, have their precedents in conventional types of evidence, although this point often does not seem to be understood in law courts.
Where identification evidence of any kind is concerned, there are two types of error which correspond to the two types of error in any statistical evidence. In another chapter, we shall call them Type 1 and Type 2 errors, but it is easier to think of them as false positive and false negative. A guilty suspect may escape, through not being recognized - false negative. And - false positive (which most people would see as the more dangerous error) - an innocent suspect may be convicted because he happens, by ill luck, to resemble the genuinely guilty party. In the case of ordinary eye- witness identification, an innocent bystander who happens to look a bit like the real criminal could consequently be arrested - false positive. Identity parades are designed to make this less probable. The chance of a miscarriage of justice is inversely related to the number of people standing in the line-up. The danger can be increased in the ways we have
already considered - the line-up being unfairly stacked with clean-shaven men for example.
In the case of DNA evidence the danger of a false positive conviction is theoretically very low indeed. We have a blood sample from a suspect, and we have a specimen from the scene of the crime. If the entire set of genes in both these samples could be written down, the probability of a false conviction is one in billions and billions. Identical twins apart, the chance that any two humans would match all their DNA is tantamount to zero. But unfortunately it is not practical to work out the complete gene sequence of a human being. Even after the Human Genome Project is completed, to attempt the equivalent in the solution of each crime is unrealistic. In practice, forensic detectives concentrate on small sections of the genome, preferably sections that are known to vary in the population. And now our fear must be that, although we could safely rule out mis-identification if the whole genome were considered, there might be a danger of two individuals' being identical with respect to the small portion of DNA that we have time to analyse.
The probability that this would happen ought to be measurable for any particular section of the genome; we could then decide whether it was an acceptable risk. The larger the section of DNA, the smaller the probability of error, just as, in an identity parade, the longer the line-up the safer the conviction. The difference is that an identity parade, in order to compete with the DNA equivalent, would need to contain not a couple of dozen people but thousands, millions or even billions in the line. Apart from this quantitative difference, the analogy- with the identity parade continues. We shall see that there is a DNA equivalent of our hypothetical line-up of clean-shaven men with one bearded suspect. But first, a little more background on DNA fingerprinting.
Obviously we sample the equivalent parts of the genome in both suspect and specimen. These parts of the genome are chosen for their tendency to vary widely in the population. A Darwinian would note that the parts that don't vary are often the parts that have an important role to play in the survival of the organism. Any substantial variations in these important genes are likely to have been removed from the population by the death of their possessors - Darwinian natural selection. But there are other parts of the genome that are very variable, perhaps because they are not important for survival. This isn't the whole story because in fact some useful genes are quite variable. The reasons for this are controversial. It's a bit of a digression but . . . What is this life if, full of stress, we have no freedom to digress?
The 'neutralist' school of thought, associated with the distinguished Japanese geneticist Motoo Kimura, believes that useful genes are equally
useful in a variety of different forms. This emphatically does not mean that they are useless, only that the different forms are equally good at what they do. If you think of genes as writing out their recipes in words, the alternative forms of a gene can be thought of as the very same words written in different typefaces: the meaning is the same, and the product of the recipe will come out the same. Genetic changes, 'mutations', that make no difference are not 'seen' by natural selection. They aren't mutations at all, for all the difference they make to the life of the animal, but they are potentially useful mutations from the point of view of the forensic scientist. The population ends up with lots of variety at such a locus (position in a chromosome), and this kind of variety could in principle be used for fingerprinting.
The other theory of variation, opposed to Kimura's neutral theory, believes that the different versions of the genes really do different things and that there is some special reason why both are preserved by natural selection in the population. For example, there might be two alternative forms of a blood protein, A? and ss, which are susceptible to two infectious diseases called alfluenza and betaccosis respectively, each being immune to the other disease. Typically, an infectious disease needs a critical density of susceptible victims in a population, otherwise an epidemic can't get going. In a population dominated by A? types, there are frequent epidemics of alfluenza but not of betaccosis. So natural selection favours the ss types who are immune to alfluenza. It favours them so much that after a while they come to dominate the population. Now the tables are turned. There are epidemics of betaccosis, but not of alfluenza. The A? types now are favoured by natural selection because they are immune to betaccosis. The population may keep oscillating between A? dominance and ss dominance, or it may settle down to an intermediate mixture, an 'equilibrium'. Either way, we'll see plenty of variation at the gene locus concerned, and this is good news for the finger-printers. The phenomenon is called 'frequency dependent selection' and it is one suggested reason for high levels of genetic variation in the population. There are others.
However, for our forensic purposes, it matters only that there are variable sections of the genome. Whatever the verdict in the controversy over whether the useful bits of the genome are variable, there are in any case lots of other regions of the genome which are never even read, or never translated into their protein equivalents. Indeed, an astonishingly high proportion of our genes seem to be doing nothing whatsoever. They are therefore free to vary-, which makes them excellent DNA fingerprinting material.
As if to confirm the fact that a great deal of DNA is doing nothing useful, the sheer quantity of DNA in the cells of different kinds of organisms is
wildly variable. Since DNA information is digital, we can measure it in
the same kind of units as we measure computer information. One bit of information is enough to specify one yes/no decision: a 1 or a 0, a true or a false. The computer on which I am writing this has 256 megabits (32 megabytes) of core memory. (The first computer that I owned was a
bigger box but had less than one five thousandth of the memory capacity. ) The equivalent fundamental unit in DNA is the nucleotide base. Since there are 4 possible bases, the information content of each base is equivalent to 2 bits. The common gut bacterium Escherichia coli has a genome of 4 mega-bases or 8 megabits. The crested newt, Triturus cristatus, has 40,000 megabits. The 5,000-fold ratio between crested
newt and bacterium is about the same as that between my present computer and my first one. We humans have 5,000 mega-bases or 6,000 megabits. This is 750 times as great as the bacterium (which satisfies
our vanity), but what are we to make of the newt trumping us sixfold? We'd like to think that genome size is not strictly proportional to what it does: presumably quite a lot of that newt DNA isn't doing anything. This is certainly true. It is also true of most of our DNA. We know from other evidence that, of the 3,000 mega-base human genome, only about 2 per cent is actually used for coding protein synthesis. The rest is often called junk DNA. Presumably the crested newt has an even higher percentage
of junk DNA. Other newts have not.
The surplus of unused DNA falls into various categories. Some of it looks like real genetic information, and probably represents old, defunct genes, or out-of-date copies of genes that are still in use. These pseudo-genes would make sense if they were read and translated. But they are not read and translated. Hard disks on computers usually contain comparable junk: old copies of work in progress, scratchpad space used by the computer for interim operations, and so on. We users don't see this junk, because our computers only show us those parts of the disk that we need to know about. But if you get right down and read the actual information on the disk, byte by byte, you'll see the junk, and much of it will make some sort of sense. There are probably dozens of disjointed fragments of this very chapter peppered around my hard disk at present, although there is only one 'official' copy that the computer tells me about (plus a prudent back-up).
In addition to the junk DNA which could he read but isn't, there is plenty of junk DNA which not only isn't read but wouldn't make any sense if it were. There are huge stretches of repeated nonsense, perhaps repeats of one base, or alternations of the same two bases, or repeats of a more complicated pattern. Unlike the other class of junk DNA, we cannot account for these 'tandem repeats' as outdated copies of useful genes. This repetitive DNA has never been decoded, and presumably has never been of any use. (Never useful for the animal's survival, anyway. From
the point of view of the selfish gene, as I explained in another book, we could say that any kind of junk DNA is 'useful' to itself if it just keeps surviving and making more copies of itself. This suggestion has come to be known by the catch-phrase 'selfish DNA', although this is a little unfortunate because, in my original sense, working DNA is selfish too. For this reason, some people have taken to calling it 'ultra-selfish DNA'. )
Anyway, whatever the reason, junk DNA is there, and there in prodigious quantities. Because it is not used, it is free to vary. Useful genes, as we have seen, are severely constrained in their freedom to change. Most changes (mutations) make a gene work less effectively, the animal dies and the change is not passed on. This is what Darwinian natural selection is all about. But mutations in junk DNA (mostly changes in the number of repeats in a given region) are not noticed by natural selection. So, as we look around the population, we find most of the variation that is useful for fingerprinting in the junk regions. As we shall now see, tandem repeats are particularly useful because they vary with respect to number of repeats, a gross feature which is easy to measure.
If it wasn't for this, the forensic geneticist would need to look at the exact sequence of bases in our sample region. This can be done, but sequencing DNA is time-consuming. The tandem repeats allow us to use cunning short-cuts, as discovered by Alec Jeffreys of the University of Leicester, rightly regarded as the father of DNA fingerprinting (and now Sir Alec). Different people have different numbers of tandem repeats in particular places. I might have 147 repeats of a particular piece of nonsense, where you have 84 repeats of the same piece of nonsense in the corresponding place in your genome. In another region, I might have 24 repeats of a particular piece of nonsense to your 38 repeats. Each of us has a characteristic fingerprint consisting of a set of numbers. Each of these numbers in our fingerprint is the number of times a particular piece of nonsense is repeated in our genome.
We get our tandem repeats from our parents. We each have 46 chromosomes, 25 from our father and 23 homologous, or corresponding, chromosomes from our mother. These chromosomes come complete with tandem repeats. Your father got his 46 chromosomes from your paternal grandparents, but he didn't pass them on to you in their entirety. Each of his mother's chromosomes was lined up with its paternal opposite number and bits were exchanged before a composite chromosome was put into the sperm that helped to make you. Every sperm and every egg is unique because it is a different mix of maternal and paternal chromosomes. The mixing process affects the tandem repeat sections as well as the meaningful sections of the chromosomes. So our characteristic numbers of tandem repeats are inherited, in much the same way as our eye colour and hair curliness are inherited. With the difference that, whereas our eye colour results from some kind of joint
verdict of our paternal and our maternal genes, our tandem repeat numbers are properties of the chromosomes themselves and can therefore be measured separately for paternal and maternal chromosomes. At any particular tandem repeat region, each of us has two readings: a paternal chromosome repeat number and a maternal chromosome repeat number. From time to time, chromosomes mutate - suffer a random change - in their tandem repeat numbers. Or a particular tandem region may be split by chromosomal crossing over. This is why there is variation in tandem repeat numbers in the population. The beauty of tandem repeat numbers is that they are easy to measure. You don't have to get embroiled in detailed sequencing of coded DNA bases. You do something a bit like weighing them. Or, to take another equally apt analogy, you spread them out like coloured bands from a prism. I'll explain one way of doing this.
First you need to make some preparations. You make a so-called DNA probe, which is a short sequence of DNA that exactly matches the nonsense sequence in question - up to about 20 nucleotide bases long. This is not difficult to do nowadays. There are several methods. You can even buy a machine off the shelf which makes short DNA sequences to any specification, just as you can buy a keyboard to punch any desired string of letters on a paper tape. By supplying the synthesizing machine with radioactive raw materials, you make the probes themselves radioactive, and so 'label' them. This makes the probes easy to find again later, as natural DNA is not radioactive, and so the two are readily distinguishable from each other.
Radioactive probes are a tool of the trade, which you must have ready before you start a Jeffreys fingerprinting exercise. Another essential tool is the 'restriction enzyme'. Restriction enzymes are chemical tools that specialize in cutting DNA, but cutting it only in particular places. For example, one restriction enzyme may search the length of a chromosome until it finds the sequence GAATTC (G, C, T and A are the four letters of the DNA alphabet; all genes, from all species on earth, differ only in consisting of different sequences of these four letters). Another restriction enzyme cuts the DNA wherever it can find the sequence GCGGCCGC. A number of different restriction enzymes are available in the toolbox of the molecular biologist. They originate from bacteria, who use them for their own defensive purposes. Each restriction enzyme has its own unique search string which it homes in on and cuts.
Now, the trick is to choose a restriction enzyme whose specific search string is completely absent from the tandem repeat we are interested in. The whole length of DNA is therefore chopped into short stretches, bounded by the characteristic search string of the restriction enzyme. Of course, not all the stretches will consist of the tandem repeat we are
looking for. All sorts of other stretches of DNA will happen to be bounded by the favoured search string of the restriction enzyme scissors. But some of them will consist of tandem repeats and the length of each scissored stretch will be largely determined by the number of tandem repeats in it. If I have 147 repeats of a particular piece of DNA nonsense, where you have only 85, my snipped fragments will be correspondingly longer than your snipped fragments.
We can measure these characteristic lengths using a technique that has been around in molecular biology for quite a while. This is the bit that is rather like spreading them out with a prism, as Newton did for white light. The standard DNA 'prism' is a gel electrophoresis column, that is, a long tube filled with jelly through which an electric current is passed. A solution containing the scissored stretches of DNA, all jumbled together, is poured into one end of the tube. The DNA fragments are all electrically attracted to the negative end of the column, which is at the other end of the tube, and they move steadily through the jelly. But they don't all move at the same rate.
