The inference that any jury might be expected to draw (indeed, were
intended
to draw) is that the defendant's beating of his wife should be discounted in the murder trial.
Richard-Dawkins-Unweaving-the-Rainbow
Other newts have not.
The surplus of unused DNA falls into various categories. Some of it looks like real genetic information, and probably represents old, defunct genes, or out-of-date copies of genes that are still in use. These pseudo-genes would make sense if they were read and translated. But they are not read and translated. Hard disks on computers usually contain comparable junk: old copies of work in progress, scratchpad space used by the computer for interim operations, and so on. We users don't see this junk, because our computers only show us those parts of the disk that we need to know about. But if you get right down and read the actual information on the disk, byte by byte, you'll see the junk, and much of it will make some sort of sense. There are probably dozens of disjointed fragments of this very chapter peppered around my hard disk at present, although there is only one 'official' copy that the computer tells me about (plus a prudent back-up).
In addition to the junk DNA which could he read but isn't, there is plenty of junk DNA which not only isn't read but wouldn't make any sense if it were. There are huge stretches of repeated nonsense, perhaps repeats of one base, or alternations of the same two bases, or repeats of a more complicated pattern. Unlike the other class of junk DNA, we cannot account for these 'tandem repeats' as outdated copies of useful genes. This repetitive DNA has never been decoded, and presumably has never been of any use. (Never useful for the animal's survival, anyway. From
the point of view of the selfish gene, as I explained in another book, we could say that any kind of junk DNA is 'useful' to itself if it just keeps surviving and making more copies of itself. This suggestion has come to be known by the catch-phrase 'selfish DNA', although this is a little unfortunate because, in my original sense, working DNA is selfish too. For this reason, some people have taken to calling it 'ultra-selfish DNA'. )
Anyway, whatever the reason, junk DNA is there, and there in prodigious quantities. Because it is not used, it is free to vary. Useful genes, as we have seen, are severely constrained in their freedom to change. Most changes (mutations) make a gene work less effectively, the animal dies and the change is not passed on. This is what Darwinian natural selection is all about. But mutations in junk DNA (mostly changes in the number of repeats in a given region) are not noticed by natural selection. So, as we look around the population, we find most of the variation that is useful for fingerprinting in the junk regions. As we shall now see, tandem repeats are particularly useful because they vary with respect to number of repeats, a gross feature which is easy to measure.
If it wasn't for this, the forensic geneticist would need to look at the exact sequence of bases in our sample region. This can be done, but sequencing DNA is time-consuming. The tandem repeats allow us to use cunning short-cuts, as discovered by Alec Jeffreys of the University of Leicester, rightly regarded as the father of DNA fingerprinting (and now Sir Alec). Different people have different numbers of tandem repeats in particular places. I might have 147 repeats of a particular piece of nonsense, where you have 84 repeats of the same piece of nonsense in the corresponding place in your genome. In another region, I might have 24 repeats of a particular piece of nonsense to your 38 repeats. Each of us has a characteristic fingerprint consisting of a set of numbers. Each of these numbers in our fingerprint is the number of times a particular piece of nonsense is repeated in our genome.
We get our tandem repeats from our parents. We each have 46 chromosomes, 25 from our father and 23 homologous, or corresponding, chromosomes from our mother. These chromosomes come complete with tandem repeats. Your father got his 46 chromosomes from your paternal grandparents, but he didn't pass them on to you in their entirety. Each of his mother's chromosomes was lined up with its paternal opposite number and bits were exchanged before a composite chromosome was put into the sperm that helped to make you. Every sperm and every egg is unique because it is a different mix of maternal and paternal chromosomes. The mixing process affects the tandem repeat sections as well as the meaningful sections of the chromosomes. So our characteristic numbers of tandem repeats are inherited, in much the same way as our eye colour and hair curliness are inherited. With the difference that, whereas our eye colour results from some kind of joint
verdict of our paternal and our maternal genes, our tandem repeat numbers are properties of the chromosomes themselves and can therefore be measured separately for paternal and maternal chromosomes. At any particular tandem repeat region, each of us has two readings: a paternal chromosome repeat number and a maternal chromosome repeat number. From time to time, chromosomes mutate - suffer a random change - in their tandem repeat numbers. Or a particular tandem region may be split by chromosomal crossing over. This is why there is variation in tandem repeat numbers in the population. The beauty of tandem repeat numbers is that they are easy to measure. You don't have to get embroiled in detailed sequencing of coded DNA bases. You do something a bit like weighing them. Or, to take another equally apt analogy, you spread them out like coloured bands from a prism. I'll explain one way of doing this.
First you need to make some preparations. You make a so-called DNA probe, which is a short sequence of DNA that exactly matches the nonsense sequence in question - up to about 20 nucleotide bases long. This is not difficult to do nowadays. There are several methods. You can even buy a machine off the shelf which makes short DNA sequences to any specification, just as you can buy a keyboard to punch any desired string of letters on a paper tape. By supplying the synthesizing machine with radioactive raw materials, you make the probes themselves radioactive, and so 'label' them. This makes the probes easy to find again later, as natural DNA is not radioactive, and so the two are readily distinguishable from each other.
Radioactive probes are a tool of the trade, which you must have ready before you start a Jeffreys fingerprinting exercise. Another essential tool is the 'restriction enzyme'. Restriction enzymes are chemical tools that specialize in cutting DNA, but cutting it only in particular places. For example, one restriction enzyme may search the length of a chromosome until it finds the sequence GAATTC (G, C, T and A are the four letters of the DNA alphabet; all genes, from all species on earth, differ only in consisting of different sequences of these four letters). Another restriction enzyme cuts the DNA wherever it can find the sequence GCGGCCGC. A number of different restriction enzymes are available in the toolbox of the molecular biologist. They originate from bacteria, who use them for their own defensive purposes. Each restriction enzyme has its own unique search string which it homes in on and cuts.
Now, the trick is to choose a restriction enzyme whose specific search string is completely absent from the tandem repeat we are interested in. The whole length of DNA is therefore chopped into short stretches, bounded by the characteristic search string of the restriction enzyme. Of course, not all the stretches will consist of the tandem repeat we are
looking for. All sorts of other stretches of DNA will happen to be bounded by the favoured search string of the restriction enzyme scissors. But some of them will consist of tandem repeats and the length of each scissored stretch will be largely determined by the number of tandem repeats in it. If I have 147 repeats of a particular piece of DNA nonsense, where you have only 85, my snipped fragments will be correspondingly longer than your snipped fragments.
We can measure these characteristic lengths using a technique that has been around in molecular biology for quite a while. This is the bit that is rather like spreading them out with a prism, as Newton did for white light. The standard DNA 'prism' is a gel electrophoresis column, that is, a long tube filled with jelly through which an electric current is passed. A solution containing the scissored stretches of DNA, all jumbled together, is poured into one end of the tube. The DNA fragments are all electrically attracted to the negative end of the column, which is at the other end of the tube, and they move steadily through the jelly. But they don't all move at the same rate. Like light of low vibration frequency moving through glass, small fragments of DNA move faster than large ones. The result is that, if you switch the current off after a suitable interval, the fragments have spread themselves out along the column, just as Newton's colours spread themselves out because light from the blue end of the spectrum is more readily slowed down by glass than light from the red end.
But so far we can't see the fragments. The jelly column looks uniform all the way down. There is nothing to show that DNA fragments of different size are lurking in discrete bands along its length, and nothing to show which bands contain which variety of tandem repeat. How do we make them visible? This is where the radioactive probes come in.
To make them visible you can use another cunning technique, the Southern blot, named after its inventor, Edward Southern. (Slightly confusingly, there are other techniques called the Northern blot and the Western blot, but no Mr Northern or Mr Western. ) The jelly column is removed from the tube and laid out on blotting paper. The liquid in the jelly, including the DNA fragments, seeps out of the jelly into the blotting paper. The blotting paper has previously been laced with quantities of the radioactive probe for the particular tandem repeat that we are interested in. The probe molecules line up along the blotting paper, pairing precisely, by the ordinary rules of DNA, with their opposite numbers in the tandem repeats. Surplus probe molecules are washed away. Now the only radioactive probe molecules left in the blotting paper are those bound to their exact opposite numbers that seeped out of the jelly. The blotting paper is now placed on a piece of X-ray film, which is then marked by the radioactivity. So, what you see when you develop the
film is a set of dark bands - another barcode. The final barcode pattern that we read on the Southern blot is a fingerprint for a person, in very much the same way as the Fraunhofer lines are a fingerprint for a star, or the formant lines are the fingerprint for a vowel sound. Indeed, the barcode from the blood looks very like Fraunhofer lines or formant lines.
The details of DNA fingerprinting techniques get quite complicated and I won't go much further. For instance, one strategy is to hit the DNA with lots of probes all at the same time. What you get then is a mixed bag of barcode stripes simultaneously. In extreme cases, the stripes merge into each other and all you get is one big smear with all possible sizes of DNA fragment represented somewhere in the genome. This is no good for identification purposes. At the other extreme, people use only one probe at a time looking at one genetic 'locus'. This 'single-locus fingerprinting' gives you nice clean bars like Fraunhofer lines. But only one or two bars per person. Even so, the chances of confusing people are small. This is because the characteristics we are talking about are not like 'brown eyes versus blue eyes', in which case lots of people would be the same. The characteristics we are measuring, remember, are lengths of tandem repeat fragments. The number of possible lengths is very large, so even single-locus fingerprinting is pretty good for identification purposes. Not quite good enough, however, so in practice forensic DNA finger-printers usually use half a dozen separate probes. Now the chances of error are very low indeed. But we still need to talk about exactly how low, because people's lives or liberties might depend upon it.
First, we must return to our distinction between false positives and false negatives. DNA evidence can be used to clear an innocent suspect, or it can be made to point the finger at a guilty one. Suppose semen is recovered from the vagina of a rape victim. Circumstantial evidence leads the police to arrest a man, suspect A. Suspect A gives a blood sample and it is compared to the semen sample, using a single DNA probe to look at one tandem repeat locus. If the two are different, suspect A is in the clear. We don't even need to look at a second locus.
But what if suspect A's blood matches the semen sample at this locus? Suppose they both share the same barcode pattern, which we shall call pattern P. This is compatible with the suspect's being guilty, but it doesn't prove it. He could just happen to share pattern P with the real rapist. We must now look at some more loci. If the samples still match, what are the odds against such a match being coincidental - a false positive mis-identification? This is where we have to start thinking statistically about the population at large. In theory, by taking blood from a sample of men in the population at large, we should be able to calculate the likelihood that any two men will be identical at each locus
concerned. But from which section of the population do we draw our sample?
Remember our lone bearded man in the old-fashioned line-up identity parade? Here's the molecular equivalent. Suppose that, in the world at large, only one in a million men has pattern P. Does this mean that there is a million to one chance against a wrongful conviction of suspect A? No. Suspect A may belong to a minority group of people whose ancestors immigrated from a particular part of the world. Local populations often share genetic peculiarities, for the simple reason that they are descended from the same ancestors. Of the 2. 5 million South African Dutch, or Afrikaners, most are descended from one shipload of immigrants who arrived from the Netherlands in 1652. As an indicator of the narrowness of this genetic bottleneck, about a million still bear the surnames of 20 of these original settlers. The Afrikaners have a much higher frequency of certain genetic diseases than the population of the world in general. According to one estimate, about 8,000 (one in 300) have the blood condition porphyria variegata, which is much rarer in the rest of the world. This is apparently because they are descended from one particular couple on the ship, Gerrit Jansz and Ariaantje Jacobs, although it is not known which one was the carrier of the (dominant) gene for the condition. (She was one of eight Rotterdam orphanage girls put on the ship to provide wives for the settlers. ) In fact, the condition wasn't noticed at all before modern medicine, because its most marked symptom is a lethal reaction to certain modern anaesthetics (South African hospitals now routinely test for the gene before administering anaesthetic). Other populations often have locally high frequencies of other particular genes, for the same kind of reason. If, to return to our hypothetical court case, suspect A and the real criminal both belong to the same minority group, the likelihood of chance confusion could be dramatically greater than you'd think if you based your estimates on the population at large. The point is that the frequency of pattern P in humans at large is no longer relevant. We need to know the frequency of pattern P in the group to which the suspect belongs.
This need is nothing new. We've already seen the equivalent danger in an ordinary line-up identity parade. If the prime suspect is Chinese, it doesn't do to stand him in a line-up largely consisting of westerners. And the same kind of statistical reasoning about the background population is needed in identifying stolen goods, as well as individual suspects. I have already mentioned my jury service in the Oxford Court. In one of the three cases I sat on, a man was accused of stealing three coins from a rival numismatist. The accused had been caught with three coins in his possession which matched those lost. Counsel for the prosecution was eloquent.
Ladies and gentlemen of the jury, are we really supposed to believe that three coins, of exactly the same type as the three missing coins, would just happen to be present in the house of a rival collector? I put it to you that such a coincidence is too much to stomach.
Jurymen are not permitted to cross-examine. That was the duty of counsel for the defence, and he, though doubtless learned in the law and also eloquent, had no more clue about probability theory' than the prosecutor. I wish he'd said something like this:
M'Lud, we don't know whether the coincidence is too much to stomach, because m'learned friend has not presented us with any evidence at all as to the rarity or commonness of these three coins in the population at large. If these coins are so rare that only one in a hundred collectors in the country has any one of them, the prosecution has a good case, since the defendant was caught with three of them. If on the other hand, these coins are as common as dirt, there is not enough evidence to convict. (To push to the extreme, three coins that I have in my pocket today, all current legal tender, are very probably the same as three coins in Your Lordships pocket)
My point is that it simply never occurred to any of the legally trained minds in the court that it was relevant even to ask how rare these three coins were in the population at large. Lawyers can certainly add up (I once received a lawyer's bill, the last item of which was 'Time spent making out this bill') but probability theory is another matter.
I expect the coins were actually rare. If they hadn't been, the theft would not have been such a serious matter, and the prosecution presumably would never have been brought. But the jury should have been told explicitly. I remember that the question came up in the jury room, and we wished that we were allowed to go back into the court to seek clarification. The equivalent question is equally relevant in the case of DNA evidence, and it is most certainly being asked. Fortunately, provided a sufficient number of separate genetic loci are examined, the chances of mis-identification - even among members of minority groups, even among family members (except identical twins) - can be reduced to genuinely very small levels, far smaller than can be achieved by any other method of identification, including eye-witness evidence.
Exactly how small the residual possibility of error is may still be open to dispute. And this is where we come to the third category of objection to DNA evidence, the just plain silly. Lawyers are accustomed to pouncing when expert witnesses seem to disagree. If two geneticists are summoned to the stand and are asked to estimate the probability of a mis-
identification with DNA evidence, the first may say a 1,000,000 to one while the second may say only a 100,000 to one. Pounce. 'Aha! AHA! The experts disagree! Ladies and gentlemen of the jury, what confidence can we place in a scientific method if the experts themselves can't get within a factor of ten of one another? Obviously the only thing to do is throw the entire evidence out, lock, stock and barrel. '
But, in these cases, although geneticists may be inclined to give different weightings to imponderables such as the racial subgroup effect, any disagreement between them is only over whether the odds against a wrongful identification are hyper-mega-astronomical or just plain astronomical. The odds cannot normally be lower than thousands to one, and they may well be up in the billions. Even on the most conservative estimate, the odds against wrongful identification are hugely greater than they are in an ordinary identity parade. 'M'lud, an identity parade of only 30 men is grossly unfair on my client. I demand a line-up of at least a million men! ' Expert statisticians called to give evidence on the likelihood that a conventional 20-man identity parade could yield a false identification would also disagree among themselves. Some would give the simple answer, one in 20. Under cross-examination they would then agree that it could be one in less than 20, depending upon the nature of the variation in the line-up in relation to the features of the suspect (this was the point about the lone bearded man in the line-up). But the one thing all the statisticians would agree upon is that the odds of mis- identification by sheer chance are at least one in 20. Yet lawyers and judges are normally happy to go along with ordinary identity parades in which the suspect stands in a line of only 20 men.
After reporting the throwing out of DNA evidence in a case at London's central criminal court the Old Bailey, the Independent newspaper of 12 December 1992 predicted a consequent flood of appeals. The idea is that everybody at present languishing in jail, as a result of DNA identification evidence, will now be able to appeal, citing the precedent. But the flood may be even greater than the Independent imagines because, if this throwing out of DNA evidence is really a serious precedent for anything, it will cast doubt on all cases in which the odds against a chance mistake are less than thousands to one. If a witness says she 'saw' somebody and identified him in a line-up, lawyers and juries are satisfied. But the odds of mistaken identity when the human eye is involved are far greater than when the identification is done by DNA fingerprinting. If we take the precedent seriously, it ought to mean that every' convicted criminal m the country will have excellent cause to appeal on grounds of mistaken identity. Even where a suspect was seen by dozens of witnesses with a smoking gun in his hand, the odds of injustice must be greater than one in 1,000,000. A recent highly publicized case in America, where the jury were systematically confused about DNA evidence, has also become
notorious for another piece of bungled probability theory. The defendant, who was known to have beaten his wife, was on trial for finally murdering her. One of the high-profile defence team, a Harvard professor of law, advanced the following argument: Statistics show that of men who beat their wives, only one in 1,000 go on to kill them.
The inference that any jury might be expected to draw (indeed, were intended to draw) is that the defendant's beating of his wife should be discounted in the murder trial. Doesn't the evidence show overwhelmingly that a wife- beater is unlikely to turn into a wife murderer? Wrong. Doctor I. J. Good, a professor of statistics, wrote to the scientific journal Nature in June 1995 to explode the fallacy. The defence lawyer's argument overlooks the additional fact that wife-killing is rare compared with wife-beating. Good calculated that if you take that minority of wives who are both beaten by their husbands and murdered by somebody, it is very likely indeed that the murderer will be the husband. This is the relevant way to calculate the odds because, in the case under discussion, the unfortunate wife had been murdered by somebody, after being beaten by her husband.
No doubt there are lawyers, judges and coroners who could benefit from a better understanding of the theory of probability. On some occasions, however, one cannot help suspecting that they understand very well and are feigning incompetence. I do not know if this was so in the case just quoted. The same suspicion is raised by Doctor Theodore Dalrymple, the (London) Spectator's acerbic medical raconteur, in this typically sardonic account, from 7 January 1995, of his being called as an expert witness in a coroner's court:
. . . a wealthy and successful man I knew swallowed 200 tablets and a bottle of rum. The coroner asked me whether I thought he might have taken them by accident I was about to answer with a ringing and confident no, when the coroner made himself a little clearer: was there even a one in a million chance he had taken them by accident?
'Err, well, I suppose so,' replied The coroner (and the man's family) relaxed, an open verdict was returned, the family was ? 750,000 the richer and an insurance company the poorer by an equivalent sum, at least until it put my premium up.
The power of DNA fingerprinting is an aspect of the general power of science that makes some people fear it. It is important not to exacerbate such fears by claiming too much or trying to move too fast. Let me end this rather technical chapter by returning to society and an important and difficult decision that we must collectively make. I would normally fight shy of discussing a topical issue for fear of going out of date, or a local one for fear of being parochial, but the question of a national DNA database is starting to preoccupy most nations in their different ways, and it is bound to become more pressing in the future.
It would in theory be possible to keep a national database of DNA sequences from every man, woman and child in the country. Then, whenever a sample of blood, semen, saliva, skin or hair was found at the scene of a crime, the police would not have to locate a suspect by other means before comparing his DNA with the sample. They could simply do a computer search of the national database. The very suggestion elicits howls of protest. It would be an infringement of individual liberty. It's the thin end of the wedge. A giant step towards a police state. I have always been a little puzzled about why people automatically react so strongly against suggestions such as these. If I examine the matter dispassionately, I think that, on balance, I come out against it. But it is not something to condemn out of hand without even looking at the pros and cons. So let us do so.
If the information is guaranteed to be used only for catching criminals, it is hard to see why anybody who is not a criminal should object. I am aware that plenty of activists for civil liberties will still object in principle. But I genuinely don't understand why, unless we want to protect the rights of criminals to perform crimes without detection. I also see no good reason against a national database of conventional, ink-pad fingerprints (except the practical one that, unlike with DNA, it is hard to do an automatic computer search of conventional fingerprints). Crime is a serious problem which diminishes the quality of life for everybody except the criminals (perhaps even them: presumably there is nothing to stop a burglar's house being burgled). If a national DNA database would significantly help the police to catch criminals, the objections had better be good ones to outweigh the benefits.
Here's an important caution, though, to begin with. It's one thing to use DNA evidence, or mass-screening identification evidence of any kind, to corroborate a suspicion that the police have already reached on other grounds. It's quite another matter to use it to arrest anybody in the country who matches the sample. If there is a certain low probability of coincidental resemblance between, say, a semen sample and the blood of an innocent individual, the probability that that individual will also be falsely suspected on independent grounds is obviously far lower. So the technique of simply searching the database and arresting the one person who matches the sample is significantly more likely to lead to injustice than a system which requires other grounds for suspicion first. If a sample from the scene of a crime in Edinburgh happens to match my DNA, should the police be allowed to hammer on my door in Oxford and arrest me on no other evidence? I think not, but it is worth remarking that the police already do something equivalent with facial features, when they release to the national newspapers an Identikit picture, or a snapshot taken by a witness, and invite people from all over the country
to telephone them if they 'recognize' the face. Once again, we must beware of our natural tendency to trust facial recognition above all other kinds of individual identification.
Setting crime aside, there is a real danger of the information in the national DNA database falling into the wrong hands. I mean into the hands of those who wish to use it not for catching criminals but for other purposes, perhaps connected with medical insurance or blackmail. There are respectable reasons why people with no criminal intent at all might not wish their DNA profile to be known, and it seems to me that their privacy should be respected. For instance, a significant number of individuals who believe they are the father of a particular child are not. Equally, a significant number of children believe somebody to be their real father who is not. Anyone with access to the national DNA database might discover the truth, and the result could be huge emotional distress, marital breakdown, nervous breakdown, blackmail, or worse. There may be some who feel that the truth should always out, however painful, but I think a good case could be made that the sum total of human happiness would not be enhanced by a sudden outburst of revelations about everybody's true paternity.
Then there are the medical and insurance issues. The whole life insurance business depends upon the inability to forecast exactly when somebody will die. As Sir Arthur Eddington said: 'Human life is proverbially uncertain; few things are more certain than the solvency of a life-insurance company. ' We all pay our premiums. Those of us who die later than expected subsidize (the heirs of) those who die earlier than expected. Insurance companies already make statistical guesses which partially subvert the system by enabling them to charge high-risk clients larger premiums. They send a doctor to listen to our hearts, take our blood pressure and investigate our smoking and drinking habits. If actuaries knew exactly when we were all going to die, life insurance would become impossible. In principle, a national DNA database, if actuaries could get their hands on it, might lead us closer to this unfortunate outcome. An extreme could be reached where the only kind of death risk that could be insured against would be pure accident.
Similarly, people screening job applicants, or applicants for places at university, could use DNA information in ways that many of us might find undesirable. Some employers already use dubious methods such as graphology (analysis of handwriting as a supposed guide to character or aptitude). Unlike the case of graphology, there is good reason to think that DNA information might be genuinely useful for judging abilities. But still, I would be one of many who would be disturbed if selection panels made use of DNA information, at least if they did so secretly.
One of the general arguments against national databases of any kind is the 'What if it fell into the hands of a Hitler? ' argument. On the face of it, it is not clear how an evil government would benefit from a database of true information about people. They are so adept at using false information, one might say, why should they bother to abuse true information? In the case of Hitler, however, there is the point about his campaign against Jews and others. Although it is not true that you can recognize a Jew from his DNA, there are particular genes which are characteristic of people whose ancestors come from certain regions of, say, central Europe, and there are statistical correlations between possession of certain genes and being Jewish. It seems undeniable that, if Hitler's regime had had a national DNA database at their disposal, they would have found terrible ways to abuse it.
Are there ways to safeguard society from these potential ills, while retaining the benefit of helping to catch criminals? I'm not sure. I think it might be difficult. You could protect honest citizens against insurance companies and employers by restricting the national database to non- coding regions of the genome. The database would refer only to tandem repeat areas of the genome, not genes that actually do anything. This would prevent actuaries working out our life expectancy and talent scouts second-guessing our abilities. But it would do nothing to protect us against discovering (or against blackmailers discovering) truths about paternity that we might prefer not to know. Quite the contrary. The identification of Josef Mengele's bones from his son's blood was entirely based upon tandem repeat DNA. I see no easy answer to this objection except to say that, as D N A testing becomes easier, it will increasingly be possible to discover paternity in any case, without recourse to a national database. A man who suspects that 'his' child is not really his could already take the child's blood and have it compared with his own. He wouldn't need a national database.
Not just in courts of law, the decisions of commissions of inquiry and other bodies charged with discovering what happened in some incident or accident frequently turn upon scientific matters. Scientists are called as expert witnesses on factual matters: on the technicalities of meted fatigue, on the infectivity of mad cow disease, and so on. Then, having delivered their expertise, the scientists are dismissed so those charged with the serious business of actually making the decisions can get on with it. The implication is that scientists are good at discovering detailed facts but others, often lawyers or judges, are better qualified to integrate them and recommend what needs to be done. On the contrary, a good case can be made that scientific ways of thinking are valuable, not just for assembling the detailed facts but for reaching the final verdict. When there has been an air crash, say, or a disastrous football riot, a scientist might be better qualified to chair the inquiry than a judge, not because of
what scientists know, but because of the methods they use to find things out and make decisions.
The case of DNA fingerprinting suggests that lawyers would be better lawyers, judges better judges, parliamentarians better parliamentarians and citizens better citizens if they knew more science and, more to the point, if they reasoned more like scientists. This is not only because scientists value reaching the truth above winning a case. Judges, and decision-takers in general, might be better decision-takers if they were more adept in the arts of statistical reasoning and probability assessment. This, point will resurface in the next two chapters, which deal with superstition and the so-called paranormal.
6
HOODWINKED WITH FAERY FANCY
Credulity is the man's weakness, but the child's strength. CHARLES LAMB, Essays of Elia (1823)
We have an appetite for wonder, a poetic appetite, which real science ought to be feeding but which is being hijacked, often for monetary gain, by purveyors of superstition, the paranormal and astrology. Resonant phrases like 'the Fourth House of the Age of Aquarius', or 'Neptune went retrograde and moved into Sagittarius' whip up a bogus romance which, to the naive and impressionable, is almost indistinguishable from authentic scientific poetry: 'The Universe is lavish beyond imagining' for example, from Carl Sagan and Ann Druyan's Shadows of Forgotten Ancestors (1992); or, out of the same book (after describing how the solar system condensed out of a spinning disc), 'The disk is rippling with possible futures. ' In another book, Carl Sagan remarked,
How is it that hardly any major religion has looked at science and concluded, 'This is better than we thought! The Universe is much bigger than our prophets said, grander, more subtle, more elegant'? Instead they say, 'No, no, no! My god is a little god, and I want him to stay that way. 'A religion, old or new, that stressed the magnificence of the Universe as revealed by modern science might be able to draw forth reserves of reverence and awe hardly tapped by the conventional faiths. Pale Blue Dot (1995)
In so far as traditional religions are in decline in the West, their place seems to be taken not by science, with its clearer sighted, grander vision of the cosmos, so much as by the paranormal and astrology. One might have hoped that by the end of this most scientifically successful of all centuries science would have been incorporated into our culture and our
aesthetic sense risen to meet its poetry. Without reviving the mid-century pessimism of C. P. Snow, I reluctantly find that, with only two years to run, these hopes are not realized. Astrology books outsell astronomy books. Television beats a path to the doors of second-rate conjurors masquerading as psychics and clairvoyants. This chapter examines superstition and gullibility, trying to explain them and the ease with which they can be exploited. Chapter 7 then advocates simple statistical thinking as an antidote to the paranormal disease. We begin with astrology.
On 27 December 1997, one of Britain's largest circulation national newspapers, the Daily Mail, devoted its main front-page story to astrology under the banner headline '1998: The Dawn of Aquarius'. One feels almost grateful when the article goes on to concede that the Hale Bopp comet was not the direct cause of Princess Diana's death. The paper's highly paid astrologer tells us that 'slow-moving, powerful Neptune' is about to join 'forces' with the equally powerful Uranus as it moves into Aquarius. This will have dramatic consequences:
. . . the Sun is rising. And the comet has come to remind us that this Sun is not a physical sun but a spiritual, psychic, inner sun. It does not, therefore, have to obey the law of gravity. It can come over the horizon more swiftly if enough people rise to greet and encourage it. And it can dispel the darkness the moment it appears.
How can people find this meaningless pap appealing, especially in the face of the real universe as revealed by astronomy?
On a moonless night when 'the stars look very cold about the sky, and the only clouds to be seen are the glowing smudges of the Milky Way, go out to a place far from street light pollution, lie on the grass and gaze up at the sky. Superficially you notice constellations, but a constellation's pattern means no more than a patch of damp on the bathroom ceiling. Note, accordingly, how little it means to say something like 'Neptune moves into Aquarius'. Aquarius is a miscellaneous set of stars all at different distances from us which are unconnected with each other except that they constitute a (meaningless) pattern when seen from a certain (not particularly special) place in the galaxy (here). A constellation is not an entity at all, and so not the kind of thing that Neptune, or anything else, can sensibly be said to 'move into'.
The shape of a constellation, moreover, is ephemeral. A million years ago our Homo erectus ancestors gazed out nightly (no light pollution then, unless it came from that species' brilliant innovation, the camp fire) at a set of very different constellations. A million years hence, our descendants will see yet other shapes in the sky and we already know
exactly how these will look. This is the sort of detailed prediction that astronomers, but not astrologers, can make. And - again by contrast with astrological predictions - it will be correct.
Because of light's finite speed, when you look at the great galaxy in Andromeda you are seeing it as it was 2. 3 million years ago and Australopithecus stalked the high veldt. You are looking back in time. Shift your eyes a few degrees to the nearest bright star in the constellation of Andromeda and you see Mirach, but much more recently, as it was when Wall Street crashed. The sun, when you witness its colour and shape, is only eight minutes ago. But point a large telescope at the Sombrero galaxy and you behold a trillion suns as they were when your tailed ancestors peered shyly through the canopy and India collided with Asia to raise the Himalayas. A collision on a larger scale, between two galaxies in Stephan's Quintet, is shown to us at a time when on earth dinosaurs were dawning and the trilobites fresh dead.
Name any event in history and you will find a star out there whose light gives you a glimpse of something happening during the year of that event. Provided you are not a very young child, somewhere up in the night sky you can find your personal birth star. Its light is a thermonuclear glow that heralds the year of your birth. Indeed, you can find quite a few such stars (about 40 if you are 40; about 70 if you are 50; about 175 if you are 80 years old). When you look at one of your birth year stars, your telescope is a time machine letting you witness thermonuclear events
that are actually taking place during the year you were born. A pleasing conceit, but that is all. Your birth star will not deign to tell anything about your personality, your future or your sexual compatibilities. The stars have larger agendas in which the preoccupations of human pettiness do not figure.
Your birth star, of course, is yours for only this year. Next year you must look to the surface of a larger sphere one light year more distant. Think of this expanding sphere as a radius of good news, the news of your birth broadcast steadily outwards. In the Einsteinian universe in which most physicists now think we live, nothing can in principle travel faster than light. So, if you are 50 years old, you have a personal news bubble of 50 light years' radius. Within that sphere (of a little more than a thousand stars) it is in principle possible (although obviously not in practice) for news of your existence to have permeated. Outside that sphere you might as well not exist; in an Einsteinian sense you do not exist. Older people have larger existence spheres than younger people, but nobody's existence extends to more than a tiny fraction of the universe. The birth of Jesus may seem an ancient and momentous event to us as we reach his second millenary. But the news is so recent on this scale that, even in the most ideal circumstances, it could in principle have been
proclaimed to less than one 200 million millionth of the stars in the universe. Many, if not most, of the stars out there will be orbited by planets. The numbers are so vast that probably some of them have life forms, some have evolved intelligence and technology.
Yet the distances and times that separate us are so great that thousands of life forms could independently evolve and go extinct without it being possible for any to know of the existence of any other.
In order to make my calculations about numbers of birth stars, I assumed that the stars are spaced, on average, about 7. 6 light years apart. This is approximately true of our local region of the Milky Way galaxy. It seems an astonishingly low density (about 440 cubic light years per star), but it is actually high by comparison with the density of stars in the universe as a whole, where space lies empty between the galaxies. Isaac Asimov has a dramatic illustration: it is as if all the matter of the universe were a single grain of sand, set in the middle of an empty room 20 miles long, 20 miles wide and 20 miles high. Yet, at the same time, it is as if that single grain of sand were pulverized into a thousand million million million fragments, for that is approximately the number of stars in the universe. These are some of the sobering facts of astronomy, and you can see that they are beautiful.
Astrology, by comparison, is an aesthetic affront. Its pre-Copernican dabblings demean and cheapen astronomy, like using Beethoven for commercial jingles. It is also an insult to the science of psychology and the richness of human personality. I am talking about the facile and potentially damaging way in which astrologers divide humans into 12 categories. Scorpios are cheerful, outgoing types while Leos, with their methodical personalities, go well with Libras (or whatever it is). My wife Lalla Ward recalls an occasion when an American starlet approached the director of the film they were both working on with a 'Gee, Mr Preminger, what sign are you? ' and received the immortal rebuff, in a thick Austrian accent, 'I am a Do Not Disturrrb sign. '
Personality is a real phenomenon and psychologists have had some success in developing mathematical models to handle its variation in many dimensions. The initially large number of dimensions can be mathematically collapsed into fewer dimensions with measurable, and for some purposes conscionable, loss in predictive power. These fewer derived dimensions sometimes correspond to the dimensions that we intuitively think we recognize - aggressiveness, obstinacy, affectionateness and so on. Summarizing an individual's personality as a point in multidimensional space is a serviceable approximation whose limitations can be stated. It is a far cry from any mutually exclusive categorization, and certainly far from the preposterous fiction of
newspaper astrologer's 12 dump-bins. It is based upon genuinely relevant data about people themselves, not their birthdays. The psychologist's multidimensional scaling can be useful in deciding whether a person is suited to a particular career, or a proposed couple to each other. The astrologer's 12 pigeonholes are, if nothing worse, a costly and irrelevant distraction. Moreover, they sit oddly with our current strong taboos, and laws, against discrimination. Newspaper readers are schooled to regard themselves and their friends and colleagues as Scorpios or Libras or one of the other 12 mythic 'signs'. If you think about it for a moment, isn't this a form of discriminatory labelling rather like the cultural stereotypes which many of us nowadays find objectionable? I can imagine a Monty Python sketch in which a newspaper publishes a daily column something like this:
Germans: It is in your nature to be hard-working and methodical, which should serve you well at work today. In your personal relationships, especially this evening, you will need to curb your natural tendency to obey orders.
Spaniards: Your Latin hot blood may get the better of you, so beware of doing something you might regret. And lay off the garlic at lunch if you have romantic aspirations in the evening.
Chinese: Inscrutability has many advantages, but it may be your undoing today . . .
British: Your stiff upper lip may serve you well in business dealings, but try to relax and let yourself go in your social life.
And so on through 12 national stereotypes. No doubt the astrology columns are less offensive than this, but we should ask ourselves exactly where the difference lies.
The surplus of unused DNA falls into various categories. Some of it looks like real genetic information, and probably represents old, defunct genes, or out-of-date copies of genes that are still in use. These pseudo-genes would make sense if they were read and translated. But they are not read and translated. Hard disks on computers usually contain comparable junk: old copies of work in progress, scratchpad space used by the computer for interim operations, and so on. We users don't see this junk, because our computers only show us those parts of the disk that we need to know about. But if you get right down and read the actual information on the disk, byte by byte, you'll see the junk, and much of it will make some sort of sense. There are probably dozens of disjointed fragments of this very chapter peppered around my hard disk at present, although there is only one 'official' copy that the computer tells me about (plus a prudent back-up).
In addition to the junk DNA which could he read but isn't, there is plenty of junk DNA which not only isn't read but wouldn't make any sense if it were. There are huge stretches of repeated nonsense, perhaps repeats of one base, or alternations of the same two bases, or repeats of a more complicated pattern. Unlike the other class of junk DNA, we cannot account for these 'tandem repeats' as outdated copies of useful genes. This repetitive DNA has never been decoded, and presumably has never been of any use. (Never useful for the animal's survival, anyway. From
the point of view of the selfish gene, as I explained in another book, we could say that any kind of junk DNA is 'useful' to itself if it just keeps surviving and making more copies of itself. This suggestion has come to be known by the catch-phrase 'selfish DNA', although this is a little unfortunate because, in my original sense, working DNA is selfish too. For this reason, some people have taken to calling it 'ultra-selfish DNA'. )
Anyway, whatever the reason, junk DNA is there, and there in prodigious quantities. Because it is not used, it is free to vary. Useful genes, as we have seen, are severely constrained in their freedom to change. Most changes (mutations) make a gene work less effectively, the animal dies and the change is not passed on. This is what Darwinian natural selection is all about. But mutations in junk DNA (mostly changes in the number of repeats in a given region) are not noticed by natural selection. So, as we look around the population, we find most of the variation that is useful for fingerprinting in the junk regions. As we shall now see, tandem repeats are particularly useful because they vary with respect to number of repeats, a gross feature which is easy to measure.
If it wasn't for this, the forensic geneticist would need to look at the exact sequence of bases in our sample region. This can be done, but sequencing DNA is time-consuming. The tandem repeats allow us to use cunning short-cuts, as discovered by Alec Jeffreys of the University of Leicester, rightly regarded as the father of DNA fingerprinting (and now Sir Alec). Different people have different numbers of tandem repeats in particular places. I might have 147 repeats of a particular piece of nonsense, where you have 84 repeats of the same piece of nonsense in the corresponding place in your genome. In another region, I might have 24 repeats of a particular piece of nonsense to your 38 repeats. Each of us has a characteristic fingerprint consisting of a set of numbers. Each of these numbers in our fingerprint is the number of times a particular piece of nonsense is repeated in our genome.
We get our tandem repeats from our parents. We each have 46 chromosomes, 25 from our father and 23 homologous, or corresponding, chromosomes from our mother. These chromosomes come complete with tandem repeats. Your father got his 46 chromosomes from your paternal grandparents, but he didn't pass them on to you in their entirety. Each of his mother's chromosomes was lined up with its paternal opposite number and bits were exchanged before a composite chromosome was put into the sperm that helped to make you. Every sperm and every egg is unique because it is a different mix of maternal and paternal chromosomes. The mixing process affects the tandem repeat sections as well as the meaningful sections of the chromosomes. So our characteristic numbers of tandem repeats are inherited, in much the same way as our eye colour and hair curliness are inherited. With the difference that, whereas our eye colour results from some kind of joint
verdict of our paternal and our maternal genes, our tandem repeat numbers are properties of the chromosomes themselves and can therefore be measured separately for paternal and maternal chromosomes. At any particular tandem repeat region, each of us has two readings: a paternal chromosome repeat number and a maternal chromosome repeat number. From time to time, chromosomes mutate - suffer a random change - in their tandem repeat numbers. Or a particular tandem region may be split by chromosomal crossing over. This is why there is variation in tandem repeat numbers in the population. The beauty of tandem repeat numbers is that they are easy to measure. You don't have to get embroiled in detailed sequencing of coded DNA bases. You do something a bit like weighing them. Or, to take another equally apt analogy, you spread them out like coloured bands from a prism. I'll explain one way of doing this.
First you need to make some preparations. You make a so-called DNA probe, which is a short sequence of DNA that exactly matches the nonsense sequence in question - up to about 20 nucleotide bases long. This is not difficult to do nowadays. There are several methods. You can even buy a machine off the shelf which makes short DNA sequences to any specification, just as you can buy a keyboard to punch any desired string of letters on a paper tape. By supplying the synthesizing machine with radioactive raw materials, you make the probes themselves radioactive, and so 'label' them. This makes the probes easy to find again later, as natural DNA is not radioactive, and so the two are readily distinguishable from each other.
Radioactive probes are a tool of the trade, which you must have ready before you start a Jeffreys fingerprinting exercise. Another essential tool is the 'restriction enzyme'. Restriction enzymes are chemical tools that specialize in cutting DNA, but cutting it only in particular places. For example, one restriction enzyme may search the length of a chromosome until it finds the sequence GAATTC (G, C, T and A are the four letters of the DNA alphabet; all genes, from all species on earth, differ only in consisting of different sequences of these four letters). Another restriction enzyme cuts the DNA wherever it can find the sequence GCGGCCGC. A number of different restriction enzymes are available in the toolbox of the molecular biologist. They originate from bacteria, who use them for their own defensive purposes. Each restriction enzyme has its own unique search string which it homes in on and cuts.
Now, the trick is to choose a restriction enzyme whose specific search string is completely absent from the tandem repeat we are interested in. The whole length of DNA is therefore chopped into short stretches, bounded by the characteristic search string of the restriction enzyme. Of course, not all the stretches will consist of the tandem repeat we are
looking for. All sorts of other stretches of DNA will happen to be bounded by the favoured search string of the restriction enzyme scissors. But some of them will consist of tandem repeats and the length of each scissored stretch will be largely determined by the number of tandem repeats in it. If I have 147 repeats of a particular piece of DNA nonsense, where you have only 85, my snipped fragments will be correspondingly longer than your snipped fragments.
We can measure these characteristic lengths using a technique that has been around in molecular biology for quite a while. This is the bit that is rather like spreading them out with a prism, as Newton did for white light. The standard DNA 'prism' is a gel electrophoresis column, that is, a long tube filled with jelly through which an electric current is passed. A solution containing the scissored stretches of DNA, all jumbled together, is poured into one end of the tube. The DNA fragments are all electrically attracted to the negative end of the column, which is at the other end of the tube, and they move steadily through the jelly. But they don't all move at the same rate. Like light of low vibration frequency moving through glass, small fragments of DNA move faster than large ones. The result is that, if you switch the current off after a suitable interval, the fragments have spread themselves out along the column, just as Newton's colours spread themselves out because light from the blue end of the spectrum is more readily slowed down by glass than light from the red end.
But so far we can't see the fragments. The jelly column looks uniform all the way down. There is nothing to show that DNA fragments of different size are lurking in discrete bands along its length, and nothing to show which bands contain which variety of tandem repeat. How do we make them visible? This is where the radioactive probes come in.
To make them visible you can use another cunning technique, the Southern blot, named after its inventor, Edward Southern. (Slightly confusingly, there are other techniques called the Northern blot and the Western blot, but no Mr Northern or Mr Western. ) The jelly column is removed from the tube and laid out on blotting paper. The liquid in the jelly, including the DNA fragments, seeps out of the jelly into the blotting paper. The blotting paper has previously been laced with quantities of the radioactive probe for the particular tandem repeat that we are interested in. The probe molecules line up along the blotting paper, pairing precisely, by the ordinary rules of DNA, with their opposite numbers in the tandem repeats. Surplus probe molecules are washed away. Now the only radioactive probe molecules left in the blotting paper are those bound to their exact opposite numbers that seeped out of the jelly. The blotting paper is now placed on a piece of X-ray film, which is then marked by the radioactivity. So, what you see when you develop the
film is a set of dark bands - another barcode. The final barcode pattern that we read on the Southern blot is a fingerprint for a person, in very much the same way as the Fraunhofer lines are a fingerprint for a star, or the formant lines are the fingerprint for a vowel sound. Indeed, the barcode from the blood looks very like Fraunhofer lines or formant lines.
The details of DNA fingerprinting techniques get quite complicated and I won't go much further. For instance, one strategy is to hit the DNA with lots of probes all at the same time. What you get then is a mixed bag of barcode stripes simultaneously. In extreme cases, the stripes merge into each other and all you get is one big smear with all possible sizes of DNA fragment represented somewhere in the genome. This is no good for identification purposes. At the other extreme, people use only one probe at a time looking at one genetic 'locus'. This 'single-locus fingerprinting' gives you nice clean bars like Fraunhofer lines. But only one or two bars per person. Even so, the chances of confusing people are small. This is because the characteristics we are talking about are not like 'brown eyes versus blue eyes', in which case lots of people would be the same. The characteristics we are measuring, remember, are lengths of tandem repeat fragments. The number of possible lengths is very large, so even single-locus fingerprinting is pretty good for identification purposes. Not quite good enough, however, so in practice forensic DNA finger-printers usually use half a dozen separate probes. Now the chances of error are very low indeed. But we still need to talk about exactly how low, because people's lives or liberties might depend upon it.
First, we must return to our distinction between false positives and false negatives. DNA evidence can be used to clear an innocent suspect, or it can be made to point the finger at a guilty one. Suppose semen is recovered from the vagina of a rape victim. Circumstantial evidence leads the police to arrest a man, suspect A. Suspect A gives a blood sample and it is compared to the semen sample, using a single DNA probe to look at one tandem repeat locus. If the two are different, suspect A is in the clear. We don't even need to look at a second locus.
But what if suspect A's blood matches the semen sample at this locus? Suppose they both share the same barcode pattern, which we shall call pattern P. This is compatible with the suspect's being guilty, but it doesn't prove it. He could just happen to share pattern P with the real rapist. We must now look at some more loci. If the samples still match, what are the odds against such a match being coincidental - a false positive mis-identification? This is where we have to start thinking statistically about the population at large. In theory, by taking blood from a sample of men in the population at large, we should be able to calculate the likelihood that any two men will be identical at each locus
concerned. But from which section of the population do we draw our sample?
Remember our lone bearded man in the old-fashioned line-up identity parade? Here's the molecular equivalent. Suppose that, in the world at large, only one in a million men has pattern P. Does this mean that there is a million to one chance against a wrongful conviction of suspect A? No. Suspect A may belong to a minority group of people whose ancestors immigrated from a particular part of the world. Local populations often share genetic peculiarities, for the simple reason that they are descended from the same ancestors. Of the 2. 5 million South African Dutch, or Afrikaners, most are descended from one shipload of immigrants who arrived from the Netherlands in 1652. As an indicator of the narrowness of this genetic bottleneck, about a million still bear the surnames of 20 of these original settlers. The Afrikaners have a much higher frequency of certain genetic diseases than the population of the world in general. According to one estimate, about 8,000 (one in 300) have the blood condition porphyria variegata, which is much rarer in the rest of the world. This is apparently because they are descended from one particular couple on the ship, Gerrit Jansz and Ariaantje Jacobs, although it is not known which one was the carrier of the (dominant) gene for the condition. (She was one of eight Rotterdam orphanage girls put on the ship to provide wives for the settlers. ) In fact, the condition wasn't noticed at all before modern medicine, because its most marked symptom is a lethal reaction to certain modern anaesthetics (South African hospitals now routinely test for the gene before administering anaesthetic). Other populations often have locally high frequencies of other particular genes, for the same kind of reason. If, to return to our hypothetical court case, suspect A and the real criminal both belong to the same minority group, the likelihood of chance confusion could be dramatically greater than you'd think if you based your estimates on the population at large. The point is that the frequency of pattern P in humans at large is no longer relevant. We need to know the frequency of pattern P in the group to which the suspect belongs.
This need is nothing new. We've already seen the equivalent danger in an ordinary line-up identity parade. If the prime suspect is Chinese, it doesn't do to stand him in a line-up largely consisting of westerners. And the same kind of statistical reasoning about the background population is needed in identifying stolen goods, as well as individual suspects. I have already mentioned my jury service in the Oxford Court. In one of the three cases I sat on, a man was accused of stealing three coins from a rival numismatist. The accused had been caught with three coins in his possession which matched those lost. Counsel for the prosecution was eloquent.
Ladies and gentlemen of the jury, are we really supposed to believe that three coins, of exactly the same type as the three missing coins, would just happen to be present in the house of a rival collector? I put it to you that such a coincidence is too much to stomach.
Jurymen are not permitted to cross-examine. That was the duty of counsel for the defence, and he, though doubtless learned in the law and also eloquent, had no more clue about probability theory' than the prosecutor. I wish he'd said something like this:
M'Lud, we don't know whether the coincidence is too much to stomach, because m'learned friend has not presented us with any evidence at all as to the rarity or commonness of these three coins in the population at large. If these coins are so rare that only one in a hundred collectors in the country has any one of them, the prosecution has a good case, since the defendant was caught with three of them. If on the other hand, these coins are as common as dirt, there is not enough evidence to convict. (To push to the extreme, three coins that I have in my pocket today, all current legal tender, are very probably the same as three coins in Your Lordships pocket)
My point is that it simply never occurred to any of the legally trained minds in the court that it was relevant even to ask how rare these three coins were in the population at large. Lawyers can certainly add up (I once received a lawyer's bill, the last item of which was 'Time spent making out this bill') but probability theory is another matter.
I expect the coins were actually rare. If they hadn't been, the theft would not have been such a serious matter, and the prosecution presumably would never have been brought. But the jury should have been told explicitly. I remember that the question came up in the jury room, and we wished that we were allowed to go back into the court to seek clarification. The equivalent question is equally relevant in the case of DNA evidence, and it is most certainly being asked. Fortunately, provided a sufficient number of separate genetic loci are examined, the chances of mis-identification - even among members of minority groups, even among family members (except identical twins) - can be reduced to genuinely very small levels, far smaller than can be achieved by any other method of identification, including eye-witness evidence.
Exactly how small the residual possibility of error is may still be open to dispute. And this is where we come to the third category of objection to DNA evidence, the just plain silly. Lawyers are accustomed to pouncing when expert witnesses seem to disagree. If two geneticists are summoned to the stand and are asked to estimate the probability of a mis-
identification with DNA evidence, the first may say a 1,000,000 to one while the second may say only a 100,000 to one. Pounce. 'Aha! AHA! The experts disagree! Ladies and gentlemen of the jury, what confidence can we place in a scientific method if the experts themselves can't get within a factor of ten of one another? Obviously the only thing to do is throw the entire evidence out, lock, stock and barrel. '
But, in these cases, although geneticists may be inclined to give different weightings to imponderables such as the racial subgroup effect, any disagreement between them is only over whether the odds against a wrongful identification are hyper-mega-astronomical or just plain astronomical. The odds cannot normally be lower than thousands to one, and they may well be up in the billions. Even on the most conservative estimate, the odds against wrongful identification are hugely greater than they are in an ordinary identity parade. 'M'lud, an identity parade of only 30 men is grossly unfair on my client. I demand a line-up of at least a million men! ' Expert statisticians called to give evidence on the likelihood that a conventional 20-man identity parade could yield a false identification would also disagree among themselves. Some would give the simple answer, one in 20. Under cross-examination they would then agree that it could be one in less than 20, depending upon the nature of the variation in the line-up in relation to the features of the suspect (this was the point about the lone bearded man in the line-up). But the one thing all the statisticians would agree upon is that the odds of mis- identification by sheer chance are at least one in 20. Yet lawyers and judges are normally happy to go along with ordinary identity parades in which the suspect stands in a line of only 20 men.
After reporting the throwing out of DNA evidence in a case at London's central criminal court the Old Bailey, the Independent newspaper of 12 December 1992 predicted a consequent flood of appeals. The idea is that everybody at present languishing in jail, as a result of DNA identification evidence, will now be able to appeal, citing the precedent. But the flood may be even greater than the Independent imagines because, if this throwing out of DNA evidence is really a serious precedent for anything, it will cast doubt on all cases in which the odds against a chance mistake are less than thousands to one. If a witness says she 'saw' somebody and identified him in a line-up, lawyers and juries are satisfied. But the odds of mistaken identity when the human eye is involved are far greater than when the identification is done by DNA fingerprinting. If we take the precedent seriously, it ought to mean that every' convicted criminal m the country will have excellent cause to appeal on grounds of mistaken identity. Even where a suspect was seen by dozens of witnesses with a smoking gun in his hand, the odds of injustice must be greater than one in 1,000,000. A recent highly publicized case in America, where the jury were systematically confused about DNA evidence, has also become
notorious for another piece of bungled probability theory. The defendant, who was known to have beaten his wife, was on trial for finally murdering her. One of the high-profile defence team, a Harvard professor of law, advanced the following argument: Statistics show that of men who beat their wives, only one in 1,000 go on to kill them.
The inference that any jury might be expected to draw (indeed, were intended to draw) is that the defendant's beating of his wife should be discounted in the murder trial. Doesn't the evidence show overwhelmingly that a wife- beater is unlikely to turn into a wife murderer? Wrong. Doctor I. J. Good, a professor of statistics, wrote to the scientific journal Nature in June 1995 to explode the fallacy. The defence lawyer's argument overlooks the additional fact that wife-killing is rare compared with wife-beating. Good calculated that if you take that minority of wives who are both beaten by their husbands and murdered by somebody, it is very likely indeed that the murderer will be the husband. This is the relevant way to calculate the odds because, in the case under discussion, the unfortunate wife had been murdered by somebody, after being beaten by her husband.
No doubt there are lawyers, judges and coroners who could benefit from a better understanding of the theory of probability. On some occasions, however, one cannot help suspecting that they understand very well and are feigning incompetence. I do not know if this was so in the case just quoted. The same suspicion is raised by Doctor Theodore Dalrymple, the (London) Spectator's acerbic medical raconteur, in this typically sardonic account, from 7 January 1995, of his being called as an expert witness in a coroner's court:
. . . a wealthy and successful man I knew swallowed 200 tablets and a bottle of rum. The coroner asked me whether I thought he might have taken them by accident I was about to answer with a ringing and confident no, when the coroner made himself a little clearer: was there even a one in a million chance he had taken them by accident?
'Err, well, I suppose so,' replied The coroner (and the man's family) relaxed, an open verdict was returned, the family was ? 750,000 the richer and an insurance company the poorer by an equivalent sum, at least until it put my premium up.
The power of DNA fingerprinting is an aspect of the general power of science that makes some people fear it. It is important not to exacerbate such fears by claiming too much or trying to move too fast. Let me end this rather technical chapter by returning to society and an important and difficult decision that we must collectively make. I would normally fight shy of discussing a topical issue for fear of going out of date, or a local one for fear of being parochial, but the question of a national DNA database is starting to preoccupy most nations in their different ways, and it is bound to become more pressing in the future.
It would in theory be possible to keep a national database of DNA sequences from every man, woman and child in the country. Then, whenever a sample of blood, semen, saliva, skin or hair was found at the scene of a crime, the police would not have to locate a suspect by other means before comparing his DNA with the sample. They could simply do a computer search of the national database. The very suggestion elicits howls of protest. It would be an infringement of individual liberty. It's the thin end of the wedge. A giant step towards a police state. I have always been a little puzzled about why people automatically react so strongly against suggestions such as these. If I examine the matter dispassionately, I think that, on balance, I come out against it. But it is not something to condemn out of hand without even looking at the pros and cons. So let us do so.
If the information is guaranteed to be used only for catching criminals, it is hard to see why anybody who is not a criminal should object. I am aware that plenty of activists for civil liberties will still object in principle. But I genuinely don't understand why, unless we want to protect the rights of criminals to perform crimes without detection. I also see no good reason against a national database of conventional, ink-pad fingerprints (except the practical one that, unlike with DNA, it is hard to do an automatic computer search of conventional fingerprints). Crime is a serious problem which diminishes the quality of life for everybody except the criminals (perhaps even them: presumably there is nothing to stop a burglar's house being burgled). If a national DNA database would significantly help the police to catch criminals, the objections had better be good ones to outweigh the benefits.
Here's an important caution, though, to begin with. It's one thing to use DNA evidence, or mass-screening identification evidence of any kind, to corroborate a suspicion that the police have already reached on other grounds. It's quite another matter to use it to arrest anybody in the country who matches the sample. If there is a certain low probability of coincidental resemblance between, say, a semen sample and the blood of an innocent individual, the probability that that individual will also be falsely suspected on independent grounds is obviously far lower. So the technique of simply searching the database and arresting the one person who matches the sample is significantly more likely to lead to injustice than a system which requires other grounds for suspicion first. If a sample from the scene of a crime in Edinburgh happens to match my DNA, should the police be allowed to hammer on my door in Oxford and arrest me on no other evidence? I think not, but it is worth remarking that the police already do something equivalent with facial features, when they release to the national newspapers an Identikit picture, or a snapshot taken by a witness, and invite people from all over the country
to telephone them if they 'recognize' the face. Once again, we must beware of our natural tendency to trust facial recognition above all other kinds of individual identification.
Setting crime aside, there is a real danger of the information in the national DNA database falling into the wrong hands. I mean into the hands of those who wish to use it not for catching criminals but for other purposes, perhaps connected with medical insurance or blackmail. There are respectable reasons why people with no criminal intent at all might not wish their DNA profile to be known, and it seems to me that their privacy should be respected. For instance, a significant number of individuals who believe they are the father of a particular child are not. Equally, a significant number of children believe somebody to be their real father who is not. Anyone with access to the national DNA database might discover the truth, and the result could be huge emotional distress, marital breakdown, nervous breakdown, blackmail, or worse. There may be some who feel that the truth should always out, however painful, but I think a good case could be made that the sum total of human happiness would not be enhanced by a sudden outburst of revelations about everybody's true paternity.
Then there are the medical and insurance issues. The whole life insurance business depends upon the inability to forecast exactly when somebody will die. As Sir Arthur Eddington said: 'Human life is proverbially uncertain; few things are more certain than the solvency of a life-insurance company. ' We all pay our premiums. Those of us who die later than expected subsidize (the heirs of) those who die earlier than expected. Insurance companies already make statistical guesses which partially subvert the system by enabling them to charge high-risk clients larger premiums. They send a doctor to listen to our hearts, take our blood pressure and investigate our smoking and drinking habits. If actuaries knew exactly when we were all going to die, life insurance would become impossible. In principle, a national DNA database, if actuaries could get their hands on it, might lead us closer to this unfortunate outcome. An extreme could be reached where the only kind of death risk that could be insured against would be pure accident.
Similarly, people screening job applicants, or applicants for places at university, could use DNA information in ways that many of us might find undesirable. Some employers already use dubious methods such as graphology (analysis of handwriting as a supposed guide to character or aptitude). Unlike the case of graphology, there is good reason to think that DNA information might be genuinely useful for judging abilities. But still, I would be one of many who would be disturbed if selection panels made use of DNA information, at least if they did so secretly.
One of the general arguments against national databases of any kind is the 'What if it fell into the hands of a Hitler? ' argument. On the face of it, it is not clear how an evil government would benefit from a database of true information about people. They are so adept at using false information, one might say, why should they bother to abuse true information? In the case of Hitler, however, there is the point about his campaign against Jews and others. Although it is not true that you can recognize a Jew from his DNA, there are particular genes which are characteristic of people whose ancestors come from certain regions of, say, central Europe, and there are statistical correlations between possession of certain genes and being Jewish. It seems undeniable that, if Hitler's regime had had a national DNA database at their disposal, they would have found terrible ways to abuse it.
Are there ways to safeguard society from these potential ills, while retaining the benefit of helping to catch criminals? I'm not sure. I think it might be difficult. You could protect honest citizens against insurance companies and employers by restricting the national database to non- coding regions of the genome. The database would refer only to tandem repeat areas of the genome, not genes that actually do anything. This would prevent actuaries working out our life expectancy and talent scouts second-guessing our abilities. But it would do nothing to protect us against discovering (or against blackmailers discovering) truths about paternity that we might prefer not to know. Quite the contrary. The identification of Josef Mengele's bones from his son's blood was entirely based upon tandem repeat DNA. I see no easy answer to this objection except to say that, as D N A testing becomes easier, it will increasingly be possible to discover paternity in any case, without recourse to a national database. A man who suspects that 'his' child is not really his could already take the child's blood and have it compared with his own. He wouldn't need a national database.
Not just in courts of law, the decisions of commissions of inquiry and other bodies charged with discovering what happened in some incident or accident frequently turn upon scientific matters. Scientists are called as expert witnesses on factual matters: on the technicalities of meted fatigue, on the infectivity of mad cow disease, and so on. Then, having delivered their expertise, the scientists are dismissed so those charged with the serious business of actually making the decisions can get on with it. The implication is that scientists are good at discovering detailed facts but others, often lawyers or judges, are better qualified to integrate them and recommend what needs to be done. On the contrary, a good case can be made that scientific ways of thinking are valuable, not just for assembling the detailed facts but for reaching the final verdict. When there has been an air crash, say, or a disastrous football riot, a scientist might be better qualified to chair the inquiry than a judge, not because of
what scientists know, but because of the methods they use to find things out and make decisions.
The case of DNA fingerprinting suggests that lawyers would be better lawyers, judges better judges, parliamentarians better parliamentarians and citizens better citizens if they knew more science and, more to the point, if they reasoned more like scientists. This is not only because scientists value reaching the truth above winning a case. Judges, and decision-takers in general, might be better decision-takers if they were more adept in the arts of statistical reasoning and probability assessment. This, point will resurface in the next two chapters, which deal with superstition and the so-called paranormal.
6
HOODWINKED WITH FAERY FANCY
Credulity is the man's weakness, but the child's strength. CHARLES LAMB, Essays of Elia (1823)
We have an appetite for wonder, a poetic appetite, which real science ought to be feeding but which is being hijacked, often for monetary gain, by purveyors of superstition, the paranormal and astrology. Resonant phrases like 'the Fourth House of the Age of Aquarius', or 'Neptune went retrograde and moved into Sagittarius' whip up a bogus romance which, to the naive and impressionable, is almost indistinguishable from authentic scientific poetry: 'The Universe is lavish beyond imagining' for example, from Carl Sagan and Ann Druyan's Shadows of Forgotten Ancestors (1992); or, out of the same book (after describing how the solar system condensed out of a spinning disc), 'The disk is rippling with possible futures. ' In another book, Carl Sagan remarked,
How is it that hardly any major religion has looked at science and concluded, 'This is better than we thought! The Universe is much bigger than our prophets said, grander, more subtle, more elegant'? Instead they say, 'No, no, no! My god is a little god, and I want him to stay that way. 'A religion, old or new, that stressed the magnificence of the Universe as revealed by modern science might be able to draw forth reserves of reverence and awe hardly tapped by the conventional faiths. Pale Blue Dot (1995)
In so far as traditional religions are in decline in the West, their place seems to be taken not by science, with its clearer sighted, grander vision of the cosmos, so much as by the paranormal and astrology. One might have hoped that by the end of this most scientifically successful of all centuries science would have been incorporated into our culture and our
aesthetic sense risen to meet its poetry. Without reviving the mid-century pessimism of C. P. Snow, I reluctantly find that, with only two years to run, these hopes are not realized. Astrology books outsell astronomy books. Television beats a path to the doors of second-rate conjurors masquerading as psychics and clairvoyants. This chapter examines superstition and gullibility, trying to explain them and the ease with which they can be exploited. Chapter 7 then advocates simple statistical thinking as an antidote to the paranormal disease. We begin with astrology.
On 27 December 1997, one of Britain's largest circulation national newspapers, the Daily Mail, devoted its main front-page story to astrology under the banner headline '1998: The Dawn of Aquarius'. One feels almost grateful when the article goes on to concede that the Hale Bopp comet was not the direct cause of Princess Diana's death. The paper's highly paid astrologer tells us that 'slow-moving, powerful Neptune' is about to join 'forces' with the equally powerful Uranus as it moves into Aquarius. This will have dramatic consequences:
. . . the Sun is rising. And the comet has come to remind us that this Sun is not a physical sun but a spiritual, psychic, inner sun. It does not, therefore, have to obey the law of gravity. It can come over the horizon more swiftly if enough people rise to greet and encourage it. And it can dispel the darkness the moment it appears.
How can people find this meaningless pap appealing, especially in the face of the real universe as revealed by astronomy?
On a moonless night when 'the stars look very cold about the sky, and the only clouds to be seen are the glowing smudges of the Milky Way, go out to a place far from street light pollution, lie on the grass and gaze up at the sky. Superficially you notice constellations, but a constellation's pattern means no more than a patch of damp on the bathroom ceiling. Note, accordingly, how little it means to say something like 'Neptune moves into Aquarius'. Aquarius is a miscellaneous set of stars all at different distances from us which are unconnected with each other except that they constitute a (meaningless) pattern when seen from a certain (not particularly special) place in the galaxy (here). A constellation is not an entity at all, and so not the kind of thing that Neptune, or anything else, can sensibly be said to 'move into'.
The shape of a constellation, moreover, is ephemeral. A million years ago our Homo erectus ancestors gazed out nightly (no light pollution then, unless it came from that species' brilliant innovation, the camp fire) at a set of very different constellations. A million years hence, our descendants will see yet other shapes in the sky and we already know
exactly how these will look. This is the sort of detailed prediction that astronomers, but not astrologers, can make. And - again by contrast with astrological predictions - it will be correct.
Because of light's finite speed, when you look at the great galaxy in Andromeda you are seeing it as it was 2. 3 million years ago and Australopithecus stalked the high veldt. You are looking back in time. Shift your eyes a few degrees to the nearest bright star in the constellation of Andromeda and you see Mirach, but much more recently, as it was when Wall Street crashed. The sun, when you witness its colour and shape, is only eight minutes ago. But point a large telescope at the Sombrero galaxy and you behold a trillion suns as they were when your tailed ancestors peered shyly through the canopy and India collided with Asia to raise the Himalayas. A collision on a larger scale, between two galaxies in Stephan's Quintet, is shown to us at a time when on earth dinosaurs were dawning and the trilobites fresh dead.
Name any event in history and you will find a star out there whose light gives you a glimpse of something happening during the year of that event. Provided you are not a very young child, somewhere up in the night sky you can find your personal birth star. Its light is a thermonuclear glow that heralds the year of your birth. Indeed, you can find quite a few such stars (about 40 if you are 40; about 70 if you are 50; about 175 if you are 80 years old). When you look at one of your birth year stars, your telescope is a time machine letting you witness thermonuclear events
that are actually taking place during the year you were born. A pleasing conceit, but that is all. Your birth star will not deign to tell anything about your personality, your future or your sexual compatibilities. The stars have larger agendas in which the preoccupations of human pettiness do not figure.
Your birth star, of course, is yours for only this year. Next year you must look to the surface of a larger sphere one light year more distant. Think of this expanding sphere as a radius of good news, the news of your birth broadcast steadily outwards. In the Einsteinian universe in which most physicists now think we live, nothing can in principle travel faster than light. So, if you are 50 years old, you have a personal news bubble of 50 light years' radius. Within that sphere (of a little more than a thousand stars) it is in principle possible (although obviously not in practice) for news of your existence to have permeated. Outside that sphere you might as well not exist; in an Einsteinian sense you do not exist. Older people have larger existence spheres than younger people, but nobody's existence extends to more than a tiny fraction of the universe. The birth of Jesus may seem an ancient and momentous event to us as we reach his second millenary. But the news is so recent on this scale that, even in the most ideal circumstances, it could in principle have been
proclaimed to less than one 200 million millionth of the stars in the universe. Many, if not most, of the stars out there will be orbited by planets. The numbers are so vast that probably some of them have life forms, some have evolved intelligence and technology.
Yet the distances and times that separate us are so great that thousands of life forms could independently evolve and go extinct without it being possible for any to know of the existence of any other.
In order to make my calculations about numbers of birth stars, I assumed that the stars are spaced, on average, about 7. 6 light years apart. This is approximately true of our local region of the Milky Way galaxy. It seems an astonishingly low density (about 440 cubic light years per star), but it is actually high by comparison with the density of stars in the universe as a whole, where space lies empty between the galaxies. Isaac Asimov has a dramatic illustration: it is as if all the matter of the universe were a single grain of sand, set in the middle of an empty room 20 miles long, 20 miles wide and 20 miles high. Yet, at the same time, it is as if that single grain of sand were pulverized into a thousand million million million fragments, for that is approximately the number of stars in the universe. These are some of the sobering facts of astronomy, and you can see that they are beautiful.
Astrology, by comparison, is an aesthetic affront. Its pre-Copernican dabblings demean and cheapen astronomy, like using Beethoven for commercial jingles. It is also an insult to the science of psychology and the richness of human personality. I am talking about the facile and potentially damaging way in which astrologers divide humans into 12 categories. Scorpios are cheerful, outgoing types while Leos, with their methodical personalities, go well with Libras (or whatever it is). My wife Lalla Ward recalls an occasion when an American starlet approached the director of the film they were both working on with a 'Gee, Mr Preminger, what sign are you? ' and received the immortal rebuff, in a thick Austrian accent, 'I am a Do Not Disturrrb sign. '
Personality is a real phenomenon and psychologists have had some success in developing mathematical models to handle its variation in many dimensions. The initially large number of dimensions can be mathematically collapsed into fewer dimensions with measurable, and for some purposes conscionable, loss in predictive power. These fewer derived dimensions sometimes correspond to the dimensions that we intuitively think we recognize - aggressiveness, obstinacy, affectionateness and so on. Summarizing an individual's personality as a point in multidimensional space is a serviceable approximation whose limitations can be stated. It is a far cry from any mutually exclusive categorization, and certainly far from the preposterous fiction of
newspaper astrologer's 12 dump-bins. It is based upon genuinely relevant data about people themselves, not their birthdays. The psychologist's multidimensional scaling can be useful in deciding whether a person is suited to a particular career, or a proposed couple to each other. The astrologer's 12 pigeonholes are, if nothing worse, a costly and irrelevant distraction. Moreover, they sit oddly with our current strong taboos, and laws, against discrimination. Newspaper readers are schooled to regard themselves and their friends and colleagues as Scorpios or Libras or one of the other 12 mythic 'signs'. If you think about it for a moment, isn't this a form of discriminatory labelling rather like the cultural stereotypes which many of us nowadays find objectionable? I can imagine a Monty Python sketch in which a newspaper publishes a daily column something like this:
Germans: It is in your nature to be hard-working and methodical, which should serve you well at work today. In your personal relationships, especially this evening, you will need to curb your natural tendency to obey orders.
Spaniards: Your Latin hot blood may get the better of you, so beware of doing something you might regret. And lay off the garlic at lunch if you have romantic aspirations in the evening.
Chinese: Inscrutability has many advantages, but it may be your undoing today . . .
British: Your stiff upper lip may serve you well in business dealings, but try to relax and let yourself go in your social life.
And so on through 12 national stereotypes. No doubt the astrology columns are less offensive than this, but we should ask ourselves exactly where the difference lies.
