CHAPTER 1 Why Risk Intelligence Matters
He who knows best, best knows how little he knows.
Kathryn, who is a detective, is good at spotting lies. While her colleagues seem to see them everywhere, she is more circumspect. When she’s interviewing a suspect, she doesn’t jump to conclusions. Instead she patiently looks for the telltale signs that suggest dishonesty. Even so, she is rarely 100 percent sure that she’s spotted a lie; it’s more often a question of tilting the scales one way or another, she says.
Jamie is viewed as a bit of an oddball at the investment bank where he works. When everyone else is sure that prices will continue to go up, Jamie is often more skeptical. On the other hand, there are times when everyone else is pessimistic but Jamie is feeling quite bullish. Jamie and his colleagues are not always at odds, but when they disagree it tends to be Jamie who is right.
Diane is overjoyed about her new relationship. When she phones her best friend, Evelyn, to tell her all about the new man in her life, Evelyn urges caution. “What’s the chance that you’ll still be with this guy in twelve months?” she asks, as she has done before. Diane’s reply is just as predictable. “Oh, ninety, maybe ninety-five percent,” she replies, as she always does. “I’m sure Danny is the one!” Two months later, she’s broken up again.
Jeff has just been promoted to the rank of captain in the US Army. Since he is new to the role, he often feels unsure of his decisions and seeks out his colonel for a second opinion. The colonel is beginning to get rather tired of Jeff’s pestering him, and has taken to playing a little game. Whenever Jeff asks his opinion, he responds by asking how confident Jeff is of his own hunch. Usually Jeff replies that he’s only about 40 or 50 percent sure. But nine times of out ten, the colonel agrees with Jeff’s opinion.
These four people display different degrees of risk intelligence. Kathryn and Jamie have high risk intelligence, while Diane and Jeff are at the other end of the spectrum. What exactly do I mean by risk intelligence? Most simply put, it is the ability to estimate probabilities accurately, whether the probabilities of various events occurring in our lives, such as a car accident, or the likelihood that some piece of information we’ve just come across is actually true, such as a rumor about a takeover bid. Or perhaps we have to judge whether a defendant in a murder trial is guilty, or must decide whether it’s safe to take a trip to a country that’s been put on a watch list. We often have to make educated guesses about such things, but fifty years of research in the psychology of judgment and decision making show that most people are not very good at doing so. Many people, for example, tend to overestimate their chances of winning the lottery, while they underestimate the probability that they will get divorced.
At the heart of risk intelligence lies the ability to gauge the limits of your own knowledge—to be cautious when you don’t know much, and to be confident when, by contrast, you know a lot. People with high risk intelligence tend to be on the button in doing this. Kathryn and Jamie, for example, are relatively risk intelligent because they know pretty well how much they know and have just the right level of confidence in their judgments. Diane and Jeff are much less proficient, though in different ways; while Diane is overconfident, Jeff is underconfident.
This is a book about why so many of us are so bad at estimating probabilities and how we can become better at it. This is a vital skill to develop, as our ability to cope with uncertainty is one of the most important requirements for success in life, yet also one of the most neglected. We may not appreciate just how often we’re required to exercise it, and how much impact our ability to do so can have on our lives, and even on the whole of society. Consider these examples, from the relatively mundane to the life-threatening:
You are buying a new 42-inch HDTV, and a sales assistant asks if you would also like to purchase an extended warranty. He explains that if anything goes wrong with your TV in the next three years, the warranty will entitle you to swap it for a brand-new one, no questions asked. When deciding whether or not to purchase the extended warranty, you should consider the price of the TV, the price of the warranty, and the probability that the TV will indeed go wrong in the next three years. But what’s the chance that this will actually happen? Here’s where your risk intelligence comes in.
A bank manager is explaining to you the various options available for investing a windfall that has just come your way. Riskier investment funds pay more interest, but there’s also a higher chance of making a loss. How much of your money should you allocate to the high-risk funds and how much to the low-risk ones? It’s partly a question of risk appetite, but you also need to know more about how much
riskier the high-risk funds are. Are they 2 percent or 10 percent riskier? You need, in other words, to put a number on it.
Doctors have discovered a tumor in your breast. Luckily, it is not malignant. It will not spread to the rest of your body, and there is no need to remove your breast. But there is a chance that it may recur and become malignant at some time in the future, and it might then spread quickly. In order to prevent this possibility, the doctor suggests that you do, after all, consider having your breast removed. It’s a terrible dilemma; clearly you don’t want the cancer to recur, but it seems a tragedy to remove a healthy breast. How high would the chance of recurrence have to be before you decided to have the breast removed?
When making evaluations in situations of uncertainty, people often make very poor probability estimates and may even ignore probabilities altogether, with sometimes devastating consequences. The decisions that we face, both individually and as a society, are only becoming more daunting. The following cases further illustrate how important it is that we learn to develop our risk intelligence. THE CSI EFFECT
The television drama CSI: Crime Scene Investigation
is hugely popular. In 2002, it was the most watched show on American television, and by 2009 the worldwide audience was estimated to be more than 73 million. It isn’t, however, such a hit with police officers and district attorneys, who have criticized the series for presenting a highly misleading image of how crimes are solved. Their fears have been echoed by Monica Robbers, a criminologist, who found evidence that jurors have increasingly unrealistic expectations of forensic evidence. Bernard Knight, formerly one of Britain’s chief pathologists, agrees. Jurors today, he observes, expect more categorical proof than forensic science is capable of delivering. And he attributes this trend directly to the influence of television crime dramas.
Science rarely proves anything conclusively. Rather, it gradually accumulates evidence that makes it more or less likely that a hypothesis is true. Yet in CSI
and other shows like it, the evidence is often portrayed as decisive. When those who have watched such shows then serve on juries, the evidence in real-life court cases can appear rather disappointing by contrast. Even when high-quality DNA evidence is available, the expert witnesses who present such evidence in court point out that they are still dealing only in probabilities. When the jurors contrast this with the certainties of television, where a match between a trace of DNA found at a crime scene and that of the suspect may be unequivocal, they can be less willing to convict than in the past.
The phenomenon has even been given a name: “the CSI
effect.” In 2010, a study published in Forensic Science International
found that prosecutors now have to spend time explaining to juries that investigators often fail to find evidence at a crime scene and hence that its absence in court is not conclusive proof of the defendant’s innocence. They have even introduced a new kind of witness to make this point—a so-called negative evidence witness.
Unrealistic expectations about the strength of forensic evidence did not begin with CSI,
of course. Fingerprints led to the same problem; they have been treated by the courts as conclusive evidence for a hundred years. In 1892, Charles Darwin’s cousin Francis Galton calculated that the chance of two different individuals having the same fingerprints was about 1 in 64 billion, and fingerprint evidence has been treated as virtually infallible ever since, which means that a single incriminating fingerprint can still send someone to jail. But, like DNA evidence, even the best fingerprints are imperfect. After a mark is found at a crime scene, it must be compared to a reference fingerprint, or “exemplar,” retrieved from police files or taken from a suspect. But no reproduction is perfect; small variations creep in when a finger is inked or scanned to create an exemplar.
More important, fingerprint analysis is a fundamentally subjective process; when identifying distorted prints, examiners must choose which features to highlight, and even highly trained experts can be swayed by outside information. Yet the subjective nature of this process is rarely highlighted during court cases and is badly understood by most jurors. Christophe Champod, an expert in forensic identification at the University of Lausanne in Switzerland, thinks the language of certainty that examiners are forced to use hides the element of subjective judgment from the court. He proposes that fingerprint evidence be presented in probabilistic terms and that examiners should be free to talk about probable or possible matches. In a criminal case, for example, an examiner could testify that there was a 95 percent chance of a match if the defender left the mark but a one-in-a-billion chance of a match if someone else left it. “Once certainty is quantified,” says Champod, “it becomes transparent.” Certainty may not seem like the kind of thing that can
be quantified, but this is exactly what numerical probabilities are designed to do. By expressing chance in terms of numbers—by saying, for example, that there is a 95 percent chance that a fingerprint was left by a particular suspect—the strength of the evidence becomes much clearer and easier to comprehend. Even with a probability of 95 percent it is clear that there is still a one-in-twenty chance that the mark came from someone else.
The tendency to consider fingerprint evidence as more conclusive than it is can have tragic consequences. Take the case of Shirley McKie, a successful Scottish policewoman who was accused of leaving her fingerprint at a crime scene and lying about it. In 1997, McKie was part of a police team investigating the vicious murder of Marion Ross in Kilmarnock, Scotland. After the thumbprint of a local builder was found on a gift tag in the victim’s home, he was accused of the murder. When the murdered woman’s fingerprints were found on a cookie tin stuffed with banknotes, which McKie discovered when searching the builder’s bedroom, it looked like an open-and-shut case. At the time, fingerprints were the gold standard of forensic evidence, and even a single print was sufficient to secure a conviction. Moreover, in the ninety-two years since Scotland Yard had first used them to prove a murderer’s guilt, their veracity had never been successfully challenged in a Scottish court.
Then the forensic team discovered something else. They identified a thumbprint on the bathroom door frame at the victim’s house as belonging to Shirley McKie. This was a serious matter, as McKie had never been granted permission to enter the dead woman’s bungalow, which had been sealed off. If she was thought to have crossed the cordon and contaminated vital forensic evidence, she would face disciplinary action. But McKie knew she had never set foot inside the crime scene, so the match between her print and the mark on the bathroom door frame could only be a mistake. Could it have been mislabeled by the fingerprint experts?
The Scottish Criminal Record Office (SCRO) refused even to contemplate the possibility. Not only would it undermine its case against the builder they suspected of murdering Marion Ross, but it might also wreck the Lockerbie trial—conducted in The Hague under Scottish jurisdiction—of two Libyans accused of blowing up a Boeing 747 while en route from London to New York in December 1988. The case against one of the Libyan suspects involved a contentious fingerprint found on a travel document, and several senior figures involved in the Lockerbie trial were also involved in the Marion Ross investigation. If the work of those experts was revealed to be so seriously flawed that they could not even accurately match a blameless policewoman’s prints, both cases could fall flat. According to Pan Am’s senior Lockerbie investigator, the FBI was so concerned that the case against the two Libyans might be undermined by the McKie debacle that they put pressure on the Scottish team to interfere with the evidence against her.
Since McKie had stated at the murder trial that she had never been in the victim’s house, she was charged with perjury. Arrested in an early-morning raid, she was taken to the local police station (where her father had been a commanding officer), marched past colleagues and friends, strip-searched, and thrown in a cell. Luckily, two US fingerprint experts came to McKie’s rescue. Pat Wertheim and David Grieve spent hours comparing the fingerprint on the door frame with an imprint of McKie’s left thumb and concluded that they belonged to different people. Moreover, they became convinced that the misidentification of the two marks could not have been an honest mistake. “Shirley’s thumbprint appears to have been smudged to mask the differences with the mark on the frame,” Wertheim noted. That clinched it; the jury acquitted McKie of perjury in May 1999, saving her from a possible eight-year jail sentence. Effectively they saved her life, since McKie later admitted that she could not have faced prison knowing she was innocent.
As she left the court, McKie thought she would receive a formal apology and be invited to return to the job she loved. Instead, she was deemed medically unfit for service and forced into a long legal battle with the police. Although she was eventually awarded £750,000 in compensation, the SCRO never admitted it was wrong, and nobody ever offered her an apology. HOMELAND SECURITY
Of the many new security measures introduced in the wake of the terrorist attacks of September 11, 2001, few have caused more irritation than those implemented at airports.
Two days after the attacks, the Federal Aviation Administration (FAA) promulgated new rules prohibiting any type of knife in secured airport areas and on airplanes. The hijackers had been able to carry box cutters through security because at the time any knife with a blade up to four inches long was permitted on US domestic flights. In November 2001, all airport screening in the United States was transferred from private companies to the newly created Transportation Security Administration (TSA). Since then, every new terrorist plot adds further checks to the gauntlet that passengers must run.
After the “shoe bomber” Richard Reid failed in his attempt to blow up a commercial aircraft in flight, all airline passengers departing from an airport in the United States were made to walk through airport security in socks or bare feet while their shoes were scanned for bombs. After British police foiled a plot to detonate liquid explosives on board airliners in 2006, passengers at UK airports were not allowed to take liquids on board, and laptop computers were banned. The restrictions were gradually relaxed in the following weeks, but the ability of passengers to carry liquids onto commercial aircraft is still limited. The attempted bombing of Northwest Airlines Flight 253 on Christmas Day 2009, in which a passenger tried to set off plastic explosives sewn to his underwear, led the US government to announce plans to spend about $1 billion on full-body scanners and other security technology such as bomb detectors.
While for many passengers, waiting in line and taking off their shoes are necessary evils (a poll conducted by Rasmussen Reports shortly after the failed bombing attempt on Flight 253 found that 63 percent of Americans felt security precautions put in place since 9/11 were “not too much of a hassle”), many others disagree. Martin Broughton, the chairman of British Airways, probably spoke for many when, at a meeting for airport operators in October 2010, he described the security procedures as “completely redundant” and called for them to be ditched. The security expert Bruce Schneier has dubbed many of the measures “security theater” on the grounds that they serve merely to create an appearance that the authorities are doing something but do nothing to reduce the actual risk of a terrorist attack. Indeed, it is intelligence tip-offs, not airport checkpoints, that have foiled the vast majority of attempted attacks on aircraft.
Schneier may be right that many of the new airport security procedures are purely theatrical, but that begs the question as to why
they are such good theater. In other words, it is not enough to point out the mismatch between feeling
safe and being
safe; if we want to understand this blind spot in our risk intelligence, we need to know why
things such as taking one’s shoes off and walking through a body scanner are so effective in creating such (objectively unreliable) feelings of safety. It probably has something to do with their visibility; intelligence gathering may be more effective at reducing the risk of a terrorist attack, but it is by its very nature invisible to the general public. The illusion of control may be another factor; when we do something active such as taking our shoes off, we tend to feel more in control of the situation, but when we sit back and let others (such as spies gathering intelligence) do all the work, we feel passive and impotent. Maybe there’s a ritual aspect here, too, as in the joke “Something must be done. This is something. Therefore, we must do it.” The default assumption is that the “something” is good, and we feel better. Psychologists have long known that the illusion of control is a key factor in risk perception; it is probably one of the main reasons why people feel safer driving than when flying, even though driving is more dangerous.
Politicians have an obvious incentive to put on this security theater; they get credit for taking visible action. A little reflection, however, should make clear that not everyone is equally likely to be carrying a bomb. The International Air Transport Association (IATA), the air transport industry’s trade body, has argued for a more selective approach by, for example, prescreening passengers before they turn up at the airport and flagging the more suspicious ones for a more thorough pat-down. Better training of airport screeners could also help them improve their ability to spot suspicious behavior.
Now consider the costs. To gauge the true cost of screening passengers at airports in the United States, it is not enough to look at the TSA’s operating budget; we should also take into account the extra time passengers have spent waiting in line, taking their shoes off, and so on. Robert Poole, a member of the National Aviation Studies Advisory Panel in the Government Accountability Office, has calculated that the additional time spent waiting at airports since 9/11 has cost the nation about $8 billion a year. It is by no means clear that this was the wisest use of the security budget. Every dollar spent on one security measure is a dollar that can’t be spent on an alternative one.
The costs of the new security procedures do not end there. Long lines at airports have prompted more people to drive rather than fly, and that has cost lives because driving is so much more dangerous than flying. The economist Garrick Blalock estimated that from September 2001 to October 2003, enhanced airport security measures led to 2,300 more road fatalities than would otherwise have occurred. Those deaths represent a victory for Al Qaeda.
One of the principal goals of terrorism is to provoke overreactions that damage the target far more than the terrorist acts themselves, but such knee-jerk responses also depend on our unwillingness to think things through carefully. As long as we react fearfully to each new mode of attack, democratic governments are likely to continue to implement security theater to appease our fears. Indeed, this is the Achilles’ heel of democracy that terrorists exploit. One thing we could all do to help combat terrorism is to protect this Achilles’ heel by developing our risk intelligence. GLOBAL WARMING AND CLIMATE CHANGE
High levels of risk intelligence will be required to deal not just with the threat of international terrorism but also with other big challenges that humanity faces in the twenty-first century. Climate change is a particularly vexing case in point. Nobody knows precisely how increasing levels of greenhouse gases in the atmosphere will affect the climate in various regions around the globe. The Intergovernmental Panel on Climate Change (IPCC) does not make definite predictions; instead, it sets out a variety of possible scenarios and attaches various probabilities to them to indicate the level of uncertainty associated with each.
Knowing how to make sense of this information is crucial if we are to allocate resources sensibly to the various solutions, from carbon-trading schemes to the development of alternative energy sources or planetary-scale geoengineering. But how can citizens make informed decisions about such matters if they are not equipped to think clearly about risk and uncertainty?
One problem is that too often, the pundits who take opposing views about climate change make exaggerated claims that convey greater certainty than is warranted by the evidence. Rarely do we hear them quote probabilities; rather, critics dismiss the IPCC’s claims out of hand, while believers in climate change paint vivid pictures of ecological catastrophes. Both kinds of exaggeration seriously hamper informed debate; the latter also terrifies kids. One survey of five hundred American preteens found that one in three children between the ages of six and eleven feared that the earth would not exist when they reached adulthood because of global warming and other environmental threats. Another survey, this one in the United Kingdom, showed that half of young children between ages seven and eleven are anxious about the effects of global warming, often losing sleep because of their concern. Without the tools to understand the uncertainty surrounding the future of our climate, we are left with a choice between two equally inadequate alternatives: ignorant bliss or fearful overreaction.
Some environmentalists have attempted to dress up the second alternative in fancy theoretical clothing. The so-called precautionary principle states that new policies or technologies should be heavily regulated or even prohibited whenever there is a possible risk to the environment or human health. This principle may appear sensible at first glance, but scratch the surface and it turns out to be terribly misguided. To be fair, it should be noted that there are many alternative versions of the precautionary principle, and some of them are less stupid than others. But the common theme that links all of the versions together is an overemphasis on downside risks and a corresponding neglect of the benefits of new technologies (the “upside risks”).
The precautionary principle is most often applied to the impact of human actions on the environment and human health and in the context of new technological developments. According to stronger versions of the principle, risky policies and technologies should be regulated or even prohibited, even if the evidence for such risks is weak and even if the economic costs of regulation are high. In 1982, the UN World Charter for Nature gave the first international recognition to a strong version of the principle, suggesting that when “potential adverse effects are not fully understood, the activities should not proceed.”
That sets the bar way too high. The potential adverse effects of any new technology are never
fully understood. Nor are the potential benefits, for that matter, or the costs of regulation. Advocates of the precautionary principle often make no attempt to estimate the probabilities of the alleged dangers, on the grounds that they are “unknowable.” But that just shows a deep misunderstanding of what probabilities are. Probabilities are an expression
of our ignorance; by quantifying uncertainty, we are already conceding that we don’t “know” the relevant facts with 100 percent certainty and admitting that we have to work on the basis of educated guesses. It is much better to reason on the basis of such guesses than to neglect probabilities altogether.
At first blush, the precautionary principle may not seem relevant to climate change, since few people doubt that our planet is getting warmer and that the chief cause of this is the burning of fossil fuels. It is a near certainty that the global climate will change. The polar ice caps will melt, and the sea will rise and flood a great deal of land that is now inhabited. There is, however, much debate about the extent of the danger. The precautionary principle suggests that this uncertainty is in itself good reason to take aggressive action. The planet is at risk, the argument goes, so it would be prudent to take bold steps immediately. Isn’t it better to be safe than sorry?
Not necessarily, argues Cass Sunstein, a legal scholar who was appointed to head the Office of Information and Regulatory Affairs in 2009. Sunstein points out that there are always risks on both sides of a decision; inaction can bring danger, but so can action. Precautions themselves, in other words, create risks. No choice is risk free.
A high tax on carbon emissions, for example, would increase the hardship on people who can least afford it and probably lead to greater unemployment and hence poverty. A sensible climate change policy must balance the costs and benefits of emissions reductions. A policy that includes costly precautions should be adopted only if the costs are outweighed by the benefits.
Such rational analyses are often trumped, however, by the strong emotional responses triggered by images of dramatic climate change such as those in films like The Day After Tomorrow
(2004) and An Inconvenient Truth
(2006). Sunstein has also argued that “in the face of a fearsome risk, people often exaggerate the benefits of preventive, risk-reducing, or ameliorative measures.” When a hazard stirs strong emotions, people also tend to factor in probability less, with the result that they will go to great lengths to avoid risks that are extremely unlikely. Psychologists refer to this phenomenon as “probability neglect” and have investigated it in a variety of experimental settings.
As with the threat of international terrorism, high levels of risk intelligence will be required to face the challenges posed by climate change. If we are to contribute sensibly to the debate, we must learn to deal better in probabilities and to craft policies that are sensitive to the different probabilities of the various possible scenarios. EXPERTS AND COMPUTERS CAN’T SAVE US FROM OURSELVES
Many of us may be inclined to believe that it’s best to defer to experts regarding such tricky assessments or, when possible, to allow computer programs to do the hard work for us, as so many bankers decided to do in assessing the risks of subprime mortgages in the decade preceding the 2007 financial crisis. But it’s a big mistake to think we can offload the responsibility for risk intelligence. Indeed, research suggests that many experts have quite poor risk intelligence, and the financial crisis illustrated all too well the problems of relying too heavily on computer models.
Take the experts first. A famous study by the psychologist Philip Tetlock asked 284 people who made their living “commenting or offering advice on political and economic trends” to estimate the probability of future events in both their areas of specialization and areas in which they claimed no expertise. Over the course of twenty years, Tetlock asked them to make a total of 82,361 forecasts. Would there be a nonviolent end to apartheid in South Africa? Would Mikhail Gorbachev be ousted in a coup? Would the United States go to war in the Persian Gulf? And so on.
Tetlock put most of the forecasting questions into a “three possible futures” form, in which three alternative outcomes were presented: the persistence of the status quo, more of something (political freedom, economic growth), or less of something (repression, recession). The results were embarrassing. The experts performed worse than they would have if they had simply assigned an equal probability to all three outcomes. Dart-throwing monkeys would have done better.
Furthermore, the pundits were not significantly better at forecasting events in their area of expertise than at assessing the likelihood of events outside their field of study. Knowing a little helped a bit, but Tetlock found that knowing a lot can actually make a person less reliable. “We reach the point of diminishing marginal predictive returns for knowledge disconcertingly quickly,” he observed. “In this age of academic hyperspecialization, there is no reason for supposing that contributors to top journals—distinguished political scientists, area study specialists, economists, and so on—are any better than journalists or attentive readers of the New York Times
in ‘reading’ emerging situations.” And the more famous the forecaster, the lower his or her risk intelligence seemed to be. “Experts in demand,” Tetlock noted, “were more overconfident than their colleagues who eked out existences far from the limelight.”
As far as relying on computer programs to help us assess risks, the story of the 2007 financial crisis reveals the vital importance of more nuanced human risk intelligence in alerting us to risks even when the data tell us not to worry.
During the 1990s, Wall Street was invaded by a new breed of risk assessors. According to Aaron Brown of AQR Capital Management, a hedge fund located in Connecticut, Wall Street used to be full of game players—literally. Many of those in trading and running trading-related businesses in the 1970s were frequent poker players, bridge players, and backgammon players. Those who weren’t gamblers in the strict sense of the term were nevertheless used to taking risks in all aspects of their lives. But in the 1990s, the risk lovers were gradually edged out and replaced by a new wave of risk avoiders. Put simply, the banks wanted to stop gambling. That, it turned out, was a mistake.
The most famous invention of the new risk avoiders, who became known as “quants,” short for quantitative analysts, was the Black-Scholes formula, which made it possible to put a price on financial instruments that weren’t traded very often. Trading is an effective way of determining value, so if an instrument is not traded frequently it can be hard to price it. The formula devised by Fischer Black and Myron Scholes came up with a value for rarely traded instruments by linking them with a comparable security that did trade regularly. Taking things a step further, a team of quants at J.P. Morgan developed a way to sum up the risks of whole portfolios of financial assets in a single number called value at risk, or VaR. The beauty of VaR was that it synthesized the dizzying variety of variables that make up the market risk of an investment portfolio into a single dollar value that risk managers could report to top executives.
At first the nonquants—the traders and executives who had been running Wall Street more on the basis of hunches and educated guesswork than on math—were suspicious of the new methods. But as the equations turned out to be right again and again, the executives came round to the new way of thinking, and by the late 1990s VaR was firmly entrenched in both the practice and regulation of investment banking.
The ironic outcome of this was that during the last decade of the twentieth century, Wall Street hemorrhaged risk intelligence. People who were used to thinking about risk intuitively left the banks for new pastures, and their ranks were filled by people who were more at home in the world of equations and formulae. According to Aaron Brown, that was an important but widely neglected cause of the 2007 crisis.
The problem with any kind of mathematical technology is that you may come to rely on it so much that your capacity to benchmark it against other standards withers away, leaving you unable to spot previously obvious errors. A case in point is the replacement of slide rules by pocket calculators in the 1970s. When people used slide rules to carry out multiplication and division, they would constantly check their intermediate steps against common sense and an understanding of their subject as they performed calculations. In particular, they had to note the order of magnitude at each stage and so were less likely to make wildly wrong errors. With an electronic calculator, the intermediate steps are all taken care of by the machine, so the habit of checking tends to atrophy, leaving people less able to spot, for example, that the decimal point is now in the wrong place.
In the same way, the greater reliance on IT systems has led to a “de-skilling of the risk process,” according to Stephen O’Sullivan, formerly of Accenture, a consultancy. A friend of mine who worked in the Foreign Exchange Complex Risk Group at a major international bank told me a story that illustrates the dangers of such uncritical reliance on mathematical technology. One morning he watched the global exchange rate for a pair of currencies, both from G7 economies, get fixed, all around the world, at an obviously stupid price. One bank’s automated trading system had developed a problem and was quoting a wildly inaccurate “giveaway” price, well below the true market rate. The rest of the global FX market participants, many of them running their own automated trading systems, rapidly switched to trading with the error-hit system, buying currency at cheap prices. The speed and magnitude of the market’s rush to exploit the mistake by one automated trading system meant that the incorrect price became, for a brief while, the global exchange rate for that pair of currencies. It was only when human traders literally pulled the plug on the automated trading system that the bank stopped bleeding money. In the next few minutes the global exchange rate fell back to where everyone knew it was supposed to be, and the blip passed.
Many errors have been caused by computerized trading, at great cost to investors, and they are fixed only when actual people step in and switch off the machine that has screwed up. The only reason this is possible is that some people still have, in their heads, standards against which they can benchmark the performance of the machines. THE DARKENED ROOM
The unfortunate fact, though, is that most of us simply aren’t comfortable with or adept at making judgments in the netherland of uncertainty, and this is largely due to our reluctance to gauge the limits of what we know. Picture your mind as a lightbulb shining in an otherwise dark room. Some nearby objects are fully illuminated; you can see them in every detail, present and identifiable. They are the things you know very well: the names of your friends, what you had for breakfast this morning, how many sides a triangle has, and so on. The objects on the other side of the room are completely shrouded in darkness. They are the things about which you know nothing: the five thousandth digit of pi,
the composition of dark matter, King Nebuchadnezzar’s favorite color. Between the light and the darkness, however, lies a gray area in which the level of illumination gradually shades away.
In this twilight zone, the objects are not fully illuminated, but neither are they completely invisible. You know something about those things, but your knowledge is patchy and incomplete—the law of the land (unless you are a lawyer), the evidence for climate change (unless you are a climatologist), the causes of the credit crunch (even economists are still arguing about this). The question is, how much
do you know about those things? How good are you at judging the precise level of illumination at different points in the twilight zone?
In 1690, the English philosopher John Locke noted that “in the greatest part of our concernments, [God] has afforded us only the twilight, as I may so say, of probability.” Yet we are still remarkably ill equipped to operate in this twilight zone. If we’re cautious, we relegate everything beyond the zone of complete illumination to complete obscurity, not daring to venture an opinion on things about which we do, in fact, have some inkling. If we’re overconfident, we do the opposite, expressing views about things in the twilight zone with more conviction than is justified. It’s hard to steer between the two extremes, daring to speculate but with prudence. This book is a traveler’s guide to that twilight zone and a manifesto for what the poet John Keats called “negative capability”: “when man is capable of being in uncertainties, Mysteries, doubts without any irritable reaching after fact and reason.” THE LIGHT AT THE END OF THE TUNNEL
It’s not all doom and gloom. There is light at the end of the tunnel. Although the general level of risk intelligence is not high, and therefore many of the mechanisms that we invent to help us do a better job of assessing risks (such as color-coded warnings about terrorist threat levels and elaborate mathematical models for measuring financial risks) can lead to perverse results, we are not condemned to repeat our mistakes. There are in fact people, such as the hypothetical Kathryn and Jamie I described at the beginning of this chapter, who have high risk intelligence—at least in certain subject areas—and I have found that by studying them, and the patterns that show up in people’s risk judgments more generally, it’s possible to discern ways in which we can all boost our risk intelligence.
Philip Tetlock’s conclusions about the limited value of expertise, which I introduced earlier, must be qualified. Many self-proclaimed experts are indeed no better than monkeys at forecasting world events. But Tetlock also found that among the hundreds of experts he studied, there were a handful who seemed particularly good at estimating probabilities. If your sample is large enough, of course, you’re bound to come across a few outliers by chance alone, but the wise forecasters identified by Tetlock do not seem to be a statistical fluke. Psychologists have also identified other groups with unusually high risk intelligence, which suggests that risk intelligence can be developed significantly under the right conditions. In fact, it was a fascinating study about one such group that first got me thinking about this whole subject. The group in question was a bunch of men who were fanatical about horse racing.
Let me take you to a sunny afternoon in 1984 at Brandywine Raceway, a harness racetrack in North Wilmington, Delaware. A young psychologist is chatting with a sixty-two-year-old man. “Which horse do you think will win the next race?” he asks the older man.
“The four-horse should win easily; he should go off three to five or shorter, or there’s something wrong,” replies the man, a crane operator who has been coming to the racetrack several times a week for the past eight years.
“What exactly is it about the four-horse that makes him your odds-on favorite?”
“He’s the fastest, plain and simple!”
The psychologist looks puzzled. “But it looks to me like other horses are even faster,” he interjects, pointing to a page in the Brandywine Official Form Program
. “For instance, both the two-horse and the six-horse have recorded faster times than the four-horse, haven’t they?”
“Yeah,” says the crane operator with a smile, “but you can’t go by that. The two-horse didn’t win that outing, he just sucked up.”
“You gotta read between the lines if you want to be good at this. The two-horse just sat on the rail and didn’t fight a lick. He just kept on the rail and sucked up lengths when horses in front of him came off the rail to fight with the front runner.”
“Why does that make his speed any slower? I don’t get it.”
“Now, listen. If he came out and fought with other horses, do you think for one minute he’d have run that fast? Let me explain something to you that will help you understand. See the race on June 6?” he asks, pointing to the relevant line of the racing program. “Well, if the two-horse had to do all of this fighting, he’d run three seconds slower. It’s that simple. There ain’t no comparison between the two-horse and the four-horse. The four is tons better!”
“And the longer you’re on the outside, the longer the race you have to run, right?” asks the psychologist, as he begins to understand what the seasoned handicapper is saying. “In other words, the shortest route around the track is along the rail, and the farther off of it you are, the longer the perimeter you have to run.”
“Exactly,” replies the crane operator. “But there’s another horse in this race that you have to watch. I’m talking about the eight-horse. He don’t mind the outside post because he lays back early. Christ, he ran a monster of a race on June 20! He worries me because if he repeats here, he’s unbeatable.”
“Do you like him better than the four-horse?”
“Not for the price. He’ll go off even money. He isn’t that steady to be even money. If he’s geared up, there’s no stopping him, but you can’t bet on him being geared up. If he were three to one, I’d bet him in a minute because he’ll return a profit over the long run. But not at even money.”
The psychologist’s name was Stephen Ceci. In 1982, not long out of grad school, Ceci and his colleague Jeffrey Liker had approached the owners of Brandywine Raceway to ask permission to conduct a study of their clients. Ceci and Liker identified thirty middle-aged and older men who were avid racetrack patrons and studied them over a four-year period. None of the men earned their living by gambling, though all of them attended the races nearly every day of their adult lives.
As part of their study, Ceci and Liker asked all thirty men to handicap ten actual horse races—that is, to estimate the chances of each horse winning—as well as fifty imaginary ones they concocted. As it happened, the men fell into two distinct groups, one of which was significantly better than the other at handicapping. Moreover, the experts seemed to be unconsciously using a highly sophisticated mental model. For example, to predict the speed with which a horse could run the final quarter mile of the race, the experts took as many as seven different variables into account, including the speed at which the horse had run in its last race, the quality of the jockey, and the current condition of the racetrack. And they didn’t just consider each of these factors independently. Rather, they considered them all in context. For example, coming third in one race may actually be more impressive than coming first in another race if the quality of the competition was higher in the former.
Ceci and Liker also tested the men’s IQs. And that was when they got their biggest surprise—as did I, when I read their paper some twenty years later. For Ceci and Liker found that handicapping expertise had zero correlation with IQ. IQ is the best single measure of intelligence that psychologists have, because it correlates with so many cognitive capacities. It’s that very correlation that underpins the concept of “general intelligence.” The discovery that expertise in handicapping doesn’t correlate at all with IQ means that whatever cognitive capacities are involved in estimating the odds of a horse winning a race, they are not a part of general intelligence. Or, to put it the other way around, IQ is unrelated to some forms of cognitive calculation that are, nonetheless, clear-cut cases of intelligence.
Not everyone is happy with the concept of general intelligence. The psychologist Howard Gardner argues that, rather than thinking in terms of one unitary measure, we should instead conceive of the mind as possessing multiple types of intelligence. Gardner identifies eight different kinds of intelligence: bodily-kinesthetic, interpersonal, verbal-linguistic, logical-mathematical, naturalistic, intrapersonal, visual-spatial, and musical. None of these involves an ability to estimate probabilities accurately, yet the study by Ceci and Liker shows that this is a cognitive skill that some people are very good at, which suggests that it might constitute a ninth kind of intelligence to add to Gardner’s list.
In a similar vein, the psychologist Daniel Goleman argues that IQ tests fail to capture a set of social and emotional skills that he refers to collectively as “emotional intelligence.” Goleman claims that proficiency with these skills—which include impulse control, self-awareness, social awareness, and relationship management—is a much stronger indicator of success than high IQ. But measures of EQ are no better than IQ tests at capturing our capacity for judging risks and weighing probabilities. This suggests that we should also test people for risk intelligence (RQ) when selecting candidates for jobs that involve estimating probabilities and making decisions under uncertainty.
This book is a manifesto for this specific kind of intelligence, for coming to appreciate how risk intelligence operates and then working to build up your own skills. I’m going to demonstrate why, when we get it wrong—when banks fail, doctors misdiagnose, and weapons of mass destruction turn out not to exist—we’re in such a bad position to understand the reasons. I’ll reveal the primary reasons why we tend to be so bad at estimating probabilities and then provide a powerful set of methods whereby we can hone our skills. Expert handicappers are not the only group of people with unusually high levels of risk intelligence; bridge players and weather forecasters are also pretty good in their areas of expertise. By studying what those groups have in common, as well as a fascinating set of findings about how our brains lead us astray in making risk assessments, we can discover ways to improve our own risk intelligence and thereby make better decisions in all aspects of our lives.