on September 27, 2012
This is the best general-readership book on applied statistics that I've read. Short review: if you're interested in science, economics, or prediction: read it. It's full of interesting cases, builds intuition, and is a readable example of Bayesian thinking.
Longer review: I'm an applied business researcher and that means my job is to deliver quality forecasts: to make them, persuade people of them, and live by the results they bring. Silver's new book offers a wealth of insight for many different audiences. It will help you to develop intuition for the kinds of predictions that are possible, that are not so possible, where they may go wrong, and how to avoid some common pitfalls.
The core concept is this: prediction is a vital part of science, of business, of politics, of pretty much everything we do. But we're not very good at it, and fall prey to cognitive biases and other systemic problems such as information overload that make things worse. However, we are simultaneously learning more about how such things occur and that knowledge can be used to make predictions better -- and to improve our models in science, politics, business, medicine, and so many other areas.
The book presents real-world experience and critical reflection on what happens to research in social contexts. Data-driven models with inadequate theory can lead to terrible inferences. For example, on p. 162: "What happens in systems with noisy data and underdeveloped theory - like earthquake prediction and parts of economic and political science - is a two-step process. First, people start to mistake the noise for a signal. Second, this noise pollutes journals, blogs, and news accounts with false alarms, undermining good science and setting back our ability to understand how the system really works." This is the kind of insight that every good practitioner acquires through hard-won battles, and continues to wrestle every day both in doing work and in communicating it to others.
It is both readable and technically accurate: it presents just enough model details yet avoids being formula-heavy. Statisticians will be able to reproduce models similar to the ones he discusses, but general readers will not be left out: the material is clear and applicable. Scholars of all stripes will appreciate the copious notes and citations, 56 pages of notes and another 20 pages of index, which detail the many sources. It is also important to note that this is perhaps the best general readership book from a Bayesian perspective -- a viewpoint that is overdue for readable exposition.
The models cover a diversity of areas from baseball to politics, from earthquakes to finance, from climate science to chess. Of course this makes the book fascinating to generalists, geeks, and breadth thinkers, but perhaps more importantly, I think it serves well to develop reusable intuition across domains. And, for those of us who practice such things professionally, to bring stories and examples that we can tell and use to illustrate concepts with the people we inform.
There are three audiences who might not appreciate the book as much. First are students looking for a how-to book. Silver provides a lot of pointers and examples, but does not get into nuts and bolts details or supply foundational technical instruction. That requires coursework in research methods and and statistics. Second, his approach to doing multiple models and interpreting them humbly will not satisfy those who promote a naive, gee-whiz, "look how great these new methods are" approach to research. But then, that's not a problem; it's a good thing. The third non-fitting audience will be experts who desire depth in one of the book's many topic areas; it's not a technical treatise for them and I can confidently predict grumbling in some quarters. Overall, those three audiences are small, which happily leaves the rest of us to enjoy the book.
What would make it better? As a pro, I'd like a little more depth (of course). It emphasizes games a little too much for my taste. And a clearer prescriptive framework could be nice (but also could be a problem for reasons he illustrates). But those are minor points; it hits its target better than any other such book I know.
Conclusion: if you're interested in scientific or statistical forecasting, either as a professional or layperson, or if you simply enjoy general science books, get it. Cheers!
on November 11, 2012
Excellent book!!! People looking for a "how to predict" silver bullet will (like some reviewers here) be disappointed, mainly because Silver is too honest to pretend that such a thing exists. The anecdotes and exposition are fantastic, and I wish we could make this book required reading for, say, everyone in the country.
During election season, everyone with a newspaper column or TV show feels entitled to make (transparently partisan) predictions about the consequences of each candidate's election to unemployment/crime/abortion/etc. This kind of pundit chatter, as Silver notes, tends to be insanely inaccurate. But there are also some amazing success stories in the prediction business. I list some chapter-by-chapter takeaways below (though there's obviously a lot depth more to the book than I can fit into a list like this):
1. People have puzzled over prediction and uncertainty for centuries.
2. TV pundits make terrible predictions, no better than random guesses. They are rewarded for being entertaining, and not really penalized for being wrong.
3. Statistics has revolutionized baseball. But computer geeks have not replaced talent scouts altogether. They're working together in more interesting ways now.
4. Weather prediction has gotten lots better over the last fifty years, due to highly sophisticated, large-scale supercomputer modeling.
5. We have almost no ability to predict earthquakes. But we know that some regions are more earthquake prone, and that in a given region an earthquake of magnitude n happens about ten times as often as an earthquake of magnitude (n+1).
6. Economists are terrible at predicting quantities such as next year's GDP. Predictions are only very slightly correlated with reality. They also tend to be overconfident, drastically underestimating the margin of error in their guesses. Politically motivated predictions (such as those released by White House, historically) are even worse.
7. The spread of a disease like the flu is hard to predict. Sometimes we overreact because risk of under-reacting seems greater.
8. A few professional sports gamblers are able to make make a living by spotting meaningful patterns before others do, and being right slightly more than half the time.
9. Kasparov thought he could beat Deep Blue. Couldn't. Interesting tale of humans/computers trying to outguess each other.
10. Nate Silver made a living playing online poker for a few years. When the government tightened the rules, the less savvy players ("fish") stopped playing, and he found he couldn't make money any more. So he started FiveThirtyEight.
11. Efficient market hypothesis: market seems very efficient, but not perfectly so. Possible source of error: most investment is done by institutions, and individuals at these institutions are rewarded based on short term profits. Rational employees may have less career risk when they "bet with the consensus" than when they buck a trend: this may increase herding effects and makes bubbles worse. Note: Nate pointedly does not claim that one can make money on Intrade by betting based on FiveThirtyEight probabilities. But he stresses that Intrade prices are themselves probably heavily informed by poll-based models like the ones on FiveThirtyEight.
12. Climate prediction: prima facie case for anthropic warming is very strong (greenhouse gas up, temperature up, good theoretical reason for former causing latter). But lots of good reason to doubt accuracy of specific elaborate computer models, and most scientists admit uncertainty about details.
13. We failed to predict both Pearl Harbor and September 11. Unknown unknowns got us. Got to watch out for loose Pakistani nukes and other potential catastrophic surprises in the future.
on November 7, 2012
This book explains the unerring accuracy for Nate SIlver's election predictions using Bayesian statistics. The BEST part of the book for me was that I finally understand Bayes' analysis. I used quite a few sophisticated statistical tools in my work (retired as reliability physics expert for semiconductor devices, aka chips), but I was never able to grasp Bayes Theorem until now. Wikipedia's "tutorial" was far too complicated even for a PhD, but Nate provided a simple version that a layman can understand ... and he did it using a hilarious example (look for "cheating"). In fact, I am so impressed with Bayes' analysis that I am thinking about writing a corollary to my two best technical papers grafting a Bayesian view.
Returning to the election prediction issue, consider that each poll of 1000 people has a sampling error of +-5%, easily derived from Poisson statistics. However, when one pools the results from say 25 polls (and removes bias), the sample size is increased by 25-fold, which reduces the sampling error by 5-fold, down to +-1%. Thus, one can make confident predictions over differences FAR smaller than the usual sampling error. When one combines Bayesian pooling with a state-by-state analysis, one can make astonishingly accurate predictions ... Nate predicted ALL 50 states correctly, so his electoral count was exactly on reality as well when fractional electoral counts are eliminated.
Buy the book as it is educational and fun to read.
on September 30, 2012
This book was a disappointment for me, and I feel that the time I spent reading it has been mostly wasted. I will first, however, describe what I thought is *good* about the book. Everything in this book is very clear and understandable. As for the content, I think that the idea of Baysean thinking is interesting and sound. The idea is that, whenever making any hypothesis (e.g. a positive mammogram is indicative of breast cancer) into a prediction (for example, that a particular woman with a positive mammogram actually has cancer), one must not forget to estimate all the following three pieces of information:
1. The general prevalence of breast cancer in population. (This is often called the "prior": how likely did you think it was that the woman had cancer before you saw the mammogram)
2. The chance of getting a positive mammogram for a woman with cancer.
3. The chance of getting a positive mammogram for a woman without cancer.
People often tend to ignore items 1 and 3 on the list, leading to very erroneous conclusions. "Bayes rule" is simply a mathematical gadget to combine these three pieces of information and output the prediction (the chance that the particular woman with a positive mammogram has cancer). There is a very detailed explanation of this online (search Google for "yudkowsky on bayes rule"), no worse (if more technical) than the one in the book. If you'd like a less technical description, read chapter 8 of the book (but ignore the rest of it).
Now for the *bad*. While the Baysean idea is valuable, its description would fit in a dozen of pages, and it is certainly insufficient by itself to make good predictions about the real world. I had hoped that the book would draw on the author's experience and give an insight into how to apply this idea in the real world. It does the former, but not he latter. There are lots of examples and stories (sometimes amusing; I liked the Chess story in Chapter 9), but the stories lead the reader to few insights.
The examples only lead to one conclusion clearly. If you need to be convinced that "the art of making predictions is important, but it is easy to get wrong", read this book. If you wonder: "how can we actually make good predictions?", don't. The only answers provided are useless platitudes: for example, "it would be foolish to ignore the commonly accepted opinion of the community, but one must also be careful to not get carried away by herd mentality". While I was searching for the words to describe the book, I have found the perfect description in Chapter 12 the book itself:
- - - - - - - - - - - - -
Heuristics like Occam's razor ... sound sexy, but they are hard to apply.... An admonition like "The more complex you make the model the worse the forecast gets" is equivalent to saying "Never add too much salt to the recipe".... If you want to get good at forecasting, you'll need to immerse yourself in the craft and trust your own taste-buds.
- - - - - - - - - - - - -
Had this quote been from the introduction, and had the book given any insight into how to get beyond the platitudes, it would be the book I hoped to read. However, the quote is from the penultimate chapter, and there is no further insight inside this book.
P.S. I first posted this review at Goodreads, and any updates will happen there and not here. Amazon has destroyed all the formatting and the hyperlink in this review, so the version at Goodreads is slightly better already.
on November 22, 2012
I'm a member of SABR, a fan of Silver's political blog, and as an atmospheric science professor I teach seminars on forecasting (weather forecasting and also an interdisciplinary seminar in prediction). I've won awards in a national weather forecasting contest, too. I thought this book would be the long-awaited grand unifier of the field of prediction. Instead, any time that Silver turns from his areas of expertise toward science, he faceplants.
Chapter 4: Nate thinks the U.S. weather forecast models are run on computers in Boulder, CO--completely wrong, total confusion of a research lab with the operational forecasting center in Maryland. He doesn't understand the complex interplay of models and data in weather forecasting, even though the previous chapter on baseball scouting is a perfect setup for it. He misdescribes what a nonlinear equation is. He misdescribes Ed Lorenz's famous chaos theory experiments. By the end of Ch. 4, he can't even spell "television" correctly. While he is complimentary of weather forecasters, he misses numerous connections with earlier chapters and really doesn't seem to understand the subject matter. (At least he doesn't think weather forecasting = Al Roker.)
But this is considerably better than his chapter on climate prediction (Ch. 12), in which he completely, utterly botches the definition of the greenhouse effect--he thinks it has to do with reflected solar radiation. At that moment, he loses all credibility... it's as if he were a scientist discussing sabermetrics in the Baseball Research Journal and referred to a baseball as "the pigskin." Total fail. (He cites a couple of URLs in this passage, including lecture notes from a Columbia/Barnard course I used to teach. The quickie-Wiki approach to becoming an 'expert' in a field. The URLs sound good, but clearly Silver learned zero about the greenhouse effect from them--zero.) Then Silver sets up a false comparison between a discredited Heartland Institute forecaster and the worldwide efforts of the IPCC... fail. He skims the whole world of climate modeling. By chapter's end, he basically comes off as a college freshman out of his depth, writing a term paper the morning it's due. He'd know more about climate forecasting if he read Ch. 16 of my college-level intro meteorology textbook, but apparently there was no time for anything that pedantic.
Silver's Acknowledgments seem to reveal the problem: his research assistant seems to have been mostly responsible for the science parts of the book. In retrospect, it's not hard to tell (from the depth of thinking and analysis, from the lack of proofreading, etc.) that Silver didn't care much about the science chapters in his book--a couple hours with a couple experts, some URLs provided by the researcher, that's plenty for him. If Nate were in my freshman Intro to Weather and Climate class, he'd miss a lot of questions on the final exam.
I like a lot of Silver's big-picture perspectives on forecasting, and his discussion of Bayesian statistics is illuminating, but his lack of follow-through in the scientific fields where forecasting has been explored most successfully has seriously undermined his book. Somebody tell the New York Times that "The Signal and the Noise" is *not* one of the momentous books of the decade. Instead, it's yet another example of a book that grew out of a blog but needed to stay in the oven a while longer. Hey, Penguin Press, don't publishers hire expert content reviewers (not to mention copy editors) anymore?
on January 2, 2013
What aspects of the future are predictable? This wide-ranging book, from an author who not only understands both data and theory but also can write engaging prose, is certainly the best non-technical discussion I have ever seen. The empirical fact, that in many contexts predictions have turned out to be much less accurate than their makers claim, has often been made (recently in The Black Swan, for instance), but deserves frequent repetition to counteract the media punditry to which we are all exposed. The 13 chapters tackle different subjects via a nice combination of details and overview. Though aimed at the general reader, even a professional statistician will find many interesting new details (for instance, I didn't know that local TV weather forecasters deliberately mis-state the probability of rain). The chapter titles are alas more cutesy than informative, so let me list their actual topics here, which are (predictability of): mortgage defaults, elections, baseball player performance, weather, earthquakes, economic indicators, epidemics, Bayes and sports betting, chess computers, professional poker, stock market, climate change, terrorism.
Because the point is that predictability differs in different contexts, the book wisely offers no grand theory. From the list above of topics, you will see the author focuses on contexts where there is a lot of past data, and the central issue (his signal/noise analogy) is determining which aspects of the data are useful in predicting the future. Using Tetlock's fox/hedgehog analogy, he advocates being a fox: "... pursue multiple approaches; incorporate ideas from different disciplines; see the universe as complicated, perhaps ... inherently unpredictable". And he advocates "thinking probabilistically", making probabilistic rather than deterministic predictions. Both are views I would heartily endorse.
Silver's own technical expertise is in three of the topics (baseball player performance, poker, election prediction from opinion polls). These are particularly amenable to the signal/noise paradigm -- with lots of data and only slowly changing ground rules -- and number-crunching can lead to prediction rules that are human-interpretable. But his accounts of the other topics are equally good. Indeed I may use several of his chapters as a basis for future lectures in my Berkeley "Probability in the real world" course and will assign my (Statistics major) students to read the book.
My main quibble concerns a glaring omission. There is an active field, inside and outside academia, called "machine learning", which seeks to develop prediction algorithms to be numerically calculated by computer without caring if they form human-interpretable rules. These are used in a huge range of contexts, from spotting trends in retail sales to genomic research. But Silver makes no mention of this parallel world. In particular, as this book notes, there is an intrinsic competitive aspect to prediction -- did my prediction work out more accurate than yours? -- which often has material consequences -- all financial speculation is in essence about predicting better than the market consensus. Machine learning has a culture of public competitions (best known being the Netflix Prize), from which we know quite a lot about how accurate its predictions can be. So Mr Silver, please add a chapter on machine learning in the second edition.
on November 7, 2012
This would be a great book anytime but it is a must read following the election. Why were so many pundits surprised by Obama's victory? It's not rocket science if you know how to separate the signal from the noise. Unfortunately the pundits are frequently the source of the noise. It will be hard to watch the Sunday morning news programs after reading this book. No wonder the pundits hate Nate Silver.
I had high hopes for this book. A book on forecasting by someone who has actually been successful at it-what could be better than that.
===The Good Stuff===
* Silver is a fairly honest writer. He is able to describe his previous forecasts, the methodology he used, and is frank with himself and the reader about what failed and what worked.
* Unlike some other books on this subject, Silver comes across as actually understanding the theory behind probability, and knows how to apply it to everyday problems.
* The material is chosen from a variety of topics, with different constraints and limitations. Chess, which is relatively deterministic but has a large number of permutations is very different that weather, which follows a few simple laws but requires intensive calculation to forecast. Silver explains how both of these create different challenges, and techniques for overcoming those problems.
* He avoids the usual "bag of red and black ping-pong ball" nonsense that usually clutters up books on this subject, and his discussion of Bayesian probability analysis is first-rate.
===The Not-So-Good Stuff===
* I found the book a little tough to read. Silver writes well enough, but the material is presented in a much drier and more formal manner than other books such as Freakonomics. This is my major issue with the book. It is not a "mass market, fun to read" book, but neither is it a rigorous treatment of a mathematical subject. Rather it sort of languishes in a no-mans-land between the two. And for those of us who at least think we understand statistics, some of the passages in the book are frustrating as you try to decode what Silver is really talking about.
* Silver has a bad habit of interjecting his own opinions and thoughts into the discussion. For example, his discussion of economic forecasting gets mired down in his own opinions on government spending levels and priorities. It takes away from the objectivity, and I know Silver knows better.
Even though it took me over a week to read this book, (a very long time for me), I enjoyed most of it. I did learn a few things about how forecasts work, and how to spot the difficulties in forecasting any given event. However it is more of a high-level look, and you will not learn much about how to actually forecast anything from this book.
I would have preferred the book to either be more entertaining (Moneyball), or more "textbooky" rather than trying to be a little of both. In some ways, I think I would have enjoyed it more if I knew less about the math of statistics. Still, if you are at all interested in how complicated processes like weather, chess and national economies are forecast, it is well worth a read.
on October 27, 2012
This is a wonderful book. This is another book I believe should be required reading for everyone. The author weaves a story about prediction/forecasting and its limitations with a somewhat autobiographical journey which provides detailed explorations of politics, economics, sport (baseball), weather, earthquakes, gambling (poker), climate change and more.
The writing style is engaging, forthright, humorous as well as instructive. The book covers the limitations of models predictive power and how deeply human the endeavour of predicting and forecasting is. It requires both deep understanding as well as statistical modelling. It is an iterative process and needs to be open and driven by pursuit of truth.
The insights from Nobel Laureate Daniel Kahneman with respect to limitations of human reason (see Thinking Fast and Slow), the problems of assumptions (linearity for non-linear phenomena, independence in highly dependent environments, power law distributions) and a number of forms of biases including human propensity to overconfidence, distortion of risks and closed attitude (attachment to pre-conceived notions) are discussed. The book is not a dry technical exploration but a clear entertaining exploration. The graphics used powerfully reinforce the narrative.
There is overlap with the themes of Black Swan. I must admit that I prefer Nate Silver's exploration. I found the story telling more enjoyable, the arguments clearer and more convincing and the witticisms more amusing. I did not know anything about American Baseball but enjoyed the author's passion. The perspective on Moneyball provided an insight into this complex world that was better than the somewhat simplistic movie version (I have not read the book). Similarly, I have not played poker but the chapter devoted to this was interesting both within the context of forecasting as well as an object lesson for the important theme of the deeply human endeavour of trying to understand the world and its uncertainty.
I am inspired after reading this book to understand more about modelling, despite an emphasis on its pitfalls. If nothing else, I hope I am more analytical and open to information presented to me and more insightful and reflective enough to learn from my failures as well as more realistically appraise my apparent successes.
on September 26, 2013
I fail to understand why so many people seem to love this book. I suppose that as a first encounter with forecasting and probability, this book seems to explain it all. (Which means hardly anyone knows probability/forecasting, which probably is true.) There are several good examples in the book, but that is pretty much it. There is no insight in the book unless it is your first encounter. Since the book is such an easy reading you are likely to come away thinking that the book was great and that you learnt a lot.
Personally, I don't like the authors tone of writing. He is expressing himself as an expert in anything. He comes in with his helicopter perspective and tells what is right and wrong and then flies to the next location. I could accept this if the author was 50 years old and had 25 years of astonishing forecasts.