Follow the Author
What is a p-value anyway? 34 Stories to Help You Actually Understand Statistics 1st Edition
What is a p-value Anyway offers a fun introduction to the fundamental principles of statistics, presenting the essential concepts in thirty-four brief, enjoyable stories. Drawing on his experience as a medical researcher, Vickers blends insightful explanations and humor, with minimal math, to help readers understand and interpret the statistics they read every day.
What is a p-value Anyway is the perfect compliment to any introductory statistics textbook and will succeed in demonstrating the everyday importance of statistics to your class.
From the Back Cover
About the Author
- Publisher : Pearson; 1st edition (November 18, 2009)
- Language : English
- Paperback : 224 pages
- ISBN-10 : 0321629302
- ISBN-13 : 978-0321629302
- Item Weight : 11.6 ounces
- Dimensions : 6.9 x 0.6 x 9 inches
- Best Sellers Rank: #695,853 in Books (See Top 100 in Books)
- Customer Reviews:
About the author
Top reviews from the United States
There was a problem filtering reviews right now. Please try again later.
In this introduction, the author summarizes the content well by noting that "the first 12 chapters deal with some basics, such as averages, variation, distributions and confidence intervals. I then have a few chapters on hypothesis testing and p-values, before discussing regression - the statistical method I use most in my work - and decision making - which generally should be, but often isn't, what statistics is about. The last third of the book, starting from the chapter 'One better than Tommy John', is devoted to discussing a wide variety of statistical errors."
"If it seems odd to devote so much of a book to slip-ups, it is because I have a little theory that 'science' is just a special name for 'learning from our mistakes'. When I teach, I give bonus points for any student giving a particularly dumb answer because those are the ones we really learn from. In fact, I don't think you can really understand, say, a p-value, without seeing some of the ways it has been misused and thinking through why these constitute mistakes. So please don't blow these chapters off thinking you've read the stuff you'll be examined on: the final chapters will really fill in your statistical knowledge."
Chapters that I especially appreciated include Chapter 9 on degree of normal distribution fit, Chapter 11 on variation and confidence intervals, Chapters 13, 14, 15, 23, and 29 on p-values, Chapter 17 on sample size, precision, and statistical power, Chapter 19 on regression and confounding, Chapter 20 on specificity and sensitivity, Chapter 21 on decision analysis, Chapter 22 on statistical errors, and Chapter 32 on science, statistics, and reproducibility. The discussion section that comprises the last 25% of what the author shares here works through questions posed at the end of each of the 34 chapters, and much of the value that I personally obtained from this text can be found in this section.
In my opinion, it is the rare reader who will not find any author discussions worth remembering, because Vickers simply tells it like it is. In Chapter 5, for example, the author describes a continuous variable as one that can take "a lot of different values", and in the discussion section for this chapter he points out that "statisticians disagree on this point (statisticians disagree on a lot of points, which just goes to show how much of statistics is a judgement call)." In Chapter 13, the author indicates that he "provided strong evidence" for something, and in the discussion section comments that "'proof' is not a word often used by scientists."
"Statisticians are particularly careful with the word 'proof', because they are keenly aware of the limitations of data, and the important role that chance plays in any set of results. Statisticians normally use the word 'proof' only to refer to mathematical relationships between formulas. The point here is that you don't use data to do math theory, so you aren't subject to the limitations of data, and so you can go about really claiming to have 'proved' something. It is certainly unwise to think that you can prove anything by applying a statistical test to a data set."
In Chapter 9, the author notes that "statisticians don't typically seem to worry too much about whether or not the data are a close fit to the normal distribution because they realize that statistics isn't football, and no one is going to throw a flag and send you back 10 yards if you are caught breaking the rules. In fact, there aren't really many 'rules' at all." After quoting one sentence from a scientific paper describing a clinical trial in Chapter 22, the author works through the sentence and discusses each of the four statistical errors made by the researcher, and discusses why he cares about such errors.
"Many people seem to think that we statisticians spend most of our time doing calculations, but that is perhaps the least interesting thing we do. Far more important is that we spend time looking at numbers and thinking through what they mean. If I see any number in a scientific report that is meaningless - a p-value for baseline differences in a randomized trial, say, or a sixth significant figure - I know that the authors are not being careful about what they are doing, they are just pulling numbers from a computer printout. Statistics is more than just cutting and pasting from one software package to another. We have to think about what the numbers mean and the implications for our scientific question." Well said.
Since then I’ve found Geoff Cumming’s book (Understanding the New Statistics). Spend the extra bucks and get it. He’s entertaining (great visuals) and will really teach you what you need to know about doing sound statistics, and no p-values.
If you care to bear with me, let’s talk about p-values.
If you’ve taken an intro stats course you were taught to do your stat, state a “null hypothesis” that your result is caused by chance, then find the p-value (computed by the software) to see if it was caused by “chance”. If p is > 0.05 (or 0.01) your result is “statistically not-significant”, and you lose. If the p-value is <0.05 your result is “statistically significant” and you win. Vickers knows this is a big problem. In chapter 13, he explains how null hypothesis testing (p-values) doesn’t really tell you what you want to know. (A “null hypothesis” is that there is no difference or nothing happened.) In chapter 14, he waffles on about it. In chapter 15, he shows how a LARGE p (>0.05) doesn’t actually tell you that your result is “insignificant”. In chapter 23 and 28, he shows how a SMALL p doesn’t actually tell you that your result is “significant”. In chapter 29, he tells you that you cannot compare p-values, so you cannot say that a smaller p means something more than a larger p. And this continues into the discussion sections. Yet he religiously discusses and reports p-values: chapters 13, 14, 15, 17, 22, 23, 24, 28, 29, plus many of the discussion pages.
Ok, so what is a p-value anyway?
Null hypothesis tests are poorly understood, and p-values even worse. You want to know whether YOUR hypothesis is true, given your data. What p-values tell you is the likelihood of your data when the “null hypothesis” IS true. If you’ve read this far, read those two sentences a couple of times, because the statistical logic really is backward from normal logic.
P-values are based on the assumption that the “null hypothesis” IS true, and also that you took a sample from a population. The p-value doesn’t tell you whether the null hypothesis is true or even likely, and it doesn’t tell you whether your hypothesis is true. In addition, you usually want to know if what you observe matters. P-values don’t tell you this either; to paraphrase many noted statisticians, a large p-value only tells you that you needed more samples, a small p-value means you had plenty of samples.
Finally, so why is this a problem, and why am I annoyed by this book? What a novice will get from this book (just like many intro stats courses) is that they are supposed to calculate some p-value to determine if the “null hypothesis” is true. They will also get that there are problems with it, but they are still supposed to calculate it. Well, the point that Vickers dances around is that a p-value doesn’t actually mean anything. The only thing you can do with it is compare it to some number like 0.05. Then you are back to p > 0.05 and you lose, p< 0.05 and you win.
P-values are based on samples (sub-sets) from a population. In my field (environmental science), we often examine the entire population of data (for example, in the last year, how much hotter was the temperature in Washington when Congress was in session, then when it wasn’t). So we have all the data, not a sample of the data. We also examine models where we know the “null hypothesis” is false (because of much other research), and what we want, anyway, is the error on the model, not this peculiar likelihood that is a p-value. Unfortunately, too many people throw away real trends and real patterns because p was > 0.05, thinking that the null hypothesis was “proved” when it wasn’t. This is not a random rant, statisticians have been all over this problem, and I see this repeatedly in the science I review.
If you borrow the book, read until page 53 (the last page before p-values), then quit. Vickers covers means, medians, standard deviation and confidence intervals, and hasn’t started on the deep-fried bacon. Then go get Geoff Cumming’s book.
While I really like it, I do not love this book because it is missing guidance for people who want to learn more. This will desensitize people who are afraid of statistics but I fear that they will then flounder there way to less friendly books and relearn why they hate math. It would have been better if Vickers pointed the readers to other friendly statistics books for people with different backgrounds, like Biostatistics: The Bare Essentials, 3e or Intuitive Biostatistics: A Nonmathematical Guide to Statistical Thinking or Biostatistics For Dummies (For Dummies (Health & Fitness)) for people who like biology instead of math. Even with this serious shortcoming, this is an excellent (but rather pricy) book.
Top reviews from other countries