Top positive review
31 people found this helpful
Kirkwood's book is one of the best, but it is not a standalone book.
on November 21, 2009
Medical Statistics 2nd ed. by Kirkwood and Sterne is a 501 page book printed on semi-glossy paper. There are six chapters and nine appendices. Essentially every page has a mathematical formula or a table of numbers. This is a good thing because it keeps the narratives concrete and firmly grounded. Only high school math is needed.
We learn about standard deviations, standard normal distributions, t distributions, Z statistic, t statistic, confidence intervals, hypothesis testing, p values, and more advanced formulas such as goodness of fit, Mandel-Haenszel methods, hazard ratio, Wilcoxon signed rank test, and Bayseian statistics. All of the examples concern clinical studies. This is in contrast to David Jones' excellent statistics book, where the many examples tend to concern manufacturing pills and vials.
Kirkwood's book and David Jones' (http://www.amazon.com/Pharmaceutical-Statistics-David-S-Jones/dp/0853694257/ref=sr_1_5?ie=UTF8&qid=1387942697&sr=8-5&keywords=statistics+david+jones) book appear to be the best of the clinical statistics books, but neither is a standalone book. Both books suffer from the organizational problems that seem to characterized most books on clinical statistics. The reader will need to turn to Lange and to Durham to fill in various gaps.
I was specifically interested in understanding standard deviations, the Z statistic, the t statistic, p values, confidence intervals, alpha values, hazard ratios, Wilcoxon rank sum test, critical values, sample size and power calculations, hazard ratios, logrank statistics, and survival curves (Kaplan-Meier curves).
In setting out to understand these equations, bought and read:
(1) Dawson and Trapp (2004) Basic & Clinical Biostatistics (Lange Series);
(2) Durham and Turner (2008) Introduction to Statistics in Pharmaceutical Clinical Trials;
(3) Kirkwood and Sterne (2003) Medical Statistics;
(4) Motulsky (1995) Intuitive Biostatistics. In my opinion, Motulsky leaves the reader in a free-fall. On occasion, Motulsky provides some insight or guidance on using the various equations. But I would not recommend that any novice interested in statistics look first to Motulsky;
(5) Bart J. Harvey (2009) Statistics for Medical Writers and Editors. This tiny book is the best for understanding Standard Deviations. Harvey does not get much beyond SDs, and there is almost nothing about the t test or about p values;
(6) Rosner. Fundamentals of Statistics.
Providing that you have all four of these books -- (1) Jones, (2) Kirkwood, (3) Lange, and (4) Dawson, it is possible to learn statistics for clinical trials. The following compares the Jones, Kirkwood, Lange, and Dawson.
KAPLAN-MEIER PLOTS (survival plots). Kirkwood (pages 272-286) discloses survival plots. Jones fails to disclose Kaplan-Meier curves. For this topic, I refer the reader to the excellent discussion in Dawson & Turner. But Lange (pages 221-244) is by far the best for this topic.
SAMPLE SIZE AND POWER CALCULATIONS. Kirkwood (pages 417-418) is the best of the books for sample size calculations, as Kirkwood provides a straightforward formula. Jones discloses sample size and power calculations on pages 172-179. The Jones narrative seems to come to a dead end on page 175, where delta is revealed as equaling 2.5 mL. But according to my calculations, delta should be 1.2 mL (50-48.8=1.2 mL). Lange (page 127) seems to lead to a dead end (since Lange fails to explain where the number -0.84 comes from).
FORMULA for CONFIDENCE INTERVAL FOR LARGE SAMPLES. Pages 60-63 of Kirkwood covers this topic, while pages 135-141 of Jones covers this topic. For two different treatments (study drug group and placebo group), Kirkwood's coverage is on page 67-70, and Jones' coverage is on pages 141-144.
FORMULA for CONFIDENCE INTERVAL FOR SMALL SAMPLES. Kirkwood's pages 53-55 is equivalent to pages 151-154 of Jones, for this simple formula. But Kirkwood's presentation of general information on Confidence Intervals (pages 50-53 of Kirkwood) is far superior to any presentation in Jones on this particular topic. Hence, I would recommend readers consult both Jones and Kirkwood for this formula.
FORMULA for HYPOTHESIS TESTING FOR LARGE SAMPLES. Kirkwood's treatment is on page 46-49, and 69, while Jones's treatment is on pages 87-125. Kirkwood has only two examples, but Jones has seven examples. In this respect, Jones is much better than Kirkwood. Both books teach us that once we have calculated the Z statistic, there are two things we can do with it. First, we can plug it into a table of STANDARD NORMAL DISTRIBUTION and get the P value (a probability). Second, we can compare it with a standard number, e.g., "1.96" for use in hypothesis testing, that is, for getting a yes/no answer (page 157-159 of Jones, page 244 of Rosner). To write this commentary, I needed to piece together information from several books. None of the books provides an organized stand-alone account of this formula.
FORMULA for HYPOTHESIS TESTING FOR SMALL SAMPLES. Kirkwood's treatment is on page 66, while Jones' treatment is on pages 168-178. These books teach that once we have the t statistic, there are two things we can do with it. First, we can plug it into a table (Table A4 of Kirwood) to obtain a P value, and get a probability. Second, we can cmparit it to the "critical value of the t deistribution" for hypothesis testing, and get a yes/no answer (Lange, page 101, 107). To write this commentary, I needed to piece together information from several books. None of the books provides an organized stand-alone account of this formula.
DISORGANIZATION. A fault with all of the above statistics books is their disorganization. The reader would have an easier time if the authors would devote one page to disclosing that statistics requires making one's way through a decision tree. There are four yes/no questions in this decision tree. These four yes/no questions are independent of each other (see below). Following this, there are two more branches, dependent on earlier branches.
(1) Are you interested in CI, or are you interested in hypothesis testing?
(2) Is your sample from a large group, e.g., 60 data points or more (then you should use Z statistic) or is your sample a small group, e.g., under 60 data points (then you use t statistic).
(3) Are you comparing one sample mean with a population mean? Or does your comparison involve a "paired measurement," that is, data from a STUDY DRUG GROUP and data from CONTROL GROUP, where you compare [sample mean#1 minus sample mean#2] with [population mean#1 minus population mean#2]?
(4) Does your experimental setup involved a 1-tailed experiment, or does your experimental setup involve a 2-tailed experiment?
Then, there are three additional yes/no decisions to make in the decision tree:
FIRST DECISION. If you are working with the t statistic, you can use it to acquire a P value (page 66, Kirkwood) or you can use it for comparing with the "critical value of the t statistic, for use in hypothesis testing (page 63, Kirkwood; page 140-141, Lange). To repeat, once you have a value for the "t statistic," you can use this value for two different purposes: (1) PLUG IT INTO A TABLE TO GET A P VALUE; or (2) COMPARE IT WITH THE CRITICAL VALUE OF THE t STATISTIC TO DO HYPOTHESIS TESTING.
SECOND DECISION. If you are working with the Z statistic, you can use it to acquire a P value (page 92, Jones), or you can use it to do hypothesis testing (pages 157-159 of Jones, page 244 of Rosner, page 141 of Lange). To repeat, once you have a value for the "Z statistic," you can use this value for two different purposes: (1) PLUG IT INTO A TABLE TO GET A P VALUE; or (2) COMPARE IT WITH THE CRITICAL VALUE OF THE NORMAL DISTRIBUTION TO DO HYPOTHESIS TESTING.
THIRD DECISION. If you are working with the Z statistic, you need to decide between two formulas. Z = [X-u]/[SD] (FIRST FORMULA); and Z = [X-u] / [(SD)/(square root of n)] (SECOND FORMULA).
The question is, when do you use the FIRST FORMULA and when do you use the SECOND FORMULA? Kirkwood fails to address this question. But Jones in combination with Lange provides excellent guidance as to when to use the FIRST FORMULA and when to use the SECOND FORMULA. For use of the FIRST FORMULA, see Jones pages 87-89 and Lange page 87. For use of the SECOND FORMULA, see Jones pages 156-159, and Lange pages 85-86. In a nutshell, the FIRST FORMULA is used for comparing data from a mean with a hypothetical value of interest to you (a value that you dreamed up out of thin air). The SECOND FORMULA is used when numbers are available for a Population Mean (and Population SD), and where numbers are available for a Sample Mean, and where your goal is to find probability of a hypothetic sample mean will have a value that is higher than (or lower than) the value of the sample mean. Jones does a better job at distinguishing between the two formulas than does Lange. Jones comes to the rescue!
The above collection of decisions in the decision tree is inherent in all of the statistics books, including Kirkwood, but the books fail to make explicit the fact that these decisions are what statistics is about. In my opinion, there is no excuse for this sort of obscurity and disorganization that so characterizes statistics books. It does not have to be this way.
The formulas used for clinical statistics have been around for over 100 years. The mathematics used in these books is on the level of a college freshman. Why is it that the presentation of statistics in all of these books, including Kirkwood's book, is so disorganized, disjointed, and sporadic? These formulas are not rocket science. There is no excuse for generally uneven quality of the available statistics books.
Kirkwood is further confusing when a specific symbol in a formula needs a subscript. What is confusing is that Kirkwood fails to print the subscript-worthy symbol as a subscript. The consequence is that the reader might think that the first symbol needs to be multiplied by the second subscript-worthy symbol. What a CONFUSING MESS this is! Kirkwood's confusing mess can be found, for example, on pages 148-164. We read about "s.c.(p1-po)." But the (p1-po), or in other formulas the (p1), which occur in the part of a formula to the left of the equal sign (and also in parts of a formula to the right of an equal sign) is not properly subscripted. Mess, MESS, MESSSSSSSSS. Fortunately, Lange comes to the rescue. Lange properly shows subscripted subscripts on pages 137 and 147 of Lange. Thank you, Lange! This is not rocket science, gang. Subscripting is what we learn in middle school math. There is no excuse for this time-wasting, confusing, way of writing statistics books.
Actually, Lange does provide a DECISION TREE for determining which statistics formula to use. This DECISION TREE is located on the inside front cover, but it does not come with any written commentary. Lange has the right idea, as far as efficacy in statistics teaching is concerned. But Lange does not go far enough.
I finally found a statistics book with this kind of decision tree. The book is: Daniel WW (2009) Biostatistics, 9th ed., John Wiley & Sons, Inc., Hoboken, NJ, p. 176. Another fine book is as follows. Norman GR, Streiner DL (2008) Biostatistics 3rd ed. B.C. Decker, Inc., Hamlton, Ontario, p. 35. Norman and Streiner know how to teach, and they warn students of the various ambiguities and inconsistencies found in other statistics books. The Daniel book, and the Norman & Streiner book, fill in the gaps where other statistics book are just plain confusing. These confusing statistics books (at least confusing in some issues) are as follows: Kirkwood & Sterne, David Jones, Dawson & Trapp, and Durham & Turner. I am still puzzled as to why all of these statistics books use such strikingly different approaches to using the Z statistic, and different approaches for plugging the Z statistic into a table to get the P value. At any rate, Norman & Streiner is unique among all statistics books, in actually recognizing this inconsistency in the table that must be used, for plugging in the Z value.