Enter your mobile number or email address below and we'll send you a link to download the free Kindle App. Then you can start reading Kindle books on your smartphone, tablet, or computer - no Kindle device required.
To get the free app, enter your mobile phone number.
Other Sellers on Amazon
+ Free Shipping
+ Free Shipping
+ $3.99 shipping
Ensemble Methods in Data Mining: Improving Accuracy Through Combining Predictions (Synthesis Lectures on Data Mining and Knowledge Discovery) Paperback – February 24, 2010
|New from||Used from|
"Warlight" by Michael Ondaatje
A dramatic coming-of-age story set in the decade after World War II, "Warlight" is the mesmerizing new novel from the best-selling author of "The English Patient." Learn more
Frequently bought together
What other items do customers buy after viewing this item?
From the Inside Flap
"This book by Seni and Elder provides a timely, concise introduction to this topic. After an intuitive, highly accessible sketch of the key concerns in predictive learning, the book takes the readers through a shortcut into the heart of the popular tree-based ensemble creation strategies, and follows that with a compact yet clear presentation of the developments in the frontiers of statistics, where active attempts are being made to explain and exploit the mysteries of ensembles through conventional statistical theory and methods."
"The practical implementations of ensemble methods are enormous. Most current implementations of them are quite primitive and this book will definitely raise the state of the art. Giovanni Seni's thorough mastery of the cutting-edge research and John Elder's practical experience have combined to make an extremely readable and useful book."
About the Author
The authors are industry experts in data mining and machine learning who are also adjunct professors and popular speakers. Although early pioneers in discovering and using ensembles, they here distill and clarify the recent groundbreaking work of leading academics (such as Jerome Friedman) to bring the benefits of ensembles to practitioners.
Top customer reviews
There was a problem filtering reviews right now. Please try again later.
But overall, this is a must-read book if you are in the data science field.
On one side, the book seems rather light for an academic audience (it only surfaces each topic). On the other side, it is too academic for industry practitioners. So it’s not fully clear who the target audience is.
To be noted issues regarding missing axis label on some pictures. Also the quality of certain pictures is really low. In conclusion, I would recommend it only if you need an overview of techniques in the field and are not scared of reading equations instead of plain English.
The heart of the text is the chapter on Importance Sampling. The authors frame the classic ensemble methods (bagging, boosting, and random forests) as special cases of the Importance Sampling methodology. This not only clarifies the explanations of each approach, but also provides a principled basis for finding improvements to the original algorithms. They have one of the clearest descriptions of AdaBoost that I've ever read.
The penultimate chapter is on "Rule Ensembles": an attempt at a more interpretable ensemble learner. They also discuss measures for variable importance and interaction strength. The last chapter discusses Generalized Degrees of Freedom as an alternative complexity measure; it is probably of more interest to researchers and mathematicians than to practitioners.
Overall, I found the book clear and concise, with good attention to practical details. I appreciated the snippets of R code and the references to relevant R packages. One minor nitpick: this book has also been published digitally, presumably with color figures. Because the print version is grayscale, some of the color-coded graphs are now illegible. Usually the major points of the figure are clear from the context in the text; still, the color to grayscale conversion is something for future authors in this series to keep in mind.
PS. Morgan Claypool sell the book's PDF for $20, or $0 for those affiliated with the publisher's institutional subscribers.
- good mathematical descriptions of the algorithms;
- intuitive explanations of the concepts involved;
- several illustrative examples (including R code); and
- a great structured guide to the vast literature on ensemble methods
The only other book I know that covers ensemble methods is the well-known "The Elements of Statistical Learning" (TEOSL), which can be quite a dense read at times. For example, TEOSL covers Importance Sampling Learning Ensembles (ISLE) and Rule Ensembles (RE) in a couple of pages each, whereas EMIDM dedicates a chapter to each (personally, I had overlooked the significance of those 2 methods until I read the more developed narrative in EMIDM).
In any event, you should own a copy of TEOSL (which can be freely downloaded off the authors website), but if you want to master ensemble methods (currently one of the hottest areas in data mining and machine learning) and confidently be able to apply them in practice, then EMIDM is a wise investment.