or
Sign in to turn on 1-Click ordering.
or
Amazon Prime Free Trial required. Sign up when you check out. Learn More
Sell Back Your Copy
For a $4.00 Gift Card
Trade in
More Buying Choices
Have one to sell? Sell yours here
Ensemble Methods in Data Mining: Improving Accuracy Through Combining Predictions (Synthesis Lectures on Data Min)
 
 
Tell the Publisher!
I'd like to read this book on Kindle

Don't have a Kindle? Get your Kindle here, or download a FREE Kindle Reading App.

Ensemble Methods in Data Mining: Improving Accuracy Through Combining Predictions (Synthesis Lectures on Data Min) [Paperback]

Giovanni Seni (Author), John Elder (Author), Robert Grossman (Series Editor)
4.8 out of 5 stars  See all reviews (6 customer reviews)

List Price: $35.00
Price: $27.84 & this item ships for FREE with Super Saver Shipping. Details
You Save: $7.16 (20%)
o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o
In Stock.
Ships from and sold by Amazon.com. Gift-wrap available.
Want it delivered Monday, January 30? Choose One-Day Shipping at checkout. Details

Book Description

Synthesis Lectures on Data Min February 24, 2010
Ensemble methods have been called the most influential development in Data Mining and Machine Learning in the past decade. They combine multiple models into one usually more accurate than the best of its components. Ensembles can provide a critical boost to industrial challenges -- from investment timing to drug discovery, and fraud detection to recommendation systems -- where predictive accuracy is more vital than model interpretability.

Ensembles are useful with all modeling algorithms, but this book focuses on decision trees to explain them most clearly. After describing trees and their strengths and weaknesses, the authors provide an overview of regularization -- today understood to be a key reason for the superior performance of modern ensembling algorithms. The book continues with a clear description of two recent developments: Importance Sampling (IS) and Rule Ensembles (RE). IS reveals classic ensemble methods -- bagging, random forests, and boosting -- to be special cases of a single algorithm, thereby showing how to improve their accuracy and speed. REs are linear rule models derived from decision tree ensembles. They are the most interpretable version of ensembles, which is essential to applications such as credit scoring and fault diagnosis. Lastly, the authors explain the paradox of how ensembles achieve greater accuracy on new data despite their (apparently much greater) complexity.

This book is aimed at novice and advanced analytic researchers and practitioners -- especially in Engineering, Statistics, and Computer Science. Those with little exposure to ensembles will learn why and how to employ this breakthrough method, and advanced practitioners will gain insight into building even more powerful models. Throughout, snippets of code in R are provided to illustrate the algorithms described and to encourage the reader to try the techniques.

Frequently Bought Together

Ensemble Methods in Data Mining: Improving Accuracy Through Combining Predictions (Synthesis Lectures on Data Min) + Handbook of Statistical Analysis and Data Mining Applications + The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition (Springer Series in Statistics)
Price For All Three: $161.49

Show availability and shipping details

Buy the selected items together


Editorial Reviews

From the Inside Flap

"This book by Seni and Elder provides a timely, concise introduction to this topic. After an intuitive, highly accessible sketch of the key concerns in predictive learning, the book takes the readers through a shortcut into the heart of the popular tree-based ensemble creation strategies, and follows that with a compact yet clear presentation of the developments in the frontiers of statistics, where active attempts are being made to explain and exploit the mysteries of ensembles through conventional statistical theory and methods." 
-- Tin Kam Ho, Bell Labs, Alcatel-Lucent

"The practical implementations of ensemble methods are enormous. Most current implementations of them are quite primitive and this book will definitely raise the state of the art. Giovanni Seni's thorough mastery of the cutting-edge research and John Elder's practical experience have combined to make an extremely readable and useful book." 
-- Jaffray Woodriff, Quantitative Investment Management

About the Author

The authors are industry experts in data mining and machine learning who are also adjunct professors and popular speakers. Although early pioneers in discovering and using ensembles, they here distill and clarify the recent groundbreaking work of leading academics (such as Jerome Friedman) to bring the benefits of ensembles to practitioners.

Product Details

  • Paperback: 126 pages
  • Publisher: Morgan and Claypool Publishers (February 24, 2010)
  • Language: English
  • ISBN-10: 1608452840
  • ISBN-13: 978-1608452842
  • Product Dimensions: 7.5 x 9.2 x 0.3 inches
  • Shipping Weight: 8.5 ounces (View shipping rates and policies)
  • Average Customer Review: 4.8 out of 5 stars  See all reviews (6 customer reviews)
  • Amazon Best Sellers Rank: #272,916 in Books (See Top 100 in Books)

More About the Authors

Discover books, learn about writers, read author blogs, and more.

 

Customer Reviews

6 Reviews
5 star:
 (5)
4 star:
 (1)
3 star:    (0)
2 star:    (0)
1 star:    (0)
 
 
 
 
 
Average Customer Review
4.8 out of 5 stars (6 customer reviews)
 
 
 
 
Share your thoughts with other customers:
Most Helpful Customer Reviews

5 of 5 people found the following review helpful:
5.0 out of 5 stars really helpful in learning the method, June 11, 2010
This review is from: Ensemble Methods in Data Mining: Improving Accuracy Through Combining Predictions (Synthesis Lectures on Data Min) (Paperback)
During my 10 plus years of modeling experience, I have always paid most of my attention on variable selection, predictive power, effectiveness and efficiency of a single model form such as logistic model, ordinary regression, tree, etc. From time to time, I also segment my sample space into pieces and then apply different modeling techniques. Never really aware of the concept of 'model selection' or 'model combination'. That classical approach has served me well. But I always suspected that there was a better approach to combine different methods to get better predictions.
Ensemble methods detailed in this books gave me the 'ah ha'. It gave a nicely balanced flavor of easy implementation and difficult concepts. I really enjoyed the book. I was able to finish the book quick and would save it for reference.
If there is anything that I would want to see in more detail, it is the treatment of evaluation of model prediction. It is a bit light on how to tell if the final product is really working. Given that the book is an intro, then it is not really a mis-treatment.
overall, awesome small book.
Help other customers find the most helpful reviews 
Was this review helpful to you? Yes No


3 of 3 people found the following review helpful:
5.0 out of 5 stars Clear, accessible introduction, July 31, 2011
This review is from: Ensemble Methods in Data Mining: Improving Accuracy Through Combining Predictions (Synthesis Lectures on Data Min) (Paperback)
This book is an accessible introduction to the theory and practice of ensemble methods in machine learning. It is a quick read, has sufficient detail for a novice to begin experimenting, and copious references for those who are interested in digging deeper. The authors also provide a nice discussion of cross-validation, and their section on regularization techniques is much more straightforward, in my opinion, than the equivalent sections in The Elements of Statistical Learning (Elements is a wonderful, necessary book, but a hard read).

The heart of the text is the chapter on Importance Sampling. The authors frame the classic ensemble methods (bagging, boosting, and random forests) as special cases of the Importance Sampling methodology. This not only clarifies the explanations of each approach, but also provides a principled basis for finding improvements to the original algorithms. They have one of the clearest descriptions of AdaBoost that I've ever read.

The penultimate chapter is on "Rule Ensembles": an attempt at a more interpretable ensemble learner. They also discuss measures for variable importance and interaction strength. The last chapter discusses Generalized Degrees of Freedom as an alternative complexity measure; it is probably of more interest to researchers and mathematicians than to practitioners.

Overall, I found the book clear and concise, with good attention to practical details. I appreciated the snippets of R code and the references to relevant R packages. One minor nitpick: this book has also been published digitally, presumably with color figures. Because the print version is grayscale, some of the color-coded graphs are now illegible. Usually the major points of the figure are clear from the context in the text; still, the color to grayscale conversion is something for future authors in this series to keep in mind.

Recommended.
Help other customers find the most helpful reviews 
Was this review helpful to you? Yes No


2 of 2 people found the following review helpful:
5.0 out of 5 stars Effective introduction, December 12, 2010
This review is from: Ensemble Methods in Data Mining: Improving Accuracy Through Combining Predictions (Synthesis Lectures on Data Min) (Paperback)
For once, "Product Description" is specific and hype-free. (Apart from the claim regarding importance sampling - dealt with on a single page). This is a concise, to-the-point and accessible introduction to the subject, discussing bagging, random-forest and boosting methods, in classification context. Once these methods are explained, the authors move on to measures of variable importance and model complexity, which may be of less interest to practitioners. R snippets, leveraging rpart and gbm packages, are a plus, but the programming is fairly simple.

PS. Morgan Claypool sell the book's PDF for $20, or $0 for those affiliated with the publisher's institutional subscribers.
Help other customers find the most helpful reviews 
Was this review helpful to you? Yes No

Share your thoughts with other customers: Create your own review
 
 
 
Most Recent Customer Reviews




Only search this product's reviews



Inside This Book (learn more)
Browse Sample Pages:
Front Cover | Table of Contents | First Pages | Back Cover | Surprise Me!
Search Inside This Book:


Tags Customers Associate with This Product

 (What's this?)
Click on a tag to find related items, discussions, and people.
 

Your tags: Add your first tag
 

Sell a Digital Version of This Book in the Kindle Store

If you are a publisher or author and hold the digital rights to a book, you can sell a digital version of it in our Kindle Store. Learn more

Customer Discussions

This product's forum
Discussion Replies Latest Post
No discussions yet

Ask questions, Share opinions, Gain insight
Start a new discussion
Topic:
First post:
Prompts for sign-in
 


Active discussions in related forums
Search Customer Discussions
Search all Amazon discussions
   
Related forums





Look for Similar Items by Category


Look for Similar Items by Subject