Applied Predictive Modeling 1st ed. 2013, Corr. 2nd printing 2018 Edition
Use the Amazon App to scan ISBNs and compare prices.
Enter your mobile number or email address below and we'll send you a link to download the free Kindle App. Then you can start reading Kindle books on your smartphone, tablet, or computer - no Kindle device required.
To get the free app, enter your mobile phone number.
Frequently bought together
Customers who viewed this item also viewed
"I used this as a supplement in teaching a data science course that I use a range of different resources because I need to cover working with data, model evaluation, and machine learning methods. The next time I teach this course, I will use only this book because it covers all of these aspects of the field." (Louis Luangkesorn, lugerpitt.blogspot.com, June, 2015)
"This is such a good book it has taken me awhile to work through the book. All the while finding examples of why people should read the book...Well thought out examples with the R packages and example code. Take your time and work through this book." (Mary Anne, Cats and Dogs with Data, maryannedata.com, February, 2015)
"This monograph presents a very friendly practical course on prediction techniques for regression and classification models...The authors are recognized experts in modeling and forecasting , as well as developers of R packages and statistical methodologies...It is a well-written book very useful to students and practitioners who need an immediate and helpful way to apply complex statistical techniques." (Stan Lipovetsky, Technometrics, Vol. 56 (3), August, 2014)
"There are hundreds of books that have something worthwhile to say about predictive modeling. However, in my judgment, Applied Predictive Modeling by Max Kuhn and Kjell Johnson (Springer 2013) ought to be at the very top of the reading list ...They come across like coaches who really, really want you to be able to do this stuff. They write simply and with great clarity...Applied Predictive Modeling is a remarkable text...it is the succinct distillation of years of experience of two expert modelers...." (Joseph Rickert, blog.revolutionanalytics.com, June, 2014)
"This strong, technical, hands-on treatment clearly spells out the concepts, and illustrates its themes tangibly with the language R, the most popular open source analytics solution." (Eric Siegel, Ph.D. Founder, Predictive Analytics World, Author, Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die)
There was a problem filtering reviews right now. Please try again later.
There is a natural comparison to be made to The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition (Springer Series in Statistics). I found this book much, much better. Where ESLII was fractured and seemed to jump from point to point with no explanation, APM proceeded in a well thought-out manner. ESLII used some non-standard notation and assumptions, where APM used notation familiar to anyone with a background in statistics and linear algebra. To be fair, it may be that I'll return to ESL after having read APM and be able to bridge the leaps the authors made with material I've learned from this book.
- Gives a solid introduction to the problem prediction is trying to solve
- Provides a framework for evaluating prediction results, using a consistent data set across all problems.
- Has citations and references for further reading
- Does a good job of contrasting machine learning black-box models and classical statistics' interpertability (see Breiman's Statistical Modeling: Two Cultures paper for some great insights into this phenomenon)
- A bit light on theory, especially proofs and details behind the models. I feel this is a bit of a pro, though, since the citations for the work are provided, and the theorems and proofs are there if you are interested in them.
I purchased Applied Predictive Modeling after visiting a high performance hedge fund that employs a number of brilliant minds. This book appeared in most of the work spaces so I decided to pick up a copy and read it for myself.
I read the first half of APM on vacation and honestly I couldn’t put it down. The book goes into detail on a wide range of models, many of which I’d never heard of before. Beyond this, APM provides the R code showing exactly how to implement the models. For me, this application focus is valuable.
The book weaves in many case studies from pharmaceuticals, to business, to even using machine learning to find the optimal concrete formula.
I will say that this book is not for complete beginners, but as soon as you get through the basics this is a great book from two of the best minds in modeling. For beginners I recommend R For Data Science.
Hope this helps.
A major theme throughout the book is detection of overfitting. Techniques to manage overfitting are discussed in detail. These include data preprocessing, normalization, standardization, transformation of distributions, feature selection, train-test split, cross validation, goodness of fit, and error metrics.
Linear and non-linear models are described, with detailed examples of use with actual data.
The illustrations are superb. Fully disclosed code in R is included.
This book is a very readable handbook that I highly recommend to everyone developing predictive models.
"Elements of statistical learning", "Pattern recognition and machine learning" , they illustrate concepts clearly but rarely shed light on real-world application. Some more applied books, like "Data science for business", they give a intuitive way of understanding the concept behind the scene and are great for people who have no previous experience in data science area. For a person who has a master degree in statistics looking for a book that is neither theoretically heavy like ESL nor elementary like Foster & Tom classic, this book is what you need. Besides the clear explanation of a variety of statistical methods in each chapter, the authors guide us walk through a case study to make sense of every single line of code complied in the book.
I've learned quite a lot about data resampling, model tuning just after finishing up the Chapter 5. I am pretty sure I will benefit more as I delve into this book in subsequent chapters. I recommend you spend time reading through this great book and may you have the same feeling as mine.
Top international reviews
I was reluctant to buy this book, but it has been so worth it- I know I will refer to it in years to come!
I also have the elements of statistical learning, which I love, but I'm not yet fluent enough in some of the maths for that book to read as easily as this one.
The book pairs well with the Elements of Statistical Learning (which is what the publisher probably attempted) as it addresses similar methods.
I hope that someone writes a similar book but focused on Bayesian machine learning methods.
The core of Applied Predictive Modeling consists of four distinct chapters:
1. General Strategies on how to manipulate and re-sample data.
2. Regression Models for making numeric predictions.
3. Classification Models for making factor predictions.
4. Other Considerations concerning model quality.
Overall, Applied Predictive Modeling is a very informative course on machine learning. It assumes some prior knowledge and might be difficult to access for someone without any knowledge, despite leaving out unnecessary equations (Introduction to Statistical Learning by Robert Tibshirani and Trevor Hastie would be a good read before starting this book.). Some of the book's examples are taken from the field of medicine and pharmaceuticals which make them hard to understand for people outside of the realm of the health sciences.
However, the book does a very good job at making machine learning in R much more systematic. It clearly shows the advantages of using the caret package (written by the book's author) and how to evaluate and tune your model's performance.
If you are not entirely new to data science, this book will yield a high return for you. It makes your process of training a model more straightforward and thorough.
If you are looking for modeling tools in R than this book is better than the others out there in terms of practical code to follow. I would have liked some variants of the base Kmeans by Hartigan and Wong to be used with clustering data without class labels, yet my impression is that the authors chose tools that are more commonly used in data science field.
Attached is some of my code, the Summarize.all function was written by a user on a forum but unfortunately I could not find the user who wrote this and as such, I can not properly credit them with the code they wrote. In essence it allows for all metrics to be shown, i.e. AUC, ROC, SPEC, SEN, CI etc. Note, you model may not produce these values and therefore you will get an error.
Moving on you will soon encounter Neural Networks which turns out to be a special case of Non-Linear Regression. Classification is covered, but machine learning is not precisely mapped out. Box-Jenkins modeling is also not covered, and for that I recommend the 4th edition of Time Series Analysis by Box, Jenkins, and Reinsel where the latter author also contributed to the 3rd edition.
Cons: Not appropriate for the business user, feels more like a technical manual vs. understanding what problem we're trying to solve or why a certain method should be used. Data sets and models are introduced without much explanation, but I can see how this book could be useful as a texbook for a course, where you have an instructor or prof explaining it to you.
It is one of the bests summaries about ML, a good introduction and good starting point.
It is possible to check all the ML basics
The problem is that is pretty old and kaggle is the algorithms. The book it is quite old.