An Introduction to Statistical Learning: with Applications in R (Springer Texts in Statistics) 1st ed. 2013, Corr. 7th printing 2017 Edition
Use the Amazon App to scan ISBNs and compare prices.
Enter your mobile number or email address below and we'll send you a link to download the free Kindle App. Then you can start reading Kindle books on your smartphone, tablet, or computer - no Kindle device required.
To get the free app, enter your mobile phone number.
Frequently bought together
Customers who viewed this item also viewed
"An Introduction to Statistical Learning (ISL)" by James, Witten, Hastie and Tibshirani is the "how to'' manual for statistical learning. Inspired by "The Elements of Statistical Learning'' (Hastie, Tibshirani and Friedman), this book provides clear and intuitive guidance on how to implement cutting edge statistical and machine learning methods. ISL makes modern methods accessible to a wide audience without requiring a background in Statistics or Computer Science. The authors give precise, practical explanations of what methods are available, and when to use them, including explicit R code. Anyone who wants to intelligently analyze complex data should own this book." (Larry Wasserman, Professor, Department of Statistics and Machine Learning Department, Carnegie Mellon University)
There was a problem filtering reviews right now. Please try again later.
Well, I'm lucky (and probably so are you) because in 2013 Stanford Statistics professors James/Witten/Hastie/Tibshirani wrote this simpler 'An Introduction to Statistical Learning' that requires only a Bachelor's degree in Mathematics or Statistics. If you have that math grounding, then this is a wonderful book to start your Statistical Learning. The book offers a clear application of Mathematical Statistics and the programming language R to Statistical Learning. At the end of each chapter, the authors provide 10-15 questions to test whether you've digested the material.
Only a few times have I needed to review my Hogg/Craig 'Introduction to Mathematical Statistics'. If you want an excellent book on Mathematical Statistics to prepare you for both 'Introduction to Statistical Learning' and 'The Elements of Statistical Learning', buy the 7th edition of 'Introduction to Mathematical Statistics' by Hogg/McKean/Craig, which is typically used for a year-long (2 semesters) class for 1st or 2nd year graduate students in Mathematics or Statistics. In fact, you could simply bone up on Hogg/McKean/Craig, skip 'Introduction to Statistical Learning', and go straight to the more challenging 'Elements of Statistical Learning'. I wanted to digest some Statistical Learning asap and probably so will you. Enjoy.
If you are already programming ML a lot and you want to step up your ML math but find ESL too hard because it is not self-contained and uses too much graduate stats terminology then do not fall for the reviewers that recommend reading ISL (Introduction to Statistical Learning) instead. ISL does not contain explanations missing from ESL. In fact, it does not explain math at all, but instead, it gives a very broad overview of statistical methods that overlap with ML.
Then who is this book for? This book is for someone who juuust started learning ML, like completed the coursera ML course or started using Python scikit-learn.
The book is well-written though. It is not self-contained because it does not explain math but merely gives a minimum intuition behind it.
As one example, I have established as a personal practice that I will never use the subset argument of lm(), even though it is used throughout this entire text. Why is this? I was curious one day, and decided to compare subsetting the data argument, versus putting the indices inside the subset argument.
It turns out that in both cases, I obtained a different result. (See StackOverflow, with q/46939063/ appended to the link.) After asking around on Cross Validated as well (q/309931 appended to the URL of Cross Validated), I concluded that using the subset argument of lm() was bad advice.
Now, in prediction, this issue doesn't occur. But if you're planning on using lm() to interpret parameter estimates, don't follow this textbook's advice.
Top international reviews
Since data science is a fast moving field, there are people who want to jump on the deep learning bandwagon straightaway. This is not the correct way to enter the field. You have to have you statistical bases covered before you touch the more advanced topics. This is especially true for CS students who learn more of discrete math which doesn't lend itself well in the world of AI/ML.
So for those learners I would recommend this book. If you self-assess yourself to be good at Maths and an advanced learner, I would recommend the authors' other book Elements of Statistical Learning.
A set of tools used to analyze data. Includes most general techniques in AI excluding Neural Networks. Kinds of tools covered: Regressions, Logistical Regression, Linear discriminant analysis, Decision Trees, Random Forest, Boosting, Cross Validation, SVM, PCA, K-means clustering.
Standouts (Strong topics):
1) Great coverage in Linear Regression. Absolutely brilliant. For the first time in my life (I've been in data science for ten odd years) I learnt about t-statistic and f-statistic in the way that it should be taught.
2) Good mathematical coverage of cross validation
3) Good coverage of Logistic Regression, PCA, Random Forest and trees, Clustering.
Bonus: Great coverage of relationship between SVM and logistical regression - history of hype behind kernel methods in SVM
Not so good parts:
1) I mentioned in my headline that i have a love hate relationship with this book. The reason for the lower rating is that there were many parts where i looking for external references. This book leaves you in the wilderness of mathematics many times of stating a conclusion without a proof. This to me is the same mistake made by several Indian books, and unlike the authors' other book ESL.
My advise to the authors is cover a topic fully or not at all. To not assume no knowledge of mathematics by the reader, especially given -
2) The book is mathematically hard in parts. They don't treat the reader with baby gloves. Several times the summations used take some disambiguation to understand. This is in my opinion good. Just that there are other parts of the book where they do not assume the same level of expertise from the reader and will just state a complex formula without derivation or justification. The consistency is not there.
3) It felt like different chapters in the book were written by different people and that's why there is a difference in the level of mathematics used and tone of teaching used. I will advise authors for another edition where the additions/editing are done by one author throughout and there are tougher parts put in the appendix.
The copy I got from Springer was simply a delight to read. It was made of silky paper and page turning was so easy. It will be one of the books in my collection.
These authors are very famous. They are pioneers in the field of statistical learning. The word was coined by them, to include all methods of learning from data excluding neural networks (which came from the AI world).
The Mooc is available for free from Stanford Lagunita. Do check it out if you are buying the book. I easily recommend the book over the video lectures. The reason is that the book is the best "Introductory statistics for ML" but there are several better MOOCs than the Lagunita one for ML.
I would rate it 5/5 for applicative learning as they run a parallel stream through the book teach you R as well. For those of you who don't know it was lingua franca for Data Scientists before the TensorFlow age. Though it has marginally decreased in popularity since then, it is still the best non-production data science language available.
Note: Due to several R paradigms (libraries) having changed since the book, I would not recommend it to learn R. It's something that can give you a taste of R that you have to learn full fledged elsewhere. I recommend MOOC : The Analytics Edge for this.
The book is great for the right audience. Decide whether
1) You are medium to advanced in the field. Then buy ESL (Elements of statistical learning) over ISLR
2) You are from different field and are not thrown off by mathematical notations
3) You are disappointed with regular statistics books as required for Data Science.
4) You want to go "the right way" to learning AI and ML, and don't want to jump to the advanced topics straightaway without understanding the basics.
This is THE book for an undergraduate first or second year book, for a first course in AI or ML. But you have to be ready to work through another book after this. The foundations you learn in this book will hold you steady as you trudge into the world of data science.
Free availability of book:
The authors have officially made the book available for free as a pdf from the book website. I have personally found it extremely hard to read books on a laptop because our computers are filled with all kinds of distractions. Further the book printing quality was extremely good.
But if you cannot afford (college student etc), then no doubt read the pdf.
Note: Advanced learners can straightaway go for the book by the same authors, ESL (Elements of Statistical Learning).
Note 2: I didnt have the opportunity to work through the exercises, but I have to note that the exercises are extensive. Making it again suitable for college learners
Would be nice to have a chapter on using the tidyverse to simplify tasks.
Nothing on cleaning data in here, you'll need another reference for that.
It' simple enough to understand, but complex enough to make me scratch my head and ponder about things..