- Paperback: 394 pages
- Publisher: O'Reilly Media; 1 edition (October 21, 2016)
- Language: English
- ISBN-10: 1449369413
- ISBN-13: 978-1449369415
- Product Dimensions: 7 x 0.8 x 9.2 inches
- Shipping Weight: 1.5 pounds (View shipping rates and policies)
- Average Customer Review: 31 customer reviews
- Amazon Best Sellers Rank: #8,073 in Books (See Top 100 in Books)
Enter your mobile number or email address below and we'll send you a link to download the free Kindle App. Then you can start reading Kindle books on your smartphone, tablet, or computer - no Kindle device required.
To get the free app, enter your mobile phone number.
Introduction to Machine Learning with Python: A Guide for Data Scientists 1st Edition
Use the Amazon App to scan ISBNs and compare prices.
The Amazon Book Review
Author interviews, book reviews, editors picks, and more. Read it now
Frequently bought together
Customers who bought this item also bought
Customers who viewed this item also viewed
About the Author
Andreas Müller received his PhD in machine learning from the University of Bonn. After working as a machine learning researcher on computer vision applications at Amazon for a year, he recently joined the Center for Data Science at the New York University. In the last four years, he has been maintainer and one of the core contributor of scikit-learn, a machine learning toolkit widely used in industry and academia, and author and contributor to several other widely used machine learning packages. His mission is to create open tools to lower the barrier of entry for machine learning applications, promote reproducible science and democratize the access to high-quality machine learning algorithms.
Sarah is a data scientist who has spent a lot of time working in start-ups. She loves Python, machine learning, large quantities of data, and the tech world. She is an accomplished conference speaker, currently resides in New York City, and attended the University of Michigan for grad school.
Top customer reviews
There was a problem filtering reviews right now. Please try again later.
The book starts with only four sentences about the Jupyter notebook although is the main environment for the whole book. The first code sample shown starts on line two of a cell, and it was very strange there was no line one. I was wondering if there was some type of misprinting.
The code as printed is broken on page 10 where there is a line with 'display(data_pandas)'. This line gave me an error that display was unrecognized. I thought maybe this was a built-in Jupyter function so I went online to search. Eventually, I had to go to the author's GitHub and ask about this problem where I was told that he simply forgot to include 'from IPython.display import display'. It was a surprising admission because he did not say there was a misprint or mistake, but simply that he forgot to do that. It is very obvious there were zero technical reviewers for this book, because they would have also noticed the broken code right away.
On page 11 we are introduced to a library called 'mglearn' which is a utility function that authors say they wrote for the book. Strangely, this repository has 733 stars on GitHub so it is obvious the library is not just for the book. Then in chapter two the author has tons of calls to mglearn which take in multiple parameters. The parameters are never explained and you have to go to the author's GitHub to see what the code actually does. In the 2nd chapter multiple of these mglearn calls broke for me. One seemed to be a conflict with numpy, and another I never figured out. I went to look at dicussions on mglearn to discover it is still a work in progress and there were sections where somebody was notifying the author that something was broken, and the author replying that he would look at it soon.
The second chapter has 120 cell entries for supervised learning techniques. Each cell has roughly 5-10 lines of code, so there are nearly 1000 lines of code for the second chapter and they are all tossed into one gigantic Jupyter notebook. Explanations are very weak often defaulting to a brief description followed by code and then more code. Function calls and parameters are rarely explained at all.
The last chapter is about natural language processing which is the machine learning subject I am most familiar with. Terms are often introduced with zero effort to define them, and it is assumed you already know many of the concepts. TF-IDF barely had any explanation at all, except to show the forumla for it. You can find much better explanations online.
For a book which is so heavy on code and light on explanations, it is unacceptable that the code is broken.
The concepts are clearly described and their implementation is presented through useful and exciting data science problems, giving the reader a clear understanding of how to apply the ML tools on real problems. The code is very well organized and structured, and ready to be used as reference and as a starting point for future projects.
The flow of the book is constructed such that it can serve two purposes: it can be read to familiarize one with the machine learning techniques and how they are being applied on data without actually having to get into coding with Python, or it can be read as a ML course for those who want to learn ML with scikit-learn by studying the theory and applying it on real data problems throughout the reading process. Therefore, I recommend this book not only for Python users, but for anyone who wants to learn the basics of ML, and to see their applications without delving too deeply into ML theory and math. The book is a comprehensive self-contained literature for those who want to learn the basics of ML and to try them out on data.
I have a background in math and wrote software professionally for a number of years, but haven't spent much time doing either for the past 5-10 years. This book is technical enough to keep me interested, and accessible enough to allow me to ramp up on the language and the scikit framework.
An added bonus - the instructions actually allowed me to set up my development environment, and the code in the book actually runs!
100% recommend for someone looking to get started in ML with Python.