Introduction to Machine Learning with Python: A Guide for Data Scientists 1st Edition
| Sarah Guido (Author) Find all the books, read about the author, and more. See search results for this author |
Use the Amazon App to scan ISBNs and compare prices.
Machine learning has become an integral part of many commercial applications and research projects, but this field is not exclusive to large companies with extensive research teams. If you use Python, even as a beginner, this book will teach you practical ways to build your own machine learning solutions. With all the data available today, machine learning applications are limited only by your imagination.
You’ll learn the steps necessary to create a successful machine-learning application with Python and the scikit-learn library. Authors Andreas Müller and Sarah Guido focus on the practical aspects of using machine learning algorithms, rather than the math behind them. Familiarity with the NumPy and matplotlib libraries will help you get even more from this book.
With this book, you’ll learn:
- Fundamental concepts and applications of machine learning
- Advantages and shortcomings of widely used machine learning algorithms
- How to represent data processed by machine learning, including which data aspects to focus on
- Advanced methods for model evaluation and parameter tuning
- The concept of pipelines for chaining models and encapsulating your workflow
- Methods for working with text data, including text-specific processing techniques
- Suggestions for improving your machine learning and data science skills.
Customers who viewed this item also viewed
From the brand
-
Sharing the knowledge of experts
O'Reilly's mission is to change the world by sharing the knowledge of innovators. For over 40 years, we've inspired companies and individuals to do new things (and do them better) by providing the skills and understanding that are necessary for success.
Our customers are hungry to build the innovations that propel the world forward. And we help them do just that.
-
Editorial Reviews
About the Author
Sarah is a data scientist who has spent a lot of time working in start-ups. She loves Python, machine learning, large quantities of data, and the tech world. She is an accomplished conference speaker, currently resides in New York City, and attended the University of Michigan for grad school.
Product details
- Publisher : O'Reilly Media; 1st edition (November 15, 2016)
- Language : English
- Paperback : 398 pages
- ISBN-10 : 1449369413
- ISBN-13 : 978-1449369415
- Item Weight : 1.3 pounds
- Dimensions : 7 x 0.82 x 9.19 inches
- Best Sellers Rank: #56,325 in Books (See Top 100 in Books)
- #14 in Computer Algorithms
- #14 in Natural Language Processing (Books)
- #29 in Programming Algorithms
- Customer Reviews:
About the authors

Andreas Mueller is a lecturer at the Data Science Institute at Columbia University and author of the O'Reilly book "Introduction to machine learning with Python", describing a practical approach to machine learning with python and scikit-learn. He is one of the core developers of the scikit-learn machine learning library, and has been co-maintaining it for several years. He is also a Software Carpentry instructor. In the past, Andreas Mueller worked at the NYU Center for Data Science on open source and open science, and as Machine Learning Scientist at Amazon

Discover more of the author’s books, see similar authors, read author blogs and more
Customer reviews
Customer Reviews, including Product Star Ratings help customers to learn more about the product and decide whether it is the right product for them.
To calculate the overall star rating and percentage breakdown by star, we don’t use a simple average. Instead, our system considers things like how recent a review is and if the reviewer bought the item on Amazon. It also analyzed reviews to verify trustworthiness.
Learn more how customers reviews work on AmazonTop reviews from the United States
There was a problem filtering reviews right now. Please try again later.
I read the Geron book "Hands-on Machine Learning with Scikit-learn & TensorFlow" before reading this book. This book provides a better start for several reasons. First, it is better organized. Second, the code implementations rely primarily on Python modules, instead of custom programming.
Regarding the first, this book is set-up so that a reader can get an understanding of Machine Learning (ML) step-by-step from the bottom-up. For instance, supervised learning, feature engineering, and model evaluation all get separate chapters. The model evaluation chapter provides an entire section, as well as graphics, for understanding the roles of training, validation, and test data, which are probably the most important bedrock concepts in ML. In contrast to this, Geron throws you right into an entire ML pipeline in the second chapter. It's a mix of feature engineering, linear models, stochastic gradient descent, random forest models, cross-validation, grid search, and even object oriented programming for custom transformers! This might be useful for quickly understanding what ML is like in practice. If later sections of Geron then went step-by-step and elaborated on the second chapter, it would be great. Instead, for instance, the second chapter is randomly about binary classification for image data. You only get two paragraphs in the first chapter on cross-validation and validation sets, and a sentence or two later in the book. I had to go to Wikipedia to ensure that I understood it correctly and robustly. I wish I had read this book instead.
Regarding the second, this book does not assume a heavy programming background. Most of the ML pipeline is taught through the Python module Scikit-Learn. This is useful because the programming does not distract from learning fundamentals of ML. In contrast, in the second chapter of Geron, there is object oriented programming code involving concepts like constructors and inheritance. For this book, the most sophisticated chapter at the end, which is on pipelines and which expertly explains why feature engineering should be performed during model evaluation, doesn't even go into this. Some reviews mention that the author uses an mglearn Python package that he wrote. It is true that when he uses functions from this package the code is concealed. Arguably, this prevents readers who aren't familiar with Python from getting distracted by code that is unrelated to machine learning (such as creating visualizations). At times I was curious about how some of the code was working in the background (it is all available on GitHub), but the book's job is not to cover all aspects of data analysis with Python (which would be a separate book).
In summary Geron teaches more advanced topics interspersed with the basics without an entirely coherent organizational structure. This book has an intuitive structure that elaborates at length on core ML concepts. It doesn't overburden with code, but may leave computer scientists wanting a bit more.
Another issue is the mglearn library that is required for this text. It is a huge annoyance because it obscures code that is otherwise necessary to understand if you have any intention of transferring the information in this text to the real world.
Some general concepts are explained well, but clarity begins to decline as topics become more complex. Almost all the code is poorly explained. Expect to spend as much time, if not more, examining the documentation for the referenced libraries as you will reading this text if you hope to get anything useful out of it.
This book shows you how to use the various machine learning algorithms, and provides an intuitive discussion of how they work, but it does not go into the mathematical details needed to program the algorithms from scratch. Thus, this book is perfect for the practitioner, but does not attempt to teach the theory or mathematics behind the algorithms.
Nevertheless, this is a good intro book and a nice companion to online classes that do not provide written notes.
Edit 3/21/2020 Received new copy that is readable. Changing rating to reflect original opinion re: content.
Top reviews from other countries
Saving point is: if you are teaching ML (like me) and need good well designed examples go for this book; also if you need very visual explanations. Would not recommend the book for a student though.
Various algos employed, detailed, explained.
Perfect to start building skills on these topics. Great accessory if you are teaching yourself online.









