Other Sellers on Amazon
& FREE Shipping
98% positive over last 12 months
& FREE Shipping
91% positive over last 12 months
Usually ships within 2 to 3 days.
& FREE Shipping
84% positive over last 12 months
Download the free Kindle app and start reading Kindle books instantly on your smartphone, tablet, or computer - no Kindle device required. Learn more
Read instantly on your browser with Kindle for Web.
Using your mobile phone camera - scan the code below and download the Kindle app.
Data Science from Scratch: First Principles with Python 2nd Edition
| Price | New from | Used from |
- Kindle
$13.20 - $36.99 Read with Our Free App - Paperback
$47.94 - $57.3914 Used from $26.39 23 New from $37.64
Enhance your purchase
To really learn data science, you should not only master the tools—data science libraries, frameworks, modules, and toolkits—but also understand the ideas and principles underlying them. Updated for Python 3.6, this second edition of Data Science from Scratch shows you how these tools and algorithms work by implementing them from scratch.
If you have an aptitude for mathematics and some programming skills, author Joel Grus will help you get comfortable with the math and statistics at the core of data science, and with the hacking skills you need to get started as a data scientist. Packed with New material on deep learning, statistics, and natural language processing, this updated book shows you how to find the gems in today’s messy glut of data.
- Get a crash course in Python
- Learn the basics of linear algebra, statistics, and probability—and how and when they’re used in data science
- Collect, explore, clean, munge, and manipulate data
- Dive into the fundamentals of machine learning
- Implement models such as k-nearest neighbors, Naïve Bayes, linear and logistic regression, decision trees, neural networks, and clustering
- Explore recommender systems, natural language processing, network analysis, MapReduce, and databases.
- ISBN-101492041130
- ISBN-13978-1492041139
- Edition2nd
- PublisherO'Reilly Media
- Publication dateMay 16, 2019
- LanguageEnglish
- Dimensions6.9 x 0.9 x 9.1 inches
- Print length406 pages
Frequently bought together

- +
- +
More items to explore
From the brand
-
-
Sharing the knowledge of experts
O'Reilly's mission is to change the world by sharing the knowledge of innovators. For over 40 years, we've inspired companies and individuals to do new things (and do them better) by providing the skills and understanding that are necessary for success.
Our customers are hungry to build the innovations that propel the world forward. And we help them do just that.
Editorial Reviews
About the Author
Joel Grus is a research engineer at the Allen Institute for Artificial Intelligence. Previously he worked as a software engineer at Google and a data scientist at several startups. He lives in Seattle, where he regularly attends data science happy hours.
Product details
- Publisher : O'Reilly Media; 2nd edition (May 16, 2019)
- Language : English
- Paperback : 406 pages
- ISBN-10 : 1492041130
- ISBN-13 : 978-1492041139
- Item Weight : 1.4 pounds
- Dimensions : 6.9 x 0.9 x 9.1 inches
- Best Sellers Rank: #28,350 in Books (See Top 100 in Books)
- #15 in Data Processing
- #17 in Data Mining (Books)
- #33 in Python Programming
- Customer Reviews:
About the author

Joel Grus is Principal Engineer at Capital Group, where he leads a small team that designs and implements machine learning and data products. Before that he was a software engineer at the Allen Institute for AI and Google, and a data scientist at a variety of startups.
He's the author of the the beloved "Data Science from Scratch", the quirky "Ten Essays on Fizz Buzz", and the polarizing JupyterCon talk "I Don't Like Notebooks".
He lives in Seattle, where he regularly attends data science happy hours. He blogs infrequently at joelgrus.com.
Customer reviews
Customer Reviews, including Product Star Ratings help customers to learn more about the product and decide whether it is the right product for them.
To calculate the overall star rating and percentage breakdown by star, we don’t use a simple average. Instead, our system considers things like how recent a review is and if the reviewer bought the item on Amazon. It also analyzed reviews to verify trustworthiness.
Learn more how customers reviews work on AmazonReviewed in the United States on February 13, 2020
-
Top reviews
Top reviews from the United States
There was a problem filtering reviews right now. Please try again later.
TLDR: If you're looking for a concise introduction to data science and have a bit of knowledge of basic Python, algebra, statistics and probability, look no further than this book! Otherwise, come back once you've picked up those tools and you'll feel right at home :)
I am a proficient Python Engineer and I can read the code and understand what is being done, but the author makes no effort to explain how it reached to that conclusion, or why it matters. The author does implement the mathematical formulas in Python skillfully, it misses the point of the book though.
Another big problem with this book is that it assumes you can learn mathematics by just doing mathematics without understanding the why. It is frustrating to read and follow the author implement mathematical formulas without explaining why. I believe this is the case because the author DOES assume you have the required math to follow the book. I believe the author should add a section in the preface that list the prerequisites for this book:
- Linear Algebra
- Statistics
- Probability
- Vector Calculus
- Continuous Optimization
Above all, it is a good book if used as an index on where to start to understand Data Science, but it definitely doesn't fulfill the promise of being "from scratch". From scratch IMHO means you dive into the internals of Data Science algorithms. I had the expectation that this book was going to be more like "Designing Data-Intensive Applications" for Data Science where the "why" is as important as the "how". Data Science from scratch is a book about the how, with no effort to dive into the why. The book does provide the vocabulary for me to discuss Data Science with practitioners, but I didn't feel it got me any closer to becoming a practitioner myself.
BTW, the fact that book is monochrome doesn't matter the font and figures are very clear and readable.
I really like the author, and the way he writes, sometimes its funny, but also its very straight to the point and dry when it comes to a "quick overview" on concepts and things you need to know, and I quickly realized that I need to do some "back learning" on a lot of mathematics. So I did have to stop reading this book and purchase some other books so I can understand what this author was talking about.
Though I had to buy some other books to help be understand this book, I still enjoy this book and think once I have a better foundation in algebra and calculus, I can continue to read this book again. Also you need to have a expert level understanding of python, which I dont either, but baby steps.
I would say this book is great guidline to what you need to know, but it doesnt teach you enjoy from "scratch" like the title says, you still need to put in the work.
I highly recommend this book as your first book into data science because the codes and thought processes are very clear. 70-80% of the book are data science foundation and basics for you to tackle harder topics later.
Top reviews from other countries
Without colour coding in the graphs, and with syntax highlighting missing for the code segments, it makes the book very difficult to read.
I'm sure the book itself is great, and I'm looking forward to reading the ePub instead.
Reviewed in the United Kingdom 🇬🇧 on August 25, 2020
Without colour coding in the graphs, and with syntax highlighting missing for the code segments, it makes the book very difficult to read.
I'm sure the book itself is great, and I'm looking forward to reading the ePub instead.
It's definitely not for complete beginners. If you have a foundational knowledge of python then you'll understand some of the concepts outlined. It's also not a tutorial book either, the best way to use this book is to find a part of it and apply it to a data set.










