Enter your mobile number or email address below and we'll send you a link to download the free Kindle App. Then you can start reading Kindle books on your smartphone, tablet, or computer - no Kindle device required.

  • Apple
  • Android
  • Windows Phone
  • Android

To get the free app, enter your email address or mobile phone number.

Data Science from Scratch: First Principles with Python 1st Edition

4.1 out of 5 stars 56 customer reviews
ISBN-13: 978-1491901427
ISBN-10: 149190142X
Why is ISBN important?
ISBN
This bar-code number lets you verify that you're getting exactly the right version or edition of a book. The 13-digit and 10-digit formats both work.
Scan an ISBN with your phone
Use the Amazon App to scan ISBNs and compare prices.
Have one to sell? Sell on Amazon
Buy used On clicking this link, a new layer will be open
$21.00 On clicking this link, a new layer will be open
Buy new On clicking this link, a new layer will be open
$32.51 On clicking this link, a new layer will be open
More Buying Choices
65 New from $22.26 40 Used from $21.00
Free Two-Day Shipping for College Students with Amazon Student Free%20Two-Day%20Shipping%20for%20College%20Students%20with%20Amazon%20Student

$32.51 FREE Shipping. In Stock. Ships from and sold by Amazon.com. Gift-wrap available.
click to open popover

Frequently Bought Together

  • Data Science from Scratch: First Principles with Python
  • +
  • Python Machine Learning
  • +
  • Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython
Total price: $100.68
Buy the selected items together

Editorial Reviews

About the Author

Joel Grus is a software engineer at Google. Before that he worked as a data scientist at multiple startups. He lives in Seattle, where he regularly attends data science happy hours. He blogs infrequently at joelgrus.com.

NO_CONTENT_IN_FEATURE

Product Details

  • Paperback: 330 pages
  • Publisher: O'Reilly Media; 1 edition (April 30, 2015)
  • Language: English
  • ISBN-10: 149190142X
  • ISBN-13: 978-1491901427
  • Product Dimensions: 6.9 x 0.7 x 9 inches
  • Shipping Weight: 1.1 pounds (View shipping rates and policies)
  • Average Customer Review: 4.1 out of 5 stars  See all reviews (56 customer reviews)
  • Amazon Best Sellers Rank: #4,889 in Books (See Top 100 in Books)

Customer Reviews

Top Customer Reviews

Format: Paperback Verified Purchase
This is a great book-- well written, easy to digest and informative. I've been in Data Mining and Statistical Analysis for a little over a decade now; I was looking for a book to share with my team to ensure we were all up-to-speed on some foundational concepts: this book is it. EDIT: I also forgot to mention, it has probably the best get-up-and-running in Python introduction I've seen (see, e.g., Chapter 2, ~20pp.)

It's the right size and correct coverage for the content and the author's sense of humor (indeed, that of a data scientist) resonates with the audience.
Solid introduction, even better review or brief explanation of commonly encountered topics.

One of the best O'Reilly books I've read in a long time-- in fact, a technical book at the level I used to expect from O'Reilly.
Comment 27 people found this helpful. Was this review helpful to you? Yes No Sending feedback...
Thank you for your feedback.
Sorry, we failed to record your vote. Please try again
Report abuse
Format: Kindle Edition
This book emphasize excellent pedagogy and understandable Python code. The basis of all programming and mathematical algorithms is given with only an assumption of minimal prior programming and high school mathematics. The basics of Data Science including: 1. A 20 page Clear Introduction to Python 2, 2. An introduction to Linear Algebra (described by Python Functions) 3. A Similar Introduction to Practical Statistics.
Like most scientific programmers who use Python the 2.6/2.7 branch is used throughout given the availability of appropriate libraries (like the Anaconda distribution). Tools for each type of algorithm are prototyped "from Scratch" in the author's own exemplary code with references to the professional libraries in the final chapters. Math is for the most part taught from code rather than mathematical notation.

Highly Recommended
6 Comments 49 people found this helpful. Was this review helpful to you? Yes No Sending feedback...
Thank you for your feedback.
Sorry, we failed to record your vote. Please try again
Report abuse
Format: Kindle Edition
The book is well-written and covers a wide range of topics related to data science and machine learning. Things I like about it:
- The range of topics covered is wide, and includes (i) an intro/refresher for Python, (ii) statistics/probability, (iii) several ML techniques, (iv) data manipulation
- For each topic, there is enough explanation of the underlying theory, as well as pointers to further reading
- For each topic, the author builds up the code in simple steps, so it's easy to follow along
- Everything is explained very clearly; there is enough precision, without difficult formal language
- The explanation of eigenvector centrality is awesome
Comment 56 people found this helpful. Was this review helpful to you? Yes No Sending feedback...
Thank you for your feedback.
Sorry, we failed to record your vote. Please try again
Report abuse
Format: Paperback
This book is exactly what I needed to get started on data science.
Prerequisites to benefit from the book :
I really wanted a starter book, as despite the fact 30% of my university training was stats / probability, I'm very rusty on those topics. The books assume no prior maths knowledge beyond basics operations. I already code in Python so the refresher course was more a "which specific parts of Python are useful for Data science". I've learned a few things, but I assume that if you are entirely new to programming, this may be a bit tough (in that case I would recommend a python starter book such as Dive in Python 2 / Dive in Python 3).
Note for Python 3 people : so far I had no major issues running the scripts with python 3, except a case of tuple unpacking that was easy to work around.

Approach of the book :
The book takes a example of the reader becoming chief data scientist in a dummy company, and the author provides the "business context" as needed (it's web oriented, but easy to grasp).
What I love is that the book took the approach to build the tooling in plain python before pointing to the libraries : this serves as a support to explain the underlying maths, and it also avoid the "magic" effect of libraries, which makes it hard to solve issues as one don't understand the underlying mechanism. Libraries are still used, but only after explaining the basics. This makes the knowledge more "portable" if you decide to use another library / language.
Each chapter ends with a list of pointers to interesting online resources / libraries / etc, making it a good starter point to dig further.

I highly recommend this book to anyone interested in learning data science, especially to those with rusty/limited math background.
1 Comment 26 people found this helpful. Was this review helpful to you? Yes No Sending feedback...
Thank you for your feedback.
Sorry, we failed to record your vote. Please try again
Report abuse
Format: Paperback Verified Purchase
This is among the handful of very best technical books I have ever read.

As the "from Scratch" in the title implies, the objective of this book is to teach the fundamental ideas and techniques of data science from first (or nearly first) principles. After working through this book, you'll be better able to meaningfully utilize the pre-packaged software (whether it's Matlab, R, scikit-learn, or whatever) that you will use in "real life".

And although the knowledge you'll gain is largely independent of the programming language, you will as a bonus learn from the clear and elegant python code included. Every key topic, from probability, statistics, and other mathematical subjects, to machine learning and databases, is covered in a crystal clear manner.

In summary, this book is the bee's knees.
Comment 34 people found this helpful. Was this review helpful to you? Yes No Sending feedback...
Thank you for your feedback.
Sorry, we failed to record your vote. Please try again
Report abuse
Format: Paperback
This book aims to give an introduction to data science from scratch and show concrete examples with Python. Sounded great to me, as I know some Python and wanted to learn more concrete data science. Was hoping for easy to understand, practical explanations backed by nice explicitly easy to understand Python code. The author uses a lot of Python but that is a specific variant using various less known (at least for me) constructs (as he so proudly explains in the beginning..). So I have to keep leafing through from the later chapters to the Python introduction to understand what are all the special classes and try to decipher it, etc. Similarly, the examples are spread throughout the chapter so running them will be a jigsaw. That is, I find the code examples of limited use and they could be much more clear for the different chapters by skipping all the authors so loved little gimmicks. I want to learn one thing and not be distracted by some special Python tricks.

The explanations are also rather limited. For example, standard deviation is described as a Python one-liner in the section on Statistics. Maybe just a little more explanation and background could be useful? In many places I find the proper explanation completely missing and not even proper code is there. For example, conditional probability is described using formulas such as P(B|G) = P(B,G)/P(G) = P(B)/P(G) = 1/2. That's it for the "example" on how to calculate it. So what were the numbers, where did they come from, how did they end up as 1/2? Where is the code? You know what your grade school teacher would say? SHOW YOUR THINKING. And you expect the reader with no stats background to learn from this?
Read more ›
1 Comment 37 people found this helpful. Was this review helpful to you? Yes No Sending feedback...
Thank you for your feedback.
Sorry, we failed to record your vote. Please try again
Report abuse

Most Recent Customer Reviews

Set up an Amazon Giveaway

Data Science from Scratch: First Principles with Python
Amazon Giveaway allows you to run promotional giveaways in order to create buzz, reward your audience, and attract new followers and customers. Learn more
This item: Data Science from Scratch: First Principles with Python