Industrial-Sized Deals Shop all Back to School Shop Women's Handbags Learn more nav_sap_SWP_6M_fly_beacon Deradoorian $5 Off Fire TV Stick Grocery Shop Popular Services gotS5 gotS5 gotS5  Amazon Echo Starting at $99 Kindle Voyage Metal Gear Solid 5 Shop Back to School with Amazon Back to School with Amazon Outdoor Recreation STEM Toys & Games
Doing Data Science: Straight Talk from the Frontline and thousands of other textbooks are available for instant download on your Kindle Fire tablet or on the free Kindle apps for iPad, Android tablets, PC or Mac.

Doing Data Science: Straight Talk from the Frontline 1st Edition

47 customer reviews
Related Text
ISBN-13: 978-1449358655
ISBN-10: 1449358659
Why is ISBN important?
This bar-code number lets you verify that you're getting exactly the right version or edition of a book. The 13-digit and 10-digit formats both work.
Scan an ISBN with your phone
Use the Amazon App to scan ISBNs and compare prices.
Sell yours for a Gift Card
We'll buy it for $7.03
Learn More
Trade in now
Have one to sell? Sell on Amazon
Buy used
Buy new
More Buying Choices
43 New from $24.43 26 Used from $19.64
Free Two-Day Shipping for College Students with Amazon Student Free%20Two-Day%20Shipping%20for%20College%20Students%20with%20Amazon%20Student

InterDesign Brand Store Awareness Textbooks
$26.99 FREE Shipping on orders over $35. In Stock. Ships from and sold by Gift-wrap available.

Frequently Bought Together

Doing Data Science: Straight Talk from the Frontline + Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython + Data Science from Scratch: First Principles with Python
Price for all three: $81.13

Buy the selected items together

Editorial Reviews Review

Dasypus novemcinctus
Dasypus novemcinctus
What's the animal featured on the cover?

The animal on the cover of Doing Data Science is a nine-banded armadillo (Dasypus novemcinctus), a mammal widespread throughout North, Central, and South America. From Latin, novemcinctus literally translates to “nine-banded” (after the telescoping rings of armor around the midsection), though the animal can actually have between 7 to 11 bands. The three-banded armadillo native to South America is the only armadillo that can roll into a ball for protection; other species have too many plates. The armadillo’s skin is perhaps its most notable feature. Brownish-gray and leathery, it is composed of scaly plates called scutes that cover everything but its underside.

The animals also have powerful digging claws, and are known to create several burrows within their territory, which they mark with scent glands. Nine-banded armadillos typically weigh between 5.5 to 14 pounds, and are around the size of a large domestic cat. Its diet is largely made up of insects, though it will also eat fruit, small reptiles, and eggs. Females almost always have a litter of four—quadruplets of the same gender, because the zygote splits into four embryos after implantation. Young armadillos have soft skin when they are born, but it hardens as they get older. They are able to walk within a few hours of birth. Nine-banded armadillos are capable of jumping three to four feet in the air if startled. Though this reaction can scare off natural predators, it is usually fatal for the armadillo if an approaching car is what has frightened it, as it will collide with the underside of the vehicle. Another unfortunate connection between humans and nine-banded armadillos is that they are the only carriers of leprosy—it is not unheard of for humans to become infected when they eat or handle armadillos. The cover image is from Shaw’s Zoology, and was reinterpreted in color by Karen Montgomery.


"Every once in a while a single book comes to crystallize a new discipline. If books still have this power in the era of electronic media, "Doing Data Science: Straight Talk from the Frontline" by Rachel Schutt and Cathy O'Neil: O'Reilly, 2013 might just be the book that defines data science."

 -- Joseph Rickert
Revolutions Blog

"I enjoyed Rachel and Cathy’s book, it’s readable, informative, and like no other book I’ve read on the topic of statistics or data science." 
—Andrew Gelman
Professor of statistics and political science, and director of the Applied Statistics Center at Columbia University

"I got a lot out of Doing Data Science, finding the chapter organization on business problem specification, analytics formulation, data access/wrangling, and computer code to be very helpful in understanding DS solutions."
—Steve Miller
Co-founder, OpenBI, LLC, a Chicago-based business intelligence services firm


See all Editorial Reviews

Best Books of the Month
Best Books of the Month
Want to know our Editors' picks for the best books of the month? Browse Best Books of the Month, featuring our favorite new books in more than a dozen categories.

Product Details

  • Paperback: 408 pages
  • Publisher: O'Reilly Media; 1 edition (November 3, 2013)
  • Language: English
  • ISBN-10: 1449358659
  • ISBN-13: 978-1449358655
  • Product Dimensions: 6 x 0.8 x 9 inches
  • Shipping Weight: 1.3 pounds (View shipping rates and policies)
  • Average Customer Review: 4.1 out of 5 stars  See all reviews (47 customer reviews)
  • Amazon Best Sellers Rank: #26,820 in Books (See Top 100 in Books)

More About the Authors

Discover books, learn about writers, read author blogs, and more.

Customer Reviews

Most Helpful Customer Reviews

113 of 113 people found the following review helpful By Carsten Jørgensen on December 28, 2013
Format: Paperback
Book review - Doing Data Science by O'Neil and Schutt, O'Reilly Media.

More breadth than depth

What is data science? The book Doing Data Science not only explains what data science is but also provides a broad overview of methods and techniques that one must master in order to call one self a data scientist. The book is based on a course about data science given at Columbia University. However it is not to be considered as a text book about data science but more as a broad introduction to a number of topics in data science.

In the spring of 2013 I followed two Coursera courses. One about the statistical programming language R and one on Data Analysis. I had for some time been looking for a book that could be used as a follow-up reading on topics in data science. This was the reason I picked up "Doing Data Science".

The book begins with a chapter about what data science is all about is followed by four chapters on topics like statistical inference, explanatory data analysis, various machine learning algorithms, linear and logistic regression, and Naive Bayes. I have a background in both mathematics and statistics and I was able to understand these chapters but the material is covered in such broad terms that I find it hard to believe that a newcomer to this topics will understand or gain much knowledge from reading these chapters. Basic math is presented about the models but without some kind of detailed explanation one cannot develop any deeper intuition for the approach explained.

The best parts of the book is definitely chapter 6 to 8 and 10. In here we find interesting discussion about coverage of data science applied to financial modeling, extracting information from data, and social networks.
Read more ›
1 Comment Was this review helpful to you? Yes No Sending feedback...
Thank you for your feedback. If this review is inappropriate, please let us know.
Sorry, we failed to record your vote. Please try again
49 of 51 people found the following review helpful By Dan D. Gutierrez on November 19, 2013
Format: Paperback
I found this book to be a very odd bird indeed. It is one book you can read from back cover to front cover and not be at a disadvantage. This is because the book is really just a collection of presentations made by various people to a class taught by the primary author Rachel Schutt at Columbia University in the Fall of 2012 – Introduction to Data Science. It wasn’t entirely clear what content Schutt was directly responsible for since only some of the chapters indicate who the contributors were (one of the chapters was contributed by a group of her students!). The co-author, Cathy O’Neil, I’ve encountered before as an outspoken blogger going by the name “mathbabe” but it wasn’t specifically stated how she became part of the book project, other than to say she was one of the students in Schutt’s class. Chapter 6 was partly written by O’Neil.

Both Schutt and O’Neil are Ph.D.s data science appropriate fields, but the book was not “written” by the two, rather they seemed to have performed some kind of editing function with the materials submitted by each contributor and added commentaries of their own. As a result, the book is a hodgepodge of anecdotes, factoids, R code snippets, plots, and mathematics, all from the in-class presentations. I enjoy seeing math in data science books, but the equations in this book were sort of just floating there requiring the reader to explore further at another time.

Although I have issues with the book as it is not any sort of text for the field, I did enjoy reading it with a number of “Ah, I didn’t know that!” moments. Schutt’s credentials in data science are considerable, having worked at Google for a few years around the same time that “data science” was growing up in Silicon Valley.
Read more ›
3 Comments Was this review helpful to you? Yes No Sending feedback...
Thank you for your feedback. If this review is inappropriate, please let us know.
Sorry, we failed to record your vote. Please try again
74 of 88 people found the following review helpful By Dimitri Shvorob on October 29, 2013
Format: Kindle Edition
... helps the medicine go down, as Mary Poppins used to say. An IT-focused publisher, O'Reilly has twice before used the "book as collection of chapters by different contributors" formula in its foray into the attractive "data" niche, with such titles as "Beautiful data" and "Bad data". "Doing data science" - by the way, I prefer Hastie and Tibshirani's "statistical learning" to the fuzzy and grandiose "data science" - follows the same approach, but, with its subject matter being closer to the academe, the company enlisted two young PhDs to steer the collaborative effort. Rachel Schutt took the lead as author and editor, and, assisted by Cathy O'Neil, produced an engaging, informal - you don't often see "science" in the title and "huge-ass" in the text - yet sufficiently technical to be hands-on, sequence-of-vignettes-styled book. Imagine a mash-up of a magazine article and a textbook. Neither part may be best-in-class, but their combination makes for a "unique selling proposition".

Well, maybe not a textbook. Most textbooks are carefully written and carefully checked. In contrast, when I see "Doing data science" introduce the ROC curve in three places, one of which translates the "O" as "operator", I can guess that this is a copy-paste of papers by three contributors. When Dr. O'Neil casually redefines an English word ("causal") to avoid rewriting a couple of sentences, or pronounces, on page 159, that "priors reduce degrees of freedom" - this is painfully meaningless, and neither term is defined, only name-checked - I suspect that she knows better, but just did not feel like spending more time on her half-chapter. Neither author speaks of their own projects - if this is the "frontline", then it's other soldiers' "trenches" that we are visiting.
Read more ›
7 Comments Was this review helpful to you? Yes No Sending feedback...
Thank you for your feedback. If this review is inappropriate, please let us know.
Sorry, we failed to record your vote. Please try again

Most Recent Customer Reviews

Set up an Amazon Giveaway

Amazon Giveaway allows you to run promotional giveaways in order to create buzz, reward your audience, and attract new followers and customers. Learn more
Doing Data Science: Straight Talk from the Frontline
This item: Doing Data Science: Straight Talk from the Frontline
Price: $26.99
Ships from and sold by

Want to discover more products? Check out these pages to see more: were is doing, stuff for fifth grade, raise money for charity, mining the social web, math talks, data vault