- Paperback: 318 pages
- Publisher: O'Reilly Media; 1 edition (May 28, 2017)
- Language: English
- ISBN-10: 1491952962
- ISBN-13: 978-1491952962
- Product Dimensions: 6.9 x 0.6 x 9.1 inches
- Shipping Weight: 1.2 pounds (View shipping rates and policies)
- Average Customer Review: 18 customer reviews
- Amazon Best Sellers Rank: #10,649 in Books (See Top 100 in Books)
Enter your mobile number or email address below and we'll send you a link to download the free Kindle App. Then you can start reading Kindle books on your smartphone, tablet, or computer - no Kindle device required.
To get the free app, enter your mobile phone number.
Practical Statistics for Data Scientists: 50 Essential Concepts 1st Edition
Use the Amazon App to scan ISBNs and compare prices.
The Amazon Book Review
Author interviews, book reviews, editors picks, and more. Read it now
Frequently bought together
Customers who bought this item also bought
From the Publisher
About this Book
Data science is a fusion of multiple disciplines, including statistics, computer science, information technology and domain specific fields. As a result, a several different terms could be used to reference a given concept. Key terms and their synonyms will be highlighted throughout the book in a sidebar within the text.
This book is aimed at the data scientist with some familiarity with the R programming language, and with some prior (perhaps spotty or ephemeral) exposure to statistics. Both of us came to the world of data science from the world of statistics, and have some appreciation of the contribution that statistics can make to the art of data science. At the same time, we are well aware of the limitations of traditional statistics instruction: statistics as a disciple is a century and a half old, and most statistics textbooks and courses are laden with the momentum and inertia worthy of an ocean liner.
Two goals underlie this book:
- To lay out, in digestible, navigable and easily referenced form, key concepts from statistics that are relevant to data science.
- To explain which concepts are important and useful from a data science perspective, which are less so, and why.
50 Essential Concepts
About the Author
Peter Bruce founded and grew the Institute for Statistics Education at Statistics.com, which now offers about 100 courses in statistics, roughly a third of which are aimed at the data scientist. In recruiting top authors as instructors and forging a marketing strategy to reach professional data scientists, Peter has developed both a broad view of the target market, and his own expertise to reach it.
Andrew Bruce has over 30 years of experience in statistics and data science in academia, government and business. He has a Ph.D. in statistics from the University of Washington and published numerous papers in refereed journals. He has developed statistical-based solutions to a wide range of problems faced by a variety of industries, from established financial firms to internet startups, and offers a deep understanding the practice of data science.
Browse award-winning titles. See more
Top customer reviews
There was a problem filtering reviews right now. Please try again later.
I dislike that the authors make a number of categorical statements of the form "Data Scientists do this" or "Data Scientists don't need that". I disagree with many of these assertions and I think they have taken a definition of "data science" which is narrower than the prevailing consensus in the industry.
This book has some errors (see, for example, the confusion matrix on page 196) but overall the accuracy is above average relative to recent norms.
As other reviewers have noted, the author's github repository for the book is currently empty. If that's important to you, check it under "andrewgbruce" on github and make sure it's been updated before you buy the book.
The concepts are not astronomically explained, but with just enough depth that I can also individually explain to people what they are. What really stands out for me so far is after each or so concept, there is a section labeled as further reading (well, in the digital copy) that is usually at the end of the book altogether & I found myself realizing I have a lot of those books so the authors really know where to look & guide those who wanted more depth.
Yeah yeah yeah, the codes are missing (as of mid-June 2017) but if you really understood / know which packages to use, you wouldn't need the code. The first half of the book are two three liners of code concepts anyways; it's the explanations that matter the most. The second half of the book is the good part, which separates a white hat statistician from a grey hat data scientist, which is exactly what I wanted in a <300 page book.
Thanks for keeping me waiting since November though, thought it would never come! The O`Reilly books always keep me in awe at how they always know what topic I want to have a brief book (probably data collecting on me :P) & simultaneously leave me in suspense because I never notice I am preordering the books! Sigh. My only request is to be able to preorder the Kindle editions rather than the physical editions; my data science book cubby is starting to overwhelm my statistics cubby (NOT FOR LONG MASTERS PROGRAM ~).
Ok the datasets are up. There is a short R script to run to download the data, it will require some small modifications to get it working correctly.
You need to create a folder named "data".
and I changed the second line in the script from:
PSDS_PATH <- file.path('~', 'statistics-for-data-scientists')
PSDS_PATH <- file.path('.')
This will download the data into a folder named "data" in whatever directory you run the script. The script runs with no real feedback and some of the data sets are large, so just be patient. Once these were downloaded the examples in the book run great.
It is true that the textbook does not provide in-depth coverage for all topics, but I don't think that was the intent of the authors. However, the text DOES provide an excellent introduction to topics relevant to students and data scientists. After reading the text and working through the examples, you will be equipped to further your knowledge in whichever topic you require for you data analysis task.