Enter your mobile number or email address below and we'll send you a link to download the free Kindle App. Then you can start reading Kindle books on your smartphone, tablet, or computer - no Kindle device required.
To get the free app, enter your mobile phone number.
Data Science For Dummies Paperback – March 9, 2015
There is a newer edition of this item:
Customers who bought this item also bought
Customers who viewed this item also viewed
What other items do customers buy after viewing this item?
From the Back Cover
- Deduce, discover, and communicate valuable insights from structured, semi-structured, and unstructured data sources
- Use meaningful visualizations to display and interpret data
- Take advantage of data processing tools like Hadoop® and MapReduce
- Turn your organization's data into a competitive advantage
Gain in-depth insight into your business with data sciencethis book makes it easy!
Big data is a big deal. This book helps you harness its power and give your business that all-important competitive edge. You'll learn to manage large amounts of data within hardware and software limitations, merge data sources, ensure consistent reporting, and interpret the data to tell your business story in a way that's easily understood.
- Get a grip on data science understand what it is, who uses it, and what it can do
- How big is it see how big data is defined and how to handle it with MapReduce, Hadoop, and alternative solutions
- It's probable explore probability and statistics in interpreting your data
- Model ideas learn about mathematical modeling, fuzzy multi-criteria programming, and modeling spatial data with statistics
- Make it visual examine different types of data visualization techniques and learn to choose the style that's right for your purpose and your audience
- Ideal technology learn where Python®, Open Source R, SQL, or even Excel® may be the tool you need
- The sky's the limit see how data science can help solve environmental issues, drive e-commerce, and even predict criminal activity
Open the book and find:
- The basics of data science
- Ways to define big data
- How business benefits from data science
- Information about regression and clustering techniques
- Various visualization options
- Tips for designing great dashboards
- How data science is used in journalism
- Ten free data science tools and applications
About the Author
Lillian Pierson, P.E. is an entrepreneurial data scientist and professional environmental engineer. She's the founder of Data-Mania, a start-up that focuses mainly on web analytics, data-driven growth services, data journalism, and data science training services. She also covers the topics of data science, analytics, and statistics for prominent organizations like IBM and UBM.
Top customer reviews
There was a problem filtering reviews right now. Please try again later.
The topics covered seem like the right ones, and there are “cheat sheets” and updates online that could be very helpful in keeping the content current. The problem is that the book is so poorly written that it is not going to be useful unless you already know the material.
The author takes pains to define the many technical terms used in data science, but most of the definitions are poorly done and almost seem circular. Time and again the definition includes the very terms that are being defined or uses the same word repeatedly and unclearly, e.g., “If science is a systematic method by which people study and explain domain-specific phenomenon[SIC] that occur in the natural world, then you can think of data science as the scientific domain that’s dedicated to knowledge discovery via data analysis.”
Despite the many definitions provided, many explanations include technical terms that the average non-engineering person would not know. For example, there is a detailed description of the MapReduce programming paradigm that forms the basis for the Hadoop data-processing platform that is apparently used extensively in data science. The explanation uses terms like “key-value pairs” without defining them. When I gave up trying to understand the explanation I showed it to a recently retired computer science professor and a still-employed engineer who is familiar with Hadoop. Both found it very difficult to understand. In this case the explanation MIGHT have been improved greatly by a simple example.
In addition to the problems above, the writing style is awkward. It is extremely repetitive. For example, data science and data engineering are defined and distinguished from each other three times in the first three chapters; in several cases, exact sentences are repeated. The Dummies books are usually organized so that you can use them as a reference and would not have to read straight through, but the level of repetition here is unnecessary and does not make for an enjoyable read. Word usage is also poor, which can contribute to misunderstanding, e.g., “Since a data scientist must also have subject matter expertise in the particular area in which they work, this requirement generally precludes a data scientist from having expertise in data engineering.” “Precludes”? I can see how such expertise might be fairly rare, but it certainly is not precluded. And sentences are poorly constructed, e.g., “Machine learning is the practice of applying algorithms to learn from and make automated predictions about data.” No, "machine learning" learns from data, but it does not make predictions about data.
In sum, this book is likely to leave most of its intended audience as frustrated as I was. This is a topic of great interest and importance today, but I suggest you look elsewhere for an introduction.
Sadly, this wasn't so. Admittedly, I only made it through the first 25% of the book before I couldn't get myself to continue. I found myself spending more time marking up mistakes in the book than actually reading. Even the simplest of concepts like singular value decomposition (SVD) is full of errors; matrices are being called vectors, the order of matrix multiplication is displayed incorrectly, etc.
Everybody makes mistakes, and this wouldn't be such a deal breaker if the rest of the book up to here contained useful information. To put it in terms of data science, if you had to perform data reduction on Part I of the book, the eigenvalue would be zero, yet this would be a good predictor for the next 5%. There's absolutely nothing in the first part that would be worth keeping if I was the technical editor. A bunch of pointless dribble to fill the pages of what was touted in the abstract as
"Some books on data science are needlessly wordy, with their authors going in circles trying to get to the point. Not so here."
Pierson, Lillian. Data Science For Dummies (p. 1). Wiley. Kindle Edition.
Oh, the irony. The author surely fooled me. Take for example the following 'explanation'
"Spark SQL: You use this module to work with and query structured data using Spark. Within Spark, you can query data using Spark’s built-in SQL package: SparkSQL."
Pierson, Lillian. Data Science For Dummies (p. 58). Wiley. Kindle Edition.
The book is full of these kinds of circular zero-content statements. Despite only having paid about a lunch's worth of money, I will kindly as Amazon to issue a refund. This book should not be for sale before it has been edited by someone.
If you are interested in data science, have a look at Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking, for example. I recently also checked out Naked Statistics: Stripping the Dread from the Data from our local library, but haven't started reading it, yet.
Most recent customer reviews
It's an introduction that in very relatable terms explains the breadth and business...Read more