Data Smart: Using Data Science to Transform Information into Insight 1st Edition
John W. Foreman (Author) Find all the books, read about the author, and more. See search results for this author |



Use the Amazon App to scan ISBNs and compare prices.
- FREE return shipping at the end of the semester.
- Access codes and supplements are not guaranteed with rentals.
Fulfillment by Amazon (FBA) is a service we offer sellers that lets them store their products in Amazon's fulfillment centers, and we directly pack, ship, and provide customer service for these products. Something we hope you'll especially enjoy: FBA items qualify for FREE Shipping and .
If you're a seller, Fulfillment by Amazon can help you grow your business. Learn more about the program.
- List Price: $45.00
- Save:$30.00(67%)

But how does one exactly do data science? Do you have to hire one of these priests of the dark arts, the "data scientist," to extract this gold from your data? Nope.
Data science is little more than using straight-forward steps to process raw data into actionable insight. And in Data Smart, author and data scientist John Foreman will show you how that's done within the familiar environment of a spreadsheet.
Books with Buzz
Discover the latest buzz-worthy books, from mysteries and romance to humor and nonfiction. Explore more
Frequently bought together
- +
- +
Customers who viewed this item also viewed
Editorial Reviews
From the Inside Flap
"Data Smart makes modern statistic methods and algorithms understandable and easy to implement. Slogging through textbooks and academic papers is no longer required!"
—Patrick Crosby, Founder of StatHat & first CTO at OkCupid
"When Mr. Foreman interviewed for a job at my company, he arrived dressed in a 'Kentucky Colonel' kind of suit and spoke about nonsensical things like barbecue, lasers, and orange juice pulp. Then, he explained how to de-mystify and solve just about any complex 'big data' problem in our company with simple spreadsheets. No server clusters, mainframes, or Hadoop-a-ma-jigs. Just Excel. I hired him on the spot. After reading this book, you too will learn how to use math and basic spreadsheet formulas to improve your business or, at the very least, how to trick senior executives into hiring you as their data scientist."
—Ben Chestnut, Founder & CEO of MailChimp
"You need a John Foreman on your analytics team. But if you can't have John, then reading this book is the next best thing."
—Patrick Lennon, Director of Analytics, The Coca-Cola Company
Most people are approaching data science all wrong. Here's how to do it right.
Not to disillusion you, but data scientists are not mystical practitioners of magical arts. Data science is something you can do. Really. This book shows you the significant data science techniques, how they work, how to use them, and how they benefit your business, large or small. It's not about coding or database technologies. It's about turning raw data into insight you can act upon, and doing it as quickly and painlessly as possible.
Roll up your sleeves and let's get going.
Relax — it's just a spreadsheet
Visit the companion website at www.wiley.com/go/datasmart to download spreadsheets for each chapter, and follow them as you learn about:
- Artificial intelligence using the general linear model, ensemble methods, and naive Bayes
- Clustering via k-means, spherical k-means, and graph modularity
- Mathematical optimization, including non-linear programming and genetic algorithms
- Working with time series data and forecasting with exponential smoothing
- Using Monte Carlo simulation to quantify and address risk
- Detecting outliers in single or multiple dimensions
- Exploring the data-science-focused R language
From the Back Cover
"Data Smart makes modern statistic methods and algorithms understandable and easy to implement. Slogging through textbooks and academic papers is no longer required!"
—Patrick Crosby, Founder of StatHat & first CTO at OkCupid
"When Mr. Foreman interviewed for a job at my company, he arrived dressed in a 'Kentucky Colonel' kind of suit and spoke about nonsensical things like barbecue, lasers, and orange juice pulp. Then, he explained how to de-mystify and solve just about any complex 'big data' problem in our company with simple spreadsheets. No server clusters, mainframes, or Hadoop-a-ma-jigs. Just Excel. I hired him on the spot. After reading this book, you too will learn how to use math and basic spreadsheet formulas to improve your business or, at the very least, how to trick senior executives into hiring you as their data scientist."
—Ben Chestnut, Founder & CEO of MailChimp
"You need a John Foreman on your analytics team. But if you can't have John, then reading this book is the next best thing."
—Patrick Lennon, Director of Analytics, The Coca-Cola Company
Most people are approaching data science all wrong. Here's how to do it right.
Not to disillusion you, but data scientists are not mystical practitioners of magical arts. Data science is something you can do. Really. This book shows you the significant data science techniques, how they work, how to use them, and how they benefit your business, large or small. It's not about coding or database technologies. It's about turning raw data into insight you can act upon, and doing it as quickly and painlessly as possible.
Roll up your sleeves and let's get going.
Relax — it's just a spreadsheet
Visit the companion website at www.wiley.com/go/datasmart to download spreadsheets for each chapter, and follow them as you learn about:
- Artificial intelligence using the general linear model, ensemble methods, and naive Bayes
- Clustering via k-means, spherical k-means, and graph modularity
- Mathematical optimization, including non-linear programming and genetic algorithms
- Working with time series data and forecasting with exponential smoothing
- Using Monte Carlo simulation to quantify and address risk
- Detecting outliers in single or multiple dimensions
- Exploring the data-science-focused R language
About the Author
Don't have a Kindle? Get your Kindle here, or download a FREE Kindle Reading App.
Product details
- Publisher : Wiley; 1st edition (November 12, 2013)
- Language : English
- Paperback : 432 pages
- ISBN-10 : 111866146X
- ISBN-13 : 978-1118661468
- Item Weight : 1.6 pounds
- Dimensions : 7.3 x 0.8 x 9.1 inches
- Best Sellers Rank: #63,179 in Books (See Top 100 in Books)
- #16 in Business Operations Research (Books)
- #29 in Artificial Intelligence (Books)
- #39 in Data Mining (Books)
- Customer Reviews:
About the author

John is the Chief Data Scientist for MailChimp.com. He's also a recovering management consultant who's done a lot of analytics work for large businesses (Coke, Royal Caribbean, Intercontinental Hotels) and the government (DoD, IRS, DHS).
These days John does all sorts of awesome data science for MailChimp, and he blogs for fun about analytics through narrative fiction at AnalyticsMadeSkeezy.com. Spoiler alert: the characters who do meth are frequently confused or in peril. John does not do meth.
Customer reviews
Customer Reviews, including Product Star Ratings help customers to learn more about the product and decide whether it is the right product for them.
To calculate the overall star rating and percentage breakdown by star, we don’t use a simple average. Instead, our system considers things like how recent a review is and if the reviewer bought the item on Amazon. It also analyzed reviews to verify trustworthiness.
Learn more how customers reviews work on AmazonTop reviews from the United States
There was a problem filtering reviews right now. Please try again later.
First, a drop about me from the standpoint of this book. I have been an IT professional for many years specializing in programming, database, and MS Office add-ons. Part of my job entails self enrichment, that is, expand my working knowledge in areas potentially important for my job. I chose Foreman's book to help with this task for a number of reasons: a) Data Science is a hot area and my company does have a Data Science group, b) I have lots of data experience under my belt - I felt that it would be nice for once to get some useful information from the data, and c) I have a really good Excel background - so I figured that Foreman's approach would be perfect for me - little did I know that I would seriously add to my Excel bag of tricks.
The author makes the assumptions that: a) the reader is somewhat technical, b) he knows nothing about Data Science, and c) he is relatively comfortable working in Excel.
Reading the book is a joy because Foreman has a cozy, chummy style. He definitely doesn't throw all the technical stuff at the reader rat-tat-tat machine gun style like many other authors. Instead, Foreman gently introduces his topics and then ramps up technical details carefully. This most definitely helps the learning process.
Speaking of learning, by the end of the you will have learned important concepts in "machine learning" and I believe that you will be ready for the next step. I sure was. I found the topics interesting and I wanted to learn more. This is where the book's only problem area comes into play - the next step. Foreman has 3 references - one good, but minor, one terrible, and the other is inappropriate. Let me explain.
Foreman recommends a free resource as a follow-on to his Forecasting Chapter. This is a good reference, but I believe that Forecasting is a minor topic in Data Science, unless, of course, Forecasting becomes your thing.
Foreman's main reference is: "Data Mining with R" by Luis Torgo. Foreman recommends this as the next step after his book.I tried to read this several times, but couldn't. It certainly wasn't my next step.
The other reference, "The Elements of Statistical Learning" by Trevor Hastie, et. al, is totally inappropriate for Data Science newbies. You can checkout the Amazon reviews for this book and you'll see that you need a pretty serious background in statistics to get anything out of that reference. In fact, the author Hastie says as much in his next book "An Introduction to Statistical Learning- with Applications in R". This is the appropriate next step, but I'll get to that in a moment.
Here are my recommendations:
A. Read Foreman's book and follow along with him in working through the Excel spreadsheets. This is a first step in getting comfortable with Machine Learning.
B. Take the Coursera courses: 1) Machine Learning Foundations: A Case Study Approach, and 2) Machine Learning: Regression. The courses are free unless you want completion certificates, in which case there is nominal cost.
C. Now you are ready for: An Introduction to Statistical Learning: with Applications in R (Springer Texts in Statistics) This book is also available for free by the authors - check online.
Foreman’s book is written in a nice and funny stile, which makes it an easy read. Data mining algorithms are described with the minimum equations needed. Foreman has written a practical book and thus decided to use Excel as a tool for data science. The book starts with an introduction to Excel and its most famous functions. For data scientists using SAS, R, Python or Matlab, you may discover how powerful Excel is. But you will also see how clumsy it is to use Excel for data science. Whereas you would need a few lines in R, the book will take you through a dozen pages of step by step actions you need to perform to obtain the same in Excel. Not only is it more time consuming but also more prone to errors.
Don’t get me wrong: Data Smart is excellent at explaining how to perform data science in Excel. I just think Excel is not the right tool for it. The book is also a journey into MailChimp, the author’s company. This is nice and provides plenty of examples related to e-mail marketing. The book thus provides quick and high-level description of the problem, followed by Excel steps to solve it. In conclusion, Data Smart is a must read to get a fresh perspective on data science with a “Data Science using Excel” user manual. And for the experts? You can just skip the Excel parts and get insights into the field, with a focus on MailChimp use cases.
Too often I'm in a meeting where the data scientist is incapable of articulating what it is they did or found, and the stakeholders in the room nod in agreement despite their obvious incomprehension. This book helps both: the stakeholders understand what's going on with the major themes in data science, and the scientist can learn ways to pragmatically describe their art. Plus, both sides get to laugh. Win-win.
What is nice about him using Excel (with data sets that are also available for download from the publishers site) is that all the extra steps Excel requires to carry out this methods is that it also helps the reader to better grasp the reasoning of these methods along the way. That he peppers the text with a humorous style also makes this a rare treasure for data science introductory books.
Top reviews from other countries

Thankfully I didn't, because it is a difficult subject matter, but there aren't any problems with the writing or content.
I have had to over it quite a few times to "get it" and there are still things I don't fully understand, but at least I can replicate to get results. Some readers might be able to grasp it first time, but I think many won't unless they already know about things such as Clustering and Exponential Smoothing.
If you are looking for a basic intro to Data Analysis you should avoid this, but if you are looking to really get stuck in, then buy it. Be warned what you are letting yourself in for: just browsing it or reading once and putting on the bookshelf will be of no use whatsoever. I've spent at least 30 hours on it so far.
Overall the content and the ideas are brilliant, and the guy is really really funny - still doesn't make it any easier.


Those with a strong background in statistics and SAS/SPSS/R will find it interesting and find it useful for the purposes of helping other understand what they are doing (i.e. it isn't magic) - but the target audience would be for those trying to gain business insight (probably in SME's) who are technically able, but who only know and / or have access to one tool for solving the problem - i.e. Excel. There are a lot of people in this situation, and this book could dramatically widen their horizons when considering issues such as optimisation or customer insight.
Ultimately the author himself ends the book by stating that Excel isn't really the right tool for the job, and introduces R as an open source alternative. The demographic at which this book is aimed would most likely would not have considered this a possible path before they picked up this book, but I believe by the end of it most will give it proper consideration as a means to best apply the methods detailed within the book.
As an aside, the author has a quirky and amusing style, which went down well at least with me, such that I read the first 4 chapters in one sitting...


The author has a great sense of humour. Also, he is a top class on what he does...otherwise he wouldn't lead the analytics team at mailchimp.