Top positive review
Leveraging data-driven insights are really the only way any of us should be making decisions
on June 20, 2015
Great book on the importance of data-driven decision making. While I have always been someone that has let the data do the talking, I haven't found an easy way to explain why. Super Crunchers is that easy way! Below I have summarized some of the important points of the book....
Super Crunching is crucially about the impact of statistical analysis on real-world decisions. Two core techniques for Super Crunching are the regression and randomization.
1. Regression will make your predictions more accurate (Historical approach):
It all starts with the use of regressions, and although this method is a basic statistical test of causal relationship it's still a very powerful tool that I need to re-introduce in my analytical life.
Regressions make predictions and tell you how precise the prediction is. It tries to hone in on the causal impact of a variable on a dependent. It can tell us the weights to place upon various factors and simultaneously tell us how precisely it was able to estimate these weights.
2. Randomization and large sample sizes (Present/Real-Time approach):
Reliance on historical data increases the difficulty in discerning causation. Large randomized tests work because the distribution amongst the sample are increasingly identical. Think A/B testing on steroids that allows you to quickly test different combinations! Boils down to the averages of the "treated and untreated" groups.
Government has embraced randomization as the best way to test what works. Statistical profiling led to smarter targeting of government support
With finite amounts of data, we can only estimate a finite number of causal effects
3. Neural network
Unlike the regression approach, which estimates the weights to apply to a single equation, the neural approach uses a system of equations represented by a series of interconnected switches.
Computers use historical data to train the equation switches to come up with optimal weights. But while the neural technique can yield powerful predictions, it does a poorer job of telling you why it is working or how much confidence it has in its prediction.
Super Crunching requires analysis of the results of repeated decisions. If you can't measure what you're trying to maximize, you're not going to be able to rely on data-driven decisions.
We humans just overestimate our ability to make good decisions and we're skeptical that a formula that necessarily ignores innumerable pieces of information could do a better job than we could.