- File Size: 6263 KB
- Print Length: 357 pages
- Publisher: Dey Street Books; Reprint edition (May 9, 2017)
- Publication Date: May 9, 2017
- Sold by: HarperCollins Publishers
- Language: English
- ASIN: B01AFXZ2F4
- Text-to-Speech: Enabled
- Word Wise: Enabled
- Lending: Not Enabled
- Amazon Best Sellers Rank: #52,260 Paid in Kindle Store (See Top 100 Paid in Kindle Store)
|Print List Price:||$16.99|
Save $5.00 (29%)
Price set by seller.
Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are Kindle Edition
|New from||Used from|
Enter your mobile number or email address below and we'll send you a link to download the free Kindle App. Then you can start reading Kindle books on your smartphone, tablet, or computer - no Kindle device required.
To get the free app, enter your mobile phone number.
More items to explore
From the Back Cover
How much sex are people really having?
How many Americans are actually racist?
Is America experiencing a hidden back-alley abortion crisis?
Can you game the stock market?
Does violent entertainment increase the rate of violent crime?
Do parents treat sons differently from daughters?
How many people actually read the books they buy?
In this groundbreaking work, Seth Stephens-Davidowitz, a Harvard-trained economist, former Google data scientist, and New York Times writer, argues that much of what we thought about people has been dead wrong. The reason? People lie, to friends, lovers, doctors, surveys—and themselves.
However, we no longer need to rely on what people tell us. New data from the internet—the traces of information that billions of people leave on Google, social media, dating, and even pornography sites—finally reveals the truth. By analyzing this digital goldmine, we can now learn what people really think, what they really want, and what they really do. Sometimes the new data will make you laugh out loud. Sometimes the new data will shock you. Sometimes the new data will deeply disturb you. But, always, this new data will make you think.
Everybody Lies combines the informed analysis of Nate Silver’s The Signal and the Noise, the storytelling of Malcolm Gladwell’s Outliers, and the wit and fun of Steven Levitt and Stephen Dubner’s Freakonomics in a book that will change the way you view the world. There is almost no limit to what can be learned about human nature from Big Data—provided, that is, you ask the right questions.--This text refers to the hardcover edition.
From the Inside Flap
In this groundbreaking work, Seth Stephens-Davidowitz, a Harvard-trained economist, former Google data scientist, and New York Times writer, argues that much of what we thought about people has been dead wrong. The reason? People lie, to friends, lovers, doctors, surveys--and themselves. However, we no longer need to rely on what people tell us. New data from the internet--the traces of information that billions of people leave on Google, social media, dating, and even pornography sites--finally reveals the truth.
Everybody Lies combines the informed analysis of Nate Silver's The Signal and the Noise, the storytelling of Malcolm Gladwell's Outliers, and the wit and fun of Stephen Dubner and Steven Levitt's Freakonomics in a book that will change the way you view the world. There is almost no limit to what can be learned about human nature from Big Data--provided, that is, you ask the right questions.--Lawrence Summers, President Emeritus and Charles W. Eliot University Professor of Harvard University --This text refers to an alternate kindle_edition edition.
Would you like to tell us about a lower price?
There was a problem filtering reviews right now. Please try again later.
What has allowed us to access this pool of unguarded opinions and truckloads of data concerning human behavior is the Internet and the tools of "big" data. As the author puts it, this data is not just "big" but also "new", which means that the kind of data we can access is also quite different from what we are used to; in his words, we live in a world where every sneeze, cough, internet purchase, political opinion, and evening run can be considered "data". This makes it possible to test hypotheses that we could not have tested before. For instance, the author gives the example of testing Freud's Oedipus Complex through accessing pornographic data which indicates a measurable interest in incest. Generally speaking there is quite an emphasis on exploring human sexuality in the book, partly because sexuality is one of those aspects of our life that we wish to hide the most and are also pruriently interested in, and partly because investigating this data through Google searches and pornographic sites reveals some rather bizarre sexual preference that are also sometimes specific to one country or another. This is a somewhat fun use of data mining.
Data exploration can both reveal the obvious as well as throw up unexpected observations. A more serious use of data tools concerns political opinions. Based on Google searches in particular states, the author shows how racism (as indicated by racist Google searches) was a primary indicator of which states voted for Obama in the 2008 election and Trump in the 2016 election. That's possibly an obvious conclusion, at least in retrospect. A more counterintuitive conclusion is that the racism divide does not seem to map neatly on the urban-rural divide or the North-South divide, but rather on the East-West divide; people seem to be searching much more for explicitly racist things in the East compared to the West. There is also an interesting survey of gay people in more and less tolerant states which concludes that you are as likely to find gay people in both parts of the country. Another interesting section of the book talked about how calls for peace by politicians after terrorist attacks actually lead to more rather than less xenophobic Google searches; this is accompanied by a section that hints at how the trends can be potentially reversed if different words are used in political speeches. There is also an interesting discussion of how the belief that newspaper political leanings drive customer political preferences gets it exactly backward; the data shows that customer political preferences shape what newspapers print, so effectively they are doing nothing different from any other customer-focused, profit making organization.
The primary tool for doing all this data analysis is correlation or regression analysis, where you look at online searches and try to find correlations between certain terms and factors like geographic location, gender, ethnicity. One hopes that one has separated the most important correlated variable and has eliminated other potentially important ones.
There are tons of other amusing and informative studies - sometimes the author's own but more often other people's - that reveal human desires and behavior across a wide swathe of fields, including politics, dating, sports, education, shopping and sexuality. There's plenty of potentially useful material in these studies. For instance, some of the data that indicates gaps in educational or social attainment in different parts of the country are immediately actionable in principle. Google searches have also been used to keep track of flu and other disease epidemics. Sometimes finding correlations is financially lucrative; there is a story about how a horse expert found that success in horse races seems to correlate with one factor more than any other: the size of the left ventricle. Another study isolated the impact of the early growing season on the quality of wines. There is no doubt that financial firms, supermarkets, newspapers, hospitals and online purveyors of everything from pornography to peanuts are going to keep a close eye on this data to maximize their reach and profits.
Generally speaking I enjoyed "Everybody Lies"; for the scope of the material, the easy-going style and some of the counterintuitive observations it reveals. My main reservation about the book is that I think the author overstates his case and sometimes sounds a little too breathless about the great changes these tools are going to bring. More than once he uses the term "revolutionary" in describing these data tools, but I am much more suspicious of their ultimate utility. Firstly, data does not equal knowledge; rather, it is the raw material for knowledge. As the author himself acknowledges, understanding correlation is not the same as understanding causation, and it's in very few cases that a true causal relationship between people's Google searches and their true nature can be established. Part of the reason I think this way is because I don't believe that a person's Google search is as reflective of their innermost desires as the book seems to think, so what a person truly believes may go way beyond their online behavior. Consider the studies revealing people's sexual preferences for instance; how many of them point to trivial idiosyncrasies and how many are indicative of some deeper truth about human brains? The tools alone cannot draw this distinction. At the end of the day you could thus end up with a lot of data (including a lot of noise), but teasing apart the useful data points from the red herrings is a completely different matter. In this sense, looking at Google searches and other information can be a reductionist and simplistic approach.
Secondly, it's usually quite hard to control for all possible variables that may reflect a Google search; for instance in concluding that racism contributes the most to a particular political behavior, it's very hard to tease out all other factors that also may do so, especially when you are talking about a heterogeneous collection of human beings. How can you know that you have corrected for every possible factor? Thirdly and finally, the "science" part of "data science" still lacks rigor in my opinion. For instance, a lot of the conclusions the book talks about are based on single studies which don't seem to be repeated. In some cases the sample sizes are large, but in other cases they are small. Plus, people's opinions can change over time, so it's important to pick the right time window in which to do the study. All this points to great responsibility on the part of data scientists to make sure that their results are rigorous and not too simplistic, before they are taken up by both politicians and the general public as blunt instruments to change social policies. This responsibility increases especially as these approaches become more widespread and cheaper to use, especially in the hands of non-specialists. When you are in possession of a hammer, everything starts looking like a nail.
Considering all these caveats, I thus find tools like those described in this volume to be the starting points for understanding human behavior, rather than direct determinants of human behavior. The tools themselves can tell you what they can be used for, not necessarily what problems would benefit the most from their application. The many interesting studies in this book certainly answer the "what" quite well, but most of them are still quite far from answering the "how" and especially the "why". They point out the path to the door, but don't necessarily tell us which door to open. And they can be especially impoverished in illuminating what lies beyond; for that only a true understanding of the human mind will pave the way.
At the very start, he creates a map showing the prevalence of racist Google searches. Then he compares it to a map where Donald Trump performed the best in Republican primaries. Lo and behold, they’re a close match! However, he didn’t bother to continue this comparison with the general election map. That would have shown that many of the states Hillary won show up as equally or more racist than the surrounding Trump states. I guess he didn’t see any point in muddying up the narrative.
Then he mentions that five journals turned down all this amazing research into racist Google searching. Kudos to him for admitting that, but it would have been nice to know that the research on which this book was based was substandard. I could’ve saved a few bucks.
Only it isn't a joke.
The author of this book is an economist and a NY Times author. He describes the use of Internet data, especially aggregated Google searches, for social science. In this, the book is very good. That is, if you've never heard of the idea of using these data for research, this might be a book for you.
However, people have worked for several years on these data, and have much deeper, more interesting insights than Seth SD. Indeed, he seems ignorant of their work, which is a pity, because it's so much better than the work he describes. For example, Google Flu was preceded by the work of Polgreen and (independently) Eysenbach. Tracking events from Internet searches was previously done by Backstrom (2008), etc. Indeed, there is an entire academic conference (WSDM) devoted to the use of internet data.
It is as if a medical doctor would write a book about the effect of drug prices on consumption, ignoring all that's known in economics. Sure, it might be interesting, but it's probably going to contain a whole lot of errors and obvious stuff. The same happens in this book.
Thus, there are better books about many of the subjects described in the book, including "A billion wicked thoughts", "Dataclysm", and the recent "Crowdsourced health".
Additionally, and perhaps to make the book more readable, the author makes some mistakes which will be painful for anyone with an understanding of statistics. For example, the coefficients in regression don't show causality, only correlation.
In summary, if you've never heard of this idea, by all means read the book, but with caution. Otherwise, better books exist.
Top international reviews
So this book is full of interest for those believing - or who are open to being persuaded - that the march of big data into the social sciences is continuing. And on the flip side, it shows such techniques are being used in the corporate and political world as well, to sell us more stuff or get us to donate more; primarily by using these big data techniques to leverage natural and quick feedback experiments to find out "what works". Although it also does show why it won't work for the stock market, as part of an overall section showing the limitations of these techniques.
Highly recommended for those interested in the uses, actual and potential (and abuses), of big data in the modern world, particularly using internet searches as the dataset.
How useful is all this data? Well, it can’t yet be used to predict election results, but this may be possible in the future. In terms of medicine, the potential is enormous, particularly in the field of public health.
I really enjoyed this book and was pleased to find that the author’s columns for the New York Times are free to access, so even if you don’t read the book, you can see what he’s all about.
Disturbing as well, but useful to understand how idiots get elected, and how people learn to be thieves and commit tax fraud, for instance and what makes us happy and sad.
On the downside, after reading 50-100 pages you sort of get the idea the author trying to convey, after which it becomes highly repetitive and also since the author US based, 9/10 things are about things in america.
Big data may well yield insight into how we think, but the lack of knowledge about basic scientific methods (and incorrect use of terms) do not make a convincing case for these studies being more than just something fun to read.