Q&A with Christian Rudder, cofounder of OkCupid and author of Dataclysm
As more of our social interaction happens on social media, how much can researchers learn about us from our online interactions?
Well, they can only learn what we tell them, but in the age of Facebook and Google, that’s become pretty much everything. To the extent that friendship, anger, sex, love, and whatever else happen online, we can investigate them.
Your search history tells us what kind of jokes you like. Your Facebook network reveals not just your friendships, but in some cases the state of your marriage. Your preferences on OkCupid tell us what you find sexy, and your reaction to the strangers the site offers up tells us how you judge people. The articles you “like” tell us not just about your politics, but even predict your intelligence.
You fold in data points like these for millions and millions of people, and you start to get a whole new picture of humankind.
In Dataclysm you’re taking this flood of information and putting it to an entirely new use: understanding human nature. So what have you found?
I tried really hard to avoid the numerical dog and pony show. There are of course lots of interesting one-off factoids, but I mostly found what I (and probably you) have always known: that people are gentle, mean, stupid, lusty, lonely, kind, foolish, shrewd, shallow, and endlessly complex. Dataclysm’s central idea isn’t necessarily what we can see using big data; it’s the fact of the vision itself. That we can get real data on even the most private moments in people’s lives is an astounding thing. It’s like the second advent of reality television, but this time without the television part. Just the reality.
Are you worried about any of this?
I have mixed feelings about the implications. I myself almost never tweet, post, or share anything about my personal life. At the same time, I’ve just spent three years writing about how interesting all this data is, and I cofounded OkCupid. My hope is that this ambivalence makes me a trustworthy guide through the thicket of technology and data. I admire the knowledge that social data can bring us; I also fear the consequences.
You have a lot to say about race in the book, and you use data to shed light on the many ways it affects the way we interact with one another. What surprised you about your research in this area? Did you find anything unsurprising?
The data on race was surprising only in its stubborn predictability—for all the glitzy technology, the results could’ve been from the 1950s. I grew up in Little Rock and graduated from Central High, the first school in the South to be integrated: Eisenhower, the National Guard, mobs of white people screaming at nine black children, that’s Central. The school embraces its history and is now over half black. I’m no brave crusader, but race (and racism) were part of my education. So when, in researching the book, I unpacked three separate databases and found that in every one white people gave black people short-shrift, I wasn’t shocked, you know? Asians and Latinos apply the same penalty to African Americans that white folks do, which says something about how even (relatively) recent additions to the “American experience” have acquired its biases.
What makes this moment in time—and this set of data—different from the massive data surveys of the past, such as Pew, Gallup, or the Kinsey Institute?
The data in my book is almost all passively observed—there’s no questionnaire, no contrived experiment to simulate “real life.” This data is real life. Online you have friends, lovers, enemies, and intense moments of truth without a thought for who’s watching, because ostensibly no one is—except of course the computers recording it all. This is how digital data circumvents that old research obstacle: people’s inability to be honest when the truth makes them look bad. Digital data’s ability to get at the private mind like this is unprecedented and very powerful.
"Most data-hyping books are vapor and slogans. This one has the real stuff: actual data and actual analysis taking place on the page. That’s something to be praised, loudly and at length. Praiseworthy, too, is Rudder’s writing, which is consistently zingy and mercifully free of Silicon Valley business gabble."—Jordan Ellenberg, Washington Post
"As a researcher, Mr. Rudder clearly possesses the statistical acumen to answer the questions he has posed so well. As a writer, he keeps the book moving while fully exploring each topic, revealing his graphs and charts with both explanatory and narrative skill. Though he forgoes statistical particulars like p-values and confidence intervals, he gives an approachable, persuasive account of his data sources and results. He offers explanations of what the data can and cannot tell us, why it is sufficient or insufficient to answer some question we may have and, if the latter is the case, what sufficient data would look like. He shows you, in short, how to think about data."—Wall Street Journal
"Rudder is the co-founder of the dating site OKCupid and the data scientist behind its now-legendary trend analyses, but he is also — as it becomes immediately clear from his elegant writing and wildly cross-disciplinary references — a lover of literature, philosophy, anthropology, and all the other humanities that make us human and that, importantly in this case, enhance and ennoble the hard data with dimensional insight into the richness of the human experience...an extraordinarily unusual and dimensional lens on what Carl Sagan memorably called ‘the aggregate of our joy and suffering.’"—Maria Popova, Brain Pickings
"Fascinating, funny, and occasionally howl-inducing...[Rudder] is a quant with soul, and we’re lucky to have him."—Elle
"There's another side of Big Data you haven't seen—not the one that promised to use our digital world to our advantage to optimize, monetize, or systematize every last part our lives. It's the big data that rears its ugly head and tells us what we don't
want to know. And that, as Christian Rudder demonstrates in his new book, Dataclysm
, is perhaps an equally worthwhile pursuit. Before we heighten the human experience, we should understand it first." —TIME
"At a time when consumers are increasingly wary of online tracking, Rudder makes a powerful argument in Dataclysm
that the ability to tell so much about us from the trails we leave is as potentially useful as it is pernicious, and as educational as it may be unsettling. By explaining some of the insights he has gleaned from OkCupid and other social networks, he demystifies data-mining and sheds light on what, for better or for worse, it is now capable of."—Financial Times
is a well-written and funny look at what the numbers reveal about human behavior in the age of social media. It’s both profound and a bit disturbing, because, sad to say, we’re generally not the kind of people we like to think — or say — we are."—Salon
"For all its data and its seemingly dating-specific focus, Dataclysm
tells the story set forth by the book's subtitle, in an entertaining and accessible way. Informative, eye-opening, and (gasp) fun to read. Even if you’re not a giant stat head." —Grantland
"[Rudder] doesn’t wring or clap his hands over the big-data phenomenon (see N.S.A., Google ads, that sneaky Fitbit) so much as plunge them into big data and attempt to pull strange creatures from the murky depths." —The New Yorker
"A hopeful and exciting journey into the heart of data collection...[Rudder's] book delivers both insider access and a savvy critique of the very machinery he is employed by. Since he's been in the data mines and has risen above them, Rudder becomes a singular and trustworthy guide.—The Globe and Mail
"Compulsively readable — including for those with no particular affinity for numbers in and of themselves — and surprisingly personal. Starting with aggregates, Rudder posits, we can zoom in on the details of how we live, love, fight, work, play, and age; from numbers, we can derive narrative. There are few characters in the book, and few anecdotes — but the human story resounds throughout."—Refinery29
"Rudder’s lively, clear prose…makes heady concepts understandable and transforms the book’s many charts into revealing truths…Rudder teaches us a bit about how wonderfully peculiar humans are, and how we go about hiding it."—Flavorwire
is all about what we can learn about human minds and hearts by analyzing the massive ongoing experiment that is the internet." —Forbes
"The book reads as if it's written (well) by a curious child whose parents beg him or her to stop asking "what-if" questions. Rudder examines the data of the website he helped create with unwavering curiosity. Every turn presents new questions to be answered, and he happily heads down the rabbit hole to resolve them."—U.S. News
"This is the best book that I've read on data in years, perhaps ever. If you want to understand how data is affecting the present and what it portends for the future, buy it now."—Huffington Post
"Rudder draws from big data sets – Google searches, Twitter updates, illicitly obtained Facebook data passed shiftily between researchers like bags of weed – to draw out subtle patterns in politics, sexuality, identity and behaviour that are only revealed with distance and aggregation…Dataclysm
will entertain those who want to know how machines see us. It also serves as a call to action, showing us how server farms running everything from home shopping to homeland security turn us into easily digested data products. Rudder's message is clear: in this particular sausage factory, we are the pigs.”
"Studying human behavior is a little like exploring a jungle: it's messy, hard, and easy to lose your way. But Christian Rudder is a consummate guide, revealing essential truths about who we are. Big Data has never been so fun."—Dan Ariely, author of Predictably Irrational
is a book full of juicy secrets—secrets about who we love, what we crave, why we like, and how we change each other’s minds and lives, often without even knowing it. Christian Rudder makes this mathematical narrative of our culture fun to read and even more fun to discuss: You will find yourself sharing these intriguing data-driven revelations with everyone you know."—Jane McGonigal, author of Reality Is Broken
"In the first few pages of Dataclysm,
Christian Rudder uses massive amounts of actual behavioral data to prove what I always believed in my heart: Belle and Sebastian is the whitest band ever. It only gets better from there."—Aziz Ansari
"It’s unheard of for a book about Big Data to read like a guilty pleasure, but Dataclysm
does. It’s a fascinating, almost voyeuristic look at who we really are and what we really want."—Steven Strogatz, Schurman Professor of Applied Mathematics, Cornell University, author of The Joy of x
"Smart, revealing, and sometimes sobering, Dataclysm
affirms what we probably suspected in our darker moments: When it comes to romance, what we say we want isn't what will actually make us happy. Christian Rudder has tapped the tremendous wealth of data that the Internet offers to tease out thoughts on topics like beauty and race that most of us wouldn’t cop to publicly. It's a riveting read, and Rudder is an affable and humane guide."—Adelle Waldman, author of The Love Affairs of Nathaniel P.
"Christian Rudder has written a funny and profound book about important issues. Race, love, sex—you name it. Are we the sum of the data we produce? Read this book immediately and see if you can answer the question."—Errol Morris
"Big Data can be like a 3D movie without 3D glasses—you know there's a lot going on but you're mainly just disoriented. We should feel fortunate to have an interpreter as skilled (and funny) as Christian Rudder. Dataclysm is filled with insights that boil down Big Data into byte-sized revelations."—Michael Norton, Harvard Business School, coauthor of Happy Money
"With a zest for both the profound and the wacky, Rudder demonstrates how the information we provide individually tells a vast deal about who we are collectively. A visually engaging read and a fascinating topic make this a great choice not just for followers of Nate Silver and fans of infographics, but for just about anyone who, by participating in online activity, has contributed to the data set."—Library Journal
"Demographers, entrepreneurs, students of history and sociology, and ordinary citizens alike will find plenty of provocations and, yes, much data in Rudder's well-argued, revealing pages."