When we are online, every like, every follow, every click is recorded and analyzed by the corporations, large and small, that rule the internet. They use these terabytes of data to market their products, to predict how new products will sell, and more. Exactly what other uses they make of the data, most of us don’t think much about, but the corporations own it and we give them permission to collect and use it when we agree to their terms of service.
The fact that most of us don’t think about someone watching our online behavior is a central assumption in Christian Rudder’s book, Dataclysm, made explicit by the subtitle Who We are (When We Think No One’s Looking). Using that premise, Rudder analyzes the clicks, messaging behavior, and survey results from the online dating site OkCupid, as well as few others. He has access to this data because he is a founder of the site and knows other people in the field. He leverages this privileged information into a book length speculation about what the data means.
Some of Rudder’s observations are well-considered and interesting. Some are less profound. At times I think Rudder jumps to erroneous conclusions and I’d wager a significant amount of money that any thoughtful reader of the book will agree with Rudder sometimes and disagree at others, depending on the specific context. Probably most readers will be occasionally offended by the book. But despite the fact that his ideas are often not fully supported by the data, they are also not fully contradicted by the data. So, even when you disagree with his conclusions, you have to admit he could be right. We just don’t know.
Overall, that makes for a provocative book that opens the imagination for the kinds of knowledge we could gain with careful analysis of the vast quantities of data we, as a global internet society, are collecting.
But beyond agreeing or disagreeing with Rudder, I have a more fundamental issue with Rudder’s approach to the data. He writes,”As far as I know, I’ve made no motivated decision that has bent the outcome of my work.” With this sentence he claims that he uses no theory to reach his conclusions, as if, somehow, he just lets the data talk and listens carefully, transcribing the data’s proclamations accurately.
I don’t think Rudder is naive, but I can only take him at his word. As any scientist or thinker knows, it is impossible to be theoryless. So, to claim explicitly to be theoryless means either he doesn’t know what theory or theories are guiding his decisions or he refuses to tell us. Either way, it is a deep flaw in the book that the reader doesn’t know the theoretical approach taken by the author.
Read the book for some interesting applications of descriptive statistics (and, typographically, for some great use of the color red!). But read with a skeptical mind.