You are using an outdated browser.
Please upgrade your browser
and improve your visit to our site.
Skip Navigation

What Big Data Will Never Explain

Rainier Ehrhardt/Getty Images News

“She told him that she loves me, which is an important data point.” I overheard those words a few months ago, and they stopped me in my tracks. I did not know the smitten and empirical young man who spoke them well enough to offer a correction of his way of talking about desire, but I was pleased to have stumbled upon such a blunt formulation of one of the shibboleths of the day. I refer to the messianic conception of data, or Big Data. (It always sounds to me like a tragic bully out of Tennessee Williams: “Big Data’s going to live!”) What the young man was doing was datafying. I take the term from Viktor Mayer-Schönberger and Kenneth Cukier, whose book Big Data: A Revolution That Will Transform How We Live, Work, and Think, is a useful if propagandistic introduction to the digital world’s latest instrument of salvation.

“To datafy a phenomenon,” they explain, “is to put it in a quantified format so it can be tabulated and analyzed.” To illustrate the scope of datafication, and the enlightenment that it may provide, they adduce “the datafication of sentiment”: “Twitter enabled the datafication of sentiment by creating an easy way for people to record and share their stray thoughts, which had previously been lost to the winds of time.” I think I prefer the winds of time.

I have been browsing in the literature on “sentiment analysis,” a branch of digital analytics that—in the words of a scientific paper—“seeks to identify the viewpoint(s) underlying a text span.” This is accomplished by mechanically identifying the words in a proposition that originate in “subjectivity,” and thereby obtaining an accurate understanding of the feelings and the preferences that animate the utterance. This finding can then be tabulated and integrated with similar findings, with millions of them, so that a vast repository of information about inwardness can be created: the Big Data of the Heart. The purpose of this accumulated information is to detect patterns that will enable prediction: a world with uncertainty steadily decreasing to zero, as if that is a dream and not a nightmare. I found a scientific paper that even provided a mathematical model for grief, which it bizarrely defined as “dissatisfaction.” It called its discovery the Good Grief Algorithm.

The mathematization of subjectivity will founder upon the resplendent fact that we are ambiguous beings. We frequently have mixed feelings, and are divided against ourselves. We use different words to communicate similar thoughts, but those words are not synonyms. Though we dream of exactitude and transparency, our meanings are often approximate and obscure. What algorithm will capture “the feel of not to feel it / when there is none to heal it,” or “half in love with easeful Death”? How will the sentiment analysis of those words advance the comprehension of bleak emotions? (In my safari into sentiment analysis I found some recognition of the problem of ambiguity, but it was treated as merely a technical obstacle.) We are also self-interpreting beings—that is, we deceive ourselves and each other. We even lie. It is true that we make choices, and translate our feelings into actions; but a choice is often a coarse and inadequate translation of a feeling, and a full picture of our inner states cannot always be inferred from it. I have never voted wholeheartedly in a general election.

For the purpose of the outcome of an election, of course, it does not matter that I vote complicatedly. All that matters is that I vote. The same is true of what I buy. A business does not want my heart; it wants my money. Its interest in my heart is owed to its interest in my money. (For business, dissatisfaction is grief.) It will come as no surprise that the most common application of the datafication of subjectivity is to commerce, in which I include politics. Again and again in the scholarly papers on sentiment analysis the examples given are restaurant reviews and movie reviews. This is fine: the study of the consumer is one of capitalism’s oldest techniques. But it is not fine that the consumer is mistaken for the entirety of the person. Mayer-Schönberger and Cukier exult that “datafication is a mental outlook that may penetrate all areas of life.” This is the revolution: the Rotten Tomatoes view of life. “Datafication represents an essential enrichment in human comprehension.” It is this inflated claim that gives offense. It would be more proper to say that datafication represents an essential enrichment in human marketing. But marketing is hardly the supreme or most consequential human activity. Subjectivity is not most fully achieved in shopping. Or is it, in our wired consumerist satyricon?

“With the help of big data,” Mayer-Schönberger and Cukier continue, “we will no longer regard our world as a string of happenings that we explain as natural and social phenomena, but as a universe comprised essentially of information.” An improvement! Can anyone seriously accept that information is the essence of the world? Of our world, perhaps; but we are making this world, and acquiescing in its making. The religion of information is another superstition, another distorting totalism, another counterfeit deliverance. In some ways the technology is transforming us into brilliant fools. In the riot of words and numbers in which we live so smartly and so articulately, in the comprehensively quantified existence in which we presume to believe that eventually we will know everything, in the expanding universe of prediction in which hope and longing will come to seem obsolete and merely ignorant, we are renouncing some of the primary human experiences. We are certainly renouncing the inexpressible. The other day I was listening to Mahler in my library. When I caught sight of the computer on the table, it looked small.