Ever wondered why corporates spend so much on gathering and analysing data? After all, they are just digits and complex mathematics. Data might be boring to look at and difficult to comprehend, but is a treasure trove for those who are in business, governance, media and more. But, what is data? Or, the right question to ask is who is data? The answer is: You. Data is people like you and me, who are flooding the internet with tons of personal information, including our deepest and darkest desires, fetishes and moods.
In his new book, Everybody Lies: What the Internet Can Tell Us About Who We Really Are, Seth Stephens-Davidowitz – a former Google data scientist – takes an in-depth look into our true selves. The book helps us confront a reality, which most humans would try to sweep under the rug. Ask anyone if he’s a racist or sexist? Most likely everyone is going to answer, no. The internet, however, tells us otherwise.
Contrary to various popular sample collecting methodologies, Seth mainly relies on Google searches as the foundation of his research. The findings of his study are shocking and saddening, but nonetheless revealing. The study is not only driven by Google search but also Pornhub – one of the biggest pornographic websites in the world.
Why Google search? Seth offers an apt justification in the book: “In the pre-digital age, people hid their embarrassing thoughts from other people. In the digital age, they still hide from other people, but not from the internet and in particular sites such as Google and Pornhub, which protect their anonymity. These sites function as a sort of digital truth serum …”
The author further analyses the data available on these websites to conclude that elements such as racism, hate and insecurity are far more prevalent in society. He points out that had research agencies focused on data available on Google and its various other services, Donald Trump’s victory in the US Presidential Election would not have shocked them. Seth stresses that the data collected by researchers are not necessarily accurate whereas mining data from Google could give a higher accuracy.
One of the challenges the traditional surveys face is that people tend to lie. This is mainly because humans tend to escape admitting their dark truths and want to portray to the world they’re good, balanced and think sanely. But it is Google where they confide their secrets, because no one is watching them, no one is judging them. People in solitary tell the internet about their mental health, sexless marriages and real political views.
The author also uses the data to analyse different aspects of our lives, things such as what it actually takes to become a successful NBA player, which first dates have higher possibilities of moving on to a second date, and so on.
Pornhub, another major reference point for Seth, helps him evaluate our sexual fantasies, something we may not be comfortable sharing with even our closest friends. He points out that a higher number of searches on the pornographic website are about homosexuality, incest, violence and bizarre things like humping a stuffed toy.
The author shares some interesting examples. He says that one of the most common searches made immediately before or after “gay porn” is “gay test.” The searches for “gay test” are about twice as prevalent in the least tolerant states in the US as compared to the ones who have recognised same-sex marriages.
The author also mentions sexual search phrases looked up in developing countries such as India.
“Did you know that in India the number one search beginning ‘what my husband wants…’ is ‘my husband wants me to breastfeed him’?, This comment is far more common in India than any other countries. Moreover, porn searches for depiction of women breastfeeding men are four times higher in India and Bangladesh than in any other country in the world,” he writes.
Everybody Lies… is not just a commentary on our ‘real’ behavioural patterns but also highlights the need for analysing the new data sets, which not necessarily need to be very big. A smarter understanding of data, which is available online, can help the governments, businesses and individuals alike. The book in itself is big data for those wanting to learn more about the real usage and implementation of the Big Data.