Distant Reading: An Offset of Big Data
The advent of Web 2.0 has allowed for a much more considered and categorised means of collecting data. As a result of the surfeit of information produced by our cyber footprint, a newly developed concept known as ‘big data’ has emerged. Big data is the development of information formulated from digital and electrical footprints. Almost everything one does on the internet, from posting a status on Facebook, to searching through Google or making an online transaction, is monitored to produce mass predictions. “And so a veritable Big Bang has set off, delivering an epic sea of raw materials, a plethora of examples so great in number, only a computer could manage to learn from them.” (Siegel 2013, 3)
Siegel (2013, 4) describes the raw data as malarkey; it is only after processing that the real benefits of the data emerge. These predictions have infiltrated every facet of our lives. The analysis of big data can predict the success of a certain political leader or party; it can predict energy demand in particular areas; it can also predict student essay grades, crime hotspots and general estimates regarding health, wealth and human behaviour (Siegel 2013, 5-9). However, these predictions are not 100% accurate but produce indicators that are far reaching and influential.
Similar to business predictions based on big data, academics and university students are using the abundance of information gathered to assist in research. As Don Tapscott (The Economist 2012) stated, “this is not an information age, it is an age of networked intelligence and one of collaboration.” An interesting repercussion of big data is the development of ‘distant reading’ and as Tapscott alluded to, it is indeed an intelligent form of collaboration.
“Distant reading identifies what something says without actually reading it,” and is the practice of using programming analysis of big data to pinpoint useful information on a subject, in an age where there is an over-abundance of knowledge (PHD Comics 2013). Franco Moretti, Italian literary scholar and founder of the Stanford Literary Lab, hypothesises that in order to accurately understand literature in a digital age we must stop reading books (Schulz 2011). In a test of distant reading, Moretti placed 30 novels of specific genres into two different computer systems. Through big data analysis, he then tested six separate works of literature in an attempt to pinpoint their genres. Both programs succeeded and interestingly utilised methods dissimilar to that of a person to identify the genres. Instead of human focus on landscapes and atmosphere, the computer system identified the genre through repetition of specific words. “[Big Data] suggests that genres ‘possess distinctive features at every possible scale of analysis.’ More importantly for the Lit Lab, it suggests that there are formal aspects of literature that people, unaided, cannot detect.” (Schulz 2011)
Big data has revolutionized the way we approach learning and qualitative analysis of data. The result of big data analysis has insinuated itself into the day-to-day development and operation of society. The concept of big data is in a constant state of flux and distant reading is but one offset of this new and highly informative area of digital evolution.
Big Data + Distant Reading in relation to Art: http://5702x.graeworks.net/?p=437
Harrington, Stephen. 2013. “Ch 18 Tweeting about the Telly: Live TV, Audiences, and Social Media.” In Twitter and Society edited by Katrin Weller, Axel Bruns, Jean Burgess, Merja Mahrt & Cornelius Puschmann, 237-248. New York, NY: Peter Lang.
PHD Comics. 2013. “Big Data + Old History.” Youtube video, posted on September 6. Accessed May 8, 2014. https://www.youtube.com/watch?feature=player_embedded&v=tp4y-_VoXdA
Schulz, Kathryn. 2011. “The Mechanic Muse: What is Distant Reading?” The New York Times, June 24. Accessed May 8, 2014. http://www.nytimes.com/2011/06/26/books/review/the-mechanic-muse-what-is-distant-reading.html?pagewanted=all&_r=0
Siegel, Eric. 2013. “Introduction – The Prediction Effect.” In Predictive Analytics, edited by Eric Siegel,1-16. Hoboken, NJ: John Wiley and Sons Inc.
The Economist. 2012. “What is Big Data?” Youtube video, posted on June 26. Accessed May 8, 2014. https://www.youtube.com/watch?v=ahZGEusG13A