Cultural analytics 


Cultural Analytics, proposed originally by Lev Manovich, outlines a field of investigation involving the quantitative analysis of enormous amounts of global cultural data displayed on state-of-the-art interactive information displays.

(reposted from Literature+ blog)


I thought that if I could put it all down, that would be one way. And next the thought came to me to leave all out would be another, and truer, way.

from “The New Spirit,” John Ashbery

This leaving-out business. On it hinges the very importance of what’s novel

Or autocratic, or dense or silly. It is as well to call attention

To it by exaggeration, perhaps. But calling attention

Isn’t the same thing as explaining, and as I said I am not ready

To line phrases with the costly stuff of explanation, and shall not,

Will not do so for the moment. Except to say that the carnivorous

Way of these lines is to devour their own nature, leaving

Nothing but a bitter impression of absence, which as we know involves presence, but still.

Nevertheless these are fundamental absences, struggling to get up and be off themselves.

from “The Skaters,” John Ashbery


Cultural Analytics extends digital humanities analyses of texts by looking as well at visual new media artifacts such as video games, video clips, cinema, animation, and art, as well as by examining and analyzing data from human interaction with websites, software, and social media. Through using various data mining algorithms and image processing techniques Cultural Analytics researchers transmute raw data into quantifiable representations of cultural phenomena. This increase in scope and interpretive techniques leads to useful visualizations of, for instance, of global flows of cultural change.

Lev Manovich, Noah Wardrip-Fruin, Jeremy Douglass, William Huber, and others at the Software Studies Initiative at UCSD, have in the last few years written a series of articles and proposals describing the Cultural Analytics research program which aims to create quantitative measures of cultural innovation. These measures can be used to create visual maps of “global cultural production and consumption” which are sufficiently temporally discrete so that a representation of the flow of change in various cultural mediums– music, design, art, finance– could be presented. The goal of the research program, as defined in the white paper, “Cultural Analytics: Analytics and Visualization of Large Cultural Data Sets” (authored by Manovich in 2007), is to create detailed interactive visualizations of various cultural flows which provide “rich information” that can be presented in different formats. These cultural data sets are conceived of in the same scope as other global data sets such as those created by various scientific endeavors. For instance, Manovich compares forms of cultural data to real-time maps of global computer networks and also to real-time maps of the range and intensity of an earthquake. Manovich makes a point of differentiating these dynamic contemporary data sets with the historical data sets that are more often used in cultural analysis within the humanities. The interdisciplinary nature of this nascent field draws on digital humanities, social sciences, statistics, data mining, information visualization, and art.

A primary question is whether or not this project is actually possible– Can cultural data be meaningfully extrapolated at all? Manovich discusses the (mainly technical) reasons why current data projects fail to provide effective global cultural analyses and points to possible ways to provide these. Looking at current art or information visualization projects, Manovich notes that they use relatively small amounts of data (compared to what is available via Google, Amazon, or that is captured from scientific sensors.) That is, they are constrained by the data, or at least the form of the data, rather than motivated by the “more challenging” theoretical questions and agendas that the creator might desire. Moreover these projects don’t do any sophisticated data analysis and also do not in general do a good job of layering multiple sets of dynamic data. In particular, he points to the field of digital humanities and notes that the representations of textual data are not transformed into “compelling visualizations”, and also that, inherently, the data sets are limited to texts (as opposed to images, statistical information, etc) and generally static, historical, canonical, “high-culture” texts at that.

Manovich summarizes the “new paradigm” of cultural analytics by outlining a series of steps that encapsulate its agenda, which include: a focus on visual data; using extremely large contemporary global data sets; the use of statistical analysis, feature-extraction, clustering, etc.; formatting output to work on very large, high-resolution displays; and a focus on non-corporate agendas. As a overarching goal, he wants to be able to track and visualize the flow of cultural ideas and influences to provide “the first ever data-driven detailed map of how cultural globalization actually works”. A variety of other projects are featured on the Cultural Analytics website, such as a visualization of comments on a set of interconnected MySpace pages, visual analyses of cartoons, films, and music videos, and an examination of art history through the computational analysis of 20 million paintings.

The papers, slides, and videos introducing Cultural Analytics tend to describe the research program in terms of “explorations” of data, presentations of an “overview” of information, finding cultural “trends” and “patterns”. In other words, their work is in many ways analogous to John Tukey’s work in “exploratory data analysis” and, more relevantly for this class, to Franco Moretti’s “distant reading”. And certainly Cultural Analytics is explicitly positioned as a logical extension of this aspect of Digital Humanities, incorporating visual and other non-textual elements. On the other hand, there is less of a focus on the methodologies of interpretation and models of meaning, which surely is the more compelling aspect of the digital humanities agenda. The innovative strength of Cultural Analytics is in the joining together of new sets of tools to display and analyze new kinds of data. But the field as a whole seems to be less vigorous about doing sophisticated cultural examination with the data that is processed and visualized. Part of this is of course that it is a more difficult problem, and the goals described in the white paper and other articles are ambitious and perhaps currently secondary to the immediate need of developing technological methods to formulate useful quantifications of cultural artifacts.

However, it may also be true that this type of statistical analysis, even if it is of tantalizing cultural data, is simply not effective at ascribing cultural meaning. In other words, thinking of Willard McCarty, the high-level overviews offered as examples of Cultural Analytics do not build a model that is capable of wrestling with novel hypotheses. For instance, the video describing the Rothko visualization is impressive, yet the features that are being extracted from the paintings are relatively basic. Certainly a human expert would be able to organize the paintings into more relevant categories as well as be able to find greater cultural associations to history, artistic techniques, other artists, art theory and criticism, etc. It may be that the state-of-the-art in visual data mining and image processing is not yet sufficient to achieve compelling results. Color analysis seems too basic to be useful and object recognition, especially of non-photo realistic objects, is notoriously poor. It also may be that the data sets are too homogeneous to allow the emergence of a machine synthesis which would replicate or augment human expertise. For instance, all of the projects featured use a limited number of data sets. And there doesn’t seem to be a methodology of how to turn the visual cultural artifacts into a networked ontology of interconnected data.

The opportunities for my project, Associative Concordance, are numerous. By extending the range of what can be considered data, there is an exponential increase in the number of associations that can be made. A project like Susan Howe’s (which I mention in my annotated bibliography) indicates an almost immediate unmanageability with the introduction of a few tangential or secondary textual inputs into her poetic work. She feels a need to typographically display the smashing up and confusion of scalability and interpretation when competing readings share space. And the allure of these compilings is not at all due its arbitrariness. Certainly an authorial mediation is present, which prevents the work from feeling random, whimsical, or flarf-y. Not that there would be anything wrong with that. But Cultural Analytics is clearly positioning itself as an analogue to a more scientific analytics, and maybe without coming to terms with more fundamental issues of meaning creation. In other words, though there is an attempt to use cultural data, it is unclear whether any cultural work is actually being done. What do interactive, animated pie charts and shifting curvy lines tell us about cultural flow that couldn’t be discerned from existing tools like economic spreadsheets? What does a Rothko painting that has an unusual texture or palette tell us about anything? It’s an outlier– Are all outliers points of contention? How can Cultural Analytic techniques ensure that its process of cultural transmutation hasn’t left out some important feature which would re-calibrate its axes into a more robust organization? How does the transmutation from artifact to data continue onward to social narrative? Does it make sense to privilege visual and technological artifacts as repositories of culture?

I have a lot of affinity for the reductive nature of the Cultural Analytics research program. But I am also suspicious of the assumption that anything interesting will emerge simply by throwing newer, cooler algorithms and higher resolution screens at bigger, more intricate data sets. There will always be more “putting down” (to quote John Ashbery) available, and the resolution of real world data is infinite. I am also suspicious of the automatic assumption that just because it is impossible therefore it is not worthwhile. I suppose I am still figuring out the figure/ground of what constitutes an association, and what is even quantifiably interesting about these associations in the first place.


Resources for Further Study

The Software Studies Initiative website at has a listing of articles describing Cultural Analytics and projects utilizing Cultural Analytics techniques.

John Ashbery (1972) Three Poems, Harmondsworth: Penguin.

Jeremy Douglass. “Computer Visions of Computer Games: analysis and visualization of play recordings.” Workshop on Media Arts, Science, and Technology (MAST) 2009: The Future of Interactive Media. UC Santa Barbara, January 2009. [slides]

Lev Manovich. “Cultural Analytics: Visualising Cultural Patterns in the Era of “More Media,”” Domus, (Milan), March 2009

Lev Manovich. “How to Follow Global Digital Cultures, or Cultural Analytics for Beginners,” Deep Search, ed. Felix Stalder and Konrad Becker. Transaction Publishers ( English version) and Studienverlag (German version), in press.

Lev Manovich. White paper: Cultural Analytics: Analysis and Visualizations of Large Cultural Data Sets, May 2007. With contributions from Noah Wardrip-Fruin.

Lev Manovich and Jeremy Douglass. “Visualizing Temporal Patterns in Visual and Interactive Media.” Forthcoming in Visualising the 21st Century, ed. Oliver Grau. MIT Press.

Willard McCarty (2005) Humanities Computing. Palgrave MacMillan

Franco Moretti (2005) Graphs, Maps, Trees: Abstract Models for a Literary History. London: Verso.

John W. Tukey (1977) Exploratory Data Analysis. Addison-Wesley.