This semester I’m teaching Programming with Data for Social Science, and my students have recently started reading Matt Salganik’s excellent book, Bit by Bit: Social Research in the Digital Age. The book gives a detailed and thoughtful overview of the many challenges and opportunities of computational social science.
One dimension Salganik introduces early on is the difference between “custommade” and “readymade” data. Borrowed from the art world, these phrases suggest different origins: custommades are intentionally created with a specific purpose in mind while readymades are repurposed.
Traditional social science methods tend to be custommades — you design your experiment, surveys, and sampling approach with a specific research question in mind. Data science, on the other hand, relies more on readymades — data are detritus of some other goal.
Both of these approaches have their strengths and weaknesses, and each appropriate under different contexts. Understanding this is a key piece of the art of computational social science.
Speaking of art, as I mentioned, the terms “readymade” and “custommade” come from the art world, and Salganik illustrates this metaphor by comparing two specific works of art:
[Marcel] Duchamp is best known for his readymades, such as Fountain, where he took ordinary objects and repurposed them as art. Michelangelo, on the other hand, didn’t repurpose. When he wanted to create a statue of David, he didn’t look for a piece of marble that kind of looked like David: he spent three years laboring to create his masterpiece. David is not a readymade; it is a custommade.
This excerpt doesn’t quite do justice to Duchamp’s work. Being somewhat less known that Michelangelo, I’m afraid this makes the metaphor incomplete. That is, you may understand Fountain as readymade without really appreciating what that means.
So, here’s some more detail on that readymade piece of art. As the Philadelphia Museum of Art describes:
In the spring of 1917, Duchamp, with the help of several friends, notoriously submitted a porcelain urinal to an unjuried exhibition held by the Society of Independent Artists in New York. Purchased from a store that sold plumbing fixtures, this object, which was titled Fountain and signed “R. Mutt” was rejected by a vote of the organizers, touching off a fierce debate.
Duchamp, who had been one of the organizers, resigned in protest.
In May of that year, the avant-garde magazine, The Blind Man — published by Duchamp and his friends — ran an editorial defending the work:
They [said] any artist…may exhibit.
Mr. Richard Mutt sent in a fountain. Without discussion this article disappeared and was never exhibited.
What were the grounds for refusing Mr. Mutt’s fountain: –
- Some contented it was immoral. Vulgar.
- Others, it was plagiarism, a plain piece of plumbing.
Now, Mr. Mutt’s fountain is not immoral, that is absurd, not more than a bath tub is immoral. It is a fixture that you see every day in plumber’s store windows.
Whether Mr. Mutt with his own hands made the fountain or not has not importance. He CHOSE it. He took an ordinary article of life, placed it so that it’s useful significance disappeared under the new title and point of view – created a new thought for that object.
While it may seem somewhat tangential to computational social science, the story of Mr. Mutt’s urinal makes me more fully appreciate the concept of readymade data.
Is it vulgar? Is it mundane? The art of Duchamp’s Fountain is in that very debate itself: he took an every day item and made it worthy of public discussion, encouraging us to question conventional wisdom and to ask questions we didn’t even know we had.
Similarly, there are important concerns about readymade data – questions of ethics and meaning, which should be rigorously debated. But that debate itself is an integral part of the scientific endeavor. The art of this work is in engaging critically with these concerns.
We may now be so inundated with data, so used to our movements and habits being passively tracked, that it’s easy to forget: there’s something profoundly radical about repurposing found data for social science research.
We CHOOSE it, we place it so that it’s useful significance disappears under the new point of view. In reimagining and reinterpreting these data we bring new knowledge into the world; we create a new thought for that object.