XIX.6 November + December 2012
Page: 10
Digital Citation

Content, the once and future king

Steve Portigal

back to top 

Christian Marclay's The Clock is a 24-hour film, in which each minute of the 24 hours is depicted by images of clocks (or other depictions of the time) from other movies. As described in the New Yorker, creating The Clock was an intensive, meticulous process [1]. For at least several months, as many as six people spent their days watching DVDs and ripping potential clips; Marclay spent three years working at his computer for 10 to 12 hours a day. With at least 90 years of cinematic history to work with, and perhaps 90,000 movies available (that is reportedly the size of Netflix's library, although that figure includes television programming; Netflix doesn't have all the movies that exist on DVD so this rough number at least gives us a sense of scale), there is a substantial corpus of moving images to draw from. Let's call this big content [2]. Of course, those images are not indexed, nor is there a unified aggregated source to draw from, so a project like The Clock must be painstakingly handcrafted.

Pulling together existing footage to create something new is something we continue to see regularly on the Internet—smaller in scope than Marclay's work, but still provocative. Pop hit "Call Me Maybe" by Carly Rae Jepsen (a song that gained attention when Justin Bieber tweeted about it) has been "covered" by the characters from Star Wars, edited to match each word from the song to a word of dialogue from one of the six Star Wars films. Though it doesn't quite work for me, you can see what they're getting at (

The first video like this I encountered was back in early 2010, a re-creation of the Beastie Boys' "Sabotage," using only clips from Battlestar Galactica ( That project drew from about 3,500 minutes of footage to match each shot of the three-minute video. According to Wired, "Galactica: Sabotage" was done over five nights, at most two hours per night [3]. With a much tighter brief than The Clock, it evokes a very digital sense of the big-content future by raising (let alone answering!) the question: Is there enough content from Source X to match to Template Y?

This form hearkens back to the "break-in" records, most famously by Dickie Goodman. In 1975 my friends and I were entertained by "Mr. Jaws," in which the shark from Jaws was interviewed by a reporter, and all of his responses were samples from popular songs (e.g., Q: I know sharks are stupid, but what did you think when you took that first bite? A: [James Taylor singing] How sweet it is...). At that point, Goodman had been making those records for 20 years! Extremely dated now, the break-in form fails to fully transcend the original bits, coming off as more of a concatenation than a creation, and feels inherently analog. You can almost picture Goodman in a classic recording studio with his hands full of strips of audio tape.

In the digital era, we have seen the emergence of something called a supercut, "[a] fast-paced montage of short video clips that obsessively isolates a single element from its source, usually a word, phrase, or cliché from film and TV" [4]. And of course, has them all. Every Hitchcock cameo, a series of exploding heads, a collection of people saying, "We've got company!", every cigarette smoked on Mad Men, and on and on. We've seen a plethora of hilarious trailer recuts, including turning The Shining into a rom-com called Shining (or turning the ultimate romantic comedy, When Harry Met Sally, into a horror film). All of these projects are artisanal labors of love that dwell within the category of remixes. We're going to big content just for new ways of looking at the same stuff. Better tools (and better digitization of the content) can accelerate this, but I think there is more we can do.

We are headed toward a paradigm shift, on the verge of a new type of user experience enabled by these petabytes of data. We're constantly reading how big data is remaking everything around us, from design (A/B testing in general, Google's 41 shades of blue as a key signifier), education ("Colleges Awaken To The Opportunities of Data Mining";, health ("Sergey Brin's Search for a Parkinson's Cure";; the quantified self). And this data is certainly big. The systems we use every day process or produce data in a volume we can't easily conceive of. People upload more than four million minutes of video to YouTube and tweet 400 million times every day. Less obvious are examples such as digital map provider Navteq, which makes 2.4 million changes to its database daily.

But where can we go with big content (and big data)? Pointer Pointer ( is a simple site that shows a photograph with a person pointing to your cursor. Depending on where the cursor is located, a different image will appear. While it's obviously an indexed database of images and grid points [5], the aesthetic choice of images suggests that these are (or could be someday) random images pulled from anyone's Facebook feed. It's not hard to imagine a big-content era in which this kind of interaction is automatically generated from a much larger data set. Pointer Pointer is the artisanal version of what is yet to come, a proof-of-concept prototype for a world of marked-up data described in Bruce Sterling's 1998 story "Maneki Neko":

"Tsuyoshi enjoyed his work. Quite often he came across bits and pieces of videotape that were of archival interest. He would pass the images on to the net. The really big network databases, with their armies of search engines, indexers, and catalogues, had some very arcane interests. The net machines would never pay for data, because the global information networks were noncommercial. But the net machines were very polite, and had excellent net etiquette. They returned a favor for a favor, and since they were machines with excellent, enormous memories, they never forgot a good deed" [6].

In 2010 the company CereProc created a speech synthesizer for film critic Roger Ebert, who had lost the ability to speak due to illness and surgery. The synthesizer used samples of his voice, taken from his TV show and movie commentaries. But even with that corpus of available data—30 years of co-hosting a weekly television show—much of it wasn't usable (because he often spoke over music or dialog from a movie). Still, this seems like a watershed moment for new products being developed from existing datasets. We can see the tantalizing possibility for building prosthetic speech appliances for anyone, without decades of media presence. While Ebert's solution was carefully built by hand, we're clearly headed toward some new and exciting experiences.

This is a new sandbox for technologists, data scientists, marketers, and experience designers. What are the corpora we have access to? What is lurking within our data smog? What are the new experiences we can create? No doubt we will continue to see art and humor, but let's use those to inspire us as we imagine what else is possible. The biggest potential (and as always the hardest problem) is in the development of game-changing experiences. I look forward to seeing where this goes.


back to top  References

1. Zalewski, D. The hours: How Christian Marclay created the ultimate digital mosaic. The New Yorker, March 12, 2012;

2. Admittedly, this term is in use already, as a pejorative for the organizations that control content. I'm using it here like big data, not so much characterizing the motivations as acknowledging that when you get a whole lot of something, then maybe something new can happen.

3. Van Buskirk, E. 'Galactica: Sabotage' creator discusses her brilliant Beastie Boys sci-fi mashup.Wired, March 10, 2010;


5. A technical deconstruction is posted at

6. The full story is at

back to top  Author

Principal of Portigal Consulting. Expert in capturing and reframing user insights.

back to top 

©2012 ACM  1072-5220/12/11  $15.00

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.

The Digital Library is published by the Association for Computing Machinery. Copyright © 2012 ACM, Inc.

Post Comment

No Comments Found