In the past few years, I have sifted through more trace data than I care to remember. By trace data I mean logs of actions taken by users on Internet sitesmostly aggregated data from many users, but sometimes data from single users, such as search query logs.
Why all this immersion in activity logs and trace data? When designing and developing any kind of technology, data is useful for many reasons, some of which are:
- evaluation and understanding, using summative methods. Did feature X or service Y get used? And if so, what happened just before or just after it was used? From this we can get a sense of someone’s experience as they interact with a page or an application or between pages and applications. If it did not get used, can we determine why not?
- iterative design, and data collection for formative and generative design. Did people do what we expected or desired them to do? Did their actions match our assumed task models? Did users seem to be thwarted, repeating actions and abandoning without completing a set of task-related actions? And if so, do we have a way to understand what would have worked for them?
- theorizing about fundamentals of human behavior. What are the gross patterns we can discern from individuals, from variously grouped sets of people, or across the whole user population? What, if anything, do these data patterns tell us about humans and technology more generally? Do they contribute to social science understandings of human psychology and/or social behavior, and in turn, do they offer any insights that may contribute to generative design efforts?
- modeling and designing adaptive systems. Can we infer from people’s actions what they are likely to want to do next so we can invoke some model-based, on-the-fly interface and/or interaction adaptations to improve their experience or move them through some pre-scripted sequence of experiences (e.g., move them up a gaming level)?
- business relevance and prediction. Does the activity data within and across applications and services offer any insights that are business-relevant to existing offerings or suggest areas for innovation?
These of course represent a small fraction of the activities that are part of the current worldwide obsession with data, most publicly manifested in discussions of the global and societal implications of “big data.” Indeed, in the world of Internet innovation, you can’t walk 10 feet without someone telling you how much data they have, and it seems calling oneself a “data scientist” is like saying you can lay golden eggs. As discussed by authors from Microsoft in the May + June issue of interactions, there are many challenges facing data analysts . These challenges will only increase, as more and more data is collected from wireless sensor networks, mobile devices, software logs, cameras, and RFID readers. We have our work cut out for us in designing better tools and services for data analytics .
For the purposes of this column, my comments will sit conceptually somewhere between daily data analytics and societal concerns about big data. I want to offer a few personal observations and cautionary remarks about the frenzy over data and data analytics, as well as a couple of ruminations about where practitioners and researchers in HCI canand, I believe, mustweigh in.
First, while most of us in design-oriented research areas are very aware of the value of qualitative methods and data, many discussions in which the word data is used tend to focus almost entirely on quantitative data, with little acknowledgement that behind every quantitative measure and every metric are a host of qualitative judgments and technological limitations: What should we measure? What can we measure? Of what is the metric constituted and what assumptions are embedded within that metric? Decisions about what does and what does not get measured are choices. Those choices have weight, and should not be taken lightly. In their 1999 book, Sorting Things Out: Classification and Its Consequences, Geof Bowker and Susan Leigh Star look at many instances of classification, and remind us that “information scientists work every day on the design, delegation, and choice of classification systems and standards” and that “each standard and each category valorizes some point of view and silences another.” They note: “This is not inherently a bad thingindeed it is inescapable. But it is an ethical choice, and as such it is dangerousnot bad, but dangerous” .
Second, there are a couple of problems with numbers. Numbers impress us and sometimes even intimidate us. There is a great deal of fear and fetishism around numbers, and more than a touch of zealous reverence. This means invoking numbers can be a powerful persuasion techniqueafraid to show their own ignorance, recipients of “factiful” arguments (that is, arguments full of fanciful facts), especially data-laden ones, can be hoodwinked into accepting invalid conclusions. In his book, Proofiness: The Dark Arts of Mathematical Deception, Charles Seife reminds us that, as humans, our reasoning in support of an argumentincluding data specification, data collection, data analysis and reportagecan be biased. Arguments and the selection of datafactsthat purportedly supports them are sometimes driven by what we want to be true, not necessarily what would be revealed through the rigorous application of tested and unbiased methods. He offers some compelling examples of what he calls “the art of using bogus mathematical arguments to prove something that you know in your heart is trueeven when it’s not” (see sidebar) .
Third, in part due to the aforementioned reverence, and nicely fueled by the arcane nature of most data-analysis tools, trace data analysis can seem more like a dark art than a science. Mathematical notation and clean code can look like precise languages, but there is a real art and craft involved in their production and application. Data analysts sometimes have the aura (or worse, adopt the mantle!) of shamans; invested with oracular power they issue ritual incantations to the murky unknown, looking for answers that live beyond, “on the grid” or “in the cloud.” Indeed, it was this sifting through logged user-activity traces in the attempt to work out why some users abandoned our website that led me to declare a few years ago that doing trace and log data analysis was like a séancethere we were, examining the traces of the dearly departed in the hope we would find some clue as to how to make them come back. Before you dismiss this analogy, give it some thought. In a séance you have the following elements: people you wish to contact, tools that allow you to make contact with the other side (Ouija board, candles, etc.), a medium, and the departedshadowy, ill-formed presences who sometimes offer insights into why they left, how they feel about their departure, reflections on their current setting, and general advice for the inquirers. A data analyst can sometimes play the role of data whisperer, the medium who connects with the other side.
A singular focus on the data without a consideration of the circumstances of its collection may lead us to miss the point or be overly general in our conclusions.
Finally, even for the highly numerate among us, we need to pose questions about how data was collected. There is much to do in the world of system, application, and service instrumentation that focuses on gathering data to reflect user experience. Inadequate or inappropriate instrumentation affects data quality and its fit for the purpose we have in mind. The interface we design affects what data it is possible to capture and therefore what it is possible to aggregate, summarize, and base one’s models on. The interface is thus in some ways a conversation with a user about a task or experience. A singular focus on the data without a consideration of the circumstances of its collection may lead us to miss the point or be overly general in our conclusions. For example, the “Like” button might be capturing something fundamental about human communication, but I am not sure I could conclude much other than it is available to be clicked. In user modeling, this has been characterized as the “garbage in, garbage out” problem, which, loosely speaking, translates to “if your input data is garbage, your output results are also garbage.” If you want to derive a model of user experience or user behavior, you need to instrument systems to gather the data that is necessary to generate a reliable and valid model; you need to fully engage in “experience instrumentation” designed for “experience mining,” not sift through data that was collected for some other purpose and hope it will suffice. Nor should we make assertions about invariants of human social action based on analyses of activities that take place in single-case, specific settings (e.g., on a particular social network) without explicitly acknowledging what that specific interface, interaction, application, service, or company brings to the table, how that affects adoption and use, and thus how the data collected is biased by those aspects.
These are exciting times! Let’s bring a creative and yet critical eye to the collectionthe designof data to complement the focus on the analysis of data. Of course, those trained with experimental and survey methodologies are continually designing data by designing experiments and instruments to address core science questions. However, I do not see this kind of thinking commonly applied when designing interfaces and interactions. I do see a deep commitment to discoverability, usability, the support of tasks and activity flows, and to aesthetic appeal, but not a critical lens on the data consequences of design choices at the interface/interaction level. I’d like to see more of what Tim Brown, among others, has called “design thinking” to be applied to data capture (including application, service, and system instrumentation), to data management (including collation and summarization), to user/ use models that utilize machine-learning techniques, and, of course, as has been invited elsewhere, to data visualization and analysis (including interpretation). Brown says that design thinking is neither art, nor science, nor religion. It is the capacity, ultimately, for “integrative thinking.” In Brown’s view, a design paradigm requires that the solution is “not locked away somewhere waiting to be discovered”; he advocates that we embrace “incongruous details” rather than smoothing them or removing them . In the incongruous details lie the insights.
I want to underscore: I am not saying there are no design thinkers doing data analytics, nor am I saying there are no data analysts who are design thinkers; I am just saying we need more of them. Ultimately, my observations highlight the need for what I have been calling data-aware design (and innovationbut I am trying to avoid the acronym DADI here) within HCI. I am intentionally playing with the word aware. I want to contrast data aware with data driven for two reasons. First, data driven is a meaningless term; all design is data driven in some sense, but that data may be informal or under-specified and thus offer no metrics for determining cause and effect, or assessments of success, failure, and learning with regard to the original design intent. Second, driven seems overly deterministic; a lot comes into play when designing a feature, application, or service, not simply what the data tells us. I also intend with the notion of data aware that the data itself be aware. I am not invoking an anthropomorphic notion of awareness, but rather the idea of reflective data systems and an invitation that we design/ develop systematic ways that gaps, errors, elisions, and unexpected abstractions are noted and reported. Reporting these along with carefully presented, “clean” stories of results will give us a sense of how much confidence we can have in the conclusions that are drawn.
In sum, given that the charter of human-computer interaction is to address how humans interact with and through computers, an HCI researcher or practitioner needs to be part of the conversation that addresses what and how quantitative trace data is collected (what is instrumented and how); how data is represented, extracted (sampled), and/or aggregated; what questions are asked of data; what processes and practices are enacted as results are generated; and how data thus extracted is understood. It is our responsibility to engage with the deeper epistemological question: How do we come to know what (we think) we know about people and their interactions with and through technologies? Data is a design problem.
2. I note that there are hot debates about whether data as a word should be treated as singular or plural. Following an extended discussion with trusted friends and editors, for this column, I have elected to go with data as a collective noun.
Some examples from Seife’s book:
Disestimation is when too much meaning is assigned to a measurement, ignoring any uncertainties about the measurement and/or errors that could be present. In the 2008 Minnesota Senate race between Norm Coleman and Al Franken, errors in counting the votes were much larger than the number of votes that separated the candidates (estimated to be between 200 to 300). He concludes that flipping a coin would have been better than assuming any veracity in the measurethe number of votesgiven these errors.
Potemkin numbers are statistics based on erroneous numbers and/or nonexistent calculations. Seife cites Justice Scalia’s statement that 0.027 percent of convicted felons are wrongly imprisoned. This turned out to be based on an informal calculation, with rigorous studies suggesting that the actual number is between 3 and 5 percent.
Some “fruit salad” examples include “comparing apples and oranges,” “cherry picking” data for rhetorical effect, and “apple polishing.”
Elizabeth Churchill is the current vice president of ACM SIGCHI. Her research focuses on social media. Currently an independent consultant, she was formerly a principal research scientist and founder of the Internet Experiences Group at Yahoo! Research.
©2012 ACM 1072-5220/12/0900 $15.00
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.
The Digital Library is published by the Association for Computing Machinery. Copyright © 2012 ACM, Inc.