In 2003, I was working at a research lab on what then seemed like a new and unmapped field: urban computing. We envisioned a network of devices that would playfully engage urban dwellers with the strangers around them. In an initial vision, sensors at points of interest, such as bus stops or coffee shops, would track nearby devices. That information could then drive place-based recommendations along the lines of Amazon's product recommendations.
A visitor to our lab immediately pointed out that the system would harvest easily identifiable location data—data that corporations and governments could use to assemble individual profiles. I was too polite to pooh-pooh him, but I thought our visitor was unrealistically paranoid. What would be the point of such a sparse surveillance network? Setting up towers would cost too much for the limited data they would provide. Or so I thought.
Fast forward 10 years to the anti-government protestors in Kiev's Maidan Square. According to The New Yorker, many protestors received SMS notifications, likely from the Ukrainian government, noting their presence at the demonstration . Observers explain that it's easy to set up the unofficial, temporary honeypot cell towers that had harvested protestors' telephone numbers before passing connections on to the wider telecom network. Reading the article, I thought of our "familiar stranger" devices—and our visitor's warning. We meant the project to inspire new directions for interaction design. With 10 years of development in mobile devices and networked sensing behind us, the project now spurs me to reconsider the ethics of interaction design in the age of big data.
From fitness trackers to car sensors that monitor speed and position, the R&D proposals of 10 years ago are now hitting the market. These new products and services produce and consume a tremendous amount of personally identifiable information about people's characteristics, opinions, and behavior. Some of the data comes from intentional contributions, such as status updates. Some comes from unwitting or background capture, as with car insurance modules that sense speed and location. As ever more data is collected, increasing numbers of products depend on reused data, as with the targeted advertising services that combine data from many sources to customize online advertising. Personally identifiable data is now a commodity, transferred to others for reuse beyond the initial context in which it was captured. Yet the fate of this data is largely opaque to the people who provide it.
Philosopher Helen Nissenbaum defines privacy as "contextual integrity"—that is, sharing information according to the expectations under which it was originally offered . Selling or reusing data is a great way to destroy contextual integrity. Even if the users consented to the original capture and use of their data, they do not always know when it is later used in ways they did not expect or desire.
Personally identifiable data is now a commodity, transferred to others for reuse.
My ethical concerns emerge from four related trends:
- The sensor-infused world. Embedded sensing allows many more types of devices to collect data about the actions of users and conditions of the world around them. It's not just special-purpose devices such as the Nike+ or the Nest thermostat. Cars, such as the Prius, are increasingly instrumented. Telephones, too, host an array of sensors. Applications that bridge multiple devices (such as the Kindle or Instagram) allow the integration of sensor and online behavior data. Combined with the systemic tracking of online activities, networked devices that monitor and share offline human behavior can potentially enable companies and governments to assemble a 360-degree profile of individual characteristics and behavior. Even if the results are intended to benefit users, this is, to put it bluntly, mass-market consumer surveillance.
- Data as a commodity. Once collected, data can be stored indefinitely, passing into new management and new uses. Consider the horrified reactions of British citizens who discovered recently that the National Health Service was preparing to sell access to the entire country's health records—only barely anonymized—through a consultancy . And that the consultancy had already uploaded those records to a distribution service. Even if they could have chosen not to have health information recorded, how could citizens in the 1980s have anticipated the technology, business, and government policies of the 2010s?
- The opacity of back-end information exchange. What's often called the data curtain veils the circulation of data in uncertainty. Where is your data stored? Who has access, whether openly or through secretive (but legal) back doors? Do managers share your beliefs about what they can or should do with it?
- Mass scale. When millions of users are scattered around the world, how are designers to take into account different culturally inflected expectations for the collection and reuse of data? Even when developed with the best of intentions, these products will unpleasantly surprise some of their users. How can we give users a chance to object before being unpleasantly surprised—or an opportunity to mitigate any consequences afterward?
We may think of data collection and processing as happening "on the back end," far away from the traditional responsibilities of interaction design. However, designers like myself are active participants in this business of collecting, processing, and sharing personally identifiable data. We may create interfaces that make it easier to share information without limits than to withhold or control it. We may specify features that rely on harvested or reused data. Or we may simply not protest when we see developers implementing code that maximizes the amount of data requested or stored.
The uncertainties surrounding the pervasive collection, storage, and sharing of personal data wouldn't be such an obvious ethical problem if interaction design, as a discipline, weren't organized around an agenda of user—or better yet, human—centeredness. Educational programs, professional organizations, major companies—all profess to serve the needs and desires of the humans who use the technologies they design. It's ironic when designers claim they want to fulfill unstated desires for, say, convenient shopping or up-to-the-minute friendships—yet simultaneously ignore or undermine an arguably more important need for contextual integrity.
Data-intensive applications and devices test the accepted professional ethics for designers. If interaction design depends upon being user-centered, and people's personal data can be easily used to harm them, then designers need to concern themselves with data policies as well as functionality.
Of course, designers are not the only people responsible for this situation—designers are not even primarily responsible for it. Executives, product managers, engineers, and legal staffs also make consequential decisions about data management.
Users themselves also bear responsibility. We cannot view consumers and users as merely passive victims. That's not just condescending—it's incorrect. From teens on Facebook to online daters, people consciously edit, tune, and falsify information about their lives. They also reject, opt out, or otherwise refuse to participate.
Regardless of where we might look, the complexity of algorithmic modeling sometimes makes it difficult to figure out where responsibility for failures might lie. In many cases, no single data collection event is at fault. Rather, disastrous results emerge unpredictably when organizations exchange vast amounts of personally identifiable data and algorithmically fit that data into culturally specific models. As Jason Schultz and Kate Crawford ask, "When a pregnant teenager is shopping for vitamins, could she predict that any particular visit or purchase would trigger a retailer's algorithms to flag her as a pregnant customer? And at what point would it have been appropriate to give notice and request her consent?"  And at what point, I would add, could the designers of that system have anticipated what might happen when a pregnancy-related marketing flyer ended up in her parents' mailbox?
Are designers directly guilty of the harms people experience? No. But when they help design mechanisms for harvesting and sharing personally identifiable information, they are not innocent, either. And the more power designers have, the more complicit they are. Not having total responsibility for data collection and use does not absolve designers from taking any responsibility.
So if they don't solely control product features and business policies, how can designers help protect users' interests?
There are already opportunities throughout the design process. Interaction designers can start making the collection and transmission of personally identifiable information more visible at the interface. They can specify functionality to minimize the amount of personal data collected and stored. Researchers at Berkeley, for example, have started to collect "privacy patterns" that promise to make it easier for interaction designers to build respect for personal data into interfaces (http://privacypatterns.org/about/). Though they do not control the organizations in which they work, designers can also act as persuasive advocates throughout the design process, helping decision makers keep the uncertainties surrounding the data, along with the clarity it seems to produce, in mind.
However, we will have to change how we think of personal data—and interaction design.
A first step is treating personal data as potentially perilous, rather than as an innocuous source of value. When data is a harmless commodity, we are more likely to maximize what we collect and store, just in case it will be valuable later. A world in which data presents dangers to users and businesses alike changes that calculation. Instead of treating databases like tidy libraries, we might see them instead as more like blood banks—full of messy, leaky substances that are helpful when used appropriately but harmful when wrongly transfused.
A next step might be to reconsider the design assumptions that lead to opacity rather than disclosure. In the name of minimalism and usability, the mechanisms of data collection and storage often remain behind the curtain. Valuing simplicity doesn't excuse keeping people in the dark.
Finally, acting ethically doesn't always mean saying no. Legal scholars and sociologists have started to propose means of "due process" for big data. Such due process would include not only notice of data collection and management process but also the opportunity for mediation by an impartial outsider. Right now, such technological due process exists only as conceptual proposals. For researchers and designers (or at least, for me), these calls for due process present a fascinating service design challenge. It's easy to think of ethical responsibilities as a barrier to unrestrained data collection and reuse. But the unfulfilled promise of proposals for technological due process shows us that ethics can be generative as well as restrictive.
Acting ethically doesn't always mean saying no.
A usual response to my call for action is that a company's stance toward data management is not a problem for interaction designers. There's a tendency to outsource worrying about this to professional advocates—the "experts"—who will review policies and take action. Or to engineers who will code in protections. For example, proponents of Privacy by Design (http://privacybydesign.ca/) have directed their efforts largely at changing legal policies and software architecture. Despite their advocacy of user-centered design, there's not much evidence that they are trying to recruit design experts.
For effective data management is very much a question of what users want and expect in context. It's a question of culture. The people in tech concerned with what users know and want are typically designers—not engineers, and not policy makers. Moreover, it's better to avoid a problem than to institute a messy and expensive fix. Invention often outpaces regulation. As we've seen with myriad debates over Internet policing, lawyers and policy makers are often stuck playing catch-up. Given the focus of many privacy advocates on architecture and legal agreements, there is a real opportunity for interaction designers to make a difference—if they take it.
For in practice, working as an interaction designer requires stretching the limits of any idealistic definition of user-centeredness. Budgets and timelines often don't allow for the kind of engagement with potential users that might help designers understand their desires or capabilities. And what users want is only one factor in design.
Even when that engagement takes place, the business priorities of the client organization often contradict what designers believe to be in the best interests of users. Limiting the amount of data collected or stored, or advocating for legal restrictions on its use, can go against corporate priorities. People need jobs; they may want to support product teams as best they can; they may believe in the mission of their companies. Being seen as a gadfly is not the best career move.
There is also the sheer pleasure of invention. That's what I felt when I refused to believe that my beautiful playful design proposal could ever be a surveillance tool.
Most important, we often make decisions confident that past conditions will dictate the future. That very human shortsightedness lies at the heart of my confession: I am now embarrassed that just two years after 9/11 and the obvious explosion of U.S. surveillance efforts, I did not take our visitor's warning seriously. We don't anticipate that the National Security Agency will demand secret, unprecedented access to company databases. Nor do we predict acquisitions, new executives, bankruptcy, or any one of a number of all-too-possible fates that can change data-management policies.
These reasons for inaction are all understandable. None of them, however, excuse avoiding what is becoming a pressing professional responsibility. Interaction designers bear responsibility for the data that powers the applications we make. It's time to make the perils of big data—as well as its new and exciting possibilities—our business.
1. Kopstein, J. A phone for the age of Snowden. The New Yorker Blogs. Jan. 30, 2014; http://www.newyorker.com/online/blogs/elements/2014/01/a-phone-for-the-age-of-snowden.html
2. Nissenbaum, H. Privacy as contextual integrity. Washington Law Review 79, 119 (2004); http://heinonline.org/HOL/Page?handle=hein.journals/washlr79&id=129&div=&collection=
3. Ramesh, R. NHS England patient data 'uploaded to Google servers,' Tory MP says. The Guardian. Mar. 3, 2014; http://www.theguardian.com/society/2014/mar/03/nhs-england-patient-data-google-servers
4. Crawford, K. and Schultz, J. Big data and due process: Toward a framework to redress predictive privacy harms (SSRN Scholarly Paper No. ID 2325784). Social Science Research Network. 2013; http://papers.ssrn.com/abstract=2325784
Elizabeth Goodman received her Ph.D. from UC Berkeley's School of Information. Her dissertation focuses on commercial interaction design practice and ubiquitous computing. She is also the author of Observing the User Experience, a handbook of design research methods for students and professionals.
Copyright Held by Author
The Digital Library is published by the Association for Computing Machinery. Copyright © 2014 ACM, Inc.