The 2020 book Data Feminism, by Catherine D'Ignazio and Lauren F. Klein , seems to have broken through. I have been seeing it on recommended reading lists and in library diversity, equity, and inclusion collections, and hearing about it from colleagues, who are discussing it in academic book clubs. Though criticized for some by not going far enough , it is likely the authors' just-enough approach to theories, feminist and otherwise, that makes the book a powerful invitation to thinking differently about data and data collection.
For those familiar with the literature on science and technology studies, the book's seven chapters present examples, some familiar and some new, that provide the sort of neat, clean, and clear stand-alone snippets that are easy to work into conversation. These examples, and the seven correlating principles, written in clear, commanding language, can provide both the means to convince a skeptic that data might not in fact be neutral and the means for a corrective approach. The principles are:
- Examine power
- Challenge power
- Elevate emotion and embodiment
- Rethink binaries and hierarchies
- Embrace pluralism
- Consider context
- Make labor visible.
For example, in the chapter that explains the importance of communicating data's context, the authors rework a chart's title and subtitle while presenting the same categorization and graphic representation of data. The first iteration reads "Mental Health in Jail: Rate of mental health diagnoses of inmates." The second iteration shifts the focus: "Racism in Jail: People of color less likely to get mental health diagnosis." The authors explain how stripping the dehumanizing language of inmates, calling out racism, and indicating that there is a disparity in healthcare reflect the responsibility of designers and data scientists to see the impacts in intersectional terms and to "name racism, sexism, or other forces of oppression when they are clearly present in the numbers." This kind of direct instruction is a helpful corollary to Edward Tufte's instructions that visual representations should have explanatory titles. But what adds depth is the third and final iteration of the title offered by the authors: "Racism in Jail: White people get more mental health diagnoses," which foregrounds the unfair accrual of benefits to a dominant group, rather than reinforcing a deficit narrative associated with minoritized groups. For those interested in changing the status quo, these seemingly small moves in the framing of an idea can have a big impact, because shifting the frame can bump people out of entrenched ways of thinking. In this case, a white reader might start to think differently about the impacts of the uneven distribution of resources. This is a similar (and data-based!) rhetorical strategy as that taken by Iris Bohnet, who points out that discriminatory patterns, even if they are entirely unintentional, result in diminishing the quality of a talent pool available to a business or organization . D'Ignazio and Klein suggest challenging power by reframing six concepts: moving from ethics to justice; from bias to oppression; from fairness to equity; from accountability to co-liberation; from transparency to reflexivity; and from understanding algorithms to understanding history, culture, and context.
Data on its own is not typically thought of as possessing agency.
What Data Feminism largely sidesteps is the question of to what degree individuals and organizations are free to reframe data in the increasingly politicized environments where many of us find ourselves working and living. Late capitalism has corralled increasing calls for diversity, equity, inclusion, and justice in ways that raise eyebrows even among those very much a part of the cause. And many companies, universities, government agencies, and cities have created new organizational roles, such as chief diversity officer, many of which carry six-figure salaries. How might we think about this shift in organizational structure using the principles in Data Feminism? From one perspective, the elevation of justice to the C-suite could be seen as a way to both examine and challenge power. But what about D'Ignazio and Klein's fourth injunctive principle: rethink binaries and hierarchies? They remind us here to consider "who is doing the counting and whose interests are served" . In many cases within organizations, the creation of a singular entity—whether an officer or committee—leaves those doing the work on precarious ground. Pointing out the sheer number of existing shortfalls—the very reason these positions and initiatives exist—won't win many friends among peers and higher-ups, and even well-resourced organizations are challenged when change comes too quickly or on too many fronts. On the other hand, celebrating successes, even when they are well deserved, risks downplaying how much work remains to be done and alienating those groups waiting for their concerns to be addressed. Interestingly, at some junctures for those in this often unenviable position, the absence of data, and the corresponding need to count people in ways that are legible and advance toward a shared goal, can constitute an important justification for future work. As the apocryphal adage goes, "You can't manage what you can't measure."
But there is something about this that seems a bit too blithe. Data on its own, no matter how thoughtfully and carefully collected, is not typically thought of as possessing agency. To demonstrate the point, I will borrow an example from the building industry, the context where I have been spending most of my time lately. After decades of calls for buildings to become more green, sustainable, and environmental, a powerful new concept has taken center stage: embodied carbon. Embodied carbon is a measure of how much energy it takes to make a given building or component of a building. On the face of it, it seems quite simple. For example, if you were to measure the embodied carbon of earthen bricks, you would determine how much carbon was emitted from mining the clay and firing the kiln. But here's where it gets complex, and fast: What if the clay comes from different places? Do you include the energy it takes to truck the bricks to the construction site or hoist them up to the top story of the building? What if the brick kiln is using a biomass material, such as wood, but using that wood is leading to deforestation, which creates negative ecological consequences, including increased carbon outputs? What if we are not measuring a single-component material such as a brick, but instead a mechanical device with thousands of subcomponents, such as an air conditioner? While it's conceptually possible to put a single carbon "price" on anything, whether a brick, air conditioner, or box of cereal, getting to a simple, legible, and single data point necessarily means flattening a long and complicated chain of judgments that abstract a lot of local specificity . On the other hand, now you can compare the proverbial apples and oranges—or bricks and air conditioners.
Returning to Data Feminism, it's easy to see embodied carbon as an example of what D'Ignazio and Klein describe in Chapter 3 as a "rational, scientific, objective viewpoint" (a single number representing embodied carbon!) taken from "mythical, imaginary, impossible standpoints." The ideal embodied carbon figure would take into account what happens at the end of a product's life; the number would be different if, for example, a wood beam is burned or reused, but since this will happen in the future, it is, at best, an imaginary standpoint. That said, in defense of those doing this kind of accounting, it is possible to evaluate best-case and worst-case scenarios; the question is how one determines the proportion of beams that will be burned versus reused in the future. (There is also the question of how long that future should extend; given the immediacy of the need to reduce carbon emissions in the near term, many are making the sensible argument that carbon reductions 50 years in the future are of zero utility today.)
The case of embodied carbon also sheds light on D'Ignazio and Klein's sixth principle—consider context—which contends that data is the result of unequal social relations. Why go to such great lengths to construct embodied carbon data? How will it be used? Some see embodied carbon accounting as paving a path toward a carbon tax, while others see it as enabling the emergence of incentives that will avoid a carbon tax. This is, of course, a highly politicized discussion, as is the popular discourse around diversity, equity, inclusion, and justice, but it is worthwhile to note that the ambiguity around exactly what embodied carbon data will achieve hasn't slowed efforts to define how it will be calculated and who can be trusted to do this work. On the contrary, disagreements and the occasional controversy seem to have accelerated its development, with different parties making the call for standards and regulation of data.
Ask those active in the embodied carbon movement and you will find that concern for the health of the planet and its people is a frequently and consistently cited motivation for their work. Ultimately, this concern is the political basis for the construction of embodied carbon data. The data itself, along with all of the arguments over the accounting procedures, is the means to an end, not the end in itself. This is the most powerful insight of Data Feminism. We are, for better or worse, locked into systems that limit individual agency through the long grasp of capitalism, colonialism, and patriarchy. Being more conscious of data's agentic force and potential is one way that we can come together to bend the arc of the future, even when our individual potential is circumscribed.
Thanks to my colleague Clare Robinson for bringing Data Feminism to my attention and for insightful discussions about concepts and applications.
Jonathan Bean is an assistant professor of architecture, sustainable built environments, and marketing at the University of Arizona. He studies taste, technology, and market transformation. [email protected]
Copyright held by author
The Digital Library is published by the Association for Computing Machinery. Copyright © 2023 ACM, Inc.