Before the algorithmic systems being deployed to assist with decision making in both public and private sectors came into being—and before they were shown to be harmful to various groups of people because of predictions based on presumed relationships found in data—they existed in the minds of designers, product managers, and organizations. Scholars have shown the issues with using biased data stemming from structurally unfair systems, as well as the implications of using unfair models. But also important is the ideation behind these systems, and how imaginaries shape their creation and the problems found within them.
Algorithmic systems are systems of representation—"ways of organizing clustering, arranging, and classifying concepts and of establishing complex relationships between them" . Representation systems put things that are alike together. For example, many different breeds of dog exist. "Dog" is an umbrella category where we see that all of the things called dog share similarities, but they are not all the same. For one person the word dog might evoke the image of a husky; for another person it could be a rottweiler.
This means that individuals can have their own specific idea of what the ideal type—the image that is most representative for a category—is. Organizations encode rules related to that ideal type, rules that presumably shape what the outcomes should look like, who should be served, and how the algorithm should behave with data and feedback loops. A simple example of how organizations encode rules for algorithms to achieve specific results can be seen in entertainment recommendation systems. Systems like Netflix, Hulu, Spotify, and Amazon Prime offer users recommendations for television shows, music, and movies based on how the rules are programmed into the algorithms; the recommendations are refined as the user continues to interact with the system by watching, listening, searching, and so on. Ideally this results in recommendations for content similar to what the users have already watched or listened to. Of course, there may be some differences among the recommendations, but too much difference is a cause for concern from platform users. The main idea is that the rules decide who or what to include as well as exclude. And this is an important consideration, as it symbolizes how we, or organizations, think about deviance. Large-scale algorithms classify based on institutional and/or organizational rules. Deviance from these rules can have significant consequences.
As an example, the 2020 A-level and GCSE examinations in the U.K. were canceled. Instead, policymakers used an algorithm designed to predict how students would score on the exams had they taken them. At the same time, teachers were asked to provide their own predictions about where they thought students would end up in terms of grades. The outcome of the algorithmic grading was that nearly 40 percent of student grades were ranked one grade lower than teachers predicted; another 3 percent were ranked two grades lower .
These predictions meant that students were rejected from admission to college, university, and training programs. An even more insidious result of the use of the algorithm was that downgrading happened significantly more for those who attended state schools; upgrading happened more on average with students at privately funded independent schools. This means the algorithm negatively affected a lot more poor and middle-class students, who predominantly attend state schools. More-affluent students, on average, attend independent schools.
The algorithm, and others like it, used an exemplar—the "most perfect" example—of what a student who would score highly would "look like." The algorithm factored in not only the grades the teachers assigned but also other data and labels to see whether a student would resemble the others in a particular grade category. Membership in categories is a matter of degree, and organizations can set the boundaries. These boundaries define which individuals are closest to the ideal. Predictions are based on family resemblance, meaning that some, but not all, of the common qualities exist for membership in a category. Those lacking more familial traits, those further away from the exemplar, are marked for exclusion.
While we can and should be concerned with the model used for algorithms like that used for the GCSE, just as concerning is the lack of imagination demonstrated in creating the rules for the system. It would seem to have been a simple task to consider, well before creating and deploying a technology with such high stakes, the possible impacts of encoding particular ideals. These ideals can be thought of as someone having asked, "What is supposed to be?" Three "supposed to's" are particularly insidious for algorithmic systems: 1) What is this supposed to solve? 2) What and who is this (product, service, process) supposed to look like? and 3) Who is/not supposed to be included and who is responsible for inclusion/excusion?
What is this supposed to solve? Problem-solving is promoted as a rationale for the use of algorithms. Those employing algorithmic systems want to be more efficient and effective and to remove possible human bias. This was the rationale provided for ignoring the grades the teachers provided for the GCSE exams. The algorithm was supposed to have removed the possibility of grade inflation by teachers who may have had an emotional attachment to their students. But focusing on the possibility of human bias ignored that the system had similar biases related to who got what kind of scores. A similar algorithmic "problem-solving" that was found to create more problems is Detroit's Project Green Light, a citywide, law enforcement scheme that uses facial recognition technology to identify those captured on public surveillance. The scheme has resulted in at least one lawsuit for wrongful arrest and imprisonment .
What (who) is this (product, service, process) supposed to look like? This question asks how closely an individual resembles the exemplar for a category. A recent example of the problems that arise with this "supposed to" was found in state adoption of a facial recognition system to verify unemployment claims during the Covid-19 pandemic. The system was supposed to identify legitimate claimants for unemployment and deny the claims of those who allegedly attempted to commit fraud. California alone reported that it had suspended 1.4 million claims under suspicion of fraud; 300,000 of those people have been able to verify their identity. This leaves 1.1 million people who have approximately 30 days to somehow verify their identity .
Some legitimate claims are being rejected because the system could not confirm the individual's ID. Scholars like Timnit Gebru and Joy Buolamwini have already shown that facial recognition systems are particularly error-prone for people who do not resemble those faces in the training data upon which the systems were trained. Therefore, darker-skinned people and those who do not read as male are less likely to be correctly identified .
So what does this mean when states employ systems like this that we know are faulty? Many people who need assistance are unable to get it, or have to wait a very long time. It also indicates that government officials and public sector administrators are, at a minimum, lacking in imagination by not foreseeing the very real impacts of implementing technology known to disparately affect particular groups of people. What this calls for is threat modeling where the focus is not on the security of the system, but rather on imagining all the places that the system could go wrong for all of those within and beyond its immediate reach.
Who is/not supposed to be here? This consideration attempts to recognize those who are and are not considered important stakeholders by those building and deploying technology. At the same time, a focus on "stakeholders," "users," and "customers" ignores a significant population of people who may not have monetary or other value-related stakes in a system, but who are still affected by its use—a much broader and more complex consideration. But the difficulty of this kind of consideration should not be a deterrent to making it a priority. The more narrow focus leads to algorithmic systems producing harmful outcomes, with individuals in affected groups left without recourse. For example, algorithms are widely used in healthcare settings, perceived as offering guidance to doctors about the kinds of treatments to provide for certain kinds of patients. As in the other examples, because systemic racism and sexism, among other things, affects data, these algorithms were found to be suggesting lax healthcare guidance when patients were Black, as well as poorer. The use of race as a factor for algorithmic decisions about kidney disease treatment and placement on the transplant list is an express example of how the encoding of the ideal causes harm. The algorithmic model weighs patients identified as Black as having healthier kidneys than others—a coefficient indicating deviance for a system designed for guiding the treatment of unhealthy organs. This weighting has been shown to have kept Black kidney disease patients from being included on transplant lists and receiving more aggressive treatments .
Whether through ignorance or express bias, the creators of this kidney disease algorithm have created and distributed an algorithm that is being used to make harmful, if not deadly, decisions. A better question for designers of systems like these is, Who is neglected? or What is the human impact of the use of these kinds of tools?
There are many other harms that have happened and continue to happen with the use of algorithmic machine learning or decision system technologies, technologies encoded to fulfill human and/or organizational imaginaries. Yet there are opportunities to remedy the problems with how systems are imagined and created, and whether or not to implement them. This requires critical consideration of what or who is being programmed as the ideal, as well as who or what is considered the deviant.
Instead of the "supposed to's," system creators and deployers should consider the "must be's," central factors related to algorithmic systems. The first "must be" is an identification of the ideal, and then a reorientation from a system that assesses only proximity to the ideal/deviance to one focused on possible impacts to the most vulnerable. But this must happen before the system is built, at the ideation stage, and continue throughout the iterative creative process.
We must be willing also to normalize stopping or prohibiting the use of certain tools when they create harm. Allegations of harm should force the pausing of the use of an algorithmic system; there should then be an investigation. If an investigation finds harm, then use must be contingent on a proper remedy. If none is available, use should be prohibited. Recently, several U.S. legislators have proposed bills that would prohibit the use of discriminatory algorithms. In May 2021, Senator Ed Markey and Congresswoman Doris Matsui proposed the Algorithmic Justice and Online Platform Transparency Act (https://www.markey.senate.gov/imo/media/doc/ajopta.pdf), which bans the use of algorithmic processes by online platforms to discriminate on the basis of race, age, gender, ability, and other protected characteristics. These kinds of requirements, if passed into law, could force creators to think about their creations before letting them loose in the wild.
Of course, there must be continuous evaluation and auditing of algorithmic or decision systems. Whether it's banking, admissions, or hiring, people need to know what they are encountering, and organizations and governments must put policies in place requiring auditing and transparency. But auditing and transparency are reactive; we need proactive policy requiring system creators to meet safety and impact standards set with the input of community and advocacy organizations. At the same time, we should not be attempting to assess acceptable levels and kinds of harm, acceptable loss of life, or loss of opportunities. And although the use of science and technology, including sociotechnical systems, inherently comes with risk, harm and risk are not the same. No harm is the ideal; as little harm as possible is the goal.
Ensuring as little harm as possible requires legislation. At the local level in the U.S., we are seeing movement, particularly with respect to the bans on the use of facial recognition in certain cities. The European Commission's April 2021 proposed legal framework for artificial intelligence (https://ec.europa.eu/commission/presscorner/detail/en/ip_21_1682) is underpinned with a focus on harm prevention, banning all AI systems posing a clear threat to safety or fundamental rights. While these policy activities are certainly welcome, it is imperative that designers, product managers, and ultimately organizations build considerations of harm into the ideation and iterative phases of systems creation. It is not enough to recognize harm after the threat has been realized. Instead, because algorithmic systems have the potential for such consequential and long-term impacts, creators must be responsible for predicting the possible outcomes, then imagining and creating something different.
There is still a lot to be done.
2. BBC News. A-level and GCSE results: Pressure mounts on ministers to solve exam crisis. BBC News. Aug. 17, 2020; https://www.bbc.com/news/education-53804323
3. Hill, K. Wrongfully Accused by an Algorithm. The New York Times. Jun. 24, 2020; https://www.nytimes.com/2020/06/24/technology/facial-recognition-arrest.html
4. Gershgorn, D. 21 states are now vetting unemployment claims with a 'risky' facial recognition system. OneZero. Feb. 4, 2021; https://onezero.medium.com/21-states-are-now-vetting-unemployment-claims-with-a-risky-facial-recognition-system-85c9ad882b60
5. Bichell, R.E. and Anthony, C. For Black kidney patients, an algorithm may help perpetuate harmful racial disparities. Washington Post. Jun. 6, 2021; https://www.washingtonpost.com/health/black-kidney-patients-racial-health-disparities/2021/06/04/7752b492-c3a7-11eb-9a8d-f95d7724967c_story.html
Jasmine McNealy is an associate professor in the Department of Media Production, Management & Technology at the University of Florida. She researches media, technology, and law with an emphasis on privacy, surveillance, and data governance. firstname.lastname@example.org
©2022 ACM 1072-5520/22/05 $15.00
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.
The Digital Library is published by the Association for Computing Machinery. Copyright © 2022 ACM, Inc.