PeopleLensIssue: XXVIII.3 May - June 2021
Cecily Morrison, Ed Cutrell, Martin Grayson, Geert Roumen, Rita Marques, Anja Thieme, Alex Taylor, Abigail Sellen
The PeopleLens is an open-ended AI system that offers people who are blind or who have low vision further resources to make sense of and engage with their immediate social surroundings. It has been used most recently by children who are blind in school settings, supporting their skills in proactively interacting with classmates.
The PeopleLens is an exploration in how we can design human-AI interaction that moves beyond discrete task support to provide a continuous stream of dynamic information. It was inspired by ethnographic research with Paralympic athletes and spectators, research that captured the diverse sense-making skills that people with low vision use to orientate to and interact with others. The PeopleLens was imagined as a way of amplifying the details in people's surroundings, enabling them to extend their existing skills and capabilities.
Our current implementation of PeopleLens, being used by schoolchildren, has three key features:
- Person-in-Front: This feature reads out the name of a person to whom a user orientates.
- Orientation Guide: This feature provides additional sound cues, giving the user a better understanding of the system's detection of bodies or faces. These cues assist users in directing their body and head orientation, improving system accuracy. The orientation guide also conveys where a user's attention is directed to those nearby.
- External Feedback: Finally, the head-mounted device has an external LED interface that indicates the system state to a communication partner.
The PeopleLens is not just a resource for the user but also a means of supporting reciprocal interaction with communication partners. In our trials, we found people adjusting their own body position in order to be identified. To supplement this, we developed external feedback on the device. A semicircular LED interface affixed to the top of the HoloLens provides communication partners with information about the state of the system. This assists the development of common ground and reflexive interpretation of behavior, giving users and communication partners additional cues to establish and maintain mutual attention.
|TH using the PeopleLens system in a classroom. TH, who has used different versions of the device since 2018, has been integral in informing its current design.*
Key to this design was to find ways in which users and communication partners can establish shared understandings with different sensory modalities. In the final design, a moving white light tracks the location of the nearest detected person, flashing green when that person is identified to the user. However, a number of visual-tactile interactions to support the user's understanding of the system information were also explored.
The PeopleLens offers an example of human-AI interaction that seeks to expand people's capabilities. As a system, it is not designed to operate on behalf of its users, replacing what might be thought of as an absence of sight. Rather, interaction with the system is intended to extend abilities: helping a user to achieve existing goals, adding and enriching information already relied upon, and building new strategies on top of existing ones. Our perspective on this more complex and unfolding coupling of people and technology seeks to approach people's capacities in expansive terms, seeing them as always emerging through interwoven relations.
The PeopleLens offers an example of human-AI interaction that seeks to expand people's capabilities.
This concrete example demonstrates the potential opportunities that an AI experience can provide when it is designed to work in concert with human capabilities.
* We acknowledge TH's considerable contribution to PeopleLens and refer to him (in abbreviated form) in this work with permission from him and his immediate family.
Cecily Morrison is a principal researcher at Microsoft Research, working at the intersection of human-computer interaction and artificial intelligence. She is interested in how our tools, models, and interfaces enable people to extend their own capabilities through using AI-infused systems. [email protected]
Ed Cutrell is a senior principal researcher at Microsoft Research, where he explores computing for disability, accessibility, and inclusive design in the MSR Ability group. Over the years, he has worked on a broad range of HCI topics, ranging from input tech to search interfaces to technology for global development. [email protected]
Martin Grayson is a principal research software design engineer at Microsoft Research who led the design and engineering of the PeopleLens. [email protected]
Geert Roumen is a maker and interaction designer who recently graduated from the Umeå Institute of Design. He focuses on bridging the digital and physical world with a hands-on design approach, creating early prototypes and doing research in a playful yet serious way, to bring the design and people together from the start.
Rita Faia Marques is an associate designer whose work focuses on designing responsible AI systems. [email protected]
Anja Thieme is a senior researcher in the Healthcare Intelligence group at Microsoft Research, designing and studying mental health technologies. [email protected]
Alex Taylor is a sociologist at the Centre for Human Centred Interaction Design, at City, University of London. With a fascination for the entanglements between social life and machines, his research ranges from empirical studies of technology in everyday life to speculative design interventions. [email protected]
Abigail Sellen is deputy director at Microsoft Research Cambridge, and has published on many different topics in HCI that put human aspiration front and center in designing new technology. A recent focus is on the intelligibility of AI systems, viewed through the lens of both HCI and philosophy. [email protected]
Consider this scenario where the user combines his own abilities with the system's features, and where others also work with the user and system: The user walks into a familiar classroom. He hears three bumps (percussive sounds representing body presence) at about 10 o'clock (forward left). He guesses that three people are standing at the interactive whiteboard working on a problem set with their backs to him. As he shifts his gaze to the right, he hears a quick succession of bumps, which he guesses are other children sitting at their desks reading with their heads down. As he moves to his right, he hears a bump followed by a name, Jane. The user clicks his tongue and listens for its echo. He guesses that Jane is standing next to the wall, perhaps at the classroom coat rack. However, the user really wants to tell his friend Oscar about the new Lego set he got over the weekend. He heads in the direction of Oscar's seat. As he gets closer, he hears woodblock sounds, which prompt him to look down. Looking at the external interface on the user's headset, Oscar can see that the system recognizes him, but he's to the right of center so he moves slightly to be properly detected. Oscar's name is then read out. The user surmises that Oscar must be sitting down. He grabs a chair and pulls out a Lego figure for Oscar to see.
Copyright held by authors
The Digital Library is published by the Association for Computing Machinery. Copyright © 2021 ACM, Inc.