Adam Williams, Francisco Ortega
A popular technique for developing gesture-interaction sets is elicitation. Elicitation is a type of participatory design in which a user is tasked with producing interaction techniques for emerging technologies. Commonly this is done for gesture inputs, but it can be extended into nearly any input design space. Elicitation is typically done by showing the desired outcome of the command to be generated (the referent), then recording what a participant produces, for example, a gesture they come up with . After a set of gestures is collected, they are placed into equivalence classes by raters. Say a swipe with two fingers and a swipe with one finger are both produced. Work has shown that participants typically don't care much about the number of fingers used in a gesture , so it is reasonable to bin those gestures into the same category. After this step, various agreement-based metrics can be used to find which gestures are the most common, or most agreed upon, by all participants .
This type of study design has the benefit of producing inputs that are more memorable , discoverable, and preferred than expert-defined gestures . However, within these studies, there is a reoccurring theme: People frequently will suggest a gesture that is the same as one found in previous technology. This is called legacy bias . Legacy bias arises from a user's desire to minimize cognitive effort by transferring existing knowledge of previous systems into their interactions with new ones.
This bias has benefits. It is memorable to users; it feels natural; and it is discoverable because of the user's past exposure to the interactions. But legacy bias is also not without costs, which arise when a gesture is produced that does not quite match the capabilities of the new interaction space. A salient example of this is holding one's hand as though grasping an invisible mouse and mimicking clicking on a multitouch surface (Figure 1). This interaction does not use the capabilities of the new interface. However, it does tap into the person's existing interaction knowledge and the affordances of a mouse: Clicking on something selects it.
|Figure 1. Tapping gesture, single finger. Left: mouse click emulated. Right: midair tap.|
While this clicking gesture is considered legacy, what if the person is only tapping the screen and not holding their hand in the shape of a mouse? In this article we explore that categorization, proposing the term evolutionary gesture as a way to refer to gestures that are not quite legacy but whose form is based on a user's past experiences with technology. These gestures can be seen as a direct extension of legacy gestures, but don't seem to fit into the same equivalence class. Think about the two-fingers zoom in/zoom out commands found on touchscreens (Figure 2). In 3D space, perhaps in virtual or augmented reality, that gesture may be represented by two hands collapsing toward the center or expanding from the center (Figure 3). While the number of fingers used may be overlookable when creating equivalence classes, the size and number of hands used are not.
|Figure 2. One-hand, two-finger pinch gesture. Left: touchscreen zoom-out. Right: touchscreen zoom-in.|
|Figure 3. Two-handed expansion gesture. Midair bimanual expansion.|
Many ways to reduce legacy bias or to use it to someone's advantage have been suggested [4,5]. These include production, partnered elicitation, and priming. Production is asking participants to generate more than one gesture proposal. The thought is that perhaps the first gesture produced will be legacy, but if asked to produce a different one, and having exhausted the legacy gesture, someone will generate a new gesture. Paired elicitation is placing users in pairs or small groups and having them work together to generate new gestures by playing a variation of charades. The premise is that knowing that the referent for the gesture should be guessable by their partner, a participant will put more effort into creating an appropriate gesture proposal. Priming is administering some sort of quest or activity to a participant before eliciting a gesture. An example is having participants do large-range body motions like jumping jacks before producing a full-body midair gesture, which may cause them to generate a more physically involved gesture .
Without using the equivalence class for evolutionary gestures, it is difficult to quantify the effectiveness of these reduction techniques. It is obvious that a successful production method would change a touchscreen tapping gesture into a midair grabbing gesture in the case of selection. However, it would be important to note whether the produced gesture was an evolution of the touchscreen gesture. This information is lost when the equivalence classes, as defined by the studies raters, place legacy gestures and near legacy gestures in the same class. Or when the opposite occurs, and this middle category gets binned with the more extreme gestures produced.
Legacy bias can be a great tool for creating easily discoverable and highly memorable gesture interaction sets. This means that legacy gestures can reduce the learning curve for interactions with new technologies. Low learning rates could lead to improved adoption of new interaction technologies and increase the acceptance of gesture inputs. The issue with legacy bias is when it results in improper utilization of the new interaction space—in this case, midair gesturing—which can manifest as gestures that are less than optimally ergonomic or appropriate.
Introducing this new bin of evolutionary gestures makes it possible to calculate the reduction of legacy bias and the transfer of legacy-informed gestures. As elicitation studies' main goal is to derive interaction sets from users that enable more natural-feeling interfaces, this new equivalence class is critical to helping design new gesture sets that tap into the benefits of legacy bias while minimizing the ergonomic costs incurred by doing so.
In a recent elicitation study of midair gestures, we found 79 unique binned gestures. This study was aimed at reducing legacy bias in gestures by using production. Participants were asked to produce three unique gestures for each referent. To measure this effect, we had two raters go through each of the 79 gestures and label them as legacy gestures, evolutionary gestures, or new gestures. They then went through and used only the labels legacy gesture and new gesture. With the two-level labels, nine legacy gestures and 70 new gestures were identified (about 11 percent legacy gestures). With the evolutionary gesture label added, the counts were six legacy gestures, 16 evolutionary gestures, and 57 new gestures, bringing the percentages to 8 percent legacy gesture and 20 percent evolutionary gesture.
Second, we analyzed the impact of production by looking at the gestures produced by participants in sequence. Each participant was asked to produce three gestures, so we look at the counts for each bin of gesture produced over those three tries. The referents that have the highest rate of bias are for translating objects and selecting objects. When asked to translate an object to the left, participants often used a touchscreen swipe gesture. The selection gesture was often pointing or midair tapping (think back to the mouse example in Figure 1).
When analyzing the referents involving translation on an axis and selection, we can see a slight legacy-bias reduction caused by using production. With two bins, 72 percent of the first gestures given were not biased, then 83 percent of the second gesture produced and 88 percent of the third gesture produced were not biased. When we analyzed the same data with three bins, we found that 68 percent, 79 percent, and 85 percent of the produced gestures were not biased (in that order). The missing percentages were found in the evolutionary gestures. This shows the same effect of legacy reduction, but it also captures a shift from legacy gestures to evolutionary gestures, and then finally to new gestures.
In a separate midair gesture elicitation study, 56 binned gestures were produced. Of those, 10 were legacy under the two-bin system. With three bins, four were legacy and six were evolutionary. This highlights some of the potential data loss found when only considering gestures as legacy or not legacy. When we look at the gestures proposed for each referent in aggregate, we get 11 percent legacy gestures if only two bins are allowed and 8 percent legacy gestures with 4 percent evolutionary gestures when three bins are allowed. The addition of the new bin helps uncover a more accurate representation of legacy gestures compared with legacy-informed gestures.
These differences in data interpretation show that having the two-label system can inflate the number of legacy gestures encountered. This inflation can impact how the overall reduction in legacy gestures is interpreted. When the three-level system is used, the count of legacy gestures proposed decreases. Additionally, there is now that new set of evolutionary gestures. By using these two sets in tandem, we can more accurately gauge the impact of production on legacy-bias reduction in gesture-elicitation studies. The evolutionary gesture category also allows us to calculate a more granular transition from legacy gestures to new gestures. This granularity is critical to showing how a legacy-bias-reduction method is impacting the gestures produced. In its absence, reduction methods may appear to be less effective (higher legacy-gesture count) or have higher counts of new gestures created (not counting gestures that are very close to legacy as legacy).
For those reasons, we conclude that in order to more accurately quantify the impact of legacy-bias-reduction methods, a new bin of gestures near but not quite legacy should be added. We call this bin evolutionary gestures. With this new term, we can better measure how gestures evolve when transitioning from one technology to another. Monitoring this will help us to better tap into the benefits of legacy gestures (e.g., memorability, familiarity) while minimizing the downsides (e.g., low fit with new technology, poor ergonomics). With all three bins available, a developer of new technologies' inputs could pick the legacy-informed gestures that may have high memorability due to users' past exposure to similar gestures. These gestures could also be selected to limit the impact of poor fit, ergonomic or otherwise, with the new interaction space.
2. Vatavu, R-D., and Wobbrock, J.O. Between-subjects elicitation studies: Formalization and tool support. Proc. of the 2016 CHI Conference on Human Factors in Computing Systems. ACM, New York, 3390–3402.
3. Nacenta, M.A., Kamber, Y., Qiang, Y., and Kristensson, P.O. Memorability of pre-designed and user-defined gesture sets. Proc. of the SIGCHI Conference on Human Factors in Computing Systems. ACM, New York, 2013; https://doi.org/10.1145/2470654.2466142
Adam S. Williams is a Ph.D. student in computer science at Colorado State University. His research is on multimodal inputs for augmented reality, specifically user-elicited gestures and speech interactions. His research goals are to create novice-friendly interactions for 3D learning environments. AdamWil@colostate.edu
Francisco R. Ortega is an assistant professor at Colorado State University and director of the Natural User Interaction Lab (NUILAB). His main research area focuses on improving user interaction in 3D user interfaces by eliciting (hand and full-body) gesture and multimodal interactions, developing techniques for multimodal interaction, and developing interactive multimodal recognition systems. email@example.com
Copyright held by authors. Publication rights licensed to ACM.
The Digital Library is published by the Association for Computing Machinery. Copyright © 2020 ACM, Inc.