Manual and Cognitive Benefits of Two-Handed Input: An Experimental Study
IBM Almaden Research Center
Stretchy widgets, such as rubber-band lines, have been a part of GUIs for over a decade. But, with only a few notable exceptions, such as Krueger (1983), all of these stretchy widgets are "nailed down" at one end. Consequently, the way that we interact with rubber-band lines, for example, resembles how we interact with a catapult in the physical world, rather than how we typically manipulate an elastic band.
Bearing this in mind, consider the task of sweeping a rectangle around an oval so that each side comes within one pixel of the lateral and vertical extremities of the oval. Figure 1 is an example of my attempt to do this with the Paint program supplied with my PC. The technique used, which is the norm with GUIs, involved selecting point A, dragging the lower right corner of the rectangle, and releasing the mouse button at point B.
The problem in performing the task is in selecting where to start (point A). One has to sight laterally and vertically in order to line the point up with the top and left extremities of the oval.
The figure, as shown, represents my fourth attempt to perform the task as specified. Even then, I did a pretty poor job of it (I started too high in this case).
While we don’t spend the bulk of our lives sweeping bounding boxes around ovals, the example is representative of a larger set of tasks that we do frequently perform (such as sweeping out bounding boxes, selection by dragging marquees, as well as drawing geometry). All of these share the same type of problems encountered in the "simple" example.
"Manual and Cognitive Benefits of Two-Handed Input: An Experimental Study," is a first attempt to investigate this class of task in a formal way. In it, we compare the conventional means of task performance with two two-handed techniques. One is based on Krueger’s two-handed stretchy technique, where, for example, corner A in Figure 1 would be held in the left hand, corner B in the right hand, and both stretched simultaneously. The benefit of this approach, in contrast with the status quo, is that one is not committed to the final position of either corner until the full criteria of the task are met.
The second two-handed technique investigated was a variation on this which employed the Toolglass technique described by Bier, Stone, Pier, Buxton and DeRose (1993).
The paper describes two experiments that were performed in order to gain some insight into which technique performed best, and the underlying mechanisms that produced the results observed.
In terms of performance, as predicted, both two-handed techniques significantly out-performed the conventional one-handed one. Contrary to expectations, the Toolglass technique only matched, as opposed to exceeded, the performance of the simple two-handed stretchy technique. Perhaps most interesting of all, the difference in performance between the one-handed vs. two-handed techniques was not explainable in terms of time-motion efficiency alone, giving rise to the conclusion that there was a cognitive advantage as well. This is because, our paper argues, two-handed input elevates the object manipulation to a form closer to user’s cognitive representation of the task. The need of cognitively visualizing where the correct control points should be (Figure 1) is hence externalized. The paper used the concept of "appropriate level of chunking" as a more general framework for organizing task elements.
In conclusion, the paper studies a class of tasks that are real-world relevant, in that they commonly occur in conventional GUIs. Second, while there is an extensive literature covering so-called Fitts’ Law selection (Fitts, 1954), there is virtually no previous literature investigating such area sweeping selection tasks. Finally, the paper extends our theoretical understanding of the benefits of bimanual techniques.
Bier, E.A., Stone, M.C., Pier, K., Buxton, W. and DeRose, T.D. (1993). Toolglass and magic lenses: The see-through interface. In Proceedings of SIGGRAPH ‘93, Annual Conference Series (Anaheim, California), Computer Graphics 27, 73-80.
Fitts, P.M. (1954). The information capacity of the human motor system controlling the amplitude of movement. Journal of Experimental Psychology 47, 6, 381-391.
Krueger, M. (1983). Artificial Reality. Reading, MA: Addison-Wesley.
The Integrality of Speech in Multimodal Interfaces
Michael A. Grasso, David S. Ebert and Timothy W. Finin
University of Maryland Baltimore County
For many applications, the human-computer interface has become a limiting factor. One example of this is in the field of health care, which consists of hands-busy restrictions during patient care. One way to address these restrictions is to use a speech-based computer interface. Speech is a natural form of communication that is pervasive, efficient, and can be used at a distance. However, the widespread acceptance of speech-based systems has yet to occur. This effort sought to cultivate the speech modality by evaluating it in a multimodal environment with direct manipulation.
We proposed that a multimodal interface works best when the input attributes are perceived as separable and that a unimodal interface works best when the inputs are perceived as integral. Following the theory of perceptual structure , the attributes of an input task were defined as integral if they could not be attended to individually, else they were separable. This was also based on work by Jacob with unimodal interfaces  and the finding that contrastive functionality can drive a user’s preference of input devices .
We developed a biomedical software prototype with two interfaces to test this hypothesis. The congruent interface used speech and mouse input in a way that matched the perceptual structure of the input attributes while the baseline interface did not. Twenty pathologists evaluated the interface in an experimental setting where they read tissue slides and entered histopathologic observations. The independent variables were interface type and slide order. Dependent variables were task completion time, speech errors, mouse errors, diagnosis errors, and user acceptance.
As shown in Figure 1, using the congruent interface decreased task completion time by 22.5% (p<.001), reduced speech errors by 36% (p<.01), and improved user acceptance 9.0% (p<.05). Differences in mouse and diagnosis errors were not significant. Note that a lower acceptability index (AI) was indicative of higher acceptance. User acceptance correlated with speech recognition errors (p<.01) and domain expertise (p<.01), but not with task completion time.
The results of this experiment supported the hypothesis that the perceptual structure of an input task is an important consideration when designing a multimodal computer interface. Task completion time, the number of speech errors, and user acceptance all improved when the interface best matched the perceptual structure of the input attributes.
1. Garner, W.R. 1974. The Processing of Information and Structure. Lawrence Erlbaum, Potomac, Maryland.
2. Jacob, R.J., Sibert, L.E., McFarlane, D.C., Mullen, M.P. Jr. 1994. Integrality and Separability of Input Devices. ACM Transactions on Computer-Human Interaction, 1, 1, 326.
3. Oviatt, S.L. and Olsen, E. 1994. Integration Themes in Multimodal Human-Computer Interaction. In Proceedings of the International Conference on Spoken Language Processing, Volume 2, Acoustical Society of Japan, 551554.
©1999 ACM 1072-5220/99/0500 $5.00
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.
The Digital Library is published by the Association for Computing Machinery. Copyright © 1999 ACM, Inc.