Raquel Prates, Clarisse de Souza, Simone Barbosa
In parallel with software usability, we can then assess software communicability. Communicability is the property of software that efficiently and effectively conveys to users its underlying design intent and interactive principles. Thus, the goal of the communicability evaluation method is to let designers appreciate how well users are getting the intended messages across the interface and to identify communication breakdowns that may take place during interaction. This method is carried out in three steps that can be performed by different groups of people (users, designers, humancomputer interaction [HCI] experts and semiotic engineering experts). It yields distinctive types of representations about interaction, which tell us something about usersystem interactive patterns and designer-to-user (intentional or unintentional) communication.
As users try to meet their needs and expectations, by exchanging messages with the system, they should be able to grasp the designer's conception of who they are and what they want. A system can be perceived as a "discourse deputy" for the designer. The deputy can only communicate to users the set of conversational turns and themes that the designer has predicted at design time. Conversely, users can only tell the deputy what the communicative language allows them to. Furthermore, users can only communicate with the designers' deputies, but not with designers themselves. Therefore, unlike in human conversation, where message sender and receiver can negotiate what they mean, interactive words and phrases in HCI have a fixed meaning (because of implementation) when used by either party. What varies is the intention of use, which depends on contextual features. As a consequence, user interfaces should help users negotiate what they mean by what they say and what is meant by the designer's deputy. For instance, applications that provide an undo feature support this negotiation by allowing users to correct their misconceptions at low cost.
Figures 1 and 2 show examples of high and low communicability, respectively. In Figure 1, an analogy with an existing physical design is being effectively used to tell users what they must do to play back compact discs in the computer. In Figure 2, however, within the context of searching for a computer, there are some readily available operations (in the pull-down menu) that are completely unrelated to the present context of discourse (but related to file manipulation). Users have little chance of assigning any meaning to the corresponding interface elements, since they refer to a whole different context of use, which may then be long forgotten.
The communicability evaluation method described in this paper provides a way for the evaluators to identify points in which the designer may have failed to convey to users his intended message, as well as a way for users to communicate with the actual designers, although indirectly, what they have not understood or agreed with. When users perform the communicability evaluation, they can spontaneously express their expectations, attitudes, interpretations, approval, or rejection toward HCI design choices present in software. When designers or experts perform the test, they produce what should be perceived as an inferred message about the same topics, qualified by the evaluator's background and expertise in HCI.
Communicative evaluation can be made at different stages of design and serve different targets. In formative evaluation, it can help designers compare design alternatives and make further design decisions. In summative evaluation, it can inform the changes needed in new releases. Compared with other evaluation methods, ours focuses on the signs, structures, and conversational patterns presented to users at the interface level, signaling the immediate interpretations assigned to them and the role they play in user-to-system and designer-to-user communication.
The communicability evaluation method consists of three major steps: tagging, interpretation, and semiotic profiling. Each of these steps requires different expertise from the evaluator and yields a distinctive type of representation about the interaction. Users, designers, HCI experts, or semiotic engineering experts may perform the first step, tagging, which identifies communication breakdown points. The next step, interpretation, maps these breakdowns to HCI problems. In this step an HCI expert is usually necessary. Semiotic profiling, the last step, requires a semiotic engineering expert and yields a characterization of the overall message conveyed by the system.
Tagging amounts to "putting words in the user's mouth" while observing his actions during goal-oriented interactions. The "words" are selected from a set of utterances that express a user's reaction to what happens during interaction (e.g., "Oops!" or "What's this?" or "Where is that?" and the like) when conversational breakdowns occur. The result of tagging is a correlation between elements of a predefined set of utterances (to be discussed later) with a sequence of actions in the interface. These actions for accomplishing a predefined task must be recorded using software that is able to capture mouse-pointer movements and other screen events (e.g., Lotus® ScreenCam).
This step may be performed by users, designers, or experts. Users may perform the tagging either while executing a task, or afterwards by playing a screen movie of their own interaction. In these cases, we capture a spontaneous response to interaction patterns, cast in the form of one of the available utterances, as a kind of constrained thinking-aloud protocol. When designers or HCI experts do the tagging, they identify the interactive breakdowns that users experienced and express them by means of the same set of utterances. That is why we say that they put words in the users' mouth, in an attempt to recreate a verbal protocol. Designers and experts could also videotape and take notes of users' behavior during test sessions and use this material for occasional disambiguation during the tagging process.
The following set of utterances is the one we have selected to express different kinds of breakdown situations and user attitudes likely to occur during human-computer interaction . They can be mapped to various ontologies of HCI design problems or guidelines and represent categories of HCI phenomena. The utterances in parentheses are variants of the most representative utterance associated with the interactive pattern described.
- Where is? (What now?)
The user seems to be searching for a specific function but demonstrates difficulty in locating it. So, he sequentially (worse case) or thematically (better case) browses menus and/or toolbars for that function, without triggering any action. This category includes a special case we have called What now?, which applies when a user is clearly searching for a clue of what to do next and not searching for a specific function that he hopes will achieve what he wants to do.
- What's this? (Object or action?)
The user seems to be exploring the possibilities of interaction to gain more (or some) understanding of what a specific function achieves. He lingers on some symbol waiting for a tool tip and/or explicitly calls for help about that symbol, or he hesitates between what he thinks are equivalent options. This category also includes cases in which users are confused about widgets being associated with objects instead of actions and vice versa (Object or action?).
- Oops! (I can't do it this way./ Where am I?)
This category accounts for cases in which a user performs some action to achieve a specific state of affairs, but the outcome is not what he expected. The user then either immediately corrects his decision (typically via Undo or by attempting to restore some previous state) or completes the task with an additional sequence of actions. Sometimes the user follows some path of action and then realizes that it's not leading him where he expected. He then cancels the sequence of actions and chooses a different path. In this case the associated utterance is I can't do it this way. This category includes another one, Where am I?, in which the user performs some action that is appropriate in another context but not in the current one.
- Why doesn't it? (What happened?)
This category involves cases in which the user expects some sort of outcome but does not achieve it. The subsequent scenario is that he then insists on the same path, as if he were so sure that some function should do what he expects that he simply cannot accept the the fact that it doesn't. Movies show that users carefully step through the path again and again to check that they are not doing something wrong. The alternative scenario (What happened?) is when they do not get feedback from the system and are apparently unable to assign meaning to the function's outcome (halt for a moment).
- Looks fine to me...
The user achieves some result he believes is the expected one. At times he misinterprets feedback from the application and does not realize that the result is not the expected one.
- I can't do it.
The user is unable to achieve the proposed goal, either because he does not know how to or because he does not have enough resources (time, will, patience) to do it.
- Thanks, but no, thanks. (I can do otherwise.)
The user ignores some preferential intended affordance present in the application's interface and finds another way around task execution to achieve his goal. If the user has successfully used the afforded strategy before and still decides to switch to a different path of action, then it is a case of Thanks, but no, thanks. If the user is not aware of the intended affordance or has not been able to use it effectively (that is, he has probably uttered an "I can't do it this way" right before he engaged in an alternative path), then it is a case of I can do otherwise. Whereas Thanks, but no, thanks is an explicit declination of some affordance, I can do otherwise is a case of missing some intended affordance.
Notice that all the utterances could be expressed naturally by the users, except for Looks fine to me... and I can do otherwise. These utterances are tagged to a user's actions only when the user is unaware of the result of his actions or of some function afforded by the interface. But if someone is unaware of something, he cannot possibly tell what he is unaware of. Therefore, these taggings can be produced by users themselves only after the fact, that is, when they watch a movie of their own performance and realize then what they didn't realize before.
For instance, in Figure 2 we can assign the utterances "What's this?" to the Edit > Undo Copy menu item and "What happened?" when the error message appears.
For samples of communicability tagged movies, visit http://peirce.inf.puc-rio.br/ and select SERG resources (under Communicability Tagging).
This step consists of tabulating the gathered data and mapping the utterances onto HCI ontologies of problems or design guidelines. This step must be done by an HCI expert, unless the mapping has been predefined. In this case, designers can benefit from some sort of automatic mapping and obtain a mechanically generated diagnosis of interaction problems. We have associated the seven categories of HCI phenomena (as represented by their corresponding utterances) to four high-level classes of interaction and usability problems, as shown in Table 1.
Notice that navigation, meaning assignment, task accomplishment, and missing of affordance, are known usability problems. However, communicability evaluation reveals yet another class of problems: declination of affordance. Popular sets of design guidelines or usability principles do not address this problem explicitly, nor do cognitively based evaluation methods and techniques. Nevertheless, in communicability evaluation, this phenomenon can be perceived and can be used to refine HCI problem taxonomies. Users generally decline an affordance when they regard the costbenefit ratio for an afforded feature as disadvantageous, if compared with an alternative way of performing the same task. Among the causes of declination of affordance are inconvenient navigational structures, such as deep nesting in menu structures or lack of shortcuts.
HCI experts may use alternative taxonomies for a more precise user-system interaction diagnosis. For instance, utterances may be mapped to such distinct taxonomies as Nielsen's discount evaluation guidelines , Shneiderman's eight golden rules , Norman's gulfs of execution and evaluation , or even Sellen and Nicol's taxonomy of contents for building online help .
Independently of the taxonomy chosen to map the utterances to, during interpretation HCI experts identify the main interaction problems at the interface. By examining the utterances, the expert has some indication of the cause of the problem and, thus, of its solution. The taggings capture more precisely the symptoms of interaction breakdowns, and thus give a more detailed and refined indication of these problems. For instance both the "Why doesn't it?" and "What's this?" utterances may be mapped to a meaning assignment category. Nevertheless, the former indicates that users believe they understand the signs but are generating the wrong interpretations to them, whereas the latter indicates that they can't generate any interpretation. The expert can then plan for the redesign of the communicative breakdown points of interface, supported by more in-depth information. From this interpretation, the expert can also plan the context-sensitive help, or the entire help module as a whole (especially if redesign is not an option or if the breakdown is not perceived as critical), as well as create the application's semiotic profile.
Profiling must be done by semiotic engineering experts. Profiling consists of interpreting the tabulation in semiotic terms, in an attempt to retrieve the original designer's meta-communication, that is, the meaning of the designer-to-user message. Thus, semiotic profiling adds value to the evaluation made during interpretation, since it goes beyond the communication breakdowns and interaction problems identified and tackles a more abstract level, the interface language.
At this level, the semiotic engineering expert deals with the implicit messages conveyed by the choice of signs, structures, and interactive patterns that compose the user interface. These messages may be intentional or unintentional and greatly influence the perceptions and reactions of users to software artifacts. Unintentional messages are generally the result of designers' tacit knowledge and assumptions. The role of the expert is to reveal these implicit messages to designers, who may then change or confirm their choices.
An example of the outcome of the semiotic profiling of an application is the realization that some piece of software is implicitly conveyed as a toolbox for solving a certain set of related problems, whereas another piece (a commercial competitor, for instance) is conveyed as a monitor of the user's action.
Communicability evaluation provides different instances of representation that can be used in different ways by people with different degrees and types of expertise. Semiotic engineering and HCI experts can use this method to evaluate interface design, identifying problems and proposing redesign choices. Designers can use it to predict or diagnose interaction problems. And users can use it as a means of direct or indirect communication with the designer. Direct communication takes place either when (a) designers have access to users' interaction through the movies and can recognize problems that occurred, or (b) designers have access to users' own taggings. Indirect communication can be mediated by (a) the HCI expert, who produces taggings and interpretations based on his technical knowledge and professional experience, or (b) the semiotic engineering expert, who can additionally build the application's semiotic profile. Moreover, taggings provide a common language in which users, designers, and HCI and semiotic engineering experts can share their knowledge.
As we said, communicability evaluation can be used at different stages of the design process. At early stages, it can serve as a formative evaluation tool, allowing designers to compare different design options or assess the choices they have made. In particular, our method can be used as an instrument for inspection evaluation, since designers and experts may try to put themselves in the users' shoes and tag potential interactive breakdowns. At later stages, it can be used as a summative evaluation tool to inform the features to be changed or added in future software releases.
Our method applies basically to single-user interfaces. Multi-user interfaces would probably require other utterances related to interacting with other users, such as Who are you? What are you doing? and Where are you? The same is true of artificial intelligence applications, for which utterances related to the system's cognitive abilities are likely to occur (e.g., Do you know this? Can you learn this?).
Compared with other evaluation methods , our method focuses on what is being said by the interface signs a user is supposed to interpret. Thus, we do not directly address problems in inadequate task or user modeling, except perhaps as a further inference on why certain communication is conveyed. By the same token, other methods typically do not directly address the problems we deal with. For instance, failure to provide feedback [e.g., 3, 7] may cause differentiated taggings (e.g., Why doesn't it? What's this? I can do otherwise.). The effect of differentiation can be noticed in redesign tasks or online help design, for example. Designers can use the tagging to decide which sign or message they will incorporate into the application so that the problem is solved or minimized.
Sequences of utterances may additionally provide relevant insights about how users interpret the designers' messages. For instance, a Where is? utterance may be often followed by an I can do otherwise utterance. In this case, the sequence probably indicates that the user fails to perceive some feature's intended affordance and thus that the designer is not getting the message across. Also, a sequence of Thanks, but no, thanks utterances is likely to indicate a mismatch between the designer's ideal user and the actual user who is participating in the evaluation.
The goals and issues evaluated by usability and communicability methods are distinct and complementary. Thus, in order to have a broader evaluation of software, taking into consideration appropriateness to the task, user's performance, and communication of design intents, the expert should combine usability and communicability methods. In our user interface evaluation projects we have combined communicability evaluation (all three steps), user interviews, and interface inspection. As a result, we have been able to evaluate the interface thoroughly for the factors mentioned.
Some challenging issues about communicability evaluation remain, namely,
- Is this set of utterances appropriate? Is it technology dependent? Is it culturally determined?
If we allow people to tag movies with other utterances (in addition to the ones in our set), we may sense if the set we are working with is satisfactory for the analysis or not. The same applies to specialized technologies, such as multi-user applications or artificial intelligencebased systems. Different cultures may also react in different ways to the same communicative acts (even without translation problems). The latter may be particularly interesting for software localization.
- What is the spectrum of taggings that can be done to the same movie by different groups of people (users, designers, and experts)?
We shall soon have the results of a case study in which interaction with a small application will be tagged by different groups of people. By contrasting the taggings, we expect to assess the range of plausible interpretations the same phenomena can yield, which is seldom achieved with other evaluation methods.
- How do utterances change along the users' learning curve?
An ongoing study with 6 participants using the same software over a period of 10 weeks will provide us with data for an appreciation of the types of utterances novices are likely to make, compared with more experienced users. For instance, Where is? can be expected to disappear, or at least to come down to an insignificant number of occurrences during a task. Conversely, Thanks, but no, thanks might become more frequent over time.
The authors would like to thank CNPq for their support. They also thank all their colleagues who discussed the issues presented in this paper and gave them insightful suggestions, in particular, Tom Carey, Michael Muller, Kevin Harrigan, and John Thompson. The members of the Semiotic Engineering Research Group and the graduate students of INF2062 at PUC-Rio also contributed significantly to the results presented here.
2. de Souza, C.S., Prates, R.O., and Barbosa, S.D.J. A Method for Evaluating Software Communicability. In Proceedings of the Second Brazilian Workshop in HumanComputer Interaction (IHC '99), forthcoming.
Raquel O. Prates
Departamento de Informática, PUC-Rio
Rua Marquês de São Vicente, 225
Rio de Janeiro, RJ, Brazil 22453-900
Departamento de Informática e Ciência da Computação, UERJ
R. São Francisco Xavier, 524 6o. andar
Rio de Janeiro, RJ, Brazil, 20550-013
Clarisse S. de Souza, and Simone D. J. Barbosa
Departamento de Informática, PUC-Rio
Rua Marquês de São Vicente, 225
Rio de Janeiro, RJ, Brazil 22453-900
Methods & Tools Column Editors
Lotus Development Corp.
55 Cambridge Parkway
Cambridge, MA 02142 USA
IT- University in Copenhagen
2400 København NV
+ 45 3816 8888 lokal 829
fax: + 45 3816 8899
©2000 ACM 1072-5220/00/0100 $5.00
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.
The Digital Library is published by the Association for Computing Machinery. Copyright © 2000 ACM, Inc.