An 1878 brochure (at left) from the New England Telephone company, "How to Make a Telephone Call," explains, with illustrations, the use of its new instrument. One of the drawings "represents a person calling attention by pressing the knob at the end of Bell Box, and turning the crank, causing the Bell at the other station to ring. When the person at the other end hears the call, he will call back; then both will turn the switches to button marked T ." As the instructions continue, "The Telephone can then be used." Of course, these early adopters began "using" the telephone the minute they pressed that knob and turned that crank. But the writer of this pamphlet understood that he was selling a revolutionary new experience, not the intricacies of a complicated machine. He needed to bracket off the preliminary manipulation of buttons and cranks as something different from this new and seemingly magical phenomenon of talking to another human over a great distance.
And yet those pesky controls required attention. This was a machine that demanded some expertise. As the pamphlet emphatically notes, "When you have finished talking, BE SURE AND TURN THE SWITCH TO BUTTON MARKED B." Oh, and watch out if the sky clouds up: "If a thunderstorm threatens, insert the plug that is supplied with the Bell, into the hole marked A." This presumably breaks the circuit between the two telephones and prevents electricity from a lightning strike from traveling along the wires that connect them. But remember to remove the plug the next time you want to make a call!
It is easy to laugh at how difficult previous generations of technology were to use, especially once they have matured and become everyday consumer items. Many improvements arise from better materials, manufacturing processes, and engineering: The physical design of a machine has to account for how the human body is built, what kinds of controls best match the capabilities of the human hand, ear, foot, etc. These early telephones required two hands and a great deal of manual dexterity to press knobs, turn cranks, hold ear- (and sometimes mouth-) pieces. But as readers of interactions know, advancements in industrial design must be informed by principles of human cognition that make using such devices something that we don't have to think about very much. Central to this process is an understanding of the role of mental and conceptual models in interaction design.
Donald Norman began this discussion for the current generation of interaction designers with the simple axiom, "A good conceptual model allows us to predict the effects of our actions ." In Norman's view, systems should present accurate images of how they work, whether through the controls in the interface, or through accompanying marketing material, instruction manuals, diagrams, support sites, etc. Problems occur when the conceptual model is faulty, when the system's image of itself is inaccurate. This often results when marketers or technical writers attempt to simplify for consumers the intricate details of a design. When the system breaks down or doesn't perform the way a user expects, the inaccurate system image reveals a fissure between the user's understanding of how a system works and how it actually does work. In Norman's classic example, he has trouble adjusting the temperature of his refrigerator compartments: The system image indicates that the freezer and cooler compartments can be controlled independently, when in fact there is only one cooler for the unit and controls that direct more or less cool air to one section or the other. He finds it impossible to alter the temperature in one compartment without affecting the other compartment as well, and this contradicts the conceptual model presented by the refrigerator's system imagethe diagram on the refrigerator itself.
What must be emphasized in any discussion of conceptual models is the knowledge that users bring with them from previous experiences with mechanical devices and information systems. When humans engage in learning, they attempt to assimilate the new experience into previously formed mental representations of reality. If the new experience matches those representations, learning occurs more easily. If there is a mismatch, real learning can occur only when the person alters his or her mental representations of reality in some way to accommodate the new experience. One can see how damaging a false system image can be to this process: Rather than strengthening a person's understanding of the world, the new learning experience corrupts it. We must always consider the previously formed mental representations of the world that people bring with them to new experiences with mechanical devices and information systems, for these inform their perception of the system's image and therefore the conceptual model they perceive as part of their experience.
Telegraph operators of the 19th century would have understood Bell's box telephone immediately. Because of their prior experiences, they had already built mental models that could assimilate the idea of electrical current traveling along a wire; the need for a closed circuit between two devices on the wire and the need to interrupt that circuit under certain circumstances; and the ability to convert electrical pulses into something else at the ends of the wire, such as the swings of a needle on a galvanometer, the series of dots and dashes in Morse code, or sound waves in the telephone. Telegraph operators also would have understood the knob and crank of the Bell Box as a way of generating current along the wire and ringing a bell to indicate at the other site that a response is requested. The plugs to open and close the circuit would likewise fit well with their mental picture of such instruments. For those who had prior experiences with it, the telegraph served as a valuable antecedent in the new experience of using the early box telephones.
But to people of the 1870s who did not have prior intimate experience with the telegraph, the new telephone was magical and frightening. Merritt lerley notes, "the reaction was confusion or disbelief. Many people were apprehensive confronting a telephone for the first time. The disembodied sound of a human voice coming out of a box was too eerie, too supernatural, for many to accept ." Only when the telephone became understood as a "speaking telegraph" did the masses became more comfortable with it. But this is merely another way of saying that people learned to assimilate the telephone into their established mental picture of how such devices work. And this could happen only when the everyday operation of the telephone became more like everyday uses of the telegraph. Ordinary people did not operate a telegraph machine. They handed their messages to the clerk, and he or she sent them along the wires. Likewise, the maturation of the telephonethe improvements in design that made it useful as an everyday appliancetook much of the operation of the apparatus out of the hands of the user, making it much easier for ordinary people to learn how to make a telephone call. Even the character of the first telephone conversations was determined by users' prior experience with the telegraph. Explaining their brevity, lerley writes: "the telegraph was understood to be a medium for short, to-the-point, business-like messages. So too, it seemed, the telephone ."
Our goals have not changed in the past hundred years and more. We want to write, to communicate, to buy and sell goods and services, to move from one place to another, to understand a problem, to make good decisions. But the technologies that help us achieve these goals do change, sometimes quite rapidly. They are "contingent" technologies in at least two important and connected ways. First, our ability to learn and use new technologies is contingent upon our experience with prior technologies. On a computer, for instance, each time we learn a new interaction idiom such as drag-and-drop, or double clicking, or scrolling, we adopt new ways of understanding how software applications and hardware devices work. We compare these new experiences to past ones; we recognize solutions to problems that previously puzzled us; we assimilate the new experience and store our new understandings so that they will serve as helpful antecedents in future encounters with unfamiliar technologies. We create new mental models that we carry with us, to help us predict how other software and devices will work when we encounter something slightly different or new. Our ability to learn and use something new is then (again) contingent upon our ability to apply antecedent experiences in productive ways. We "run the model," and we hope that the new technology operates according to the same rules that controlled the previous experience. In this fundamental way, our ability to use new technologies is contingent upon our prior experiences with other technologies.
But new technologies don't always follow the same principles and patterns as previous ones, and this is the other way in which they are contingent. They change. They take one form today, another tomorrow. They use language in unexpected ways and deploy idioms and metaphors inconsistently and confusingly. It becomes difficult to refer to antecedent experiences in order to learn how to use something new. There is too much incompatibility between our previously formed mental models and the new system before us. An online game that appears to allow direct manipulation in fact requires a memorized sequence of keystrokes. A line of text that looks like a hyperlink actually requires a double-click to activate. A custom-designed scrolling widget uses unconventional horizontal arrows in addition to vertical ones, thus making it difficult to access much of the content in an application. This kind of contingency in the technologies we use does not further our understanding. Rather, it makes it more difficult to learn new technologies because it introduces inconsistency and randomness, making it impossible to predict the effects of our actions. It hinders our ability to develop reliable mental models about the workings of the digital world in which we live and work.
A common example of the problem of contingency can be found at the checkout aisle of any grocery store or discount superstore. For several years now, people who use credit cards for purchases in such places have encountered a confusing interaction: After swiping our cards, we are prompted for our Personal Identification Number (PIN) to authorize the transaction. Observing our bewilderment, helpful clerks will ask the now familiar question, credit or debit? Answering "credit," we are then instructed to press the cancel button and proceed as usual. Sure enough, pressing cancel sends the process request to the bank and moves the transaction along, culminating in a request for an ink or digital signature. Sometimes pressing cancel leads to an intervening screen that again offers the choice of credit or debit. Today "Press Cancel to Proceed" is everywhere. Those of us who habitually pay with credit cards have accommodated this new aspect of reality to our mental representations of how the systems work. We now do it automatically, without bothering the cashier. But this is not legitimate new knowledge that will help users of information systems with other interactions. In every other situation, cancel means abort the procedure, stop, don't do anything.
Banks and stores have many reasons to favor (and default to) one form of payment or another. But the inconsistency in this interaction from store to store makes it clear that customer confidence and understanding about the transaction process is not high among them. The fact that "credit or debit" often really means "signature or PIN authorization," but the consumer rarely is informed of the implications of each, is further evidence that these systems are not designed to meet the user's needs . To further complicate matters, some banks encourage customers to choose credit at checkout, even when using a debit card. That is, they encourage signature-based authorizations over PIN-based ones, because signature-based authorizations are more profitable for the bank issuing the card. But they often disguise this fact with arguments about security and purchase protections.
One credit union, for instance, offers this advice on its Visa/ATM FAQ page: "Take advantage of all of the benefits of your VISA Check Card by always selecting 'credit.' The funds will still be withdrawn directly from your checking, and you will receive the purchase protection of VISA, a service that does not apply if you choose 'debit' ." In other words, choose credit when you want debit. For those who use a debit card instead of a credit card as a way to control spending, it is probably clear enough that their debit card will subtract funds from the checking account regardless of the authorization method. But the fact that the choice of authorization method is disguised as a choice of payment account (credit or checking) is damaging to the customer's efforts to build an understanding of the way their everyday transactions work. The misleading system image confounds learning and understanding.
If you have made the necessary mental accommodation and you understand why you should select credit when you want debit, you might think you understand what happens when you enter a PIN to authorize a credit transaction when that is the default at the checkout. You simply use the same PIN that allows you to take a cash advance from your credit card at an ATM, for example. You are merely choosing to authorize the transaction with your PIN rather than with your signature. But sometimes "credit or debit?" does mean just that: Some credit cards are actually "dual access" cards and can be used as debit cards attached to checking accounts at the issuing bank. The cards themselves do not indicate this dual functionality, and the system provides no indication that in choosing to authorize the transaction with a PIN instead of a signature, the customer is actually requesting that the funds be taken from her checking account rather than her line of credit. The customer has every reason to believe that she is making a credit card purchase, but her choice of PIN-based authorization makes it a debit transaction instead. And of course it is overly generous to call this a "choice," since the system never actually presents the option of choosing debit instead of credit.
This is not the place to completely redesign the point-of-sale interaction, but we can point out what is missing from the current system and how some basic principles of interaction design can improve it. First and foremost, the system needs to present an accurate picture of which account one's money is being taken from. "Debit or credit?" should present a choice of accounts, not authorization method. If the industry needs to support multiple authorization methods, it should educate the customer at the point of sale about those choices: the screen should prompt us to "enter PIN or sign below to authorize." If PIN authorizations are less secure but transfer funds immediately, and signature authorizations carry more protections but take a day or more, the system should present that information in close proximity to the buttons on the screen. If the account the user selects has insufficient funds for the transaction, the system should present that information and present the option of continuing anyway (if the bank permits it), and should indicate how much will be charged for an insufficient-funds overdraft. These modifications improve the visibility of the system, as Norman defined this term: by making the correct controls visible and by making them convey the correct information . They enhance the feedback of the system by informing the user of what actions have been taken and what the consequences of further actions will be. By improving the visibility and feedback of the interaction, they help users build a conceptual model of the entire process that will help them understand when things go wrong, such as when funds are taken from the wrong account, or an unexpected overdraft occurs. Most important, this more robust conceptual model will serve to make future electronic financial transactions more comprehensible.
Some retailers have done a better job than others with their point-of-sale payment systems. Some, for instance, allow the customer to select credit or debit after swiping their card, instead of defaulting to a PIN screen. The Giant Eagle grocery store in my area does this. And to its credit, it replaced the point-of-sale devices as I wrote this essay, and the new system works exactly the same way. My local Lowes, on the other hand, changed from a system that offered the credit-or-debit choice to one that assumed debit/PIN by default and required the credit customer to press cancel to proceed. Others have eliminated the requirement to authorize payment entirely (and thus to choose an authorization method), typically for transactions below a certain amount. (Panera Bread is an example of a national company that does this.) We can hope that the less desirable interactions soon will be replaced by better ones; indeed, in some areas this transition to understandable interactions is occurring quite rapidly. But this is just another part of the problem: Implementations that are temporary or contingent, that change overnight without warning, make it more difficult for people to become habituated to the interaction and to build strong mental models to help them understand what is happening in a typical transaction.
The common cordless home telephone presents another case study in how inattention to the concepts of feedback, visibility, and mental and conceptual models can confound users. A standard analog telephone is already "on" when you lift the handset: The perfectly pitched dial tone provides feedback that the system is ready to receive one or more numbers as input, and a slightly differentiated series of tones provides additional feedback as one enters the numbers. Modern cordless telephones, however, are not "on" in the same way when you pick them up: There is no dial tone. Theoretically, this should not be a problem, as users can begin dialing immediately, just as they would with an older phone, and then "send" the number in some way. But they do not have the audio feedback of the dial tone that the older sets used to indicate that the system was ready to receive a number. The silence of the handset contradicts the user's mental model of the way a telephone operates, which is carried over from the analog telephone experience.
Most new users in this situation look in vain for an "ON" button and might be encouraged by finding an "OFF" button prominently featured near the top of the button pad. Alas, the corresponding button is usually "TALK," not "ON." "TALK" just is not the correct label for these cordless handsets because it never makes sense: not before I've dialed (how can I talk if I haven't dialed yet?); and not after I've dialed (how can I talk if the number hasn't been sent yet?). But instead of focusing as a community on this interaction design challenge, the makers of these handsets experiment and change the design as often as they like, with no observable progress toward the best solution.
An inspection of the four bestselling cordless telephones at Best Buy in early 2008 reveals that none of them uses the same interaction methods to accomplish the basic task of initiating a telephone call. Common to each design is a directional wheel with a prominent button to the left and rightbut there the similarities end. The following table compares the design of these top models:
The absence of an ON label, or a picture of a handset, or at least the color green, makes it difficult to infer how to turn on the AT&T model. Conversely, the absence of an OFF label, or a picture of a cradled handset, or the color red, makes a puzzle out of turning off the GE model. The Uniden model's use of color in combination with images of handsets provides the best immediate indication of how to initiate a call. But then those ambiguous icons on the directional wheel are bound to frustrate a user at some point in the conversation. Ironically, three of the models use some representation of a handset on the button that initiates the call, but the depicted handset harkens back to the modern analog Bell telephone. This is particularly evident in the models that use a cradled Bell handset to indicate which button ends a connection. The designers of these buttons appear to know that users learn best when they can compare something new to something familiar. Sadly, they also know that they can't rely on depictions of today's ever-changing digital phones to provide any shared foundation of experience on which to build.
We live in an increasingly digital world, and it is also an increasingly unusable one. Making a telephone call and paying for something at the store are mundane aspects of everyday life. But unless we can accomplish these mundane tasks without giving them much thought, we will be hindered in realizing our higher aspirationssuch as communicating to ourselves and to others about who we are, what it is we wish to accomplish in life, how we want to change the world. Both the ordinary and the exceptional aspects of modern existence are increasingly connected to technological systems. Unless we are able to design systems that encourage us to build upon our experiences with them from one day to another, we will fail both ourselves and those who come after us.
Charles Hannon is associate professor and founding chair of the information technology leadership department at Washington & Jefferson College in Washington, PA. He teaches courses in human computer interaction, the history of information technology, data presentation, and project management, among others. He is the author of Faulkner and the Discourses of Culture. More recently, he has published widely on the role of educational technologies in higher education. His current book project is Usable Devices: Mental and Conceptual Models, and the Problem of Contingency.
The Amazon Kindle received its share of criticism for "Next Page" and "Prev Page" buttons that are too easy to press unintentionally. But it is the Kindle's "Back" button that illustrates the fissure that opens up when there is a mismatch between mental and conceptual models. How does "Back," on a device that promotes itself as a book substitute, fit into our mental models of reading?
The Kindle's user guide states that readers can use Back to return to their book or magazine after briefly looking up a word, highlighting text, or following a footnote: "Pressing the Back button, located to the right of the select wheel, will bring you back to where you were." But the nature of hypertext is fundamentally associative. Once we have linked away from our book in the Kindle and looked through several pages in several new texts, what does it mean to go "back" to where we were?
A reader leaves a book to browse the Kindle store. Once there, she selects a top-level category, such as blogs, then a subcategory, like news, politics, and opinion, and then uses Next Page and Prev Page to look through several pages of available blog subscriptions. Where should Back now take her? To the home page of the Kindle store? To the page listing all blog subcategories? What about back to the page of the book she was reading? In this scenario Back takes her to the list of blog subcategories, from which she had previously selected news, politics, and opinion. But it is difficult to imagine a rationale for this behavior that would help her predict the results of using Back in future contexts.
The Kindle's Back button represents a collision between what we have learned in the past 500 years about reading books, and what we know so far about hypertext. Consider what the Kindle user's guide says about underlined words in a text: "They indicate a link to somewhere else in the material you are reading like a footnote, a chapter, or a web site." A typical mental model of book reading includes the concept of a footnote and a chapter as being "somewhere else in the material you are reading," but a website is something else altogether. We put down our books when we look something up on the Internet. If we are reading online, we know we are leaving the current text when we link to another site.
Back is a flawed concept for the Kindle because it mixes two separate mental models of reading (Web and print), and because Back is hardly a settled concept for Web-based interactions in the first place. Initially, Back was simple to understand and use. The predictability of Back and Forward made risk taking more acceptable for new users. But the No. 1 "design sin" in Jakob Nielsen's "Top Ten Web Design Mistakes of 1999" was the breaking of Back by coding links so that they open new browser windows or redirect users to the undesired page. Over the years, Back has been degraded further by the use of frames; by forms that send a user's information to a server for processing; by websites that use Flash animations for navigation; and by rich Internet applications that process information within the context of a single URL.
These "contingent" implementations make it difficult for users to develop mental models that will allow them to predict what will happen when they use a Back button, whether on the Web, the Kindle, or any other device.
©2008 ACM 1072-5220/08/0900 $5.00
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.
The Digital Library is published by the Association for Computing Machinery. Copyright © 2008 ACM, Inc.