In a recent talk, Daniel Rosenberg  asked why we were still talking about return-on-investment (ROI) justifications for HCI work. Rosenberg said he had never been asked for an ROI justification in his 20-plus years in the field. That may be true for most practitioners, but if we want our contribution to be taken seriously by other stakeholders, we absolutely must demonstrate the business value of HCI.
The idea of cost justification can seem intimidating, but demonstrating our business value need not be a difficult task or involve complex equations. It does need to be done in a way that speaks clearly to business decision-makers and targets issues that are truly of concern to them. If our arguments target the right issues, the task of showing our business value can often be surprisingly simple, requiring no more than "tweaking" data we have collected anyway, or connecting our findings to data that the organization already has.
Sometimes, the issues that will catch the attention of business decision makers are not the ones that seem, at first, related to the problem; it is the job of the HCI professional to make the connection. For instance, in one ergonomic review of a group of 2500 workers using a mission-critical computer system in a telecommunications company, the data revealed a workplace in deep crisis: absenteeism rates were dramatically high, several suicide attempts had occurred, many workers were on long-term stress leave, and staff turnover rates reached 120 percent in three months. Yet, management refused to own the problem. However, once certain HCI observationssuch as the amount of time customers spent "on hold"were quantified, connected to the poor workplace performance, and presented in terms of monetary loss, management became interested in HCI analysis and recommendations; in this case, a redesign of the application user interface .
In determining which issues will provide the most persuasive arguments, we cannot always rely on the way management may have framed its research requests. Often, when seeking information, management will pose questions too narrowly. Sometimes, redefining the problem is the best solutionwe are all familiar with the rewards of helping our internal customers simply ask the "right questions." Broadening the scope of research may lead you to collect different data or to employ different methods of analysis than first planned. Expect such a change of plans to meet resistance, especially if the nature and origin of the real problem is not where management anticipated. However, you can improve your chances and ensure that management listens, by focusing on the business goals of the company, presenting your calculations in realistic financial terms and backing up your statements with real data, even if it is only preliminary.
In this article I attempt to show how we can make powerful, convincing, and realistic business statements by targeting our arguments and by applying the best analysis method to a given problem. I focus on two main examples. First, I argue that the authors of a well-known study from the HCI literature that applied GOMS rules (Goals, Operators, Methods, and Selection)  to demonstrate the superiority of one telecommunication application over another  should have redefined the problem. To support my argument, I cite a lesser-known study that employed a simple activity analysis  to a very similar problem. Second, I present another study in which the actual problems, and hence the resulting recommendations, were completely unrelated to the solution management had anticipated.
One study, portrayed as a major success in the HCI literature, compared a proposed application with an existing workstation used by Toll and Assistance Operators (TAOs) in a Telephone company . Although this study is now dated, it is considered a classic and remains instructive. The study aimed specifically to advise the business on whether or not to replace an existing application with a proposed one. The TAO's job was to handle collect calls and person-to-person calls. The researchers generated a series of GOMS models for both workstations. Initially they predicted that the proposed workstation would reduce the average call-completion time by 2.5 seconds per call , or two seconds per call , depending on which version of the study one reads. This would save the company around $7.5 million per year. At the same time as the initial GOMS models were constructed, a field study was conducted to generate a set of benchmark tasks. The models were then modified to reflect the differences in design between the two workstations. When combining all the call types into one calculation, the resulting GOMS model predicted that the proposed workstation would be 0.65, or 0.80 seconds slower than the existing workstation, again depending on the version of the study one reads.
GOMS models treat third-party activitysuch as a customer interacting with the operator during task-completionas a constant rather than as a variable. GOMS also assumes expert, error-free performance. This is entirely appropriate, as the original intention behind GOMS was to provide the HCI community with predictive, theory-based engineering models . Therefore, they do not take variations in the duration of task performance into account. Thus, even though the authors stated that the most time-consuming factor was the TAO/customer interaction, they did not consider variations in performance attributed to this extra-system factor and instead only focused on evaluating theoretical performance based on keystrokes in the critical path of the task.
Compare this with another study that decomposed Directory Assistance (DA) calls to learn where these could be reduced without increasing operators' workload . Each call took, on average, less than 20 seconds; many were completed in 10 seconds, and a few exceeded 60 seconds. Each operator handled several thousand calls per week. The authors of this study generated stereotypical calls from simple event logs rather than producing GOMS models. Calls were analyzed from the moment the operator accepted the call until the caller was disconnected. Some 6700 calls were divided into behaviorally observable and meaningful events to obtain a distribution of time devoted to each in a typical call. One outcome of this is shown in Figure 1, with time displayed along the abscissa and event-types along the ordinate.
Each event-type is denoted by a color and occupies a separate row, showing when an event occurred relative to others, how long it lasted, and the sequence in which events occurred. The pale blue area in Figure 1 shows the task-complexity where several events co-occur.
As was apparently the case in the GOMS study, the analysis here revealed that the single most time-consuming event, amounting to 35 percent of the stereotypical call, was interacting with the caller to obtain details about the person/company the caller wanted. This is consistent with other findings from similar tasks and environments: Customer interaction typically accounts for 30 percent to 40 percent of the task-time . To reduce the 35 percent event-time, the researchers inserted an Interactive Voice Response (IVR) mechanism at the start of a call. Tests showed that callers and operators preferred this to the conventional method in which the caller first encountered an operator. This remedy shaved several seconds off the average call time. Thus, rather than ignoring the most time-consuming variable as the GOMS models do, this simple quantification of the task yielded enough insight to support a creative solution that targeted precisely that variable, thereby meeting the objective and improving the call experience for all involved.
Comparing these studies not only suggests that the GOMS analysis may have been the wrong way to approach a competitive evaluation of the two workstations, but that a competitive evaluation per se was not the way to have the greatest business impact. If dealing with callers takes up substantially more task-time than the predicted 0.65- or 0.80-second difference between workstations, this calls into question the business significance of one or two keystrokes. The choice of workstation probably did not matter; the business may have been better served by asking, instead, how operators' spent their task-time and then, determining the best solution to make the interaction with callers more efficient.
Our study on warehouse management started out as a competitive analysis of two technologies under consideration for use in a large grocery distribution warehouse [9, 10, 12]. The warehouse comprised approximately five acres of floor space arranged in aisles and shelves from floor to ceiling. Assemblers navigated these aisles, collected goods ordered by supermarkets, stacked them on a pallet, and wrapped them in plastic before dropping them off in one of the 18 loading bays.
Management was concerned about large losses due to incomplete orders and damaged products received by supermarkets. For example, cases of soft or fragile goods were often found crushed beneath cases of heavy items. Management attributed the problem to carelessness on the part of workers known as "assemblers," who used hand-held input devices. Management believed that replacing these devices with a hands-free speech device would solve the problem at least to some extent.
Analysis of the two technologies revealed multiple usability problems with each. However, observation of the assemblers suggested that the major problems had little to do with the physical device. Using the activity theory model operationalized by Engeström [3, 4], we were able to expand the framework and identify the location of breakdowns. One problem lay with the Warehouse Management System (WMS) program, which recorded the outside dimensions of trucks, not the inside dimensions. Consequently, inaccurate space estimates caused some loads to be too tall. The loaders, whose role was to pack the trucks, were forced to dismantle and re-package the orders to fit, adding pallets to avoid damage to the product, but increasing the overall size of the load. In spite of these adjustments, they were still often unable to fit the load into the assigned truck. This upset the WMS scheduling and route calculations.
The WMS was also programmed with a flawed procedure for selecting items. The sequence was designed to ensure that the assembler always moved in one direction through the warehouse, but product weight and volume were not taken into consideration. For example, suppose an assembler had just picked 16 cartons of cornflakes, the next item could be 12 cartons of 4-liter bottled orange juice. The assembler would then have three options: (1) "break the order," that is, leave the juice for another assembler to get; (2) unload the cornflakes, load the juice, reload the cornflakes; (3) ignore the problem and allow some of the cornflake packs to be damaged by the weight of the juice packs. Since the assemblers' scanning technology displayed only a few items at a time, it was impossible to predict or avoid these problems. The proposed speech technology similarly provided only one item at a time, so it was not a solution to the problem.
The WMS calculated the time to complete each order, and assemblers were required to work to within +-5 percent of the predicted time. Therefore, assemblers were motivated to pass the problem on to the loaderswho spent so much time unpacking and repacking orders that they were behind with their own loading tasks and ended up keeping the truck drivers waiting. With the loading bays backing up, assemblers were unable to drop their loads in the locations specified by the WMS, leaving them wherever they could in the rush to report back to their supervisors within the allocated time frame.
A simple time-and-motion study of loader activities showed that over half of the loaders' task-time was taken up by unnecessary activities. Retrieving stray orders, the most time-consuming activity, could be avoided by setting the computer algorithm to the internal rather than the external truck dimensions, and by removing the requirement for assemblers to work to set time standards. This correction eliminated excessively tall loads, allowing assemblers to apply common sense to their work. They were also given a printout of orders before starting the assembling task, which reduced the need to break orders and the motivation to pass on the problem. By eliminating the need for loaders to repack orders, loading bays could be kept clear, enabling assemblers to drop orders in the right place and avoiding loaders having to hunt for them.
Management expected us to identify the best technology for assemblers and recommend one of the two technologies evaluated. We were not asked to do a time-and-motion study or even to study the loaders. However, our early data suggested that the problems resulting in damaged goods and incomplete orders lay elsewhere, and our subsequent analysis confirmed this suspicion. Indeed, the usability problems associated with each technology were quite insignificant. The study revealed that it was unnecessary for the company to invest in the new speech technology and instead indicated a relatively simple adjustment in the WMS to reflect workspace and practices. As a bonus, this solution also enabled management to assign one out of every three loaders to other duties, thereby increasing the efficiency of the workplace as a whole.
I often hear colleagues complain that their activities are dictated by others who may not be in the best position to judge how a problem should be addressed or what methods should be applied to solve it. They seem to believe that they do not have the authority to change the scope of research questions that are handed to them. Or, they may feel that the best way to win business allies is to do what they are asked to do. The case studies presented above, however, show that it can be both possible and politically beneficial to reframe our research questions, provided we truly address management concerns.
Like Rosenberg, I have never been asked explicitly to provide ROI for HCI activities, but by enlarging the focus to the business itself, we give ourselves and others more freedom to define problems, address them correctly, and provide novel solutions rather than merely "doing as we are told." Without relating HCI activities to the business, I fear many HCI professionals will continue to be confined to the laboratory and have minimal, if any, impact on the projects in which they are involved.
3. Engeström, Y. (2000a). From individual action to collective activity and back: developmental work research as an interventionist methodology. In Paul Luff, Jon Hindmarsh & Christian Heath (Eds.), Workplace Studies (pp. 150-168). Cambridge, MA: Cambridge University Press.
5. Gray, W.D., John, B.E. & Atwood, M.E. (1990). An application and evaluation of GOMS techniques for operator workstation evaluation, Proceedings 13th. International Symposium Human Factors in Telecommunications, HFT, Turin.
6. Gray, W.D., John, B.E. & Atwood, M.E. (1992). The precis of project Ernestine or an overview of a validation of GOMS, in Proceedings CHI '92 Striking a balance Human Factors in Computing System, Monterey, CA, (pp.307-312).
7. Gray, W.D., John, B.E. & Atwood, M.E. (1993). Project Ernestine: Validating a GOMS analysis for predicting and explaining real-world task performance, Human Computer Interaction, 8, (3), (pp.237-309).
9. Lindgaard, G. & Madore, S, (2003). The impact of interactive technology on worker efficiency: A job assessment, Proceedings of the 4th World Congress on e-commerce management, Hamilton, Ontario, Canada, 15-17 January.
11. McEwen, S. & Bergman, H. (1993). Automating directory assistance service: A human factors case study, in Proceedings 14th. International Symposium Human Factors in Telecommunications, Darmstadt, Germany, May 1993.
Human Oriented Technology Lab (HOTLab)
Ottawa, Ontario K1S 5B6, Canada
I would like to thank Ian Milburn, Christine O'Connor, Sherri Madore, Todd Yates, and all the people who allowed us to observe them in action and for permission to publish the above data.
©2004 ACM 1072-5220/04/0500 $5.00
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.
The Digital Library is published by the Association for Computing Machinery. Copyright © 2004 ACM, Inc.