XXI.6 November-December 2014
Page: 80
Digital Citation

The usefulness of traditional usability evaluation methods

Gitte Lindgaard

Calls for novel user experience (UX) evaluation methods have been echoing through the HCI literature for several years. Although the traditional notions of effectiveness, efficiency, and satisfaction may not adequately capture all types of new interactive experiences, novel approaches would not necessarily render existing evaluation methods irrelevant. As Tom Stewart so aptly says, “Being pretty and engaging is not enough” [1]. Despite the proven longevity of traditional usability evaluation methods, the HCI literature still points to confusion about how UX, including usability, is and should be tested.

One reason for this confusion is the undeniable distance between HCI research and practice that does not seem to diminish as our interdisciplinary field matures [2]. We all know that academics are under constant pressure to publish. We also know that practitioners face very different pressures in their work of solving real-life problems in real time and with limited resources. These factors make it almost impossible to also produce publishable papers meeting the stringent journal editorial requirements; hence, the relative scarcity of practitioner-authored HCI papers. Usability research is largely derived from journal and conference papers written and assessed by academics. As very few academics are also experienced practitioners, they therefore receive relatively little exposure to industry-based usability evaluation practice. The unfortunate consequence is that the HCI literature remains mainly a forum for academics conversing with other academics, and that practitioners perceive much of what they have to say as irrelevant to them.



I wondered how experienced practitioners deal with usability evaluation. I assumed, perhaps naively, that academic researchers writing about usability testing, especially those who make claims about current usability measurement practice, would want to learn more about industry-based practice. I also assumed that practitioners would welcome research literature addressing issues that they find relevant. One serious concern was that research discussing usability methods appears largely to originate in educational settings using students or academic colleagues as evaluators. Since few academics also have extensive experience developing and testing software that must survive in a fiercely competitive world, it seemed to me that the extrapolation of such findings to the messy world of industry would be questionable. I wanted to learn more about actual practice and to see if some of my assumptions were correct.

Interviews with Practitioners

I interviewed 12 UX practitioners and UX team managers in four countries (Australia, Britain, Canada, United States) about the usability testing methods currently in use in their organizations. Participants were employed in a mixture of national and international/global organizations, both public and private, and ranging from one to more than 40,000 employees. They represented domains including telecommunications, health, game production, worldwide consumer software, and consultancies. All had five to 30 or more years of practical usability experience, and all were all currently involved with UX work. Six interviewees’ native discipline was technology-oriented, for instance engineering and computer science, and the remainder were psychologists and ergonomists. All had master’s or Ph.D. degrees.

Everyone said they adhered to a user-centered design (UCD) philosophy, implemented to varying degrees in the organizations they represent. All 12 interviewees spent considerable energy advocating the value of UCD, UX, and usability testing. For consultants external to their client organizations, use of the UCD approach depended both on how far their clients had already embraced UCD and on the tasks specified in the brief or tender document. If the organization for which they are consulting does not have a dedicated UX team, the consultants indicated that they work to educate the client in charge of the brief about the importance of UCD as a philosophy and as a design approach. For employees in larger organizations, these efforts were directed toward people from other branches and product lines/groups.

UX Work Structures

In some of the larger organizations represented in the sample, individual UX experts were assigned to a particular project and were thus responsible for the usability of that product throughout the development process. They worked closely and continually with the developers from product conception to deployment. The UX experts were held personally accountable for the usability of the end product to the extent their jobs depended on achieving the agreed-upon usability goals.

In other similar large organizations, the UX expert was responsible for finding timely opportunities to apply their usability skills to different projects. Their organizations tended already to be very user focused. Widespread understanding of usability throughout the organizations meant that UX activities were very much in demand. To cope with these demands, UX team members had scheduled consultation hours during which they offered help and advice to any product lines/project teams. Consultants who were not employed by their client organizations reported that they worked opportunistically, doing whatever they could usability-wise. Even the smaller consultancies (those with fewer than 10 employees) employed dedicated UX personnel. In my sample, all of the external consultants reported that they had significant repeat business, representing 50 to 80 percent of their work.

Usability Evaluation Inspection Methods

Of the 12 interviewees, 11 used some variety of expert inspection methods, and the last one was about to introduce inspection methods so that stakeholders beyond the UX team could participate directly in formative usability evaluations throughout the process, or at least observe usability tests. Several interviewees said they used inspection methods only when time, budget, or the client brief did not permit actual usability testing. Some consultants offered inspection methods as an entry into a new organization because initial contracts are typically very small, designed cautiously to “test the usability/UCD/UX waters.” Consultants said they also conduct expert reviews in projects that are so obviously full of extreme usability problems that full-blown usability testing would make no sense until the worst problems had been fixed. The main purpose of inspection methods, however, was to identify user tasks likely to yield the highest return on subsequent user tests. Most interviewees said they have abandoned the term heuristic evaluation, instead doing what they call an expert review or design review. The term heuristic, they said, is meaningless to people outside the HCI community, and the purpose of evaluation is to focus on the user interaction, rather than getting bogged down by particular heuristics.


The actual process of review and inspection was subtly similar, but not standard, from person to person or from team to team in most cases. The variations in the process and rules guiding these inspections were almost endless. It was also clear that this sample of practitioners believed that the value of inspection methods was limited to the above-mentioned situations. Although research is afoot to improve heuristic evaluation methods, especially in the area of computer games, “feedback from developers suggests that they [heuristics] are too generic to be of much use” [3]. These practitioners would almost certainly agree.

User-Based Usability Evaluation

All 12 interviewees expressed a strong preference for, and almost religious adherence to, traditional user-based usability tests, even in organizations using agile software development processes, and even including computer game providers. As one consultant said, “There is just no other or better way to do it [collect diagnostic usability data].” According to 10 interviewees, empirical user-based tests made up 70 to 90 percent of their work. It did not matter if these were conducted in a dedicated usability lab or in the field. Interviewees were all acutely aware of the potential biases in sampling methods and sample sizes, limited experimental control in the lab and in the field, and so on, but for these participants, it would appear that quasi-experimental methods are the best we have and that they seem to work.

Most interviewees said they have abandoned the term heuristic evaluation, instead doing what they call an expert review or design review.

The interviewees’ clear distinction in the relative value of opinion-based inspection methods and user-based empirical methods is a far cry from the often-blurred distinction between these methods discussed in the academic literature. Comparing these two method types is much like comparing apples with oranges; no matter their similarities, they remain distinctly different and their outcomes differ. In addition, when one’s job depends on delivering usability results and recommendations that help to fix problems, the detection of usability stumbling blocks is only half the equation. Problem detection is necessary and important but certainly not sufficient! Continued focus in the literature on detecting, rather than fixing, usability problems is therefore of neither theoretical nor practical value.

I was somewhat surprised to find that even with the current emphasis on UX rather than merely on usability, traditional empirical user-based evaluation was still as much in vogue today as it was back in my own former life as a human factors/HCI practitioner. Then again, for an interactive experience to be engaging and fun, it must also be usable. It seems to me that the HCI community needs a selection of enhanced, augmented, or additional evaluation methods rather than replacements of empirical methods that have served us well since the 1970s.

Next Steps

The backgrounds, skills, and job demands of the interviewees in this sample vary considerably; therefore, it is possibly unrepresentative of the larger population of practitioners. Four of them graduated from my Carleton University lab, so they were contaminated by our own worldviews. The fact that they all work in English-speaking countries is another potential bias. The fact that they work in vastly different kinds of organizations may be a strength or a weakness. My aim now is to interview a much larger sample that includes practitioners representing other countries and cultures to learn more about the role of usability testing and usability evaluation methods in UX elsewhere. To the extent that my provocative assertion that academics have too little exposure to usability practice outside universities to speak authoritatively about practice is correct, I strongly recommend that my academic colleagues begin to seek out such exposure. Given that relatively few of our graduates end up as academic researchers studying some aspect of HCI, it follows that most of them become practitioners. Some are likely to work in UX teams. One easy way to learn more about what they do would simply be to stay in touch once they graduate.

With respect to methods that practitioners use or could use, there seems to be plenty of room for novel evaluation methods, but it seems likely that empirical methods will remain a cornerstone of UX practice. Inspection methods might be useful, but only if practitioners are actually part of their development.


I wholeheartedly thank the 12 interviewees who so freely gave me their time and information, typically at great inconvenience to them due to our different time zones.


1. Stewart, T. Usability (editorial). Behaviour & Information Technology 28, 2 (2009), 99–100.

2. Dray, S. Engaged scholars, thoughtful practitioners: The interdependence of academics and practitioners in user-centered design and usability. Journal of Usability Studies 5, 1 (2009), 1–7.

3. McAllister, G. and White, G.R. Video game development and user experience. In Evaluating the User Experience in Games. R. Bernhaupt, ed. Springer Verlag, Heidelberg, Germany, 2010, 107–130.


Gitte Lindgaard is professor of strategic design at Swinburne University of Technology in Melbourne, Australia, and Distinguished Research Professor (HCI) at Carleton University in Ottawa, Canada. As a human factors expert in telecommunications for many years, she has ample first-hand experience on both sides of the academic/practitioner fence.

©2014 ACM  1072-5220/14/11  $15.00

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.

The Digital Library is published by the Association for Computing Machinery. Copyright © 2014 ACM, Inc.

Post Comment

No Comments Found