Seven heuristics for identifying proper UX instruments and metrics

Authors: Maximilian Speicher
Posted: Tue, September 19, 2023 - 11:09:00

In the two previous articles of this series, we have first learned that metrics such as conversion rate, average order value, or Net Promoter Score are not suitable to reliably measure user experience (UX) [1]. The second article then explained how UX is a latent variable and, therefore, we must rely on research instruments and corresponding composite indicators (that produce a metric) to measure it [2]. Now, the logical next question is how we can identify those instruments and metrics that do reliably measure UX. This boils down to what is called construct validity and reliability, on which we will give a brief introduction in this final article, before deriving easily applicable heuristics for practitioners and researchers alike who don’t know which UX instrument or metric to choose.

Construct validity refers to the extent to which a test measures what it is supposed to measure [3]. In the case of UX, this means that the instrument or metric should measure the concept of UX as it is understood in the research literature, and not, for example, only usability. One good way to establish construct validity is through factor analysis [3].

Construct reliability refers to the consistency of a test or measure [4]. Put differently, it is a measure of how reproducible the results of an instrument or metric are. A good way to establish construct reliability is through studies that assess the test-retest reliability of the instrument or metric, as well as its internal consistency, such as Cronbach’s alpha [4].

In addition to that, the Joint Research Centre of the European Commission (JRC) provides a “Handbook on Constructing Composite Indicators” [5], which summarizes the proper process in terms of a 10-step checklist. We build on all of the above for our following list of seven heuristics for identifying proper UX instruments and metrics.

Heuristic 1: Is there a paper about it? If there is no paper about the instrument and/or metric in question, there’s barely a chance you’ll be able to answer any of the following questions with yes. So, this should be the first thing to look for. A peer-reviewed paper published in a scientific journal or conference would be the best case, but there should be at the very least some kind of white paper available.

Heuristic 2: Is there a sound theoretical basis? In the case of UX, this means, does the provider of the instrument and/or metric clearly explain their understanding of UX and, therefore, what their construct actually measures? The JRC states: “What is badly defined is likely to be badly measured” [5].

Heuristic 3: Is the choice of items explained in detail? Why were these specific variables of the instrument chosen, and not others? And how do they relate to the theoretical framework, that is, the understanding of UX? The JRC states: “The strengths and weaknesses of composite indicators largely derive from the quality of the underlying variables” [5].

Heuristic 4: Is an evaluation of construct validity reported? This could be reported in terms of, for example, a confirmatory factor analysis [3]. If not, you can’t be sure whether the instrument or metric actually measures what it’s supposed to measure.

Heuristic 5: Is an evaluation of construct reliability reported? This could be reported in terms of, for example, Cronbach’s alpha [4]. If not, you can’t be sure whether the measurements you obtain are proper and reproducible approximations of the actual UX you want to measure.

Heuristic 6: Is the data that’s combined to form the metric properly normalized? This is necessary if the items in an instrument have different units of measurement. The JRC states: “Avoid adding up apples and oranges” [5].

Heuristic 7: Is the weighting of the different factors that form the metric explained? Factors should be weighted according to their importance. “Combining variables with a high degree of correlation” (double counting) should be avoided [5].

In the following, the application of these heuristics will be demonstrated through two very brief case studies.

Case Study 1: UEQ
The User Experience Questionnaire (UEQ) is a popular UX instrument developed at SAP AG.

H1: There is a peer-reviewed research paper about UEQ, which is available at [6]. ✓
H2: The paper clearly defines the authors’ understanding of UX, and they elaborate on the theoretical background. ✓
H3: The paper explains the selection of the item pool and how it relates to the theoretical background. ✓
H4: The paper describes, in detail, two studies in which the validity of UEQ was investigated. ✓
H5: The paper reports Cronbach’s alpha for all subscales of the instrument. ✓
H6: Not applicable, since UEQ doesn’t explicitly define a composite indicator. However, a composite indicator can be constructed from the instrument.
H7: See H6.

Case Study 2: QX score “for measuring user experience”
This metric was developed by SaaS provider UserZoom and is now provided by UserTesting. It is a composite of two parts: 1) the widely used SUPR-Q instrument and 2) the individual task success rates from the user study where the metric was measured, in a 50/50 proportion.

H1: There is no research paper, but at least a blog post explaining the instrument and metric. ✓
H2: There is no clear definition of UX given. The theoretical basis for the metric is the assumption that all existing UX metrics use either only behavioral or only attitudinal data. There is no well-founded explanation given why this is considered problematic. The implicit reasoning is that only by mixing behavioral and attitudinal data can we properly measure UX, which is factually incorrect (cf. [2]). ❌
H3: The metric mixes attitudinal (SUPR-Q) and behavioral (task success) items, but no well-founded reasoning is given as to why only task success rate was chosen, or why this would improve SUPR-Q, which is already a valid and reliable UX instrument in itself. ❌
H4: No evaluation of construct validity is reported. ❌
H5: No evaluation of construct reliability is reported. ❌
H6: There is no approach to data normalization reported, the metric seemingly adds up apples and oranges. ❌
H7: There is no reasoning given for the weighting of the attitudinal and behavioral items. ❌

In conclusion, the seven heuristics provided in this article serve as a useful guide for identifying proper UX instruments and metrics. Additionally, by considering construct validity and reliability, as well as following the JRC’s 10-step checklist, practitioners and researchers alike can make informed decisions when choosing a UX instrument or metric. It’s important to note that not all instruments or metrics will pass all of these heuristics, but the more of them that are met, the more confident one can be that the chosen instrument or metric properly measures UX. If in doubt, choose the instrument or metric that checks more boxes. It's worth noting that some heuristics, like H1, are not strictly necessary, and H6 and H7 only apply to composite indicators. It’s also worth noting that there may be valid instruments or metrics that fail some of these heuristics and vice versa. Their goal is to provide a robust and quick framework for evaluating UX instruments and metrics, but ultimately, the best approach will depend on the specific research or design project.

Endnotes
1. Speicher, M. Conversion rate & average order value are not UX metrics. UX Collective. Jan. 2022; https://uxdesign.cc/conversion...
2. Speicher, M. So, How Can We Measure UX? Interactions 30, 1 (2023), 6–7; https://doi.org/10.1145/357096...
3. Kline, R.B. Principles and Practice of Structural Equation Modeling. Guilford Publications, 2015.
4. Cronbach, L.J. and Meehl, P.E. Construct validity in psychological tests. Psychological Bulletin 52, 4 (1955), 281.
5. Joint Research Centre of the European Commission. Handbook on Constructing Composite Indicators: Methodology and User Guide. OECD publishing, 2008; https://www.oecd.org/sdd/42495...
6. Laugwitz, B., Held, T., and Schrepp, M. Construction and evaluation of a user experience questionnaire. Symposium of the Austrian HCI and Usability Engineering Group. Springer, Berlin, Heidelberg, 2008, 63–76.

Posted in: on Tue, September 19, 2023 - 11:09:00

Maximilian Speicher

Maximilian Speicher is a computer scientist, designer, researcher, and ringtennis player. Currently, he is director of product design at BestSecret and cofounder of UX consulting firm Jagow Speicher. His research interests lie primarily with novel ways to do digital design, usability evaluation, augmented and virtual reality, and sustainable design. MaximilianSpeicher@outlook.de
View All Maximilian Speicher's Posts

Post Comment

@Angel17 (2024 05 08)

Wow! This post is so useful to your readers. Keep sharing! water heater installers denver

@Hunt (2024 05 22)

It is very beneficial. Thank you for posting relevant information. Dumpster]https://www.dumpsterrentalriversideca.com/”]Dumpster rental riverside[/url]

@Jenny (2024 05 22)

I appreciate you taking the time to post some interesting information. Construction dumpster rental

@silvana32 (2024 07 07)

Thanks for the information, I find it really useful.catalogos andrea

@Alvin (2024 07 08)

Research evaluating the instrument’s or metric’s internal consistency and test-retest reliability are useful in establishing construct dependability. See: roof repair near me

@metermeant (2024 07 17)

Properly identifying the right instruments and metrics to measure UX is essential for obtaining meaningful and trap the cat actionable insights.

@Alice12 (2024 07 18)

Creating an exceptional user experience (UX) involves careful planning, testing, and Friday Night Funkin evaluation. Identifying the right instruments and metrics is essential to measure and improve the user experience effectively.

@Zabel (2024 07 22)

This is awesome! Thanks for the great shared content. screening company

@Fesicop21 (2024 07 30)

You can play the excellent idle clicker game banana game for free on the internet. Bananas grow when you click on them, so you can keep picking fruits. With the yellow fruits, you may purchase upgrades that will yield even more bananas. How quickly can you click? Find out now.

@Theo Kyle (2024 08 01)

Maximilian, this article provides a thorough and insightful guide on identifying proper UX instruments and metrics. The heuristics you outlined are incredibly practical for both practitioners and researchers. I especially appreciate the emphasis on construct validity and reliability, as well as the detailed case studies that illustrate the application of these heuristics. Thank you for sharing your expertise!

https://www.grandrapidsconcrete.net/services/grand-rapids-residential-concrete-contractor

@Theo Kyle (2024 08 12)

Great insights on identifying reliable UX metrics! The heuristics and case studies you provided offer practical guidance for evaluating UX instruments effectively. Thanks for sharing this comprehensive overview!
https://www.remodelgrandrapids.com/services/bathroom-remodeling-grand-rapids

@miklas (2024 08 16)

I’m glad you taught me this; I’ll put it to good use. If you’re looking for another great site that gives free ways to have fun, you should check out run 3.

@George Williams (2024 08 19)

Thanks for this insightful blog!
https://www.customconcreteholland.com/services/holland-commercial-concrete-contractors

@Miami Hydroseeding trenching (2024 08 22)

You have a nice output! Thank you

@ Jane (2024 08 22)

You have a nice output! Thank you Miami Hydroseeding trenching

@Marco (2024 08 22)

I would love to visit this site more often for an informative post. McKinney Concrete Contractors concreters

@Erica James (2024 08 26)

That’s very insightful and informative. Well-written!
https://www.grandrapidsductcleaning.co/

@Matthew Gentry (2024 08 27)

This conclusion offers a clear and practical approach to selecting UX instruments and metrics. The seven hill climb racing heuristics and the JRC’s 10-step checklist seem like solid tools for guiding practitioners and researchers through the evaluation process.

@spinner (2024 08 29)

I enjoy it and want to learn more. Please add fresh stuff to your blog when you learn more. wheel spinner

@Oliver (2024 08 29)

I like the article you shared, it’s an informative one. Germantown Landscapers landscapers

ACM Interactions

Blogs