XXVII.3 May - June 2020
Page: 30
Digital Citation

Modeling humans via physiological and behavioral signals

Ronnie Taib, Shlomo Berkovsky

back to top 

The time has come to accurately and unobtrusively model humans! Modeling humans—whether in terms of their skills, emotions, or attitudes—can help us deliver tailored services, interaction, and information. Let's teach math by associating each formula with other concepts and formulas the student already knows. Let's recommend movies based on the emotions that past movies elicited in an individual. Let's produce data visualizations in line with what has previously resonated with a user. Why are these objectives so obviously needed, yet so elusive? A key reason is that a lack of integration between tasks, human behavior, and reasoning is hindering the full power of interaction customization. Modeling humans can help bring these elements together.

back to top  Insights


In recent decades, HCI and AI have developed a range of tools and methodologies for modeling humans and personalizing interaction [1], but these often remain bounded by the information needed by the application itself. For example, movie recommenders generally focus on behavioral signals, such as past movie selections by the user or other users of the same platform, sometimes looking at related factors like social media likes or movie reviews. Human modeling, however, rarely ventures into gauging thoughts, emotions, or attitudes, at best resorting to clunky pop-ups such as "Would you prefer option A or B?"


Another common drawback of many human modeling tools is that they can be manipulated, making their reliability questionable. For example, it is easy to pose as a horror-movie lover in a system, or similarly to spread such fake information on social media. In fact, unbalanced training data will result in strong biases even without any manipulations [2]. Models that capitalize on direct input are even easier to trick, as the human user may have a good idea about the desired answers, allowing them to steer the system in the right direction. These risks intensify in high-stakes scenarios, such as performance evaluation or job recruitment. Here, humans may be willing to paint a fictitious picture of certain behavior, overstate their skills and knowledge, or simply provide inaccurate information, which may hinder the modeling and affect its outcomes.

A solution is at hand; it is all a question of piecing things together. In order to make human modeling more objective and reliable, let's turn to the shelves of physiologists, neuroscientists, and sensing engineers. A multitude of physiological and behavioral signals generated by the human (consider heartbeat, brain activity, skin conductance, blood pressure, and eye movements) can hardly be consciously controlled and can potentially disclose precious and reliable information about the human experience, if properly harnessed. While this raises significant technical challenges, requiring a combination of sensing, signal processing, and machine-learning skillsets, it also has a tremendous potential to pave the way for next-generation human-modeling methods.

back to top  A Human-Modeling Framework

Based on our recent work on the detection of personality traits [3], as well as another work on the prediction of Parkinson's disease [4] (both published at CHI 2019), we present here a framework for such physiological and behavioral signal-based human modeling (Figure 1).

ins03.gif Figure 1. Human-modeling framework.

The main idea behind the framework is that consciously uncontrollable signals can be treated as objective predictors for the model. To offer a reliable and measurable input, such signals need to come in response to a standardized stimulus or task triggering them. A range of triggers can be used—from passive exposure to multimedia content to highly interactive activities—as long as the triggers are causally linked to the derived model. Domain expertise is required for this step. For example, when teaching mathematics, such a trigger could be a series of mathematical exercises of increasing difficulty. The modalities of the stimuli and tasks are crucial, as they are likely to trigger different physiological signals. For example, as our work discussed below shows, signals elicited by a video clip would be stronger than those elicited by an image.

The level of control that humans exhibit over the signal drives the objectivity of the captured data and the reliability of the derived models.

The physiological signals are captured by a sensing technology. Guided by the main goal of objective human modeling, we prioritize technologies that can capture difficult-to-control signals. Indeed, some physiological or behavioral signals, like breathing rate or eye gaze, can be consciously controlled; others, like blinking, blood pressure, or heart rate, are difficult to control; while many other signals, like skin conductance or electric brain activity, cannot be controlled at all. The level of control that humans exhibit over the signal drives the objectivity of the captured data and the reliability of the derived models. Another consideration is the practicality and ease of use of the sensors, which ideally should not impede normal behavior and interactions, and should mitigate artifacts related to movement, temperature, lighting, and more.

While it may be appealing to bring in many sensors, they help only once useful features are extracted; otherwise they are simply flooding the systems with irrelevant data. Raw sensor data, like skin-conductance values, electric brain signals, or heart-rate data needs to be preprocessed, and its statistical characteristics extracted, during data processing. While this varies according to the deployed sensor, three typical preprocessing steps are: filtering (data cleansing, noise reduction, and artifact removal), segmentation (partitioning into time intervals), and normalization (with respect to a baseline signal). These are followed by feature extraction, which again depends on the sensor and the stimuli. For example, analyzing the shape of spikes in electrodermal activity may reflect cognitive responses to a short math question but may be less useful when aggregated over the duration of a feature-length movie.

Finally, the processed data can be used to build the human models using machine learning. The model consists in a set of parameters trained from labeled data—features extracted from physiological signals of humans with already established models—which serve as ground truth examples (marked by the dashed arrow in Figure 1). These labels are collected using traditional methods such as questionnaires or observations, and hence are prone to noise and manipulation. However, system developers can control quality through incentives (payment for truthful responses) or disincentives (no benefit from cheating). A range of machine-learning algorithms are readily available to train the human-modeling component of the framework. Once trained, it can be deployed to predict the model values for new subjects, whose features are also extracted from their physiological signals. When too many features are extracted, machine learning is at risk of overfitting, but this can be mitigated using feature-selection methods.

back to top  Two Recent Human-Modeling Case Studies

We predicted human personality traits using affective stimuli and eye-tracking data [3]. Personality is considered to be a set of stable characteristics that affect human behavior, cognition, and emotions. Personality detection is a nontrivial task, typically requiring humans to fill out lengthy questionnaires rooted in personality and psychometric theories. Since the questionnaire data is self-reported, the models are often noisy and manipulation-prone.

We showed our participants 50 images and seven videos, all validated to evoke emotional responses. We focused on eye signals captured using commercial-grade eye-tracking glasses—eye blinks, saccades, fixations, and pupil-size measurements. Ten features were extracted, ranging from simple ones like saccade rate per second, to more complex geometric ones intended to reflect ocular muscle activity, such as average of the peak angular velocity of each saccade. We extracted a total of 170 features from the images and videos. Since the data collection involved only 21 participants, we applied correlation-based feature selection to select a predictive set of fewer than 10 features. These were fed into machine-learning classifiers trained to predict 16 personality traits across three established personality models: Dark Triad, BIS/BAS, and HEXACO [5]. The ground truth data was obtained by administering the personality questionnaires associated with these models, and grouping the participants into low, medium, and high classes for each trait.

The overall predictive accuracy of the best-performing classifier, across all the personality traits and participants, was close to 0.86, with six traits being predicted with an accuracy greater than 0.90. This is much higher than the benchmark random-guess probability of 0.33 in the three-class classification task. In particular, we noted that the tactics, views, and morality traits achieved over 0.90 accuracy; all belong to the Machiavellianism component of the Dark Triad, which is associated with affective rather than cognitive traits [6]. We attributed this to the fact that our stimuli were affect-based. Considering the image and video stimuli, we unsurprisingly found that the videos resulted in more accurate predictions than the images. Similarly, we examined the predictive power of features for various traits and found the most predictive feature-trait combination was the number of blinks and psychopathy, which aligns with prior work showing that people with psychopathic traits tended to display unusual blink responses [7].

Practically speaking, such a system could dramatically reduce the time required to administer questionnaires and provide near real-time objective personality modeling. Accuracy rates of 0.86 may not yet allow the detection of mental pathologies but could prove useful in longitudinal psychological assessments. The low entry cost of sensors and data-analysis packages suggests that such a method could practically supplement traditional personality questionnaires.

Predictions of Parkinson's disease (PD) from mobile phone gesture analysis were achieved using a similar approach [4]. PD is a neurodegenerative disorder that affects the motor system and is diagnosed using neurological examinations, computer tomography scans, and magnetic resonance imaging. It is important to highlight that PD symptoms often show up—and, thus, PD is diagnosed—at relatively late stages, when significant and irreversible brain damage has already occurred, emphasizing the importance of early detection.

To predict the PD diagnosis, the authors used commercial-grade smartphones and common mobile gestures: flick, drag, handwriting, pinch, and tapping. Since PD affects humans' fine-motor skills [8], the authors hypothesized that this will manifest in finger movements captured by the smartphone's sensors. The participants were tasked with performing 60 flick, 60 pinch, and 30 drag gestures, writing and typing for 10 minutes, and performing the alternative finger-tapping test currently used to diagnose PD. The touch signal captured by the screen sensors was processed and 46 features extracted, which were grouped into touch, trajectory, temporal, and inertia groups. The study involved 102 participants: about a third diagnosed with PD and the rest healthy. Hence, the human modeling was essentially a PD diagnosis prediction represented by a two-class classification.

The accuracy of the predictions was measured using the area under the receiver-operating characteristic curve (AUC) that ranges from 0 to 1. First, the authors studied the performance of the four groups of features. Although trajectory and inertia features achieved AUC close to 0.9, combining all the groups improved the AUC to 0.95. Considering the mobile-input gestures, drags and pinches achieved an AUC of 0.92. Further adding flicks and handwriting boosted the AUC to 0.95, while eventually adding typing increased the AUC to as high as 0.97. Overall, for the best-performing combination of gestures and features, both true positive and true negative rates—that is, the ratio of correct predictions for PD-diagnosed and healthy subjects—were around 0.9. To position this with respect to existing clinical methods, the currently deployed alternative finger-tapping test achieves AUC of approximately 0.83 [9].

Practically speaking, this case study showcased another implementation of the proposed framework in a promising medical application of human modeling, using a simple smartphone. By joining cross-disciplinary skillsets, the authors demonstrated that a challenging medical condition can be predicted with accuracy levels surpassing the current clinical methods. Research is yet to study whether neurological conditions can be detected with electro- and magneto-encephalogram (EEG and MEG, respectively) sensors, directly capturing brain activity.

back to top  Where To Next?

With recent advances on the sensing, signal/data processing, and machine-learning fronts, the exercise of accurate and reliable human modeling seems to be within reach. The modeling framework we proposed seeks to provide structure and confidence to teams aiming to boost user experience by introducing human modeling and personalization in their applications. The discussed case studies are promising; in the near future, we may be able to:

  • Predict mental/cognitive disorders. With a plethora of new body and activity-tracking devices, mental or cognitive disorders could be assessed on a high-frequency basis by the proposed framework, instead of requiring a visit to a practitioner's clinic. Conditions such as anxiety and depression in the mental space, or even reading and learning disorders like dyslexia, are often hard to establish objectively but could potentially be screened using our framework and the right combination of stimuli, sensors, data processing, and machine learning. Affective stimuli were shown to be accurate in the first case study discussed here; hence, suitable cognitive triggers could also be developed to target specific disorders, which can potentially be captured by common sensors such as a camera/microphone or motion trackers. Stimulus-based processing also safeguards against misuse of the system, as the evaluation takes place in an agreed time and place. Such screening technologies would be invaluable for effective cognitive-behavioral therapies.
  • Detect susceptibility to cybersecurity attacks. The human factor plays a major role in cybersecurity, and with the existing security software in place, many cyber incidents are now associated with human error. For example, making a hasty decision on an incoming email, potentially a phishing attack, can have disastrous consequences for the targeted individual and their organization alike. Hence, it is critical to understand who is susceptible to what type of cybersecurity risks, to be able to educate and protect these people. Deploying our framework in order to examine the brain signals or mouse movements of a person faced with simulated cybersecurity threats could allow the modeling of their cognitive processes. Once such a model is derived, it will be possible to detect human hesitation or subconscious behavior when faced with potential cyber threats, and bring this to their attention for conscious examination. Combining the model with adaptive training could further reduce vulnerability and upgrade human users into an active defence against cyberattacks.

We believe the HCI community and available technology now provide the required support for novel next-generation methods for human modeling and personalized interactions. The use cases we presented highlight how important real-life problems have been addressed with promising results, and can be abstracted in a reasonably simple framework. There are many more challenging problems out there offering high-reward and real-life impact. Hence, we are calling for action and take this opportunity to encourage researchers and practitioners to look into these problems!

back to top  References

1. Kobsa, A., Nejdl, W., and Brusilovsky, P., eds. The Adaptive Web: Methods and Strategies of Web Personalization. Springer, 2007.

2. Belkin, M., Hsu, D., Ma, S., and Mandal, S. Reconciling modern machine-learning practice and the classical bias—variance trade-off. Proc. of the National Academy of Sciences 116, 32 (2019), 15849–15854.

3. Berkovsky, S., Taib, R., Koprinska, I., Wang, E., Zeng, Y., Li, J., and Kleitman, S. Detecting personality traits using eye-tracking data. Proc. of CHI 2019, paper 221.

4. Tian, F., Fan, X., Fan, J., Zhu, Y., Gao, J., Wang, D., Bi, X., and Wang, H. What can gestures tell? Detecting motor impairment in early Parkinson's from common touch gestural interaction. Proc. of CHI 2019, paper 83.

5. McCrae, R.R. and Costa, P.T. Personality in Adulthood: A Five-Factor Theory Perspective. Guilford Press, 2003.

6. Jonason, P.K. and Krause, L. The emotional deficits associated with the Dark Triad traits: Cognitive empathy, affective empathy and alexithymia. Personality and Individual Differences 55, 5 (2013), 532–537.

7. Patrick, C.J., Bradley, M.M., and Lang, P.J. Emotion in the criminal psychopath: Startle reflex modulation. Journal of Abnormal Psychology 102, 1 (1993), 82–92.

8. Pradhan, S., Brewer, B., Carvell, G., Sparto, P., Delitto, A., and Matsuoka, Y. Assessment of fine motor control in individuals with Parkinson's disease using force tracking with a secondary cognitive task. Journal of Neurologic Physical Therapy 34, 1 (2010), 32–40.

9. Arroyo-Gallego, T., Ledesma-Carbayo, M.J., Butterworth, I., Matarazzo, M., Montero-Escribano, P., Puertas-Martín, V., Gray, M.L., Giancardo, L., and Sánchez-Ferro, A. Detecting motor impairment in early Parkinson's disease via natural typing interaction with keyboards: Validation of the neuroQWERTY approach in an uncontrolled at-home setting. Journal of Medical Internet Research 20, 3 (2018), e89.

back to top  Authors

Ronnie Taib is a principal research engineer at Data61 – CSIRO in Sydney, Australia. With a passion for understanding and measuring human-machine interaction, he has published over 50 papers covering multimodal interaction and cognitive load measurement based on physiology and behavioral signals. ronnie.taib@data61.csiro.au

Shlomo Berkovsky is an associate professor at the Australian Institute of Health Innovation, Macquarie University, where he is leading a team of researchers working on precision health. He is a computer scientist who has published over 130 papers. His core expertise areas are user modeling and personalized technologies. shlomo.berkovsky@mq.edu.au

back to top 

©2020 ACM  1072-5520/20/05  $15.00

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.

The Digital Library is published by the Association for Computing Machinery. Copyright © 2020 ACM, Inc.

Post Comment

@Mark Albin (2012 06 30)

This is a very interesting article about social bots, thanks for sharing.

@Aman Anderson (2012 07 18)

This is great
“So what’s the center of a design? In one sense, it is the designer’s nuanced understanding of the problem or opportunity at hand. The focus of design is problem solving, not self-expression.” - Uday Gajendar, Interaction Designer

@Bill Killam (2012 07 31)

This is a long overdue article.  And I couldn’t agree with it more.  I’m current working on yet another Federal RFP that is asking for us to do work using short cut methods that are likely make it harder to get them quality results, and we can probably propose a cheaper and more data rich approach if they didn’t specify how we had to do the job.  Sad.

@Demosthenes Leonard Zelig (2012 08 12)

Great Article, it is funny to notice that such huge corporations do not even bother to do a market research before releasing products on a new market. However, I guess we are still learning from our mistakes.

@karla.arosemenea@gmail.com (2012 10 24)

Hi everyone, In the Technological University of Panama there is also a movement. There is a 2 years MS in IT with a specializtation in HCI. We are also trying to include HCI as part of our main curricula. This year we started a research with a company interested on incorporating usability in their development. We expect to receive a Fulbright Scholar next year in this area…


Karla Arosemena

@John Michael Sheehan (2012 11 06)

There are thousands of blogs that requires comments on them. What is the intention of blog comments? Sent From Blackberry.

@Junia Anacleto (2012 11 07)

A very shallow and naive view of a much more rich and complex context.
I am still waiting for a fair position paper to be presented.

@Rick Norton (2012 11 17)

Excellent article raising significant issues that are largely overlooked.  The prospect that the collapse of sustainability for a growth/consumption related societal model is inevitable, is a topic I have often wondered about, given the nature of capitalism as we know it today.  Even the “Great Recession” of current times gives me pause to wonder just how long we can keep this economic engine going before we have to face the reality that we are all going to have to learn to “live with less”.  (A quantitative assessment, not necessarily qualitative.)

Keep up the good work.  Hopefully, you will raise awareness of these topics.

@Noah McNeely (2012 11 27)

Very nice article, that raises meaningful questions.  I actually think that the idea of sustainable products and sustainable product development is a bit of a myth.  All products consume energy and other resources in one form or another during their production, use, or re-use.  The key, ultimately is to balance resource consumption with resource production, but we will always need to be producing new resources.  See my blog post on the subject at ( http://productinnovationblog.blogspot.com/2012/11/are-there-sustainable-materials_7159.html )

@ed.h.chi@gmail.com (2012 11 30)

The quote in the article mis-contextualize James Landay ‘s essay. James actually is actively working to break down those stereotypes, but you can’t do that without understanding what the deep problems are.

James’ blog post on this is at

@Lee Crane (2012 12 03)

This is a topic that is thought provoking and important.  The message explores how humans can escape and survive the world they have jumbled.  So many of the theories and ideas are basic.  Our future may look a lot like the distant past.  And indeed we may be happier for it.

@ 4996484 (2012 12 19)

this is a great article David and Silvia!  I’‘m so excited that you guys wrote this up and are showing everyone the complexities in this space. I hope Interactions features more of this kind of research on China.  Although I agree w/ @landay’s assessment of China’s creativity problem - but he’s working with a very different population than you guys. I think you research is absolutely on point - creative folks are going to hacker spaces like Xinchejian, they aren’t ending up in institutions like Tsinghua!  I explain more here:  http://www.88-bar.com/2012/12/where-are-all-the-creative-chinese-people-hanging-out-in-hacker-spaces-apparently/

@Joe (2013 01 04)

I think that if you study the Elliot Wave Theory it can answer your questions.

@Rafeeque (2013 01 06)

good one

@zhai (2013 01 17)

Enjoyed reading this article. I finally got why Harold wants to call it “the Fitts law”. If enough people write it that way I would never have to correct another submission making the embarrassing mistake of ‘Fitt’s law”.

I did not completely get the following remark though:

        “The Accot and Zhai paper about the Fitts Law [3] has a clever title that illustrates
        the rules on letters, “More than dotting the i’s…”—a bad pun on eyes.”

I came up with the title, but the word “eyes” never came to my mind. We meant that the point-and-click style of UI is like dotting the i’s everywhere—- placing a click on constrained targets as the fundamental action in interaction. Why not using ” Crossing the t’s ”  as an alternative action?  Indeed, we presented models of a new style of UI, which systematically reveals when crossing is superior to clicking,  hence the subtitle of the paper “Foundations for crossing-based interfaces.”

Shumin Zhai

@Mohamadou M. Amar (2013 03 22)

I am a Doctoral student in I/O Psychology with Touro UW and need to access your articles.

@Mohamadou Amar (2013 03 22)

Need access for Doctoral Research

@William Hudson (2013 04 09)

Gilbert overlooks the important issue that the ‘big boys’ largely do not appreciate the need for design all and the problems that real people have with technology. I admit that we’ve had a hard time selling UCD but I am not persuaded by the arguments here to abandon it. Perhaps have a look at my article on a similar subject - User Requirements for the 21st Century - where I take a more pragmatic view of trying to address real users’ needs in the development process. http://bit.ly/agile-ucd

@ 0343665 (2013 04 29)

Fantastic text. I came here by searching for people that quote the Standford study on multitasking. The introduction is fantastic as it builds up an argument that attention has some features that do not change over time.

@Simon Taylor (2013 04 30)

not wanting to do anything so grandiose as building a (technology for) a world parliament, I have in essence been working on the same problems and facing the seven challenges with a project called ‘company.’ [https://gust.com/c/littleelephantltd]

In 2011, working with senior software developers - gratis - although neither the ethical undertaking nor the promise of sweat equity were enough to keep them involved - I established the technical feasibility of ‘company.’
In 2012, turning from the ‘voluntary’ ‘principled’ participation model - because the attractions of real paying jobs had lost me my team - I received financial support from the New Zealand government. This part-funded an Intellectual Property Position Review - which government considered a pre-requisite - as commercial due diligence - to investing in an initial build, or beta. The IPPR recommended I do proceed… However, government offers only part-funding and without a team - either technical or commercial - there has been little to no investor interest.

As things stand at present, I have the tools and schematics for a beta build of something which would fit the sort of use imagined here. If you have any interest in helping, please contact me.

Simon Taylor