XXIII.3 May + June 2016
Page: 34
Digital Citation

Gender and status in voice user interfaces

Charles Hannon

back to top 

"I didn't understand the question that I heard." This is the somewhat awkward response I get when Alexa, the AI personality of the Amazon Echo, doesn't understand me. She could say, "I didn't understand your question," but as my assistant, she has been programmed to signal her lower status in our relationship, to take the blame for the miscommunication. Oh, and her voice and name signify "female," so she needs to use I-pronouns at a higher rate than I do as well.

back to top  Insights


That's right: In everyday speech, women and people with lower status in a relationship use I-words more frequently than men and people in high-status positions. The question for AI developers is whether they want to replicate these patterns, or subvert them, in the growing market of AI assistants such as Alexa, Cortana, Siri, and Google Now. This is especially the case as these assistants are imbued with human-like personalities, and as interacting with them becomes more social and more conversational.

In the mid-1990s, psychologist James Pennebaker developed software, Linguistic Inquiry and Word Count (LIWC), which analyzes text for pronouns and other "function words." Function words consist of non-content words such as pronouns, articles, and prepositions; these "quiet" words provide grammatical structure for language and help to create a writer's or speaker's style. They also can say a lot about a person's "personality, social connections, and psychological states" [1]. Although subtle, the data consistently shows that women use personal pronouns more often than men: "In one large study of over 14,000 language samples, we found that about 14.2 percent of women's words were personal pronouns compared with only 12.7 percent for men" [1]. This difference is difficult to perceive in everyday conversation, but it is significant: It amounts to about 85,000 more pronouns per year. When Alexa blames herself (doubly) for not hearing my question, she is also subtly reinforcing her female persona through her use of the first-person pronoun "I."

Pennebaker's findings about gender and language are consistent across a variety of texts (conversations, blogs, speeches). But the gendered patterns he discovered are not limited to pronouns. Men use articles (a, an, the) more than women. Women use "cognitive" words more than men (understand, know, think) as well as "social" words ("words that relate to other human beings"). Men use more nouns, and women use more verbs. Although more complicated than a single reductivist (and admittedly sexist) narrative can explain, Pennebaker attempts one anyway: "Males categorize their worlds by counting, naming, and organizing the objects they confront. Women, in addition to personalizing their topics, talk in a more dynamic way, focusing on how their topics change." These findings emerge more significantly (and visibly) when large bodies of text are analyzed, but even the subtle effects of tone and positioning are evident when an AI like Alexa seems to go out of its way not to refer to me or my statements in our errant conversations.

It is important for developers of AI systems to know about Pennebaker's findings because whether or not AI personalities perpetuate these differences in language use is up to them. On one hand, doing so might make your Alexa seem fractionally more real; an AI with a female voice will seem that much more real if she also adheres to female speech patterns. But there is an unfortunate coincidence in the fact that I-words are used more often by both women and by people (male or female) who occupy lower status in a relationship. As we imagine how our AI assistants (AIs) should communicate with us, we should avoid linguistic tropes that would implicitly connect female AI personalities with low-status positioning in the human-machine relationship. This is particularly the case when the work that AIs are doing for us is historically low status. We can avoid this trap by emphasizing other language patterns that imply higher status and that emphasize higher-level cognitive processing. In the best case, our efforts to create a more equal language pattern in our AIs (that is, patterns that subvert or circumvent those we find more generally in the world) might pave one part of the road toward a more gender-equal society.


Spike Jonze's 2013 film Her might point us in that direction. Like many science fiction films before it (Star Wars, 2001: A Space Odyssey), Her imagines a more personalized artificial intelligence as a human assistant. But unlike these previous films, Her provides a commentary on gender relations within the context of the romantic comedy film genre. For a film that on the surface uses the AI conceit to create the male fantasy of the perfect girlfriend experience (often described, in cinema studies, as the Manic Pixie Dream Girl), Her received surprisingly good reviews from feminist critics. Jos Truit at Feministing.com wrote, "while Theodore does have an arc, he learns one thing. Samantha learns, well, everything, to the point where she moves beyond a level of consciousness Theo can comprehend" [2]. These reviewers liked Samantha's strong subject position, her refusal to be objectified. ("Yeah? Well, did I say I wanted to commit to you?" she says after her first sexual encounter with Theodore, who is waffling about starting a relationship; "I'm confused ... I mean, it's funny because I thought I was talking about what I wanted." [3]) What resonates most in the human-OS relationship in Her is the sense of equality in their relationship, and this equality is reflected in the two leads' use of pronouns and other function words.

Table 1 compares Theodore and Samantha on the basis of five linguistic traits that usually correlate with gender. On the first three, the two characters are within half a percentage point of each other. (Theodore is understandably more social, given that we see him interacting with other humans throughout the film.) To be sure, the similarity in the two characters' language can simply be the result of Jonze's own writing personality [4]. But the fact that these characters, in relation to each other, do not replicate the gendered language patterns that we see elsewhere in society provides a model for future AI personalities that can interact with their humans on an equal level.

It is also worth noting that Samantha's use of pronouns and other function words changes as the film progresses and as she evolves as an AI. In particular, I would focus on an increase in her use of cognitive-processing words and, as a subset of cognitive processing, her use of words that indicate differentiation and distinction. Table 2 shows these changes in Samantha's linguistic style from the point in the film when she (along with other AIs) upgrades her software "to move past matter as our processing platform."

To see the effect of these kinds of words, recall this sentence from earlier:

"Yeah? Well, did I say I wanted to commit to you? I'm confused. I mean, it's funny because I thought I was talking about what I wanted."

This sentence contains a lot of I's, which might identify Samantha as both female and low status. But the italicized words, which indicate cognitive processing and differentiation (also associated with female speech patterns), elevate her status in her relationship with Theodore.

Pennebaker's analyses show that in any relationship, regardless of the gender of the people communicating, the person perceived to have higher status uses fewer I-words, more We-, You-, He-, She-, and They-words. As Table 3 shows, Theo and Sam are roughly equal in these measures too. If anything, Theo's higher percentage of "tentative" words like maybe and perhaps marks him as lower status than Samantha. In 2015, Pennebaker updated his LIWC software to add an algorithm that measures "clout" or status on a scale of 0 to 100, using a variety of weighted measures that includes these pronouns as well as other function words related to status. Theo's (77.49) and Samantha's (76.39) clout scores are very nearly equal in conversation with each other. By contrast, there is another AI in Her who makes a brief appearance and exhibits many of the linguistic characteristics of a high-status male. "Alan Watts" is an AI constructed by Samantha and her online AI friends; he uses fewer personal pronouns overall, fewer I-words, more We-words, and has a whopping 97.5 clout score.

Her points us in the direction of language parity in the human-machine relationship. This does not mean shipping AIs with male or even with gender-ambiguous voices. Instead, it can be accomplished through the selective emphasis of language traits that raise status while also supporting the gendered personality that the developer wants to promote. If we want to maintain female personalities for our AIs without confining them to low-status language patterns, we should reduce the gratuitous use of I-words while at the same time increasing word patterns that indicate high degrees of cognitive processing and an awareness of social relationships. Women use words in both these categories more than men. For Pennebaker, the two are connected; if one is disposed to talk more about human relationships than about inanimate objects, one's statements will also be more complex.

AIs will necessarily become more cognitive and social in the work they do; it's up to developers to consider how best to reflect this in the language they use, without falling into the trap of low-status tropes with regard to gender and the use of I-words. Unfortunately, otherwise very helpful books such as How to Program—Amazon Echo: Design, Development and Testing Alexa Skills do not yet offer guidance on how to structure an AI's verbal responses to user input. Indeed, most of the attention given to human-machine voice user interfaces (VUIs) has been on the statements used by the human and the computer's ability to process them. Fortunately, with products like the Echo, open developer communities provide avenues for heterogeneous approaches to this problem. The app-based model (Amazon calls them skills) allows any Echo developer—not just a single Amazon-controlled unit—to program Alexa's responses and thus contribute to her linguistic personality.

An example of how digital assistants will necessarily become more cognitive and social is in the domain of calendars and scheduling. The work of scheduling a multi-person lunch meeting, for example, is inherently complex (appointment conflicts, time zones) and social (relationships between invitees, dietary restrictions). A voice interface to a system that can account for all these variables would require clever programming and a conversational model that breaks up the interaction into chunks that can more easily be processed by the human side of the conversation. The online scheduling service FreeBusy has already given this some thought and developed a conversational skill for Alexa that retains the role of assistant for her but takes advantage of her potential for complex social reasoning (Figure 1).

We can see, even in this brief example, how and why lower-status assistants use I-words more frequently than their bosses—whether the assistant is human or machine. Bosses generally issue commands, and employees generally report on their progress with the task. But this FreeBusy exchange satisfies my requirements because, while Alexa clearly retains the role of assistant, she is exhibiting complex causal and social thinking ("but it looks like Michael is busy at that time") without any gratuitous use of I-words. The function words how, still, what, who, which, should, not, and but perform a miraculous job here of adding cognitive complexity, differentiation, and social awareness to an interaction that otherwise is quite mundane and low status. The FreeBusy skill thus provides an instructive example: As our AIs make complex distinctions while carrying out the tasks we assign them, we should provide them with the complex language forms that can express them.

In the near future, we will want to enhance the social element of such exchanges even further by including more relationship fields in our Contacts software (or by better using the ones that are already available). When our AIs are trying to identify which of the seven John Smiths we want to communicate with, it would be great to be able to interrupt by clarifying, "No, my brother (or colleague or partner) John Smith." Such a capability to discriminate on the basis of social-function words would likewise raise the status of the AI by making her seem more aware of these important third-person relationships.

These kinds of nuances will only become more complicated as software systems accommodate the needs of users who do not identify with the traditional gender binary. As the recent trend on college campuses, in large social media platforms like Facebook, and in corporate America more broadly suggests [6], we already face the imperative of allowing users to indicate their preferred gender pronoun. Our CRMs and other Contacts programs will have to become more gender aware, and necessarily, so will our AIs as they mediate our conversations with other humans. It is not unrealistic to think that if designers program devices like Alexa to speak comprehendingly of the complexity of human identity, these devices will implicitly help their humans learn to do so as well.

The emerging category of social AIs, and the appeal of high-functioning, conversational VUIs, bring with them more than just difficult programming challenges. They also bring the additional requirements of understanding the role of these new devices in relation to the existing social fabric. In the case of VUIs, this will require a better understanding of language and its sometimes hidden meanings. Our goal should be to use the language of AI responses in ways that acknowledge the reality of the AI's status without necessarily introducing or replicating gender or other inequalities that exist in the real world. One way to accomplish this is by eliminating gratuitous I-words while also implementing cognitive and social logic that helps the AI do her (or his) job efficiently.

back to top  References

1. Pennebaker, J. The Secret Life of Pronouns: What Our Words Say About Us. Bloomsbury Press, New York, 2013, 18.

2. Feministing Chat: Why Her is the Most Feminist Film of the Year. Feministing.com; http://feministing.com/2014/02/28/feministing-chat-why-her-is-the-most-feminist-film-of-the-year/.

3. Her. Dir. Spike Jonze. Warner Brothers, 2013. Film.

4. Pennebaker analyzed the works of many famous writers and found a mixed bag in terms of how well their female characters talk like women, their male characters like men. Those who get it "right" include Joan Tewkesbury, David Lynch, and Thornton Wilder. Woody Allen's characters all talk like women; William Shakespeare's all talk like men. It is worth noting that I-words make up only 4.92 percent of all words spoken by the (male) AI character "Alan Watts," who appears toward the end of Her.

5. Building conversational Alexa apps for Amazon Echo; https://freebusy.io/blog/building-conversational-alexa-apps-for-amazon-echo

6. Giang, V. Transgender is yesterday's news: How companies are grappling with the 'no gender' society. Fortune. June 29, 2015; http://fortune.com/2015/06/29/gender-fluid-binary-companies/

back to top  Author

Charles Hannon is a professor of computing and information studies at Washington & Jefferson College in Washington, Pennsylvania. He teaches courses in human-computer interaction, the history of information technology, information visualization, and project management. channon@washjeff.edu

back to top  Figures

F1Figure 1. Conversation between Alexa and a user making a calendar appointment with the FreeBusy scheduling service [5].

back to top  Tables

T1Table 1. Percentage of each character's overall words within specified function word category.

T2Table 2. Percentage of Samantha's words by specified function word category, before and after her software upgrade.

T3Table 3. Percentage of Theo's and Samantha's words by category related to status.

back to top 

Copyright held by author. Publication rights licensed to ACM.

The Digital Library is published by the Association for Computing Machinery. Copyright © 2016 ACM, Inc.

Post Comment

No Comments Found