P. Foglia, F. Giuntoli, C.A. Prete, M. Zanda
E-government (e-gov) is the use of information technologies to deliver government services through Web sites devoted to users interacting with government. E-gov is a growing sector of Web usage, driven by the expectation of improving the delivery of public administration services, easier data integration in information systems, and overall cost reduction . However, citizens use e-gov services reluctantly, partly due to usability problems .
To improve the user experience of such sites, we developed a prototype e-gov Web site using an animated face (AF) with text-to-speech (TTS) voice to assist users. We used the AF because faces stimulate human attention , and anthropomorphic agents increase users' perception of flow  (flow is a construct depicting a user's interactions as playful and exploratory). Consistent with physiological and computer science studies, we conducted user tests to measure how adding AF influences users' behavior. Our results show that the AF proved effective in reducing the number of visited pages when users had specific tasks to perform on a Web site; however, contrary to our expectations, task completion times were not significantly altered by the introduction of the AF. We conclude that AF can be useful in e-gov Web sites, but not in situations where time to perform a task is critical or must be shortened.
E-Gov Web Site. The e-gov Web site was developed following recognized usability recommendations, also taken from the Italian government . We developed a local administration Web site that offered a broad range of online services for citizens, companies, and tourists.
The Web site had two versions, one without and one with the AF. In the latter version, the AF was placed on the bottom right corner of each page without modifying original content and layout (Figures 1 and 2).
We followed published guidelines to make the AF pronounce text effectively [4, 7]. The AF's speech was neutral with respect to the assigned tasks. When a Web page was loaded, the face briefly introduced the Web page, describing the page structure and classifying page sections. The face moved its eyes synchronously to indicate which area was being described. When a user pointed to a link, the AF described where the link would lead. When a wizard was in use, the face introduced the page contents and gave suggestions on how to proceed. For instance: "Welcome to the registration area of the Web site. Please find the input form at the center of the page. You can start by filling out your name." When the participants focused on a text box, the face described what kind of information was required and what had to be done next. The AF gave further information with respect to the text in the body of the page, and the spoken text appeared below the AF. The Web site included help buttons in each page, offering textual assistance or vocal assistance (with the AF); the AF could be deactivated if users didn't find it helpful.
The face was animated from a 2D photograph. Reallusion Crazytalk (www.reallusion.com/crazytalk/) was used as morphing software, synchronous with the TTS voice. The TTS engine was Loquendo TTS (www.loquendo.com ). Crazytalk provides a client-side plugin, so the only overhead for the AF was the 2D image and the speech to be pronounced.
Experiments and Results. For user tests we chose two tasks: registering on the Web site and paying a parking fine in the citizens' section (Figures 3, 4, and 5). Given that potential users are all Italian citizens, the recruited group of 38 citizens was heterogeneous to the best of our ability. Half of the participants completed the first task without the AF but completed the second task with the AF; the other half encountered the opposite situation.
We observed that if participants visited the home page that included the AF, they used the help feature more frequently than while visiting the home page that did not include AF (table 1).
The AF presence reduced the number of pages visited in both tasks, thus reducing, to some extent, wrong navigation decisions (Table 1). However, task-completion times were not reduced because participants listened to the AF, and therefore their interaction with the Web site was not fasterthey listened to the AF and then decided what to do. Wrong navigation decisions were reduced due to both the information provided by the AF and to the increased usage of the help feature on the home page.
On average, the participants rated the effectiveness of the AF at 3.7 on a five-point Likert scale. Participants were more relaxed while visiting a Web page that included the AF than one without it, and only one participant switched off the AF.
Discussion. In general, the AF proved its effectiveness when participants felt lost and did not know how to proceed, which happened if the task was unusual or difficult. Conversely, the users who knew what to do and how to proceed considered the AF annoying and time-consuming. Some participants did not consider the AF useful during the registration task. Comments such as "The face continues to talk; I want to stop it," or "Can I go on while it is talking?" were frequent. Since this was a common task, participants required little assistance. These results are in line with Nielsen's suggestions  to keep animations short and avoid using video unless necessary. In fact, Internet users drive their own experience through a continuous set of choices and clicks, and long, boring broadcast video results on the web.
Participants' attitudes changed substantially in the second task. None of the participants had ever paid a parking fine online, so they were willing to listen to the details given by the AF. When participants had to fill out the form with the details of the fine, they listened to the AF attentively. A common scenario was:
AF: "Please find the data required for this textbox in the date field at the top right corner of the fine."
Participant: "Let me check... here it is!"
A benefit of online interactions is their potential speed, and AFs should assist users while slowing them down as little as possible. Adopting the AF for fast interactions is not effective; however, the AF proved particularly useful when participants experienced difficulties on the Web site and when fast interactions were not required.
- The AF is effective with unusual or difficult tasks.
- The AF is not useful with usual or simple tasks.
- In either case, the AF does not provide faster interactions.
Acknowledgements This work has been partially supported by "Fondazione Cassa di Risparmio di Pisa," under the Easy.Gov project.
3. Haxby J.V., Gobbini M.I., Furey M.L., Ishai A., Schouten, J.L., Pietrini P. Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science, Sept. 28; 293(5539):2405-7, 2001.
5. Qiu L., Benbasat I. An Investigation into the effects of text-to-speech voice and 3D avatars on the perception of presence and flow of live help in electronic commerce. ACM Transactions on Computer Human Interaction (TOCHI), 12(4); 2005.
University of Pisa, Italy
University of Pisa, Italy
University of Pisa, Italy
IMT Advanced Studies, Italy
About the Authors
Pierfrancesco Foglia is assistant professor at the Information Engineering Department of the University of Pisa, Italy. His research interests include computer architecture, coherence protocols, high-performance servers, and infrastructures for E-Commerce. He coordinated a research project involving the University of Pisa and Siemens ICN for the development of a manager for a telecommunication network of GSM devices. He is a member of IEEE, IEEE Computer Society, and ACM.
Fabio Giuntoli received an MS in Computer Engineering from the University of Pisa in 2006. Currently he is developing a usable e-learning system to be adopted by the University of Pisa and other public institutions.
Cosimo Antonio Prete is full professor at the Information Engineering Department of the University of Pisa. He is also a board member of the PhD course "Computer Science Engineering" at IMT Advanced Studies in Lucca and is a coordinator or various research projects. Research interests include multiprocessor architectures, cache memory, embedded systems, and usability engineering. Cosimo is Italian coordinator of the Hipeac European Network of Excellence and a member of ACM.
Michele Zanda is a PhD student in Computer Science and Engineering at the IMT Lucca Institute for Advanced Studies. He received the MS degree in Computer Engineering from the University of Pisa in 2004. He is investigating methods and techniques to ensure usability in web applications. He is a member of ACM SIGCHI.
Table 1. Statistical evaluation of the hypotheses tested by the user studies. P-values were calculated using Wilcoxon's signed ranks test. Roughly, p-values estimate the likelihood that observed differences are due to chance.
©2007 ACM 1072-5220/07/0100 $5.00
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.
The Digital Library is published by the Association for Computing Machinery. Copyright © 2007 ACM, Inc.