Bridge the gap

XIII.3 May + June 2006
Page: 66
Digital Citation

Discovering modalities for adaptive multimodal interfaces

Srihathai Prammanee, Klaus Moessner, Rahim Tafazolli

back to top 

With every new generation of mobile terminals, be it cell phones, PDAs, or gaming consoles, the range of features becomes wider. And the way interactive content is presented becomes increasingly diverse, yet the ways to interact with applications, the user-interface capabilities, remain restricted to a small screen, audio input and output, and occasionally a stylus pointing instrument.

On the other side, the environment in which mobile communication occurs becomes ever more crowded with devices offering wide ranges of interface modalities. We designed a scheme that when a mobile user comes into the physical range of devices hosting modalities (screens/audio systems, etc), these ambient interface devices are bound into an overall multimodal user interface. Literally, whenever suitable devices are available, the user interface can be tailored to the needs of the user, as well as the requirements of the application.

While the principle is rather straightforward, there are a number of associated problems, including:

  • discovery of user interfaces, modalities, and their capabilities
  • interface binding (establishing physical connections to forward media streams)
  • adaptation of the application

While multimodal interfaces and interactions have been researched [1-4], there is not much ongoing research investigating how multimodal interfaces can be made adaptive. One exception is the European Framework 6 Research Project "MobiLife" (, where the principles of a context-aware implementation of user interfaces are investigated.

One of the features of multimodal applications is to provide the possibility of interacting via different interface devices. This means that the user should be able to interact via any modality available, even if the environment is mobile. Hence, our development scans the environment and discovers and incorporates locally available devices. The aim is to dynamically extend a range of interface modalities offered by the personal portable terminal. The devices that can be incorporated could be visual (screen, camera), audio-based (microphone, speaker), or motion detection, etc. The architecture (we call it MID-B, Multi Modal Interface-Binding Engine) consists of one "private" terminal that senses the environment, selects appropriate devices and external modalities, and binds them into the (distributed) adaptive multimodal user interface.

Making user interfaces adaptable extends the user-friendliness and usability of mobile terminals and applications. But this comes at a price; mobile connections are rather volatile, and configurations combining multiple different devices to a multimodal interface can be instable. Hence our system must monitor and react, in real time, to the changes of connectivity between the devices, as well as the availability of devices and external modalities. Also, it has to enable (user) interface reallocation, in case the owners of the temporally used modality want to use their device themselves. Implementation of a (multimodal) user-interface configuration incorporates four main functions; Figure 1 illustrates them:

  • detection of user's environment context (i.e. mobility awareness, situation awareness)
  • device and modality discovery, discovering type and capability of external modality
  • device and interface binding, amalgamating selected devices and modalities into the overall user interface
  • user content adaptation, transcoding format of the data content.

Looking into the mechanisms for device and modalities, we identified the main features and a solution approach.

Discovering Available Interfaces and Modalities. Assume a situation in which many public devices with their interface capabilities are available: A user interacts with an application using his portable private interface device, which in turn searches for and accesses the modalities that are provided by these public devices. To find the available devices, there are a number of discovery protocols like Jini, UPnP, and Bluetooth SDP. While all of them are perfectly capable of announcing and discovering services and devices within ad hoc networks, they have not been designed to discover modalities and their capabilities. Information that describes the dependencies between discovered devices and modalities, and the characteristics and capability of these modalities, are missing.

Three-Party Model for Multimodal Interface Discovery. Extending the functionality of those discovery mechanisms, we introduce a three-party model for multimodal interface discovery. The three parties are the "private interface device," "public interface devices," and a "service manager." Our aim is to extend the existing discovery mechanisms with features supporting modality characteristics discovery. Our model implementation is an extension of the Service Discovery Protocol (SDP) Transaction of Bluetooth [5]. The transaction provides a request/response scenario for wireless networks, and any transaction consists of a request and a response PDU (Protocol Data Unit).

SDP already provides some service description, but it lacks descriptions of modality services and characteristics. We added an extension for Multimodality Transactions (MM Transaction), which uses the request/response scenario and provides details of the multimodal service. The added details include modality/ies type, characteristics, dependencies, and device connectivity.

Figure 2.

Figure 3.

Figure 4.

We defined an interface and modality description, based on a script for private as well as public interface devices. To keep it short, we named the private device the "User Equipment Core Device" (UE_C) and the public device "User Equipment Interface Device" (UE_I).

While the Bluetooth SDP implements its service discovery in a two-party model (request/response architecture), our system introduces a third party called "Multimodal Service Base" (MSB). The MSB implements the basic management of modalities and hosts a registry for the available device-and-modality characteristics. It also manages the information on dependencies between the discovered interface devices and modalities; this dependencies description is necessary, because each UE_I may carry one or more modalities.

The MSB also unifies the modality descriptions of the discovered devices; in this process it dynamically combines information on the state of the service life cycle, user preferences, and user environment context.

The mechanism is fairly straightforward: A user, carrying a UE_C, moves into a new environment. An application is running on the private device (UE_C). As the application can use multiple modalities, the UE_C searches for external interfaces, issuing a request PDU. An available UE_I responds with an interface and modality description. The MSB unifies and stores this information for each responding UE_I and provides it to the UE_C. The application gets (from the MSB) information on available modalities; the UE_C then binds the chosen UE_Is and their modalities and reroutes the modality streams accordingly.

Summary. We've described a scheme developed to facilitate user-interface adaptivity. The scheme foresees the mobility of users and devices and the need to dynamically reconfigure the user interface to whatever modality is available and sensible. Our mechanism for the real-time discovery of devices carrying interface modalities is also outlined.

back to top  References

1. S. Zachariadis, C. Mascolo and W. Emmerich, "SATIN: A Component Model for Mobile Self-Organisation." In. Proc. of Int. Symposium on Distributed Objects and Applications (DOA). Agia Napa, Cyprus, October 2004.

2. M. Mikic-Rakic and N. Medvidovic, "Support for Disconnected Operation via Architectural Self-Reconfiguration," Int lst Conf. on Autonomic Computing (ICAC'04), New York, May 2004.

3. B. Keller, T. Owen, I.Wakeman, J. Weeds and D. Weir, "Middleware for User Controlled Environments," PerWare Workshop, PerCom 2005 Middleware Support for Pervasive Computing, Hawaii, USA, March 2005.

4. G. Niklfeld, H. Anegg and A. Gassner, "Device independent mobile multimodal user interfaces with the MONA multimodal Presentation Server," Eurescom Summit 2005 Ubiquitous Services and Applications Exploiting the Potential, Heidelberg, Germany, April 2005.


back to top  Authors

Srihathai Prammanee
University of Surrey, UK

Klaus Moessner
University of Surrey, UK

Rahim Tafazolli
University of Surrey, UK

About the Authors:

Srihathai Prammanee is a researcher at the Centre for Communication Systems Research (CCSR) at the University of Surrey, UK.

Klaus Moessner is a senior research fellow at the Centre for Communication Systems Research at the University of Surrey, UK.

Professor Rahim Tafazolli leads the Mobile Communications research group at the Centre for Communication Systems Research at the University of Surrey, UK.

To submit a research article, please email Carolyn Gale at


Carolyn Gale
National Center on the Psychology of Terrorism, and Stanford University

back to top  Figures

F1Figure 1. Multimodality in Mobile Environment

F2Figure 2. The SDP Request/Response Model

F3Figure 3. Three-Party Model of Multimodal Interface Discovery

F4Figure 4. The Sequence Diagram of the Extension

back to top 

©2006 ACM  1072-5220/06/0500  $5.00

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.

The Digital Library is published by the Association for Computing Machinery. Copyright © 2006 ACM, Inc.

Post Comment

No Comments Found