Gadgets: part 2

XIII.5 September + October 2006
Page: 30
Digital Citation

Increasing text-entry usability in mobile devices for languages used in Europe


Authors:
Martin Böcker, Bruno Niman, Karl Larsson

Telecommunications devices currently represent one of the largest global consumer product segments. As telecommunications devices and services converge with technologies such as information processing, broadcast services and the Internet, while at the same time becoming mobile and ubiquitous, the usability of these devices and services becomes a critical factor in service uptake. One of the most challenging aspects of mobile-device usability is text entry using the standard 12-key telephone keypad.

At present, finding the characters necessary to enter a name in the terminal’s phone book, searching for a name, writing an SMS (text) message or logging on to a mobile Internet portal is not always easy, because manufacturers differ in terms of which European characters their devices support, how they are ordered in lists, and how the specific characters are mapped onto the keys of the keypad. Character-set implementation varies sometimes even between devices and applications from the same manufacturer. Standardizing the way characters are mapped onto keypads gives users easier access to different communication devices and services, allowing simple, correct and efficient text input, search, and retrieval. It also broadens market opportunities for manufacturers and suppliers and reduces their development costs.

The original reason for assigning letters to the rotary dial pad and later to the numeric telephone keys was to provide alphabetic "aliases" for digits, as mnemonics in dialing. The need to use a telephone keypad for entering text or data was not envisioned. Nobody in the pioneer days of telephony anticipated the concept of a "phone books" stored inside the telephone, or a service like SMS, the very successful service for transmitting short text messages as an alternative to voice communication.

The only standards previously available (e.g., ETSI ETS 300 640 or ITU-T Recommendation E.161 (02/01)), addressing the assignment of characters to the 12-key telephone keypad, were limited to the assignment of the basic 26 Latin letters (A to Z). Language-specific letters (e.g., ü, é, å, ä, ö) as well as other characters (e.g., euro.gif or @) were not addressed. The lack of a standard on these issues has led to diverse and inconsistent solutions for European languages, creating obvious accessibility barriers to basic communication access in Europe.

Europe has around 230 indigenous languages—there are close to 7,000 worldwide. The largest number of languages presently supported by a specific ICT device or service is approaching 50. Cultural and linguistic diversity is one of the key strengths of Europe. However, in ICT, it raises issues that need to be considered and solved in order not to limit access to services and their availability and usability—on the basic as well as more advanced levels.

The first version of ETSI ES 202 130 has been developed to solve the problem for some of the most important European languages by defining character repertoires, sorting orders, and the assignment of letters to the 12-key telephony keypad for these languages. A new version of ETSI ES 202 130 will extend this work to cover other major languages spoken in Europe including official languages, minority languages, and immigrants’ languages. All of this work was aligned with the European Commission’s initiative eEurope, a program for accelerated uptake and inclusive deployment of new, important, consumer-oriented technologies (http://europa.eu.int/information_society/eeurope).

* Scope of ETSI ES 202 130.

The current version of ETSI ES 202 130 specifies the minimum repertoire and assignment of graphic (letter, digit and special) characters to standard 12-key telephone keypads on ICT devices with telephony functionality. It applies to public or private, fixed or mobile network terminals, without an alphanumeric keyboard but providing a 12-key keypad in hardware form (e.g., as push-button keys) or software form (e.g., as soft keys on a visual display). It also applies to network-based services accessed through such terminal devices. It complements ETS 300 640 by additionally including European language-specific letters (Latin, Greek and Cyrillic scripts) and other common characters (e.g., euro.gif and punctuation marks). It specifies solutions for both language-independent and language-specific keypad assignments, mapped to the 12-key telephone keypad, also providing common and language-specific information on character repertoires and ordering.

The standard is fully applicable to the official languages of the European Union (EU) member countries as of 2005 and those of countries with candidate status (Romania, Bulgaria, and Turkey) and, additionally, to the official languages of the EFTA (the European Free Trade Association) countries, as well as Russian. The languages fully covered by the first version of ETSI ES 202 130 are therefore: Bulgarian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Greek, Hungarian, Icelandic, Irish, Italian, Latvian, Lithuanian, Luxemburgish, Maltese, Norwegian, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Spanish, Swedish, and Turkish. In anticipation of future expansions, the language-independent repertoires and keypad assignments specified also include letters needed in some of the remaining European official languages.

ETSI ES 202 130 does not cover any implementation-related issues, e.g., specifics of predictive text input or user interface design.

* User Requirements.

Users of the standard are those implementing it, for example interaction designers and other developers of ICT devices and services, designing user interfaces deploying text input and output, applied to 12-key keypad arrays provided in hardware form (e.g., as push-button keys) or software form (e.g., as soft keys on a visual display) and telecommunication-network-based services accessed through such terminal devices.

End users addressed are the consumers of the ICT devices and services mentioned above, ranging from first-time to experienced advanced users, who can produce tactile stimuli in the form of a key press and perceive written text. The end users’ main goal is to efficiently use ICT devices and services under circumstances intended by these. The implementation of ES 202 130 enables users to reapply knowledge and previous experience between different ICT devices and services using a 12-key standard keypad array and a display. Control of common functions such as entering of characters and retrieval of text in a certain order will be simplified. Well-established services that rely on alpha mnemonics (e.g., "800 DOCTOR" rather than "800 362867" are not negatively influenced, as the standard only complements ETS 300 640).

For certain end users with special needs, ES 202 130 is particularly helpful due to consistent implementations (the same character always appears in the same position, regardless of the terminal manufacturer). The standard is not expected to have any impact for certain disabilities (e.g., in the case of temporary or permanent difficulties caused by cognitive problems or the lack of necessary proficiency level in the respective language and other communication impairments such as visual impairments, the inability to produce distinctive tactile stimuli, or difficulties in handling, distinguishing, and understanding textual information).

Uniformity in the basic interactive elements increases the transfer of learning between devices and services and improves the overall usability of the entire interactive environment. Such transference becomes even more important in a world of ubiquitous devices and services.

Guiding principles during the development of the ordering and assignments of the alphanumeric characters have been:

  • 1) Consistent and harmonized across different devices and services
  • 2) Easy to learn and remember
  • 3) As natural as possible, matching previously acquired knowledge
  • 4) Redundancy (multiple solutions possible to reach desired input)

Methodology.

* Initial survey.

As an early component of developing the standard, an informal survey of the key assignments in a number of mobile-phone models was carried out on several major manufacturers’ handsets. The survey mainly was based on specifications and user manuals downloaded from the Internet but also on "hands-on" investigation.

* Principles applying to ES 202 130.

In order to arrive at a consistent and easy-to-implement presentation of the requirements for character repertoires, ordering rules and character assignment to the 12-key keypad, the principles listed in Table 1 were applied throughout the production of the standard. Some of these are elaborated in the following.

* Characters needed.

Approximately 240 Latin-repertoire letters are needed to cover the major European languages. With Greek and Cyrillic letters added, the number increases to well over 350. This can be compared to the 75 Latin-repertoire letters (mix of capital and small) supported by the present GSM 03.38 7-bit scheme generally implemented in today’s mobile phones and networks (85 letters total, when the Greek capital letters of that scheme are included). It was necessary to include in the language-specific repertoires more letters than are contained in the "core" of those languages, called "Type A" letters. This is because in all languages there is a user need also to input foreign-origin words, some of them needing "foreign" letters. Further, in all countries there exist user preferences in spelling of some names with "foreign" letters, and possibly also a need to represent names—personal and/or geographical—correctly in recognised minority languages. The repertoire tables therefore also include "Type B" letters (see Figure 1).

* Character ordering.

Ordering of characters is a highly complex problem that has been the subject of very large amounts of work in several standardization bodies, both national and international. Earlier ETSI and ISO/IEC standards specify principles based on a "multilevel" approach for the ordering of strings of characters. However, it was found necessary to adopt a simplified "single-level" method for this standard, considering the limited capabilities of telephone devices as compared with computer systems. As regards letters, the two language-independent repertoire tables specify a deterministic ordering. For the language-specific repertoire tables, however, some additional criteria were applied because of established practices in telecommunications, e.g., for printed telephone directories.

In all European languages, the letters A to Z are considered part of the alphabet, even if, in many of them, some of the letters are not used in any indigenous-origin words. Also some languages have special-shape letters, like the German b. Additionally, all languages use special variants of letters A to Z with diacritical marks, like the acute accent and the cedilla (e.g. é and ç). For ordering, most languages consider such variants equivalent to the basic letter. In some languages, however, a few of them are considered letters of their own and are ordered differently. For instance, the letter Ö is ordered in Swedish as the last letter of the alphabet. As far as possible, national conventions were followed for the language-specific repertoire tables. This may possibly cause "non-deterministic" ordering in specific cases. Although unsatisfactory in principle, it was concluded that this could be acceptable for the relevant applications.

* Keypad input sequences.

In today’s keypad-input implementations—foremost in mobile phones—the digits are generally placed as the last character in the key-press sequence, following not only the standardized letter assignments (ABC on key 2, DEF on key 3, etc.) but also all special-letter variants assigned to the keys. The same principle was considered for ES 202 130. However, the special needs of visually impaired users make the principle questionable. It was, therefore, decided to place, instead, the digits immediately following the presently standardized letter assignments; i.e., as the fourth key-press on all keys except 7 and 9 (PQRS and WXYZ), where it is the fifth (see Table 1 and Figure 2).

* Digits and special characters.

As the ordering and keypad assignment of digits and special characters turned out to be somewhat controversial, they were treated following a different set of rules. ES 202 130 defines a set of special characters that must be supported. In addition, other characters also may be supported. The order of appearance in the respective table is only a recommendation, valid for a language-independent implementation, and alternative orders of appearance of special characters are allowed. Furthermore, language-specific orders of appearance also are allowed. The full set of special characters must be accessible via a single entry point. It is recommended that this entry point is the "1" key. In addition, a device may use other, different keys to access different sets of special characters and/or digits. In this case, Rule 1 and Rule 6 must still be followed. Thereby, the implementation of language-specific keypad assignments for special characters and digits is possible.

* Update/Extension of ES 202 130.

ES 202 130 was met with positive responses from industry. An extension of its language coverage is therefore highly desirable, in particular in view of the strong emphasis on multilingualism by the European Union. In January 2006, ETSI decided to begin work on such an extension. This will take the form of either a revision of ES 202 130, or of a complementary standard. The terms of reference for the decided-on work specifies the extension as containing "major minority languages, some official European languages and… non-European languages used by a considerable number of ICT users in Europe." The interpretation of this is not obvious, since different delimitations may be concluded from the wording.

Geographically, "Europe" is traditionally delimited in the south and the east by the Bosporus, the Caucasian mountain range, the Ural Mountains, and the Ural River. The North Atlantic islands (but generally not Greenland) are also included, as well as those Mediterranean islands not in the proximity of the African continent. This definition is, however, unsatisfactory as a language basis for the terms of reference, since it contains only a small part of the nation of Turkey, also part of Kazakhstan, but not Cyprus, and further leaves the Trans-Caucasian states somewhat undefined. A more suitable definition of "Europe" should be that of the Council of Europe (CoE), which includes all "traditionally European" states; the definition should also clarify that Turkey and Cyprus are part of Europe but that Kazakhstan is not. Also CoE concluded after a thorough investigation, considering historical and cultural as well as other factors, that the Trans-Caucasian states Georgia, Armenia, and Azerbaijan shall be considered European, and therefore eligible for entry in the Council (and all three are currently members).

An adoption of this definition for the update/extension work does not necessarily imply that all of the CoE member states’ official/majority languages will be covered in the selected extension/standard. In particular, the Armenian and Georgian unique script systems will have to be considered and will be studied in the initial phases of the standard extension work. A complication of the CoE definition is that it includes all of the Russian state, which is obviously European as well as Asiatic. This, however, relates only to the selection of minority languages to be covered in the ETSI work, and not to the definition as such. Another complication is the overseas territories of some of the European states. This will need study, which will be performed in the initial stage of the work. As regards European-origin minority languages, special consideration will be taken of languages recognized in ratifications of the CoE charter ETS 148, "European Charter for Regional or Minority Languages." This charter has so far been ratified by about half of the CoE member states and signed—although not yet ratified—by several more. The ES 202 130 update/complement could therefore cover the following categories of languages:

  • official/majority languages of European countries not covered in the current version of the standard (e.g., Croatian)
  • recognized European-country minority languages not already covered by the majority language (e.g., Sorbian in Germany)
  • other important but (as yet) unrecognized minority languages (e.g., Friulian in Italy)
  • large immigrant languages (e.g., Arabic)
  • other important immigrant non-European languages (e.g., Vietnamese, which poses special character complications).

* Summary.

The implementation of ES 202 130 in its current version allows users to enter text into modern ICT devices in a number of major European languages, in a way that is logically consistent and that renders the learning of new keypad assignments superfluous when moving from the devices of one manufacturer to those of another. The usability of future ICT devices will be further increased by a revision/complement of ES 202 130 to expand the number of languages covered by the standard.

Notes

ETSI references are available free of charge at www.etsi.org.

CEN ENV 13710 (2000): "European Ordering Rules—Ordering of characters from the Latin, Greek and Cyrillic scripts."

ETSI ES 202 130 Human Factors; User Interfaces; Character repertoires, ordering and assignment to the 12-key telephone keypad (European languages)

ETSI ETS 300 640 Human Factors (HF); Assignment of alphabetic letters to digits on standard telephone keypad arrays

ETSI TS 100 900 Digital cellular telecommunications system (Phase 2+); Alphabets and language-specific information (same as GSM 03.38 version 7.2.0, Release 1998)

ISO 8859-5: "Information technology—eight-bit single-byte coded graphic character sets—Part 5: Latin/Cyrillic alphabet."

ISO/IEC 14651 (2001): "Information technology—International string ordering and comparison—Method for comparing character strings and description of the common template tailorable ordering."

ISO/IEC 10646-1 (2000): "Information technology—Universal Multiple-Octet Coded Character Set (UCS); Part 1: Architecture and Basic Multilingual Plane."

ISO/IEC 6937 (2001): "Information technology—Coded graphic character set for text communication—Latin alphabet."

ITU-T Recommendation E.161 (02/01) Arrangement of digits, letters and symbols on telephones and other devices that can be used for gaining access to a telephone network.

Authors

Martin Böcker
BenQ Mobile, Germany
martin.boecker@benq.com

Bruno von Niman
vonniman Consulting
bruno@vonniman.com

Karl Ivar Larsson
LWP Consulting, Sweden
ki@lwp.se

About the Authors

Dr. Martin Böcker studied Psychology and Linguistics at the Technical University in West-Berlin and at Exeter University, England. After working with SEL Alcatel and the Heinrich-Hertz-Institute in Berlin on user interfaces for telecommunications systems, he joined Siemens ICM in 1997 where he became head of the Siemens ICM Competence Center for User Interfaces in 2002. Since 2005, Dr. Böcker is head of user experience at BenQ Mobile in Munich.

Bruno von Niman is founder and lead expert of vonniman consulting, the ICT user experience company. Bruno is vice chairman of the European Telecommunication Standards Institute’s Human Factors Committee and leader of several projects sponsored by the European Commission. Bruno is also the guest editor of this and last issue’s special section on Gadgets.

Karl Ivar Larsson is an independent consultant after retirement from Ericsson. His working-life background is mainly in military electronics engineering, especially real-time computer systems. He has participated broadly in international standardization, in particular in ISO/IEC JTC1 SC2 "Coded character sets" and SC35 "User interfaces," and in CEN/TC304 "Information and communication technologies—European localization Requirements." He also took part in the development of the original version of ETSI standard ES 202 130.

Figures

F1Figure 1. Extract of the table specifying character repertoire and sorting order for Czech

F2Figure 2. Extract of the table specifying character assignment to 12-key keypad for Czech

Tables

T1Table 1. Principles employed in character repertoires, sorting order and keypad assignment

©2006 ACM  1072-5220/06/0900  $5.00

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.

The Digital Library is published by the Association for Computing Machinery. Copyright © 2006 ACM, Inc.

 

Post Comment


No Comments Found