What if the everyday objects around us came to life? What if they could sense our presence, our focus of attention, and our actions, and could respond with relevant information, suggestions, and actions? This is the central question addressed by the Ambient Intelligence research group at the MIT Media Laboratory. We are developing novel hardware and software techniques that make it possible to create so called "attentive objects." For example, we are creating technologies that make it possible for the book you are holding to tell you what passages you may be particularly interested in, while the bookshelf in the room might show you which books are similar to the one in your hands, and the picture of your grandmother on the wall keeps you abreast of how she is doing when you glance up at it.
One motivation for this research is to make people’s lives more convenient. We increasingly live in two parallel worlds: the physical or "real" world of objects, places, people, and the digital or "virtual" world of electronic communication and information. For most physical objects a wealth of information exists online, but that information is hard to get to. It requires the use of a computer and cumbersome keyboard interface, as well as navigation through several Web sites and thousands of hits returned by search engines. By augmenting objects with sensors, communication and computation, and by using intelligent interface techniques to predict what a person may be interested in, we can integrate the digital and physical worlds more closely so that the most relevant digital information and services can be offered to a person when they interact with a physical object.
Another motivation for making objects come to life is to enrich people’s lives and hopefully even "enlighten" them. Attentive objects can convey interesting information to people that they would never have bothered to look up in the first place, thereby turning every moment into a learning opportunity. For example, objects could tell you about their history, how they are made or what others have said about them. Augmented objects can also enrich a person’s life by providing a form of immersive entertainment.
Of course the wealth of information available about any object is enormous, and making all of this information available to a person while they are going about their life would be overwhelming. That is why it is important to "personalize" the interaction and present to a specific person the information that is of the greatest potential interest, given that person’s focus, context, interests and past actions. To do so, we have had to develop novel techniques to model a user, personalize information and distill large amounts of information into "gems of knowledge" which a person can quickly understand and act upon.
Concretely, our research group has a handful of ongoing projects. A first set of projects give a person more information about the object that he or she is paying attention to. One way to determine what object a person is interested in is to observe what object they are touching or grasping. For example, a person may pick up a book, or shake another person’s hand. To sense this tactile expression of interest, we have developed a wristband that a person wears all the time (it could possibly be integrated in their wristwatch). This wristband contains a small RFID reader, as well as some accelerometers and a module for wireless communication (see Figure 1). It is completely wireless and powered by a small battery. Throughout the day as the wearer touches different objects, the wristband picks up the RFID tags of these objects. That information is sent via a cell phone to a server that looks up the object corresponding to this tag and searches for information about that object on sites such as Amazon and Google. The system verifies and updates the profile of the user in question. This profile consists of short-term data (where is the person located, what other objects did they recently interact with, etc.), as well long-term data (such as what the long-term interests and preferences of this user are). It then compares the information about the object with the information about the user and decides what information and services to present to the user. This information is sent back to the person’s cell phone in real-time, effectively turning the person’s cell phone into a generic interface for any tagged object the person happens to hold.
Users are oblivious to all this information exchange going on in the background. From their point of view, all they need to know is that their cell phones almost instantly present information and suggestions relevant to the object they are attending to. The information the system provides in the case of books includes: an option to buy the book "with one click" (if this is the one of the first times they hold this book), an option to do a keyword search in the book, summaries of reviews of the book (specifically by sources which this user typically consults), an informed guess as to whether the user will like the book or not, a listing of other books the user has picked up which are similar to this book, a list of messages left in this book for the user by other people, an opportunity to leave a message in this book for someone else, and so on.
We have experimented with different input/output formats for presenting this information. One format uses the cell phone’s screen as the output device and the cell phone’s keys as the means of input. In another project we have developed an "on-the-go" style of interface where output is provided in audio/speech format in a cell phone "earbud" while wrist gestures (picked up by the accelerometers on the wristband) are used for input, for example, for navigating through the menus.
Another way in which a user’s current interest can be detected is by measuring the person’s visual focus or gaze. We have developed a second hardware platform by attaching a small Infrared (IR) emitter/receiver to a person’s cell phone earbud. This forward-pointed IR sensor is able to detect IR beacons which are nearby and in the person’s visual focus (see Figure 2). When the user’s head is pointed for a few seconds at an object augmented with such an IR beacon, chances are good that the user is expressing visual interest in this object and information about that object can be spoken into the user’s ear through the earbud. In our laboratory, we have augmented a car engine with approximately 20 such IR beacons. Wearing the augmented earbud, a person can learn about the car engine by just staring at different parts of the engine. The text spoken into the user’s ear is personalized and contextualized based on the level of expertise of the user, the native language of the user, what other parts of the engine the user recently looked at, and so on.
While sometimes the right thing to do is to give the user more information about what they are currently attending to (visually or tactically), at other times it is useful for objects to try to attract and change the person’s focus of attention. In a related project, we have augmented objects with sensors, an embedded communication device, as well as an LED light. When a person walks into a room full of such augmented objects, the person’s cell phone exchanges the person’s profile of interests with the augmented objects. The objects that match the person’s interests light up, thereby attracting the person’s attention. Again this interaction is completely personalized and contextualized. We have experimented with this set-up in the domain of books. If a person walks into the room with augmented books, a different set of books lights up every time (based on what interests the person, what he or she recently paid attention to, whether he or she has been in that room before, etc). When the user picks up one of the books, books similar to that one light up. The person can also use their cell phone to find a particular book (ordering it to turn on its light) or can do a topic search in an entire library of books (ordering books about topic X to blink their lights).
In another series of projects we are focused on augmenting specific types of objects with which people typically surround themselves. We are extending the functionality of familiar objects such as clocks, coffee machines, and picture frames. Moving Portrait, for example, is a project which brings life to a portrait in a typical picture frame. The portrait interacts with its viewers using a variety of sensing techniques (vision and ultrasonic). The sensors enable the portrait to be aware of the viewers’ presence, distance, and body movements. The portrait reacts to this data by animating the person in the photo. For example, one of our moving portraits depicts a shy little girl (see Figure 4). When a person looks at the portrait long enough, without too much movement and without getting to close, the little girl becomes less shy and eventually starts smiling at the user. But when more people join to look at her, or when the viewer gets too close or becomes too agitated, she quickly hides her face in her hands again. By adding interaction, dynamics, and memory to a familiar portrait we create a different and more engaging relationship between the viewer and the portrait. The viewer gets to know more about the subject and, in addition, the portrait’s responses are adapted to the viewer’s behavior and to prior interactions with current and former viewers. Just like in real life every person reveals a different side of his/her personality to different people and situations, the moving portrait reveals different sides of its own personality.
In summary, we are augmenting everyday objects so they can sense a person’s presence and actions and react in interesting ways. A lot of previous work by others is related, specifically research in the area of augmented reality and mixed reality. Our work differs from this work in that it is less concerned with creating a visual image combining the real with the virtual. There is also a lot of related work in the field of ubiquitous computing. Our work differs from the majority of that literature in combining ubiquitous computing with techniques from the field of intelligent interfaces (specifically using techniques for user modeling, personalization and recommendations to present the user with the most relevant information). Finally in contrast to related approaches, our work focuses on interfaces for "natural," "on-the-go" interaction. This novel approach to human-computer interaction does not require any explicit actions from the user: Users go about their business interacting with objects the way they always do. Output devices close to the user present relevant information about the user’s environment and current focus of attention. The user is aware of the availability of this additional information, but is able to ignore it if he or she does not have time or does not want to be distracted. The form factor in all our projects stresses the use of non-invasive technology and piggy-backs onto the things we already wear or carry around every day such as a cell phone and wristwatch.
The work discussed here is the research of graduate students Assaf Feldman, David Gatenby, Xingyu (Hugo) Liu, David Merrill, Sajid Sadi, Chia-Ming (James) Teng, and Orit Zuckerman working with the author in the Ambient Intelligence Research Group of the MIT Media Laboratory.
MIT Media Laboratory
About the Author:
Pattie Maes is an associate professor in MIT’s Program in Media Arts and Sciences and is the founder and director of the Media Lab’s Ambient Intelligence research group. Previously, she founded and ran the Software Agents group. Prior to joining the Media Lab, Maes was a visiting professor and a research scientist at the MIT Artificial Intelligence Lab. She holds a Bachelors and Ph.D. in Computer Science from the Vrije Universiteit Brussel in Belgium. Her areas of expertise are human-computer interaction, artificial intelligence, and intelligence augmentation. Maes is the editor of three books, and is an editorial board member and reviewer for numerous professional journals and conferences. She has been named one of the "100 Americans to watch for" by Newsweek; a member of the Cyber-Elite by TIME Digital; a "Global Leader of Tomorrow" by the World Economic Forum; and has been awarded the 1995 World Wide Web category prize by Ars Electronica; and the "Lifetime Achievement Award" by the Massachusetts Interactive Media Council
©2005 ACM 1072-5220/05/0700 $5.00
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.
The Digital Library is published by the Association for Computing Machinery. Copyright © 2005 ACM, Inc.