As with other areas of user interface design, considerable leverage can be obtained by drawing analogies that use people’s already-existing skills for operating in the natural environment and searching for ways to apply them to communicating with a computer. Direct manipulation interfaces have enjoyed great success, particularly with novice users, largely because they draw on analogies to existing human skills (pointing, grabbing, moving objects in physical space), rather than trained behaviors; and virtual realities offer the promise of usefully exploiting people’s existing physical navigation and manipulation abilities. These notions are more difficult to extend to eye movement-based interaction, since few objects in the real world respond to people’s eye movements. The principal exception is, of course, other people: they detect and respond to being looked at directly and, to a lesser and much less precise degree, to what else one may be looking at. In describing eye movement-based human-computer interaction we can draw two distinctions: one is in the nature of the user’s eye movements and the other, in the nature of the responses.
Each of these could be viewed as natural (that is, based on a corresponding real-world analogy) or unnatural (no real world counterpart):
• Within the world created by an eye movement-based interface, users could move their eyes to scan the scene, just as they would a real world scene, unaffected by the presence of eye tracking equipment (natural eye movement, on the eye movement axis). The alternative is to instruct users of the eye movement-based interface to move their eyes in particular ways, not necessarily those they would have employed if left to their own devices, in order to actuate the system (unnatural or learned eye movements).
• On the response axis, objects could respond to a user’s eye movements in a natural way, that is, the object responds to the user’s looking in the same way real objects do. As noted, there is a limited domain from which to draw such analogies in the real world. The alternative is unnatural response, where objects respond in ways not experienced in the real world. The natural eye movement/natural response area is a difficult one, because it draws on a limited and subtle domain, principally how people respond to other people’s gaze.
Starker and Bolt provide an excellent example of this mode, drawing on the analogy of a tour guide or host who estimates the visitor’s interests by his or her gazes. In the work described in this chapter, we try to use natural (not trained) eye movements as input, but we provide responses unlike those in the real world. This is a compromise between full analogy to the real world and an entirely artificial interface. We present a display and allow the user to observe it with his or her normal scanning mechanisms, but such scans then induce responses from the computer not normally exhibited by real world objects. Most previous eye movement-based systems have used learned ("unnatural") eye movements for operation and thus, of necessity, unnatural responses.
Much of that work has been aimed at disabled or hands-busy applications, where the cost of learning the required eye movements ("stare at this icon to activate the device") is repaid by the acquisition of an otherwise impossible new ability. However, we believe that the real benefits of eye movement interaction for the majority of users will be in its naturalness, fluidity, low cognitive load, and almost unconscious operation; these benefits are attenuated if unnatural, and thus quite conscious, eye movements are required. The remaining category in unnatural eye movement/natural response, is anomalous and has not been used in practice.