05-09-2014, 02:36 PM
Eye Movement-Based Human-Computer Interaction Techniques Toward Non-Command Interfaces
Eye Movement-Based Human.pdf (Size: 198.51 KB / Downloads: 23)
ABSTRACT
User-computer dialogues are typically one-sided, with the bandwidth from
computer to user far greater than that from user to computer. The movement
of a user’s eyes can provide a convenient, natural, and high-bandwidth source
of additional user input, to help redress this imbalance. We therefore investigate
the introduction of eye movements as a computer input medium. Our
emphasis is on the study of interaction techniques that incorporate eye movements
into the user-computer dialogue in a convenient and natural way. This
chapter describes research at NRL on developing such interaction techniques
and the broader issues raised by non-command-based interaction styles. It
discusses some of the human factors and technical considerations that arise in
trying to use eye movements as an input medium, describes our approach and
the first eye movement-based interaction techniques that we have devised and
implemented in our laboratory, reports our experiences and observations on
them, and considers eye movement-based interaction as an exemplar of a new,
more general class of non-command-based user-computer interaction
INTRODUCTION
In searching for better interfaces between users and their computers, an additional mode
of communication between the two parties would be of great use. The problem of humancomputer
interaction can be viewed as two powerful information processors (human and computer)
attempting to communicate with each other via a narrow-bandwidth, highly constrained
interface [25]. Faster, more natural, more convenient (and, particularly, more parallel, less
sequential) means for users and computers to exchange information are needed to increase the
useful bandwidth across that interface.
On the user’s side, the constraints are in the nature of the communication organs and abilities with which humans are endowed; on the computer side, the only constraint is the range of
devices and interaction techniques that we can invent and their performance. Current technology
has been stronger in the computer-to-user direction than user-to-computer, hence today’s
user-computer dialogues are typically one-sided, with the bandwidth from the computer to the
user far greater than that from user to computer. We are especially interested in input media
that can help redress this imbalance by obtaining data from the user conveniently and rapidly.
We therefore investigate the possibility of using the movements of a user’s eyes to provide a
high-bandwidth source of additional user input. While the technology for measuring a user’s
visual line of gaze (where he or she is looking in space) and reporting it in real time has been
improving, what is needed is appropriate interaction techniques that incorporate eye movements
into the user-computer dialogue in a convenient and natural way
NON-COMMAND INTERFACE STYLES
Eye movement-based interaction is one of several areas of current research in humancomputer
interaction in which a new interface style seems to be emerging. It represents a
change in input from objects for the user to actuate by specific commands to passive equipment
that simply senses parameters of the user’s body. Jakob Nielsen describes this propertyThe fifth generation user interface paradigm seems to be centered around noncommand-
based dialogues. This term is a somewhat negative way of characterizing
a new form of interaction but so far, the unifying concept does seem to be exactly
the abandonment of the principle underlying all earlier paradigms: That a dialogue
has to be controlled by specific and precise commands issued by the user and processed
and replied to by the computer. The new interfaces are often not even dialogues
in the traditional meaning of the word, even though they obviously can be
analyzed as having some dialogue content at some level since they do involve the
exchange of information between a user and a computer. The principles shown at
CHI’90 which I am summarizing as being non-command-based interaction are eye
tracking interfaces, artificial realities, play-along music accompaniment, and agents
[19].
CHARACTERISTICS OF EYE MOVEMENTS
In order to proceed with the design of effective eye movement-based human-computer
interaction, we must first examine the characteristics of natural eye movements, with emphasis
on those likely to be exhibited by a user in front of a conventional (non-eyetracking) computer
console.
Optical/Video Methods – Single Point
More practical methods use remote imaging of some visible feature located on the eye,
such as the boundary between the sclera (white portion of the front of the eye) and iris (colored
portion)–this boundary is only partially visible at any one time, the outline of the pupil (works
best for subjects with light-colored eyes or else the pupil can be illuminated so it appears
lighter than the iris regardless of eye color), or the reflection off the front of the cornea of a
collimated light beam shone at the eye. Any of these can then be used with photographic or
video recording (for retrospective analysis) or with real-time video processing. They all require
the head to be held absolutely stationary to be sure that any movement detected represents
movement of the eye, rather than the head moving in space; a bite board is customarily used
Multiple ‘‘Fixations’’ in a Single ‘‘Gaze’’
A user may view a single object with a sequence of several fixations, all in the general
area of the object. Since they are distinct fixations, separated by measurable saccades larger
than the jitter mentioned above, they would be reported as individual fixations. Once again, if
the user thinks he or she is looking at a single object, the user interface ought to treat the eye
tracker data as if there were one event, not several. Therefore, following the approach of Just
and Carpenter [14], if the user makes several fixations near the same screen object, connected
by small saccades, we group them together into a single ‘‘gaze.’’ Further dialogue processing
is performed in terms of these gazes, rather than fixations, since the former should be more
indicative of the user’s intentions.
Accuracy and Range
A user generally need not position his or her eye more accurately than the width of the
fovea (about one degree) to see an object sharply. Finer accuracy from an eye tracker might be
needed for studying the operation of the eye muscles but adds little for our purposes. The
eye’s normal jittering further limits the practical accuracy of eye tracking. It is possible to
improve accuracy by averaging over a fixation, but not in a real-time interface.
Despite the servo-controlled mirror mechanism for following the user’s head, we find that
the steadier the user holds his or her head, the better the eye tracker works. We find that we
can generally get two degrees accuracy quite easily, and sometimes can achieve one degree (or
approximately 0.4" or 40 pixels on the screen at a 24" viewing distance). The eye tracker
should thus be viewed as having a resolution much coarser than that of a mouse or most other
pointing devices, perhaps more like a traditional touch screen. An additional problem is that
the range over which the eye can be tracked with this equipment is fairly limited
Eye-controlled Scrolling Text
A window of text is shown, but not all of the material to be displayed can fit. As shown
at the bottom left of Figure 8, a row of arrows appears below the last line of the text and
above the first line, indicating that there is additional material not shown. If the user looks at
the arrows, the text itself starts to scroll. Note, though, that it never scrolls when the user is
actually reading the text (rather than looking at the arrows). The assumption is that, as soon as
the text starts scrolling, the user’s eye will be drawn to the moving display and away from the
arrows, which will stop the scrolling. The user can thus read down to end of the window,
then, after he or she finishes reading the last line, look slightly below it, at the arrows, in order
to retrieve the next part of the text. The arrows are visible above and/or below text display
only when there is additional scrollable material in that direction.
Listener Window
In a window system, the user must designate the active or ‘‘listener’’ window, that is, the
one that receives keyboard inputs. Current systems use an explicit mouse command to designate
the active window; in some, the command is simply pointing, in others, it is pointing and
clicking. Instead, we use eye position–the listener window is simply the one the user is looking
at. A delay is built into the system, so that user can look briefly at other windows without
changing the listener window designation. Fine cursor motions within a window are still handled
with the mouse, which gives an appropriate partition of tasks between eye tracker and
mouse, analogous to that between speech and mouse used by Schmandt [22]. A possible
extension to this approach is for each window to remember the location of the mouse cursor
within it when the user last left that window. When the window is reactivated (by looking at
it), the mouse cursor is restored to that remembered position