23-02-2013, 12:25 PM
Skinput
Skinput.pdf (Size: 800.38 KB / Downloads: 29)
ABSTRACT
We present Skinput, a technology that appropriates the human
body for acoustic transmission, allowing the skin to be
used as an input surface. In particular, we resolve the location
of finger taps on the arm and hand by analyzing mechanical
vibrations that propagate through the body. We
collect these signals using a novel array of sensors worn as
an armband. This approach provides an always available,
naturally portable, and on-body finger input system. We
assess the capabilities, accuracy and limitations of our technique
through a two-part, twenty-participant user study. To
further illustrate the utility of our approach, we conclude
with several proof-of-concept applications we developed.
Author Keywords
Bio-acoustics, finger input, buttons, gestures, on-body interaction,
projected displays, audio interfaces.
ACM Classification Keywords
H.5.2 [User Interfaces]: Input devices and strategies; B.4.2
[Input/Output Devices]: Channels and controllers
General terms: Human Factors
INTRODUCTION
Devices with significant computational power and capabilities
can now be easily carried on our bodies. However, their
small size typically leads to limited interaction space (e.g.,
diminutive screens, buttons, and jog wheels) and consequently
diminishes their usability and functionality. Since
we cannot simply make buttons and screens larger without
losing the primary benefit of small size, we consider alternative
approaches that enhance interactions with small mobile
systems.
One option is to opportunistically appropriate surface area
from the environment for interactive purposes. For example,
[10] describes a technique that allows a small mobile
device to turn tables on which it rests into a gestural finger
input canvas. However, tables are not always present, and
in a mobile context, users are unlikely to want to carry appropriated
surfaces with them (at this point, one might as
well just have a larger device). However, there is one surface
that has been previous overlooked as an input canvas,
and one that happens to always travel with us: our skin.
Appropriating the human body as an input device is appealing
not only because we have roughly two square meters of
external surface area, but also because much of it is easily
accessible by our hands (e.g., arms, upper legs, torso). Furthermore,
proprioception – our sense of how our body is
configured in three-dimensional space – allows us to accurately
interact with our bodies in an eyes-free manner. For
example, we can readily flick each of our fingers, touch the
tip of our nose, and clap our hands together without visual
assistance. Few external input devices can claim this accurate,
eyes-free input characteristic and provide such a large
interaction area.
In this paper, we present our work on Skinput – a method
that allows the body to be appropriated for finger input using
a novel, non-invasive, wearable bio-acoustic sensor.
The contributions of this paper are:
1) We describe the design of a novel, wearable sensor for
bio-acoustic signal acquisition (Figure 1).
2) We describe an analysis approach that enables our system
to resolve the location of finger taps on the body.
Figure 1. A wearable, bio-acoustic sensing array built into
an armband. Sensing elements detect vibrations transmitted
through the body. The two sensor packages shown
above each contain five, specially weighted, cantilevered
piezo films, responsive to a particular frequency range.
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise,
or republish, to post on servers or to redistribute to lists, requires prior
specific permission and/or a fee.
CHI 2010, April 10–15, 2010, Atlanta, Georgia, USA.
Copyright 2010 ACM 978-1-60558-929-9/10/04..$10.00.
3) We assess the robustness and limitations of this system
through a user study.
4) We explore the broader space of bio-acoustic input
through prototype applications and additional experimentation.
RELATED WORK
Always-Available Input
The primary goal of Skinput is to provide an alwaysavailable
mobile input system – that is, an input system that
does not require a user to carry or pick up a device. A number
of alternative approaches have been proposed that operate
in this space. Techniques based on computer vision are
popular (e.g. [3,26,27], see [7] for a recent survey). These,
however, are computationally expensive and error prone in
mobile scenarios (where, e.g., non-input optical flow is
prevalent). Speech input (e.g. [13,15]) is a logical choice
for always-available input, but is limited in its precision in
unpredictable acoustic environments, and suffers from privacy
and scalability issues in shared environments.
Other approaches have taken the form of wearable computing.
This typically involves a physical input device built in
a form considered to be part of one’s clothing. For example,
glove-based input systems (see [25] for a review) allow
users to retain most of their natural hand movements, but
are cumbersome, uncomfortable, and disruptive to tactile
sensation. Post and Orth [22] present a “smart fabric” system
that embeds sensors and conductors into fabric, but
taking this approach to always-available input necessitates
embedding technology in all clothing, which would be prohibitively
complex and expensive.
The SixthSense project [19] proposes a mobile, alwaysavailable
input/output capability by combining projected
information with a color-marker-based vision tracking system.
This approach is feasible, but suffers from serious occlusion
and accuracy limitations. For example, determining
whether, e.g., a finger has tapped a button, or is merely hovering
above it, is extraordinarily difficult. In the present
work, we briefly explore the combination of on-body sensing
with on-body projection.
Bio-Sensing
Skinput leverages the natural acoustic conduction properties
of the human body to provide an input system, and is thus
related to previous work in the use of biological signals for
computer input. Signals traditionally used for diagnostic
medicine, such as heart rate and skin resistance, have been
appropriated for assessing a user’s emotional state (e.g.
[16,17,20]). These features are generally subconsciouslydriven
and cannot be controlled with sufficient precision for
direct input. Similarly, brain sensing technologies such as
electroencephalography (EEG) and functional near-infrared
spectroscopy (fNIR) have been used by HCI researchers to
assess cognitive and emotional state (e.g. [9,11,14]); this
work also primarily looked at involuntary signals. In contrast,
brain signals have been harnessed as a direct input for
use by paralyzed patients (e.g. [8,18]), but direct braincomputer
interfaces (BCIs) still lack the bandwidth required
for everyday computing tasks, and require levels of focus,
training, and concentration that are incompatible with typical
computer interaction.
There has been less work relating to the intersection of finger
input and biological signals. Researchers have harnessed
the electrical signals generated by muscle activation
during normal hand movement through electromyography
(EMG) (e.g. [23,24]). At present, however, this approach
typically requires expensive amplification systems and the
application of conductive gel for effective signal acquisition,
which would limit the acceptability of this approach
for most users.
The input technology most related to our own is that of
Amento et al. [2], who placed contact microphones on a
user’s wrist to assess finger movement. However, this work
was never formally evaluated, as is constrained to finger
motions in one hand. The Hambone system [6] employs a
similar setup, and through an HMM, yields classification
accuracies around 90% for four gestures (e.g., raise heels,
snap fingers). Performance of false positive rejection remains
untested in both systems at present. Moreover, both
techniques required the placement of sensors near the area
of interaction (e.g., the wrist), increasing the degree of invasiveness
and visibility.
Finally, bone conduction microphones and headphones –
now common consumer technologies - represent an additional
bio-sensing technology that is relevant to the present
work. These leverage the fact that sound frequencies relevant
to human speech propagate well through bone. Bone
conduction microphones are typically worn near the ear,
where they can sense vibrations propagating from the
mouth and larynx during speech. Bone conduction headphones
send sound through the bones of the skull and jaw
directly to the inner ear, bypassing transmission of sound
through the air and outer ear, leaving an unobstructed path
for environmental sounds.
Acoustic Input
Our approach is also inspired by systems that leverage
acoustic transmission through (non-body) input surfaces.
Paradiso et al. [21] measured the arrival time of a sound at
multiple sensors to locate hand taps on a glass window.
Ishii et al. [12] use a similar approach to localize a ball hitting
a table, for computer augmentation of a real-world
game. Both of these systems use acoustic time-of-flight for
localization, which we explored, but found to be insufficiently
robust on the human body, leading to the fingerprinting
approach described in this paper.
SKINPUT
To expand the range of sensing modalities for alwaysavailable
input systems, we introduce Skinput, a novel input
technique that allows the skin to be used as a finger input
surface. In our prototype system, we choose to focus on the
arm (although the technique could be applied elsewhere).
This is an attractive area to appropriate as it provides considerable
surface area for interaction, including a contiguous
and flat area for projection (discussed subsequently). Furthermore,
the forearm and hands contain a complex assemblage
of bones that increases acoustic distinctiveness of
different locations. To capture this acoustic information, we
developed a wearable armband that is non-invasive and
easily removable (Figures 1 and 5).
In this section, we discuss the mechanical phenomena that
enables Skinput, with a specific focus on the mechanical
properties of the arm. Then we will describe the Skinput
sensor and the processing techniques we use to segment,
analyze, and classify bio-acoustic signals.
Skinput.pdf (Size: 800.38 KB / Downloads: 29)
ABSTRACT
We present Skinput, a technology that appropriates the human
body for acoustic transmission, allowing the skin to be
used as an input surface. In particular, we resolve the location
of finger taps on the arm and hand by analyzing mechanical
vibrations that propagate through the body. We
collect these signals using a novel array of sensors worn as
an armband. This approach provides an always available,
naturally portable, and on-body finger input system. We
assess the capabilities, accuracy and limitations of our technique
through a two-part, twenty-participant user study. To
further illustrate the utility of our approach, we conclude
with several proof-of-concept applications we developed.
Author Keywords
Bio-acoustics, finger input, buttons, gestures, on-body interaction,
projected displays, audio interfaces.
ACM Classification Keywords
H.5.2 [User Interfaces]: Input devices and strategies; B.4.2
[Input/Output Devices]: Channels and controllers
General terms: Human Factors
INTRODUCTION
Devices with significant computational power and capabilities
can now be easily carried on our bodies. However, their
small size typically leads to limited interaction space (e.g.,
diminutive screens, buttons, and jog wheels) and consequently
diminishes their usability and functionality. Since
we cannot simply make buttons and screens larger without
losing the primary benefit of small size, we consider alternative
approaches that enhance interactions with small mobile
systems.
One option is to opportunistically appropriate surface area
from the environment for interactive purposes. For example,
[10] describes a technique that allows a small mobile
device to turn tables on which it rests into a gestural finger
input canvas. However, tables are not always present, and
in a mobile context, users are unlikely to want to carry appropriated
surfaces with them (at this point, one might as
well just have a larger device). However, there is one surface
that has been previous overlooked as an input canvas,
and one that happens to always travel with us: our skin.
Appropriating the human body as an input device is appealing
not only because we have roughly two square meters of
external surface area, but also because much of it is easily
accessible by our hands (e.g., arms, upper legs, torso). Furthermore,
proprioception – our sense of how our body is
configured in three-dimensional space – allows us to accurately
interact with our bodies in an eyes-free manner. For
example, we can readily flick each of our fingers, touch the
tip of our nose, and clap our hands together without visual
assistance. Few external input devices can claim this accurate,
eyes-free input characteristic and provide such a large
interaction area.
In this paper, we present our work on Skinput – a method
that allows the body to be appropriated for finger input using
a novel, non-invasive, wearable bio-acoustic sensor.
The contributions of this paper are:
1) We describe the design of a novel, wearable sensor for
bio-acoustic signal acquisition (Figure 1).
2) We describe an analysis approach that enables our system
to resolve the location of finger taps on the body.
Figure 1. A wearable, bio-acoustic sensing array built into
an armband. Sensing elements detect vibrations transmitted
through the body. The two sensor packages shown
above each contain five, specially weighted, cantilevered
piezo films, responsive to a particular frequency range.
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise,
or republish, to post on servers or to redistribute to lists, requires prior
specific permission and/or a fee.
CHI 2010, April 10–15, 2010, Atlanta, Georgia, USA.
Copyright 2010 ACM 978-1-60558-929-9/10/04..$10.00.
3) We assess the robustness and limitations of this system
through a user study.
4) We explore the broader space of bio-acoustic input
through prototype applications and additional experimentation.
RELATED WORK
Always-Available Input
The primary goal of Skinput is to provide an alwaysavailable
mobile input system – that is, an input system that
does not require a user to carry or pick up a device. A number
of alternative approaches have been proposed that operate
in this space. Techniques based on computer vision are
popular (e.g. [3,26,27], see [7] for a recent survey). These,
however, are computationally expensive and error prone in
mobile scenarios (where, e.g., non-input optical flow is
prevalent). Speech input (e.g. [13,15]) is a logical choice
for always-available input, but is limited in its precision in
unpredictable acoustic environments, and suffers from privacy
and scalability issues in shared environments.
Other approaches have taken the form of wearable computing.
This typically involves a physical input device built in
a form considered to be part of one’s clothing. For example,
glove-based input systems (see [25] for a review) allow
users to retain most of their natural hand movements, but
are cumbersome, uncomfortable, and disruptive to tactile
sensation. Post and Orth [22] present a “smart fabric” system
that embeds sensors and conductors into fabric, but
taking this approach to always-available input necessitates
embedding technology in all clothing, which would be prohibitively
complex and expensive.
The SixthSense project [19] proposes a mobile, alwaysavailable
input/output capability by combining projected
information with a color-marker-based vision tracking system.
This approach is feasible, but suffers from serious occlusion
and accuracy limitations. For example, determining
whether, e.g., a finger has tapped a button, or is merely hovering
above it, is extraordinarily difficult. In the present
work, we briefly explore the combination of on-body sensing
with on-body projection.
Bio-Sensing
Skinput leverages the natural acoustic conduction properties
of the human body to provide an input system, and is thus
related to previous work in the use of biological signals for
computer input. Signals traditionally used for diagnostic
medicine, such as heart rate and skin resistance, have been
appropriated for assessing a user’s emotional state (e.g.
[16,17,20]). These features are generally subconsciouslydriven
and cannot be controlled with sufficient precision for
direct input. Similarly, brain sensing technologies such as
electroencephalography (EEG) and functional near-infrared
spectroscopy (fNIR) have been used by HCI researchers to
assess cognitive and emotional state (e.g. [9,11,14]); this
work also primarily looked at involuntary signals. In contrast,
brain signals have been harnessed as a direct input for
use by paralyzed patients (e.g. [8,18]), but direct braincomputer
interfaces (BCIs) still lack the bandwidth required
for everyday computing tasks, and require levels of focus,
training, and concentration that are incompatible with typical
computer interaction.
There has been less work relating to the intersection of finger
input and biological signals. Researchers have harnessed
the electrical signals generated by muscle activation
during normal hand movement through electromyography
(EMG) (e.g. [23,24]). At present, however, this approach
typically requires expensive amplification systems and the
application of conductive gel for effective signal acquisition,
which would limit the acceptability of this approach
for most users.
The input technology most related to our own is that of
Amento et al. [2], who placed contact microphones on a
user’s wrist to assess finger movement. However, this work
was never formally evaluated, as is constrained to finger
motions in one hand. The Hambone system [6] employs a
similar setup, and through an HMM, yields classification
accuracies around 90% for four gestures (e.g., raise heels,
snap fingers). Performance of false positive rejection remains
untested in both systems at present. Moreover, both
techniques required the placement of sensors near the area
of interaction (e.g., the wrist), increasing the degree of invasiveness
and visibility.
Finally, bone conduction microphones and headphones –
now common consumer technologies - represent an additional
bio-sensing technology that is relevant to the present
work. These leverage the fact that sound frequencies relevant
to human speech propagate well through bone. Bone
conduction microphones are typically worn near the ear,
where they can sense vibrations propagating from the
mouth and larynx during speech. Bone conduction headphones
send sound through the bones of the skull and jaw
directly to the inner ear, bypassing transmission of sound
through the air and outer ear, leaving an unobstructed path
for environmental sounds.
Acoustic Input
Our approach is also inspired by systems that leverage
acoustic transmission through (non-body) input surfaces.
Paradiso et al. [21] measured the arrival time of a sound at
multiple sensors to locate hand taps on a glass window.
Ishii et al. [12] use a similar approach to localize a ball hitting
a table, for computer augmentation of a real-world
game. Both of these systems use acoustic time-of-flight for
localization, which we explored, but found to be insufficiently
robust on the human body, leading to the fingerprinting
approach described in this paper.
SKINPUT
To expand the range of sensing modalities for alwaysavailable
input systems, we introduce Skinput, a novel input
technique that allows the skin to be used as a finger input
surface. In our prototype system, we choose to focus on the
arm (although the technique could be applied elsewhere).
This is an attractive area to appropriate as it provides considerable
surface area for interaction, including a contiguous
and flat area for projection (discussed subsequently). Furthermore,
the forearm and hands contain a complex assemblage
of bones that increases acoustic distinctiveness of
different locations. To capture this acoustic information, we
developed a wearable armband that is non-invasive and
easily removable (Figures 1 and 5).
In this section, we discuss the mechanical phenomena that
enables Skinput, with a specific focus on the mechanical
properties of the arm. Then we will describe the Skinput
sensor and the processing techniques we use to segment,
analyze, and classify bio-acoustic signals.