01-06-2012, 11:16 AM
Outdoors Augmented Reality on Mobile Phone using Loxel-Based Visual Feature Organization
Outdoors Augmented Reality on Mobile Phone using Loxel.pdf (Size: 563.29 KB / Downloads: 57)
INTRODUCTION
HIGH-END mobile phones have developed into capable
computational devices equipped with high-quality
color displays, high-resolution digital cameras, and real-time
hardware-accelerated 3D graphics. They can exchange information
over broadband data connections, and sense location
using GPS. This enables a new class of augmented reality
applications which use the phone camera to initiate search
queries about objects in visual proximity to the user. Pointing
with a camera provides a natural way of indicating one’s interest
and browsing information available at a particular location.
Once the system recognizes the user’s target it can augment the
viewfinder with graphics and hyper-links that provide further
information (the menu or customer ratings of a restaurant)
or services (reserve a table and invite friends for a dinner).
This paper presents the technologies we have developed for
supporting these kinds of point-and-find applications.
Prior Work in Mobile Augmented Reality
A recent demonstration of an outdoor mobile augmented
reality application running on a cell phone is Nokia’s MARA
project by K¨ah¨ari and Murphy [15]. The system does not
perform any image analysis, instead it uses an external GPS
for localization and an inertial sensor to provide orientation.
PhoneGuide [?] is one of the first object recognition systems
performing the computation on a mobile phone, instead of
sending the images to a remote server. The system employs a
neural network trained to recognize normalized color features
and is used as a museum guide. Seifer et al. [17] use a mobile
system based on a hand-held device, GPS sensor, and a camera
for roadside sign detection and inventory. Their algorithm
was efficient enough to ensure good quality results in mobile
settings.
Contributions and Overview
In our own work, we have developed an outdoors augmented
reality system for matching images taken with a GPS-equipped
camera-phone against a database of location-tagged images.
The system then provides the user links or services associated
with the recognized object. If no match is found, the user
has an option of associating the query image with a label
from a list of nearby points of interest and submitting the
data to the server. The system is fully implemented on the
mobile device, and runs at close to real-time while maintaining
excellent recognition performance.
Image Retrieval Pipeline
The algorithm assumes the presence of a database of labeled
images. The goal is to assign labels to a test image by matching
it with images in the database. Our algorithm starts by extracting
SURF features from all labeled images and inserting them
into a shared kd-tree that allows for fast approximate nearest
neighbor (ANN) queries in high-dimensional spaces [24].
Next, the algorithm extracts features from the test image and
matches them against the features in the ANN data structure
using the multiple ratio test described below. The database
images are then ranked based on the number of features that
match a query feature.
Location Grid
A simple but effective method of reducing the size of the
feature set is to consider only features that originate from
objects directly visible from the current location. We estimate
the visibility by laying down a uniform grid over a geographic
area of interest. We refer to a grid cell as a location cell,
or simply loxel,1 and we will refer to the area visible from
a given loxel as a visibility kernel, or simply kernel. A
kernel will typically span multiple adjacent loxels as shown in
Fig. 5. Each loxel Li
j has an associated kernel Ki
j. Note that
while the loxels cover disjoint areas, the kernel areas overlap.
This affects the way we represent and transmit data to avoid
redundancy.
MOBILE PHONE IMPLEMENTATION
In this section, we present our implementation of the SURF
algorithm and its adaptation to the mobile phone. Next, we
discuss the impact that accuracy has on the speed of the
nearest-neighbor search and show that we can achieve an
order of magnitude speed-up with minimal impact on matching
accuracy. Finally, we discuss the details of the phone implementation
of the image matching pipeline. We study the
performance, memory use, and bandwidth consumption on the
phone.