25-10-2012, 03:57 PM
Vision Processing for Realtime 3-D Data Acquisition Based on Coded Structured Light
Vision Processing for Realtime.pdf (Size: 1.01 MB / Downloads: 19)
Abstract
Structured light vision systems have been successfully
used for accurate measurement of 3-D surfaces in computer vision.
However, their applications are mainly limited to scanning
stationary objects so far since tens of images have to be captured for
recovering one 3-D scene. This paper presents an idea for real-time
acquisition of 3-D surface data by a specially coded vision system.
To achieve 3-D measurement for a dynamic scene, the data acquisition
must be performed with only a single image. A principle of
uniquely color-encoded pattern projection is proposed to design a
color matrix for improving the reconstruction efficiency. The matrix
is produced by a special code sequence and a number of state
transitions. A color projector is controlled by a computer to generate
the desired color patterns in the scene. The unique indexing
of the light codes is crucial here for color projection since it is essential
that each light grid be uniquely identified by incorporating
local neighborhoods so that 3-D reconstruction can be performed
with only local analysis of a single image. A scheme is presented to
describe such a vision processing method for fast 3-D data acquisition.
Practical experimental performance is provided to analyze
the efficiency of the proposed methods.
INTRODUCTION
Motivation
COMPUTER vision has become a very important means
to obtain the 3-D model of an object. A number of 3-D
sensing methods have been explored by researchers in the past
30 years [1]–[7]. The structured light has made its progress from
single light-spot projection to complex coded pattern, and, consequently,
the 3-D scanning operation speeds up from several
hours per image to dozens of images per second [4], [8], [9].
The first stage of feasible structured light systems came in
early 1980 when the binary coding or gray coding methods were
employed. Fig. 1 illustrates a typical set of light patterns by
Inokuchi et al. [10]. This kind of pattern can achieve high accuracy
in the measurements [11]–[16]. This is due to the fact
that the pattern resolutions are exponentially increasing among
the coarse-to-fine light projections and the stripe gap tends to 0,
but the stripe locations are easily distinguishable since a small
set of primitives is used, and, therefore, the position of a pixel
can be encoded precisely. It also takes the advantage of easy
implementation, and, thus, this method is still the most widely
used in structured light systems. The main drawback is that they
cannot be applied to moving surfaces since multiple patterns
must be projected. In order to obtain a better resolution, a technique
based on the combination of gray code and phase shifting
is often used [11]. Its drawback is that a larger number of projection
patterns (e.g., images) are required.
COLOR CODIFICATION
Color-Coded Structured Light System
The structured light system in this work consists of a CCD
camera and a digital projector (Fig. 5). That is similar to the
traditional stereo vision system, but with its second camera replaced
by the light source which projects a known pattern of
light on the scene. Another single camera captures the illuminated
scene. The required 3-D information can be obtained by
analyzing the deformation of the imaged pattern with respect
to the projected one. Here, the correspondences between the
projected pattern and the imaged one can be solved directly
via codifying the projected pattern, so that each projected light
point carries some information. When the point is imaged on
the image plane, this information can be used to determine its
coordinates on the projected pattern.
Flood Search for Word Identification
With the known grid size and initial seed word, it is easy to
find all adjacent words by a flood search algorithm [27], [28]. It
first tries to search several grid points around the seed word,
and then search more grid points near the known area. Each
point to be added in the known partial net has to satisfy three
conditions—its color, size, and regularity.
The color measured in the image is often not ideal as what
should be due to the distortion in the vision system and scene
reflection. Besides the color calibration strategies to be discussed
later, we can determine it by a color likelihood function. The
image pixel is compared with all the seven ideal colors in the
coding set. If the desired code color corresponds to one of the
three largest likelihood values, the grid point is accepted in the
net.
Since it is a “one-pass” method, i.e., the pixels are computed
only in a small local area once, the image processing can be performed
very fast, promising real-time applications. The speed
evaluation will be analyzed in the next section for performance
analysis and also in the experiment section.
CONCLUSION
Real-time, low-cost, reliable, and accurate 3-D data acquisition
is a dream for us in the vision community. While the available
technology is still not able to reach all these features together,
this paper makes a significant progress to the goal. An
idea was presented and implemented for generating a specially
color-coded light pattern, which combines the advantages of
both fast 3-D vision processing from a single image and reliability
and accuracy from the principle of structured light systems.
With a given set of color primitives, the patterns generated
are guaranteed to be a large matrix and desired shape with the
restriction that each word in the pattern matrix must be unique.
By using such a light pattern, correspondence is solved within a
single image, and, therefore, this is used in a dynamic environment
for real-time applications. Furthermore, the method does
not have a limit in the smoothness of object surfaces since it only
requires analyzing a small part of the scene and identifies the
coordinates by local image processing, which greatly improves
the 3-D reconstruction efficiency. Theoretical analysis and experimental
results show that acquisition of a 3-D surface with
mid-level resolution takes about 100 ms which is adequate for
many practical applications. Some software and hardware skills
may be applied to further improve the speed to above 30 fps. A
parallel processing scheme will further increases the efficiency
several times.
Vision Processing for Realtime.pdf (Size: 1.01 MB / Downloads: 19)
Abstract
Structured light vision systems have been successfully
used for accurate measurement of 3-D surfaces in computer vision.
However, their applications are mainly limited to scanning
stationary objects so far since tens of images have to be captured for
recovering one 3-D scene. This paper presents an idea for real-time
acquisition of 3-D surface data by a specially coded vision system.
To achieve 3-D measurement for a dynamic scene, the data acquisition
must be performed with only a single image. A principle of
uniquely color-encoded pattern projection is proposed to design a
color matrix for improving the reconstruction efficiency. The matrix
is produced by a special code sequence and a number of state
transitions. A color projector is controlled by a computer to generate
the desired color patterns in the scene. The unique indexing
of the light codes is crucial here for color projection since it is essential
that each light grid be uniquely identified by incorporating
local neighborhoods so that 3-D reconstruction can be performed
with only local analysis of a single image. A scheme is presented to
describe such a vision processing method for fast 3-D data acquisition.
Practical experimental performance is provided to analyze
the efficiency of the proposed methods.
INTRODUCTION
Motivation
COMPUTER vision has become a very important means
to obtain the 3-D model of an object. A number of 3-D
sensing methods have been explored by researchers in the past
30 years [1]–[7]. The structured light has made its progress from
single light-spot projection to complex coded pattern, and, consequently,
the 3-D scanning operation speeds up from several
hours per image to dozens of images per second [4], [8], [9].
The first stage of feasible structured light systems came in
early 1980 when the binary coding or gray coding methods were
employed. Fig. 1 illustrates a typical set of light patterns by
Inokuchi et al. [10]. This kind of pattern can achieve high accuracy
in the measurements [11]–[16]. This is due to the fact
that the pattern resolutions are exponentially increasing among
the coarse-to-fine light projections and the stripe gap tends to 0,
but the stripe locations are easily distinguishable since a small
set of primitives is used, and, therefore, the position of a pixel
can be encoded precisely. It also takes the advantage of easy
implementation, and, thus, this method is still the most widely
used in structured light systems. The main drawback is that they
cannot be applied to moving surfaces since multiple patterns
must be projected. In order to obtain a better resolution, a technique
based on the combination of gray code and phase shifting
is often used [11]. Its drawback is that a larger number of projection
patterns (e.g., images) are required.
COLOR CODIFICATION
Color-Coded Structured Light System
The structured light system in this work consists of a CCD
camera and a digital projector (Fig. 5). That is similar to the
traditional stereo vision system, but with its second camera replaced
by the light source which projects a known pattern of
light on the scene. Another single camera captures the illuminated
scene. The required 3-D information can be obtained by
analyzing the deformation of the imaged pattern with respect
to the projected one. Here, the correspondences between the
projected pattern and the imaged one can be solved directly
via codifying the projected pattern, so that each projected light
point carries some information. When the point is imaged on
the image plane, this information can be used to determine its
coordinates on the projected pattern.
Flood Search for Word Identification
With the known grid size and initial seed word, it is easy to
find all adjacent words by a flood search algorithm [27], [28]. It
first tries to search several grid points around the seed word,
and then search more grid points near the known area. Each
point to be added in the known partial net has to satisfy three
conditions—its color, size, and regularity.
The color measured in the image is often not ideal as what
should be due to the distortion in the vision system and scene
reflection. Besides the color calibration strategies to be discussed
later, we can determine it by a color likelihood function. The
image pixel is compared with all the seven ideal colors in the
coding set. If the desired code color corresponds to one of the
three largest likelihood values, the grid point is accepted in the
net.
Since it is a “one-pass” method, i.e., the pixels are computed
only in a small local area once, the image processing can be performed
very fast, promising real-time applications. The speed
evaluation will be analyzed in the next section for performance
analysis and also in the experiment section.
CONCLUSION
Real-time, low-cost, reliable, and accurate 3-D data acquisition
is a dream for us in the vision community. While the available
technology is still not able to reach all these features together,
this paper makes a significant progress to the goal. An
idea was presented and implemented for generating a specially
color-coded light pattern, which combines the advantages of
both fast 3-D vision processing from a single image and reliability
and accuracy from the principle of structured light systems.
With a given set of color primitives, the patterns generated
are guaranteed to be a large matrix and desired shape with the
restriction that each word in the pattern matrix must be unique.
By using such a light pattern, correspondence is solved within a
single image, and, therefore, this is used in a dynamic environment
for real-time applications. Furthermore, the method does
not have a limit in the smoothness of object surfaces since it only
requires analyzing a small part of the scene and identifies the
coordinates by local image processing, which greatly improves
the 3-D reconstruction efficiency. Theoretical analysis and experimental
results show that acquisition of a 3-D surface with
mid-level resolution takes about 100 ms which is adequate for
many practical applications. Some software and hardware skills
may be applied to further improve the speed to above 30 fps. A
parallel processing scheme will further increases the efficiency
several times.