15-11-2012, 03:19 PM
Single-Pixel Imaging via Compressive Sampling
1Single-Pixel Imaging.pdf (Size: 1.19 MB / Downloads: 115)
INTRODUCTION
Humans are visual animals, and imaging sensors that extend our reach – cameras – have
improved dramatically in recent times thanks to the introduction of CCD and CMOS digital
technology. Consumer digital cameras in the mega-pixel range are now ubiquitous thanks to the
happy coincidence that the semiconductor material of choice for large-scale electronics integration
(silicon) also happens to readily convert photons at visual wavelengths into electrons. On the
contrary, imaging at wavelengths where silicon is blind is considerably more complicated, bulky,
and expensive. Thus, for comparable resolution, a $500 digital camera for the visible becomes a
$50,000 camera for the infrared.
In this paper, we present a new approach to building simpler, smaller, and cheaper digital
cameras that can operate efficiently across a much broader spectral range than conventional
silicon-based cameras. Our approach fuses a new camera architecture based on a digital micromirror
device (DMD – see Sidebar: Spatial Light Modulators) with the new mathematical theory
and algorithms of compressive sampling (CS – see Sidebar: Compressive Sampling in a Nutshell).
CS combines sampling and compression into a single nonadaptive linear measurement process
[1–4]. Rather than measuring pixel samples of the scene under view, we measure inner products
between the scene and a set of test functions. Interestingly, random test functions play a key role,
making each measurement a random sum of pixel values taken across the entire image. When
the scene under view is compressible by an algorithm like JPEG or JPEG2000, the CS theory
enables us to stably reconstruct an image of the scene from fewer measurements than the number
of reconstructed pixels. In this manner we achieve sub-Nyquist image acquisition.
The Single-Pixel Camera
Architecture
The single-pixel camera is an optical computer that sequentially measures the inner products
y[m] = hx, φmi between an N-pixel sampled version x of the incident light-field from the scene
under view and a set of two-dimensional (2D) test functions {φm} [5]. As shown in Fig. 1, the
light-field is focused by biconvex Lens 1 not onto a CCD or CMOS sampling array but rather
onto a DMD consisting of an array of N tiny mirrors (see Sidebar: Spatial Light Modulators).
Each mirror corresponds to a particular pixel in x and φm and can be independently oriented
either towards Lens 2 (corresponding to a 1 at that pixel in φm) or away from Lens 2 (corresponding
to a 0 at that pixel in φm). The reflected light is then collected by biconvex Lens 2 and
focused onto a single photon detector (the single pixel) that integrates the product x[n]φm[n] to
compute the measurement y[m] = hx, φmi as its output voltage. This voltage is then digitized
by an A/D converter. Values of φm between 0 and 1 can be obtained by dithering the mirrors
back and forth during the photodiode integration time. To obtain φm with both positive and
negative values (±1, for example), we estimate and subtract the mean light intensity from each
measurement, which is easily measured by setting all mirrors to the full-on 1 position.
Structured illumination configuration
In a reciprocal configuration to that in Fig. 1, we can illuminate the scene using a projector
displaying a sequence of random patterns {φm} and collect the reflected light using a single
lens and photodetector. Such a “structured illumination” setup has advantages in applications
where we can control the light source. In particular, there are intriguing possible combinations
of single-pixel imaging with techniques such as 3D imaging and dual photography [9].
Shutterless video imaging
We can also acquire video sequences using the single-pixel camera. Recall that a traditional
video camera opens a shutter periodically to capture a sequence of images (called video frames)
that are then compressed by an algorithm like MPEG that jointly exploits their spatio-temporal
redundancy. In contrast, the single-pixel video camera needs no shutter; we merely continuously
sequence through randomized test functions φm and then reconstruct a video sequence using an
optimization that exploits the video’s spatio-temporal redundancy [10].
If we view a video sequence as a 3D space/time cube, then the test functions φm lie
concentrated along a periodic sequence of 2D image slices through the cube. A na¨ıve way to
reconstruct the video sequence would group the corresponding measurements y[m] into groups
where the video is quasi-stationary and then perform a 2D frame-by-frame reconstruction on each
group. This exploits the compressibility of the 3D video cube in the space but not time direction.
A more powerful alternative exploits the fact that even though each φm is testing a different
2D image slice, the image slices are often related temporally through smooth object motions in
the video. Exploiting this 3D compressibility in both the space and time directions and inspired
by modern 3D video coding techniques [11], we can, for example, attempt to reconstruct the
sparsest video space/time cube in the 3D wavelet domain.
Single-Pixel Camera Tradeoffs
The single-pixel camera is a flexible architecture to implement a range of different multiplexing
methodologies, just one of them being CS. In this section, we analyze the performance of CS
and two other candidate multiplexing methodologies and compare them to the performance of a
brute-force array of N pixel sensors. Integral to our analysis is the consideration of Poisson photon
counting noise at the detector, which is image-dependent. We conduct two separate analyses to
assess the “bang for the buck” of CS. The first is a theoretical analysis that provides general
guidance.
Experimental results
Since CS acquisition/reconstruction methods often perform much better in practice than the
above theoretical bounds suggest, in this section we conduct a simple experiment using real data
from the CS imaging testbed depicted in Fig. 1. Thanks to the programmability of the testbed,
we acquired RS, BS, and CS measurements from the same hardware. We fixed the number of
A/D converter bits across all methodologies. Figure 4 shows the pixel-wise MSE for the capture
of a N = 128 × 128 pixel “R” test image as a function of the total capture time T. Here the
MSE combines both quantization and photon counting effects. For CS we took M = N/10 total
measurements per capture and used a Daubechies-4 wavelet basis for the sparse reconstruction.
Conclusions
For certain applications, CS promises to substantially increase the performance and capabilities
of data acquisition, processing, and fusion systems while lowering the cost and complexity
of deployment. A useful practical feature of the CS approach is that it off-loads processing from
data acquisition (which can be complicated and expensive) into data reconstruction or processing
(which can be performed on a digital computer, perhaps not even co-located with the sensor).
We have presented an overview of the theory and practice of a simple yet flexible single-pixel
architecture for CS based on a DMD spatial light modulator. While there are promising potential
applications where current digital cameras have difficulty imaging, there are clear tradeoffs and
challenges in the single-pixel design. Our current and planned work involves better understanding
and addressing these tradeoffs and challenges. Other potential avenues for research include
extending the single-pixel concept to wavelengths where the DMD fails as a modulator, such
as THz and X-rays.