04-05-2012, 02:47 PM
A Scalable System for Real-Time Acquisition, Transmission, and
Autostereoscopic Display of Dynamic Scenes
3DTV.pdf (Size: 432.73 KB / Downloads: 34)
Introduction
Humans gain three-dimensional information from a variety of cues.
Two of the most important ones are binocular parallax, scientifically
studied by Wheatstone in 1838, and motion parallax, described
by Helmholtz in 1866. Binocular parallax refers to seeing
a different image of the same object with each eye, whereas motion
parallax refers to seeing different images of an object when
moving the head. Wheatstone was able to scientifically prove the
link between parallax and depth perception using a steroscope – the
world’s first three-dimensional display device [Okoshi 1976]. Ever
since, researchers have proposed and developed devices to stereoscopically
display images. These three-dimensional displays hold
tremendous potential for many applications in entertainment, information
presentation, reconnaissance, tele-presence, medicine, visualization,
remote manipulation, and art.
Previous Work and Background
The topic of 3D TV – with thousands of publications and patents –
incorporates knowledge from multiple disciplines, such as imagebased
rendering, video coding, optics, stereoscopic displays, multiprojector
displays, computer vision, virtual reality, and psychology.
Some of the work may not be widely known across disciplines.
There are some good overview books on 3D TV [Okoshi 1976; Javidi
and Okano 2002]. In addition, we provide an extensive review
of the previous work.
Lightfield Systems
A lightfield represents radiance as a function of position and direction
in regions of space free of occluders [Levoy and Hanrahan
1996]. The ultimate goal, which Gavin Miller called the “hyper
display” [Miller 1995], is to capture a time-varying lightfield passing
through a surface and emitting the same (directional) lightfield
through another surface with minimal delay.
Early work in image-based graphics and 3D displays has dealt with
static lightfields [Ives 1928; Levoy and Hanrahan 1996; Gortler
et al. 1996]. In 1929, H. E. Ives proposed a photographic multicamera
recording method for large objects in conjunction with
the first projection-based 3D display [Ives 1929]. His proposal
bears some architectural similarities to our system, although modern
technology allows us to achieve real-time performance.
Multiview Video Compression and Transmission
Multiview video compression has mostly focused on static lightfields
(e.g., [Magnor et al. 2003; Ramanathan et al. 2003]). There
has been relatively little research on how to compress and transmit
multiview video of dynamic scenes in real-time. A notable exception
is the work by Yang et al. [2002]. They achieve real-time display
from an 8×8 lightfield camera by transmitting only the rays
that are necessary for view interpolation. However, it is impossible
to anticipate all the viewpoints in a TV broadcast setting. We
transmit all acquired video streams and use a similar strategy on
the receiver side to route the videos to the appropriate projectors
for display (see Section 3.3).