Seminar Topics & Project Ideas On Computer Science Electronics Electrical Mechanical Engineering Civil MBA Medicine Nursing Science Physics Mathematics Chemistry ppt pdf doc presentation downloads and Abstract

Full Version: A Scalable System for Real-Time Acquisition
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
A Scalable System for Real-Time Acquisition, Transmission, and
Autostereoscopic Display of Dynamic Scenes


[attachment=27242]

Introduction

Humans gain three-dimensional information from a variety of cues.
Two of the most important ones are binocular parallax, scientifically
studied by Wheatstone in 1838, and motion parallax, described
by Helmholtz in 1866. Binocular parallax refers to seeing
a different image of the same object with each eye, whereas motion
parallax refers to seeing different images of an object when
moving the head. Wheatstone was able to scientifically prove the
link between parallax and depth perception using a steroscope – the
world’s first three-dimensional display device [Okoshi 1976]. Ever
since, researchers have proposed and developed devices to stereoscopically
display images. These three-dimensional displays hold
tremendous potential for many applications in entertainment, information
presentation, reconnaissance, tele-presence, medicine, visualization,
remote manipulation, and art.


Previous Work and Background

The topic of 3D TV – with thousands of publications and patents –
incorporates knowledge from multiple disciplines, such as imagebased
rendering, video coding, optics, stereoscopic displays, multiprojector
displays, computer vision, virtual reality, and psychology.
Some of the work may not be widely known across disciplines.
There are some good overview books on 3D TV [Okoshi 1976; Javidi
and Okano 2002]. In addition, we provide an extensive review
of the previous work.


Lightfield Systems

A lightfield represents radiance as a function of position and direction
in regions of space free of occluders [Levoy and Hanrahan
1996]. The ultimate goal, which Gavin Miller called the “hyper
display” [Miller 1995], is to capture a time-varying lightfield passing
through a surface and emitting the same (directional) lightfield
through another surface with minimal delay.
Early work in image-based graphics and 3D displays has dealt with
static lightfields [Ives 1928; Levoy and Hanrahan 1996; Gortler
et al. 1996]. In 1929, H. E. Ives proposed a photographic multicamera
recording method for large objects in conjunction with
the first projection-based 3D display [Ives 1929]. His proposal
bears some architectural similarities to our system, although modern
technology allows us to achieve real-time performance.


Multiview Video Compression and Transmission

Multiview video compression has mostly focused on static lightfields
(e.g., [Magnor et al. 2003; Ramanathan et al. 2003]). There
has been relatively little research on how to compress and transmit
multiview video of dynamic scenes in real-time. A notable exception
is the work by Yang et al. [2002]. They achieve real-time display
from an 8×8 lightfield camera by transmitting only the rays
that are necessary for view interpolation. However, it is impossible
to anticipate all the viewpoints in a TV broadcast setting. We
transmit all acquired video streams and use a similar strategy on
the receiver side to route the videos to the appropriate projectors
for display (see Section 3.3).


System Architecture
Figure 2 shows a schematic representation of our 3D TV system.
The acquisition stage consists of an array of hardwaresynchronized
cameras. Small clusters of cameras are connected
to producer PCs. The producers capture live, uncompressed video
streams and encode them using standard MPEG coding. The compressed
video streams are then broadcast on separate channels over
a transmission network, which could be digital cable, satellite TV,
or the Internet. On the receiver side, individual video streams are
decompressed by decoders. The decoders are connected by network
(e.g., gigabit ethernet) to a cluster of consumer PCs. The
consumers render the appropriate views and send them to a standard
2D, stereo-pair 3D, or multiview 3D display. In our current
implementation, each consumer corresponds to a projector in the
display and needs to project a slightly different viewpoint.