29-09-2016, 10:52 AM
1456734846-OnuralNAB06Assessment.pdf (Size: 419.68 KB / Downloads: 12)
ABSTRACT
Stereoscopic 3D viewing techniques are almost as
old as their 2D counterparts: experimental
stereoscopic 3DTV immediately followed the
invention of TV. Holography is a newer
technology compared to stereoscopy, and there are
indicators that satisfactory holographic 3DTV may
be feasible. Another candidate technology for
3DTV is integral imaging. Holography and integral
imaging provide true full parallax 3D displays in
the ideal case. All these technologies have their
own distinct features, advantages and problems.
Interest in all forms of 3DTV has been
significantly increasing both in research and
commercial communities. An integrated 3DTV
system naturally has different components:
capturing of 3D moving scenes, their
representation, compression and transport, and
finally display are the main building blocks.
Naturally, the consumer attitude and the related
social issues will be rather centred around the
display and interaction. 3D scenes can be captured
by various means, for example, by using many
cameras simultaneously. Furthermore, it is
desirable to serve all types of 3D displays with
different capabilities. Therefore, It is envisaged
that scene capture and display operation will be
decoupled in future 3DTV systems: captured scene
information will be converted to abstract
representations (and maybe stored) using computer
graphics techniques, and the display (and observer)
will interact with this intermediate data. It is
natural to extend conventional video compression
techniques to 3D video signals by exploiting the
inherent redundancies. Coding of 3D video signals
is attracting research interest and related
standardization activities are ongoing in bodies like
ISO-MPEG. Digital transmission, using adapted
streaming techniques is another research area.
Autostereoscopic, holographic and volumetric
displays have been demonstrated and used. Signal
processing techniques are employed to find the technology-dependent display driver signals to get
the 3D images from abstract 3D scenes.
INTRODUCTION AND HISTORICAL
OVERVIEW
The ultimate goal of the viewing experience is to
create the illusion of a real environment in its
absence. If this goal is fully achieved, there is no
way for an observer to distinguish whether or not
what he sees is real or an optical illusion.
Due to its ease and technological feasibility in
earlier times, 2D representations of images have
been with us since the beginning of history in the
form of paintings and drawings. Capturing the
sense of depth in simple 2D drawings has been a
challenge; art historians are well aware of
developments in art which led to drawing
techniques using perspective techniques.
Photography was publicly introduced in 1839 by
Sir John Herschel; however, the underlying optical
process was known about three centuries before
that date [1]. Since then, 2D still imaging has been
improved to near its limits giving us the beautiful
pictures we see around us. Animation of 2D
pictures to achieve movies was documented in
1867 in a US patent about a device called
``zoopraxiscope'' invented by William Lincoln [2].
Observing images from remote places was
accomplished by the invention of television as
early as 1920s (Edouard Belin and John Logie
Baird) [3]. The high quality viewing experience in
today's digital TV sets and movie houses is a
natural consequence of continuing scientific and
technological developments, improvements and
inventions in this field. Of course, the driving force
behind all this development is the never ending
consumer demand for a better viewing experience,
the curiosity and the talents of those who provide
the technological basis, and the entrepreneurial
skills and attempts to satisfy such demands.
As soon as the photography and its motion picture
equivalent were invented, stereoscopic 3D immediately followed. Indeed, it is known that a
mirror device was used in 1838 to deliver
stereoscopic 3D images by Sir Charles Wheatstone
[4]. By 1844, stereoscopic viewing was popular
both in Europe and in USA. Similarly, the
stereoscopic 3D counterparts of motion picture and
television quickly became a reality after the
invention of their 2D counterparts: the concept of
stereoscopic cinema appeared in the early 1900's,
and stereoscopic TV was proposed in the 1920's.
By 1950, 3-D movies became quite popular. 3D
movie theaters spread all over the world with their
high resolution format, and gave audiences a
highly satisfactory stereoscopic 3D experience.
Although experimental 3DTV broadcast dates back
as early as 1953, the first commercial 3DTV
broadcast took place in 1980 in the USA [5].
An overview of 3D exhibitions in different parts of
the world between 1985-1996 and technologies
presented in those exhibitions are given in [8].
Stereoscopy is rather simpler; its fundamental
operational principle is based on the human visual
system and perception. However, most
stereoscopic systems create mismatches between
various 3D cues in human perception, and thus
create discomfort while viewing. Indeed, most of
the current research in stereoscopic 3D is targeted
to overcome such problems [4,5].
Even though stereoscopy has been known for a
long time, there are other 3D imaging techniques,
and some of them are also known for a long time.
Based on scientific developments in optics and
diffraction theory dating back to the early 1600's,
the principles of holography were established in
1948. [6]. The first off-axis holograms were
created in the early 1960s when lasers became
available. Digital holography techniques followed,
and eventually holographic cameras appeared.
Experimental holographic motion pictures
appeared for the first time in 1989 [6]. Recent
developments in this field strongly hint at
successful holographic 3DTV displays being
produced in the near-future.
Another technology for 3D imaging is usually
referred as ``integral imaging'' and known since
1908 [6,7]. The basics of integral imaging can be
explained as capturing many 2D pictures of an
object simultaneously from different angles, and
then optically projecting the pictures back to the
geometric location of the object, in its absence, to
create the 3D image. Lenslet arrays (micro-lens
arrays) are generally used in both capture and
reconstruction. Extension of the technique for
motion pictures and TV is possible. Integral imaging is a strong candidate for next generation of
3DTV.
Holography and integral imaging provide true full
parallax 3D displays. Unlike stereoscopy, their
principles are not based primarily on human visual
perception, but on the principle of duplicating the
physical light distribution in the viewing space in
the absence of the original objects. The quality of
the generated 3D image is, therefore, based on the
success in duplicating the physical properties of the
original light. Scientific and technological
developments in both fields have been significant
and the quality of displays has been improving.
Similarly, the problems in stereoscopy are being
solved with the advances in autostereoscopic multiuser
systems. The ultimate goal is to provide the
viewer with the freedom to move and change his or
her viewing direction while interacting with the 3D
image and virtual environment, together with a
perception of the vivid colors and sharpness that
we experience in real life. The association of still
3D imagery with 3D motion pictures and 3DTV
are similar to the 2D case: if the visual information
can be updated fast enough, motion will be
observed, and if that data can be electronically
transmitted, we get 3DTV. The difference is in the
detail in the technology used to capture, represent,
transmit, and display such pictures.
As seen from the brief historical overview above,
the 3D imaging technologies have been known and
utilized for a long time; indeed, it will not be
grossly wrong to state that the 2D and 3D
technologies have been developed in parallel. Yet,
it is a simple observation that the popularity of 2D
products in any form surpasses their 3D
counterparts by far. The reasons for this
imbalance, and the basic underlying consumer
attitude and preferences should be well understood
to overcome this unfavorable situation for 3D. The
history is full of unsuccessful entrepreneurial
attempts in the form of business failures in 3D
imaging. As the reasons of such failures are
understood by analyzing the consumer behavior,
and as the technology provides the solutions to the
problem areas, there is no doubt that the 3D
viewing will be the choice of the future. Such a
future will provide a completely new experience.
Any associated social and psychological impact
remains unknown at this time.
However, it is certain that the recent interest in 3D
imaging, both in society, and in the research
community is increasing significantly. An indicator
is the volume of scientific papers, news articles,
and patents in these fields.
A collection of historical pictures in 3D
stereoscopic imaging (both still and motion) is
presented in [14].
A stereoscopic 3D-HDTV System was reported in
1999 in NHK-STRL annual report [9,16]. The
report mentions the regular problems associated
with stereoscopy, and describes subjective
evaluation tests targeted at overcoming such
problems. It is claimed that two factors, “sensation
of reality” and “ease of viewing” are extracted
from such tests. With improvements in these
factors, this study concluded that 3D images were
better than 2D in terms of sensation-of-reality, but
scores for ease-of-viewing varied depending on the
image content. A discussion of future 3DTV
systems is presented including autostereoscopic,
holographic and integral-imaging-based systems.
Capturing techniques for 3D scenery is also
covered, and a description of a 3D camera (1998),
based on infrared sensors to detect depth, is
presented. Some test results on visual and
psychological effects associated with wide-screen
display systems are also given.
Another Korean broadcasting experiment in 3DHDTV
was the broadcast of 2002 FIFA World Cup
within activities in 3D-HDTV project [13]. The
project spanned human visual fatigue studies,
stereoscopic cameras, video multiplexerdemultiplexer,
receiver, coding, related image
processing techniques based on MPEG-2 and
MPEG-4, etc. Different stereoscopic cameras were
tested. The activities involved 10 demo rooms with
50 seats and a 300 in. screen; it is claimed that the
demo rooms were visited by about 571,000 visitors
during the events. The stereoscopic viewing was
via polarizing glasses.
Perceptual evaluation of 3DTV displays and
system requirements based on such evaluations are
presented in [10]. The focus is only on stereoscopic
displays, and autostereoscopic systems with
multiple viewers. Rather immature holographic or
integral-imaging-based displays are omitted. On
the capture side, dual camera (stereoscopic), single
camera assisted with a depth camera, and a single
camera with 2D-to-3D conversion are considered.
Captured data is compressed and delivered to the
displays. Evaluation paradigms are discussed, and
in particular applicability of 2D video assessment
techniques to 3D are questioned. It is concluded
that 3D experience is quite different, and therefore,
must be assessed based on criteria that fit better to
3D. Six major viewing artifacts for the stereoscopic
case are listed and discussed.
ATTEST was a project on 3DTV funded between
2002-2004 by the EC. A full 3DTV processing chain has been realized and demonstrated in the
European ATTEST project [17]. The result is a
backward compatible (to classical DVB) approach
for 3DTV. In this context compression of depth
data has also been investigated. It has been found
that depth data can be very efficiently compressed
using standard video codecs such as H.264/AVC
[18]. From standards point of view the realization
of the ATTEST concept for 3DTV only requires
minor additions on the Systems level of MPEG-4.
These are currently under investigation and may
provide an interoperable solution for 3DTV
broadcast in the very near future. This concept for
depth based 3D rendering is easily extended to N
views, shown in [19]. Depending to the user
position a simple switching to the nearest original
view with depth (or pair of views with
disparity/depth) is possible. This extends the
navigation range in front of the screen with the
number of cameras used. For some application
scenarios such as 3DTV broadcast this implies
compression and transmission of multi-view video,
which is an ongoing work item in MPEG
standardization activities.
Newer generation of 3DTV techniques are targeted
to decouple the image capture and image display
components further: in such systems, the captured
scene, by some means, is first converted to an
abstract 3D moving scene using such aids like
wire-mesh models and other representation
techniques. The 3D scene is then rendered at the
display side depending on the display technique
employed. Based on human perception and
physical properties and the technology of the
display, there are many different ways of rendering
the captured 3D info. One such system, based on
scanning different depth slices of a 3D scene by
holographic means of reproducing each slice in a
time sequential fashion is presented in [12].
A PC-based stereoscopic interactive video system
to give the sensation of walking through a prerecorded
3D environment is presented in [11].
A 3D videoconference application is described in a
patent document [15]. The 3D image is captured by
an array of video cameras. Digitized video data is
computer processed and the resultant data is
transmitted. 3D data collected from all such teleconference
attendees are collected to form a single
3D image which is then transmitted to all locations.
Received 3D video data is displayed using a 3D
projection system.
A paper published in 1995 [20] describes the state
of 3DTV research at that time. An overview of
human factors is presented; stereoscopic 3DTV
systems related issues are discussed; bandwidth and its possible reduction through coding are
included.
After a brief general introduction about early
anaglyphic broadcasts in Europe in the early 1980s,
more advanced two-channel PAL demonstrations
both from Europe and Japan in 1983, 1985 and
1987 are mentioned in [21]. Then a technological
overview of research in Europe is presented,
including psycho-optic aspects and signal
processing issues. An overview of European COST
230 ``Stereoscopic Television'' is also given,
together with the RACE DISTIMA project.
Developments in Japan and USA are also briefly
presented.
An end-to-end distributed scalable 3DTV system,
consisting of an array of cameras, clusters of
network connected PCs, and a multi-projector
display is developed and implemented by
Mitsubishi Electric Research Laboratories (MERL)
[22]. Multiple video streams are individually
encoded and transmitted over broadband networks.
The display is based on lenticular technology.
Design choices and tradeoffs are presented.