06-10-2012, 05:33 PM
Watermarking of Free-view Video
Watermarking of Free-view.pdf (Size: 1.72 MB / Downloads: 39)
Abstract
With the advances in image based rendering (IBR) in
recent years, generation of a realistic arbitrary view of a scene from
a number of original views has become cheaper and faster. One
of the main applications of this progress has emerged as free-view
TV(FTV), where TV-viewers select freely the viewing position and
angle via IBR on the transmitted multiview video. Noting that the
TV-viewer might record a personal video for this arbitrarily selected
view and misuse this content, it is apparent that copyright
and copy protection problems also exist and should be solved for
FTV. In this paper, we focus on this newly emerged problem by
proposing a watermarking method for free-view video. The watermark
is embedded into every frame of multiple views by exploiting
the spatial masking properties of the human visual system.
Assuming that the position and rotation of the virtual camera is
known, the proposed method extracts the watermark successfully
from an arbitrarily generated virtual image. In order to extend
the method for the case of an unknown virtual camera position
and rotation, the transformations on the watermark pattern due
to image based rendering operations are analyzed. Based upon this
analysis, camera position and homography estimation methods are
proposed for the virtual camera. The encouraging simulation results
promise not only a novel method, but also a new direction for
watermarking research.
INTRODUCTION
I N the last decade, the utilization of 3-D information for
modelling and representation of a real world scene has become
widespread in many applications, such as animation films,
video games and stereoscopic displays. Parallel to these applications,
the research on more sophisticated technologies based
upon rendering of 3-D scenes, such as free-view televisions
(FTVs) and 3-D holographic televisions, has already reached to
an acceptable maturity [1]. In addition, standardization on multiview
coding has been delivered by ISO and ITU bodies, and
FTV is expected to be the next goal for standardization [2]–[4].
General Framework and Basic Requirements
First of all, beside the robustness to common video processing
and multiview video processing operations, the main characteristic
challenge for multiview watermarking is to detect the embedded
signal from a virtual video sequence, generated for an arbitrary
(user selected) view (Fig. 1). Based upon the availability
of the original and/or watermarked multiview content, the utilized
data during watermark detection can be as follows:
1) a single watermarked virtual view generated from
watermarked multiview video;
2) a single watermarked virtual view, and the watermarked
multiview video from which the virtual view is generated;
3) a single watermarked virtual view generated from watermarked
multiview video, and original (unwatermarked)
multiview video.
LIGHT FIELD RENDERING (LFR)
LFR is a simple method for generating novel views from arbitrary
camera angles by combining and resampling the available
images without utilizing any depth information about the scene
[18]. The basic idea behind the technique is a representation of
the light field as the radiance at a point in a given direction in
free space which is free of occluders. This representation characterizes
the flow of light through free space in a static scene
with fixed illumination [18].
A practical way to create (capture) light fields is to assemble a
collection of images by a number of cameras where the intensity
of the pixels of the camera views corresponds to the radiance of
the light rays passing through the pixel locations and camera
centers. A general configuration for LFR is given in Fig. 2(a)
with two parallel planes, namely camera (uv) plane and focal
(st) plane.
Watermark Embedding
The proposed approach inserts thewatermark into each image
of the light field image array [Fig. 2(b)] by exploiting spatial
sensitivity of human visual system (HVS) [20]. For that purpose,
the watermark is modulated with the resulting output image
which is obtained after filtering each light field image by a highpass
filter, and spatially added onto the light field image. Such
a process decreases the watermark strength over the flat regions
of the image, in which HVS is more sensitive, whereas increases
the embedded watermark energy in the textured regions, where
HVS is relatively insensitive [20].
Two important points should be noted at the embedding stage.
First of all, the watermark is embedded to the transmitted lightfield
images, which are the sheared perspective projection of the
original camera-captured frames [18]. These frames can be obtained
easily by camera calibration information [18]. Secondly,
the watermark component, added to the intensity of each image
pixel, is determined according to the intersection of the light ray
corresponding to that pixel with the focal plane. The same watermark
is added to the pixels of different camera views, whose
corresponding light rays are intersected at the same point in the
focal plane, as illustrated in Fig. 4. The rationale behind such a
procedure is to avoid facing with the superposition of the different
watermark samples from different camera frames in the
interpolation step during the rendering.
ANALYSIS OF LFR ON WATERMARK PATTERN
FOR AN UNKNOWN CAMERA
In the previous tests, the position and orientation of the virtual
camera is assumed to be known during the detection stage.
However, this information is not available in practice. Hence,
the detection algorithm should also include a procedure to determine
the position and orientation of the virtual camera.
In order to handle this problem, the relation between the embedded
watermark and the rendered watermark should be analyzed
separately for this two interpolation methods in LFR,
namely nearest neighbor and bilinear interpolation. In this part,
the analysis for nearest neighbor interpolation and its robustness
results are given; later, the proposed solution for bilinear case
will also be presented.
CONCLUSION
The emerging FTV systems have produced the copy-right
problems for multiview video content. In this paper, we have
pointed out the essential differences in multiview video watermarking
as a solution to this copyright problem compared
to the well-studied single-view video watermarking. The main
challenge has emerged as the watermark detection from virtual
views rendered for arbitrary camera positions. Our proposed
method has achieved to detect the watermark from the rendered
views both for the known and unknown camera positions by
means of analyzing and estimating the variations on the watermark
patterns due to the rendering operations. In this pioneer
work, we have handled the static scenes consisting of only one
object and the main distortions in the transmission chain of multiview
video as the attacks on the rendered object.