Seminar Topics & Project Ideas On Computer Science Electronics Electrical Mechanical Engineering Civil MBA Medicine Nursing Science Physics Mathematics Chemistry ppt pdf doc presentation downloads and Abstract

Full Version: A Statistical Modeling Approach to Content Based Video Retrieval pdf
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
A Statistical Modeling Approach to Content Based Video Retrieval

[attachment=59868]

Abstract

Statistical modeling for content based retrieval is exam-
ined in the context of recent TREC Video benchmark exer-
cise. The TREC Video exercise can be viewed as a test bed
for evaluation and comparison of a variety of different algo-
rithms on a set of high-level queries for multimedia retrieval.
We report on the use of techniques adopted from statistical
learning theory. Our method depend on training of mod-
els based on large data sets. Particularly, we use statistical
models such the Gaussian mixture models to build compu-
tational representations for a variety of semantic concepts
including rocket-launch, outdoor, greenery, sky etc. Training
requires a large amount of annotated (labeled) data. Thus,
we explore use of active learning for the annotation engine
that minimizes the number of training samples to be labeled
for satisfactory performance.

Introduction

Supporting semantic queries in video retrieval is an im-
portant and difficult problem in multimedia analysis [7, 9]
and query by keywords is needed for effective utilization
of multimedia repositories [10]. Recent work in video re-
trieval has witnessed an interesting shift from the query by
example (QBE) paradigm [5, 10, 2] to the query by key-
words paradigm using semantic video objects [7, 9, 3, 12, 6].
The reason behind this shift lies in the desire to build media
search engines that are as pervasive and useful from user per-
spective as their counterparts in text.
To build systems that help users, find what they desire, it
is important to address the semantics of the queries. To eval-
uate effectiveness of the search engines, it is then necessary
to have a set of queries and a benchmark database of videos
that can be used to conduct experiments. In this light, the
emergence of Video TREC [1] as a benchmark1 is a signif-
icant development.

Learning Semantic Concepts for Video Re-
trieval


The architecture of our statistical modeling approach to
video retrieval is outlined in Figure 1. This system includes
three main components: semantic annotation, learning and
model construction, and video retrieval. The idea behind us-
ing automatically constructed models for video retrieval is to
facilitate the paradigm of query by keyword [9, 3].
We now describe modeling a set of semantic concepts by
learning them from annotated content. We also describe how
these models are used to answer the V-TREC queries.

Active learning to Enable Faster Annotation

Abundance of video data and diversity of labels make an-
notation a difficult and expensive task. We formulate the
task of annotation in the framework of active learning. We
first train a classifier with a small set of labeled data, and
propagate persistent annotation by prompting the annotater
to provide the ground truth for the most informative, or most
uncertain subset of the available data-set. The learner is thus
refined at every iteration and the user needs to annotate as
few samples as is possible.
We use a support vector machine as the intrinsic classifier
for the active annotator and the active learner. This approach
was proposed by Naphade et al [8]. The initial support vec-
tor classifier is built on the basis of a very small annotated
data-set.Each unseen example is classified by the SVM clas-
sifier and if the uncertainty if classification is high, the anno-
tater is prompted to provide the class label. The classifier is
retrained after every iteration. If the uncertainty associated
with classification is low the label provided by the classifier
is propagated. This reduces the number of examples to be
annotated by a factor of 10.

Concluding Remarks and Future Directions

In this paper we examine statistical modeling for content
based retrieval in the context of recent TREC Video bench-
mark exercise. The V-TREC exercise can be viewed as a
test bed for evaluation and comparison of a variety of differ-
ent algorithms on a set of high-level queries for multimedia
retrieval. We build models for semantic concepts including
Outdoors, Rocket-launch and several other concepts. Using
multiple such models we answer the set of V-TREC queries.