Seminar Topics & Project Ideas On Computer Science Electronics Electrical Mechanical Engineering Civil MBA Medicine Nursing Science Physics Mathematics Chemistry ppt pdf doc presentation downloads and Abstract

Image Ranking and Retrieval based on Multi-Attribute Queries

[attachment=33639]

Abstract

We propose a novel approach for ranking and retrieval
of images based on multi-attribute queries. Existing image
retrieval methods train separate classifiers for each word
and heuristically combine their outputs for retrieving multiword
queries. Moreover, these approaches also ignore the
interdependencies among the query terms. In contrast, we
propose a principled approach for multi-attribute retrieval
which explicitly models the correlations that are present
between the attributes. Given a multi-attribute query, we
also utilize other attributes in the vocabulary which are not
present in the query, for ranking/retrieval. Furthermore, we
integrate ranking and retrieval within the same formulation,
by posing them as structured prediction problems. Extensive
experimental evaluation on the Labeled Faces in the
Wild(LFW), FaceTracer and PASCAL VOC datasets show
that our approach significantly outperforms several stateof-
the-art ranking and retrieval methods.

Introduction

In the past few years, methods that exploit the semantic
attributes of objects have attracted significant attention
in the computer vision community. The usefulness of these
methods has been demonstrated in several different application
areas, including object recognition [5, 17, 24] face
verification [16] and image search [22, 15].
In this paper we address the problem of image ranking
and retrieval based on semantic attributes. Consider the
problem of ranking/retrieval of images of people according
to queries describing the physical traits of a person, including
facial attributes (e.g. hair color, presence of beard
or mustache, presence of eyeglasses or sunglasses etc.),
body attributes (e.g. color of shirt and pants, striped shirt,
long/short sleeves etc.), demographic attributes (e.g. age,
race, gender) and even non-visual attributes (e.g. voice type,
temperature and odor) which could potentially be obtained
from other sensors.

Related Work

An approach that has proved extremely successful for
document retrieval is learning to rank [12, 7, 18, 2], where
a ranking function is learnt, given either the pairwise preference
relations or relevance levels of the training examples.
Similar methods have also been proposed for ranking images,
[10]. Several image retrieval methods, which retrieve
images relevant to a textual query, adopt a visual reranking
framework [1, 6, 13, 23], which is a two stage process. In
the first stage images are retrieved based purely on textual
features like tags(e.g. in Flickr), query terms in webpages
and image meta data. The second stage involves reranking
or filtering these images using a classifier trained on visual
features. A major limitation of these approaches is the
requirement of textual annotations for the first stage of retrieval,
which are not always available in many applications
- for example the surveillance scenario described in the introduction.
Another drawback of both the image ranking
approaches as well as the visual reranking methods is that
they learn a separate ranking/classification function corresponding
to each query term and hence have to resort to
ad-hoc methods for retrieving/ranking multi-word queries.
A few methods have been proposed for dealing with multiword
queries. Notable among them are PAMIR [8] and Tag-
Prop [9]. However, these methods do not take into account
the dependencies between query terms. We show that there
often exist significant dependencies between query words
and modeling them can substantially improve ranking and
retrieval performance.

Multi Attribute Retrieval and Ranking

We now describe our Multi-Attribute Retrieval and
Ranking(MARR) approach. Our image retrieval method is
based on the concept of reverse learning. Here, we are given
a set of labels X, and a set of training images Y. Corresponding
to each label xi (xi 2 X) a mapping is learned
to predict the set of images y (y Y) that contain the label
xi. Since reverse learning has a structured output (set
of images) it fits well into the structured prediction framework.
Reverse learning was recently proposed in [19], and
was shown to be extremely effective for multi-label classification.
The main advantage of reverse learning is that it
allows for learning based on the minimization of loss functions
corresponding to a wide variety of performance measures
such as hamming loss, precision and recall. We build
upon this approach in three different ways. First we propose
a single framework for both retrieval and ranking. This
is accomplished by adopting a ranking approach similar to
[18], where the output is a set of images ordered by relevance,
enabling integration of ranking and reverse learning
within the same framework.

Conclusion

We have presented an approach for ranking and retrieval
of images based on multi-attribute queries. We utilize a
structured prediction framework to integrate ranking and retrieval
within the same formulation. Furthermore, our approach
models the correlations between different attributes
leading to improved ranking/retrieval performance. The effectiveness
of our framework was demonstrated on three
different datasets, where our method outperformed a number
of state-of-the-art approaches for both ranking as well
as retrieval.In future, we plan to explore image retrieval for
more complex queries such as scene descriptions consisting
of the objects present, along with their attributes and the
relationships among them.

seminar flower