23-06-2012, 12:11 PM
Improving Web Image Search by Bag-Based Reranking
Improving Web Image.pdf (Size: 1,005.52 KB / Downloads: 37)
Abstract
Given a textual query in traditional text-based image
retrieval (TBIR), relevant images are to be reranked using visual
features after the initial text-based search. In this paper, we propose
a new bag-based reranking framework for large-scale TBIR.
Specifically, we first cluster relevant images using both textual
and visual features. By treating each cluster as a “bag” and the
images in the bag as “instances,” we formulate this problem as
a multi-instance (MI) learning problem.
INTRODUCTION
WITH THE ever-growing number of images on the
Internet (such as in the online photo sharing Website
, the online photo forum , and so
on), retrieving relevant images from a large collection of database
images has become an important research topic. Over the
past decades, many image retrieval systems have been developed,
such as text-based image retrieval (TBIR) [3], [12], [19],
[38], [42] and content-based image retrieval [23], [33], [39].
RELATED WORK ON MI LEARNING
MI learning methods have been proposed to solve learning
problems with ambiguity on training samples. In the traditional
supervised learning problems, there is clear knowledge on the
labels of training samples. In contrast, in MI learning problems,
a label only accompanies each training “bag,” which consists
of several instances (i.e., training samples). Specifically, in the
traditional setting of MI learning problems, each positive bag
has at least one positive instance, while a negative bag has no
positive instances. MI learning methods [1], [7], [20], [35], [40]
learn models from the training data with such ambiguous label
information and predict the label of test bags or instances.
BAG-BASED WEB IMAGE RERANKING FRAMEWORK
Here, we present our proposed bag-based reranking framework
for large-scale TBIR. Our goal is to improve the Web
image retrieval in Internet image databases, such as .
These Web images are usually accompanied by textual descriptions.
For the th Web image, the low-level visual feature
(e.g., color, texture, and shape) and the textual feature
(e.g., term frequency) can be extracted. We further aggregate
them into a single feature vector for subsequent operations,
namely, , where is a weight parameter.
CONCLUSION AND FUTURE WORK
In this paper, we have proposed a bag-based framework for
large-scale TBIR. images with textual descriptions (i.e.,
tags) have been used for this real-world application. Given a
textual query, relevant images are to be reranked after the initial
text-based search. Instead of directly reranking the relevant
images by using traditional image reranking methods, we have
partitioned the relevant images into clusters. By treating each
cluster as a “bag” and the images in a bag as “instances,” we
have formulated this problem as a MI learning problem.