24-09-2014, 09:43 AM
Content Based Image Retrieval Methods Using Graphical Image Retrieval Algorithm (GIRA) Project Report
Content Based Image.pdf (Size: 134.06 KB / Downloads: 31)
Abstract
This document gives a brief description of a system
developed for retrieving images similar to a query image from a
large set of distinct images. It follows an image segmentation
based approach to extract the different features present in an
image. These features are stored in vectors called feature vectors
and compared to the feature vectors of query image and thus, the
image database is sorted in decreasing order of similarity.
Different from traditional dimensionality reduction algorithms
such as Principal Component Analysis (PCA) and Linear
Discriminate Analysis (LDA), which effectively see only the global
Euclidean structure, GIRA is designed for discovering the local
manifold structure. Therefore, GIRA is likely to be more suitable
for image retrieval, where nearest neighbor search is usually
involved. After projecting the images into a lower dimensional
subspace, the relevant images get closer to the query image; thus,
the retrieval performance can be enhanced.
INTRODUCTION
HE CBIR is motivated by the fast growth of
digital image databases, which, in turn, require efficient
search schemes. Rather than describing an image by using
text, in these systems, an image query is described using one or
more example images. The low level visual features (color,
texture, shape, etc.,) are automatically extracted to represent the
images. However, the low-level features may not accurately
characterize the high-level semantic concepts. The image
retrieval techniques based on visual image content has been
in-focus for more than a decade. Many web search engines
retrieve similar images by searching and matching textual
metadata associated with digital images. The paper addresses
and analyses challenges & issues of CBIR techniques/systems,
evolved during recent years covering various methods for
segmentation; edge, boundary, region, color, texture, and shape
based feature extraction; object detection and identification.
For better precision of the retrieved resultant images, this type
of search requires associating meaningful image descriptive
text labels as metadata with all images of the database. In
real-world image retrieval systems, the relevance feedbacks
provided by the user is often limited, typically less than 20,
whereas the dimensionality of the image space can range from
several hundreds to thousands. Manual image labeling
OPTIMAL LINEAR EMBEDDING
Histogram refinement based on color coherence vectors was
proposed in [3]. The technique considers spatial information
and classifies pixels of histogram buckets as coherent if they
belong to small region and incoherent otherwise. Though being
computationally expensive, the technique improves
performance of histogram based matching. Color correlogram
feature for images was proposed in [2] which take into account
local color spatial correlation as well as global distribution of
this spatial correlation. The correlogram gives the change of
spatial correlation of pairs of colors with distance and hence
performs well over classical histogram based techniques. A
modified histogram based technique to incorporate spatial
layout information of each color with annular, angular and
hybrid histograms has been proposed in [4]. In [5], cumulative
histogram and respective distances for image similarity
measures, overcoming quantization problem of the histogram
bins was proposed.
PROPOSED GRRA ALGORITHM
The proposed scheme is not the same as the existing framework
of unifying keywords and visual content systems. The key word
models built from visual feature of a set of images are labeled
with keywords. It incorporates an image analysis algorithm into
the text-based image search engines. Moreover, it is
implemented on real-world image database. A high-level
semantic retrieval can be done by using relevance images from
Yahoo image search engine. For low-level feature, we
introduce a fast and robust color feature extraction technique
namely auto color correlogram and correlation (ACCC) based
on color correlogram (CC)[7] and autocorrelogram (AC) [7]
algorithms, for extracting and indexing low-level features of
images. The retrieval performance is satisfactory and higher
than the average precision of the retrieved images using auto
correlogram (AC). Moreover, It can reduce computational time
from O(m2d) to O(md) [8]. The framework of multi-threaded
processing is proposed to incorporate an image analysis
algorithm into the text based image search engines. It enhances
the capability of an application when downloading images,
indexing, and comparing the similarity of retrieved images
from diverse sources
IMAGE BROWSING EXAMPLE
Query based on texture properties will have many applications
in image and multimedia databases. Here, we describe with an
example our current work on incorporating these features for
browsing large satellite images and air photos. This work
relates to the UCSB Alexandria digital library project [11]
whose goal is to create a digital library of spatially indexed data
such as maps and satellite images. Typical images in such a
database range from few megabytes to hundreds of megabytes,
posing challenging problems in image analysis and
visualization of data. Content based retrieval will be very useful
in this context in answering queries such as "Retrieve all
LANDSAT images of Santa Barbara which have less than 20%
cloud cover," or "Find a vegetation patch that looks like this
region."We are currently investigating the use of texture
primitives to accomplish rapid content based browsing within
an image or across similar images
FEATURE EXTRACTION
There are various visual descriptors used to extract a low-level
feature vector of an image [3]. However, in this paper, we used
color descriptors for retrieving images. The color texture
database used in the experiments consists of 116 different
texture classes. Each of the 512 x 512 images is divided into16
128 x 128 no overlapping sub images, thus creating a database
of 1,856 texture images. A query pattern in the following is
anyone of the 1,856 patterns in the database. This pattern is then
processed to compute the feature vector as in (7). The distance
d(i, j),where i is the query pattern and j is a pattern from the
database, miscomputed. The distances are then sorted in
increasing order and the closest set of patterns is then retrieved
correlogram is an efficient feature extraction techniques used in
content-based image retrieval (CBIR) systems. The technique,
namely color correlogram, is widely used for finding the spatial
correlation of each color in an image. It was introduced by
Huang J. et al [7]. The technique was implemented and it was
found that the retrieval performance of a color correlogram was
better than the standard color histogram and the color
coherence vector methods. However, the colorcorrelogram is
expensive to compute and the computation time of the
correlogram is O(m2d). The authors also present a technique
that captures the spatial correlation between identical colors
called an autocorrelogram with a computation time of
O(md).However,
CONCLUSIONS
This paper presents a novel manifold learning algorithm, called
GIR, for image retrieval. In the first step, we construct a
between-class nearest neighbor graph and a within-class nearest
neighbor graph to model both geometrical and discriminate
structures in the data. The standard spectral technique is then used
to find an optimal projection, which respects the graph structure.
This way, the Euclidean distances in the reduced subspace can
reflect the semantic structure in the data to some extent.