01-07-2014, 03:08 PM
Content-based video Retrieval
Content-based video Retrieval.docx (Size: 1.09 MB / Downloads: 16)
ABSTRACT
Recently, multimedia data grows rapidly due to the advanced multimedia capturing devices, such as digital video recorder, mobile camera and so on. Since conventional query-by-text retrieval cannot satisfy users’ requirements in finding the desired videos effectively, content-based video retrieval is regarded as one of the most practical solutions to improve the retrieval quality. In addition, video retrieval using query-by-image is not successful in associating the videos with user’s interest either. In this paper, we propose an innovative method to achieve the high quality of content-based video retrieval by discovering the temporal patterns in the video contents. On basis of the discovered temporal patterns, an efficient indexing technique and an effective sequence matching technique are integrated to reduce the computation cost and to raise the retrieval accuracy, respectively. Experimental results reveal that our approach is very promising in enhancing content-based video retrieval in terms of efficiency and effectiveness
INTRODUCTION
In recent years, advanced digital capturing technology leads to the rapid growth of digital data. Through the ease of communication tools, millions of multimedia data are exchanged on the Internet
at any time. Hence, knowledge discovery from the massive amount of multimedia data, so-called multimedia mining, has been the focus of attention over the past few years. For multimedia
mining, compound and complex multimedia data are usually organized into the multimedia repositories by multimedia conceptualizing techniques, such as classification, annotation and so on.
Behind the multimedia conceptualizing techniques, the main perspective is that the researchers make attempts to satisfy the users’ semantic demands by automatically bridging human concepts and low-level features. Nevertheless, so far, very few studies have been successful in modeling the relationships between the complex low-level features and the diverse human concepts. For example, two similar videos annotated by different conceptual descriptions possibly result
in the large gap between the user’s intention and multimedia search results. Also, video conceptualization is much more difficult than image conceptualization since videos consist of multiple multimedia contents, including image, audio and text. As a result, a video can be viewed as a set of sequential images, which contains a lot of various concepts. Even though several studies have been made on annotating the image frames in a video, the concepts in the image frames cannot still represent the whole video. In addition, based on manual descriptions, almost of on-line search engines, such as Youtube, Google, Yahoo, MSN, etc., provide the users with textual-based multimedia search service. However, it actually cannot precisely touch the user’s mind. it is a real
example illustrating that the search results are almost incorrect to the query ‘‘Racing Car” in user’s mind
MOTIVATION
There is an amazing growth in the amount of digital video data in recent years. Lack of tools for classify and retrieve video content .There exists a gap between low-level features and high-level semantic content.To let machine understand video is important and challenging. Necessity of Video Database Management System Increase in the amount of video data captured. Efficient way to handle multimedia data. Traditional Databases Vs Video Databases Traditional Databases has tuple as basic unit of data. Video Databases has shot as basic unit of data
DOMAIN STUDY
Search engine has been widely used as the platform of knowledge from the web in the last few decades. Yet, very little attention has not been given to video retrieval until the popularity of digital capturing devices and communication tools. In order to facilitate textual-based video retrieval, the most natural way is manual annotation. However, manual annotation costs expensively due to the massive amount of video contents. To this end, a considerable number of past studies were conducted on automated semantic videos such as decision tree, hidden Markov model
(HMM), K nearest neighbor (KNN), association mining, support vector machine (SVM), etc. Through the automated descriptions of videos, the user’s interest and videos can be associated semantically.
Unfortunately, diverse concepts cause distorted descriptions and thereupon limit the effectiveness of video retrieval. As the limitation of textual-based video retrieval, in that domain we will focus our attention on content-based video retrieval. For content-based video retrieval, a video is traditionally divided into several scenes and each scene contains some shots that consist of a few time-limited/
similarity-limited image frames. Out of these sequential frames, a representative frame will be defined as a key-frame. In general, based on the extracted visual features, such as color, shape
and texture, the related work for content-based video retrieval can be categorized into the followings.
3LITERATURE SURVEY
The main challenge in content-based video retrieval is: how to utilize video contents to search user’s interested videos effectively and efficiently. In fact, effective and efficient retrieval primarily lies in two aspects: index and search strategies. In this section, we present how our proposed method achieves the high quality of content- based video retrieval by the special pattern-based matching techniques in great detail.
Basic idea
Before introducing our proposed method, we have to clarify the basic idea. Generally speaking, a video is composed of a sequence of shots/key-frames. Due to the relations or co-relations of these sequential shots, the video retrieval strategy, instinctively, has to consider the temporal continuity of shots. That is to say, two video clips are similar if the subsequences of both are similar. Because of
the complicated video contents, the complexity for CBVR is much higher than that for content-based image retrieval (CBIR). For example, Fig. 2 shows two shot sequences extracted from two video
clips. Overall, (a) and (b) can be viewed as a set of relevant video clips since the temporal continuities of two sequences are almost the same in terms of visual features. In contrast to CBIR, this example
delivers the critical point that effective video retrieval depends on a good sequence matching. Based on this viewpoint, we index the same sequences as a tree structure to make video
retrieval more efficiently
Shot clustering and encoding
To construct the pattern-based index tree, encoding the shots is necessary. The main contribution of this work is that, the feature dimensionality can be reduced substantially and the pattern matching cost becomes very low. In this work, the shots are clustered by the well-known algorithm k-means and each shot is assigned a symbol by its belonging cluster number, as shown in
Another important issue to address here is the quality of clustering since it actually makes a significant impact on the quality of pattern-based video retrieval. Thus, we adopt the following
validation measures to confirm the clustering quality. the whole validation procedure does not stop until all criteria are satisfied. The involved criteria, namely Local Proportion, Local Density and Global Density, are described as follows
PROPOSAL
For developing a novel method for contentbased video retrieval by using pattern-based indexing and matching techniques. The main contribution of the proposed method is that, the proposed approach achieves the high quality of video retrieval without considering the query terms. The utilization of the
pattern-based index can effectively deal with the problems of high dimensional visual features, which occur in current visual-based sequence matching methods. The experimental results show that
the proposed method can substantially enhance the precision and recall for content-based video retrieval even though only two kinds of visual features are considered. Besides, AFPI is really an efficient method for finding the desired videos from the massive amount of diverse data. In the future, we will further address the following issues: First, in addition to color layout and edge histogram, more
types of features, such as motion, audio and the other visual features, will be considered further. Second, we will apply AFPI-tree to the other types of content-based multimedia retrieval
IMPLEMENTATION DETAILS
1. Color feature space: Color statistics are used for measuring global or local dissimilarities. Global color features are analyzed through histograms. These histograms offer the advantage of being invariant under rotation, translation and many other geometric operations
CURRENT STATUS
After having successfully Color feature space the individual intensity values from the input provided, also referred to as native image, the next step to be performed presented a novel method for content based video retrieval by using pattern-based indexing and matching the proposed approach achieves the high quality of video retrieval without considering the query terms. The utilization of the pattern-based index can effectively deal with the problems of high dimensional visual features, which occur in current visual-based sequence matching methods. The experimental results show that the proposed method can substantially enhance the precision and recall for content-based video retrieval even though only two kinds of visual features are considered. Besides, AFPI is really an efficient
method for finding the desired videos from the massive amount of diverse data. In the future, we will further address the following issues: First, in addition to color layout and edge histogram, more types of features, such as motion, audio and the other visual features, will be considered further. Second, we will apply AFPI-tree to the other types of content-based multimedia retrieval