25-08-2017, 09:32 PM
Image Clustering and Retrieval using Image Mining Techniques
Image Clustering.doc (Size: 1.1 MB / Downloads: 34)
Abstract:
Image retrieval is the basic requirement task in the present scenario. Content Based Image Retrieval is the popular image retrieval system by which the target image to be retrieved based on the useful features of the given image. In other end, image mining is the arising concept which can be used to extract potential information from the general collection of images. Target or close Images can be retrieved in a little fast if it is clustered in a right manner. In this paper, the concepts of CBIR and Image mining have been combined and a new clustering technique has been introduced in order to increase the speed of the image retrieval system.
Introduction
In this present scenario, image plays vital role in every aspect of business such as business images, satellite images, medical images and so on. If we analysis these data, which can reveal useful information to the human users. But, unfortunately there are certain difficulties to gather those data in a right way. Due to incomplete data, the information gathered is not processed further for any conclusion. In another end, Image retrieval is the fast growing and challenging research area with regard to both still and moving images. Many Content Based Image Retrieval (CBIR) system prototypes have been proposed and few are used as commercial systems. CBIR aims at searching image databases for specific images that are similar to a given query image. It also focuses at developing new techniques that support effective searching and browsing of large digital image libraries based on automatically derived imagery features. It is a rapidly expanding research area situated at the intersection of databases, information retrieval, and computer vision. Although CBIR is still immature, there has been abundance of prior work. The CBIR focuses on Image ‘features’ to enable the query and have been the recent focus of studies of image databases. The features further can be classified as low-level and high-level features. Users can query example images based on these features such as texture, color, shape, region and others. By similarity comparison the target image from the image repository is retrieved. Meanwhile, the next important phase today is focused on clustering techniques. Clustering algorithms can offer superior organization of multidimensional data for effective retrieval. Clustering algorithms allow a nearest neighbor search to be efficiently performed. Hence, the image mining is rapidly gaining more attention among the researchers in the field of data mining, information retrieval and multimedia databases. Spatial Databases is the one of the concepts which plays a major role in Multimedia System. Researches can extract semantically meaningful information from image data are increasingly in demand.
Topic Overview
The main objective of the image mining is to remove the data loss and extracting the meaningful information to the human expected needs. The images are preprocessed with various techniques and the texture calculation is highly focused. Here, images are clustered based on RGB Components, Texture values and Fuzzy C mean algorithm. Entropy is used to compare the images with some threshold constraints. This application can be used in future to classify the medical images in order to diagnose the right disease verified earlier.
Comparison of Image Mining with other Techniques
Image mining normally deals with the extraction of implicit knowledge, image data relationship, or other patters not explicitly stored from the low-level computer vision and image processing techniques.i.e.) the focus of image mining is the in the extraction of patterns from a large collection of images, the focus of computer vision and image processing techniques is in understanding or extracting specific features from a single image.
RGB Components Processing
An RGB color images is an M*N*3 array of color pixels, where each color pixel is a triplet corresponding to the red, green, and blue components of an image at a spatial location. An RGB image can be viewed as the stack of three gray scale images that, when fed into the red, green, blue inputs of a color monitor, produce the color image on the screen. By convention the three images form an RGB images are called as red, green and blue components.
Performance Evaluation of Proposed CBIR System
Evaluation of retrieval performance is a crucial problem in Content-Based Image Retrieval (CBIR). Many different methods for measuring the performance of a system have been created and used by researchers. We have used the most common evaluation methods namely, Precision and Recall usually presented as a Precision vs Recall graph. Precision and recall alone contain insufficient information. We can always make recall value 1
just by retrieving all images. In a similar way precision value can be kept in a higher value by retrieving only few images or precision and recall should either be used together or the number of images retrieved should be specified. With this, the following formulae are used for finding Precision and Recall values.
Genetic k-means clustering algorithms
The application of genetic algorithms in the area of clusters analysis takes advantage of extensive optimum search capabilities of genetic algorithms. General genetic procedure in the case of determining the best k centers for clusters consists of setting of parameters (number of clusters), population initialization, initial population fitness calculation and repeated [4] selection, cross-over and mutation operations until termination criteria are met.
Genetically optimized k-means clustering algorithms
For genetic k-means (GKM) and its variants (GKHM, GFKM) selection of cluster number and other algorithm specific parameter values is required. Next, the population should be initialized with randomly created cluster centers. From the initial population by
Subsequent iterations are created new populations by operations of selection, cross-over and mutation. For every solution in population, fitness value is calculated according to the specific fitness function as described in Section 2. Solutions with high fitness values come into mating pool. The process is repeated until termination criteria are met. Below some implementation details are given.