02-11-2016, 09:25 AM
1463208594-FaceAnnotationandPedestrianDetectionSRS1.doc (Size: 499.5 KB / Downloads: 4)
INTRODUCTION AND OBJECTIVES
Due to the popularity of various digital cameras and the rapid growth of social media tools for internet-based photo sharing, recent years have witnessed an explosion of the number of digital photos captured and stored by consumers. A large portion of photos shared by users on the Internet are human facial images. Some of these facial images are tagged with names, but many of them are not tagged properly. This has motivated the study of auto face annotation, an important technique that aims to annotate facial images automatically. Also this work is concerned with the challenging task of pedestrian detection in real world environments. That is, the aim is to successfully localize pedestrian in system despite the presence of background clutter. Even though pedestrian has many practical applications and has been an active area of research for many years, it has not been until recently that recognition algorithms have become robust enough to deal with scene of realistic complexity. This proposes algorithms and algorithmic extensions, which further enhance detection robustness compared to existing approaches. Persons and pedestrian, however are not rigid and their appearance changes greatly depending on the body articulation or pose. The variety of textures and colors in clothing and accessories adds further difficulties. One challenging problem for search-based face annotation scheme is how to effectively perform annotation by exploiting the list of most similar facial images and their weak labels that are often noisy and incomplete. To tackle this problem, we propose an effective unsupervised label refinement (ULR) approach for refining the labels of web facial images using machine learning techniques. We formulate the learning problem as a convex optimization and develop effective optimization algorithms to solve the large-scale learning task efficiently. To further speed up the proposed scheme, we also propose a clustering-based approximation algorithm which can improve the scalability considerably.
HARDWARE SPECIFICATION
Processor : Intel Pentium4
RAM : 2GB
Hard Disk : 250GB
Drives : CD ROM Drive
Display Size : 14” Color Monitor
Screen Resolution : 1024× 768 pixels
Keyboard : PC/AT Enhanced Type
Mouse : Logitech PS/2 Port Mouse
SOFTWARE SPECIFICATION
Operating System : Windows 7 or higher
Front End : JAVA
Back End : MySQL
EXISTING SYSTEM
In the existing system, feature detection algorithm for human identification is used to find out the pedestrians. These calculations are very complex. To perform these calculations in a time critical manner, the existing system has to use complex and expensive computational devices.
PROPOSED SYSTEM
The efficiency and fast feature detection methods are demonstrated with three distinct detection frameworks using individual and deformable models. In order to overcome the defect of existing system, a new system was proposed, which uses a new method of histogram of gradient for pattern matching to find the pedestrian detection. The system is trained with lot of pedestrian images. So that the HOG pattern matching process can compare the detected HOG patterns with the HOG patterns stored in the dictionary to find out the pedestrians.
DATA FLOW DIAGRAM
A Data Flow Diagram is a network that describes the flow of data and processes that change, or transform, data through out the system. This network is constructed by using a set of symbols that do not imply a physical implementation. It is a graphical tool for structured analysis of the system requirements. DFD models a system by using external entities from which data flows to a process, which transforms the data and creates, output-data-flows which go to other processes or external entities or files.
STRUCTURE OF THE PROJECT
PROCESS DESIGN
The main modules are
1. Dictionary Construction
2. Face Recognition
3. Human HoG detection (pedestrain detection)
1. Dictionary Construction
For the purpose of dictionary construction, feature extraction of high quality face images is essential. From high quality face images, features and orientation are extracted. A critical step in automatic feature extraction is to automatically and reliably extract features from the input images. However, the performance of a feature extraction algorithm relies heavily on the quality of the input images. In order to ensure that the performance of an automatic feature extraction would be robust with respect to the quality of the images, it would be essential to incorporate a matching enhancement algorithm in the feature extraction. Constructing a proper retrieval database is a key step for a retrieval-based face annotation system. For each downloaded image, we crop the face image according the given face position rectangle and resize all the face images into the same size . In order to evaluate the retrieval-based face annotation scheme in even larger web facial image databases, we construct new databases, which contains famous celebrity and web facial images. In general, there are two main steps to build a weakly-labeled web facial image database: (i) Construct a name list of popular persons; and (ii) Query the existing search engine with the names, and then crawl the web images according to the retrieval results.
2. Face Recognition
A straight forward idea for automatic/semi-automatic face annotation is to integrate face recognition algorithms which have been well studied in the last decade. Here we used face recognition technology to sort faces by their similarity to a chosen face or trained face model. However, despite progress made in recent years, face recognition continues to be a challenging topic in computer vision research. Most algorithms perform well under a controlled environment, while in the scenario of family photo management, the performance of face recognition algorithms becomes unacceptable due to difficult lighting/illumination conditions and large head pose variations. This aims to correct the noisy web facial images for face recognition applications. These works are proposed as a simple preprocessing step in the whole system without adopting sophisticated techniques. Here applied an approach for cleaning up the noisy web facial images. The new algorithm works by normalizing each face into a 150 x 120 pixel image, by transforming it based on five image landmarks: the position of both eyes, the nose and the two corners of the mouth. It then divides each image into overlapping patches of 25 x 25 pixels and describes each patch using a mathematical object known as a vector which captures its basic features. Having done that, the algorithm is ready to compare the images looking for similarities. But first it needs to know what to look for. This is where the training data set comes in. The usual approach is to use a single dataset to train the algorithm and to use a sample of images from the same dataset to test the algorithm on.
3. Human Hog Detection (pedistrain detection)
The method is based on evaluating well-normalized local histograms of image gradient orientations in a dense grid. The basic idea is that local object appearance and shape can often be characterized rather well by the distribution of local intensity gradients or edge directions, even without precise knowledge of the corresponding gradient or edge positions. In practice this is implemented by dividing the image window into small spatial regions (.cells.), for each cell accumulating a local 1-D histogram of gradient directions or edge orientations over the pixels of the cell. The combined histogram entries form the representation. For better invariance to illumination, shadowing, etc., it is also useful to contrast-normalize the local responses before using them. This can be done by accumulating a measure of local histogram energy over somewhat larger spatial regions (.blocks.) and using the results to normalize all of the cells in the block. We will refer to the normalized descriptor blocks as Histogram of Oriented Gradient (HOG) descriptors. Tiling the detection window with a dense grid of HOG descriptors and using the combined feature vector in a conventional SVM based window classifier gives our human detection chain. The use of orientation histograms has many precursors but it only reached maturity when combined with local spatial histogramming and normalization in Lowe's Scale Invariant Feature Transformation (SIFT) approach to baseline image matching, in which it provides the underlying image patch descriptor for matching scale invariant key points. The success of these sparse feature based representations has somewhat overshadowed the power and simplicity of HOG's as dense image descriptors.