08-05-2012, 03:54 PM
HUMAN FACE RECOGNITION USING NEURAL NETWORKS
00413708.pdf (Size: 339.7 KB / Downloads: 26)
INTRODUCTION
Referring to Figure 1, the operation of the HFRS
is divided into three modules; human head tracer
(HHT), eye locator (EL), and recognition of face
(RF). There are two processes associated with
the HHT module. The first one is to sense the
presence of an object in front of the video camera
and the second is to approximate the location of
human head in the image frame.
The first process is achieved by analyzing the
pixel histogram projection of grey level in the
horizontal line. The grey level values are
summed up starting at the top horizontal line of
image frame to the middle of the image frame.
The presence of an image is considered when
there is a sudden drop in the total grey level
value; the total grey level of a horizontal line
from a white background is greater than the total
grey level of horizontal line consisting of low
grey level pixels for example, a face. This
process also provides the location of image in the
vertical direction.
The second process of the HHT determines the
location of image in the horizontal direction.
This is accomplished by analyzing the histogram
of grey level in the vertical lines. The projection
is a straight line if there is no image in the frame.
But when an image is present, the total grey level
will drop at a range where the image presence.
From this, the center location of the image can be
easily determined.
EYE LOCATOR
The EL serves to search for the location of
human eyes in the image frame. When the
approximated location of human head is
determined by HHT, the EL is used to scan for
the left and right eyes of the face image. The
area of scanning can be limited, since we already
know the whereabouts of the human head. The
detected locations of both eyes, left and right,
will be used as reference points to the location of
human face area which will then be used as
inputs to the RF. Since the EL is capable of
detecting both eyes, the tilt of the human head
can be known. A compensation method is
designed to re-position the tilted image.
The EL is developed by using multi-layer
perceptron neural network. The EL MLP with a
single hidden layer and a single output node is
trained with eye images, and non-eye images.
The output node is supervised to respond
positively when the input image is eye image and
negatively when the input image is non-eye
image. During scanning process, the image area
that causes the EL MLP to respond positively is
considered as an eye feature.
The RF is the main aim of the work. It is
designed to emulate the human face recognition
task. The RF also uses a MLP neural network to
achieve its objective. The input of the RF MLP
is the image area within a window that encloses
the crucial features of human face (see box E in
Fig. 1). The RF MLP with a single hidden layer
and several output nodes is trained with a set of
face images derived from our own database.
The number of output nodes corresponds to the
number of face classes.
During testing period, if an input image that is
fed to the MLP causes the corresponding output
node to respond positively, the input is
recognized as the class of the positively
response's node. If none of the output nodes
responds to an input image, the input image is
considered not recognized. In the case where an
input image belonging to a known face class,
does not result in a positive response at the
corresponding output node the input image is
used in a retraining or reinforcement phase. This
is necessary due to inadequate number of
different test images for each of the face classes.
The MLPs used in the EL MLP and RF MLP are
trained with the fast learning back propagation
algorithm [Kirayiannis 19921. It is proven by
Kirayiannis that the algorithm learns input
pattern much faster than the ordinary back
propagation algorithm.
EXPERIMENTAL SETUP
The computer used in the software simulation of
the HFRS is a Sun-SPARC workstation. The
images are captured by a CCD video camera
(Sony video camera model no: CCD-TR55E)
connected to the workstation via VideoPix image
grabber system. The simulation software is
written in C language. The simulation is
developed on a window platform, X Window
using XView tool kits. Figure 1 illustrates the
HFRS setup.
RESULTS AND CONCLUSIONS
The EL performance test images (85 face
images, out of which 25 images have subject
with closed eyes and 28 images have subjects
with glasses) is found to be 87% accurate. This
is determined by setting the requirement that the
center point between detected the eyes must be
within 5 pixels horizontally or vertically of the
actual center point.
Figure 2 is the performance results of the MLP
training with various number of images. The
face image is 92 by 80 pixels. For training, the
resolution level of the input images is 1/16th of
the original size so that the input sizeis actually
460 pixels. The network has a single hidden
layer with 10 hidden nodes and the size of
output class is 6.