22-08-2012, 02:33 PM
COMBINING NEURAL NETWORKS FOR SKIN DETECTION
1COMBINING NEURAL NETWORKS.pdf (Size: 367.73 KB / Downloads: 40)
ABSTRACT
Two types of combining strategies were evaluated namely combining skin features and combining skin
classifiers. Several combining rules were applied where the outputs of the skin classifiers are combined
using binary operators such as the AND and the OR operators, “Voting”, “Sum of Weights” and a new
neural network. Three chrominance components from the YCbCr colour space that gave the highest correct
detection on their single feature MLP were selected as the combining parameters. A major issue in
designing a MLP neural network is to determine the optimal number of hidden units given a set of training
patterns. Therefore, a “coarse to fine search” method to find the number of neurons in the hidden layer is
proposed. The strategy of combining Cb/Cr and Cr features improved the correct detection by 3.01%
compared to the best single feature MLP given by Cb-Cr. The strategy of combining the outputs of three
skin classifiers using the “Sum of Weights” rule further improved the correct detection by 4.38% compared
to the best single feature MLP.
INTRODUCTION
Skin detection is an important preliminary process for subsequent feature extraction in a wide
range of image processing techniques such as face detection, face tracking, gesture analysis,
content-based image retrieval systems, various computer vision applications, etc. Studies have
shown that by combining more than one feature or classifier, the performance of the skin
detection system is improved. Zhu et al. [1] combined two Gaussian feature spaces where the first
one is related to the colour distribution and the second one is related to the skin spatial and shape
distribution. The combined feature method performed better compared to single feature and also
other generic skin model namely histogram model, single Gaussian model and Gaussian Mixture
model. Brand and Mason [2] evaluated the performance of the combined colour components or
features from the RGB colour space and concluded that the combination of (R/G + R/B + G/B)
gave better performance than the single colour component. Jiang et al. [3] proposed a Skin
Probability Map (SPM) based skin detection system that integrated the colour, texture and space
information and claimed that their proposed method performed better than the generic SPM
method. Gasparini et al. [4] combined different skin classifiers based on different colour features
using several combination rules such as the sum rule, the product rule, the majority rule and the
author’s proposed skin corrected by non-skin (SCNS) rule.
DATA PREPARATION
The database used in this work is the Compaq database [6]. This database consists of 13,640
images with its corresponding masked images. These images contain skin pixels belonging to
persons of different origins, with unconstrained illumination and background conditions, which
make the skin detection task more challenging and difficult. Figure 1 shows an example of
images and their corresponding masked images. Two sets of data are prepared namely the training
data and test data. The training data comprises training and validation samples that will be used to
train the MLP neural networks. The training sample consists of 420,000 image pixels and will be
validated by a similar number of image pixels randomly selected from the Compaq database.
Note that each image pixel can be selected only once. The training data is divided into 30 data
files where each data file consists of 14,000 pixels from the training sample and an equal number
of pixels from the validation sample. Each data file will be used to train a network during a
training run. The test data consists of 100 images selected at random from the Compaq database.
The test images are used to evaluate the performance of the skin detection system.
NEURAL NETWORK PROPERTIES
One of the important aspects in designing a MLP neural network is how to determine the network
topology. The input size is dictated by the number of features of available inputs and the output
size is dictated by the number of classes. Thus, the two decisions that must be made regarding the
hidden units are to determine the number of hidden layers and the number of neurons in each
hidden layer. Fu [7] stated that using only one hidden layer is sufficient to solve many practical
problems and thus, one hidden layer MLP neural network is used in this work. The determination
of the number of neurons in the hidden layer will be discussed in Section 4. Hence, the neural
network topology will be C-HN-O, where C indicates the input neuron which is the chrominance
component, HN is the number of neurons in the hidden layer and O is the output neuron. The
output layer will have one neuron decoded as 1 for skin and 0 for non-skin. The training
algorithm used is the Levenberg-Marquardt because of its rapid convergence time compared with
other fast training algorithms such as Conjugate gradient and Quasi-Newton. The transfer
function applied is the sigmoid function. The maximum number of epochs is set to 500 as it is
proven through trials and error that the number of epochs required for convergence using the
Levenberg-Marquardt training algorithm always occurs well below the 500 epochs [8]. The
training goal selected is 1x10-6
DETERMINATION OF THE NUMBER OF NEURONS IN THE HIDDEN LAYER
The first step is to find the number of neurons for several chrominance components in the YCbCr
colour space, namely, Cb, Cr, Cb/Cr, Cb.Cr and Cb-Cr. Each chrominance is treated as an input
neuron of a MLP neural network. Thus, the MLP neural network structure for each chrominance
component is 1-HN-1. Existing technique for finding the number of neurons in the hidden layer
are network growing [9] and network pruning [10]. In this work, a modified network growing
called “coarse to fine search” method is applied. This method iss divided into two stages. The
first stage is a coarse search using binary search. Hence, the values of HN to be tested are 1, 2, 4,
8, 16, 32, 64 and 128. The maximum number of HN is set at 128 because a larger number of HN
requires more memory space. Each network structure with a given HN is trained 30 times
(training runs) using different initial values and training and validation data and its average Mean
Squared Errors (MSE) over the 30 runs is calculated. The HN value that gives the lowest MSE
will be selected.
CONCLUSION
In this work, several combination strategies for combining MLP neural networks for skin
detection were evaluated. A modified network growing technique for finding the number of
neurons in the hidden layer of a MLP neural network was applied. Three chrominance
components Cb-Cr, Cb/Cr and Cr that gave the highest CDR on their respective MLP were used for
the combination. The combination of Cb/Cr and Cr features improved the CDR by 3.01%
compared to the best single feature MLP given by Cb-Cr. Combining classifier using Sum of
Weights strategy further improved the CDR by 4.38% compared to the best single feature MLP.
Furthermore, combining classifiers using the Sum of Weights strategy improved the correct
detection rate by 3.98% compared to the Bayes’ rule classifier reported by Jones and Rehg [6]
using the Compaq database.