29-03-2014, 11:41 AM
A Survey on Pixel-Based Skin Color Detection Techniques
Abstract
Skin color has proven to be a useful and robust cue for face de-
tection, localization and tracking. Image content filtering, content-
aware video compression and image color balancing applications
can also benefit from automatic detection of skin in images. Numer-
ous techniques for skin color modelling and recognition have been
proposed during several past years. A few papers comparing differ-
ent approaches have been published [Zarit et al. 1999], [Terrillon
et al. 2000], [Brand and Mason 2000]. However, a comprehensive
survey on the topic is still missing. We try to fill this vacuum by
reviewing most widely used methods and techniques and collecting
their numerical evaluation results.
Introduction
Face detection and tracking has been the topics of an extensive re-
search for the several past decades. Many heuristic and pattern-
recognition based strategies have been proposed for achieving ro-
bust and accurate solution. Among feature-based face detection
methods, the ones using skin color as a detection cue, have gained
strong popularity. Color allows fast processing and is highly robust
to geometric variations of the face pattern. Also, the experience
suggests that human skin has a characteristic color, which is easily
recognized by humans. So trying to employ skin color modelling
for face detection was an idea suggested both by task properties and
common sense.
When building a system, that uses skin color as a feature for
face detection, the researcher usually faces three main problems.
First, what colorspace to choose, second, how exactly the skin color
distribution should be modelled, and finally, what will be the way
of processing of color segmentation results for face detection. This
paper covers the first two questions, leaving the third (an equally
important one) for another discussion.
RGB
RGB is a colorspace originated from CRT (or similar) display ap-
plications, when it was convenient to describe color as a combina-
tion of three colored rays (red, green and blue). It is one of the
most widely used colorspaces for processing and storing of digital
image data. However, high correlation between channels, signifi-
cant perceptual non-uniformity (see section 2.6 for perceptual uni-
formity explanation), mixing of chrominance and luminance data
make RGB not a very favorable choice for color analysis and color-
based recognition algorithms. This colorspace was used in [Brand
and Mason 2000], [Jones and Rehg 1999].
HSI, HSV, HSL - Hue Saturation Intensity
(Value, Lightness)
Hue-saturation based colorspaces were introduced when there was
a need for the user to specify color properties numerically. They de-
scribe color with intuitive values, based on the artist’s idea of tint,
saturation and tone. Hue defines the dominant color (such as red,
green, purple and yellow) of an area, saturation measures the col-
orfulness of an area in proportion to its brightness [Poynton 1995].
The ”intensity”, ”lightness” or ”value” is related to the color lu-
minance. The intuitiveness of the colorspace components and ex-
plicit discrimination between luminance and chrominance proper-
ties made these colorspaces popular in the works on skin color seg-
mentation [Zarit et al. 1999], [McKenna et al. 1998], [Sigal et al.
2000], [Birchfield 1998], [Jordao et al. 1999]. Several interesting
properties of Hue were noted in [Skarbek and Koschan 1994]: it is
invariant to highlights at white light sources, and also, for matte
surfaces, to ambient light and surface orientation relative to the
light source. However, [Poynton 1995], points out several unde-
sirable features of these colorspaces, including hue discontinuities
and the computation of ”brightness” (lightness, value), which con-
flicts badly with the properties of color vision.
Methods discussion
The main advantage of the methods that use explicitly defined skin
cluster boundaries (section 3.1) is the simplicity and intuitiveness
of the classification rules. However, the difficulty with them is the
need to find both good colorspace and adequate decision rules em-
pirically. The recently proposed method that uses machine learn-
ing algorithms to find both suitable colorspace and simple decision
rules [Gomez and Morales 2002] has shown a way to overcome
these difficulties.
The non-parametric methods (section 3.2) are fast both in train-
ing and classification, independent to distribution shape and there-
fore to colorspace selection (see colorspaces discussion section for
more information on the topic). But, they require much storage
space and a representative training dataset.
The parametric methods (section 3.3) can also be fast, they have
a useful ability to interpolate and generalize incomplete training
data, they are expressed by a small number of parameters and need
very little storage space. However, they can be really slow (like
mixture of Gaussians) in both training and work, and their perfor-
mance depends strongly on the skin distribution shape. Besides,
most parametric skin modelling methods ignore the non-skin color
statistics. This, together with dependance on skin cluster shape
results in higher false positives rate, compared to non-parametric
methods.
Conclusion
In this paper, we have provided the description, comparison and
evaluation results of popular methods for skin modelling and detec-
tion. We tried to summarize the most notable and significant dif-
ferences between the methods, their advantages and disadvantages.
The most important conclusions we draw are listed below:
• Parametric skin modelling methods are better suited for con-
structing classifiers in case of limited training and expected
target data set. The generalization and interpolation ability of
these methods makes it possible to construct a classifier with
acceptable performance from incomplete training data.
• The methods that are less dependent on the skin cluster shape
and take into account skin and non-skin colors overlap (Bayes
SPM, Maximum entropy model [Jedynak et al. 2002], au-
tomatically constructed colorspace and classification rules
[Gomez and Morales 2002]) look more promising for con-
structing skin classifier for large target datasets.