10-08-2012, 11:58 AM
TOWARDS FACE RECOGNITION AT A DISTANCE
FaceRecSecurity.pdf (Size: 1.16 MB / Downloads: 46)
Abstract
Current face recognition algorithms require the tacit
cooperation of users, who must position themselves in a
small area of space and face the camera. Face recognition in
uncontrolled conditions, such as in security camera footage
presents two extra challenges. First, it is difficult to capture
good quality images of faces in this setting. Second, the pose
of the face is relatively uncontrolled which causes most face
recognition algorithms to fail. In this paper, we present a
series of solutions to address these problems. High quality
face images are captured using a foveated wide field sensor,
in which a narrow-field camera is directed towards faces
using information from a static wide-field camera. Feature
points corresponding to the eyes/nose etc. are accurately
localized and face shape is normalized. A novel algorithm is
introduced to identify these (typically non-frontal) faces from
a test gallery of frontal faces. Results are demonstrated to be
superior to contemporary approaches.
Introduction
Commercial face recognition systems typically operate in
restricted circumstances: they require the user to place
themselves in front of the device and to face the camera. In this
paper we summarize our results on determining identity using
face recognition in a security setting. Consider surveillance
equipment in a large open area such as an outdoor car park.
Challenge 1: In order to visualize the area the camera needs a
wide field of view. Unfortunately, wide field of view comes
at the expense of image resolution. For example, a human
face at 5m will subtend only about 4×6 pixels on a 640×480
sensor with 130o field of view. This is insufficent resolution
for biometric tasks. One solution is to have two cameras. A
wide-field pre-attentive camera observes the whole scene at
low resolution. We use data the output to orient a narrow field
foveal camera towards faces in the scene using a pan-tilt motor.
Wide Field Sensor and Person Detection
The sensor consists of two 30Hz RGB Point Grey Dragonfly
cameras (see Figure 1). The wide field camera is fixed in
position and has a 2.1mm lens subtending a 130 deg horizontal
field of view. The foveal camera is mounted on a pan/tilt
platform and has a 24 mm lens with a 13 deg horizontal field of
view. The pan and tilt motors are Lin Engineering step motors,
with a step size of 0.1 degrees. We aim to use the output from
the wide-field camera to orient the foveal camera at faces in the
scene.
The dominant paradigm in face/body detection is to combine
weak classifiers using Adaboost to build a strong classifier
based on supervised data [15, 8]. A number of factors limit
the utility of these methods for far-field or wide-field face
detection. First, they assume a minimum scale of at least
12×16 pixels. However, in our relatively unconstrained indoor
testing environments, the median face size is 4 × 5 pixels, and
many faces subtend less than a pixel (Figure 2). Moreover,
faces are not restricted to being frontal or profile and heads
may even face away from the camera. We would still like
to orient the foveal camera towards them in this context as
the head may turn toward the camera in subsequent frames.
Pre-processing Faces
In the previous section we have discussed how to gather high
quality face footage over a wide area. In this section we discuss
how to pre-process faces so that they are suitable for face
recognition. A cascade-based face detector [8] is employed.
This provides an approximate estimate of scale and position
and the image is face by these estimates to a standard size.
However, for accurate face recognition we need high quality
registration. We identify distinct face features such as the eyes,
nose etc. A warp is performed by triangulating the image based
on these features, and mapping the vertices of the triangle to the
standard shape. The color of each pixel at given barycentric coordinates
remains the same after this warp (see Figure 4).
Face Recognition Results
We built binary pose models comparing frontal faces with one
other pose. In each case we trained the system with 220 faces
from the FERET dataset. We built a separate model each for 10
of the face features that were visible in both profile and frontal
faces. These sub-models are assumed to be independent, and
the likelihoods are multiplied together under each hypothesis.
The test database consisted of 100 different frontal faces from
the FERET dataset. On each trial a single probe face is
presented at a different pose. We report the proportion of first
choice correct matches to the database. Results are given in
Figure 6. Performance at ±22.5o pose difference was perfect,
but dropped to an average of 95%at ±67.5o and 86%at ±90%.
Discussion: The Challenges Ahead
In this paper we have sketched a series of solutions for building
reliable face recognition solutions which do not require the
co-operation of the subject. However, there are extensions
required before we have a full useful system. First, in the
examples in this paper, we have classified the pose of the
face by hand, and this needs to be automated. Second, we
have not dealt with lighting variation. The system described
for creating invariance to pose variations could be adapted to
eliminate variation due to illuminance.