11-03-2014, 02:54 PM
Face Recognition and Detection
Recognition problems
What is it?
Object and scene recognition
Who is it?
Identity recognition
Where is it?
Object detection
What are they doing?
Activities
All of these are classification problems
Choose one class from a list of possible candidates
What is recognition?
A different taxonomy from [Csurka et al. 2006]:
Recognition
Where is this particular object?
Categorization
What kind of object(s) is(are) present?
Content-based image retrieval
Find me something that looks similar
Detection
Locate all instances of a given class
Sources
Steve Seitz, CSE 455/576, previous quarters
Fei-Fei, Fergus, Torralba, CVPR’2007 course
Efros, CMU 16-721 Learning in Vision
Freeman, MIT 6.869 Computer Vision: Learning
Linda Shapiro, CSE 576, Spring 2007
Today’s lecture
Face recognition and detection
color-based skin detection
recognition: eigenfaces [Turk & Pentland]
and parts [Moghaddan & Pentland]
detection: boosting [Viola & Jones]
Principal component analysis
Suppose each data point is N-dimensional
Same procedure applies:
The eigenvectors of A define a new coordinate system
eigenvector with largest eigenvalue captures the most variation among training vectors x
eigenvector with smallest eigenvalue has least variation
We can compress the data using the top few eigenvectors
corresponds to choosing a “linear subspace”
represent points on a line, plane, or “hyper-plane”
these eigenvectors are known as the principal components
Constructing the classifier
For each round of boosting:
Evaluate each rectangle filter on each example
Sort examples by filter values
Select best threshold for each filter (min error)
Use sorting to quickly scan for optimal threshold
Select best filter/threshold combination
Weight is a simple function of error rate
Reweight examples
(There are many tricks to make this more efficient.)
Speed of face detector (2001)
Speed is proportional to the average number of features computed per sub-window.
On the MIT+CMU test set, an average of 9 features (/ 6061) are computed per sub-window.
On a 700 Mhz Pentium III, a 384x288 pixel image takes about 0.067 seconds to process (15 fps).
Roughly 15 times faster than Rowley-Baluja-Kanade and 600 times faster than Schneiderman-Kanade.
Face detector comparison
Informal study by Andrew Gallagher, CMU,
for CMU 16-721 Learning-Based Methods in Vision, Spring 2007
The Viola Jones algorithm OpenCV implementation was used. (<2 sec per image).
For Schneiderman and Kanade, Object Detection Using the Statistics of Parts [IJCV’04], the www.pittpatt.com demo was used. (~10-15 seconds per image, including web transmission).