17-08-2012, 03:10 PM
Fisher Discrimination Dictionary Learning for Sparse Representation
1Fisher Discrimination.pdf (Size: 324.21 KB / Downloads: 92)
Abstract
Sparse representation based classification has led to
interesting image recognition results, while the dictionary
used for sparse coding plays a key role in it. This paper
presents a novel dictionary learning (DL) method to
improve the pattern classification performance. Based on
the Fisher discrimination criterion, a structured dictionary,
whose dictionary atoms have correspondence to the class
labels, is learned so that the reconstruction error after
sparse coding can be used for pattern classification.
Meanwhile, the Fisher discrimination criterion is imposed
on the coding coefficients so that they have small
within-class scatter but big between-class scatter. A new
classification scheme associated with the proposed Fisher
discrimination DL (FDDL) method is then presented by
using both the discriminative information in the
reconstruction error and sparse coding coefficients.
Introduction
The past several years have witnessed the rapid
development of the theory and algorithms of sparse
representation (or coding) [30] and its successful
applications in image restoration [1-3] and compressed
sensing [4]. Recently sparse representation techniques have
also led to promising results in image classification, e.g.
face recognition (FR) [5-7, 10, 31], digit and texture
classification [8-9, 11-12], etc. The success of sparse
representation based classification owes to the fact that a
high-dimensional image can be represented or coded by a
few representative samples from the same class in a
low-dimensional manifold, and the recent progress of
l0-norm and l1-norm minimization techniques [28].
Discriminative coefficient term f(X)
To make dictionary D be discriminative for the samples
in A, we can make the coding coefficient of A over D, i.e. X,
be discriminative. Based on some criterion such as the
Fisher discrimination criterion [18], this can be achieved by
minimizing the within-class scatter of X, denoted by SW(X),
and maximizing the between-class scatter of X,
The classification scheme
When D is available, a testing sample can be classified
via coding it over D. Based on the employed dictionary D,
different information can be utilized for the classification
task. In [16] and [10], a common dictionary is shared by all
classes, and the sparse coding coefficients are used for
classification. In SRC [5], the original training samples are
used to form a structured dictionary to code the testing
sample, and the reconstruction error associated with each
class is used for classification. Compared to SRC, in [14]
and [11] the testing sample is coded on each sub-dictionary
associated with each class, and then the reconstruction
error is computed for classification.
Face recognition
We apply the proposed algorithm to FR on the Extended
Yale B [21-22], AR [23], and Multi-PIE [24] face
databases. In order to clearly illustrate the advantage of the
proposed method, we compare FDDL with SRC, two latest
DL based classification methods (discriminative KSVD
(DKSVD) [10] and dictionary learning with structure
incoherence (DLSI) [11]) and two popular classification
methods (nearest neighbor (NN) and linear support vector
machines (SVM)). Note that the original DLSI method
codes the testing sample by each class. For a fair
comparison, we also gave the results (denoted by DLSI*)
by coding the testing sample over the whole dictionary and
using the reconstruction error for classification. The
default number of dictionary atoms in FDDL on each
class is set as the number of training samples.
Parameter selection
One important parameter in FDDL is the number of
atoms in Di, denoted by pi. For FDDL, we usually set all the
pi equal, i=1,2,…c. We use SRC as the baseline method,
and analyze the effect of pi on the performance of FDDL.
We take FR on Extended Yale B [21-22] as an example (the
experiment setting is given in next subsection). Because
SRC uses the original training samples as dictionary, we
randomly select pi training samples as dictionary atoms and
run 10 times the experiment to get the average recognition
rate. Fig. 3 plots the recognition rates of FDDL and SRC
versus different number of dictionary atoms. We can see
that in all cases FDDL has about 3% improvement over
SRC.
Digit recognition
We then perform handwritten digit recognition on the
widely used USPS database [26] with 7,291 training and
2,007 testing images. We compare the proposed FDDL
with state-of-the-art methods reported in [11], [9] and [8].
These methods include the best reconstructive DL method
with linear and bilinear classifier models (denoted by
REC-L and REC-BL) [9], the best supervised DL method
with generative training and discriminative training
(denoted by SDL-G and SDL-D) [9], the best result of
sparse representation for signal classification (denoted by
SRSC) [8] and the best result of DLSI [11]. In addition,
some results of problem-specific methods (i.e., the standard
Euclidean k_NN and SVM with a Gaussian kernel)
reported in [11] are also listed. Here the original image
(16×16) is directly used as the feature and the dictionary of
each class has 90 atoms in FDDL with λ1 =γ1=0.1, λ2=0.001,
and γ2=0.005.
Conclusion and discussion
In this paper, we proposed a Fisher Discrimination
Dictionary Learning (FDDL) approach to sparse
representation based image classification. The FDDL aims
to learn a structured dictionary whose sub-dictionaries have
specific class labels. The discrimination ability of FDDL is
two-folds. First, each sub-dictionary of the learned whole
dictionary has good representation power to the samples
from the corresponding class, but has poor representation
power to the samples from other classes. Second, FDDL
will result in discriminative coefficients by minimizing the
with-class scatter and maximizing the between-class scatter
of them. Consequently, we presented the classification
schemes associated with FDDL, which use both the
discriminative reconstruction error and sparse coding
coefficients to classify the input query image.