вход по аккаунту



код для вставкиСкачать
2017 IEEE International Conference on Computational Science and Engineering (CSE) and IEEE International Conference
on Embedded and Ubiquitous Computing (EUC)
Kernel Bilinear Discriminant Analysis for Face Recognition
Xiao-Zhang Liu
College of Information Science & Technology
Hainan University
Haikou, China
ization. With this consideration, some bilateral projection
based 2D feature extraction techniques have been proposed
for seeking transforms on both sides of the image matrix,
such as generalized low-rank approximation of matrices
(GLRAM) [5], which can be seen as a kind of two dimensional PCA, and two dimensional LDA (2DLDA) [6], which
can implicitly resolve the SSS problem suffered by LDA.
These 2D methods are more efficient and more effective
than their 1D counterparts respectively. Furthermore, Yan et
al. [7] proposed an algorithm called multilinear discriminant
analysis (MDA), for supervised dimensionality reduction by
encoding an image object as a general tensor of second or
even higher order. As a general alternative of LDA, MDA
avoids the curse of dimensionality dilemma and alleviates
the SSS problem. And, when the tensors are of second order,
MDA also performs a 2D feature extraction from image
However, the above 2D subspace learning algorithms are
linear projection techniques, whereas face recognition is a
highly nonlinear problem owing to image variations such as
pose, illumination and facial expression. To handle such an
inherent nonlinear problem, some kernel-based PCA/LDA
methods have been developed and applied in pattern recognition tasks, such as kernel PCA (KPCA) [8], kernel Fisher
discriminant (KFD) [9], generalized discriminant analysis
(GDA) [10], and kernel direct LDA (KDDA) [11]. Theoretical and experimental results both show that these kernelbased subspace methods are effective for face recognition,
compared with their linear counterpart. Nevertheless, all
these classical kernel methods are based on vector representation of face images. Since the 2D versions of PCA and
LDA can achieve better performance than PCA and LDA,
respectively, it can be imagined that incorporating kernel
trick in 2DLDA will give better classification than kernelbased LDA.
In this paper, we develop a kernel-based 2D subspace
learning method, called kernel bilinear discriminant analysis (KBDA), which is used to extract image discriminant
features by maximizing the interclass scatter and at the
same time minimizing the intraclass scatter both measured
in the kernel-based tensor metric. The rest of this paper is
organized as follows. In section 2, we propose a kernel based
discriminant tensor criterion, called kernel bilinear fisher cri-
Abstract—This paper proposes a novel kernel-based image
subspace learning method for face recognition, by encoding
an face image as a tensor of second order (matrix). First, we
propose a kernel based discriminant tensor criterion, called
kernel bilinear fisher criterion (KBFC), which is designed to
simultaneously pursue two projection vectors to maximize the
interclass scatter and at the same time minimize the intraclass
scatter in its corresponding subspace. Then, a score level fusion
method is presented to combine two separate projection results
to achieve classification tasks. Experimental results on the
ORL and UMIST face databases show the effectiveness of the
proposed approach.
Keywords-kernel; bilinear discriminant; matrix representation; face recognition
During the past decades, subspace learning has been
one of the mainstream directions of the face recognition
field. Most conventional subspace learning techniques, such
as Principal Component Analysis (PCA) [1] and Linear
Discriminant Analysis (LDA) [2], are based on the so-called
vector-space model. Under this model, the original twodimensional (2D in short) image data are reshaped into a
one-dimensional (1D in short) long vector by stacking either
rows or columns of the image. This vector-space model
introduces the following problems in practical applications.
First, the intrinsic 2D structure of image matrix is removed.
As a result, the spatial information stored in the 2D image
is discarded and not effectively utilized for representation
and recognition. Second, each image sample is modeled
as a point in a high-dimensional space, e.g., for an image
of size 112 × 92, the commonly used image size in face
recognition, the dimension of the vector space is 10304,
and the size of the scatter matrices is 10304 × 10304.
Obviously, a large number of training samples are needed
to get a reliable and robust estimation of data statistics.
This problem, known as curse of dimensionality, is often
confronted in real applications. Third, usually very limited
number of data are available in real applications such that the
Small Sample Size (SSS) problem [3] comes forth frequently
in practice.
To overcome the above drawbacks, efforts have been
made to seek to extract the 2D features directly from the
matrix form [4] of image samples, without their vector978-1-5386-3221-5/17 $31.00 © 2017 IEEE
DOI 10.1109/CSE-EUC.2017.110
same time minimizes the intraclass scatter in its corresponding subspace. That is
terion (KBFC), which is designed to simultaneously pursue
two projection vectors to maximize the interclass scatter
and at the same time minimize the intraclass scatter in its
corresponding subspace. In Section 3, the fusion procedure
for the two projection results are presented. The experiments
on standard face recognition databases are demonstrated in
Section 4. Finally, we draw our conclusions in Section 5.
vk∗ = arg max
Ni MΦk ×k vk − MΦk ×k vk i=1
Ni C 2
Φk (Xj ) ×k vk − MΦk ×k vk , k = 1, 2.
i=1 j=1
The criterion above can be reformulated as follows:
Let Φk : x ∈ Rmk → Φk (x) ∈ Fk be a nonlinear
mapping from vector space Rmk to some high-dimensional
feature space Fk , where different classes of objects are
supposed to be linearly separable, with dk = dimFk ,
k = 1, 2. For any matrix X ∈ Rm1 ×m2 , it can be
written as X = [α1 , α2 , . . . , αm2 ] where αt ∈ Rm1 is
the t-th column of X, t = 1, 2, . . . , m2 , or written as
X = [β1T , β2T , . . . , βm
] where βr ∈ Rm2 is the r-th row
of X, r = 1, 2, . . . , m1 . Denote
vk∗ = arg max
vkT SΦ
b vk
vkT SΦ
w vk
k = 1, 2,
b =
Ni MΦk − MΦk MΦk − MΦk ,
w =
Φ1 (X) = [Φ1 (α1 ), Φ1 (α2 ), . . . , Φ1 (αm2 )] ∈ Rd1 ×m2 ,
Ni C T
Φk (Xj ) − MΦk Φk (Xj ) − MΦk .
i=1 j=1
It can be easily seen that, for k = 1, 2 respectively, the
optimization criterion (4) is a special case of that of the
KDDA method [11], and can be solved using the framework
of KDDA.
)] ∈ Rd2 ×m1 .
Φ2 (X) = [Φ2 (β1T ), Φ2 (β2T ), . . . , Φ2 (βm
∈ Rm1 ×m2 , j =
Assume that the matrices Xj
1, 2, . . . , Ni , represent the 2D images in the sample
belonging to the ith class, i = 1, 2, . . . , C (N = i=1 Ni ).
Under Φk , the i-th mapped matrix class is given by
XΦk = Φk (X1 ), Φk (X2 ), . . . , Φk (XNi ) ,
Our kernel bilinear fisher criterion (KBFC) pursues two
kernel-based subspaces, in order to give full play to the matrix structure of images. Thus our face recognition method
consists of three steps. The first and second steps separately
perform the KDDA algorithm to optimize the criterion (4)
for k = 1, 2. The two separate procedures yield two separate
results, which will be combined in a score level fusion, in
the final step, to achieve classification tasks.
The score level fusion criterion is described as follows.
Given N images belonging to C subjects. Denote the
membership degree of the n-th image belonging to the cth subject as λcn , n = 1, 2, . . . , N , c = 1, 2, . . . , C. λcn is
obtained as follows
and the mapped matrix set is
XΦk =
XΦ k .
Also, the mean of the mapped class XΦk and that of the
mapped set XΦk are respectively given by
MΦk =
MΦ k
1 (i)
Φk (Xj ),
Ni j=1
C Ni
1 (i)
Φk (Xj ).
N i=1 j=1
λcn =
αk scn (k),
where scn (k) denotes the score of the n-th image with regard
to the c-th subject, obtained by the optimization of KBFC,
and αk is the corresponding weight value, k = 1, 2. The
n-th image is assigned to the I-th subject if and only if
λIn = max λcn .
With the terminology of tensor operations [12], the k(i)
mode product of matrix Φk (Xj ) and vector vk ∈ Rdk is
yj,k = Φk (Xj ) ×k vk = Φk (Xj )T vk ,
k = 1, 2. (3)
Now we explain the score scn (k), i.e., the score of the
n-th image with regard to the c-th subject, obtained by the
optimization of the criterion (4) using KDDA when k = 1, 2,
respectively. For the n-th image xn , we can compute the
The kernel bilinear fisher criterion (KBFC) is designed to
simultaneously pursue two projection vectors vk , k = 1, 2,
each of which maximizes the interclass scatter and at the
Figure 1.
Images of one person from the ORL face database
Figure 2.
squared kernel distance between Xn and the center of the
c-th class Mc , as follows
dcn =Φ(Xn ) − Φ(Mc )2F ,
Images of one person from the UMIST face database
from profile to frontal views as well as race, gender and
appearance. Images from one subject are shown in Fig. 2.
In our experiment, the images with size 92 × 112 are also
resized to 23 × 28 for efficiency.
To test the recognition performance with different numbers of training samples on UMIST, T r(2 ≤ T r ≤ 5) images
of each subject are randomly selected for training, i.e., the
training set contains 20T r images in all; The remaining
(575 − 20T r) images are used to form the test set.
where ·F denotes the Frobenius norm. Then we sort the C
squared distances {dcn |c = 1, 2, . . . , C} in ascending order,
and denote the squared kernel distances after sorting as
{dn |c = 1, 2, . . . , C}, i.e., i(c) is the order number of the
c-th largest squared kernel distance in {dcn |c = 1, 2, . . . , C}.
n (k) = C − c,
B. Recognition results
Fig. 3 shows the recognition results of KDDA, K2DFDA,
and the proposed method, for T r = 2, 3, 4, 5 on the ORL
and UMIST databases, where the results are obtained respectively with the optimized kernel parameters. All the results
are the average recognition accuracies over 10 random runs.
From the results in Fig. 3, it can be seen that, the
proposed method achieves higher accuracies than KDDA
and K2DFDA. Moreover, when the number of training samples increases, the proposed method maintains significant
advantage over KDDA, compared with K2DFDA.
c = 1, 2, . . . , C, then we get {scn (k)|c = 1, 2, . . . , C} . Thus,
given the n-th image, the smaller the kernel distance between
it and the center of the c-th class, the greater scn (k) is, c =
1, 2, . . . , C.
When recognizing images using the proposed method,
the optimization of the Gaussian kernel parameter runs and
finds the optimal value separately for two kernel based
subspaces. Then the two results are gathered to compute
the membership degree of the image belonging to the every
subject according to Eq (5). Finally the image is assigned
to the subject which show the greatest membership degree
of the image.
In this paper, we proposes a novel kernel-based image
feature extraction method for face recognition, based on
matrix representation of face images. We first propose a kernel based discriminant tensor criterion, called kernel bilinear
fisher criterion (KBFC), which is designed to simultaneously
pursue two projection vectors to maximize the interclass
scatter and at the same time minimize the intraclass scatter
in its corresponding subspace. Then, we present a score level
fusion method to combine two separate projection results to
deal with classification tasks. The proposed method yields
higher recognition accuracies on the ORL and UMIST face
database, compared with KDDA and K2DFDA.
To evaluate the performance of our method for face
recognition, we have made experimental comparisons with
KDDA [11] and K2DFDA [13], based on Gaussian RBF
kernel, in terms of recognition accuracy. Images are from
two face databases, namely the FERET and the CMU PIE
A. Face image datasets
In our experiments, we use two standard face recognition
databases which are widely used as benchmark datasets in
feature extraction literature.
The ORL face database: There are ten images for each of
the 40 human subjects, which were taken at different times,
varying in lighting, facial expressions and facial details.
Images from one subject are shown in Fig. 1. The original
images (with 256 gray levels) have size 92 × 112, which are
resized to 23 × 28 to reduce the computational load.
To test the recognition performance with different numbers of training samples on ORL, T r(2 ≤ T r ≤ 5) images
of each subject are randomly selected for training and the
remaining (10 − T r) images for testing.
The UMIST face database: It consists of 575 gray-scale
images of 20 subjects, each covering a wide range of poses
This work is partially supported by the National Natural
Science Foundation of China under grant No. 61562017,
and the Scientific Research Foundation of Hainan University
(Project No.: kyqd1443).
[1] Turk M, Pentland A, Eigenfaces for Recognition.
Neurosci., 1991, 3(1): 71-86.
J. Cogn.
[2] Belhumeur P N, Hespanha J P, Kriegman D J, Eigenfaces vs.
Fisherfaces: Recognition Using Class Specific Linear Projection. IEEE Trans. Pattern Anal. Mach. Intell., 1997, 19(7):
a) ORL
[9] Mika S, Rätsch G, Weston J, Schölkopf B, and Müller K R,
Fisher Discriminant Analysis with Kernels. IEEE workshop
Neural Netw. Signal Process. IX, IEEE Press, Madison, 1999:
Recognition rate (%)
[10] Baudat G, Anouar F, Generalized Discriminant Analysis
Using a Kernel Approach. Neural Comput., 2000, 12(10):
[11] Lu J W, Plataniotis K, Venetsanopoulos A N, Face Recognition Using Kernel Direct Discriminant Analysis Algorithms.
IEEE Trans. Neural Netw., 2003, 14(1): 117-126.
[12] Kolda T, Orthogonal Tensor Decompositions.
Matrix Anal. & Appl., 2001, 23(1): 243–255.
Number of training images per subject
[13] Liu X Z, Wang P S P, Feng G C, Kernel Based 2D Fisher
Discriminant Analysis with Parameter Optimization for Face
Recognition. Int. J. Patt. Recogn. Artif. Intell. 2013, 27(8).
Recognition rate (%)
Figure 3.
Number of raining images per subject
Comparison of recognition performance on ORL and UMIST
[3] Fukunaga K, Introduction to Statistical Pattern Recognition.
Academic Press, 1991.
[4] Zheng W S, Lai J H, Li S Z, 1D-LDA vs. 2D-LDA: When Is
Vector-based Linear Discriminant Analysis Better than Matrixbased. Pattern Recognition, 2008, 41(7): 2156C2172.
[5] Ye J, Generalized Low Rank Approximations of Matrices.
Proceedings of 21th International Conference on Machine
Learning, Banff, Canada, July 2004.
[6] Ye J, Janardan R, Li Q, Two-dimensional Linear Discriminant
Analysis. Advances in Neural Information Processing Systems
17, 2005, 1569-1576.
[7] Yan S C, Xu D, Yang Q, Zhang L, Tang X O, and Zhang,
H J, Multilinear Discriminant Ananlysis for Face Recognition.
IEEE Trans. Image Process., 2007, 16(1): 212-220.
[8] Schölkopf B, Smola A, Müller K R, Nonlinear Component
Analysis as a Kernel Eigenvalue Problem. Neural Comput.,
1998, 10(5): 1299-1319.
Без категории
Размер файла
225 Кб
cse, 2017, euc, 110
Пожаловаться на содержимое документа