2017 IEEE International Conference on Computational Science and Engineering (CSE) and IEEE International Conference on Embedded and Ubiquitous Computing (EUC) Kernel Bilinear Discriminant Analysis for Face Recognition Xiao-Zhang Liu College of Information Science & Technology Hainan University Haikou, China Email: liuxiaozhang@gmail.com ization. With this consideration, some bilateral projection based 2D feature extraction techniques have been proposed for seeking transforms on both sides of the image matrix, such as generalized low-rank approximation of matrices (GLRAM) [5], which can be seen as a kind of two dimensional PCA, and two dimensional LDA (2DLDA) [6], which can implicitly resolve the SSS problem suffered by LDA. These 2D methods are more efﬁcient and more effective than their 1D counterparts respectively. Furthermore, Yan et al. [7] proposed an algorithm called multilinear discriminant analysis (MDA), for supervised dimensionality reduction by encoding an image object as a general tensor of second or even higher order. As a general alternative of LDA, MDA avoids the curse of dimensionality dilemma and alleviates the SSS problem. And, when the tensors are of second order, MDA also performs a 2D feature extraction from image samples. However, the above 2D subspace learning algorithms are linear projection techniques, whereas face recognition is a highly nonlinear problem owing to image variations such as pose, illumination and facial expression. To handle such an inherent nonlinear problem, some kernel-based PCA/LDA methods have been developed and applied in pattern recognition tasks, such as kernel PCA (KPCA) [8], kernel Fisher discriminant (KFD) [9], generalized discriminant analysis (GDA) [10], and kernel direct LDA (KDDA) [11]. Theoretical and experimental results both show that these kernelbased subspace methods are effective for face recognition, compared with their linear counterpart. Nevertheless, all these classical kernel methods are based on vector representation of face images. Since the 2D versions of PCA and LDA can achieve better performance than PCA and LDA, respectively, it can be imagined that incorporating kernel trick in 2DLDA will give better classiﬁcation than kernelbased LDA. In this paper, we develop a kernel-based 2D subspace learning method, called kernel bilinear discriminant analysis (KBDA), which is used to extract image discriminant features by maximizing the interclass scatter and at the same time minimizing the intraclass scatter both measured in the kernel-based tensor metric. The rest of this paper is organized as follows. In section 2, we propose a kernel based discriminant tensor criterion, called kernel bilinear ﬁsher cri- Abstract—This paper proposes a novel kernel-based image subspace learning method for face recognition, by encoding an face image as a tensor of second order (matrix). First, we propose a kernel based discriminant tensor criterion, called kernel bilinear ﬁsher criterion (KBFC), which is designed to simultaneously pursue two projection vectors to maximize the interclass scatter and at the same time minimize the intraclass scatter in its corresponding subspace. Then, a score level fusion method is presented to combine two separate projection results to achieve classiﬁcation tasks. Experimental results on the ORL and UMIST face databases show the effectiveness of the proposed approach. Keywords-kernel; bilinear discriminant; matrix representation; face recognition I. I NTRODUCTION During the past decades, subspace learning has been one of the mainstream directions of the face recognition ﬁeld. Most conventional subspace learning techniques, such as Principal Component Analysis (PCA) [1] and Linear Discriminant Analysis (LDA) [2], are based on the so-called vector-space model. Under this model, the original twodimensional (2D in short) image data are reshaped into a one-dimensional (1D in short) long vector by stacking either rows or columns of the image. This vector-space model introduces the following problems in practical applications. First, the intrinsic 2D structure of image matrix is removed. As a result, the spatial information stored in the 2D image is discarded and not effectively utilized for representation and recognition. Second, each image sample is modeled as a point in a high-dimensional space, e.g., for an image of size 112 × 92, the commonly used image size in face recognition, the dimension of the vector space is 10304, and the size of the scatter matrices is 10304 × 10304. Obviously, a large number of training samples are needed to get a reliable and robust estimation of data statistics. This problem, known as curse of dimensionality, is often confronted in real applications. Third, usually very limited number of data are available in real applications such that the Small Sample Size (SSS) problem [3] comes forth frequently in practice. To overcome the above drawbacks, efforts have been made to seek to extract the 2D features directly from the matrix form [4] of image samples, without their vector978-1-5386-3221-5/17 $31.00 © 2017 IEEE DOI 10.1109/CSE-EUC.2017.110 601 603 same time minimizes the intraclass scatter in its corresponding subspace. That is terion (KBFC), which is designed to simultaneously pursue two projection vectors to maximize the interclass scatter and at the same time minimize the intraclass scatter in its corresponding subspace. In Section 3, the fusion procedure for the two projection results are presented. The experiments on standard face recognition databases are demonstrated in Section 4. Finally, we draw our conclusions in Section 5. C vk∗ = arg max vk 2 (i) Ni MΦk ×k vk − MΦk ×k vk i=1 Ni C 2 (i) (i) Φk (Xj ) ×k vk − MΦk ×k vk , k = 1, 2. i=1 j=1 II. K ERNEL B ILINEAR F ISHER CRITERION The criterion above can be reformulated as follows: Let Φk : x ∈ Rmk → Φk (x) ∈ Fk be a nonlinear mapping from vector space Rmk to some high-dimensional feature space Fk , where different classes of objects are supposed to be linearly separable, with dk = dimFk , k = 1, 2. For any matrix X ∈ Rm1 ×m2 , it can be written as X = [α1 , α2 , . . . , αm2 ] where αt ∈ Rm1 is the t-th column of X, t = 1, 2, . . . , m2 , or written as T T X = [β1T , β2T , . . . , βm ] where βr ∈ Rm2 is the r-th row 1 of X, r = 1, 2, . . . , m1 . Denote vk∗ = arg max vk k vkT SΦ b vk k vkT SΦ w vk , k = 1, 2, (4) where k SΦ b = C T (i) (i) Ni MΦk − MΦk MΦk − MΦk , i=1 k SΦ w = Φ1 (X) = [Φ1 (α1 ), Φ1 (α2 ), . . . , Φ1 (αm2 )] ∈ Rd1 ×m2 , Ni C T (i) (i) (i) (i) Φk (Xj ) − MΦk Φk (Xj ) − MΦk . i=1 j=1 and It can be easily seen that, for k = 1, 2 respectively, the optimization criterion (4) is a special case of that of the KDDA method [11], and can be solved using the framework of KDDA. T )] ∈ Rd2 ×m1 . Φ2 (X) = [Φ2 (β1T ), Φ2 (β2T ), . . . , Φ2 (βm 1 (i) ∈ Rm1 ×m2 , j = Assume that the matrices Xj 1, 2, . . . , Ni , represent the 2D images in the sample set C belonging to the ith class, i = 1, 2, . . . , C (N = i=1 Ni ). Under Φk , the i-th mapped matrix class is given by (i) (i) (i) (i) XΦk = Φk (X1 ), Φk (X2 ), . . . , Φk (XNi ) , III. F UZZY F USION OF KBFC Our kernel bilinear ﬁsher criterion (KBFC) pursues two kernel-based subspaces, in order to give full play to the matrix structure of images. Thus our face recognition method consists of three steps. The ﬁrst and second steps separately perform the KDDA algorithm to optimize the criterion (4) for k = 1, 2. The two separate procedures yield two separate results, which will be combined in a score level fusion, in the ﬁnal step, to achieve classiﬁcation tasks. The score level fusion criterion is described as follows. Given N images belonging to C subjects. Denote the membership degree of the n-th image belonging to the cth subject as λcn , n = 1, 2, . . . , N , c = 1, 2, . . . , C. λcn is obtained as follows and the mapped matrix set is XΦk = C (i) XΦ k . i=1 (i) Also, the mean of the mapped class XΦk and that of the mapped set XΦk are respectively given by (i) MΦk = and MΦ k Ni 1 (i) Φk (Xj ), Ni j=1 C Ni 1 (i) = Φk (Xj ). N i=1 j=1 (1) λcn = (i) (i) (i) αk scn (k), (5) k=1 (2) where scn (k) denotes the score of the n-th image with regard to the c-th subject, obtained by the optimization of KBFC, and αk is the corresponding weight value, k = 1, 2. The n-th image is assigned to the I-th subject if and only if λIn = max λcn . With the terminology of tensor operations [12], the k(i) mode product of matrix Φk (Xj ) and vector vk ∈ Rdk is yj,k = Φk (Xj ) ×k vk = Φk (Xj )T vk , 2 1≤c≤C k = 1, 2. (3) Now we explain the score scn (k), i.e., the score of the n-th image with regard to the c-th subject, obtained by the optimization of the criterion (4) using KDDA when k = 1, 2, respectively. For the n-th image xn , we can compute the The kernel bilinear ﬁsher criterion (KBFC) is designed to simultaneously pursue two projection vectors vk , k = 1, 2, each of which maximizes the interclass scatter and at the 604 602 Figure 1. Images of one person from the ORL face database Figure 2. squared kernel distance between Xn and the center of the c-th class Mc , as follows dcn =Φ(Xn ) − Φ(Mc )2F , Images of one person from the UMIST face database from proﬁle to frontal views as well as race, gender and appearance. Images from one subject are shown in Fig. 2. In our experiment, the images with size 92 × 112 are also resized to 23 × 28 for efﬁciency. To test the recognition performance with different numbers of training samples on UMIST, T r(2 ≤ T r ≤ 5) images of each subject are randomly selected for training, i.e., the training set contains 20T r images in all; The remaining (575 − 20T r) images are used to form the test set. (6) where ·F denotes the Frobenius norm. Then we sort the C squared distances {dcn |c = 1, 2, . . . , C} in ascending order, and denote the squared kernel distances after sorting as i(c) {dn |c = 1, 2, . . . , C}, i.e., i(c) is the order number of the c-th largest squared kernel distance in {dcn |c = 1, 2, . . . , C}. Let si(c) (7) n (k) = C − c, B. Recognition results Fig. 3 shows the recognition results of KDDA, K2DFDA, and the proposed method, for T r = 2, 3, 4, 5 on the ORL and UMIST databases, where the results are obtained respectively with the optimized kernel parameters. All the results are the average recognition accuracies over 10 random runs. From the results in Fig. 3, it can be seen that, the proposed method achieves higher accuracies than KDDA and K2DFDA. Moreover, when the number of training samples increases, the proposed method maintains signiﬁcant advantage over KDDA, compared with K2DFDA. c = 1, 2, . . . , C, then we get {scn (k)|c = 1, 2, . . . , C} . Thus, given the n-th image, the smaller the kernel distance between it and the center of the c-th class, the greater scn (k) is, c = 1, 2, . . . , C. When recognizing images using the proposed method, the optimization of the Gaussian kernel parameter runs and ﬁnds the optimal value separately for two kernel based subspaces. Then the two results are gathered to compute the membership degree of the image belonging to the every subject according to Eq (5). Finally the image is assigned to the subject which show the greatest membership degree of the image. V. C ONCLUSIONS In this paper, we proposes a novel kernel-based image feature extraction method for face recognition, based on matrix representation of face images. We ﬁrst propose a kernel based discriminant tensor criterion, called kernel bilinear ﬁsher criterion (KBFC), which is designed to simultaneously pursue two projection vectors to maximize the interclass scatter and at the same time minimize the intraclass scatter in its corresponding subspace. Then, we present a score level fusion method to combine two separate projection results to deal with classiﬁcation tasks. The proposed method yields higher recognition accuracies on the ORL and UMIST face database, compared with KDDA and K2DFDA. IV. E XPERIMENTAL R ESULTS To evaluate the performance of our method for face recognition, we have made experimental comparisons with KDDA [11] and K2DFDA [13], based on Gaussian RBF kernel, in terms of recognition accuracy. Images are from two face databases, namely the FERET and the CMU PIE databases. A. Face image datasets In our experiments, we use two standard face recognition databases which are widely used as benchmark datasets in feature extraction literature. The ORL face database: There are ten images for each of the 40 human subjects, which were taken at different times, varying in lighting, facial expressions and facial details. Images from one subject are shown in Fig. 1. The original images (with 256 gray levels) have size 92 × 112, which are resized to 23 × 28 to reduce the computational load. To test the recognition performance with different numbers of training samples on ORL, T r(2 ≤ T r ≤ 5) images of each subject are randomly selected for training and the remaining (10 − T r) images for testing. The UMIST face database: It consists of 575 gray-scale images of 20 subjects, each covering a wide range of poses ACKNOWLEDGMENT This work is partially supported by the National Natural Science Foundation of China under grant No. 61562017, and the Scientiﬁc Research Foundation of Hainan University (Project No.: kyqd1443). R EFERENCES [1] Turk M, Pentland A, Eigenfaces for Recognition. Neurosci., 1991, 3(1): 71-86. J. Cogn. [2] Belhumeur P N, Hespanha J P, Kriegman D J, Eigenfaces vs. Fisherfaces: Recognition Using Class Speciﬁc Linear Projection. IEEE Trans. Pattern Anal. Mach. Intell., 1997, 19(7): 711-720. 605 603 a) ORL [9] Mika S, Rätsch G, Weston J, Schölkopf B, and Müller K R, Fisher Discriminant Analysis with Kernels. IEEE workshop Neural Netw. Signal Process. IX, IEEE Press, Madison, 1999: 41-48. 98 96 Recognition rate (%) 94 [10] Baudat G, Anouar F, Generalized Discriminant Analysis Using a Kernel Approach. Neural Comput., 2000, 12(10): 2385-2404. 92 90 [11] Lu J W, Plataniotis K, Venetsanopoulos A N, Face Recognition Using Kernel Direct Discriminant Analysis Algorithms. IEEE Trans. Neural Netw., 2003, 14(1): 117-126. 88 86 84 80 [12] Kolda T, Orthogonal Tensor Decompositions. Matrix Anal. & Appl., 2001, 23(1): 243–255. KDDA K2DFDA proposed 82 2 3 4 Number of training images per subject [13] Liu X Z, Wang P S P, Feng G C, Kernel Based 2D Fisher Discriminant Analysis with Parameter Optimization for Face Recognition. Int. J. Patt. Recogn. Artif. Intell. 2013, 27(8). 5 b) UMIST 95 Recognition rate (%) 90 85 80 75 70 65 Figure 3. KDDA K2DFDA proposed 2 3 4 Number of raining images per subject SIAM. J. 5 Comparison of recognition performance on ORL and UMIST [3] Fukunaga K, Introduction to Statistical Pattern Recognition. Academic Press, 1991. [4] Zheng W S, Lai J H, Li S Z, 1D-LDA vs. 2D-LDA: When Is Vector-based Linear Discriminant Analysis Better than Matrixbased. Pattern Recognition, 2008, 41(7): 2156C2172. [5] Ye J, Generalized Low Rank Approximations of Matrices. Proceedings of 21th International Conference on Machine Learning, Banff, Canada, July 2004. [6] Ye J, Janardan R, Li Q, Two-dimensional Linear Discriminant Analysis. Advances in Neural Information Processing Systems 17, 2005, 1569-1576. [7] Yan S C, Xu D, Yang Q, Zhang L, Tang X O, and Zhang, H J, Multilinear Discriminant Ananlysis for Face Recognition. IEEE Trans. Image Process., 2007, 16(1): 212-220. [8] Schölkopf B, Smola A, Müller K R, Nonlinear Component Analysis as a Kernel Eigenvalue Problem. Neural Comput., 1998, 10(5): 1299-1319. 606 604

1/--страниц