TY - JOUR
T1 - Fast Algorithms for the Generalized Foley–Sammon Discriminant Analysis
AU - Zhang, Lei Hong
AU - Liao, Li zhi
AU - Ng, Michael K.
N1 - Funding information:
*Corresponding author. Department of Mathematics, Hong Kong Baptist University, Kowloon Tong, Kowloon, Hong Kong, People’s Republic of China ([email protected]). The work of this author was partially supported by FRG grants from Hong Kong Baptist University and the Research Grant Council of Hong Kong.
§ Department of Mathematics, Hong Kong Baptist Univers ity, Kowloon Tong, Kowloon, Hong Kong People’s Republic of China ([email protected]). T he work of this author was partially supported by RGC grants 201508 and HKBU FRGs.
Publisher copyright:
Copyright © 2010 Society for Industrial and Applied Mathematics
PY - 2010/2/17
Y1 - 2010/2/17
N2 - Linear discriminant analysis (LDA) is one of the most popular approaches for feature extraction and dimension reduction to overcome the curse of the dimensionality of the high-dimensional data in many applications of data mining, machine learning, and bioinformatics. In this paper, we made two main contributions to an important LDA scheme, the generalized Foley–Sammon transform (GFST) [Foley and Sammon, IEEE Trans. Comput., 24 (1975), pp. 281–289; Guo et al., Pattern Recognition Lett., 24 (2003), pp. 147–158] or a trace ratio model [Wang et al., Proceedings of the International Conference on Computer Vision and Pattern Recognition, 2007, pp. 1–8] and its regularized GFST (RGFST), which handles the undersampled problem that involves small samples size n, but with high number of features N (N > n) and arises frequently in many modern applications. Our first main result is to establish an equivalent reduced model for the RGFST which effectively improves the computational overhead. The iteration method proposed by Wang et al. is applied to solve the GFST or the reduced RGFST. It has been proven by Wang et al. that this iteration converges globally and fast convergence was observed numerically, but there is no theoretical analysis on the convergence rate thus far. Our second main contribution completes this important and missing piece by proving the quadratic convergence even under two kinds of inexact computations. Practical implementations, including computational complexity and storage requirements, are also discussed. Our experimental results on several real world data sets indicate the efficiency of the algorithm and the advantages of the GFST model in classification.
AB - Linear discriminant analysis (LDA) is one of the most popular approaches for feature extraction and dimension reduction to overcome the curse of the dimensionality of the high-dimensional data in many applications of data mining, machine learning, and bioinformatics. In this paper, we made two main contributions to an important LDA scheme, the generalized Foley–Sammon transform (GFST) [Foley and Sammon, IEEE Trans. Comput., 24 (1975), pp. 281–289; Guo et al., Pattern Recognition Lett., 24 (2003), pp. 147–158] or a trace ratio model [Wang et al., Proceedings of the International Conference on Computer Vision and Pattern Recognition, 2007, pp. 1–8] and its regularized GFST (RGFST), which handles the undersampled problem that involves small samples size n, but with high number of features N (N > n) and arises frequently in many modern applications. Our first main result is to establish an equivalent reduced model for the RGFST which effectively improves the computational overhead. The iteration method proposed by Wang et al. is applied to solve the GFST or the reduced RGFST. It has been proven by Wang et al. that this iteration converges globally and fast convergence was observed numerically, but there is no theoretical analysis on the convergence rate thus far. Our second main contribution completes this important and missing piece by proving the quadratic convergence even under two kinds of inexact computations. Practical implementations, including computational complexity and storage requirements, are also discussed. Our experimental results on several real world data sets indicate the efficiency of the algorithm and the advantages of the GFST model in classification.
KW - Dimension reduction
KW - Foley-Sammon Transform
KW - Global convergence
KW - Linear discriminant analysis
KW - Quadratic convergence
KW - Regularization
UR - http://www.scopus.com/inward/record.url?scp=77955675408&partnerID=8YFLogxK
U2 - 10.1137/080720863
DO - 10.1137/080720863
M3 - Journal article
AN - SCOPUS:77955675408
SN - 0895-4798
VL - 31
SP - 1584
EP - 1605
JO - SIAM Journal on Matrix Analysis and Applications
JF - SIAM Journal on Matrix Analysis and Applications
IS - 4
ER -