TY - JOUR
T1 - Block-diagonal discriminant analysis and its bias-corrected rules
AU - Pang, Herbert
AU - TONG, Tiejun
AU - NG, Kwok Po
N1 - Funding Information:
Acknowledgments: Herbert Pang’s research was supported in part by National Institute of Health under Award P01CA142538 and funds from DUMC. Tiejun Tong’s research was supported in part by Hong Kong Research Grants Council under Grant 202711 and HKBU FRGs. Michael Ng’s research was supported in part by Hong Kong Research Grants Council under Grant 201508 and HKBU FRGs. The authors are grateful to the editor, the associate editor, and two reviewers for their constructive comments and suggestions that have led to a substantial improvement in the article.
PY - 2013/6
Y1 - 2013/6
N2 - High-throughput expression profiling allows simultaneous measure of tens of thousands of genes at once. These data have motivated the development of reliable biomarkers for disease subtypes identification and diagnosis. Many methods have been developed in the literature for analyzing these data, such as diagonal discriminant analysis, support vector machines, and k-nearest neighbor methods. The diagonal discriminant methods have been shown to perform well for high-dimensional data with small sample sizes. Despite its popularity, the independence assumption is unlikely to be true in practice. Recently, a gene module based linear discriminant analysis strategy has been proposed by utilizing the correlation among genes in discriminant analysis. However, the approach can be underpowered when the samples of the two classes are unbalanced. In this paper, we propose to correct the biases in the discriminant scores of blockdiagonal discriminant analysis. In simulation studies, our proposed method outperforms other approaches in various settings. We also illustrate our proposed discriminant analysis method for analyzing microarray data studies.
AB - High-throughput expression profiling allows simultaneous measure of tens of thousands of genes at once. These data have motivated the development of reliable biomarkers for disease subtypes identification and diagnosis. Many methods have been developed in the literature for analyzing these data, such as diagonal discriminant analysis, support vector machines, and k-nearest neighbor methods. The diagonal discriminant methods have been shown to perform well for high-dimensional data with small sample sizes. Despite its popularity, the independence assumption is unlikely to be true in practice. Recently, a gene module based linear discriminant analysis strategy has been proposed by utilizing the correlation among genes in discriminant analysis. However, the approach can be underpowered when the samples of the two classes are unbalanced. In this paper, we propose to correct the biases in the discriminant scores of blockdiagonal discriminant analysis. In simulation studies, our proposed method outperforms other approaches in various settings. We also illustrate our proposed discriminant analysis method for analyzing microarray data studies.
KW - Bias-correction
KW - Block-diagonal
KW - Classification
KW - High-dimensional data
KW - Linear discriminant analysis
UR - http://www.scopus.com/inward/record.url?scp=84881622680&partnerID=8YFLogxK
U2 - 10.1515/sagmb-2012-0017
DO - 10.1515/sagmb-2012-0017
M3 - Journal article
C2 - 23735433
AN - SCOPUS:84881622680
SN - 1544-6115
VL - 12
SP - 347
EP - 359
JO - Statistical Applications in Genetics and Molecular Biology
JF - Statistical Applications in Genetics and Molecular Biology
IS - 3
ER -