TY - JOUR
T1 - Identification of differentially expressed genes with multivariate outlier analysis
AU - Zhao, Hong Ya
AU - YUE, Patrick Y K
AU - Fang, Kai Tai
N1 - Copyright:
Copyright 2008 Elsevier B.V., All rights reserved.
PY - 2004
Y1 - 2004
N2 - DNA microarray offers a powerful and effective technology to monitor the changes in the gene expression levels for thousands of genes simultaneously. It is being widely applied to explore the quantitative alternation in gene regulation in response to a variety of aspects including diseases and exposure of toxicant. A common task in analyzing microarray data is to identify the differentially expressed genes under two different experimental conditions. Because of the large number of genes and small number of arrays, and higher signal-noise ratio in microarray data, many traditional approaches seem improper. In this paper, a multivariate mixture model is applied to model the expression level of replicated arrays, considering the differentially expressed genes as the outliers of the expression data. In order to detect the outliers of the multivariate mixture model, an effective and robust statistical method is first applied to microarray analysis. This method is based on the analysis of - kurtosis coefficient (KC) of the projected multivariate data arising from a mixture model so as to identify the outliers. We utilize the multivariate KC algorithm to our microarray experiment with the control and toxic treatment. After the processing of data, the differential genes are successfully identified from 1824 genes on the UCLA M07 microarray chip. We also use the RT-PCR method and two robust statistical methods, minimum covariance determinant (MCD) and minimum volume ellipsoid (MVE), to verify the expression level of outlier genes identified by KC algorithm. We conclude that the robust multivariate tool is practical and effective for the detection of differentially expressed genes.
AB - DNA microarray offers a powerful and effective technology to monitor the changes in the gene expression levels for thousands of genes simultaneously. It is being widely applied to explore the quantitative alternation in gene regulation in response to a variety of aspects including diseases and exposure of toxicant. A common task in analyzing microarray data is to identify the differentially expressed genes under two different experimental conditions. Because of the large number of genes and small number of arrays, and higher signal-noise ratio in microarray data, many traditional approaches seem improper. In this paper, a multivariate mixture model is applied to model the expression level of replicated arrays, considering the differentially expressed genes as the outliers of the expression data. In order to detect the outliers of the multivariate mixture model, an effective and robust statistical method is first applied to microarray analysis. This method is based on the analysis of - kurtosis coefficient (KC) of the projected multivariate data arising from a mixture model so as to identify the outliers. We utilize the multivariate KC algorithm to our microarray experiment with the control and toxic treatment. After the processing of data, the differential genes are successfully identified from 1824 genes on the UCLA M07 microarray chip. We also use the RT-PCR method and two robust statistical methods, minimum covariance determinant (MCD) and minimum volume ellipsoid (MVE), to verify the expression level of outlier genes identified by KC algorithm. We conclude that the robust multivariate tool is practical and effective for the detection of differentially expressed genes.
KW - CDNA Microarray
KW - Gene expression data
KW - Kurtosis coefficient (KC)
KW - Mahalanobis distance (MD)
KW - Mixture model
KW - Multivariate outlier
UR - http://www.scopus.com/inward/record.url?scp=4644299293&partnerID=8YFLogxK
U2 - 10.1081/BIP-200025654
DO - 10.1081/BIP-200025654
M3 - Journal article
C2 - 15468756
AN - SCOPUS:4644299293
SN - 1054-3406
VL - 14
SP - 629
EP - 646
JO - Journal of Biopharmaceutical Statistics
JF - Journal of Biopharmaceutical Statistics
IS - 3
ER -