Identification of differentially expressed genes with multivariate outlier analysis

Hong Ya Zhao, Patrick Y K YUE, Kai Tai Fang*

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

12 Citations (Scopus)


DNA microarray offers a powerful and effective technology to monitor the changes in the gene expression levels for thousands of genes simultaneously. It is being widely applied to explore the quantitative alternation in gene regulation in response to a variety of aspects including diseases and exposure of toxicant. A common task in analyzing microarray data is to identify the differentially expressed genes under two different experimental conditions. Because of the large number of genes and small number of arrays, and higher signal-noise ratio in microarray data, many traditional approaches seem improper. In this paper, a multivariate mixture model is applied to model the expression level of replicated arrays, considering the differentially expressed genes as the outliers of the expression data. In order to detect the outliers of the multivariate mixture model, an effective and robust statistical method is first applied to microarray analysis. This method is based on the analysis of - kurtosis coefficient (KC) of the projected multivariate data arising from a mixture model so as to identify the outliers. We utilize the multivariate KC algorithm to our microarray experiment with the control and toxic treatment. After the processing of data, the differential genes are successfully identified from 1824 genes on the UCLA M07 microarray chip. We also use the RT-PCR method and two robust statistical methods, minimum covariance determinant (MCD) and minimum volume ellipsoid (MVE), to verify the expression level of outlier genes identified by KC algorithm. We conclude that the robust multivariate tool is practical and effective for the detection of differentially expressed genes.

Original languageEnglish
Pages (from-to)629-646
Number of pages18
JournalJournal of Biopharmaceutical Statistics
Issue number3
Publication statusPublished - 2004

Scopus Subject Areas

  • Statistics and Probability
  • Pharmacology
  • Pharmacology (medical)

User-Defined Keywords

  • CDNA Microarray
  • Gene expression data
  • Kurtosis coefficient (KC)
  • Mahalanobis distance (MD)
  • Mixture model
  • Multivariate outlier


Dive into the research topics of 'Identification of differentially expressed genes with multivariate outlier analysis'. Together they form a unique fingerprint.

Cite this