Ultrahigh dimensional time course feature selection

Peirong Xu, Lixing ZHU, Yi Li*

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

15 Citations (Scopus)

Abstract

Statistical challenges arise from modern biomedical studies that produce time course genomic data with ultrahigh dimensions. In a renal cancer study that motivated this paper, the pharmacokinetic measures of a tumor suppressor (CCI-779) and expression levels of 12,625 genes were measured for each of 33 patients at 8 and 16 weeks after the start of treatments, with the goal of identifying predictive gene transcripts and the interactions with time in peripheral blood mononuclear cells for pharmacokinetics over the time course. The resulting data set defies analysis even with regularized regression. Although some remedies have been proposed for both linear and generalized linear models, there are virtually no solutions in the time course setting. As such, a novel GEE-based screening procedure is proposed, which only pertains to the specifications of the first two marginal moments and a working correlation structure. Different from existing methods that either fit separate marginal models or compute pairwise correlation measures, the new procedure merely involves making a single evaluation of estimating functions and thus is extremely computationally efficient. The new method is robust against the mis-specification of correlation structures and enjoys theoretical readiness, which is further verified via Monte Carlo simulations. The procedure is applied to analyze the aforementioned renal cancer study and identify gene transcripts and possible time-interactions that are relevant to CCI-779 metabolism in peripheral blood.

Original languageEnglish
Pages (from-to)356-365
Number of pages10
JournalBiometrics
Volume70
Issue number2
DOIs
Publication statusPublished - Jun 2014

Scopus Subject Areas

  • Statistics and Probability
  • Biochemistry, Genetics and Molecular Biology(all)
  • Immunology and Microbiology(all)
  • Agricultural and Biological Sciences(all)
  • Applied Mathematics

User-Defined Keywords

  • Correlated data
  • Generalized estimating equations
  • Longitudinal analysis
  • Sure screening property
  • Time course data
  • Ultrahigh dimensionality
  • Variable selection

Fingerprint

Dive into the research topics of 'Ultrahigh dimensional time course feature selection'. Together they form a unique fingerprint.

Cite this