Greedy forward regression for variable screening

Ming Yen Cheng*, Sanying Feng, Gaorong Li, Heng Lian

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

8 Citations (Scopus)

Abstract

In the ultra-high dimensional setting, two popular variable screening methods with the desirable sure screening property are sure independence screening (SIS) and forward regression (FR). Both are classical variable screening methods, and recently have attracted greater attention under high-dimensional data analysis. We consider a new and simple screening method that incorporates multiple predictors at each step of forward regression, with decisions on which variables to incorporate based on the same criterion. If only one step is carried out, the new procedure reduces to SIS. Thus it can be regarded as a generalisation and unification of FR and SIS. More importantly, it preserves the sure screening property and has computational complexity similar to FR at each step, yet it can discover the relevant covariates in fewer steps. Thus it reduces the computational burden of FR drastically while retaining the advantages of the latter over SIS. Furthermore, we show that it can find all the true variables if the number of steps taken is the same as the correct model size, which is a new theoretical result even for the original FR. An extensive simulation study and application to two real data examples demonstrate excellent performance of the proposed method.

Original languageEnglish
Pages (from-to)20-42
Number of pages23
JournalAustralian and New Zealand Journal of Statistics
Volume60
Issue number1
DOIs
Publication statusPublished - Mar 2018

Scopus Subject Areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

User-Defined Keywords

  • big data problems
  • high-dimensional statistical inference
  • model selection
  • variable selection

Fingerprint

Dive into the research topics of 'Greedy forward regression for variable screening'. Together they form a unique fingerprint.

Cite this