Goodness-of-fit testing-based selection for large-p-small-n problems: A two-stage ranking approach

Xiaobo Ding, Lexin Li, Lixing ZHU*

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

1 Citation (Scopus)

Abstract

In this paper, we investigate two-stage ranking-selection procedures for ultra-high dimensional data in the framework of goodness-of-fit testing. We develop a k-step marginal F-test (MFTk) screening in the first stage. The MFT1 is, as a statistic, equivalent to that used in the sure independence screening (SIS) proposed by Fan and Lv (2008). The MFTk with k≥2 makes improvement over the MFT1 mainly on better handling correlations among predictors. For selecting a more parsimonious working model in the first stage, we propose a soft threshold cutoff through a sequential goodness-of-fit testing. This avoids some drawbacks of the hard threshold cutoff in Fan and Lv (2008) and the extended BIC used in Wang (2009). In the second stage, we develop one-step backward screening to further remove those insignificant predictors from the model. Further, likewise as the iterative SIS, we provide the iterative versions of the proposed procedures to have more accurate variable selection. Extensive numerical studies and real data analysis are carried out to examine the performance of our proposed procedures.

Original languageEnglish
Pages (from-to)148-164
Number of pages17
JournalJournal of Statistical Planning and Inference
Volume145
DOIs
Publication statusPublished - Feb 2014

Scopus Subject Areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty
  • Applied Mathematics

User-Defined Keywords

  • Backward screening
  • Linear model
  • Marginal effect
  • Sequential goodness-of-fit testing

Fingerprint

Dive into the research topics of 'Goodness-of-fit testing-based selection for large-p-small-n problems: A two-stage ranking approach'. Together they form a unique fingerprint.

Cite this