Abstract
An exhaustive search as required for traditional variable selection methods is impractical in high dimensional statistical modeling. Thus, to conduct variable selection, various forms of penalized estimators with good statistical and computational properties, have been proposed during the past two decades. The attractive properties of these shrinkage and selection estimators, however, depend critically on the size of regularization which controls model complexity. In this paper, we consider the problem of consistent tuning parameter selection in high dimensional sparse linear regression where the dimension of the predictor vector is larger than the size of the sample. First, we propose a family of high dimensional Bayesian Information Criteria (HBIC), and then investigate the selection consistency, extending the results of the extended Bayesian Information Criterion (EBIC), in Chen and Chen (2008) to ultra-high dimensional situations. Second, we develop a two-step procedure, the SIS+AENET, to conduct variable selection in p>n situations. The consistency of tuning parameter selection is established under fairly mild technical conditions. Simulation studies are presented to confirm theoretical findings, and an empirical example is given to illustrate the use in the internet advertising data.
Original language | English |
---|---|
Pages (from-to) | 1141-1151 |
Number of pages | 11 |
Journal | Journal of Multivariate Analysis |
Volume | 102 |
Issue number | 7 |
DOIs | |
Publication status | Published - Aug 2011 |
Scopus Subject Areas
- Statistics and Probability
- Numerical Analysis
- Statistics, Probability and Uncertainty
User-Defined Keywords
- Adaptive Elastic Net
- Bayesian information criterion
- High dimensionality
- Sure independence screening
- Tuning parameter selection
- Variable selection