The sure independence screening procedure by ranking the marginal Pearson correlation is well documented in literatures and works satisfactorily in the ultra-high dimensional case. However, this marginal Person correlation learning would easily miss the variable that is marginally uncorrelated with the response but correlated with the response jointly with some other variables. This failure in missing an important variable is due to the fact that the marginal Pearson correlation does not use the joint information of the response and a set of covariates. In this paper, we introduce a new screening method which leaves a variable into the active set if it jointly with some other variables has a high canonical correlation with the response. This is accomplished via ranking canonical correlations between the response and all possible sets of k variables. Our results show that the procedure has the sure screening property and substantially reduces the dimensionality to a moderate size against the sample size. Extensive simulations demonstrate that our new method performs substantially better than the existing sure independence screening approaches based on the marginal Pearson correlation or Kental’s tau rank correlation. A real data set is also analyzed by implementing our approach.
Scopus Subject Areas
- Statistics and Probability
- Statistics, Probability and Uncertainty
- Group screening
- High dimension