High dimensional inference via kernel convolution smoothing

Project: Research project

Project Details


Technological advances have made high-dimensional data ubiquitous in many scientific fields. In the past decades statisticians have witnessed the explosive developments of high-dimensional statistical methods and theory driven by the analysis of big data in fields such as biological science, medical science, public health, social sciences and finance, among others. Statistical inference is the key tool for scientific discoveries. Testing hypothesis under ultra-high dimensionality is important and very challenging. Existing works for high dimensional inference are typically based on least squares approach and canonical regression model, and they mainly focus on conditional mean and light-tailed data. However, in real applications, quantities other than the conditional mean is often of interest. For instance, the value at risk in econometrics and finance is studied through expectiles, which is a general position in a distribution including the mean as its special case. The existing theory in literature cannot provide inference of expectiles in ultra-high dimensions. Meanwhile, heavy-tailed distribution is a stylized feature for high-dimensional data, which has been confirmed by many studies in literature. In these cases, the canonical regression model has severe limitations and nonstandard regression is more powerful. The overall target of this research project is to propose statistically and computationally efficient method for inference in these nonstandard yet more realistic high dimensional settings. In the first project of this proposal, we study the linear hypothesis testing problem for high dimensional regression coefficients, with no moment condition imposed on error distribution. We propose novel Wald and score test statistics based on convoluted rank regression estimator, which is a powerful estimator that remains valid under heavy-tailed error while adopting smooth loss so it is computationally efficient. In the second project, we focus on the testing problem for high dimensional expectile regression. Due to technical challenges caused by non-smoothness of the expectile loss, inference for expectile regression is unstudied in literature. We adopt kernel convolution to smooth the expectile loss, and propose corresponding Wald and score tests for high dimensional expectile regression. In both projects, we will prove that the distributions of proposed test statistics are asymptotically equivalent to chi-squared distribution with certain degrees of freedom. Power against local alternatives will also be investigated. We will conduct simulation studies and apply the proposed methods to analyze some high-dimensional datasets concerning genomics, climate study, econometrics and finance.
Effective start/end date1/01/2431/12/26


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.