TY - JOUR
T1 - Enveloped Huber Regression
AU - Zhou, Le
AU - Cook, R. Dennis
AU - Zou, Hui
N1 - Zhou’s research is supported in part by HKBU 162864 and 179424. Zou’s research is supported in part by NSF grants 1915842, 2015120 and 2220286.
Publisher Copyright:
© 2023 American Statistical Association.
PY - 2024/10/1
Y1 - 2024/10/1
N2 - Huber regression (HR) is a popular flexible alternative to the least squares regression when the error follows a heavy-tailed distribution. We propose a new method called the enveloped Huber regression (EHR) by considering the envelope assumption that there exists some subspace of the predictors that has no association with the response, which is referred to as the immaterial part. More efficient estimation is achieved via the removal of the immaterial part. Different from the envelope least squares (ENV) model whose estimation is based on maximum normal likelihood, the estimation of the EHR model is through Generalized Method of Moments. The asymptotic normality of the EHR estimator is established, and it is shown that EHR is more efficient than HR. Moreover, EHR is more efficient than ENV when the error distribution is heavy-tailed, while maintaining a small efficiency loss when the error distribution is normal. Moreover, our theory also covers the heteroscedastic case in which the error may depend on the covariates. The envelope dimension in EHR is a tuning parameter to be determined by the data in practice. We further propose a novel generalized information criterion (GIC) for dimension selection and establish its consistency. Extensive simulation studies confirm the messages from our theory. EHR is further illustrated on a real dataset. Supplementary materials for this article are available online.
AB - Huber regression (HR) is a popular flexible alternative to the least squares regression when the error follows a heavy-tailed distribution. We propose a new method called the enveloped Huber regression (EHR) by considering the envelope assumption that there exists some subspace of the predictors that has no association with the response, which is referred to as the immaterial part. More efficient estimation is achieved via the removal of the immaterial part. Different from the envelope least squares (ENV) model whose estimation is based on maximum normal likelihood, the estimation of the EHR model is through Generalized Method of Moments. The asymptotic normality of the EHR estimator is established, and it is shown that EHR is more efficient than HR. Moreover, EHR is more efficient than ENV when the error distribution is heavy-tailed, while maintaining a small efficiency loss when the error distribution is normal. Moreover, our theory also covers the heteroscedastic case in which the error may depend on the covariates. The envelope dimension in EHR is a tuning parameter to be determined by the data in practice. We further propose a novel generalized information criterion (GIC) for dimension selection and establish its consistency. Extensive simulation studies confirm the messages from our theory. EHR is further illustrated on a real dataset. Supplementary materials for this article are available online.
KW - Asymptotics efficiency
KW - Envelope model
KW - Generalized information criterion
KW - Heavy-tailed distributions
KW - Huber regression
UR - http://www.scopus.com/inward/record.url?scp=85180236392&partnerID=8YFLogxK
U2 - 10.1080/01621459.2023.2277403
DO - 10.1080/01621459.2023.2277403
M3 - Journal article
AN - SCOPUS:85180236392
SN - 0162-1459
VL - 119
SP - 2722
EP - 2732
JO - Journal of the American Statistical Association
JF - Journal of the American Statistical Association
IS - 548
ER -