TY - JOUR
T1 - Rank-Based Greedy Model Averaging for High-Dimensional Survival Data
AU - He, Baihua
AU - Ma, Shuangge
AU - Zhang, Xinyu
AU - Zhu, Li Xing
N1 - Shuangge Ma was financially supported by National Institutes of Health (CA204120). Xinyu Zhang gratefully acknowledges research support from the National Natural Science Foundation of China (71925007, 72091212, and 71988101), the CAS Project for Young Scientists in Basic Research (YSBR-008), and the Beijing Academy of Artificial Intelligence. Lixing Zhu was financially supported by the grant (HKBU12302720) from the Research Grants Council of Hong Kong and the grant (NSFC12131006) funded by the National Scientific Foundation of China.
Publisher Copyright:
© 2022 American Statistical Association.
PY - 2023/12
Y1 - 2023/12
N2 - Model averaging is an effective way to enhance prediction accuracy. However, most previous works focus on low-dimensional settings with completely observed responses. To attain an accurate prediction for the risk effect of survival data with high-dimensional predictors, we propose a novel method: rank-based greedy (RG) model averaging. Specifically, adopting the transformation model with splitting predictors as working models, we doubly use the smooth concordance index function to derive the candidate predictions and optimal model weights. The final prediction is achieved by weighted averaging all the candidates. Our approach is flexible, computationally efficient, and robust against model misspecification, as it neither requires the correctness of a joint model nor involves the estimation of the transformation function. We further adopt the greedy algorithm for high dimensions. Theoretically, we derive an asymptotic error bound for the optimal weights under some mild conditions. In addition, the summation of weights assigned to the correct candidate submodels is proven to approach one in probability when there are correct models included among the candidate submodels. Extensive numerical studies are carried out using both simulated and real datasets to show the proposed approach’s robust performance compared to the existing regularization approaches. Supplementary materials for this article are available online.
AB - Model averaging is an effective way to enhance prediction accuracy. However, most previous works focus on low-dimensional settings with completely observed responses. To attain an accurate prediction for the risk effect of survival data with high-dimensional predictors, we propose a novel method: rank-based greedy (RG) model averaging. Specifically, adopting the transformation model with splitting predictors as working models, we doubly use the smooth concordance index function to derive the candidate predictions and optimal model weights. The final prediction is achieved by weighted averaging all the candidates. Our approach is flexible, computationally efficient, and robust against model misspecification, as it neither requires the correctness of a joint model nor involves the estimation of the transformation function. We further adopt the greedy algorithm for high dimensions. Theoretically, we derive an asymptotic error bound for the optimal weights under some mild conditions. In addition, the summation of weights assigned to the correct candidate submodels is proven to approach one in probability when there are correct models included among the candidate submodels. Extensive numerical studies are carried out using both simulated and real datasets to show the proposed approach’s robust performance compared to the existing regularization approaches. Supplementary materials for this article are available online.
KW - Greedy algorithm
KW - High-dimensional survival data
KW - Model averaging
KW - Prediction
KW - Smooth concordance index
UR - http://www.scopus.com/inward/record.url?scp=85133628062&partnerID=8YFLogxK
U2 - 10.1080/01621459.2022.2070070
DO - 10.1080/01621459.2022.2070070
M3 - Journal article
AN - SCOPUS:85133628062
SN - 0162-1459
VL - 118
SP - 2658
EP - 2670
JO - Journal of the American Statistical Association
JF - Journal of the American Statistical Association
IS - 544
ER -