TY - JOUR
T1 - BIVAS
T2 - A Scalable Bayesian Method for Bi-Level Variable Selection With Applications
AU - Cai, Mingxuan
AU - Dai, Mingwei
AU - Ming, Jingsi
AU - Peng, Heng
AU - Liu, Jin
AU - Yang, Can
N1 - Funding Information:
This work was supported in part by the National Science Funding of China [61501389]; the Hong Kong Research Grant Council [22302815, 12316116, 12301417, and 16307818]; The Hong Kong University of Science and Technology [startup grant R9405 and IGN17SC02]; Duke-NUS Medical School WBS [R-913-200-098-263]; Ministry of Education, Singapore. AcRF Tier 2 [MOE2016-T2-2-029, MOE2018-T2-1-046, and MOE2018-T2-2-006].
Publisher copyright:
© 2019 American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America
PY - 2020/1/2
Y1 - 2020/1/2
N2 - In this article, we consider a Bayesian bi-level variable selection problem in high-dimensional regressions. In many practical situations, it is natural to assign group membership to each predictor. Examples include that genetic variants can be grouped at the gene level and a covariate from different tasks naturally forms a group. Thus, it is of interest to select important groups as well as important members from those groups. The existing Markov chain Monte Carlo methods are often computationally intensive and not scalable to large datasets. To address this problem, we consider variational inference for bi-level variable selection. In contrast to the commonly used mean-field approximation, we propose a hierarchical factorization to approximate the posterior distribution, by using the structure of bi-level variable selection. Moreover, we develop a computationally efficient and fully parallelizable algorithm based on this variational approximation. We further extend the developed method to model datasets from multitask learning. The comprehensive numerical results from both simulation studies and real data analysis demonstrate the advantages of BIVAS for variable selection, parameter estimation, and computational efficiency over existing methods. The method is implemented in R package “bivas” available at https://github.com/mxcai/bivas. Supplementary materials for this article are available online.
AB - In this article, we consider a Bayesian bi-level variable selection problem in high-dimensional regressions. In many practical situations, it is natural to assign group membership to each predictor. Examples include that genetic variants can be grouped at the gene level and a covariate from different tasks naturally forms a group. Thus, it is of interest to select important groups as well as important members from those groups. The existing Markov chain Monte Carlo methods are often computationally intensive and not scalable to large datasets. To address this problem, we consider variational inference for bi-level variable selection. In contrast to the commonly used mean-field approximation, we propose a hierarchical factorization to approximate the posterior distribution, by using the structure of bi-level variable selection. Moreover, we develop a computationally efficient and fully parallelizable algorithm based on this variational approximation. We further extend the developed method to model datasets from multitask learning. The comprehensive numerical results from both simulation studies and real data analysis demonstrate the advantages of BIVAS for variable selection, parameter estimation, and computational efficiency over existing methods. The method is implemented in R package “bivas” available at https://github.com/mxcai/bivas. Supplementary materials for this article are available online.
KW - Bayesian variable selection
KW - Group sparsity
KW - Parallel computing
KW - Variational inference
UR - http://www.scopus.com/inward/record.url?scp=85083617153&partnerID=8YFLogxK
U2 - 10.1080/10618600.2019.1624365
DO - 10.1080/10618600.2019.1624365
M3 - Journal article
AN - SCOPUS:85083617153
SN - 1061-8600
VL - 29
SP - 40
EP - 52
JO - Journal of Computational and Graphical Statistics
JF - Journal of Computational and Graphical Statistics
IS - 1
ER -