TY - JOUR
T1 - Independence tests with random subspace of two random vectors in high dimension
AU - Qiu, Tao
AU - Xu, Wangli
AU - Zhu, Lixing
N1 - This work was partially supported by the Beijing Natural Science Foundation, China (Z200001), and National Science Foundation of China (11971478 , 11731011 , 11931014).
Publisher Copyright:
© 2023 Elsevier Inc.
PY - 2023/5
Y1 - 2023/5
N2 - Testing for independence between two random vectors is a fundamental problem in statistics. When the dimension of these two random vectors are fixed, the existing tests based on the distance covariance and Hilbert–Schmidt independence criterion with many desirable properties, including the capacity to capture linear and non-linear dependence. However, these tests may fail to capture the non-linear dependence due to the “curse of dimensionality” when the random vectors are high dimensional. To attack this problem, we propose a general framework for testing the dependence of two random vectors to randomly select two subspaces consisting of components of the vectors, respectively. To enhance the performance of this method, we repeatedly implement this procedure to construct the final test statistic. The new method can also work for non-linear dependence detection in a high-dimensional setup. Theoretically, if the replication time tends to infinity to get the final statistic, we can avoid potential power loss caused by lousy subspaces. Therefore, the two proposed tests are consistent with general alternatives. The weak limit under the null hypothesis is normal; thus, determining critical value need not resort to resampling approximation. We demonstrate the finite-sample performance of the proposed test by using Monte Carlo simulations and the analysis for a real-data example.
AB - Testing for independence between two random vectors is a fundamental problem in statistics. When the dimension of these two random vectors are fixed, the existing tests based on the distance covariance and Hilbert–Schmidt independence criterion with many desirable properties, including the capacity to capture linear and non-linear dependence. However, these tests may fail to capture the non-linear dependence due to the “curse of dimensionality” when the random vectors are high dimensional. To attack this problem, we propose a general framework for testing the dependence of two random vectors to randomly select two subspaces consisting of components of the vectors, respectively. To enhance the performance of this method, we repeatedly implement this procedure to construct the final test statistic. The new method can also work for non-linear dependence detection in a high-dimensional setup. Theoretically, if the replication time tends to infinity to get the final statistic, we can avoid potential power loss caused by lousy subspaces. Therefore, the two proposed tests are consistent with general alternatives. The weak limit under the null hypothesis is normal; thus, determining critical value need not resort to resampling approximation. We demonstrate the finite-sample performance of the proposed test by using Monte Carlo simulations and the analysis for a real-data example.
KW - Distance covariance
KW - High dimension
KW - Hilbert–Schmidt independence criterion
KW - Independence test
KW - Random subspace sampling
KW - U-statistics
UR - http://www.scopus.com/inward/record.url?scp=85147818203&partnerID=8YFLogxK
U2 - 10.1016/j.jmva.2023.105160
DO - 10.1016/j.jmva.2023.105160
M3 - Journal article
AN - SCOPUS:85147818203
SN - 0047-259X
VL - 195
JO - Journal of Multivariate Analysis
JF - Journal of Multivariate Analysis
M1 - 105160
ER -