TY - JOUR
T1 - A Simple Two-Sample Test in High Dimensions Based on L2-Norm
AU - Zhang, Jin Ting
AU - Guo, Jia
AU - Zhou, Bu
AU - Cheng, Ming Yen
N1 - Funding Information:
The authors thank the co-editor, AE, and two reviewers for their constructive comments and suggestions which help us improve the article substantially. Zhang is supported by the National University of Singapore academic research grant R-155-000-187-114. Zhou is financially supported by the First Class Discipline of Zhejiang?A(Zhejiang Gongshang University?Statistics). Cheng is supported by the Hong Kong Baptist University grants RC-ICRS17-18 and FRG2/17-18/086. Guo would like to thank Professor Wen-Lung Shiau, the Advanced Data Analysis Center (PLS-SEM of Zhejiang University of Technology), for his support on this research.
PY - 2020/4/2
Y1 - 2020/4/2
N2 - Testing the equality of two means is a fundamental inference problem. For high-dimensional data, the Hotelling’s T2-test either performs poorly or becomes inapplicable. Several modifications have been proposed to address this issue. However, most of them are based on asymptotic normality of the null distributions of their test statistics which inevitably requires strong assumptions on the covariance. We study this problem thoroughly and propose an L2-norm based test that works under mild conditions and even when there are fewer observations than the dimension. Specially, to cope with general nonnormality of the null distribution we employ the Welch–Satterthwaite χ2-approximation. We derive a sharp upper bound on the approximation error and use it to justify that χ2-approximation is preferred to normal approximation. Simple ratio-consistent estimators for the parameters in the χ2-approximation are given. Importantly, our test can cope with singularity or near singularity of the covariance which is commonly seen in high dimensions and is the main cause of nonnormality. The power of the proposed test is also investigated. Extensive simulation studies and an application show that our test is at least comparable to and often outperforms several competitors in terms of size control, and the powers are comparable when their sizes are. Supplementary materials for this article are available online.
AB - Testing the equality of two means is a fundamental inference problem. For high-dimensional data, the Hotelling’s T2-test either performs poorly or becomes inapplicable. Several modifications have been proposed to address this issue. However, most of them are based on asymptotic normality of the null distributions of their test statistics which inevitably requires strong assumptions on the covariance. We study this problem thoroughly and propose an L2-norm based test that works under mild conditions and even when there are fewer observations than the dimension. Specially, to cope with general nonnormality of the null distribution we employ the Welch–Satterthwaite χ2-approximation. We derive a sharp upper bound on the approximation error and use it to justify that χ2-approximation is preferred to normal approximation. Simple ratio-consistent estimators for the parameters in the χ2-approximation are given. Importantly, our test can cope with singularity or near singularity of the covariance which is commonly seen in high dimensions and is the main cause of nonnormality. The power of the proposed test is also investigated. Extensive simulation studies and an application show that our test is at least comparable to and often outperforms several competitors in terms of size control, and the powers are comparable when their sizes are. Supplementary materials for this article are available online.
KW - High-dimensional data
KW - Hotelling’s T2-test
KW - Welch–Satterthwaite χ2-approximation
KW - χ2-type mixtures
UR - https://www.ingentaconnect.com/content/tandf/uasa20/2020/00000115/00000530/art00044
UR - http://www.scopus.com/inward/record.url?scp=85066631145&partnerID=8YFLogxK
U2 - 10.1080/01621459.2019.1604366
DO - 10.1080/01621459.2019.1604366
M3 - Journal article
AN - SCOPUS:85066631145
SN - 0162-1459
VL - 115
SP - 1011
EP - 1027
JO - Journal of the American Statistical Association
JF - Journal of the American Statistical Association
IS - 530
ER -