TY - JOUR
T1 - When Tukey meets Chauvenet: a new boxplot criterion for outlier detection
AU - Lin, Hongmei
AU - Zhang, Riquan
AU - Tong, Tiejun
N1 - The authors thank the editor, the associate editor, and the two reviewers for their constructive comments that have led to a significant improvement of the paper, in particular on the quality of the figures and the user-friendly R package ‘ChauBoxplot’ for implementing the Chauvenet-type boxplot. Hongmei Lin’s research was supported in part by the National Natural Science Foundation of China (12171310), the Shanghai “Project Dawn 2022” (22SG52), and the Basic Research Project of Shanghai Science and Technology Commission (22JC1400800). Riquan Zhang’s research was supported in part by the National Natural Science Foundation of China (12371272) and the Basic Research Project of Shanghai Science and Technology Commission (22JC1400800). Tiejun Tong’s research was supported in part by the General Research Fund of Hong Kong (HKBU12300123 and HKBU12303421) and the Initiation Grant for Faculty Niche Research Areas of Hong Kong Baptist University (RC-FNRA-IG/23-24/SCI/03).
Publisher copyright:
© 2025 The Author(s). Published with license by Taylor and Francis Group, LLC
PY - 2025/8/6
Y1 - 2025/8/6
N2 - The box-and-whisker plot, introduced by Tukey (1977), is one of the most popular graphical methods in descriptive statistics. On the other hand, however, Tukey’s boxplot is free of sample size, yielding the so-called “one-size-fits-all” fences for outlier detection. Although improvements on the sample size adjusted boxplots do exist in the literature, most of them are either not easy to implement or lack justification. As another common rule for outlier detection, Chauvenet’s criterion uses the sample mean and standard derivation to perform the test, but it is often sensitive to the included outliers and hence is not robust. In this paper, by combining Tukey’s boxplot and Chauvenet’s criterion, we introduce a new boxplot, namely the Chauvenet-type boxplot, with the fence coefficient determined by an exact control of the outside rate per observation. Our new outlier criterion not only maintains the simplicity of the boxplot from a practical perspective, but also serves as a robust Chauvenet’s criterion. Simulation study and a real data analysis on the civil service pay adjustment in Hong Kong demonstrate that the Chauvenet-type boxplot performs extremely well regardless of the sample size, and can therefore be highly recommended for practical use to replace both Tukey’s boxplot and Chauvenet’s criterion. Lastly, to increase the visibility of the work, a user-friendly R package named ‘ChauBoxplot’ has also been officially released on CRAN.
AB - The box-and-whisker plot, introduced by Tukey (1977), is one of the most popular graphical methods in descriptive statistics. On the other hand, however, Tukey’s boxplot is free of sample size, yielding the so-called “one-size-fits-all” fences for outlier detection. Although improvements on the sample size adjusted boxplots do exist in the literature, most of them are either not easy to implement or lack justification. As another common rule for outlier detection, Chauvenet’s criterion uses the sample mean and standard derivation to perform the test, but it is often sensitive to the included outliers and hence is not robust. In this paper, by combining Tukey’s boxplot and Chauvenet’s criterion, we introduce a new boxplot, namely the Chauvenet-type boxplot, with the fence coefficient determined by an exact control of the outside rate per observation. Our new outlier criterion not only maintains the simplicity of the boxplot from a practical perspective, but also serves as a robust Chauvenet’s criterion. Simulation study and a real data analysis on the civil service pay adjustment in Hong Kong demonstrate that the Chauvenet-type boxplot performs extremely well regardless of the sample size, and can therefore be highly recommended for practical use to replace both Tukey’s boxplot and Chauvenet’s criterion. Lastly, to increase the visibility of the work, a user-friendly R package named ‘ChauBoxplot’ has also been officially released on CRAN.
KW - Box-and-whisker plot
KW - Chauvenet-type boxplot
KW - Chauvenet’s criterion
KW - Fence coefficient
KW - Outlier detection
KW - Sample size
UR - http://www.scopus.com/inward/record.url?scp=105012635303&partnerID=8YFLogxK
U2 - 10.1080/10618600.2025.2520577
DO - 10.1080/10618600.2025.2520577
M3 - Journal article
SN - 1061-8600
JO - Journal of Computational and Graphical Statistics
JF - Journal of Computational and Graphical Statistics
ER -