TY - JOUR
T1 - Class-Wise Denoising for Robust Learning under Label Noise
AU - Gong, Chen
AU - Ding, Yongliang
AU - Han, Bo
AU - Niu, Gang
AU - Yang, Jian
AU - You, Jane J.
AU - Tao, Dacheng
AU - Sugiyama, Masashi
N1 - Funding Information:
The work of Chen Gong was supported in part by the NSF of China under Grant 61973162, in part by the NSF of Jiangsu Province under Grant BZ2021013, in part by the Fundamental Research Funds for the Central Universities under Grants 30920032202 and 30921013114, in part by "111 Program" under Grant B13022. The work of Bo Han was supported in part by the NSFC Young Scientists Fund under Grant 62006202, and in part by the RGC Early Career Scheme under Grant 22200720. The work of Jane You was supported in part by Hong Kong Polytechnic University under Grants YZ3K, UAJP/UAGK, and ZVRH. The work of Masashi Sugiyama was supported in part by KAKENHI under Grant 20H04206.
Publisher Copyright:
© 1979-2012 IEEE.
PY - 2023/3/1
Y1 - 2023/3/1
N2 - Label noise is ubiquitous in many real-world scenarios which often misleads training algorithm and brings about the degraded classification performance. Therefore, many approaches have been proposed to correct the loss function given corrupted labels to combat such label noise. Among them, a trend of works achieve this goal by unbiasedly estimating the data centroid, which plays an important role in constructing an unbiased risk estimator for minimization. However, they usually handle the noisy labels in different classes all at once, so the local information inherited by each class is ignored which often leads to unsatisfactory performance. To address this defect, this paper presents a novel robust learning algorithm dubbed 'Class-Wise Denoising' (CWD), which tackles the noisy labels in a class-wise way to ease the entire noise correction task. Specifically, two virtual auxiliary sets are respectively constructed by presuming that the positive and negative labels in the training set are clean, so the original false-negative labels and false-positive ones are tackled separately. As a result, an improved centroid estimator can be designed which helps to yield more accurate risk estimator. Theoretically, we prove that: 1) the variance in centroid estimation can often be reduced by our CWD when compared with existing methods with unbiased centroid estimator; and 2) the performance of CWD trained on the noisy set will converge to that of the optimal classifier trained on the clean set with a convergence rate O(1n) where n is the number of the training examples. These sound theoretical properties critically enable our CWD to produce the improved classification performance under label noise, which is also demonstrated by the comparisons with ten representative state-of-the-art methods on a variety of benchmark datasets.
AB - Label noise is ubiquitous in many real-world scenarios which often misleads training algorithm and brings about the degraded classification performance. Therefore, many approaches have been proposed to correct the loss function given corrupted labels to combat such label noise. Among them, a trend of works achieve this goal by unbiasedly estimating the data centroid, which plays an important role in constructing an unbiased risk estimator for minimization. However, they usually handle the noisy labels in different classes all at once, so the local information inherited by each class is ignored which often leads to unsatisfactory performance. To address this defect, this paper presents a novel robust learning algorithm dubbed 'Class-Wise Denoising' (CWD), which tackles the noisy labels in a class-wise way to ease the entire noise correction task. Specifically, two virtual auxiliary sets are respectively constructed by presuming that the positive and negative labels in the training set are clean, so the original false-negative labels and false-positive ones are tackled separately. As a result, an improved centroid estimator can be designed which helps to yield more accurate risk estimator. Theoretically, we prove that: 1) the variance in centroid estimation can often be reduced by our CWD when compared with existing methods with unbiased centroid estimator; and 2) the performance of CWD trained on the noisy set will converge to that of the optimal classifier trained on the clean set with a convergence rate O(1n) where n is the number of the training examples. These sound theoretical properties critically enable our CWD to produce the improved classification performance under label noise, which is also demonstrated by the comparisons with ten representative state-of-the-art methods on a variety of benchmark datasets.
KW - Centroid estimation
KW - Label noise
KW - Unbiasedness
KW - Variance reduction
UR - http://www.scopus.com/inward/record.url?scp=85131712340&partnerID=8YFLogxK
U2 - 10.1109/TPAMI.2022.3178690
DO - 10.1109/TPAMI.2022.3178690
M3 - Journal article
AN - SCOPUS:85131712340
SN - 0162-8828
VL - 45
SP - 2835
EP - 2848
JO - IEEE Transactions on Pattern Analysis and Machine Intelligence
JF - IEEE Transactions on Pattern Analysis and Machine Intelligence
IS - 3
ER -