TY - JOUR
T1 - Latent Class-Conditional Noise Model
AU - Yao, Jiangchao
AU - Han, Bo
AU - Zhou, Zhihan
AU - Zhang, Ya
AU - Tsang, Ivor W.
N1 - Funding Information:
This work was supported in part by the STCSM under Grants 22511106101, 18DZ2270700, and 21DZ1100100, in part by 111 plan under Grant BP0719010, in part by State Key Laboratory of UHD Video and Audio Production and Presentation, NSFC Young Scientists Fund under Grant 62006202, in part by RGC Early Career Scheme under Grant 22200720, in part by Guangdong Basic and Applied Basic Research Foundation under Grant 2022A151501165.
Publisher Copyright:
IEEE
PY - 2023/8
Y1 - 2023/8
N2 - Learning with noisy labels has become imperative in the Big Data era, which saves expensive human labors on accurate annotations. Previous noise-transition-based methods have achieved theoretically-grounded performance under the Class-Conditional Noise model (CCN). However, these approaches builds upon an ideal but impractical anchor set available to pre-estimate the noise transition. Even though subsequent works adapt the estimation as a neural layer, the ill-posed stochastic learning of its parameters in back-propagation easily falls into undesired local minimums. We solve this problem by introducing a Latent Class-Conditional Noise model (LCCN) to parameterize the noise transition under a Bayesian framework. By projecting the noise transition into the Dirichlet space, the learning is constrained on a simplex characterized by the complete dataset, instead of some ad-hoc parametric space wrapped by the neural layer. We then deduce a dynamic label regression method for LCCN, whose Gibbs sampler allows us efficiently infer the latent true labels to train the classifier and to model the noise. Our approach safeguards the stable update of the noise transition, which avoids previous arbitrarily tuning from a mini-batch of samples. We further generalize LCCN to different counterparts compatible with open-set noisy labels, semi-supervised learning as well as cross-model training. A range of experiments demonstrate the advantages of LCCN and its variants over the current state-of-the-art methods. The code is available at here.
AB - Learning with noisy labels has become imperative in the Big Data era, which saves expensive human labors on accurate annotations. Previous noise-transition-based methods have achieved theoretically-grounded performance under the Class-Conditional Noise model (CCN). However, these approaches builds upon an ideal but impractical anchor set available to pre-estimate the noise transition. Even though subsequent works adapt the estimation as a neural layer, the ill-posed stochastic learning of its parameters in back-propagation easily falls into undesired local minimums. We solve this problem by introducing a Latent Class-Conditional Noise model (LCCN) to parameterize the noise transition under a Bayesian framework. By projecting the noise transition into the Dirichlet space, the learning is constrained on a simplex characterized by the complete dataset, instead of some ad-hoc parametric space wrapped by the neural layer. We then deduce a dynamic label regression method for LCCN, whose Gibbs sampler allows us efficiently infer the latent true labels to train the classifier and to model the noise. Our approach safeguards the stable update of the noise transition, which avoids previous arbitrarily tuning from a mini-batch of samples. We further generalize LCCN to different counterparts compatible with open-set noisy labels, semi-supervised learning as well as cross-model training. A range of experiments demonstrate the advantages of LCCN and its variants over the current state-of-the-art methods. The code is available at here.
KW - Bayesian Modeling
KW - Deep Learning
KW - Noisy Supervision
KW - Semi-Supervised Learning
UR - http://www.scopus.com/inward/record.url?scp=85149402656&partnerID=8YFLogxK
U2 - 10.1109/TPAMI.2023.3247629
DO - 10.1109/TPAMI.2023.3247629
M3 - Journal article
SN - 0162-8828
VL - 45
SP - 9964
EP - 9980
JO - IEEE Transactions on Pattern Analysis and Machine Intelligence
JF - IEEE Transactions on Pattern Analysis and Machine Intelligence
IS - 8
ER -