TY - GEN
T1 - Understanding and Improving Early Stopping for Learning with Noisy Labels
AU - Bai, Yingbin
AU - Yang, Erkun
AU - Han, Bo
AU - Yang, Yanhua
AU - Li, Jiatong
AU - Mao, Yinian
AU - Niu, Gang
AU - Liu, Tongliang
N1 - Funding Information:
YB was partially supported by Agriculture Consultant and Smart Management. BH was supported by the RGC Early Career Scheme No. 22200720, NSFC Young Scientists Fund No. 62006202 and HKBU CSD Departmental Incentive Grant. YY was partially supported by Key Research and Development Program of Shaanxi (ProgramNo. 2021ZDLGY01-03). GN was supported by JST AIP Acceleration Research Grant Number JPMJCR20U3, Japan. TL was partially supported by Australian Research Council Projects DE-190101473 and IC-190100031.
Publisher Copyright:
© 2021 Neural information processing systems foundation. All rights reserved.
PY - 2021/12/6
Y1 - 2021/12/6
N2 - The memorization effect of deep neural network (DNN) plays a pivotal role in many state-of-the-art label-noise learning methods. To exploit this property, the early stopping trick, which stops the optimization at the early stage of training, is usually adopted. Current methods generally decide the early stopping point by considering a DNN as a whole. However, a DNN can be considered as a composition of a series of layers, and we find that the latter layers in a DNN are much more sensitive to label noise, while their former counterparts are quite robust. Therefore, selecting a stopping point for the whole network may make different DNN layers antagonistically affect each other, thus degrading the final performance. In this paper, we propose to separate a DNN into different parts and progressively train them to address this problem. Instead of the early stopping which trains a whole DNN all at once, we initially train former DNN layers by optimizing the DNN with a relatively large number of epochs. During training, we progressively train the latter DNN layers by using a smaller number of epochs with the preceding layers fixed to counteract the impact of noisy labels. We term the proposed method as progressive early stopping (PES). Despite its simplicity, compared with the traditional early stopping, PES can help to obtain more promising and stable results. Furthermore, by combining PES with existing approaches on noisy label training, we achieve state-of-the-art performance on image classification benchmarks. The code is made public at https://github.com/tmllab/PES.
AB - The memorization effect of deep neural network (DNN) plays a pivotal role in many state-of-the-art label-noise learning methods. To exploit this property, the early stopping trick, which stops the optimization at the early stage of training, is usually adopted. Current methods generally decide the early stopping point by considering a DNN as a whole. However, a DNN can be considered as a composition of a series of layers, and we find that the latter layers in a DNN are much more sensitive to label noise, while their former counterparts are quite robust. Therefore, selecting a stopping point for the whole network may make different DNN layers antagonistically affect each other, thus degrading the final performance. In this paper, we propose to separate a DNN into different parts and progressively train them to address this problem. Instead of the early stopping which trains a whole DNN all at once, we initially train former DNN layers by optimizing the DNN with a relatively large number of epochs. During training, we progressively train the latter DNN layers by using a smaller number of epochs with the preceding layers fixed to counteract the impact of noisy labels. We term the proposed method as progressive early stopping (PES). Despite its simplicity, compared with the traditional early stopping, PES can help to obtain more promising and stable results. Furthermore, by combining PES with existing approaches on noisy label training, we achieve state-of-the-art performance on image classification benchmarks. The code is made public at https://github.com/tmllab/PES.
UR - http://www.scopus.com/inward/record.url?scp=85130859995&partnerID=8YFLogxK
UR - https://proceedings.neurips.cc/paper_files/paper/2021/hash/cc7e2b878868cbae992d1fb743995d8f-Abstract.html
M3 - Conference proceeding
AN - SCOPUS:85130859995
SN - 9781713845393
T3 - Advances in Neural Information Processing Systems
SP - 24392
EP - 24403
BT - NIPS '21: Proceedings of the 35th International Conference on Neural Information Processing Systems
A2 - Ranzato, Marc'Aurelio
A2 - Beygelzimer, Alina
A2 - Dauphin, Yann
A2 - Liang, Percy S.
A2 - Wortman Vaughan, Jenn
PB - Neural information processing systems foundation
T2 - 35th Conference on Neural Information Processing Systems, NeurIPS 2021
Y2 - 6 December 2021 through 14 December 2021
ER -