TY - JOUR
T1 - Searching to Exploit Memorization Effect in Deep Learning with Noisy Labels
AU - Yang, Hansi
AU - Yao, Quanming
AU - Han, Bo
AU - Kwok, James T.
N1 - This work was supported in part by the National Natural Science Foundation of China under Grant 92270106, in part by the Tsinghua University-Tencent Joint Laboratory for Internet Innovation Technology, in part by the Research Grants Council of the Hong Kong Special Administrative Region, China under Grant 16202523, in part by the Research Grants Council of the Hong Kong Special Administrative Region, China under Grant HKU C7004-22G, in part by the Early CAREER Scheme from Research Grants Council of the Hong Kong Special Administrative Region, China under Grant 22200720, in part by the Young Scientists Fund of National Natural Science Foundation of China under Grant 62006202, and in part by the National Natural Science Foundation of China under Grant 62376235.
PY - 2024/12
Y1 - 2024/12
N2 - Sample selection approaches are popular in robust learning from noisy labels. However, how to control the selection process properly so that deep networks can benefit from the memorization effect is a hard problem. In this paper, motivated by the success of automated machine learning (AutoML), we propose to control the selection process by bi-level optimization. Specifically, we parameterize the selection process by exploiting the general patterns of the memorization effect in the upper-level, and then update these parameters using predicting accuracy obtained from model training in the lower-level. We further introduce semi-supervised learning algorithms to utiilize noisy-labeled data as unlabeled data. To solve the bi-level optimization problem efficiently, we consider more information from the validation curvature by the Newton method and cubic regularization method. We provide convergence analysis for both optimization methods. Results show that while both methods can converge to an (approximately) stationary point, the cubic regularization method can find better local optimal than the Newton method with less time. Experiments on both benchmark and real-world data sets demonstrate that the proposed searching method can lead to significant improvements upon existing methods. Compared with existing AutoML approaches, our method is much more efficient on finding a good selection schedule.
AB - Sample selection approaches are popular in robust learning from noisy labels. However, how to control the selection process properly so that deep networks can benefit from the memorization effect is a hard problem. In this paper, motivated by the success of automated machine learning (AutoML), we propose to control the selection process by bi-level optimization. Specifically, we parameterize the selection process by exploiting the general patterns of the memorization effect in the upper-level, and then update these parameters using predicting accuracy obtained from model training in the lower-level. We further introduce semi-supervised learning algorithms to utiilize noisy-labeled data as unlabeled data. To solve the bi-level optimization problem efficiently, we consider more information from the validation curvature by the Newton method and cubic regularization method. We provide convergence analysis for both optimization methods. Results show that while both methods can converge to an (approximately) stationary point, the cubic regularization method can find better local optimal than the Newton method with less time. Experiments on both benchmark and real-world data sets demonstrate that the proposed searching method can lead to significant improvements upon existing methods. Compared with existing AutoML approaches, our method is much more efficient on finding a good selection schedule.
KW - Automated machine learning (AutoML)
KW - Deep learning
KW - Label-noise learning
KW - Nonconvex optimization
UR - http://www.scopus.com/inward/record.url?scp=85192145769&partnerID=8YFLogxK
U2 - 10.1109/TPAMI.2024.3394552
DO - 10.1109/TPAMI.2024.3394552
M3 - Journal article
SN - 0162-8828
VL - 46
SP - 7833
EP - 7849
JO - IEEE Transactions on Pattern Analysis and Machine Intelligence
JF - IEEE Transactions on Pattern Analysis and Machine Intelligence
IS - 12
ER -