TY - GEN
T1 - SIGUA
T2 - 37th International Conference on Machine Learning, ICML 2020
AU - HAN, Bo
AU - Niu, Gang
AU - Yu, Xingrui
AU - Yao, Quanming
AU - Xu, Miao
AU - Tsang, Ivor W.
AU - Sugiyama, Masashi
N1 - Funding Information:
BH was supported by the Early Career Scheme (ECS) through the Research Grants Council of Hong Kong under Grant No.22200720, HKBU Tier-1 Start-up Grant, HKBU CSD Start-up Grant and a RIKEN BAIHO Award. IWT was supported by Australian Research Council under Grants DP180100106 and DP200101328. MS was supported by the International Research Center for Neurointel-ligence (WPI-IRCN) at The University of Tokyo Institutes for Advanced Study.
PY - 2020/7
Y1 - 2020/7
N2 - Given data with noisy labels, over-parameterized deep networks can gradually memorize the data, and fit everything in the end. Although equipped with corrections for noisy labels, many learning methods in this area still suffer overfitting due to undesired memorization. In this paper, to relieve this issue, we propose stochastic integrated gradient underweighted ascent (SIGUA): in a minibatch, we adopt gradient descent on good data as usual, and learning-rate-reduced gradient ascent on bad data; the proposal is a versatile approach where data goodness or badness is w.r.t. desired or undesired memorization given a base learning method. Technically, SIGUA pulls optimization back for generalization when their goals conflict with each other; philosophically, SIGUA shows forgetting undesired memorization can reinforce desired memorization. Experiments demonstrate that SIGUA successfully robustifies two typical base learning methods, so that their performance is often significantly improved.
AB - Given data with noisy labels, over-parameterized deep networks can gradually memorize the data, and fit everything in the end. Although equipped with corrections for noisy labels, many learning methods in this area still suffer overfitting due to undesired memorization. In this paper, to relieve this issue, we propose stochastic integrated gradient underweighted ascent (SIGUA): in a minibatch, we adopt gradient descent on good data as usual, and learning-rate-reduced gradient ascent on bad data; the proposal is a versatile approach where data goodness or badness is w.r.t. desired or undesired memorization given a base learning method. Technically, SIGUA pulls optimization back for generalization when their goals conflict with each other; philosophically, SIGUA shows forgetting undesired memorization can reinforce desired memorization. Experiments demonstrate that SIGUA successfully robustifies two typical base learning methods, so that their performance is often significantly improved.
UR - http://www.scopus.com/inward/record.url?scp=85105289816&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85105289816
T3 - 37th International Conference on Machine Learning, ICML 2020
SP - 3964
EP - 3974
BT - 37th International Conference on Machine Learning, ICML 2020
A2 - Daume, Hal
A2 - Singh, Aarti
PB - International Machine Learning Society (IMLS)
Y2 - 13 July 2020 through 18 July 2020
ER -