TY - GEN
T1 - Subclass-Dominant Label Noise
T2 - 37th Conference on Neural Information Processing Systems, NeurIPS 2023
AU - Bai, Yingbin
AU - Han, Zhongyi
AU - Yang, Erkun
AU - Yu, Jun
AU - Han, Bo
AU - Wang, Dadong
AU - Liu, Tongliang
N1 - Funding Information:
The authors would like to thank the anonymous reviewers and the meta-reviewer for their constructive feedback and encouraging comments on this work.Yingbin Bai was supported by CSIRO Data61.Erkun Yang was supported in part by the National Natural Science Foundation of China under Grant 62202365, Guangdong Basic and Applied Basic Research Foundation (2021A1515110026), and Natural Science Basic Research Program of Shaanxi (Program No.2022JQ-608).Jun Yu was supported by the Natural Science Foundation of China (62276242), National Aviation Science Foundation (2022Z071078001), CAAI-Huawei MindSpore Open Fund (CAAIXSJLJJ-2021-016B, CAAIXSJLJJ-2022-001A), Anhui Province Key Research and Development Program (202104a05020007), USTCIAT Application Sci.& Tech.Achievement Cultivation Program (JL06521001Y), Sci.& Tech.Innovation Special Zone (20-163-14-LZ-001-004-01).Bo Han was supported by the NSFC Young Scientists Fund No.62006202, NSFC General Program No.62376235, and Guangdong Basic and Applied Basic Research Foundation No.2022A1515011652.Tongliang Liu was partially supported by the following Australian Research Council projects: FT220100318, DP220102121, LP220100527, LP220200949, and IC190100031.
Publisher Copyright:
© 2023 Neural information processing systems foundation. All rights reserved.
PY - 2023/12/10
Y1 - 2023/12/10
N2 - In this paper, we empirically investigate a previously overlooked and widespread type of label noise, subclass-dominant label noise (SDN).Our findings reveal that, during the early stages of training, deep neural networks can rapidly memorize mislabeled examples in SDN.This phenomenon poses challenges in effectively selecting confident examples using conventional early stopping techniques.To address this issue, we delve into the properties of SDN and observe that long-trained representations are superior at capturing the high-level semantics of mislabeled examples, leading to a clustering effect where similar examples are grouped together.Based on this observation, we propose a novel method called NoiseCluster that leverages the geometric structures of long-trained representations to identify and correct SDN.Our experiments demonstrate that NoiseCluster outperforms state-of-the-art baselines on both synthetic and real-world datasets, highlighting the importance of addressing SDN in learning with noisy labels.The code is available at https://github.com/tmllab/2023_NeurIPS_SDN.
AB - In this paper, we empirically investigate a previously overlooked and widespread type of label noise, subclass-dominant label noise (SDN).Our findings reveal that, during the early stages of training, deep neural networks can rapidly memorize mislabeled examples in SDN.This phenomenon poses challenges in effectively selecting confident examples using conventional early stopping techniques.To address this issue, we delve into the properties of SDN and observe that long-trained representations are superior at capturing the high-level semantics of mislabeled examples, leading to a clustering effect where similar examples are grouped together.Based on this observation, we propose a novel method called NoiseCluster that leverages the geometric structures of long-trained representations to identify and correct SDN.Our experiments demonstrate that NoiseCluster outperforms state-of-the-art baselines on both synthetic and real-world datasets, highlighting the importance of addressing SDN in learning with noisy labels.The code is available at https://github.com/tmllab/2023_NeurIPS_SDN.
UR - http://www.scopus.com/inward/record.url?scp=85185872133&partnerID=8YFLogxK
UR - https://proceedings.neurips.cc/paper_files/paper/2023/hash/d763b4a2dde0ae7b77498516ce9f439e-Abstract-Conference.html
M3 - Conference proceeding
AN - SCOPUS:85185872133
SN - 9781713899921
T3 - Advances in Neural Information Processing Systems
BT - 37th Conference on Neural Information Processing Systems, NeurIPS 2023
A2 - Oh, A.
A2 - Naumann, T.
A2 - Globerson, A.
A2 - Saenko, K.
A2 - Hardt, M.
A2 - Levine, S.
PB - Neural Information Processing Systems Foundation
Y2 - 10 December 2023 through 16 December 2023
ER -