Continuous Review and Timely Correction: Enhancing the Resistance to Noisy Labels Via Self-Not-True and Class-Wise Distillation

  • Long Lan
  • , Jingyi Wang
  • , Xinghao Wu*
  • , Bo Han
  • , Xinwang Liu
  • *Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

Abstract

Deep neural networks possess remarkable learning capabilities and expressive power, but this makes them vulnerable to overfitting, especially when they encounter mislabeled data. A notable phenomenon called the memorization effect occurs when networks first learn the correctly labeled data and later memorize the mislabeled instances. While early stopping can mitigate overfitting, it doesn't entirely prevent networks from adapting to incorrect labels during the initial training phases, which can result in losing valuable insights from accurate data. Moreover, early stopping cannot rectify the mistakes caused by mislabeled inputs, underscoring the need for improved strategies. In this paper, we introduce an innovative mechanism for continuous review and timely correction of learned knowledge. Our approach allows the network to repeatedly revisit and reinforce correct information while promptly addressing any inaccuracies stemming from mislabeled data. We present a novel method called self-not-true-distillation (SNTD). This technique employs self-distillation, where the network from previous training iterations acts as a teacher, guiding the current network to review and solidify its understanding of accurate labels. Crucially, SNTD masks the true class label in the logits during this process, concentrating on the non-true classes to correct any erroneous knowledge that may have been acquired. We also recognize that different data classes follow distinct learning trajectories. A single teacher network might struggle to effectively guide the learning of all classes at once, which necessitates selecting different teacher networks for each specific class. Additionally, the influence of the teacher network's guidance varies throughout the training process. To address these challenges, we propose SNTD+, which integrates a class-wise distillation strategy along with a dynamic weight adjustment mechanism. Together, these enhancements significantly bolster SNTD's robustness in tackling complex scenarios characterized by label noise.
Original languageEnglish
Number of pages15
JournalIEEE Transactions on Pattern Analysis and Machine Intelligence
DOIs
Publication statusE-pub ahead of print - 29 Dec 2025

User-Defined Keywords

  • Class-Wise Distillation
  • Early Stopping
  • Learning with Noisy Labels
  • Self-Not-True Distillation
  • Class-Wise distillation

Fingerprint

Dive into the research topics of 'Continuous Review and Timely Correction: Enhancing the Resistance to Noisy Labels Via Self-Not-True and Class-Wise Distillation'. Together they form a unique fingerprint.

Cite this