Does Confusion Really Hurt Novel Class Discovery?

Haoang Chi*, Wenjing Yang, Feng Liu, Long Lan, Tao Qin, Bo Han

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review


When sampling data of specific classes (i.e., known classes) for a scientific task, collectors may encounter unknown classes (i.e., novel classes). Since these novel classes might be valuable for future research, collectors will also sample them and assign them to several clusters with the help of known-class data. This assigning process is known as novel class discovery (NCD). However, category confusion is common in the sampling process and may make the NCD unreliable. To tackle this problem, this paper introduces a new and more realistic setting, where collectors may misidentify known classes and even confuse known classes with novel classes—we name it NCD under unreliable sampling (NUSA). We find that NUSA will empirically degrade existing NCD methods if taking no care of sampling errors. To handle NUSA, we propose an effective solution, named hidden-prototype-based discovery network (HPDN): (1) we try to obtain relatively clean data representations even with the confusedly sampled data; (2) we propose a mini-batch K-means variant for robust clustering, alleviating the negative impact of residual errors embedded in the representations by detaching the noisy supervision timely. Experiments demonstrate that, under NUSA, HPDN significantly outperforms competitive baselines (e.g., 6% more than the best baseline on CIFAR-10) and remains robust when encountering serious sampling errors.
Original languageEnglish
Pages (from-to)3191-3207
Number of pages17
JournalInternational Journal of Computer Vision
Issue number8
Publication statusE-pub ahead of print - 7 Mar 2024

Scopus Subject Areas

  • Software
  • Artificial Intelligence
  • Computer Vision and Pattern Recognition

User-Defined Keywords

  • Novel class discovery
  • Open-world recognition
  • Semi-supervised learning
  • Transfer learning


Dive into the research topics of 'Does Confusion Really Hurt Novel Class Discovery?'. Together they form a unique fingerprint.

Cite this