Abstract
When sampling data of specific classes (i.e., known classes) for a scientific task, collectors may encounter unknown classes (i.e., novel classes). Since these novel classes might be valuable for future research, collectors will also sample them and assign them to several clusters with the help of known-class data. This assigning process is known as novel class discovery (NCD). However, category confusion is common in the sampling process and may make the NCD unreliable. To tackle this problem, this paper introduces a new and more realistic setting, where collectors may misidentify known classes and even confuse known classes with novel classes—we name it NCD under unreliable sampling (NUSA). We find that NUSA will empirically degrade existing NCD methods if taking no care of sampling errors. To handle NUSA, we propose an effective solution, named hidden-prototype-based discovery network (HPDN): (1) we try to obtain relatively clean data representations even with the confusedly sampled data; (2) we propose a mini-batch K-means variant for robust clustering, alleviating the negative impact of residual errors embedded in the representations by detaching the noisy supervision timely. Experiments demonstrate that, under NUSA, HPDN significantly outperforms competitive baselines (e.g., 6% more than the best baseline on CIFAR-10) and remains robust when encountering serious sampling errors.
| Original language | English |
|---|---|
| Pages (from-to) | 3191-3207 |
| Number of pages | 17 |
| Journal | International Journal of Computer Vision |
| Volume | 132 |
| Issue number | 8 |
| Early online date | 7 Mar 2024 |
| DOIs | |
| Publication status | Published - Aug 2024 |
User-Defined Keywords
- Novel class discovery
- Open-world recognition
- Semi-supervised learning
- Transfer learning
Fingerprint
Dive into the research topics of 'Does Confusion Really Hurt Novel Class Discovery?'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver