Abstract
In novel class discovery (NCD), we are given labeled data from seen classes and unlabeled data from unseen classes, and we train clustering models for the unseen classes. However, the implicit assumptions behind NCD are still unclear. In this paper, we demystify assumptions behind NCD and find that high-level semantic features should be shared among the seen and unseen classes. Based on this finding, NCD is theoretically solvable under certain assumptions and can be naturally linked to meta-learning that has exactly the same assumption as NCD. Thus, we can empirically solve the NCD problem by meta-learning algorithms after slight modifications. This meta-learning-based methodology significantly reduces the amount of unlabeled data needed for training and makes it more practical, as demonstrated in experiments. The use of very limited data is also justified by the application scenario of NCD: since it is unnatural to label only seen-class data, NCD is sampling instead of labeling in causality. Therefore, unseen-class data should be collected on the way of collecting seen-class data, which is why they are novel and first need to be clustered.
Original language | English |
---|---|
Title of host publication | Proceedings of Tenth International Conference on Learning Representations, ICLR 2022 |
Publisher | International Conference on Learning Representations |
Pages | 1-20 |
Number of pages | 20 |
Publication status | Published - 25 Apr 2022 |
Event | 10th International Conference on Learning Representations, ICLR 2022 - Virtual, Online Duration: 25 Apr 2022 → 29 Apr 2022 https://iclr.cc/Conferences/2022 https://openreview.net/group?id=ICLR.cc/2022/Conference |
Conference
Conference | 10th International Conference on Learning Representations, ICLR 2022 |
---|---|
Period | 25/04/22 → 29/04/22 |
Internet address |