Abstract
The world is understood from various modalities, such as appearance, sound, and language. Since each modality only partially represents objects in a certain meaning, leveraging additional ones is beneficial in both theory and practice. However, exploiting novel modalities normally requires cross-modal pairs corresponding to the same instance, which is extremely resource-consuming and sometimes even impossible, making knowledge exploration of novel modalities largely restricted. To seek practical multi-modal learning, here we study Out-of-Modal (OOM) Generalization as an initial attempt to generalize to an unknown modality without given instance-level modal correspondence. Specifically, we consider Semi-Supervised and Unsupervised scenarios of OOM Generalization, where the first has scarce correspondences and the second has none, and propose Connect&Explore (COX) to solve these problems. COX first connects OOM data and known In-Modal (IM) data through a variational information bottleneck framework to extract shared information. Then, COX leverages the shared knowledge to create emergent correspondences, which is theoretically justified from an information-theoretic perspective. As a result, the label information on OOM data emerges along with the correspondences, which helps explore the OOM data with unknown knowledge, thus benefiting generalization results. We carefully evaluate the proposed COX method under various OOM generalization scenarios, verifying its effectiveness and extensibility. The code is available at https://github.com/tmllab/2025 ICLR COX.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of the Thirteenth International Conference on Learning Representations, ICLR 2025 |
| Publisher | International Conference on Learning Representations, ICLR |
| Pages | 56245-56264 |
| Number of pages | 20 |
| ISBN (Electronic) | 9798331320850 |
| Publication status | Published - 24 Apr 2025 |
| Event | 13th International Conference on Learning Representations, ICLR 2025 - , Singapore Duration: 24 Apr 2025 → 28 Apr 2025 https://iclr.cc/Conferences/2025 (Conference website) https://openreview.net/group?id=ICLR.cc/2025/Conference#tab-accept-oral (Conference proceedings) |
Publication series
| Name | International Conference on Learning Representations, ICLR |
|---|
Conference
| Conference | 13th International Conference on Learning Representations, ICLR 2025 |
|---|---|
| Country/Territory | Singapore |
| Period | 24/04/25 → 28/04/25 |
| Internet address |
|