Abstract
Cross-modal matching has recently gained significant popularity to facilitate retrieval across multi-modal data, and existing works are highly relied on an implicit assumption that the training data pairs are perfectly aligned. However, such an ideal assumption is extremely impossible due to the inevitably mismatched data pairs, a.k.a. noisy correspondence, which can wrongly enforce the mismatched data to be similar and thus induces the performance degradation. Although some recent methods have attempted to address this problem, they still face two challenging issues: 1) unreliable data division for training inefficiency and 2) unstable prediction for matching failure. To address these problems, we propose an efficient Uncertainty-Guided Noisy Correspondence Learning (UGNCL) framework to achieve noise-robust cross-modal matching. Specifically, a novel Uncertainty Guided Division (UGD) algorithm is reliably designed leverage the potential benefits of derived uncertainty to divide the data into clean, noisy and hard partitions, which can effortlessly mitigate the impact of easily-determined noisy pairs. Meanwhile, an efficient Trusted Robust Loss (TRL) is explicitly designed to recast the soft margins, calibrated by confident yet error soft correspondence labels, for the data pairs in the hard partition through the uncertainty, leading to increase/decrease the importance of matched/mismatched pairs and further alleviate the impact of noisy pairs for robustness improvement. Extensive experiments conducted on three public datasets highlight the superiorities of the proposed framework, and show its competitive performance compared with the state-of-the-arts.
Original language | English |
---|---|
Title of host publication | SIGIR '24 |
Subtitle of host publication | Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval |
Place of Publication | New York |
Publisher | Association for Computing Machinery (ACM) |
Pages | 852-861 |
Number of pages | 10 |
ISBN (Electronic) | 9798400704314 |
DOIs | |
Publication status | Published - 11 Jul 2024 |
Event | 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2024 - , United States Duration: 14 Jul 2024 → 18 Jul 2024 https://sigir-2024.github.io/ (Conference Website) |
Conference
Conference | 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2024 |
---|---|
Country/Territory | United States |
Period | 14/07/24 → 18/07/24 |
Internet address |
|
User-Defined Keywords
- Cross-Modal Matching
- Noisy Correspondence Learning
- Uncertainty Guided Division
- Trusted Robust Loss