Abstract
In recent years, zero-shot sketch-based image retrieval (ZS-SBIR) task has attracted considerable attention. Although some ZS-SBIR approaches have been proposed, it remains challenging to handle the inherent linkages between the sketch and image domains. Moreover, how to transfer semantic knowledge from seen categories to unseen categories is still an open problem, significantly affecting retrieval performance. In this article, we propose a novel approach Modality Fused Class-Proxy with Knowledge Distillation, named MFCPKD, which develops two novel schemes to remedy the above issues. Specifically, MFCPKD leverages a Modality Fusion Model to learn modality-fused feature embeddings and class proxies. The knowledge distillation is employed for student to learn the feature from seen categories and infer the unknown category through class proxies. Furthermore, three losses constrain the student network to narrow the modality gap between sketch and image domains. Finally, we conduct extensive experiments on three benchmark datasets (Sketchy Ext, TU-Berlin Ext, and QuickDraw Ext) and demonstrate that our MFCPKD method can achieve excellent performance compared to some existing methods in ZS-SBIR scenarios.
| Original language | English |
|---|---|
| Pages (from-to) | 6158-6169 |
| Number of pages | 12 |
| Journal | IEEE Transactions on Circuits and Systems for Video Technology |
| Volume | 35 |
| Issue number | 6 |
| Early online date | 15 Jan 2025 |
| DOIs | |
| Publication status | Published - Jun 2025 |
User-Defined Keywords
- Sketch-based image retrieval
- cross modality alignment
- knowledge distillation
- modality fusion
- zero-shot learning
Fingerprint
Dive into the research topics of 'Modality Fused Class-Proxy with Knowledge Distillation for Zero-Shot Sketch-based Image Retrieval'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver