Adversarial Tri-Fusion Hashing Network for Imbalanced Cross-Modal Retrieval

Xin Liu*, Yiu Ming Cheung, Zhikai Hu, Yi He, Bineng Zhong

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

16 Citations (Scopus)

Abstract

Cross-modal retrieval has received increasing attentions for efficient retrieval across different modalities, and hashing technique has made significant progress recently due to its low storage cost and high query speed. However, most existing cross-modal hashing works still face the challenges of narrowing down the semantic gap between different modalities and training with imbalanced multi-modal data. This article presents an efficient Adversarial Tri-Fusion Hashing Network (ATFH-N) for cross-modal retrieval, which lies among the early attempts to incorporate adversarial learning for working with imbalanced multi-modal data. Specifically, a triple fusion network associated with zero padding operation is proposed to adapt either balanced or imbalanced multi-modal training data. At the same time, an adversarial training mechanism is leveraged to maximally bridge the semantic gap of the common representations between balanced and imbalanced data. Further, a label prediction network is utilized to guide the feature learning process and promote hash code learning, while additionally embedding the manifold structure to preserve both inter-modal and intra-modal similarities. Through the joint exploitation of the above, the underlying semantic structure of multimedia data can be well preserved in Hamming space, which can benefit various cross-modal retrieval tasks. Extensive experiments on three benchmark datasets show that the proposed ATFH-N method yields the comparable performance in balanced scenario and brings substantial improvements over the state-of-the-art methods in imbalanced scenarios.

Original languageEnglish
Pages (from-to)607-619
Number of pages13
JournalIEEE Transactions on Emerging Topics in Computational Intelligence
Volume5
Issue number4
Early online date13 Jul 2020
DOIs
Publication statusPublished - Aug 2021

Scopus Subject Areas

  • Artificial Intelligence
  • Computer Science Applications
  • Computational Mathematics
  • Control and Optimization

User-Defined Keywords

  • Cross-modal hashing
  • imbalanced multi-modal data
  • adversarial tri-fusion hashing
  • manifold structure

Fingerprint

Dive into the research topics of 'Adversarial Tri-Fusion Hashing Network for Imbalanced Cross-Modal Retrieval'. Together they form a unique fingerprint.

Cite this