TY - JOUR
T1 - MTFH: A Matrix Tri-Factorization Hashing Framework for Efficient Cross-Modal Retrieval
AU - Liu, Xin
AU - Hu, Zhikai
AU - Ling, Haibin
AU - Cheung, Yiu Ming
N1 - Funding Information:
This work was supported by the National Science Foundation of China (No. 61673185 and No. 61672444), Fundamental Research Funds for the Central Universities of Huaqiao University (No. ZQN-PY309), State Key Laboratory of Integrated Services Networks of Xidian University (No. ISN20-11), National Science Foundation of Fujian Province (No. 2017J01112), Quanzhou City Science & Technology Program of China (No. 2018C107R), the ITF project of HKSAR (No. ITS/339/18), and Hong Kong Baptist University, Research Committee, Initiation Grant-Faculty Niche Research Areas (IG-FNRA) 2018/19 (No. RC-FNRA-IG/18-19/SCI/03).
Publisher Copyright:
© 1979-2012 IEEE.
PY - 2021/3/1
Y1 - 2021/3/1
N2 - Hashing has recently sparked a great revolution in cross-modal retrieval because of its low storage cost and high query speed. Recent cross-modal hashing methods often learn unified or equal-length hash codes to represent the multi-modal data and make them intuitively comparable. However, such unified or equal-length hash representations could inherently sacrifice their representation scalability because the data from different modalities may not have one-to-one correspondence and could be encoded more efficiently by different hash codes of unequal lengths. To mitigate these problems, this paper exploits a related and relatively unexplored problem: encode the heterogeneous data with varying hash lengths and generalize the cross-modal retrieval in various challenging scenarios. To this end, a generalized and flexible cross-modal hashing framework, termed Matrix Tri-Factorization Hashing (MTFH), is proposed to work seamlessly in various settings including paired or unpaired multi-modal data, and equal or varying hash length encoding scenarios. More specifically, MTFH exploits an efficient objective function to flexibly learn the modality-specific hash codes with different length settings, while synchronously learning two semantic correlation matrices to semantically correlate the different hash representations for heterogeneous data comparable. As a result, the derived hash codes are more semantically meaningful for various challenging cross-modal retrieval tasks. Extensive experiments evaluated on public benchmark datasets highlight the superiority of MTFH under various retrieval scenarios and show its competitive performance with the state-of-the-arts.
AB - Hashing has recently sparked a great revolution in cross-modal retrieval because of its low storage cost and high query speed. Recent cross-modal hashing methods often learn unified or equal-length hash codes to represent the multi-modal data and make them intuitively comparable. However, such unified or equal-length hash representations could inherently sacrifice their representation scalability because the data from different modalities may not have one-to-one correspondence and could be encoded more efficiently by different hash codes of unequal lengths. To mitigate these problems, this paper exploits a related and relatively unexplored problem: encode the heterogeneous data with varying hash lengths and generalize the cross-modal retrieval in various challenging scenarios. To this end, a generalized and flexible cross-modal hashing framework, termed Matrix Tri-Factorization Hashing (MTFH), is proposed to work seamlessly in various settings including paired or unpaired multi-modal data, and equal or varying hash length encoding scenarios. More specifically, MTFH exploits an efficient objective function to flexibly learn the modality-specific hash codes with different length settings, while synchronously learning two semantic correlation matrices to semantically correlate the different hash representations for heterogeneous data comparable. As a result, the derived hash codes are more semantically meaningful for various challenging cross-modal retrieval tasks. Extensive experiments evaluated on public benchmark datasets highlight the superiority of MTFH under various retrieval scenarios and show its competitive performance with the state-of-the-arts.
KW - Cross-modal retrieval
KW - matrix tri-factorization hashing
KW - semantic correlation matrix
KW - varying hash length
UR - http://www.scopus.com/inward/record.url?scp=85100821891&partnerID=8YFLogxK
U2 - 10.1109/TPAMI.2019.2940446
DO - 10.1109/TPAMI.2019.2940446
M3 - Journal article
C2 - 31514125
AN - SCOPUS:85100821891
SN - 0162-8828
VL - 43
SP - 964
EP - 981
JO - IEEE Transactions on Pattern Analysis and Machine Intelligence
JF - IEEE Transactions on Pattern Analysis and Machine Intelligence
IS - 3
ER -