TY - JOUR
T1 - GMFAD
T2 - Towards Generalized Visual Recognition via Multilayer Feature Alignment and Disentanglement
AU - Li, Haoliang
AU - Wang, Shiqi
AU - Wan, Renjie
AU - Kot, Alex C.
N1 - Publisher Copyright:
© 1979-2012 IEEE.
PY - 2022/3/1
Y1 - 2022/3/1
N2 - The deep learning based approaches which have been repeatedly proven to bring benefits to visual recognition tasks usually make a strong assumption that the training and test data are drawn from similar feature spaces and distributions. However, such an assumption may not always hold in various practical application scenarios on visual recognition tasks. Inspired by the hierarchical organization of deep feature representation that progressively leads to more abstract features at higher layers of representations, we propose to tackle this problem with a novel feature learning framework, which is called GMFAD, with better generalization capability in a multilayer perceptron manner. We first learn feature representations at the shallow layer where shareable underlying factors among domains (e.g., a subset of which could be relevant for each particular domain) can be explored. In particular, we propose to align the domain divergence between domain pair(s) by considering both inter-dimension and inter-sample correlations, which have been largely ignored by many cross-domain visual recognition methods. Subsequently, to learn more abstract information which could further benefit transferability, we propose to conduct feature disentanglement at the deep feature layer. Extensive experiments based on different visual recognition tasks demonstrate that our proposed framework can learn better transferable feature representation compared with state-of-the-art baselines.
AB - The deep learning based approaches which have been repeatedly proven to bring benefits to visual recognition tasks usually make a strong assumption that the training and test data are drawn from similar feature spaces and distributions. However, such an assumption may not always hold in various practical application scenarios on visual recognition tasks. Inspired by the hierarchical organization of deep feature representation that progressively leads to more abstract features at higher layers of representations, we propose to tackle this problem with a novel feature learning framework, which is called GMFAD, with better generalization capability in a multilayer perceptron manner. We first learn feature representations at the shallow layer where shareable underlying factors among domains (e.g., a subset of which could be relevant for each particular domain) can be explored. In particular, we propose to align the domain divergence between domain pair(s) by considering both inter-dimension and inter-sample correlations, which have been largely ignored by many cross-domain visual recognition methods. Subsequently, to learn more abstract information which could further benefit transferability, we propose to conduct feature disentanglement at the deep feature layer. Extensive experiments based on different visual recognition tasks demonstrate that our proposed framework can learn better transferable feature representation compared with state-of-the-art baselines.
KW - covariance matrix
KW - disentanglement
KW - Generalization capability
KW - visual recognition
UR - http://www.scopus.com/inward/record.url?scp=85124055653&partnerID=8YFLogxK
U2 - 10.1109/TPAMI.2020.3020554
DO - 10.1109/TPAMI.2020.3020554
M3 - Journal article
C2 - 32870783
AN - SCOPUS:85124055653
SN - 0162-8828
VL - 44
SP - 1289
EP - 1303
JO - IEEE Transactions on Pattern Analysis and Machine Intelligence
JF - IEEE Transactions on Pattern Analysis and Machine Intelligence
IS - 3
ER -