TY - JOUR
T1 - SIR-HCL: Semantic-Inconsistency Reasoning and Hybrid Contrastive Learning for Efficient Cross-Emotion Anomaly Detection
AU - Liu, Xin
AU - Chen, Qiyan
AU - Cheung, Yiu-ming
AU - Peng, Shu-Juan
N1 - This work was supported in part by the National Science Foundation of China under Grant 62476103, in part by the Natural Science Foundation of Xiamen City under Grant 3502Z202473043, in part by the National Science Foundation of Fujian Province under Grant 2024J01096 and Grant 2022J01316, in part by the NSFC/Research Grants Council (RGC) Joint Research Scheme under Grant N_HKBU214/21, in part by the RGC Senior Research Fellow Scheme under Grant SRFS2324-2S02, and in part by the General Research Fund of RGC under Grant 12201321 and Grant 12202622.
PY - 2025/10
Y1 - 2025/10
N2 - Cross-emotion anomaly detection is an emerging and challenging research topic in cognitive analysis field, which aims at identifying the abnormal emotion pair whose semantic patterns are inconsistent across different emotional modalities. To the best of our knowledge, this topic has yet to be well studied, which could potentially benefit lots of valuable cognitive applications such as autistic children diagnosis and criminal deception detection. To this end, this article proposes an efficient cross-emotion anomaly detection approach via semantic-inconsistency reasoning and hybrid contrastive learning (SIR-HCL), which is the first attempt to detect the anomalous emotional pairs across the audio–visual emotions. First, the proposed framework utilizes dual-branch network to obtain the deep emotional features in each modality, and then employs the shared residual block to derive the semantically compatible features. Subsequently, an efficient hybrid contrastive learning approach is designed to enlarge the semantic-inconsistency among abnormal emotional pair with different affective classes, while enhancing the semantic-consistency and increasing the feature correlation between normal emotional pair from the same affective class. At the same time, an efficient bidirectional learning scheme is employed to significantly improve the data utilization and a two-component Beta Mixture Model is adaptively utilized to reason the anomalous emotion pairs. Extensive experiments evaluated on two benchmark datasets show that the proposed SIR-HCL method can well detect the anomalous emotional pairs across audio-visual emotional data, and brings substantial improvements over the state-of-the-art competing methods.
AB - Cross-emotion anomaly detection is an emerging and challenging research topic in cognitive analysis field, which aims at identifying the abnormal emotion pair whose semantic patterns are inconsistent across different emotional modalities. To the best of our knowledge, this topic has yet to be well studied, which could potentially benefit lots of valuable cognitive applications such as autistic children diagnosis and criminal deception detection. To this end, this article proposes an efficient cross-emotion anomaly detection approach via semantic-inconsistency reasoning and hybrid contrastive learning (SIR-HCL), which is the first attempt to detect the anomalous emotional pairs across the audio–visual emotions. First, the proposed framework utilizes dual-branch network to obtain the deep emotional features in each modality, and then employs the shared residual block to derive the semantically compatible features. Subsequently, an efficient hybrid contrastive learning approach is designed to enlarge the semantic-inconsistency among abnormal emotional pair with different affective classes, while enhancing the semantic-consistency and increasing the feature correlation between normal emotional pair from the same affective class. At the same time, an efficient bidirectional learning scheme is employed to significantly improve the data utilization and a two-component Beta Mixture Model is adaptively utilized to reason the anomalous emotion pairs. Extensive experiments evaluated on two benchmark datasets show that the proposed SIR-HCL method can well detect the anomalous emotional pairs across audio-visual emotional data, and brings substantial improvements over the state-of-the-art competing methods.
KW - Audio-visual emotion
KW - Beta Mixture Model
KW - cross-emotion anomaly detection
KW - hybrid contrastive learning
KW - semantic-inconsistency reasoning
UR - https://www.scopus.com/pages/publications/105000170854
U2 - 10.1109/TCDS.2025.3550645
DO - 10.1109/TCDS.2025.3550645
M3 - Journal article
SN - 2379-8920
VL - 17
SP - 1310
EP - 1322
JO - IEEE Transactions on Cognitive and Developmental Systems
JF - IEEE Transactions on Cognitive and Developmental Systems
IS - 5
ER -