TY - GEN
T1 - Multi-modal Mood Reader
T2 - 5th International Conference on Neural Computing for Advanced Applications, NCAA 2024
AU - Dong, Yihang
AU - Chen, Xuhang
AU - Shen, Yanyan
AU - Ng, Michael Kwok Po
AU - Qian, Tao
AU - Wang, Shuqiang
N1 - This work was supported in part by the National Natural Science Foundations of China under Grant 62172403, the Distinguished Young Scholars Fund of Guangdong under Grant 2021B1515020019. M. Ng’s research is supported in part by the HKRGC GRF 17201020 and 17300021, HKRGC CRF C7004-21GF, and Joint NSFC and RGC N-HKU769/21.
Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.
PY - 2024/9/22
Y1 - 2024/9/22
N2 - Emotion recognition based on Electroencephalography (EEG) has gained significant attention and diversified development in fields such as neural signal processing and affective computing. However, the unique brain anatomy of individuals leads to non-negligible natural differences in EEG signals across subjects, posing challenges for cross-subject emotion recognition. While recent studies have attempted to address these issues, they still face limitations in practical effectiveness and model framework unity. Current methods often struggle to capture the complex spatial-temporal dynamics of EEG signals and fail to effectively integrate multimodal information, resulting in suboptimal performance and limited generalizability across subjects. To overcome these limitations, we develop a Pre-trained model based Multimodal Mood Reader for cross-subject emotion recognition that utilizes masked brain signal modeling and interlinked spatial-temporal attention mechanism. The model learns universal latent representations of EEG signals through pre-training on large scale dataset, and employs Interlinked spatial-temporal attention mechanism to process Differential Entropy(DE) features extracted from EEG data. Subsequently, a multi-level fusion layer is proposed to integrate the discriminative features, maximizing the advantages of features across different dimensions and modalities. Extensive experiments on public datasets demonstrate Mood Reader’s superior performance in cross-subject emotion recognition tasks, outperforming state-of-the-art methods. Additionally, the model is dissected from attention perspective, providing qualitative analysis of emotion-related brain areas, offering valuable insights for affective research in neural signal processing.
AB - Emotion recognition based on Electroencephalography (EEG) has gained significant attention and diversified development in fields such as neural signal processing and affective computing. However, the unique brain anatomy of individuals leads to non-negligible natural differences in EEG signals across subjects, posing challenges for cross-subject emotion recognition. While recent studies have attempted to address these issues, they still face limitations in practical effectiveness and model framework unity. Current methods often struggle to capture the complex spatial-temporal dynamics of EEG signals and fail to effectively integrate multimodal information, resulting in suboptimal performance and limited generalizability across subjects. To overcome these limitations, we develop a Pre-trained model based Multimodal Mood Reader for cross-subject emotion recognition that utilizes masked brain signal modeling and interlinked spatial-temporal attention mechanism. The model learns universal latent representations of EEG signals through pre-training on large scale dataset, and employs Interlinked spatial-temporal attention mechanism to process Differential Entropy(DE) features extracted from EEG data. Subsequently, a multi-level fusion layer is proposed to integrate the discriminative features, maximizing the advantages of features across different dimensions and modalities. Extensive experiments on public datasets demonstrate Mood Reader’s superior performance in cross-subject emotion recognition tasks, outperforming state-of-the-art methods. Additionally, the model is dissected from attention perspective, providing qualitative analysis of emotion-related brain areas, offering valuable insights for affective research in neural signal processing.
KW - EEG-based emotion recognition
KW - masked brain signal modeling
KW - Pre-trained Model
KW - spatial-temporal attention
UR - http://www.scopus.com/inward/record.url?scp=85205474533&partnerID=8YFLogxK
U2 - 10.1007/978-981-97-7007-6_13
DO - 10.1007/978-981-97-7007-6_13
M3 - Conference proceeding
AN - SCOPUS:85205474533
SN - 9789819770069
T3 - Communications in Computer and Information Science
SP - 178
EP - 192
BT - Neural Computing for Advanced Applications - 5th International Conference, NCAA 2024, Proceedings
A2 - Zhang, Haijun
A2 - Li, Xianxian
A2 - Hao, Tianyong
A2 - Meng, Weizhi
A2 - Wu, Zhou
A2 - He, Qian
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 5 July 2024 through 7 July 2024
ER -