TY - JOUR
T1 - Tb-mmrd: transformer-based multi-modal election rumor detection with agreement-aware gating and semantic fusion
AU - Kwao, Lazarus
AU - Ma, Jing
AU - Yussif, Sophyani Banaamwini
AU - Ativi, Wisdom Xornam
AU - Ayawli, Ben Beklisi Kwame
N1 - This work was supported by the Natural Science Foundation of China (62402093) and the Sichuan Science and Technology Program (2025ZNSFSC0479 and 2024NSFTD0034). It was also supported in part by the National Natural Science Foundation of China under grants U20B2063 and 62220106008.
Publisher Copyright:
© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2025.
PY - 2025/8/19
Y1 - 2025/8/19
N2 - The rise of social media has made it easier to share information in real time. However, it has also made it easier for rumors to spread quickly, particularly during sensitive events, such as elections. Many of these rumors appear in the form of posts that combine text, images, and emotionally charged captions in ways that can mislead people, making them hard to detect with traditional models. While prior multimodal fusion approaches have shown promise, they continue to face persistent challenges, including semantic misalignment across modalities, noisy user-generated content, and the high computational demands of deep fusion architectures. To overcome these limitations, we propose TB-MMRD, a Transformer-based Multi-modal Rumor Detection framework. TB-MMRD consists of three main components. First, a multimodal feature extraction module uses DistilRoBERTa for textual inputs and VGG-19 for images to capture informative representations while reducing computational overhead. Second, a dual-stage fusion architecture introduces agreement-aware gating before and after Linformer-based attention to suppress semantically inconsistent or noisy features at both early and late fusion stages. Third, a lightweight classification head enables fast and reliable rumor classification. We evaluate our model on Twitter, FakeNewsNet (GossipCo and PolitiFact), and a new Ghana-focused dataset (GhElection). Experimental results show that our model consistently outperforms state-of-the-art baselines in terms of accuracy, F1-score, and robustness to noise, validating the effectiveness of alignment-aware filtering and efficient attention in multimodal rumor detection.
AB - The rise of social media has made it easier to share information in real time. However, it has also made it easier for rumors to spread quickly, particularly during sensitive events, such as elections. Many of these rumors appear in the form of posts that combine text, images, and emotionally charged captions in ways that can mislead people, making them hard to detect with traditional models. While prior multimodal fusion approaches have shown promise, they continue to face persistent challenges, including semantic misalignment across modalities, noisy user-generated content, and the high computational demands of deep fusion architectures. To overcome these limitations, we propose TB-MMRD, a Transformer-based Multi-modal Rumor Detection framework. TB-MMRD consists of three main components. First, a multimodal feature extraction module uses DistilRoBERTa for textual inputs and VGG-19 for images to capture informative representations while reducing computational overhead. Second, a dual-stage fusion architecture introduces agreement-aware gating before and after Linformer-based attention to suppress semantically inconsistent or noisy features at both early and late fusion stages. Third, a lightweight classification head enables fast and reliable rumor classification. We evaluate our model on Twitter, FakeNewsNet (GossipCo and PolitiFact), and a new Ghana-focused dataset (GhElection). Experimental results show that our model consistently outperforms state-of-the-art baselines in terms of accuracy, F1-score, and robustness to noise, validating the effectiveness of alignment-aware filtering and efficient attention in multimodal rumor detection.
KW - Multimodal rumor detection
KW - Noise gating
KW - Politics elections
KW - Semantic alignment
KW - Transformer
UR - http://www.scopus.com/inward/record.url?scp=105013762847&partnerID=8YFLogxK
U2 - 10.1007/s00530-025-01937-9
DO - 10.1007/s00530-025-01937-9
M3 - Journal article
AN - SCOPUS:105013762847
SN - 0942-4962
VL - 31
JO - Multimedia Systems
JF - Multimedia Systems
IS - 4
M1 - 315
ER -