ExplainHM++: Explainable Harmful Meme Detection With Retrieval-Augmented Debate Between Large Multimodal Models

  • Hongzhan Lin
  • , Wei Gao
  • , Jing Ma*
  • , Yang Deng
  • , Ziyang Luo
  • , Bo Wang
  • , Ruichao Yang
  • , Tat Seng Chua
  • *Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

Abstract

Identifying harmful memes is challenging due to their implicit meanings, which are not always evident from texts and images alone. Existing solutions often lack clear explanations to justify their decisions. To address this gap, we propose an explainable approach, ExplainHM++, which detects harmful memes by reasoning over competing rationales from both harmful and harmless perspectives. First, inspired by the capabilities of Large Multimodal Models (LMMs) in text generation and multimodal reasoning, we develop ExplainHM, a one-stage multimodal debate in which LMMs generate explanations through contradictory arguments. Second, we fine-tune a small language model to serve as a judge in the debate, improving the integration of harmfulness rationales with the multimodal content of memes. However, we observe that a naive multimodal debate remains vulnerable, as it heavily depends on the inherent reasoning ability of LMMs to understand the memes. Given the evolving and noisy nature of memes, we further introduce a meme sample retrieval mechanism and a retrieval-augmented debate paradigm to strengthen and refine LMM-generated explanations. Extensive experiments on three public meme datasets demonstrate that ExplainHM++ not only outperforms state-of-the-art methods but also provides superior, interpretable explanations for harmful meme detection.

Original languageEnglish
Pages (from-to)1-14
Number of pages14
JournalIEEE Transactions on Knowledge and Data Engineering
Volume38
Issue number2
DOIs
Publication statusE-pub ahead of print - 26 Nov 2025

User-Defined Keywords

  • Harmful meme detection
  • explainability
  • retrieval-augmented debate
  • large multimodal models

Fingerprint

Dive into the research topics of 'ExplainHM++: Explainable Harmful Meme Detection With Retrieval-Augmented Debate Between Large Multimodal Models'. Together they form a unique fingerprint.

Cite this