TY - GEN
T1 - Message Injection Attack on Rumor Detection under the Black-Box Evasion Setting Using Large Language Model
AU - Luo, Yifeng
AU - Li, Yupeng
AU - Wen, Dacheng
AU - Lan, Liang
N1 - This work was supported in part by the National Natural Science Foundation of China under Grant 62202402, in part by the Guangdong Basic and Applied Basic Research Foundation under Grant 2022A1515011583 and Grant 2023A1515011562, in part by the Hong Kong Research Grants Council Early Career Scheme under Grant 22202423, in part by the National Natural Science Foundation of China under Grant 61906161, in part by the Germany/Hong Kong Joint Research Scheme sponsored by the Research Grants Council of Hong Kong and the German Academic Exchange Service of Germany under Grant G-HKBU203/22, in part by the One-Off Tier 2 Start-Up Grant (2020/2021) of Hong Kong Baptist University under Grant RC-OFSGT2/20-21/COMM/002, and in part by the Startup Grant (Tier 1) for New Academics of Hong Kong Baptist University under Grant AY2020/21.
Publisher Copyright:
© 2024 ACM.
PY - 2024/5/13
Y1 - 2024/5/13
N2 - Recent analyses have disclosed that existing rumor detection techniques, despite playing a pivotal role in countering the dissemination of misinformation on social media, are vulnerable to both white-box and surrogate-based black-box adversarial attacks. However, such attacks depend heavily on unrealistic assumptions, e.g., modifiable user data and white-box access to the rumor detection models, or appropriate selections of surrogate models, which are impractical in the real world. Thus, existing analyses fail to uncover the robustness of rumor detectors in practice. In this work, we take a further step towards the investigation about the robustness of existing rumor detection solutions. Specifically, we focus on the state-of-the-art rumor detectors, which leverage graph neural network based models to predict whether a post is rumor based on the Message Propagation Tree (MPT), a conversation tree with the post as its root and the replies to the post as the descendants of the root. We propose a novel black-box attack method, HMIA-LLM, against these rumor detectors, which uses the Large Language Model to generate malicious messages and inject them into the targeted MPTs. Our extensive evaluation conducted across three rumor detection datasets, four target rumor detectors, and three baselines for comparison demonstrates the effectiveness of our proposed attack method in compromising the performance of the state-of-the-art rumor detectors.
AB - Recent analyses have disclosed that existing rumor detection techniques, despite playing a pivotal role in countering the dissemination of misinformation on social media, are vulnerable to both white-box and surrogate-based black-box adversarial attacks. However, such attacks depend heavily on unrealistic assumptions, e.g., modifiable user data and white-box access to the rumor detection models, or appropriate selections of surrogate models, which are impractical in the real world. Thus, existing analyses fail to uncover the robustness of rumor detectors in practice. In this work, we take a further step towards the investigation about the robustness of existing rumor detection solutions. Specifically, we focus on the state-of-the-art rumor detectors, which leverage graph neural network based models to predict whether a post is rumor based on the Message Propagation Tree (MPT), a conversation tree with the post as its root and the replies to the post as the descendants of the root. We propose a novel black-box attack method, HMIA-LLM, against these rumor detectors, which uses the Large Language Model to generate malicious messages and inject them into the targeted MPTs. Our extensive evaluation conducted across three rumor detection datasets, four target rumor detectors, and three baselines for comparison demonstrates the effectiveness of our proposed attack method in compromising the performance of the state-of-the-art rumor detectors.
KW - black-box evasion setting
KW - graph neural networks
KW - large language model
KW - message injection attack
KW - rumor detection
UR - http://www.scopus.com/inward/record.url?scp=85194097248&partnerID=8YFLogxK
U2 - 10.1145/3589334.3648139
DO - 10.1145/3589334.3648139
M3 - Conference proceeding
AN - SCOPUS:85194097248
T3 - WWW 2024 - Proceedings of the ACM Web Conference
SP - 4512
EP - 4522
BT - WWW '24: Proceedings of the ACM Web Conference 2024
PB - Association for Computing Machinery (ACM)
T2 - 33rd ACM Web Conference, WWW 2024
Y2 - 13 May 2024 through 17 May 2024
ER -