TY - GEN
T1 - MCFEND
T2 - 33rd ACM Web Conference, WWW 2024
AU - Li, Yupeng
AU - He, Haorui
AU - Bai, Jin
AU - Wen, Dacheng
N1 - This work was supported by the National Natural Science Foundation of China (No. 62202402), the Guangdong Basic and Applied Basic Research Foundation (No. 2022A1515011583 and No. 2023A1515011562), the Hong Kong RGC Early Career Scheme (No. 22202423), the Germany/ Hong Kong Joint Research Scheme sponsored by the Research Grants Council of Hong Kong and the German Academic Exchange Service of Germany (No. G-HKBU203/22), the One-off Tier 2 Startup Grant (2020/2021) of Hong Kong Baptist University (Ref. RCOFSGT2/ 20-21/COMM/002), and the Startup Grant (Tier 1) for New Academics AY2020/21 of Hong Kong Baptist University.
Publisher Copyright:
© 2024 ACM.
PY - 2024/5/13
Y1 - 2024/5/13
N2 - The prevalence of fake news across various online sources has had a significant influence on the public. Existing Chinese fake news detection datasets are limited to news sourced solely from Weibo. However, fake news originating from multiple sources exhibits diversity in various aspects, including its content and social context. Methods trained on purely one single news source can hardly be applicable to real-world scenarios. Our pilot experiment demonstrates that the F1 score of the state-of-the-art method that learns from a large Chinese fake news detection dataset, Weibo-21, drops significantly from 0.943 to 0.470 when the test data is changed to multi-source news data, failing to identify more than one-third of the multi-source fake news. To address this limitation, we constructed the first multi-source benchmark dataset for Chinese fake news detection, termed MCFEND, which is composed of news we collected from diverse sources such as social platforms, messaging apps, and traditional online news outlets. Notably, such news has been fact-checked by 14 authoritative fact-checking agencies worldwide. In addition, various existing Chinese fake news detection methods are thoroughly evaluated on our proposed dataset in cross-source, multi-source, and unseen source ways. MCFEND, as a benchmark dataset, aims to advance Chinese fake news detection approaches in real-world scenarios.
AB - The prevalence of fake news across various online sources has had a significant influence on the public. Existing Chinese fake news detection datasets are limited to news sourced solely from Weibo. However, fake news originating from multiple sources exhibits diversity in various aspects, including its content and social context. Methods trained on purely one single news source can hardly be applicable to real-world scenarios. Our pilot experiment demonstrates that the F1 score of the state-of-the-art method that learns from a large Chinese fake news detection dataset, Weibo-21, drops significantly from 0.943 to 0.470 when the test data is changed to multi-source news data, failing to identify more than one-third of the multi-source fake news. To address this limitation, we constructed the first multi-source benchmark dataset for Chinese fake news detection, termed MCFEND, which is composed of news we collected from diverse sources such as social platforms, messaging apps, and traditional online news outlets. Notably, such news has been fact-checked by 14 authoritative fact-checking agencies worldwide. In addition, various existing Chinese fake news detection methods are thoroughly evaluated on our proposed dataset in cross-source, multi-source, and unseen source ways. MCFEND, as a benchmark dataset, aims to advance Chinese fake news detection approaches in real-world scenarios.
KW - chinese fake news detection
KW - cross-source evaluation
KW - multi-source benchmark dataset
KW - multi-source evaluation
UR - http://www.scopus.com/inward/record.url?scp=85194096251&partnerID=8YFLogxK
U2 - 10.1145/3589334.3645385
DO - 10.1145/3589334.3645385
M3 - Conference proceeding
AN - SCOPUS:85194096251
T3 - WWW 2024 - Proceedings of the ACM Web Conference
SP - 4018
EP - 4027
BT - WWW 2024 - Proceedings of the ACM Web Conference
PB - Association for Computing Machinery (ACM)
Y2 - 13 May 2024 through 17 May 2024
ER -