TY - JOUR
T1 - Content moderation on social media
T2 - Does it matter who and why moderates hate speech?
AU - Wang, Sai
AU - Kim, Ki Joon
N1 - Funding information:
The research was supported by the first author's Start-up Grant (21.4431.162804) from Hong Kong Baptist University, Hong Kong SAR, China.
Publisher copyright:
© MaryAnnLiebert,Inc.
PY - 2023/7/17
Y1 - 2023/7/17
N2 - Artificial intelligence (AI) has been increasingly integrated into content moderation to detect and remove hate speech on social media. An online experiment (N = 478) was conducted to examine how moderation agents (AI vs. human vs. human–AI collaboration) and removal explanations (with vs. without) affect users' perceptions and acceptance of removal decisions for hate speech targeting social groups with certain characteristics, such as religion or sexual orientation. The results showed that individuals exhibit consistent levels of perceived trustworthiness and acceptance of removal decisions regardless of the type of moderation agent. When explanations for the content takedown were provided, removal decisions made jointly by humans and AI were perceived as more trustworthy than the same decisions made by humans alone, which increased users' willingness to accept the verdict. However, this moderated mediation effect was only significant when Muslims, not homosexuals, were the target of hate speech.
AB - Artificial intelligence (AI) has been increasingly integrated into content moderation to detect and remove hate speech on social media. An online experiment (N = 478) was conducted to examine how moderation agents (AI vs. human vs. human–AI collaboration) and removal explanations (with vs. without) affect users' perceptions and acceptance of removal decisions for hate speech targeting social groups with certain characteristics, such as religion or sexual orientation. The results showed that individuals exhibit consistent levels of perceived trustworthiness and acceptance of removal decisions regardless of the type of moderation agent. When explanations for the content takedown were provided, removal decisions made jointly by humans and AI were perceived as more trustworthy than the same decisions made by humans alone, which increased users' willingness to accept the verdict. However, this moderated mediation effect was only significant when Muslims, not homosexuals, were the target of hate speech.
KW - artificial intelligence
KW - content moderation
KW - hate speech
KW - social media
KW - transparency
UR - http://www.scopus.com/inward/record.url?scp=85163602458&partnerID=8YFLogxK
U2 - 10.1089/cyber.2022.0158
DO - 10.1089/cyber.2022.0158
M3 - Journal article
SN - 2152-2715
VL - 26
SP - 527
EP - 534
JO - Cyberpsychology, Behavior, and Social Networking
JF - Cyberpsychology, Behavior, and Social Networking
IS - 7
ER -