Content moderation on social media: Does it matter who and why moderates hate speech?

Sai Wang*, Ki Joon Kim

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

1 Citation (Scopus)


Artificial intelligence (AI) has been increasingly integrated into content moderation to detect and remove hate speech on social media. An online experiment (N = 478) was conducted to examine how moderation agents (AI vs. human vs. human–AI collaboration) and removal explanations (with vs. without) affect users' perceptions and acceptance of removal decisions for hate speech targeting social groups with certain characteristics, such as religion or sexual orientation. The results showed that individuals exhibit consistent levels of perceived trustworthiness and acceptance of removal decisions regardless of the type of moderation agent. When explanations for the content takedown were provided, removal decisions made jointly by humans and AI were perceived as more trustworthy than the same decisions made by humans alone, which increased users' willingness to accept the verdict. However, this moderated mediation effect was only significant when Muslims, not homosexuals, were the target of hate speech.

Original languageEnglish
Pages (from-to)527-534
Number of pages8
JournalCyberpsychology, Behavior, and Social Networking
Issue number7
Early online date3 May 2023
Publication statusPublished - 17 Jul 2023

Scopus Subject Areas

  • Communication
  • Social Psychology
  • Human-Computer Interaction
  • Applied Psychology
  • Computer Science Applications

User-Defined Keywords

  • artificial intelligence
  • content moderation
  • hate speech
  • social media
  • transparency


Dive into the research topics of 'Content moderation on social media: Does it matter who and why moderates hate speech?'. Together they form a unique fingerprint.

Cite this