We agreed to measure agreement: Redefining reliability de-justifies Krippendorff’s alpha

Xinshu Zhao, Guangchao Charles Feng, Jun S. Liu, Ke Deng

Research output: Contribution to journalJournal articlepeer-review

82 Downloads (Pure)


Zhao, Liu, & Deng (2013) reviewed 22 inter-coder reliability indices, and found that each makes unrealistic assumption(s) about coder behavior, leading to paradoxes and abnormalities. Krippendorff’s α makes more of such assumptions, consequently produces more paradoxes and abnormalities than any other index. Professor Krippendorff (2013) countered that “most of the authors’ discoveries are the artifacts of being led astray by strange, almost conspiratorial uses of language.” The commentary reiterated Krippendorff’s long-standing position that Krippendorff’s α is the standard reliability measure, and is the only index qualified to serve the function (Hayes & Krippendorff, 2007; Krippendorff, 2004b, 2016). This paper continues this dialogue. We offer a review of literature to show that the scientific community, including Krippendorff, has defined intercoder reliability as intercoder agreement, and Krippendorff’s α, like all its main competitors, was designed and declared to measure intercoder agreement. Now that evidences are mounting that α, like Scott’s π and Cohen’s κ, does not accurately measure intercoder agreement, Krippendorff chose to redefine intercoder reliability and, furthermore, redefine information, variation, sensitivity, and specificity. By redefining reliability, we argue, Prof. Krippendorff has redefined the function of Krippendorff’s α, thereby disqualified α as an indicator of intercoder agreement. The search for a better index of intercoder agreement aka intercoder reliability should continue. We, however, also note a spiral of inertia in science communication in general, and reliability research in particular. The powerful spiral, we argue, should not forever keep up the appearances for α, π or κ.
Original languageEnglish
Pages (from-to)1-15
Number of pages15
JournalChina Media Research
Issue number2
Publication statusPublished - Nov 2018

User-Defined Keywords

  • spiral of inertia
  • selective spiral
  • reliability
  • inter-coder reliability
  • inter-rater reliability
  • agreement
  • Cohen’s kappa
  • Scott’s pi
  • Krippendorff’s alpha
  • multi-signification
  • multi-concepts
  • multi-signified
  • multi-signs
  • multi-signifiers
  • sensitivity
  • specificity
  • mechanical information
  • human information
  • aggregate estimation
  • individual classification
  • individual prediction


Dive into the research topics of 'We agreed to measure agreement: Redefining reliability de-justifies Krippendorff’s alpha'. Together they form a unique fingerprint.

Cite this