When to use Cohen’s κ, if ever?

Xinshu Zhao

Research output: Contribution to conferenceConference paper

30 Downloads (Pure)


For half a century, Cohen’s κ has been the most often used general indicator of reliability. It has been cited by more than three thousand journal articles between 1994 and 2009, according to Social Science Citation Index.

This article presents 14 paradoxes to show that κ is not a general indicator. An analysis of κ’s mathematics and underlying logic uncovers three assumptions: Each assessor predetermines a quota and faithfully enforces it. Assessors maximize chance diagnosing as the second priority. Assessors conduct honest diagnosing as the last priority. These assumptions have three implications: Assessors perform constrained task of assigning objects to categories predetermined by the quota. Assessors fix the distribution before diagnosing. Assessors apply variable benchmark depending on predetermined distribution.

These assumptions constitute boundaries beyond which κ should not be used. We show that the 14 paradoxes emerge because κ is used beyond its boundaries – when at least one of the three assumptions is violated.

We conclude that, when the assumptions do not hold, κ does not apply. As the assumptions rarely hold, we rarely should use κ, if ever.

Original languageEnglish
Publication statusPublished - 30 May 2011
Event61st Annual International Communication Association Conference, ICA 2011: Communication @ the Center - Boston, United States
Duration: 26 May 201130 May 2011


Conference61st Annual International Communication Association Conference, ICA 2011
Country/TerritoryUnited States
Internet address


Dive into the research topics of 'When to use Cohen’s κ, if ever?'. Together they form a unique fingerprint.

Cite this