There are plenty of intercoder reliability indices, whereas the choice of them has been debated. With a Monte Carlo simulation, the determinants of the agreement indices were empirically tested. The chance agreement of Bennett’s S is found to be only affected by the number of categories. Consequently, S is a category based index. The chance agreements of Krippendorff’s α, Scott’s π and Cohen’s κ are affected by the marginal distribution, the level of difficulty and the interaction between them, and yet the difficulty level influences their chance agreements abnormally. The three indices are hence in general distribution based indices. Gwet’s AC1 reversed the direction of the three aforementioned indices, but its chance agreement is additionally affected by the number of categories and the interaction between the number of categories and the marginal distribution. AC1 can be classified into a class based on the number of categories, the marginal distribution and the level of difficulty. Both theoretical and practical implications were also discussed in the end.
- Intercoder reliability