Member-only story
Understanding Cohen’s Kappa
Cohen’s Kappa, or quadratic weighted kappa, is a metric to measure agreement between two ratings.
The Key idea of this metrics is comparing the probability of two classifiers actually agree and the probability two classifiers agree by accident.
The formula for this metrics is:
You may wonder why this metrics is defined in this way, not K = P_0*P_e or other form. Here is the reason:
- When two classifiers perfectly match each other: K = 1
- When the probability of two classifiers equal’s random probability: K = 0
For example, we got two classifiers that predict category I, II, III as y_1 and y_2.
y_1 = [1,2,3,1,2,3,1,2,3]
y_2 = [3,2,1,3,2,1,3,2,1]
p_0 = 3/9 = 1/3
y_true_p = [1/3,1/3,1/3]
# for 1, 2, 3, there are 3 observation, the probability equal 3/9
y_pred_p = [1/3,1/3,1/3]
# p of two classifers agree by accidents
p_e = 1/9 + 1/9 + 1/9 = 1/3
cohen_kappa = (1/3-1/3)/(1-1/3) = 0
Given the probability of y_1 and y_2 agree with each other is equal to the probability of it happen by chance, thus the cohen_kappa is equal 0.
Scikit Learn library have implementation for it.
from sklearn import metrics
y_1 = [1,2,3,1,2,3,1,2,3]
y_2 = [3,2,1,3,2,1,3,2,1]
metrics.cohen_kappa_score(y_1, y_2)
#0
metrics.accuracy_score(y_true, y_pred)
#1/3
This metrics can also be implemented in python as: