The kappa function is commonly used as a measure of interrater reliability. The kappa functions below compute a kappa score given a 2×2 table of observed counts. These interrater “confusion” counts can be represented using either a one dimensional array (use the kappa1
function below):
array(count(no/no), count(no/yes), count(yes/no), count(yes/yes));
or a two dimensional array (use the kappa2
function below):
array(
array(count(no/no), count(no/yes)),
array(count(yes/no), count(yes/yes))
);
<?php /** * Compute kappa for a 2x2 table. * * An index which compares the agreement against that which might be * expected by chance. Possible values range from +1 (perfect agreement) * to 0 (no agreement above that expected by chance) to -1 (complete * disagreement). * * @see http://www.dmi.columbia.edu/homepages/chuangj/kappa/ * * @param $table2x2 array of interrater no/yes agreements/disagreements * * @return $kappa statistic */ function kappa2($table2x2) { $row1 = $table2x2[0][0] + $table2x2[0][1]; $row2 = $table2x2[1][0] + $table2x2[1][1]; $col1 = $table2x2[0][0] + $table2x2[1][0]; $col2 = $table2x2[0][1] + $table2x2[1][1]; $n = $col1+$col2; $pObserved = ($table2x2[0][0] + $table2x2[1][1]) / $n; $pExpected = (($col1 * $row1) + ($col2 * $row2)) / ($n*$n); $kappa = ($pObserved - $pExpected) / (1 - $pExpected); return $kappa; } /* * @see http://bij.isi.uu.nl/ImageJ/api/index.html {BIJStats.java} */ function kappa1($table) { $g1 = $table[0] + $table[1]; $g2 = $table[2] + $table[3]; $f1 = $table[0] + $table[2]; $f2 = $table[1] + $table[3]; $n = $f1+$f2; $pObserved = ($table[0] + $table[3]) / $n; $pExpected = (($f1 * $g1) + ($f2 * $g2)) / ($n*$n); $kappa = ($pObserved - $pExpected) / (1 - $pExpected); return $kappa; } $table2x2 = array( array(10, 7), array(0, 12) ); echo kappa2($table2x2); ?>
The example values input and output by this script were taken from this discussion of the kappa statistic. There are two reasons why I am particularly interested in the kappa statistic: 1. The kappa statistic is a close cousin of the chi square statistic which I have written about on numerious occasions and developed math packages for. I have seen the kappa statistic used alongside the chi square statistic to provide additional insight into tables of categorical count data. 2. An important aspect of the concept of “Collaborative Filtering” is the idea of “interrater reliability” – developing operational defintions of what it means to agree or disagree in your assessment of some object attribute. The kappa statistic is a good place to begin one’s exploration of the mathematics of collaborative filtering. Exercise Generalize the algorithm for computing a “kappa” score so that it can be used on 3×3 tables and nxn tables.