BIOSTATISTICS Year : 2016  Volume : 2  Issue : 2  Page : 217219 Understanding the calculation of the kappa statistic: A measure of interobserver reliability Sidharth S Mishra, Nitika Department of Community Medicine, School of Public Health, Postgraduate Institute of Medical Education and Research, Chandigarh, India Correspondence Address: It is common practice to assess the consistency of diagnostic ratings in terms of “agreement beyond chance.” The kappa coefficient is a popular index of agreement for binary and categorical ratings. This article focuses on the unweighted kappa statistic calculation by providing a stepwise approach that is supplemented with an example. The aim is that health care personnel may better understand the purpose of the kappa statistic and how to calculate it. The following core competencies are addressed in this article: Medical knowledge.
Introduction It is common practice to assess the consistency of diagnostic ratings in terms of “agreement beyond chance” specifically, agreement between two clinicians under two different conditions or the agreement among multiple clinicians under one condition.[1] To this end, we consider a relevant statistical technique such as Cohen's kappa, which is a common index of agreement for binary and categorical ratings.[2] If the categories are unordered, the unweighted kappa statistic (K) is appropriate. If the categories are ordered – as they are in most rating scales in clinical, psychological, and epidemiological research – the weighted kappa statistic (K[w]) is preferable.[3] While there are many modifications and variants of kappa statistic, this article focused on calculation of the unweighted kappa statistic calculation by providing a stepwise approach and supplemented with an example. Estimating the Kappa Statistic Step 1: Calculate the percentages of each row and column out of the grand total of all four cells [Table 1].{Table 1} Step 2: Calculate the percentage of observed agreement [INLINE:1] Step 3: Calculate the percentage of agreement expected by chance alone. In this agreement is present in two cells, i.e. A – in which both are agreeing and in D – in which both disagrees. “a” is the expected value for cell A, and “d” is the expected value for cell D. For each cell, we need to find it by, [INLINE:2] That is, [INLINE:3] Similarly, method has to be followed for calculating d [Table 2].{Table 2} Percentage agreement expected by chance is [INLINE:4] Step 4: [INLINE:5] Step 5 (inference): It was suggested by Landis and Koch [4] that a kappa value more than 0.75 represented excellent agreement beyond chance whereas below 0.40 had poor agreement. A kappa value in the range of 0.40–0.75 represents intermediate to good agreement. Example of Estimation of the Kappa Statistic Suppose that 100 patients suffering from pancreatic carcinoma underwent contrastenhanced computed tomography abdomen and that 2 radiologists reviewed the reports [Table 3].{Table 3} Solution to Problem Step 1: Calculate the percentages of each row and column out of the grand total of all four cells [Table 4].{Table 4} Step 2: Calculate the percentage of observed agreement [INLINE:6] Step 3: Calculation of the percentage of agreement expected by chance alone [Table 5].{Table 5} For each cell, we need to find it by, [INLINE:7] That is, [INLINE:8] Similarly, b = 20.25, c = 30.25, and d = 24.75 Percentage agreement expected by chance alone [INLINE:9] Step 4: [INLINE:10] Step 5 (inference): intermediate to good agreement. Conclusion The kappa statistic is a frequently used measure of interobserver reliability, but its manual calculation may cause confusion. The aim of this article is to help health care personnel better understand the purpose of the kappa statistic and how to calculate it. Acknowledgment We would like to thank Dr. Reshmi Mishra, and Dr. Tushar Subhadarshan Mishra. Financial support and sponsorship Nil. Conflicts of interest There are no conflicts of interest. References


