At least a dozen indexes have been proposed for measuring agreement between two judges on a categorical scale. Using the binary (positive-negative) case as a model, this paper presents and critically evaluates some of these proposed measures. The importance of correcting for chance-expected agreement is emphasized, and identities with intraclass correlation coefficients are pointed out.