ABSTRACT
This paper views clusterings as elements of a lattice. Distances between clusterings are analyzed in their relationship to the lattice. From this vantage point, we first give an axiomatic characterization of some criteria for comparing clusterings, including the variation of information and the unadjusted Rand index. Then we study other distances between partitions w.r.t these axioms and prove an impossibility result: there is no "sensible" criterion for comparing clusterings that is simultaneously (1) aligned with the lattice of partitions, (2) convexely additive, and (3) bounded.
References
- Ben-Hur, A., Elisseeff, A., & Guyon, I. (2002). A stability based method for discovering structure in clustered data. Pacific Symposium on Biocomputing (pp. 6--17).]]Google Scholar
- Cover, T. M., & Thomas, J. A. (1991). Elements of information theory. Wiley.]] Google Scholar
Digital Library
- Fowlkes, E. B., & Mallows, C. L. (1983). A method for comparing two hierarchical clusterings. Journal of the American Statistical Association, 78, 553--569.]]Google Scholar
Cross Ref
- Golumbic, M. (1980). Algorithmic graph theory and perfect graphs. Academic Press, New York.]] Google Scholar
Digital Library
- Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2, 193--218.]]Google Scholar
Cross Ref
- Kleinberg, J. (2002). An impossibility theorem for clustering. Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press.]]Google Scholar
- Meilǎ, M. (2003). Comparing clusterings by the variation of information. Proceedings of the Sixteenth Annual Conference ofn Computational Learning Theory (COLT). Springer.]]Google Scholar
Cross Ref
- Mirkin, B. (1996). Mathematical classification and clustering. Kluwer Academic Press.]]Google Scholar
- Rand, W. M. (1971). Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association, 66, 846--850.]]Google Scholar
Cross Ref
- Rènyi, A. (1970). Probability theory. North-Holland.]]Google Scholar
- Stanley, R. P. (1997). Enumerative combinatorics. Cambridge University Press.]] Google Scholar
Digital Library
- van Dongen, S. (2000). Performance criteria for graph clustering and Markov cluster experiments (Technical Report INS-R0012). Centrum voor Wiskunde en Informatica.]] Google Scholar
Digital Library
- Wallace, D. L. (1983). Comment. Journal of the American Statistical Association, 78, 569--576.]]Google Scholar
Index Terms
(auto-classified)Comparing clusterings



Comments