ABSTRACT
Graph-based clustering algorithms are particularly suited for dealing with data that do not come from a Gaussian or a spherical distribution. They can be used for detecting clusters of any size and shape without the need of specifying the actual number of clusters; moreover, they can be profitably used in cluster detection problems.
In this paper, we propose a detailed performance evaluation of four different graph-based clustering approaches. Three of the algorithms selected for comparison have been chosen from the literature. While these algorithms do not require the setting of the number of clusters, they need, however, some parameters to be provided by the user. So, as the fourth algorithm under comparison, we propose in this paper an approach that overcomes this limitation, proving to be an effective solution in real applications where a completely unsupervised method is desirable.
- Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Computing Surveys 31(3), 264-323 (1999). Google Scholar
Digital Library
- Jain, A.K., Dubes, R.C.: Algorithms for clustering data. Prentice-Hall, Inc, Upper Saddle River, NJ, USA (1988). Google Scholar
Digital Library
- Kohonen, T.: Self-organizing maps. Springer-Verlag, Heidelberg, Germany (1995). Google Scholar
Digital Library
- Juszczak, P.: Learning to recognise. A study on one-class classification and active learning, PhD thesis, Delft University of Technology, ISBN: 978-90-9020684-4 (2006).Google Scholar
- Wu, Z., Leahy, R.: An Optimal Graph Theoretic Approach to Data Clustering: Theory and Its Application to Image Segmentation. IEEE Transactions on PAMI 15(11), 1101- 1113 (1993). Google Scholar
Digital Library
- Günter, S., Bunke, H.: Validation indices for graph clustering. Pattern Recognition Letters 24(8), 1107-1113 (2003). Google Scholar
Digital Library
- Malik, U., Bandyopadhyay, S.: Performance Evaluation of Some Clustering Algorithms and Validity Indices. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(12), 1650-1654 (2002). Google Scholar
Digital Library
- Brandes, U., Gaertler, M., Wagner, D.: Experiments on Graph Clustering Algorithms. In: Di Battista, G., Zwick, U. (eds.) ESA 2003. LNCS, vol. 2832, pp. 568-579. Springer, Heidelberg (2003).Google Scholar
- van Dongen, S.M.: Graph Clustering by Flow Simulation. PhD thesis, University of Utrecht (2000).Google Scholar
- Kannan, R., Vampala, S., Vetta, A.: On Clustering: Good, Bad and Spectral. In: Foundations of Computer Science 2000, pp. 367-378 (2000). Google Scholar
Digital Library
- Gaertler, M.: Clustering with spectral methods, Master's thesis, Universitat Konstanz (2002).Google Scholar
- Zahn, C.: Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Transactions on Computers C-20, 68-86 (1971). Google Scholar
Digital Library
- Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York (1981). Google Scholar
Digital Library
- Horowitz, E., Sahni, S.: Fundamentals of Computer Algorithms, Computer Science Press (1978). Google Scholar
Digital Library
- Enright, A.J., van Dongen, S., Ouzounis, C.A.: An efficient algorithm for large-scale detection of protein families. Nucleic Acids Research 30(7), 1575-1584 (2002).Google Scholar
Cross Ref
- Davies, D.L., Bouldin, D.W.: A Cluster Separation Measure. IEEE Trans. Pattern Analysis and Machine Intelligence 1, 224-227 (1979).Google Scholar
Digital Library
- Dunn, C., Fuzzy, A.: A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters. J. Cybernetics 3, 32-57 (1973).Google Scholar
Cross Ref
- Calinski, R.B., Harabasz, J.: A Dendrite Method for Cluster Analysis. Comm. in Statistics 3, 1-27 (1974).Google Scholar
Cross Ref
- Xie, X.L., Beni, G., Validity, A.: A Validity Measure for Fuzzy Clustering. IEEE Trans. on Pattern Analysis and Machine Intelligence 13, 841-847 (1991). Google Scholar
Digital Library
- Hubert, L., Schultz, J.: Quadratic assignment as a general data-analysis strategy. British Journal of Mathematical and Statistical Psychology 29, 190-241 (1976).Google Scholar
Cross Ref
- Shi, J., Malik, J.: Normalized Cuts and Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(8), 888-905 (2000). Google Scholar
Digital Library
- Shental, N., Zomet, A., Hertz, T., Weiss, Y.: Pairwise Clustering and Graphical Models. In: Advances in Neural Information Processing Systems, MIT Press, Cambridge, MA (2004).Google Scholar
Recommendations
Hybrid Bisect K-Means Clustering Algorithm
BCGIN '11: Proceedings of the 2011 International Conference on Business Computing and Global InformatizationIn this paper, we present a hybrid clustering algorithm that combines divisive and agglomerative hierarchical clustering algorithm. Our method uses bisect K-means for divisive clustering algorithm and Unweighted Pair Group Method with Arithmetic Mean (...
A partitional clustering algorithm validated by a clustering tendency index based on graph theory
Applying graph theory to clustering, we propose a partitional clustering method and a clustering tendency index. No initial assumptions about the data set are requested by the method. The number of clusters and the partition that best fits the data set, ...
Ant clustering algorithm with K-harmonic means clustering
Clustering is an unsupervised learning procedure and there is no a prior knowledge of data distribution. It organizes a set of objects/data into similar groups called clusters, and the objects within one cluster are highly similar and dissimilar with ...




Comments