ABSTRACT
Graph clustering is one of the key techniques for understanding structures present in the complex graphs such as Web pages, social networks, and others. In the Web and data mining communities, modularity-based graph clustering algorithm is successfully used in many applications. However, it is difficult for the modularity-based methods to find fine-grained clusters hidden in large-scale graphs; the methods fail to reproduce the ground truth. In this paper, we present a novel modularity-based algorithm, CAV-Partitioning, that shows better clustering results than the traditional algorithm. In our proposed method, we introduce cohesiveness-aware vector partitioning into the graph spectral analysis to improve the clustering accuracy. Extensive experiments on public datasets demonstrate the performance superiority of CAV-Partitioning over the state-of-the-art approaches.
References
- V.D. Blondel, J.L. Guillaume, R. Lambiotte, and E.L.J.S. Mech. 2008. Fast Unfolding of Communities in Large Networks. Journal of Statistical Mechanics: Theory and Experiment 2008, 10 (2008), P10008.Google Scholar
Cross Ref
- Avery Ching, Sergey Edunov, Maja Kabiljo, Dionysios Logothetis, and Sambavi Muthukrishnan. 2015. One Trillion Edges: Graph Processing at Facebook-scale. Proceedings of the Very Large Data Bases Endowment (PVLDB) 8, 12 (August 2015), 1804--1815. Google Scholar
Digital Library
- Aaron Clauset, M. E. J. Newman, and Cristopher Moore. 2004. Finding Community Structure in Very Large Networks. Physical Review E 70, 066111 (2004).Google Scholar
Cross Ref
- Alberto Costa. 2014. Comment on "Quantitative Function for Community Detection". CoRR abs/1409.4063 (2014).Google Scholar
- Santo Fortunato and M Barthelemy. 2007. Resolution Limit in Community Detection. Proceedings of the National Academy of Sciences (Jan 2007).Google Scholar
Cross Ref
- Yasuhiro Fujiwara, Yasutoshi Ida, Hiroaki Shiokawa, and Sotetsu Iwamura. 2016. Fast Lasso Algorithm via Selective Coordinate Descent. In Proceedings of the 30th AAAI Conference on Artificial Intelligence. 1561--1567. Google Scholar
Digital Library
- Yasuhiro Fujiwara, Makoto Nakatsuji, Hiroaki Shiokawa, Yasutoshi Ida, and Machiko Toyoda. 2015. Adaptive Message Update for Fast Affinity Propagation. In Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2015). 309--318. Google Scholar
Digital Library
- Javier O. Garcia, Arian Ashourvan, Sarah Muldoon, Jean M. Vettel, and Danielle S. Bassett. 2018. Applications of Community Detection Techniques to Brain Graphs: Algorithmic Considerations and Implications for Neural Function. Proc. IEEE 106, 5 (May 2018), 846--867.Google Scholar
- Chenjuan Guo, Bin Yang, Jilin Hu, and Christian S. Jensen. 2018. Learning to Route with Sparse Trajectory Sets. In Proceedings of the 34th IEEE Conference on Data Engineering. 1073--1084.Google Scholar
- Edward K. Kao, Vijay Gadepally, Michael B. Hurley, Michael Jones, Jeremy Kepner, Sanjeev Mohindra, Paul Monticciolo, Albert Reuther, Siddharth Samsi, William Song, Diane Staheli, and Steven Smith. 2017. Streaming Graph Challenge: Stochastic Block Partition. In 2017 IEEE High Performance Extreme Computing Conference, (HPEC 2017). 1--12.Google Scholar
- George Karypis and Vipin Kumar. 1998. A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs. SIAM Journal on Scientific Computing 20, 1 (December 1998), 359--392. Google Scholar
Digital Library
- Andrea Lancichinetti, Santo Fortunato, and János Kertész. 2009. Detecting the Overlapping and Hierarchical Community Structure in Complex Networks. New Journal of Physics 11, 3 (2009), 033015.Google Scholar
Cross Ref
- Jure Leskovec, Jon Kleinberg, and Christos Faloutsos. 2007. Graph Evolution: Densification and Shrinking Diameters. ACM Transactions on Knowledge Discovery from Data (ACM TKDD) 1, 1, Article 2 (March 2007). Google Scholar
Digital Library
- Zhenping Li, Shihua Zhang, Rui-Sheng Wang, Xiang-Sun Zhang, and Luonan Chen. 2008. Quantative Function for Community Detection. Physical Review E 77, 036109 (2008).Google Scholar
Cross Ref
- J. B. MacQueen. 1967. Some Methods for Classification and Analysis of MultiVariate Observations. In Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, Vol. 1. University of California Press, 281--297.Google Scholar
- Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze. 2008. Introduction to Information Retrieval. Cambridge University Press, New York, NY, USA. Google Scholar
Digital Library
- Stefanie Muff, Francesco Rao, and Amedeo Caflisch. 2005. Local Modularity Measure for Network Clusterizations. E 72, 056107 (2005).Google Scholar
- M. E. J. Newman. 2004. Fast Algorithm for Detecting Community Structure in Networks. Physical Review E 69, 066133 (2004).Google Scholar
- M. E. J. Newman and M. Girvan. 2004. Finding and Evaluating Community Structure in Networks. Physical Review E 69, 026113 (2004).Google Scholar
- Filippo Radicchi, Claudio Castellano, Federico Cecconi, Vittorio Loreto, and Domenico Parisi. 2004. Defining and Identifying Communities in Networks. Proceedings of the National Academy of Sciences 101, 9 (2004), 2658--2663.Google Scholar
Cross Ref
- Thomas Richardson, Peter J. Mucha, and Mason A. Porter. 2009. Spectral Tripartitioning of Networks. Physical Review E 80 (Sep 2009), 036111. Issue 3.Google Scholar
- Tomoki Sato, Hiroaki Shiokawa, Yuto Yamaguchi, and Hiroyuki Kitagawa. 2018. FORank: Fast ObjectRank for Large Heterogeneous Graphs. In Companion Proceedings of the The Web Conference 2018. 103--104. Google Scholar
Digital Library
- Jianbo Shi and Jitendra Malik. 2000. Normalized Cuts and Image Segmentation. IEEE Transaction on Pattern Analysis and Machine Intelligence 22, 8 (August 2000), 888--905. Google Scholar
Digital Library
- Hiroaki Shiokawa, Yasuhiro Fujiwara, and Makoto Onizuka. 2013. Fast Algorithm for Modularity-based Graph Clustering. In Proceedings of the 27th AAAI Conference on Artificial Intelligence (AAAI). 1170--1176. Google Scholar
Digital Library
- Hiroaki Shiokawa, Yasuhiro Fujiwara, and Makoto Onizuka. 2015. SCAN++: Efficient Algorithm for Finding Clusters, Hubs and Outliers on Large-scale Graphs. Proceedings of the Very Large Data Bases Endowment (PVLDB) 8, 11 (July 2015), 1178--1189. Google Scholar
Digital Library
- Hiroaki Shiokawa, Tomokatsu Takahashi, and Hiroyuki Kitagawa. 2018. ScaleSCAN: Scalable Density-based Graph Clustering. In Proceedings of the 29th International Conference on Database and Expert Systems Applications (DEXA). 18--34.Google Scholar
Cross Ref
- Tomokatsu Takahashi, Hiroaki Shiokawa, and Hiroyuki Kitagawa. 2017. SCAN-XP: Parallel Structural Graph Clustering Algorithm on Intel Xeon Phi Coprocessors. In Proceedings of the 2nd ACM SIGMOD Workshop on Network Data Analytics (NDA 2017). 6:1--6:7. Google Scholar
Digital Library
- Jianshu Weng and Bu-Sung Lee. 2011. Event Detection in Twitter. In Proceedings of the 5th International Conference on Weblogs and Social Media (ICWSM).Google Scholar
- Xiaowei Xu, Nurcan Yuruk, Zhidan Feng, and Thomas A. J. Schweiger. 2007. SCAN: A Structural Clustering Algorithm for Networks. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). ACM, New York, NY, USA, 824--833. Google Scholar
Digital Library
- Xiao Zhang and M. E. J. Newman. 2015. Multiway Spectral Community Detection in Networks. Physical Review E 92 (Nov 2015), 052808. Issue 5.Google Scholar
- Tom Chao Zhou, Hao Ma, Michael R. Lyu, and Irwin King. 2010. UserRec: A User Recommendation Framework in Social Tagging Systems. (2010).Google Scholar
Index Terms
Graph Clustering via Cohesiveness-aware Vector Partitioning





Comments