skip to main content
research-article

Sampling Graphlets of Multiplex Networks: A Restricted Random Walk Approach

Published:14 June 2021Publication History
Skip Abstract Section

Abstract

Graphlets are induced subgraph patterns that are crucial to the understanding of the structure and function of a large network. A lot of effort has been devoted to calculating graphlet statistics where random walk-based approaches are commonly used to access restricted graphs through the available application programming interfaces (APIs). However, most of them merely consider individual networks while overlooking the strong coupling between different networks. In this article, we estimate the graphlet concentration in multiplex networks with real-world applications. An inter-layer edge connects two nodes in different layers if they actually belong to the same node. The access to a multiplex network is restrictive in the sense that the upper layer allows random walk sampling, whereas the nodes of lower layers can be accessed only through the inter-layer edges and only support random node or edge sampling. To cope with this new challenge, we define a suit of two-layer graphlets and propose novel random walk sampling algorithms to estimate the proportion of all the three-node graphlets. An analytical bound on the sampling steps is proved to guarantee the convergence of our unbiased estimator. We further generalize our algorithm to explore the tradeoff between the estimated accuracy of different graphlets when the sample budget is split into different layers. Experimental evaluation on real-world and synthetic multiplex networks demonstrates the accuracy and high efficiency of our unbiased estimators.

References

  1. Nesreen K. Ahmed, Nick Duffield, Jennifer Neville, and Ramana Kompella. 2014. Graph sample and hold: A framework for big-graph analytics. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1446–1455. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. N. K. Ahmed, N. Duffield, T. Willke, and R. A. Rossi. 2017. On sampling from massive graph streams. arXiv:1703.02625. Retrieved from https://arxiv.org/abs/1703.02625. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. N. K. Ahmed, N. Duffield, T. L. Willke, and R. A. Rossi. 2017. On sampling from massive graph streams. VLDB J. 10, 11 (2017), 1430–1441. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. N. K. Ahmed, J. Neville, and R. Kompella. 2013. Network sampling: From static to streaming graphs. ACM Trans. Knowl. Discov. Data 8, 2 (2013), 1–56. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. N. K. Ahmed, J. Neville, R. A. Rossi, and N. Duffield. 2015. Efficient graphlet counting for large networks. In Proceedings of the International Conference on Data Mining. IEEE, 1–10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Réka Albert and Albert-László Barabási. 2002. Statistical mechanics of complex networks. Rev. Mod. Phys. 74, 1 (2002), 47. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Massoud Amin. 2002. Toward secure and resilient interdependent infrastructures. J. Infrastruct. Syst. 8, 3 (2002), 67–75.Google ScholarGoogle ScholarCross RefCross Ref
  8. Luca Becchetti, Paolo Boldi, Carlos Castillo, and Aristides Gionis. 2008. Efficient semi-streaming algorithms for local triangle counting in massive graphs. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 16–24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. A. Bhuiyan, M. Rahman, M. Rahman, and Al. H. Mohammad. 2012. Guise: Uniform sampling of graphlets for large graph analysis. In Proceedings of the International Conference on Data Mining. IEEE, 91–100. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Hanjo D. Boekhout, Walter A. Kosters, and Frank W. Takes. 2018. Counting multilayer temporal motifs in complex networks. In Proceedings of the International Conference on Complex Networks and Their Applications. Springer, 565–577.Google ScholarGoogle Scholar
  11. Béla Bollobás and Bollobás Béla. 2001. Random Graphs. Number 73. Cambridge University Press.Google ScholarGoogle Scholar
  12. S. P. Borgatti, A. Mehra, D. J. Brass, and G. Labianca. 2009. Network analysis in the social sciences. Science 323, 5916 (2009), 892–895.Google ScholarGoogle Scholar
  13. Fabio Celli, F. Marta L. Di Lascio, Matteo Magnani, Barbara Pacelli, and Luca Rossi. 2010. Social network data and practices: The case of Friendfeed. In Proceedings of the International Conference on Social Computing, Behavioral Modeling and Prediction.Lecture Notes in Computer Science. Springer, Berlin. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. X. Chen, Y. Li, P. Wang, and J. Lui. 2016. A general framework for estimating graphlet statistics via random walk. VLDB J. 10, 3 (2016), 253–264. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Kai-Min Chung, Henry Lam, Zhenming Liu, and Michael Mitzenmacher. 2012. Chernoff-Hoeffding bounds for Markov chains: Generalized and simplified. In Proceedings of the Symposium on Theoretical Aspects of Computer Science (STACS’12).Google ScholarGoogle Scholar
  16. G. M. Coclite, M. Garavello, and B. Piccoli. 2005. Traffic flow on a road network. SIAM J. Math. Anal. 36, 6 (2005), 1862–1886.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Yuxiao Dong, Jie Tang, Sen Wu, Jilei Tian, Nitesh V Chawla, Jinghai Rao, and Huanhuan Cao. 2012. Link prediction and recommendation across heterogeneous social networks. In Proceedings of the 2012 IEEE 12th International Conference on Data Mining. IEEE, 181–190. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Charles J. Geyer. 2005. Markov chain Monte Carlo lecture notes.Google ScholarGoogle Scholar
  19. M. Gjoka, C. T. Butts, M. Kurant, and A. Markopoulou. 2011. Multigraph sampling of online social networks. IEEE J. Select. Areas Commun. 29, 9 (2011), 1893–1905.Google ScholarGoogle ScholarCross RefCross Ref
  20. J. W. Godfrey. 1969. The mechanism of a road network. Traffic Eng. Contr. 8, 8 (1969).Google ScholarGoogle Scholar
  21. Qingyuan Gong, Yang Chen, Xiaolong Yu, Chao Xu, Zhichun Guo, Yu Xiao, Fehmi Ben Abdesslem, Xin Wang, and Pan Hui. 2019. Exploring the power of social hub services. World Wide Web 22, 6 (2019), 2825–2852.Google ScholarGoogle ScholarCross RefCross Ref
  22. Yacov Y Haimes and Pu Jiang. 2001. Leontief-based model of risk in complex interconnected infrastructures. J. Infrastruct. Syst. 7, 1 (2001), 1–12.Google ScholarGoogle ScholarCross RefCross Ref
  23. Fritz Heider. 1958. The Psychology of Interpersonal Relations. Psychology Press.Google ScholarGoogle Scholar
  24. T. Hočevar and J. Demšar. 2014. A combinatorial approach to graphlet counting. Bioinformatics 30, 4 (2014), 559–565.Google ScholarGoogle ScholarCross RefCross Ref
  25. J. M. Hofman and C. H. Wiggins. 2008. Bayesian approach to network modularity. Phys. Rev. Lett. 100, 25 (2008), 258701.Google ScholarGoogle ScholarCross RefCross Ref
  26. Hong Huang, Jie Tang, Lu Liu, JarDer Luo, and Xiaoming Fu. 2015. Triadic closure pattern analysis and prediction in social networks. IEEE Trans. Knowl. Data Eng. 27, 12 (2015), 3374–3389. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. M. Jha, C. Seshadhri, and A. Pinar. 2013. A space efficient streaming algorithm for triangle counting using the birthday paradox. In Proceedings of the International Conference on Knowledge Discovery and Data Mining. ACM, 589–597. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. M. Jha, C. Seshadhri, and A. Pinar. 2015. Path sampling: A fast and provable method for estimating 4-vertex subgraph counts. In Proceedings of the International Conference on World Wide Web. 495–505. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. K. Juszczyszyn, K. Musial, and M. Budka. 2011. Link prediction based on subgraph evolution in dynamic social networks. In Proceedings of the International Conference on Social Computing. IEEE, 27–34.Google ScholarGoogle Scholar
  30. L. Katzir and S. J. Hardiman. 2015. Estimating clustering coefficients and size of social networks via random walk. ACM Trans. Web 9, 4 (2015), 19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Peter Klimek and Stefan Thurner. 2013. Triadic closure dynamics drives scaling laws in social multiplex networks. New J. Phys. 15, 6 (2013), 063008.Google ScholarGoogle ScholarCross RefCross Ref
  32. Jérôme Kunegis. 2013. Konect: The Koblenz network collection. In Proceedings of the 22nd International Conference on World Wide Web. 1343–1350. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. J. Kunegis, A. Lommatzsch, and C. Bauckhage. 2009. The slashdot zoo: Mining a social network with negative edges. In Proceedings of the International Conference on World Wide Web. ACM, 741–750. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. C. H. Lee, X. Xu, and D. Y. Eun. 2012. Beyond random walk and metropolis-hastings samplers: Why you should not backtrack for unbiased graph sampling. In Proceedings of the ACM Special Interest Group on Performance Evaluation (SIGMETRICS’12), Vol. 40. ACM, 319–330. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Jure Leskovec and Andrej Krevl. 2014. SNAP Datasets: Stanford Large Network Dataset Collection. Retrieved from http://snap.stanford.edu/data.Google ScholarGoogle Scholar
  36. J. Y. Li and M. Y. Yeh. 2011. On sampling type distribution from heterogeneous social networks. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 111–122. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. R. H. Li, J. X. Yu, L. Qin, R. Mao, and T. Jin. 2015. On random walk based graph sampling. In Proceedings of the International Conference on Data Engineering. IEEE, 927–938.Google ScholarGoogle Scholar
  38. Matteo Magnani and Luca Rossi. 2011. The ML-model for multi-layer social networks. In Proceedings of the Advances in Social Network Analysis and Mining (ASONAM’11). IEEE Computer Society, 5–12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Abedelaziz Mohaisen, Aaram Yun, and Yongdae Kim. 2010. Measuring the mixing time of social graphs. In Proceedings of the 10th ACM SIGCOMM Conference on Internet Measurement. 383–389. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. J. D. Noh and H. Rieger. 2004. Random walks on complex networks. Phys. Rev. Lett. 92, 11 (2004), 118701.Google ScholarGoogle ScholarCross RefCross Ref
  41. N. Pržulj. 2007. Biological network comparison using graphlet degree distribution. Bioinformatics 23, 2 (2007), e177–e183. Google ScholarGoogle ScholarCross RefCross Ref
  42. M. Rahman, M. Bhuiyan, and M. Al. Hasan. 2012. Graft: An approximate graphlet counting algorithm for large graph analysis. In Proceedings of the International Conference on Information and Knowledge Management. ACM, 1467–1471. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. J. Scott. 1988. Social network analysis. Sociology 22, 1 (1988), 109–127.Google ScholarGoogle ScholarCross RefCross Ref
  44. C. Seshadhri, A. Pinar, and T. G. Kolda. 2013. Triadic measures on graphs: The power of wedge sampling. In Proceedings of the International Conference on Data Mining. SIAM, 10–18.Google ScholarGoogle Scholar
  45. S. Suri and S. Vassilvitskii. 2011. Counting triangles and the curse of the last reducer. In Proceedings of the International Conference on World Wide Web. 607–614. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Bimal Viswanath, Alan Mislove, Meeyoung Cha, and Krishna P. Gummadi. 2009. On the evolution of user interaction in Facebook. In Proceedings of the 2nd ACM SIGCOMM Workshop on Social Networks (WOSN’09). Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. J. S. Vitter. 1985. Random sampling with a reservoir. ACM Trans. Math. Softw. 11, 1 (1985), 37–57. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. P. Wang, J. Lui, B. Ribeiro, D. Towsley, J. Zhao, and X. Guan. 2014. Efficiently estimating motif statistics of large networks. ACM Trans. Knowl. Discov. Data 9, 2 (2014), 8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. P. Wang, J. Tao, J. Zhao, and X. Guan. 2015. Moss: A scalable tool for efficiently sampling and counting 4-and 5-node graphlets. arXiv:1509.08089. Retrieved from https://arxiv.org/abs/1509.08089.Google ScholarGoogle Scholar
  50. Duncan J. Watts and Steven H. Strogatz. 1998. Collective dynamics of ‘small-world’ networks. Nature 393, 6684 (1998), 440–442.Google ScholarGoogle Scholar
  51. Sebastian Wernicke and Florian Rasche. 2006. FANMOD: A tool for fast network motif detection. Bioinformatics 22, 9 (2006), 1152–1153. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. O. Younis, M. Krunz, and S. Ramasubramanian. 2006. Node clustering in wireless sensor networks: Recent developments and deployment challenges. IEEE Netw. 20, 3 (2006), 20–25. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Jing Zhang, Zhanpeng Fang, Wei Chen, and Jie Tang. 2015. Diffusion of “following” links in microblogging networks. IEEE Trans. Knowl. Data Eng. 27, 8 (2015), 2093–2106.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Sampling Graphlets of Multiplex Networks: A Restricted Random Walk Approach

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on the Web
      ACM Transactions on the Web  Volume 15, Issue 4
      November 2021
      152 pages
      ISSN:1559-1131
      EISSN:1559-114X
      DOI:10.1145/3465465
      Issue’s Table of Contents

      Copyright © 2021 Association for Computing Machinery.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 14 June 2021
      • Accepted: 1 March 2021
      • Revised: 1 December 2020
      • Received: 1 May 2020
      Published in tweb Volume 15, Issue 4

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!