skip to main content
research-article

Finding the Source in Networks: An Approach Based on Structural Entropy

Published:27 March 2023Publication History
Skip Abstract Section

Abstract

The popularity of intelligent devices provides straightforward access to the Internet and online social networks. However, the quick and easy data updates from networks also benefit the risk spreading, such as rumor, malware, or computer viruses. To this end, this article studies the problem of source detection, which is to infer the source node out of an aftermath of a cascade, that is, the observed infected graph GN of the network at some time. Prior arts have adopted various statistical quantities such as degree, distance, or infection size to reflect the structural centrality of the source. In this article, we propose a new metric that we call the infected tree entropy (ITE), to utilize richer underlying structural features for source detection. Our idea of ITE is inspired by the conception of structural entropy [21], which demonstrated that the minimization of average bits to encode the network structures with different partitions is the principle for detecting the natural or true structures in real-world networks. Accordingly, our proposed ITE based estimator for the source tries to minimize the coding of network partitions brought by the infected tree rooted at all the potential sources, thus minimizing the structural deviation between the cascades from the potential sources and the actual infection process included in GN. On polynomially growing geometric trees, with increasing tree heterogeneity, the ITE estimator remarkably yields more reliable detection under only moderate infection sizes, and returns an asymptotically complete detection. In contrast, for regular expanding trees, we still observe guaranteed detection probability of ITE estimator even with an infinite infection size, thanks to the degree regularity property. We also algorithmically realize the ITE based detection that enjoys linear time complexity via a message-passing scheme, and further extend it to general graphs. Extensive experiments on synthetic and real datasets confirm the superiority of ITE to the baselines. For example, ITE returns an accuracy of 85%, ranking the source among the top 10%, far exceeding 55% of the classic algorithm on scale-free networks.

REFERENCES

  1. [1] Agaskar Ameya and Lu Yue M.. 2013. A fast Monte Carlo algorithm for source localization on graphs. In Proceeding of the Wavelets and Sparsity XV, Vol. 8858, 429434.Google ScholarGoogle Scholar
  2. [2] Anand Kartik and Bianconi Ginestra. 2009. Entropy measures for networks: Toward an information theory of complex topologies. Physical Review E 80, 4 (2009), 045102.Google ScholarGoogle ScholarCross RefCross Ref
  3. [3] Barabási Albert-László and Albert Réka. 1999. Emergence of scaling in random networks. Science 286, 5439 (1999), 509512.Google ScholarGoogle ScholarCross RefCross Ref
  4. [4] Bianconi Ginestra. 2009. Entropy of network ensembles. Physical Review E 79, 3 (2009), 036114.Google ScholarGoogle ScholarCross RefCross Ref
  5. [5] Bonchev D. and Trinajstić N.. 1977. Information theory, distance matrix, and molecular branching. The Journal of Chemical Physics 67, 10 (1977), 45174533.Google ScholarGoogle ScholarCross RefCross Ref
  6. [6] Braunstein Samuel L., Ghosh Sibasish, and Severini Simone. 2006. The Laplacian of a graph as a density matrix: A basic combinatorial approach to separability of mixed states. Annals of Combinatorics 10, 3 (2006), 291317.Google ScholarGoogle ScholarCross RefCross Ref
  7. [7] Chai Yun, Wang Youguo, and Zhu Liang. 2021. Information sources estimation in time-varying networks. IEEE Transactions on Information Forensics and Security 16 (2021), 26212636.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. [8] Choi Jaeyoung, Moon Sangwoo, Woo Jiin, Son Kyunghwan, Shin Jinwoo, and Yi Yung. 2017. Rumor source detection under querying with untruthful answers. In Proceedings of the IEEE INFOCOM. 19.Google ScholarGoogle ScholarCross RefCross Ref
  9. [9] Choi Jaeyoung and Yi Yung. 2018. Necessary and sufficient budgets in information source finding with querying: Adaptivity gap. In Proceedings of the IEEE ISIT. 22612265.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. [10] Choi Yongwook and Szpankowski Wojciech. 2009. Compression of graphical structures. In Proceedings of the IEEE ISIT. 364368.Google ScholarGoogle Scholar
  11. [11] Comin Cesar Henrique and Costa Luciano da Fontoura. 2011. Identifying the starting point of a spreading process in complex networks. Physical Review E 84, 5 (2011), 056105.Google ScholarGoogle ScholarCross RefCross Ref
  12. [12] Dehmer Matthias. 2008. Information processing in complex networks: Graph entropy and information functionals. Applied Mathematics and Computation 201, 1-2 (2008), 8294.Google ScholarGoogle ScholarCross RefCross Ref
  13. [13] Dong Wenxiang, Zhang Wenyi, and Tan Chee Wei. 2013. Rooting out the rumor culprit from suspects. In Proceedings of the IEEE ISIT. 26712675.Google ScholarGoogle ScholarCross RefCross Ref
  14. [14] Fioriti Vincenzo and Chinnici Marta. 2012. Predicting the sources of an outbreak with a spectral technique. arXiv:1211.2333.Google ScholarGoogle Scholar
  15. [15] Goldenberg Jacob, Libai Barak, and Muller Eitan. 2001. Talk of the network: A complex systems look at the underlying process of word-of-mouth. Marketing Letters 12, 3 (2001), 211223.Google ScholarGoogle ScholarCross RefCross Ref
  16. [16] Jiang Jiaojiao, Wen Sheng, Yu Shui, Xiang Yang, and Zhou Wanlei. 2015. K-center: An approach on the multi-source identification of information diffusion. IEEE Transactions on Information Forensics and Security 10, 12 (2015), 26162626.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. [17] Karamchandani Nikhil and Franceschetti Massimo. 2013. Rumor source detection under probabilistic sampling. In Proceedings of the IEEE ISIT. 21842188.Google ScholarGoogle ScholarCross RefCross Ref
  18. [18] Kunegis Jérôme. 2013. Konect: The koblenz network collection. In Proceedings of the WWW. 13431350.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. [19] Lappas Theodoros, Terzi Evimaria, Gunopulos Dimitrios, and Mannila Heikki. 2010. Finding effectors in social networks. In Proceedings of the ACM SIGKDD. 10591068.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. [20] Leskovec Jure and Krevl Andrej. 2014. SNAP Datasets: Stanford Large Network Dataset Collection. (June 2014). Retrieved May 10, 2021 from http://snap.stanford.edu/data.Google ScholarGoogle Scholar
  21. [21] Li Angsheng and Pan Yicheng. 2016. Structural information and dynamical complexity of networks. IEEE Transactions on Information Theory 62, 6 (2016), 32903339.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. [22] Liu Yiwei, Liu Jiamou, Zhang Zijian, Zhu Liehuang, and Li Angsheng. 2019. REM: From structural entropy to community structure deception. In Advances in Neural Information Processing Systems. 1293812948.Google ScholarGoogle Scholar
  23. [23] Lokhov Andrey Y., Mézard Marc, Ohta Hiroki, and Zdeborová Lenka. 2014. Inferring the origin of an epidemic with a dynamic message-passing algorithm. Physical Review E 90, 1 (2014), 012801.Google ScholarGoogle ScholarCross RefCross Ref
  24. [24] Luo Wuqiong and Tay Wee Peng. 2013. Estimating infection sources in a network with incomplete observations. In Proceedings of the IEEE GlobalSIP. 301304.Google ScholarGoogle ScholarCross RefCross Ref
  25. [25] Luo Wuqiong and Tay Wee Peng. 2013. Finding an infection source under the SIS model. In Proceedings of the IEEE ICASSP. 29302934.Google ScholarGoogle ScholarCross RefCross Ref
  26. [26] Luo Wuqiong, Tay Wee Peng, and Leng Mei. 2013. Identifying infection sources and regions in large networks. IEEE Transactions on Signal Processing 61, 11 (2013), 28502865.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. [27] Pinto Pedro C., Thiran Patrick, and Vetterli Martin. 2012. Locating the source of diffusion in large-scale networks. Physical Review Letters 109, 6 (2012), 068702.Google ScholarGoogle ScholarCross RefCross Ref
  28. [28] Prakash B. Aditya, Vreeken Jilles, and Faloutsos Christos. 2012. Spotting culprits in epidemics: How many and which ones?. In Proceedings of the ICDM. 1120.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. [29] Rashevsky Nicolas. 1955. Life, information theory, and topology. Bulletin of Mathematical Biophysics 17, 3 (1955), 229235.Google ScholarGoogle ScholarCross RefCross Ref
  30. [30] Raychaudhury C., Ray S. K., Ghosh J. J., Roy A. B., and Basak S. C.. 1984. Discrimination of isomeric structures using information theoretic topological indices. Journal of Computational Chemistry 5, 6 (1984), 581588.Google ScholarGoogle ScholarCross RefCross Ref
  31. [31] Rosvall Martin and Bergstrom Carl T.. 2008. Maps of random walks on complex networks reveal community structure. Proceedings of the National Academy of Sciences of the United States of America 105, 4 (2008), 11181123.Google ScholarGoogle Scholar
  32. [32] Shah Devavrat and Zaman Tauhid. 2011. Rumors in a network: Who’s the culprit? IEEE Transactions on Information Theory 57, 8 (2011), 51635181.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. [33] Shah Devavrat and Zaman Tauhid. 2016. Finding rumor sources on random trees. Operations Research 64, 3 (2016), 736755.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. [34] Shannon Claude E.. 1948. A mathematical theory of communication. The Bell System Technical Journal 27, 3 (1948), 379423.Google ScholarGoogle ScholarCross RefCross Ref
  35. [35] Tang Wenchang, Ji Feng, and Tay Wee Peng. 2018. Estimating infection sources in networks using partial timestamps. IEEE Transactions on Information Forensics and Security 13, 12 (2018), 30353049.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. [36] Wang Zhaoxu, Dong Wenxiang, Zhang Wenyi, and Tan Chee Wei. 2014. Rumor source detection with multiple observations: Fundamental limits and algorithms. In Proceedings of the ACM SIGMETRICS.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. [37] Watts Duncan J. and Strogatz Steven H.. 1998. Collective dynamics of “small-world” networks. Nature 393, 6684 (1998), 440442.Google ScholarGoogle ScholarCross RefCross Ref
  38. [38] Yu Pei-Duo, Tan Chee Wei, and Fu Hung-Lin. 2018. Rumor source detection in finite graphs with boundary effects by message-passing algorithms. In Proceedings of the IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. 175192.Google ScholarGoogle Scholar
  39. [39] Zhu Kai, Chen Zhen, and Ying Lei. 2016. Locating the contagion source in networks with partial timestamps. Data Mining and Knowledge Discovery 30, 5 (2016), 12171248.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. [40] Zhu Kai and Ying Lei. 2014. Information source detection in the SIR model: A sample-path-based approach. IEEE/ACM Transactions on Networking 24, 1 (2014), 408421.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. [41] Zhu Kai and Ying Lei. 2015. Source localization in networks: Trees and beyond. arXiv:1510.01814.Google ScholarGoogle Scholar

Index Terms

  1. Finding the Source in Networks: An Approach Based on Structural Entropy

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Internet Technology
        ACM Transactions on Internet Technology  Volume 23, Issue 1
        February 2023
        564 pages
        ISSN:1533-5399
        EISSN:1557-6051
        DOI:10.1145/3584863
        • Editor:
        • Ling Liu
        Issue’s Table of Contents

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 27 March 2023
        • Online AM: 6 February 2023
        • Accepted: 9 October 2022
        • Revised: 16 August 2022
        • Received: 12 September 2021
        Published in toit Volume 23, Issue 1

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
      • Article Metrics

        • Downloads (Last 12 months)161
        • Downloads (Last 6 weeks)26

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      View Full Text

      HTML Format

      View this article in HTML Format .

      View HTML Format
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!