skip to main content
research-article

Towards Anomalous Diffusion Sources Detection in a Large Network

Published:18 January 2016Publication History
Skip Abstract Section

Abstract

Witnessing the wide spread of malicious information in large networks, we develop an efficient method to detect anomalous diffusion sources and thus protect networks from security and privacy attacks. To date, most existing work on diffusion sources detection are based on the assumption that network snapshots that reflect information diffusion can be obtained continuously. However, obtaining snapshots of an entire network needs to deploy detectors on all network nodes and thus is very expensive. Alternatively, in this article, we study the diffusion sources locating problem by learning from information diffusion data collected from only a small subset of network nodes. Specifically, we present a new regression learning model that can detect anomalous diffusion sources by jointly solving five challenges, that is, unknown number of source nodes, few activated detectors, unknown initial propagation time, uncertain propagation path and uncertain propagation time delay. We theoretically analyze the strength of the model and derive performance bounds. We empirically test and compare the model using both synthetic and real-world networks to demonstrate its performance.

References

  1. Fabrizio Altarelli, Alfredo Braunstein, Luca Dall'Asta, Alejandro Lage-Castellanos, and Riccardo Zecchina. 2013. Bayesian inference of epidemics on networks via belief propagation. Physical Review Letters 112, 11 (2014), 118701.Google ScholarGoogle ScholarCross RefCross Ref
  2. R. M. Anderson and R. M. May. 1991. Infectious Diseases of Humans. Oxford University Press.Google ScholarGoogle Scholar
  3. N. Antulov-Fantulin, A. Lancic, H. Stefancic, M. Sikic, and T. Smuc. 2013. Statistical inference framework for source detection of contagion processes on arbitrary network structures. Proc. of IEEE SASOW. 78--83. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Norman T. J. Bailey. 1975. The Mathematical Theory of Infectious Diseases and Its Applications. Charles Griffin & Company Ltd, Glasgow, Scotland.Google ScholarGoogle Scholar
  5. Albert-László Barabási and Réka Albert. 1999. Emergence of scaling in random networks. Science 286, 5439 (1999), 509--512.Google ScholarGoogle Scholar
  6. Jonathan Berry, William E. Hart, Cynthia A. Phillips, James G. Uber, and Jean-Paul Watson. 2006. Sensor placement in municipal water networks with temporal integer programming models. Journal of Water Resources Planning and Management 132, 4 (2006), 218--224.Google ScholarGoogle ScholarCross RefCross Ref
  7. C. Bishop. 2007. Pattern Recognition and Machine Learning. Springer.Google ScholarGoogle Scholar
  8. B. Bollobs. 2001. Random Graphs. Cambridge University Press.Google ScholarGoogle Scholar
  9. Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato, and Jonathan Eckstein. 2011. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends® in Machine Learning 3, 1 (2011), 1--122. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Stephen Boyd and Lieven Vandenberghe. 2004. Convex Optimization. Cambridge University Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. C. Budak, D. Agrawal, and E. Abbadi. Limiting the spread of misinformation in social networks. In Proc. of WWW 2011. 665--674. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. O. Chapelle, B. Scholkopf, and A. Zien. 2006. Semi-Supervised Leanring. MIT Press, Cambridge, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. W. Chen, Y. Wang, and S. Yang. Efficient influence maximization in social networks. In Proc. of KDD 2009. 199--208. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Nicholas Christakis and James Fowler. 2010. Social network sensors for early detection of contagious outbreaks. PloS ONE 5, 9, 1--8.Google ScholarGoogle ScholarCross RefCross Ref
  15. Reuven Cohen, Shlomo Havlin, and Daniel Ben-Avraham. 2003. Efficient immunization strategies for computer networks and populations. Physical Review Letters 91, 24, 247901.Google ScholarGoogle ScholarCross RefCross Ref
  16. C. Comin and L. Costa. 2011. Identifying the starting point of a spreading process in complex networks. Physical Review Letters E 84, 5 (2011), 056105.Google ScholarGoogle ScholarCross RefCross Ref
  17. N. Du, Y. Liang, M. Balcan, and L. Song. 2014. Influence function learning in information diffusion networks. In Proc. of ICML 2014. 2016--2024.Google ScholarGoogle Scholar
  18. N. Du, L. Song, M. Gomez-Rodriguez, and H. Zha. 2013. Scalable influence estimation in continuous-time diffusion networks. In Proc. of NIPS 2013. 3147--3155.Google ScholarGoogle Scholar
  19. A. Ganesh, L. Massoulie, and D. F. Towsley. 2005. The effect of network topology on the spread of epidemics. In Proc. of INFOCOM 2005. 1455--1466.Google ScholarGoogle Scholar
  20. M. Gomez-Rodriguez, D. Balduzzi, and B. Scholkopf. 2011. Uncovering the temporal dynamics of diffusion networks. In Proc. of ICML 2011. 561--568.Google ScholarGoogle Scholar
  21. A. Goyal, W. Lu, and L. Lakshmanan. 2011. CELF++: Optimizing the greedy algorithm for influence maximization in social networks. In Proc. of WWW 2011. 47--48. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. Guo, P. Zhang, C. Zhou, Y. Cao, and L. Guo. 2013. Personalized influence maximization on social networks. In Proc. of CIKM 2013. 199--208. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. N. Karamchandani and M. Franceschetti. 2013. Rumor source detection under probabilistic sampling. In Proc. of ISIT 2013. 2184--2188.Google ScholarGoogle Scholar
  24. D. Kempe, J. Kleinberg, and E. Tardos. 2003. Maximizing the spread of influence through a social network. In Proc. of KDD 2003. 137--146. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. M. Kimura, K. Saito, and H. Motoda. 2008. Minimizing the spread of contamination by blocking links in a network. In Proc. of AAAI 2008. 1175--1180. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. M. Kitsak and et al. 2010. Identification of influential spreaders in complex networks. Nature Physics 6, 11 (2010), 888--893.Google ScholarGoogle ScholarCross RefCross Ref
  27. Jan Kostka, Yvonne Anne Oswald, and Roger Wattenhofer. 2008. Word of mouth: Rumor dissemination in social networks. Structural Information and Communication Complexity. 185--196. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Andreas Krause and Carlos Guestrin. 2009. Optimizing Sensing: From Water to the Web. Technical Report. Machine Learning Dept., Carnegie-Mellon University, Pittsburgh, PA.Google ScholarGoogle Scholar
  29. T. Lappas, E. Terzi, D. Gunopulos, and MannilaH. 2010. Finding effectors in social networks. In Proc. of KDD 2010. 1059--1068. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. J. Leskovec, J. Kleinberg, and C. Faloutsos. 2005. Graphs over time: Densification laws, shrinking diameters and possible explanations. In Proc. of KDD 2005. 177--187. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Jure Leskovec, Andreas Krause, Carlos Guestrin, Christos Faloutsos, Jeanne VanBriesen, and Natalie Glance. 2007. Cost-effective outbreak detection in networks. In Proc. of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 420--429. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Rong-Hua Li, Jeffrey Xu Yu, Xin Huang, and Hong Cheng. 2014. Random-walk domination in large graphs. In Proc. of the 30th IEEE International Conference on Data Engineering. IEEE, 736--747.Google ScholarGoogle ScholarCross RefCross Ref
  33. A. Lokhov. 2014. Inferring the origin of an epidemy with dynamic message-passing algorithm. Physical Review E 90, 1 (2014), 012801.Google ScholarGoogle ScholarCross RefCross Ref
  34. W. Luo, W. Tay, and M. Leng. 2014. How to identify an infection source with limited observations. IEEE Journal of Selected Topics in Signal Processing 8, 4 (2014), 586--597.Google ScholarGoogle ScholarCross RefCross Ref
  35. C. Milling, C. Caramanis, S. Mannor, and S. Shakkottai. 2012. Network forensics: Random infection vs spreading epidemic. In Proc. of ACM SIGMETRICS, 2012. 223--234. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Romualdo Pastor-Satorras and Alessandro Vespignani. 2002. Immunization of complex networks. Physical Review E 65, 3 (2002), 036104.Google ScholarGoogle ScholarCross RefCross Ref
  37. P. Pinto, P. Thiran, and M. Vetterli. 2012. Locating the source of diffusion in large-scale networks. Physical Review Letters 109, 6 (2012), 068702.Google ScholarGoogle ScholarCross RefCross Ref
  38. B. Prakash, J. Vreeken, and C. Faloutsos. 2012. Spotting culprits in epidemics: How many and which ones? In Proc. of ICDM 2012. 11--20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. B. Aditya Prakash, Jilles Vreeken, and Christos Faloutsos. 2014. Efficiently spotting the starting points of an epidemic in a large graph. Knowledge and Information Systems 38, 1, 35--59.Google ScholarGoogle ScholarCross RefCross Ref
  40. Z. Qiao, P. Zhang, J. He, Y. Cao, C. Zhou, and L. Guo. 2014. Combining geographical information of users and content of items for accurate rating prediction. In Proc. of WWW 2014. 361--362. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. M. Rodriguez, D. Balduzzi, and B. Scholkopf. 2011. Uncovering the temporal dynamics of diffusion networks. In Proc. of ICML 2011. 561--568.Google ScholarGoogle Scholar
  42. D. Shah and T. Zaman. 2011. Rumors in a network: who is the culprit? IEEE Transactions on Information Technology 57, 8, 5163--5181. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. G. Stringhini, C. Kruegel, and G. Vigna. 2010. Detecting spammers on social networks. In Proc. of ACSAC 2010. 1--9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. H. Tong, B. Prakash, C. Tsourakakis, T. Eliassi-Rad, C. Faloutsos, and D. Chau. 2010. On the vulnerability of large graphs. In Proc. of ICDM 2010. 1091--1096. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. C. Wang, J. C. Knight, and M. C. Elder. 2000. On computer viral infection and the effect of immunization. In Proc. of ACSAC 2000. 246--256. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. D. Watts and S. Strogatz. 1998. Collective dynamics of small-world networks. Nature 393, 6684 (1998), 440--442.Google ScholarGoogle Scholar
  47. C. Zhou, P. Zhang, J. Guo, X. Zhu, and L. Guo. 2013. UBLF: An upper bound based approach to discover influential nodes in social networks. In Proc. of ICDM 2013. 907--916.Google ScholarGoogle Scholar
  48. G. Zhu, H. J. Yang, R. Yang, J. Ren, B. Li, and Y. C. Lai. Information source detection in the sir model: A sample path based approach. In arxiv.org/abs/1206.5421, 2013.Google ScholarGoogle Scholar
  49. K. Zhu and L. Ying. 2013. Information source detection in the sir model: A sample path based approach. IEEE ITA (2013), 1--9.Google ScholarGoogle Scholar

Index Terms

  1. Towards Anomalous Diffusion Sources Detection in a Large Network

    Recommendations

    Reviews

    Andrew Kalafut

    The identification of diffusion sources in a network is an interesting and important problem. A solution to this problem may be used, for example, for solving security problems such as finding the source of false or malicious information being spread through a network. When designing a solution for a problem such as this one that has real-world applications, it is important to keep in mind that solutions should be realistic. This is one area where much of the previous work on identification of diffusion sources falls short. For example, much of the previous work assumes that snapshots of the entire network are available. In this paper, the authors propose five technical challenges that reflect common real-world conditions. They then propose a learning model that accurately detects diffusion sources under these conditions. Both a theoretical analysis of the model and experimental validation are provided. Four of the datasets used for the experiments are synthetic, but one is based on a crawl of a real social network, further emphasizing the real-world applicability of these results. The results show that the proposed model outperforms benchmark models (meaning the predicted diffusion sources are closer to the actual diffusion sources) on a variety of datasets and conditions. Online Computing Reviews Service

    Access critical reviews of Computing literature here

    Become a reviewer for Computing Reviews.

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Internet Technology
      ACM Transactions on Internet Technology  Volume 16, Issue 1
      February 2016
      129 pages
      ISSN:1533-5399
      EISSN:1557-6051
      DOI:10.1145/2869768
      • Editor:
      • Munindar P. Singh
      Issue’s Table of Contents

      Copyright © 2016 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 18 January 2016
      • Accepted: 1 July 2015
      • Revised: 1 June 2015
      • Received: 1 November 2014
      Published in toit Volume 16, Issue 1

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!