skip to main content
research-article
Public Access

Non-Markovian Monte Carlo on Directed Graphs

Authors Info & Claims
Published:26 March 2019Publication History
Skip Abstract Section

Abstract

Markov Chain Monte Carlo (MCMC) has been the de facto technique for sampling and inference of large graphs such as online social networks. At the heart of MCMC lies the ability to construct an ergodic Markov chain that attains any given stationary distribution $\boldsymbolπ $, often in the form of random walks or crawling agents on the graph. Most of the works around MCMC, however, presume that the graph is undirected or has reciprocal edges, and become inapplicable when the graph is directed and non-reciprocal. Here we develop a similar framework for directed graphs, which we call Non-Markovian Monte Carlo (NMMC), by establishing a mapping to convert $\boldsymbolπ into the quasi-stationary distribution of a carefully constructed transient Markov chain on an extended state space. As applications, we demonstrate how to achieve any given distribution $\boldsymbolπ $ on a directed graph and estimate the eigenvector centrality using a set of non-Markovian, history-dependent random walks on the same graph in a distributed manner. We also provide numerical results on various real-world directed graphs to confirm our theoretical findings, and present several practical enhancements to make our NMMC method ready for practical use in most directed graphs. To the best of our knowledge, the proposed NMMC framework for directed graphs is the first of its kind, unlocking all the limitations set by the standard MCMC methods for undirected graphs.

References

  1. David Aldous, Barry Flannery, and José Luis Palacios. 1988. Two Applications of Urn Processes The Fringe Analysis of Search Trees and The Simulation of Quasi-Stationary Distributions of Markov Chains. Probability in the Engineering and Informational Sciences , Vol. 2, 3 (1988), 293--307.Google ScholarGoogle ScholarCross RefCross Ref
  2. Noga Alon, Itai Benjamini, Eyal Lubetzky, and Sasha Sodin. 2007. Non-Backtracking Random Walks Mix Faster. Communications in Contemporary Mathematics , Vol. 9, 4 (2007), 585--603.Google ScholarGoogle ScholarCross RefCross Ref
  3. Amine Asselah, Pablo A. Ferrari, and Pablo Groisman. 2011. Quasistationary distributions and Fleming-Viot processes in finite spaces. Journal of Applied Probability , Vol. 48, 2 (Jun 2011), 322--332.Google ScholarGoogle ScholarCross RefCross Ref
  4. Konstantin Avrachenkov, Vivek S. Borkar, Arun Kadavankandy, and Jithin K. Sreedharan. 2018. Revisiting random walk based sampling in networks: evasion of burn-in period and frequent regenerations. Computational Social Networks , Vol. 5, 1 (Mar 2018), 19.Google ScholarGoogle ScholarCross RefCross Ref
  5. Konstantin Avrachenkov, Bruno Ribeiro, and Jithin K. Sreedharan. 2016. Inference in OSNs via Lightweight Partial Crawls. In Proceedings of the ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS'16). 165--177. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Ziv Bar-Yossef, Alexander Berg, Steve Chien, Jittat Fakcharoenphol, and Dror Weitz. 2000. Approximating Aggregate Queries About Web Pages via Random Walks. In Proceedings of the 26th International Conference on Very Large Data Bases (VLDB'00). 535--544. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Ziv Bar-Yossef and Maxim Gurevich. 2008. Random Sampling from a Search Engine's Index. J. ACM , Vol. 55, 5, Article 24 (Oct 2008), bibinfonumpages24:1--24:74 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. A. Barker. 1965. Monte Carlo Calculations of the Radial Distribution Functions for a Proton-Electron Plasma. Australian Journal of Physics , Vol. 18, 2 (1965), 119--134.Google ScholarGoogle ScholarCross RefCross Ref
  9. Michel Benaïm. 1997. Vertex-reinforced random walks and a conjecture of Pemantle. Annals of Probability , Vol. 25, 1 (01 1997), 361--392.Google ScholarGoogle Scholar
  10. Michel Benaïm and Bertrand Cloez. 2015. A stochastic approximation approach to quasi-stationary distributions on finite spaces. Electronic Communications in Probability , Vol. 20 (2015), 1--14.Google ScholarGoogle ScholarCross RefCross Ref
  11. Michel Benaïm, Bertrand Cloez, and Fabien Panloup. 2018. Stochastic approximation of quasi-stationary distributions on compact spaces and applications. Annals of Applied Probability , Vol. 28, 4 (08 2018), 2370--2416.Google ScholarGoogle ScholarCross RefCross Ref
  12. Joris Bierkens. 2016. Non-reversible Metropolis-Hastings. Statistics and Computing , Vol. 26, 6 (2016), 1213--1228. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. Blanchet, P. Glynn, and S. Zheng. 2016. Analysis of a stochastic approximation algorithm for computing quasi-stationary distributions. Advances in Applied Probability , Vol. 48, 3 (09 2016), 792--811.Google ScholarGoogle Scholar
  14. S. Boyd, P. Diaconis, and L. Xiao. 2004. Fastest mixing markov chain on a graph. SIAM Rev. , Vol. 46, 4 (2004), 667--689. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. P. Brémaud. 1999. Markov chains: Gibbs fields, Monte Carlo simulation, and queues.Google ScholarGoogle Scholar
  16. Sergey Brin and Lawrence Page. 1998. The Anatomy of a Large-scale Hypertextual Web Search Engine. Computer Networks and ISDN Systems , Vol. 30, 1--7 (April 1998), 107--117. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Z. Burda, J. Duda, J. M. Luck, and B. Waclaw. 2009. Localization of the Maximal Entropy Random Walk. Physical Review Letters , Vol. 102 (Apr 2009), 160602. Issue 16.Google ScholarGoogle Scholar
  18. Geoffrey S. Canright and Kenth Engø-Monsen. 2006. Spreading on Networks: A Topographic View. Complexus , Vol. 3 (Aug 2006), 131--146. Issue 1--3.Google ScholarGoogle Scholar
  19. Iacopo Carreras, Daniele Miorandi, Geoffrey S. Canright, and Kenth Engø-Monsen. 2007. Eigenvector Centrality in Highly Partitioned Mobile Networks: Principles and Applications .Springer Berlin Heidelberg.Google ScholarGoogle Scholar
  20. Deepayan Chakrabarti, Yang Wang, Chenxi Wang, Jurij Leskovec, and Christos Faloutsos. 2008. Epidemic Thresholds in Real Networks. ACM Transactions on Information and System Security , Vol. 10, 4 (Jan 2008), 1:1--1:26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Chen Chen, Hanghang Tong, B. Aditya Prakash, Tina Eliassi-Rad, Michalis Faloutsos, and Christos Faloutsos. 2016. Eigen-Optimization on Large Graphs by Edge Manipulation. ACM Transactions on Knowledge Discovery from Data , Vol. 10, 4 (Jun. 2016), 49:1--49:30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Fang Chen, László Lovász, and Igor Pak. 1999. Lifting Markov Chains to Speed Up Mixing. In Proceedings of the Thirty-first Annual ACM Symposium on Theory of Computing (STOC'99). 275--281. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Ting-Li Chen and Chii-Ruey Hwang. 2013. Accelerating reversible Markov chains. Statistics & Probability Letters , Vol. 83, 9 (2013), 1956--1962.Google ScholarGoogle ScholarCross RefCross Ref
  24. Persi Diaconis, Susan Holmes, and Radford M. Neal. 2000. Analysis of a nonreversible Markov chain sampler. Annals of Applied Probability , Vol. 10, 3 (Aug 2000), 726--752.Google ScholarGoogle Scholar
  25. Persi Diaconis and Laurent Miclo. 2013. On the spectral analysis of second-order Markov chains. Annales de la Faculté des sciences de Toulouse : Mathématiques , Vol. 22 (2013), 573--621. Issue 3.Google ScholarGoogle Scholar
  26. Persi Diaconis and Laurent Saloff-Coste. 1998. What Do We Know about the Metropolis Algorithm? J. Comput. System Sci. , Vol. 57, 1 (1998), 20--36. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Moez Draief, Ayalvadi Ganesh, and Laurent Massoulié. 2008. Thresholds for Virus Spread on Networks. Annals of Applied Probability , Vol. 18, 2 (04 2008), 359--378.Google ScholarGoogle ScholarCross RefCross Ref
  28. David Easley and Jon Kleinberg. 2010. Networks, Crowds, and Markets: Reasoning About a Highly Connected World .Cambridge University Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Heitor C.M. Fernandes and Martin Weigel. 2011. Non-reversible Monte Carlo simulations of spin models. Computer Physics Communications , Vol. 182, 9 (2011), 1856--1859.Google ScholarGoogle ScholarCross RefCross Ref
  30. Maksym Gabielkov, Ashwin Rao, and Arnaud Legout. 2014. Sampling Online Social Networks: An Experimental Study of Twitter. In Proceedings of ACM SIGCOMM . Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Ayalvadi Ganesh, Laurent Massoulié, and Don Towsley. 2005. The Effect of Network Topology on the Spread of Epidemics. In Proceedings of IEEE INFOCOM .Google ScholarGoogle ScholarCross RefCross Ref
  32. M. Gjoka, M. Kurant, C. T. Butts, and A. Markopoulou. 2011. Practical Recommendations on Crawling Online Social Networks. IEEE Journal on Selected Areas in Communications , Vol. 29, 9 (October 2011), 1872--1892.Google ScholarGoogle ScholarCross RefCross Ref
  33. Ilie Grigorescu and Min Kang. 2004. Hydrodynamic limit for a Fleming-Viot type system. Stochastic Processes and their Applications , Vol. 110, 1 (2004), 111--143.Google ScholarGoogle Scholar
  34. Ilie Grigorescu and Min Kang. 2012. Immortal particle for a catalytic branching process. Probability Theory and Related Fields , Vol. 153, 1 (2012), 333--361.Google ScholarGoogle ScholarCross RefCross Ref
  35. Ilie Grigorescu and Min Kang. 2013. Markov processes with redistribution. Markov Process and Related Fields , Vol. 19, 3 (2013), 497--520.Google ScholarGoogle Scholar
  36. Pablo Groisman and Matthieu Jonckheere. 2013. Simulation of Quasi-Stationary Distributions on Countable Spaces. Markov Processes and Related Fields , Vol. 19, 3 (2013), 521--542.Google ScholarGoogle Scholar
  37. Stephen J. Hardiman and Liran Katzir. 2013. Estimating Clustering Coefficients and Size of Social Networks via Random Walk. In Proceedings of the 22nd International Conference on World Wide Web (WWW'13). 539--550. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. W. K. Hastings. 1970. Monte Carlo Sampling Methods Using Markov Chains and Their Applications. Biometrika , Vol. 57, 1 (1970), 97--109.Google ScholarGoogle ScholarCross RefCross Ref
  39. Taher Haveliwala and Sepandar Kamvar. 2003. The Second Eigenvalue of the Google Matrix. Technical Report 2003--20. Stanford University.Google ScholarGoogle Scholar
  40. Monika R. Henzinger, Allan Heydon, Michael Mitzenmacher, and Marc Najork. 2000. On Near-uniform URL Sampling. Computer Networks , Vol. 33, 1 (2000), 295--308. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Akihisa Ichiki and Masayuki Ohzeki. 2013. Violation of detailed balance accelerates relaxation. Physical Review E , Vol. 88 (Aug 2013), 020101. Issue 2.Google ScholarGoogle ScholarCross RefCross Ref
  42. Marcus Kaiser, Robert L. Jack, and Johannes Zimmer. 2017. Acceleration of Convergence to Equilibrium in Markov Chains by Breaking Detailed Balance. Journal of Statistical Physics , Vol. 168, 2 (Jul 2017), 259--287.Google ScholarGoogle ScholarCross RefCross Ref
  43. Amy N. Langville and Carl D. Meyer. 2006. Google's PageRank and Beyond: The Science of Search Engine Rankings .Princeton University Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Chul-Ho Lee, Xin Xu, and Do Young Eun. 2012. Beyond Random Walk and Metropolis-Hastings Samplers: Why You Should Not Backtrack for Unbiased Graph Sampling. In Proceedings of the ACM SIGMETRICS/PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS'12). 319--330. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Chul-Ho Lee, Xin Xu, and Do Young Eun. 2017. On the Rao-Blackwellization and Its Application for Graph Sampling via Neighborhood Exploration. In Proceedings of IEEE INFOCOM .Google ScholarGoogle ScholarCross RefCross Ref
  46. Jure Leskovec and Andrej Krevl. 2014. SNAP Datasets: Stanford Large Network Dataset Collection. http://snap.stanford.edu/data/.Google ScholarGoogle Scholar
  47. Jun S. Liu. 2004. Monte Carlo Strategies in Scientific Computing .Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Linyuan Lu, Duanbing Chen, Xiao-Long Ren, Qian-Ming Zhang, Yi-Cheng Zhang, and Tao Zhou. 2016. Vital nodes identification in complex networks. Physics Reports , Vol. 650 (2016), 1--63.Google ScholarGoogle ScholarCross RefCross Ref
  49. Sylvie Méléard and Denis Villemonais. 2012. Quasi-stationary distributions and population processes. Probability Surveys , Vol. 9 (2012), 340--410.Google ScholarGoogle ScholarCross RefCross Ref
  50. Carl D. Meyer. 2000. Matrix Analysis and Applied Linear Algebra .SIAM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Radford M. Neal. 2004. Improving Asymptotic Variance of MCMC Estimators: Non-reversible Chains are Better . Technical Report No. 0406. Department of Statistics, University of Toronto.Google ScholarGoogle Scholar
  52. M. E. J. Newman. 2003. The Structure and Function of Complex Networks. SIAM Rev. , Vol. 45, 2 (2003), 167--256.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. M. E. J. Newman. 2010. Networks: An Introduction .Oxford University Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1999. The PageRank Citation Ranking: Bringing Order to the Web. Technical Report 1999--66. Stanford University.Google ScholarGoogle Scholar
  55. Robin Pemantle. 2007. A survey of random processes with reinforcement. Probability Surveys , Vol. 4 (2007), 1--79.Google ScholarGoogle ScholarCross RefCross Ref
  56. P. H. Peskun. 1973. Optimum Monte-Carlo Sampling Using Markov Chains. Biometrika , Vol. 60, 3 (1973), 607--612.Google ScholarGoogle ScholarCross RefCross Ref
  57. Luc Rey-Bellet and Konstantinos Spiliopoulos. 2016. Improving the Convergence of Reversible Samplers. Journal of Statistical Physics , Vol. 164, 3 (Aug 2016), 472--494.Google ScholarGoogle ScholarCross RefCross Ref
  58. Bruno Ribeiro and Don Towsley. 2010. Estimating and Sampling Graphs with Multidimensional Random Walks. In Proceedings of the 10th ACM SIGCOMM Conference on Internet Measurement (IMC'10). 390--403. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Bruno Ribeiro, Pinghui Wang, Fabricio Murai, and Don Towsley. 2012. Sampling Directed Graphs with Random Walks. In Proceedings of IEEE INFOCOM .Google ScholarGoogle ScholarCross RefCross Ref
  60. Matthew Richey. 2010. The Evolution of Markov Chain Monte Carlo Methods. The American Mathematical Monthly , Vol. 117, 5 (2010), 383--413.Google ScholarGoogle ScholarCross RefCross Ref
  61. Paat Rusmevichientong, David M. Pennock, Steve Lawrence, and C. Lee Giles. 2001. Methods for Sampling Pages Uniformly from the World Wide Web. In Proceedings of AAAI Fall Symposium on Using Uncertainty Within Computation. 121--128.Google ScholarGoogle Scholar
  62. Yuji Sakai and Koji Hukushima. 2016. Eigenvalue analysis of an irreversible random walk with skew detailed balance conditions. Physical Review E , Vol. 93 (Apr 2016), 043318. Issue 4.Google ScholarGoogle ScholarCross RefCross Ref
  63. Raoul D. Schram and Gerard T. Barkema. 2015. Monte Carlo methods beyond detailed balance. Physica A: Statistical Mechanics and its Applications , Vol. 418 (2015), 88--93.Google ScholarGoogle Scholar
  64. Roberta Sinatra, Jesús Gómez-Garde nes, Renaud Lambiotte, Vincenzo Nicosia, and Vito Latora. 2011. Maximal-entropy random walks in complex networks with limited information. Physical Review E , Vol. 83 (Mar 2011), 030103. Issue 3.Google ScholarGoogle ScholarCross RefCross Ref
  65. Daniel Stutzbach, Reza Rejaie, Nick Duffield, Subhabrata Sen, and Walter Willinger. 2009. On Unbiased Sampling for Unstructured Peer-to-peer Networks. IEEE/ACM Transactions on Networking , Vol. 17, 2 (Apr 2009), 377--390. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Yi Sun, Juergen Schmidhuber, and Faustino J. Gomez. 2010. Improving the Asymptotic Performance of Markov Chain Monte-Carlo by Inserting Vortices. In Advances in Neural Information Processing Systems 23. 2235--2243. Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. G. Timár, A. V. Goltsev, S. N. Dorogovtsev, and J. F. F. Mendes. 2017. Mapping the Structure of Directed Networks: Beyond the Bow-Tie Diagram. Physical Review Letters , Vol. 118 (Feb 2017), 078301. Issue 7.Google ScholarGoogle ScholarCross RefCross Ref
  68. Konstantin S. Turitsyn, Michael Chertkov, and Marija Vucelja. 2011. Irreversible Monte Carlo algorithms for efficient sampling. Physica D: Nonlinear Phenomena , Vol. 240, 4 (2011), 410--414.Google ScholarGoogle ScholarCross RefCross Ref
  69. Twitter. {n. d.}. Rate Limiting . https://dev.twitter.com/rest/public/rate-limiting.Google ScholarGoogle Scholar
  70. Erik A. van Doorn and Philip K. Pollett. 2009. Quasi-stationary distributions for reducible absorbing Markov chains in discrete time. Markov Process and Related Fields , Vol. 15, 2 (2009), 191--204.Google ScholarGoogle Scholar
  71. Erik A. van Doorn and Philip K. Pollett. 2013. Quasi-stationary distributions for discrete-state models. European Journal of Operational Research , Vol. 230, 1 (2013), 1--14.Google ScholarGoogle ScholarCross RefCross Ref
  72. Piet Van Mieghem, Jasmina Omic, and Robert Kooij. 2009. Virus Spread in Networks. IEEE/ACM Transactions on Networking , Vol. 17, 1 (Feb 2009), 1--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. Norases Vesdapunt and Hector Garcia-Molina. 2016. Updating an Existing Social Graph Snapshot via a Limited API. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management (CIKM'16). 1693--1702. Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. Marija Vucelja. 2016. Lifting--A nonreversible Markov chain Monte Carlo algorithm. American Journal of Physics , Vol. 84, 12 (2016), 958--968.Google ScholarGoogle ScholarCross RefCross Ref
  75. Tianyi Wang, Yang Chen, Zengbin Zhang, Peng Sun, Beixing Deng, and Xing Li. 2010. Unbiased Sampling in Directed Social Graph. In Proceedings of ACM SIGCOMM . Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. Xin Xu, Chul-Ho Lee, and Do Young Eun. 2014. A General Framework of Hybrid Graph Sampling for Complex Network Analysis. In Proceedings of IEEE INFOCOM .Google ScholarGoogle ScholarCross RefCross Ref
  77. Xin Xu, Chul-Ho Lee, and Do Young Eun. 2017. Challenging the Limits: Sampling Online Social Networks with Cost Constraints. In Proceedings of IEEE INFOCOM .Google ScholarGoogle ScholarCross RefCross Ref
  78. Zhuojie Zhou, Nan Zhang, and Gautam Das. 2015. Leveraging History for Faster Sampling of Online Social Networks. Proceedings of the VLDB Endowment , Vol. 8 (Jun. 2015), 1034--1045. Issue 10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  79. Zhuojie Zhou, Nan Zhang, Zhiguo Gong, and Gautam Das. 2016. Faster Random Walks by Rewiring Online Social Networks On-the-Fly. ACM Transactions on Database Systems , Vol. 40 (Jan. 2016), 26:1--26:36. Issue 4. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Non-Markovian Monte Carlo on Directed Graphs

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!