Abstract
Markov Chain Monte Carlo (MCMC) has been the de facto technique for sampling and inference of large graphs such as online social networks. At the heart of MCMC lies the ability to construct an ergodic Markov chain that attains any given stationary distribution $\boldsymbolπ $, often in the form of random walks or crawling agents on the graph. Most of the works around MCMC, however, presume that the graph is undirected or has reciprocal edges, and become inapplicable when the graph is directed and non-reciprocal. Here we develop a similar framework for directed graphs, which we call Non-Markovian Monte Carlo (NMMC), by establishing a mapping to convert $\boldsymbolπ into the quasi-stationary distribution of a carefully constructed transient Markov chain on an extended state space. As applications, we demonstrate how to achieve any given distribution $\boldsymbolπ $ on a directed graph and estimate the eigenvector centrality using a set of non-Markovian, history-dependent random walks on the same graph in a distributed manner. We also provide numerical results on various real-world directed graphs to confirm our theoretical findings, and present several practical enhancements to make our NMMC method ready for practical use in most directed graphs. To the best of our knowledge, the proposed NMMC framework for directed graphs is the first of its kind, unlocking all the limitations set by the standard MCMC methods for undirected graphs.
- David Aldous, Barry Flannery, and José Luis Palacios. 1988. Two Applications of Urn Processes The Fringe Analysis of Search Trees and The Simulation of Quasi-Stationary Distributions of Markov Chains. Probability in the Engineering and Informational Sciences , Vol. 2, 3 (1988), 293--307.Google Scholar
Cross Ref
- Noga Alon, Itai Benjamini, Eyal Lubetzky, and Sasha Sodin. 2007. Non-Backtracking Random Walks Mix Faster. Communications in Contemporary Mathematics , Vol. 9, 4 (2007), 585--603.Google Scholar
Cross Ref
- Amine Asselah, Pablo A. Ferrari, and Pablo Groisman. 2011. Quasistationary distributions and Fleming-Viot processes in finite spaces. Journal of Applied Probability , Vol. 48, 2 (Jun 2011), 322--332.Google Scholar
Cross Ref
- Konstantin Avrachenkov, Vivek S. Borkar, Arun Kadavankandy, and Jithin K. Sreedharan. 2018. Revisiting random walk based sampling in networks: evasion of burn-in period and frequent regenerations. Computational Social Networks , Vol. 5, 1 (Mar 2018), 19.Google Scholar
Cross Ref
- Konstantin Avrachenkov, Bruno Ribeiro, and Jithin K. Sreedharan. 2016. Inference in OSNs via Lightweight Partial Crawls. In Proceedings of the ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS'16). 165--177. Google Scholar
Digital Library
- Ziv Bar-Yossef, Alexander Berg, Steve Chien, Jittat Fakcharoenphol, and Dror Weitz. 2000. Approximating Aggregate Queries About Web Pages via Random Walks. In Proceedings of the 26th International Conference on Very Large Data Bases (VLDB'00). 535--544. Google Scholar
Digital Library
- Ziv Bar-Yossef and Maxim Gurevich. 2008. Random Sampling from a Search Engine's Index. J. ACM , Vol. 55, 5, Article 24 (Oct 2008), bibinfonumpages24:1--24:74 pages. Google Scholar
Digital Library
- A. A. Barker. 1965. Monte Carlo Calculations of the Radial Distribution Functions for a Proton-Electron Plasma. Australian Journal of Physics , Vol. 18, 2 (1965), 119--134.Google Scholar
Cross Ref
- Michel Benaïm. 1997. Vertex-reinforced random walks and a conjecture of Pemantle. Annals of Probability , Vol. 25, 1 (01 1997), 361--392.Google Scholar
- Michel Benaïm and Bertrand Cloez. 2015. A stochastic approximation approach to quasi-stationary distributions on finite spaces. Electronic Communications in Probability , Vol. 20 (2015), 1--14.Google Scholar
Cross Ref
- Michel Benaïm, Bertrand Cloez, and Fabien Panloup. 2018. Stochastic approximation of quasi-stationary distributions on compact spaces and applications. Annals of Applied Probability , Vol. 28, 4 (08 2018), 2370--2416.Google Scholar
Cross Ref
- Joris Bierkens. 2016. Non-reversible Metropolis-Hastings. Statistics and Computing , Vol. 26, 6 (2016), 1213--1228. Google Scholar
Digital Library
- J. Blanchet, P. Glynn, and S. Zheng. 2016. Analysis of a stochastic approximation algorithm for computing quasi-stationary distributions. Advances in Applied Probability , Vol. 48, 3 (09 2016), 792--811.Google Scholar
- S. Boyd, P. Diaconis, and L. Xiao. 2004. Fastest mixing markov chain on a graph. SIAM Rev. , Vol. 46, 4 (2004), 667--689. Google Scholar
Digital Library
- P. Brémaud. 1999. Markov chains: Gibbs fields, Monte Carlo simulation, and queues.Google Scholar
- Sergey Brin and Lawrence Page. 1998. The Anatomy of a Large-scale Hypertextual Web Search Engine. Computer Networks and ISDN Systems , Vol. 30, 1--7 (April 1998), 107--117. Google Scholar
Digital Library
- Z. Burda, J. Duda, J. M. Luck, and B. Waclaw. 2009. Localization of the Maximal Entropy Random Walk. Physical Review Letters , Vol. 102 (Apr 2009), 160602. Issue 16.Google Scholar
- Geoffrey S. Canright and Kenth Engø-Monsen. 2006. Spreading on Networks: A Topographic View. Complexus , Vol. 3 (Aug 2006), 131--146. Issue 1--3.Google Scholar
- Iacopo Carreras, Daniele Miorandi, Geoffrey S. Canright, and Kenth Engø-Monsen. 2007. Eigenvector Centrality in Highly Partitioned Mobile Networks: Principles and Applications .Springer Berlin Heidelberg.Google Scholar
- Deepayan Chakrabarti, Yang Wang, Chenxi Wang, Jurij Leskovec, and Christos Faloutsos. 2008. Epidemic Thresholds in Real Networks. ACM Transactions on Information and System Security , Vol. 10, 4 (Jan 2008), 1:1--1:26. Google Scholar
Digital Library
- Chen Chen, Hanghang Tong, B. Aditya Prakash, Tina Eliassi-Rad, Michalis Faloutsos, and Christos Faloutsos. 2016. Eigen-Optimization on Large Graphs by Edge Manipulation. ACM Transactions on Knowledge Discovery from Data , Vol. 10, 4 (Jun. 2016), 49:1--49:30. Google Scholar
Digital Library
- Fang Chen, László Lovász, and Igor Pak. 1999. Lifting Markov Chains to Speed Up Mixing. In Proceedings of the Thirty-first Annual ACM Symposium on Theory of Computing (STOC'99). 275--281. Google Scholar
Digital Library
- Ting-Li Chen and Chii-Ruey Hwang. 2013. Accelerating reversible Markov chains. Statistics & Probability Letters , Vol. 83, 9 (2013), 1956--1962.Google Scholar
Cross Ref
- Persi Diaconis, Susan Holmes, and Radford M. Neal. 2000. Analysis of a nonreversible Markov chain sampler. Annals of Applied Probability , Vol. 10, 3 (Aug 2000), 726--752.Google Scholar
- Persi Diaconis and Laurent Miclo. 2013. On the spectral analysis of second-order Markov chains. Annales de la Faculté des sciences de Toulouse : Mathématiques , Vol. 22 (2013), 573--621. Issue 3.Google Scholar
- Persi Diaconis and Laurent Saloff-Coste. 1998. What Do We Know about the Metropolis Algorithm? J. Comput. System Sci. , Vol. 57, 1 (1998), 20--36. Google Scholar
Digital Library
- Moez Draief, Ayalvadi Ganesh, and Laurent Massoulié. 2008. Thresholds for Virus Spread on Networks. Annals of Applied Probability , Vol. 18, 2 (04 2008), 359--378.Google Scholar
Cross Ref
- David Easley and Jon Kleinberg. 2010. Networks, Crowds, and Markets: Reasoning About a Highly Connected World .Cambridge University Press. Google Scholar
Digital Library
- Heitor C.M. Fernandes and Martin Weigel. 2011. Non-reversible Monte Carlo simulations of spin models. Computer Physics Communications , Vol. 182, 9 (2011), 1856--1859.Google Scholar
Cross Ref
- Maksym Gabielkov, Ashwin Rao, and Arnaud Legout. 2014. Sampling Online Social Networks: An Experimental Study of Twitter. In Proceedings of ACM SIGCOMM . Google Scholar
Digital Library
- Ayalvadi Ganesh, Laurent Massoulié, and Don Towsley. 2005. The Effect of Network Topology on the Spread of Epidemics. In Proceedings of IEEE INFOCOM .Google Scholar
Cross Ref
- M. Gjoka, M. Kurant, C. T. Butts, and A. Markopoulou. 2011. Practical Recommendations on Crawling Online Social Networks. IEEE Journal on Selected Areas in Communications , Vol. 29, 9 (October 2011), 1872--1892.Google Scholar
Cross Ref
- Ilie Grigorescu and Min Kang. 2004. Hydrodynamic limit for a Fleming-Viot type system. Stochastic Processes and their Applications , Vol. 110, 1 (2004), 111--143.Google Scholar
- Ilie Grigorescu and Min Kang. 2012. Immortal particle for a catalytic branching process. Probability Theory and Related Fields , Vol. 153, 1 (2012), 333--361.Google Scholar
Cross Ref
- Ilie Grigorescu and Min Kang. 2013. Markov processes with redistribution. Markov Process and Related Fields , Vol. 19, 3 (2013), 497--520.Google Scholar
- Pablo Groisman and Matthieu Jonckheere. 2013. Simulation of Quasi-Stationary Distributions on Countable Spaces. Markov Processes and Related Fields , Vol. 19, 3 (2013), 521--542.Google Scholar
- Stephen J. Hardiman and Liran Katzir. 2013. Estimating Clustering Coefficients and Size of Social Networks via Random Walk. In Proceedings of the 22nd International Conference on World Wide Web (WWW'13). 539--550. Google Scholar
Digital Library
- W. K. Hastings. 1970. Monte Carlo Sampling Methods Using Markov Chains and Their Applications. Biometrika , Vol. 57, 1 (1970), 97--109.Google Scholar
Cross Ref
- Taher Haveliwala and Sepandar Kamvar. 2003. The Second Eigenvalue of the Google Matrix. Technical Report 2003--20. Stanford University.Google Scholar
- Monika R. Henzinger, Allan Heydon, Michael Mitzenmacher, and Marc Najork. 2000. On Near-uniform URL Sampling. Computer Networks , Vol. 33, 1 (2000), 295--308. Google Scholar
Digital Library
- Akihisa Ichiki and Masayuki Ohzeki. 2013. Violation of detailed balance accelerates relaxation. Physical Review E , Vol. 88 (Aug 2013), 020101. Issue 2.Google Scholar
Cross Ref
- Marcus Kaiser, Robert L. Jack, and Johannes Zimmer. 2017. Acceleration of Convergence to Equilibrium in Markov Chains by Breaking Detailed Balance. Journal of Statistical Physics , Vol. 168, 2 (Jul 2017), 259--287.Google Scholar
Cross Ref
- Amy N. Langville and Carl D. Meyer. 2006. Google's PageRank and Beyond: The Science of Search Engine Rankings .Princeton University Press. Google Scholar
Digital Library
- Chul-Ho Lee, Xin Xu, and Do Young Eun. 2012. Beyond Random Walk and Metropolis-Hastings Samplers: Why You Should Not Backtrack for Unbiased Graph Sampling. In Proceedings of the ACM SIGMETRICS/PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS'12). 319--330. Google Scholar
Digital Library
- Chul-Ho Lee, Xin Xu, and Do Young Eun. 2017. On the Rao-Blackwellization and Its Application for Graph Sampling via Neighborhood Exploration. In Proceedings of IEEE INFOCOM .Google Scholar
Cross Ref
- Jure Leskovec and Andrej Krevl. 2014. SNAP Datasets: Stanford Large Network Dataset Collection. http://snap.stanford.edu/data/.Google Scholar
- Jun S. Liu. 2004. Monte Carlo Strategies in Scientific Computing .Springer-Verlag. Google Scholar
Digital Library
- Linyuan Lu, Duanbing Chen, Xiao-Long Ren, Qian-Ming Zhang, Yi-Cheng Zhang, and Tao Zhou. 2016. Vital nodes identification in complex networks. Physics Reports , Vol. 650 (2016), 1--63.Google Scholar
Cross Ref
- Sylvie Méléard and Denis Villemonais. 2012. Quasi-stationary distributions and population processes. Probability Surveys , Vol. 9 (2012), 340--410.Google Scholar
Cross Ref
- Carl D. Meyer. 2000. Matrix Analysis and Applied Linear Algebra .SIAM. Google Scholar
Digital Library
- Radford M. Neal. 2004. Improving Asymptotic Variance of MCMC Estimators: Non-reversible Chains are Better . Technical Report No. 0406. Department of Statistics, University of Toronto.Google Scholar
- M. E. J. Newman. 2003. The Structure and Function of Complex Networks. SIAM Rev. , Vol. 45, 2 (2003), 167--256.Google Scholar
Digital Library
- M. E. J. Newman. 2010. Networks: An Introduction .Oxford University Press. Google Scholar
Digital Library
- Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1999. The PageRank Citation Ranking: Bringing Order to the Web. Technical Report 1999--66. Stanford University.Google Scholar
- Robin Pemantle. 2007. A survey of random processes with reinforcement. Probability Surveys , Vol. 4 (2007), 1--79.Google Scholar
Cross Ref
- P. H. Peskun. 1973. Optimum Monte-Carlo Sampling Using Markov Chains. Biometrika , Vol. 60, 3 (1973), 607--612.Google Scholar
Cross Ref
- Luc Rey-Bellet and Konstantinos Spiliopoulos. 2016. Improving the Convergence of Reversible Samplers. Journal of Statistical Physics , Vol. 164, 3 (Aug 2016), 472--494.Google Scholar
Cross Ref
- Bruno Ribeiro and Don Towsley. 2010. Estimating and Sampling Graphs with Multidimensional Random Walks. In Proceedings of the 10th ACM SIGCOMM Conference on Internet Measurement (IMC'10). 390--403. Google Scholar
Digital Library
- Bruno Ribeiro, Pinghui Wang, Fabricio Murai, and Don Towsley. 2012. Sampling Directed Graphs with Random Walks. In Proceedings of IEEE INFOCOM .Google Scholar
Cross Ref
- Matthew Richey. 2010. The Evolution of Markov Chain Monte Carlo Methods. The American Mathematical Monthly , Vol. 117, 5 (2010), 383--413.Google Scholar
Cross Ref
- Paat Rusmevichientong, David M. Pennock, Steve Lawrence, and C. Lee Giles. 2001. Methods for Sampling Pages Uniformly from the World Wide Web. In Proceedings of AAAI Fall Symposium on Using Uncertainty Within Computation. 121--128.Google Scholar
- Yuji Sakai and Koji Hukushima. 2016. Eigenvalue analysis of an irreversible random walk with skew detailed balance conditions. Physical Review E , Vol. 93 (Apr 2016), 043318. Issue 4.Google Scholar
Cross Ref
- Raoul D. Schram and Gerard T. Barkema. 2015. Monte Carlo methods beyond detailed balance. Physica A: Statistical Mechanics and its Applications , Vol. 418 (2015), 88--93.Google Scholar
- Roberta Sinatra, Jesús Gómez-Garde nes, Renaud Lambiotte, Vincenzo Nicosia, and Vito Latora. 2011. Maximal-entropy random walks in complex networks with limited information. Physical Review E , Vol. 83 (Mar 2011), 030103. Issue 3.Google Scholar
Cross Ref
- Daniel Stutzbach, Reza Rejaie, Nick Duffield, Subhabrata Sen, and Walter Willinger. 2009. On Unbiased Sampling for Unstructured Peer-to-peer Networks. IEEE/ACM Transactions on Networking , Vol. 17, 2 (Apr 2009), 377--390. Google Scholar
Digital Library
- Yi Sun, Juergen Schmidhuber, and Faustino J. Gomez. 2010. Improving the Asymptotic Performance of Markov Chain Monte-Carlo by Inserting Vortices. In Advances in Neural Information Processing Systems 23. 2235--2243. Google Scholar
Digital Library
- G. Timár, A. V. Goltsev, S. N. Dorogovtsev, and J. F. F. Mendes. 2017. Mapping the Structure of Directed Networks: Beyond the Bow-Tie Diagram. Physical Review Letters , Vol. 118 (Feb 2017), 078301. Issue 7.Google Scholar
Cross Ref
- Konstantin S. Turitsyn, Michael Chertkov, and Marija Vucelja. 2011. Irreversible Monte Carlo algorithms for efficient sampling. Physica D: Nonlinear Phenomena , Vol. 240, 4 (2011), 410--414.Google Scholar
Cross Ref
- Twitter. {n. d.}. Rate Limiting . https://dev.twitter.com/rest/public/rate-limiting.Google Scholar
- Erik A. van Doorn and Philip K. Pollett. 2009. Quasi-stationary distributions for reducible absorbing Markov chains in discrete time. Markov Process and Related Fields , Vol. 15, 2 (2009), 191--204.Google Scholar
- Erik A. van Doorn and Philip K. Pollett. 2013. Quasi-stationary distributions for discrete-state models. European Journal of Operational Research , Vol. 230, 1 (2013), 1--14.Google Scholar
Cross Ref
- Piet Van Mieghem, Jasmina Omic, and Robert Kooij. 2009. Virus Spread in Networks. IEEE/ACM Transactions on Networking , Vol. 17, 1 (Feb 2009), 1--14. Google Scholar
Digital Library
- Norases Vesdapunt and Hector Garcia-Molina. 2016. Updating an Existing Social Graph Snapshot via a Limited API. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management (CIKM'16). 1693--1702. Google Scholar
Digital Library
- Marija Vucelja. 2016. Lifting--A nonreversible Markov chain Monte Carlo algorithm. American Journal of Physics , Vol. 84, 12 (2016), 958--968.Google Scholar
Cross Ref
- Tianyi Wang, Yang Chen, Zengbin Zhang, Peng Sun, Beixing Deng, and Xing Li. 2010. Unbiased Sampling in Directed Social Graph. In Proceedings of ACM SIGCOMM . Google Scholar
Digital Library
- Xin Xu, Chul-Ho Lee, and Do Young Eun. 2014. A General Framework of Hybrid Graph Sampling for Complex Network Analysis. In Proceedings of IEEE INFOCOM .Google Scholar
Cross Ref
- Xin Xu, Chul-Ho Lee, and Do Young Eun. 2017. Challenging the Limits: Sampling Online Social Networks with Cost Constraints. In Proceedings of IEEE INFOCOM .Google Scholar
Cross Ref
- Zhuojie Zhou, Nan Zhang, and Gautam Das. 2015. Leveraging History for Faster Sampling of Online Social Networks. Proceedings of the VLDB Endowment , Vol. 8 (Jun. 2015), 1034--1045. Issue 10. Google Scholar
Digital Library
- Zhuojie Zhou, Nan Zhang, Zhiguo Gong, and Gautam Das. 2016. Faster Random Walks by Rewiring Online Social Networks On-the-Fly. ACM Transactions on Database Systems , Vol. 40 (Jan. 2016), 26:1--26:36. Issue 4. Google Scholar
Digital Library
Index Terms
Non-Markovian Monte Carlo on Directed Graphs
Recommendations
Non-Markovian Monte Carlo on Directed Graphs
SIGMETRICS '19: Abstracts of the 2019 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer SystemsMarkov Chain Monte Carlo (MCMC) has been the de facto technique for sampling and inference of large graphs such as online social networks. At the heart of MCMC lies the ability to construct an ergodic Markov chain that attains any given stationary ...
Non-Markovian Monte Carlo on Directed Graphs
Markov Chain Monte Carlo (MCMC) has been the de facto technique for sampling and inference of large graphs such as online social networks. At the heart ofMCMC lies the ability to construct an ergodicMarkov chain that attains any given stationary ...






Comments