ABSTRACT
We introduce a set of techniques that allow for efficiently generating many independent random walks in the Massively Parallel Computation (MPC) model with space per machine strongly sublinear in the number of vertices. In this space-per-machine regime, many natural approaches to graph problems struggle to overcome the Θ(log n) MPC round complexity barrier, where n is the number of vertices. Our techniques enable achieving this for PageRank—one of the most important applications of random walks—even in more challenging directed graphs, as well as for approximate bipartiteness and expansion testing.
In the undirected case, we start our random walks from the stationary distribution, which implies that we approximately know the empirical distribution of their next steps. This allows for preparing continuations of random walks in advance and applying a doubling approach. As a result we can generate multiple random walks of length l in Θ(log l) rounds on MPC. Moreover, we show that under the popular 1-vs.-2-Cycles conjecture, this round complexity is asymptotically tight.
For directed graphs, our approach stems from our treatment of the PageRank Markov chain. We first compute the PageRank for the undirected version of the input graph and then slowly transition towards the directed case, considering convex combinations of the transition matrices in the process.
For PageRank, we achieve the following round complexities for damping factor equal to 1 − є:
in O(log log n + log 1 / є) rounds for undirected graphs (with Õ(m / є2) total space), in Õ(log2 log n + log2 1/є) rounds for directed graphs (with Õ((m+n 1+o(1)) / poly(є)) total space).
The round complexity of our result for computing PageRank has only logarithmic dependence on 1/є. We use this to show that our PageRank algorithm can be used to construct directed length-l random walks in O(log2 log n + log2 l) rounds with Õ((m+n 1+o(1)) poly(l)) total space. More specifically, by setting є = Θ(1 / l), a length-l PageRank walk with constant probability contains no random jump, and hence is a directed random walk.
- Reid Andersen, Fan R. K. Chung, and Kevin J. Lang. 2006. Local Graph Partitioning using PageRank Vectors. In 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS 2006), 21-24 October 2006, Berkeley, California, USA, Proceedings. 475–486.Google Scholar
- Alexandr Andoni, Aleksandar Nikolov, Krzysztof Onak, and Grigory Yaroslavtsev. 2014. Parallel algorithms for geometric graph problems. In Proceedings of the 46th ACM Symposium on Theory of Computing, STOC 2014, New York, NY, USA, May 31–June 3, 2014. 574–583.Google Scholar
Digital Library
- Alexandr Andoni, Zhao Song, Clifford Stein, Zhengyu Wang, and Peilin Zhong. 2018. Parallel graph connectivity in log diameter rounds. In 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS). 674–685.Google Scholar
Cross Ref
- Sepehr Assadi, MohammadHossein Bateni, Aaron Bernstein, Vahab Mirrokni, and Cliff Stein. 2019. Coresets meet EDCS: algorithms for matching and vertex cover on massive graphs. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms. 1616–1635.Google Scholar
- Sepehr Assadi, Yu Chen, and Sanjeev Khanna. 2019. Sublinear algorithms for (Δ +1) vertex coloring. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms. 767–786.Google Scholar
Digital Library
- Sepehr Assadi, Xiaorui Sun, and Omri Weinstein. 2019. Massively Parallel Algorithms for Finding Well-Connected Components in Sparse Graphs. In Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing, PODC 2019, Toronto, ON, Canada, July 29 - August 2, 2019.. 461–470.Google Scholar
Digital Library
- Konstantin Avrachenkov, Nelly Litvak, Danil Nemirovsky, and Natalia Osipova. 2007. Monte Carlo methods in PageRank computation: When one iteration is sufficient. SIAM J. Numer. Anal., 45, 2 (2007), 890–904.Google Scholar
Digital Library
- Bahman Bahmani, Kaushik Chakrabarti, and Dong Xin. 2011. Fast personalized PageRank on MapReduce. In Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2011, Athens, Greece, June 12-16, 2011. 973–984.Google Scholar
Digital Library
- Paul Beame, Paraschos Koutris, and Dan Suciu. 2013. Communication steps for parallel query processing. In Proceedings of the 32nd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2013, New York, NY, USA, June 22–27, 2013. 273–284.Google Scholar
Digital Library
- Soheil Behnezhad, Laxman Dhulipala, Hossein Esfandiari, Jakub Łącki, and Vahab Mirrokni. 2019. Near-Optimal Massively Parallel Graph Connectivity. FOCS.Google Scholar
- Soheil Behnezhad, MohammadTaghi Hajiaghayi, and David G Harris. 2019. Exponentially Faster Massively Parallel Maximal Matching. FOCS.Google Scholar
- Pavel Berkhin. 2005. A survey on PageRank computing. Internet Mathematics, 2, 1 (2005), 73–120.Google Scholar
Cross Ref
- Christian Borgs, Michael Brautbar, Jennifer Chayes, and Shang-Hua Teng. 2012. A sublinear time algorithm for PageRank computations. In International Workshop on Algorithms and Models for the Web-Graph. 41–53.Google Scholar
Digital Library
- Sebastian Brandt, Manuela Fischer, and Jara Uitto. 2018. Matching and MIS for Uniformly Sparse Graphs in the Low-Memory MPC Model. arXiv preprint arXiv:1807.05374.Google Scholar
- Marco Bressan, Enoch Peserico, and Luca Pretto. 2018. Sublinear Algorithms for Local Graph Centrality Estimation. In 59th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2018, Paris, France, October 7-9, 2018. 709–718.Google Scholar
- Marco Bressan and Luca Pretto. 2011. Local computation of PageRank: the ranking side. In Proceedings of the 20th ACM international conference on Information and knowledge management. 631–640.Google Scholar
Digital Library
- LA Breyer. 2002. Markovian page ranking distributions: some theory and simulations.Google Scholar
- Sergey Brin and Lawrence Page. 1998. The anatomy of a large-scale hypertextual web search engine. Computer networks and ISDN systems, 30, 1-7 (1998), 107–117.Google Scholar
- Keren Censor-Hillel, Eldar Fischer, Gregory Schwartzman, and Yadu Vasudev. 2016. Fast Distributed Algorithms for Testing Graph Properties. In Distributed Computing - 30th International Symposium, DISC 2016, Paris, France, September 27-29, 2016. Proceedings. 43–56.Google Scholar
- Yen-Yu Chen, Qingqing Gan, and Torsten Suel. 2004. Local methods for estimating PageRank values. In Proceedings of the thirteenth ACM international conference on Information and knowledge management. 381–389.Google Scholar
Digital Library
- A. Chiplunkar, M. Kapralov, S. Khanna, A. Mousavifar, and Y. Peres. 2018. Testing Graph Clusterability: Algorithms and Lower Bounds. In 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS). 497–508.Google Scholar
- Artur Czumaj, Jakub Ł ącki, Aleksander Mądry, Slobodan Mitrović, Krzysztof Onak, and Piotr Sankowski. 2018. Round compression for parallel matching algorithms. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing. 471–484.Google Scholar
Digital Library
- Artur Czumaj, Morteza Monemizadeh, Krzysztof Onak, and Christian Sohler. 2019. Planar graphs: Random walks and bipartiteness testing. Random Structures & Algorithms.Google Scholar
- Artur Czumaj, Pan Peng, and Christian Sohler. 2015. Testing Cluster Structure of Graphs. In Proceedings of the Forty-seventh Annual ACM Symposium on Theory of Computing (STOC ’15). ACM, New York, NY, USA. 723–732. isbn:978-1-4503-3536-2Google Scholar
Digital Library
- Artur Czumaj and Christian Sohler. 2010. Testing expansion in bounded-degree graphs. Combinatorics, Probability and Computing, 19, 5-6 (2010), 693–709.Google Scholar
Digital Library
- Atish Das Sarma, Sreenivas Gollapudi, and Rina Panigrahy. 2011. Estimating PageRank on graph streams. J. ACM, 58, 3 (2011), 13:1–13:19.Google Scholar
Digital Library
- Atish Das Sarma, Anisur Rahaman Molla, Gopal Pandurangan, and Eli Upfal. 2015. Fast Distributed PageRank Computation. Theor. Comput. Sci., 561, PB (2015), Jan., 113–121. issn:0304-3975Google Scholar
- Atish Das Sarma, Danupon Nanongkai, Gopal Pandurangan, and Prasad Tetali. 2013. Distributed Random Walks. J. ACM, 60, 1 (2013), 2:1–2:31.Google Scholar
- Gianna M. Del Corso, Antonio Gullí, and Francesco Romani. 2005. Fast PageRank Computation via a Sparse Linear System. Internet Math., 2, 3 (2005), 251–273.Google Scholar
Cross Ref
- Neelam Duhan, AK Sharma, and Komal Kumar Bhatia. 2009. Page ranking algorithms: a survey. In 2009 IEEE International Advance Computing Conference. 1530–1537.Google Scholar
Cross Ref
- Buddhima Gamlath, Sagar Kale, Slobodan Mitrović, and Ola Svensson. 2018. Weighted Matchings via Unweighted Augmentations. arXiv preprint arXiv:1811.02760.Google Scholar
- Mohsen Ghaffari, Themis Gouleakis, Christian Konrad, Slobodan Mitrovic, and Ronitt Rubinfeld. 2018. Improved Massively Parallel Computation Algorithms for MIS, Matching, and Vertex Cover. Proceedings of the 37th ACM Principles of Distributed Computing (PODC 2018).Google Scholar
Digital Library
- Mohsen Ghaffari, Fabian Kuhn, and Jara Uitto. 2019. Conditional Hardness Results for Massively Parallel Computation from Distributed Lower Bounds. FOCS.Google Scholar
- Mohsen Ghaffari, Silvio Lattanzi, and Slobodan Mitrović. 2019. Improved Parallel Algorithms for Density-Based Network Clustering. In International Conference on Machine Learning. 2201–2210.Google Scholar
- Mohsen Ghaffari and Jara Uitto. 2019. Sparsifying distributed algorithms with ramifications in massively parallel computation and centralized local computation. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms. 1636–1653.Google Scholar
Cross Ref
- Ashish Goel, Michael Kapralov, and Sanjeev Khanna. 2013. Perfect Matchings in O(n ologn) Time in Regular Bipartite Graphs. SIAM J. Comput., 42, 3 (2013), 1392–1404.Google Scholar
Digital Library
- Oded Goldreich and Dana Ron. 1999. A Sublinear Bipartiteness Tester for Bounded Degree Graphs. Combinatorica, 19, 3 (1999), 335–373.Google Scholar
Cross Ref
- Michael T. Goodrich, Nodari Sitchinava, and Qin Zhang. 2011. Sorting, searching, and simulation in the MapReduce framework. In International Symposium on Algorithms and Computation. 374–383.Google Scholar
Digital Library
- Shay Halperin and Uri Zwick. 1996. An Optimal Randomised Logarithmic Time Connectivity Algorithm for the EREW PRAM. J. Comput. Syst. Sci., 53, 3 (1996), 395–416.Google Scholar
Digital Library
- Shay Halperin and Uri Zwick. 1996. Optimal Randomized EREW PRAM Algorithms for Finding Spanning Forests and for Other Basic Graph Connectivity Problems. In Proceedings of the Seventh Annual ACM-SIAM Symposium on Discrete Algorithms (SODA ’96). Society for Industrial and Applied Mathematics, Philadelphia, PA, USA. 438–447. isbn:0-89871-366-8Google Scholar
Digital Library
- Nicholas JA Harvey, Christopher Liaw, and Paul Liu. 2018. Greedy and Local Ratio Algorithms in the MapReduce Model. In Proceedings of the 30th on Symposium on Parallelism in Algorithms and Architectures. 43–52.Google Scholar
Digital Library
- Mark Jerrum and Alistair Sinclair. 1996. The Markov chain Monte Carlo method: an approach to approximate counting and integration. Approximation algorithms for NP-hard problems, 482–520.Google Scholar
- Ce Jin. 2019. Simulating Random Walks on Graphs in the Streaming Model. In 10th Innovations in Theoretical Computer Science Conference, ITCS 2019, January 10-12, 2019, San Diego, California, USA. 46:1–46:15.Google Scholar
- Satyen Kale and Comandur Seshadhri. 2011. An expansion tester for bounded degree graphs. SIAM J. Comput., 40, 3 (2011), 709–720.Google Scholar
Digital Library
- David R. Karger, Noam Nisan, and Michal Parnas. 1999. Fast Connected Components Algorithms for the EREW PRAM. SIAM J. Comput., 28, 3 (1999), 1021–1034.Google Scholar
Digital Library
- Howard J. Karloff, Siddharth Suri, and Sergei Vassilvitskii. 2010. A Model of Computation for MapReduce. In Proceedings of the 21st Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2010, Austin, Texas, USA, January 17–19, 2010. 938–948.Google Scholar
Digital Library
- Tali Kaufman, Michael Krivelevich, and Dana Ron. 2004. Tight Bounds for Testing Bipartiteness in General Graphs. SIAM J. Comput., 33, 6 (2004), 1441–1483.Google Scholar
Digital Library
- Jonathan A. Kelner and Aleksander Mądry. 2009. Faster Generation of Random Spanning Trees. In 50th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2009, October 25-27, 2009, Atlanta, Georgia, USA. 13–21.Google Scholar
- Amy N Langville and Carl D Meyer. 2004. Deeper inside PageRank. Internet Mathematics, 1, 3 (2004), 335–380.Google Scholar
Cross Ref
- Silvio Lattanzi, Benjamin Moseley, Siddharth Suri, and Sergei Vassilvitskii. 2011. Filtering: a method for solving graph problems in MapReduce. In Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures. 85–94.Google Scholar
Digital Library
- Siqiang Luo. 2019. Distributed PageRank Computation: An Improved Theoretical Study. In The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, Honolulu, Hawaii, USA, January 27 - February 1, 2019. AAAI Press, 4496–4503.Google Scholar
Digital Library
- Asaf Nachmias and Asaf Shapira. 2010. Testing the expansion of a graph. Inf. Comput., 208, 4 (2010), 309–314.Google Scholar
Digital Library
- Krzysztof Onak. 2018. Round compression for parallel graph algorithms in strongly sublinear space. arXiv preprint arXiv:1807.08745.Google Scholar
- Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1999. The PageRank Citation Ranking: Bringing Order to the Web.. Stanford InfoLab.Google Scholar
- John H. Reif. 1985. An Optimal Parallel Algorithm for Integer Sorting. In Proceedings of the 26th Annual Symposium on Foundations of Computer Science (SFCS ’85). IEEE Computer Society, Washington, DC, USA. 496–504. isbn:0-8186-0844-4Google Scholar
Digital Library
- Tim Roughgarden, Sergei Vassilvitskii, and Joshua R Wang. 2018. Shuffles and circuits (on lower bounds for modern parallel computation). Journal of the ACM (JACM), 65, 6 (2018), 41.Google Scholar
Digital Library
Index Terms
Walking randomly, massively, and efficiently
Recommendations
Deterministic Massively Parallel Symmetry Breaking for Sparse Graphs
SPAA '23: Proceedings of the 35th ACM Symposium on Parallelism in Algorithms and ArchitecturesWe consider the problem of designing deterministic graph algorithms for the model of Massively Parallel Computation (MPC) that improve with the sparsity of the input graph, as measured by the standard notion of arboricity. For the problems of maximal ...
The Complexity of (Δ+1) Coloring in Congested Clique, Massively Parallel Computation, and Centralized Local Computation
PODC '19: Proceedings of the 2019 ACM Symposium on Principles of Distributed ComputingIn this paper, we present new randomized algorithms that improve the complexity of the classic (Δ+1)-coloring problem, and its generalization (Δ+1)-list-coloring, in three well-studied models of distributed, parallel, and centralized computation: ...
Brief Announcement: A Randomness-efficient Massively Parallel Algorithm for Connectivity
PODC'21: Proceedings of the 2021 ACM Symposium on Principles of Distributed ComputingWe give a randomness-efficient Massively Parallel Computation (MPC) algorithm for deciding whether an undirected graph is connected. For Connectivity on n-vertex, m-edge graphs whose components have diameter at most D = 2o(log n/ log log n), our ...





Comments