skip to main content
10.1145/3357713.3384303acmconferencesArticle/Chapter ViewAbstractPublication PagesstocConference Proceedingsconference-collections
research-article
Open Access

Walking randomly, massively, and efficiently

Published:22 June 2020Publication History

ABSTRACT

We introduce a set of techniques that allow for efficiently generating many independent random walks in the Massively Parallel Computation (MPC) model with space per machine strongly sublinear in the number of vertices. In this space-per-machine regime, many natural approaches to graph problems struggle to overcome the Θ(log n) MPC round complexity barrier, where n is the number of vertices. Our techniques enable achieving this for PageRank—one of the most important applications of random walks—even in more challenging directed graphs, as well as for approximate bipartiteness and expansion testing.

In the undirected case, we start our random walks from the stationary distribution, which implies that we approximately know the empirical distribution of their next steps. This allows for preparing continuations of random walks in advance and applying a doubling approach. As a result we can generate multiple random walks of length l in Θ(log l) rounds on MPC. Moreover, we show that under the popular 1-vs.-2-Cycles conjecture, this round complexity is asymptotically tight.

For directed graphs, our approach stems from our treatment of the PageRank Markov chain. We first compute the PageRank for the undirected version of the input graph and then slowly transition towards the directed case, considering convex combinations of the transition matrices in the process.

For PageRank, we achieve the following round complexities for damping factor equal to 1 − є:

in O(log log n + log 1 / є) rounds for undirected graphs (with Õ(m / є2) total space), in Õ(log2 log n + log2 1/є) rounds for directed graphs (with Õ((m+n 1+o(1)) / poly(є)) total space).

The round complexity of our result for computing PageRank has only logarithmic dependence on 1/є. We use this to show that our PageRank algorithm can be used to construct directed length-l random walks in O(log2 log n + log2 l) rounds with Õ((m+n 1+o(1)) poly(l)) total space. More specifically, by setting є = Θ(1 / l), a length-l PageRank walk with constant probability contains no random jump, and hence is a directed random walk.

References

  1. Reid Andersen, Fan R. K. Chung, and Kevin J. Lang. 2006. Local Graph Partitioning using PageRank Vectors. In 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS 2006), 21-24 October 2006, Berkeley, California, USA, Proceedings. 475–486.Google ScholarGoogle Scholar
  2. Alexandr Andoni, Aleksandar Nikolov, Krzysztof Onak, and Grigory Yaroslavtsev. 2014. Parallel algorithms for geometric graph problems. In Proceedings of the 46th ACM Symposium on Theory of Computing, STOC 2014, New York, NY, USA, May 31–June 3, 2014. 574–583.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Alexandr Andoni, Zhao Song, Clifford Stein, Zhengyu Wang, and Peilin Zhong. 2018. Parallel graph connectivity in log diameter rounds. In 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS). 674–685.Google ScholarGoogle ScholarCross RefCross Ref
  4. Sepehr Assadi, MohammadHossein Bateni, Aaron Bernstein, Vahab Mirrokni, and Cliff Stein. 2019. Coresets meet EDCS: algorithms for matching and vertex cover on massive graphs. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms. 1616–1635.Google ScholarGoogle Scholar
  5. Sepehr Assadi, Yu Chen, and Sanjeev Khanna. 2019. Sublinear algorithms for (Δ +1) vertex coloring. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms. 767–786.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Sepehr Assadi, Xiaorui Sun, and Omri Weinstein. 2019. Massively Parallel Algorithms for Finding Well-Connected Components in Sparse Graphs. In Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing, PODC 2019, Toronto, ON, Canada, July 29 - August 2, 2019.. 461–470.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Konstantin Avrachenkov, Nelly Litvak, Danil Nemirovsky, and Natalia Osipova. 2007. Monte Carlo methods in PageRank computation: When one iteration is sufficient. SIAM J. Numer. Anal., 45, 2 (2007), 890–904.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Bahman Bahmani, Kaushik Chakrabarti, and Dong Xin. 2011. Fast personalized PageRank on MapReduce. In Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2011, Athens, Greece, June 12-16, 2011. 973–984.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Paul Beame, Paraschos Koutris, and Dan Suciu. 2013. Communication steps for parallel query processing. In Proceedings of the 32nd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2013, New York, NY, USA, June 22–27, 2013. 273–284.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Soheil Behnezhad, Laxman Dhulipala, Hossein Esfandiari, Jakub Łącki, and Vahab Mirrokni. 2019. Near-Optimal Massively Parallel Graph Connectivity. FOCS.Google ScholarGoogle Scholar
  11. Soheil Behnezhad, MohammadTaghi Hajiaghayi, and David G Harris. 2019. Exponentially Faster Massively Parallel Maximal Matching. FOCS.Google ScholarGoogle Scholar
  12. Pavel Berkhin. 2005. A survey on PageRank computing. Internet Mathematics, 2, 1 (2005), 73–120.Google ScholarGoogle ScholarCross RefCross Ref
  13. Christian Borgs, Michael Brautbar, Jennifer Chayes, and Shang-Hua Teng. 2012. A sublinear time algorithm for PageRank computations. In International Workshop on Algorithms and Models for the Web-Graph. 41–53.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Sebastian Brandt, Manuela Fischer, and Jara Uitto. 2018. Matching and MIS for Uniformly Sparse Graphs in the Low-Memory MPC Model. arXiv preprint arXiv:1807.05374.Google ScholarGoogle Scholar
  15. Marco Bressan, Enoch Peserico, and Luca Pretto. 2018. Sublinear Algorithms for Local Graph Centrality Estimation. In 59th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2018, Paris, France, October 7-9, 2018. 709–718.Google ScholarGoogle Scholar
  16. Marco Bressan and Luca Pretto. 2011. Local computation of PageRank: the ranking side. In Proceedings of the 20th ACM international conference on Information and knowledge management. 631–640.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. LA Breyer. 2002. Markovian page ranking distributions: some theory and simulations.Google ScholarGoogle Scholar
  18. Sergey Brin and Lawrence Page. 1998. The anatomy of a large-scale hypertextual web search engine. Computer networks and ISDN systems, 30, 1-7 (1998), 107–117.Google ScholarGoogle Scholar
  19. Keren Censor-Hillel, Eldar Fischer, Gregory Schwartzman, and Yadu Vasudev. 2016. Fast Distributed Algorithms for Testing Graph Properties. In Distributed Computing - 30th International Symposium, DISC 2016, Paris, France, September 27-29, 2016. Proceedings. 43–56.Google ScholarGoogle Scholar
  20. Yen-Yu Chen, Qingqing Gan, and Torsten Suel. 2004. Local methods for estimating PageRank values. In Proceedings of the thirteenth ACM international conference on Information and knowledge management. 381–389.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. A. Chiplunkar, M. Kapralov, S. Khanna, A. Mousavifar, and Y. Peres. 2018. Testing Graph Clusterability: Algorithms and Lower Bounds. In 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS). 497–508.Google ScholarGoogle Scholar
  22. Artur Czumaj, Jakub Ł ącki, Aleksander Mądry, Slobodan Mitrović, Krzysztof Onak, and Piotr Sankowski. 2018. Round compression for parallel matching algorithms. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing. 471–484.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Artur Czumaj, Morteza Monemizadeh, Krzysztof Onak, and Christian Sohler. 2019. Planar graphs: Random walks and bipartiteness testing. Random Structures & Algorithms.Google ScholarGoogle Scholar
  24. Artur Czumaj, Pan Peng, and Christian Sohler. 2015. Testing Cluster Structure of Graphs. In Proceedings of the Forty-seventh Annual ACM Symposium on Theory of Computing (STOC ’15). ACM, New York, NY, USA. 723–732. isbn:978-1-4503-3536-2Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Artur Czumaj and Christian Sohler. 2010. Testing expansion in bounded-degree graphs. Combinatorics, Probability and Computing, 19, 5-6 (2010), 693–709.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Atish Das Sarma, Sreenivas Gollapudi, and Rina Panigrahy. 2011. Estimating PageRank on graph streams. J. ACM, 58, 3 (2011), 13:1–13:19.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Atish Das Sarma, Anisur Rahaman Molla, Gopal Pandurangan, and Eli Upfal. 2015. Fast Distributed PageRank Computation. Theor. Comput. Sci., 561, PB (2015), Jan., 113–121. issn:0304-3975Google ScholarGoogle Scholar
  28. Atish Das Sarma, Danupon Nanongkai, Gopal Pandurangan, and Prasad Tetali. 2013. Distributed Random Walks. J. ACM, 60, 1 (2013), 2:1–2:31.Google ScholarGoogle Scholar
  29. Gianna M. Del Corso, Antonio Gullí, and Francesco Romani. 2005. Fast PageRank Computation via a Sparse Linear System. Internet Math., 2, 3 (2005), 251–273.Google ScholarGoogle ScholarCross RefCross Ref
  30. Neelam Duhan, AK Sharma, and Komal Kumar Bhatia. 2009. Page ranking algorithms: a survey. In 2009 IEEE International Advance Computing Conference. 1530–1537.Google ScholarGoogle ScholarCross RefCross Ref
  31. Buddhima Gamlath, Sagar Kale, Slobodan Mitrović, and Ola Svensson. 2018. Weighted Matchings via Unweighted Augmentations. arXiv preprint arXiv:1811.02760.Google ScholarGoogle Scholar
  32. Mohsen Ghaffari, Themis Gouleakis, Christian Konrad, Slobodan Mitrovic, and Ronitt Rubinfeld. 2018. Improved Massively Parallel Computation Algorithms for MIS, Matching, and Vertex Cover. Proceedings of the 37th ACM Principles of Distributed Computing (PODC 2018).Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Mohsen Ghaffari, Fabian Kuhn, and Jara Uitto. 2019. Conditional Hardness Results for Massively Parallel Computation from Distributed Lower Bounds. FOCS.Google ScholarGoogle Scholar
  34. Mohsen Ghaffari, Silvio Lattanzi, and Slobodan Mitrović. 2019. Improved Parallel Algorithms for Density-Based Network Clustering. In International Conference on Machine Learning. 2201–2210.Google ScholarGoogle Scholar
  35. Mohsen Ghaffari and Jara Uitto. 2019. Sparsifying distributed algorithms with ramifications in massively parallel computation and centralized local computation. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms. 1636–1653.Google ScholarGoogle ScholarCross RefCross Ref
  36. Ashish Goel, Michael Kapralov, and Sanjeev Khanna. 2013. Perfect Matchings in O(n ologn) Time in Regular Bipartite Graphs. SIAM J. Comput., 42, 3 (2013), 1392–1404.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Oded Goldreich and Dana Ron. 1999. A Sublinear Bipartiteness Tester for Bounded Degree Graphs. Combinatorica, 19, 3 (1999), 335–373.Google ScholarGoogle ScholarCross RefCross Ref
  38. Michael T. Goodrich, Nodari Sitchinava, and Qin Zhang. 2011. Sorting, searching, and simulation in the MapReduce framework. In International Symposium on Algorithms and Computation. 374–383.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Shay Halperin and Uri Zwick. 1996. An Optimal Randomised Logarithmic Time Connectivity Algorithm for the EREW PRAM. J. Comput. Syst. Sci., 53, 3 (1996), 395–416.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Shay Halperin and Uri Zwick. 1996. Optimal Randomized EREW PRAM Algorithms for Finding Spanning Forests and for Other Basic Graph Connectivity Problems. In Proceedings of the Seventh Annual ACM-SIAM Symposium on Discrete Algorithms (SODA ’96). Society for Industrial and Applied Mathematics, Philadelphia, PA, USA. 438–447. isbn:0-89871-366-8Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Nicholas JA Harvey, Christopher Liaw, and Paul Liu. 2018. Greedy and Local Ratio Algorithms in the MapReduce Model. In Proceedings of the 30th on Symposium on Parallelism in Algorithms and Architectures. 43–52.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Mark Jerrum and Alistair Sinclair. 1996. The Markov chain Monte Carlo method: an approach to approximate counting and integration. Approximation algorithms for NP-hard problems, 482–520.Google ScholarGoogle Scholar
  43. Ce Jin. 2019. Simulating Random Walks on Graphs in the Streaming Model. In 10th Innovations in Theoretical Computer Science Conference, ITCS 2019, January 10-12, 2019, San Diego, California, USA. 46:1–46:15.Google ScholarGoogle Scholar
  44. Satyen Kale and Comandur Seshadhri. 2011. An expansion tester for bounded degree graphs. SIAM J. Comput., 40, 3 (2011), 709–720.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. David R. Karger, Noam Nisan, and Michal Parnas. 1999. Fast Connected Components Algorithms for the EREW PRAM. SIAM J. Comput., 28, 3 (1999), 1021–1034.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Howard J. Karloff, Siddharth Suri, and Sergei Vassilvitskii. 2010. A Model of Computation for MapReduce. In Proceedings of the 21st Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2010, Austin, Texas, USA, January 17–19, 2010. 938–948.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Tali Kaufman, Michael Krivelevich, and Dana Ron. 2004. Tight Bounds for Testing Bipartiteness in General Graphs. SIAM J. Comput., 33, 6 (2004), 1441–1483.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Jonathan A. Kelner and Aleksander Mądry. 2009. Faster Generation of Random Spanning Trees. In 50th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2009, October 25-27, 2009, Atlanta, Georgia, USA. 13–21.Google ScholarGoogle Scholar
  49. Amy N Langville and Carl D Meyer. 2004. Deeper inside PageRank. Internet Mathematics, 1, 3 (2004), 335–380.Google ScholarGoogle ScholarCross RefCross Ref
  50. Silvio Lattanzi, Benjamin Moseley, Siddharth Suri, and Sergei Vassilvitskii. 2011. Filtering: a method for solving graph problems in MapReduce. In Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures. 85–94.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Siqiang Luo. 2019. Distributed PageRank Computation: An Improved Theoretical Study. In The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, Honolulu, Hawaii, USA, January 27 - February 1, 2019. AAAI Press, 4496–4503.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Asaf Nachmias and Asaf Shapira. 2010. Testing the expansion of a graph. Inf. Comput., 208, 4 (2010), 309–314.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Krzysztof Onak. 2018. Round compression for parallel graph algorithms in strongly sublinear space. arXiv preprint arXiv:1807.08745.Google ScholarGoogle Scholar
  54. Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1999. The PageRank Citation Ranking: Bringing Order to the Web.. Stanford InfoLab.Google ScholarGoogle Scholar
  55. John H. Reif. 1985. An Optimal Parallel Algorithm for Integer Sorting. In Proceedings of the 26th Annual Symposium on Foundations of Computer Science (SFCS ’85). IEEE Computer Society, Washington, DC, USA. 496–504. isbn:0-8186-0844-4Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Tim Roughgarden, Sergei Vassilvitskii, and Joshua R Wang. 2018. Shuffles and circuits (on lower bounds for modern parallel computation). Journal of the ACM (JACM), 65, 6 (2018), 41.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Walking randomly, massively, and efficiently

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            STOC 2020: Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing
            June 2020
            1429 pages
            ISBN:9781450369794
            DOI:10.1145/3357713

            Copyright © 2020 Owner/Author

            This work is licensed under a Creative Commons Attribution International 4.0 License.

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 22 June 2020

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            Overall Acceptance Rate1,469of4,586submissions,32%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader