skip to main content
10.1145/1099554.1099705acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
Article

Distributed PageRank computation based on iterative aggregation-disaggregation methods

Published:31 October 2005Publication History

ABSTRACT

PageRank has been widely used as a major factor in search engine ranking systems. However, global link graph information is required when computing PageRank, which causes prohibitive communication cost to achieve accurate results in distributed solution. In this paper, we propose a distributed PageRank computation algorithm based on iterative aggregation-disaggregation (IAD) method with Block Jacobi smoothing. The basic idea is divide-and-conquer. We treat each web site as a node to explore the block structure of hyperlinks. Local PageRank is computed by each node itself and then updated with a low communication cost with a coordinator. We prove the global convergence of the Block Jacobi method and then analyze the communication overhead and major advantages of our algorithm. Experiments on three real web graphs show that our method converges 5-7 times faster than the traditional Power method. We believe our work provides an efficient and practical distributed solution for PageRank on large scale Web graphs.

References

  1. P. Boldi and S. Vigna. The webgraph framework i: compression techniques. In Proc. of the WWW'04 Conf., pages 595--602, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. Broder, R. Lempel, F. Maghoul, and J. Pedersen. Efficient pagerank approximation via graph aggregation. In Proc. of the WWW'04 Conf., pages 484--485, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. W. Cao and W. Stewart. Iterative aggregation/disaggregation techniques for nearly uncoupled markov chains. J. ACM, 32(3):702--719, 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. F. Chatelin. Iterative aggregation/disaggregation methods. In Proc. of Int. Workshop on Mathematical Computer Performance and Reliability, 1983. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. P. Courtois and P. Semal. Block iterative algorithm for stochastic matrices. Linear Algebra and its Application, Vol 76, pages 59--70, 1986.Google ScholarGoogle Scholar
  6. N. Eiron, K. McCurley, and J. Tomlin. Ranking the web frontier. In Proc. of the WWW'04 Conf., pages 309--318, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. Gleich, L. Zhukov, and P. Berkhin. Fast parallel pagerank: A linear system approach. Technical report, Yahoo Corp., 2004.Google ScholarGoogle Scholar
  8. G. Golub and C. V. Loan. Matrix computations (3rd ed.). Johns Hopkins Univ. Press, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. I. Ipsen and S. Kirkland. Convergence analysis of a pagerank updating algorithm by langville and meyer. SIAM J. Matrix Anal. Appl., to appear, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. H. Kafeety, C. Meyer, and W. Stewart. A general framework for iterative aggregation/ disaggregation methods. In Proc. of the 4th Copper Mountain Conf. on Iterative Methods, 1992.Google ScholarGoogle Scholar
  11. S. Kamvar, T. Haveliwala, and G. Golub. Adaptive methods for the computation of pagerank. Technical report, Stanford Univ., 2003.Google ScholarGoogle Scholar
  12. S. Kamvar, T. Haveliwala, C. Manning, and G. Golub. Exploiting the block structure of the web for computing pagerank. Technical report, Stanford Univ., 2003.Google ScholarGoogle Scholar
  13. S. Kamvar, T. Haveliwala, C. Manning, and G. Golub. Extrapolation methods for accelerating pagerank computations. Proc. of the WWW'03 Conf., pages 261--270, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. L. Kaufman. Matrix methods for queueing problems. SIAM J. Sci. Statist. Comput., 4(3):525--552, 1983.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. Kendall and J. Gibbons. Rank Correlation Methods. Edward Arnold, London, 5 edition, 1990.Google ScholarGoogle Scholar
  16. J. Kleinberg. Authoritative sources in a hyperlinked environment. J. ACM, 46(5):604--632, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. A. Langville and C. Meyer. Updating pagerank with iterative aggregation. In Proc. of the WWW'04 Conf., 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. A. Langville and C. Meyer. Deeper inside pagerank. Internet Mathematics, 1(3):335--380, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  19. C. Lee, G. Golub, and S. Zenios. A fast two-stage algorithm for computing pagerank and its extensions. Technical report, Stanford Univ., 2003.Google ScholarGoogle Scholar
  20. P. Lyman, H. Varian, J. Dunn, A. Strygin, and K. Swearingen. How much information project. Technical report, Univ. of California, Berkeley, 2003.Google ScholarGoogle Scholar
  21. I. Marek and P. Mayer. Iterative aggregation/disaggregation methods for computing some characteristics of markov chains. In Proc. of the Third Int. Conf. on Large-Scale Scientific Computing, pages 68--80, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. I. Marek and P. Mayer. Convergence theory of some classes of iterative aggregation-disaggregation methods for computing stationary probability vectors of stochastic matrices. Linear Algebra and Its Applications, 363:177--200, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  23. I. Marek and I. Pultarova. A note on local and global convergence analysis of iterative aggregation-disaggregation methods. Submitted to Linear Algebra and Applications, 2005.Google ScholarGoogle Scholar
  24. C. Meyer. Stochastic complementation, uncoupling markov chains, and the theory of nearly reducible systems. SIAM Review, Vol. 31, :2, pages 240--272, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. M. Neumann and R. Plemmons. Convergent nonnegative matrices and iterative methods for consistent linear systems. Numer. Math., 31:265--279, 1978.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. Technical report, Stanford Univ., 1998.Google ScholarGoogle Scholar
  27. G. Stewart, W. Stewart, and D. McAllister. A two stage iteration for solving nearly completely decomposable markov chains. Recent Advances in Iterative Methods, IMA Vol. Math. Appl. 60:201--216, 1993.Google ScholarGoogle Scholar
  28. D. Szyld. The mystery of asynchronous iterations convergence when the spectral radius is one. Research Report 98--102, Temple Univ., 1998.Google ScholarGoogle Scholar
  29. Y. Takahashi. A lumping method for numerical calculations of stationary distributions of markov chains. Technical Report B-18, Dept. of Information Sciences, Tokyo Institute of Technology, 1975.Google ScholarGoogle Scholar
  30. H. Vantilborgh. The error of aggregation in decomposable systems. Technical Report R453, Philipps Research Laboratory, Brussels, Belgium, 1981.Google ScholarGoogle Scholar
  31. Y. Wang and D. DeWitt. Computing pagerank in a distributed internet search engine system. In Proc. of VLDB'04 Conf., pages 420--431, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Distributed PageRank computation based on iterative aggregation-disaggregation methods

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!