ABSTRACT
PageRank has been widely used as a major factor in search engine ranking systems. However, global link graph information is required when computing PageRank, which causes prohibitive communication cost to achieve accurate results in distributed solution. In this paper, we propose a distributed PageRank computation algorithm based on iterative aggregation-disaggregation (IAD) method with Block Jacobi smoothing. The basic idea is divide-and-conquer. We treat each web site as a node to explore the block structure of hyperlinks. Local PageRank is computed by each node itself and then updated with a low communication cost with a coordinator. We prove the global convergence of the Block Jacobi method and then analyze the communication overhead and major advantages of our algorithm. Experiments on three real web graphs show that our method converges 5-7 times faster than the traditional Power method. We believe our work provides an efficient and practical distributed solution for PageRank on large scale Web graphs.
- P. Boldi and S. Vigna. The webgraph framework i: compression techniques. In Proc. of the WWW'04 Conf., pages 595--602, 2004. Google Scholar
Digital Library
- A. Broder, R. Lempel, F. Maghoul, and J. Pedersen. Efficient pagerank approximation via graph aggregation. In Proc. of the WWW'04 Conf., pages 484--485, 2004. Google Scholar
Digital Library
- W. Cao and W. Stewart. Iterative aggregation/disaggregation techniques for nearly uncoupled markov chains. J. ACM, 32(3):702--719, 1985. Google Scholar
Digital Library
- F. Chatelin. Iterative aggregation/disaggregation methods. In Proc. of Int. Workshop on Mathematical Computer Performance and Reliability, 1983. Google Scholar
Digital Library
- P. Courtois and P. Semal. Block iterative algorithm for stochastic matrices. Linear Algebra and its Application, Vol 76, pages 59--70, 1986.Google Scholar
- N. Eiron, K. McCurley, and J. Tomlin. Ranking the web frontier. In Proc. of the WWW'04 Conf., pages 309--318, 2004. Google Scholar
Digital Library
- D. Gleich, L. Zhukov, and P. Berkhin. Fast parallel pagerank: A linear system approach. Technical report, Yahoo Corp., 2004.Google Scholar
- G. Golub and C. V. Loan. Matrix computations (3rd ed.). Johns Hopkins Univ. Press, 1996. Google Scholar
Digital Library
- I. Ipsen and S. Kirkland. Convergence analysis of a pagerank updating algorithm by langville and meyer. SIAM J. Matrix Anal. Appl., to appear, 2004. Google Scholar
Digital Library
- H. Kafeety, C. Meyer, and W. Stewart. A general framework for iterative aggregation/ disaggregation methods. In Proc. of the 4th Copper Mountain Conf. on Iterative Methods, 1992.Google Scholar
- S. Kamvar, T. Haveliwala, and G. Golub. Adaptive methods for the computation of pagerank. Technical report, Stanford Univ., 2003.Google Scholar
- S. Kamvar, T. Haveliwala, C. Manning, and G. Golub. Exploiting the block structure of the web for computing pagerank. Technical report, Stanford Univ., 2003.Google Scholar
- S. Kamvar, T. Haveliwala, C. Manning, and G. Golub. Extrapolation methods for accelerating pagerank computations. Proc. of the WWW'03 Conf., pages 261--270, 2003. Google Scholar
Digital Library
- L. Kaufman. Matrix methods for queueing problems. SIAM J. Sci. Statist. Comput., 4(3):525--552, 1983.Google Scholar
Digital Library
- M. Kendall and J. Gibbons. Rank Correlation Methods. Edward Arnold, London, 5 edition, 1990.Google Scholar
- J. Kleinberg. Authoritative sources in a hyperlinked environment. J. ACM, 46(5):604--632, 1999. Google Scholar
Digital Library
- A. Langville and C. Meyer. Updating pagerank with iterative aggregation. In Proc. of the WWW'04 Conf., 2004. Google Scholar
Digital Library
- A. Langville and C. Meyer. Deeper inside pagerank. Internet Mathematics, 1(3):335--380, 2005.Google Scholar
Cross Ref
- C. Lee, G. Golub, and S. Zenios. A fast two-stage algorithm for computing pagerank and its extensions. Technical report, Stanford Univ., 2003.Google Scholar
- P. Lyman, H. Varian, J. Dunn, A. Strygin, and K. Swearingen. How much information project. Technical report, Univ. of California, Berkeley, 2003.Google Scholar
- I. Marek and P. Mayer. Iterative aggregation/disaggregation methods for computing some characteristics of markov chains. In Proc. of the Third Int. Conf. on Large-Scale Scientific Computing, pages 68--80, 2001. Google Scholar
Digital Library
- I. Marek and P. Mayer. Convergence theory of some classes of iterative aggregation-disaggregation methods for computing stationary probability vectors of stochastic matrices. Linear Algebra and Its Applications, 363:177--200, 2003.Google Scholar
Cross Ref
- I. Marek and I. Pultarova. A note on local and global convergence analysis of iterative aggregation-disaggregation methods. Submitted to Linear Algebra and Applications, 2005.Google Scholar
- C. Meyer. Stochastic complementation, uncoupling markov chains, and the theory of nearly reducible systems. SIAM Review, Vol. 31, :2, pages 240--272, 1989. Google Scholar
Digital Library
- M. Neumann and R. Plemmons. Convergent nonnegative matrices and iterative methods for consistent linear systems. Numer. Math., 31:265--279, 1978.Google Scholar
Digital Library
- L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. Technical report, Stanford Univ., 1998.Google Scholar
- G. Stewart, W. Stewart, and D. McAllister. A two stage iteration for solving nearly completely decomposable markov chains. Recent Advances in Iterative Methods, IMA Vol. Math. Appl. 60:201--216, 1993.Google Scholar
- D. Szyld. The mystery of asynchronous iterations convergence when the spectral radius is one. Research Report 98--102, Temple Univ., 1998.Google Scholar
- Y. Takahashi. A lumping method for numerical calculations of stationary distributions of markov chains. Technical Report B-18, Dept. of Information Sciences, Tokyo Institute of Technology, 1975.Google Scholar
- H. Vantilborgh. The error of aggregation in decomposable systems. Technical Report R453, Philipps Research Laboratory, Brussels, Belgium, 1981.Google Scholar
- Y. Wang and D. DeWitt. Computing pagerank in a distributed internet search engine system. In Proc. of VLDB'04 Conf., pages 420--431, 2004. Google Scholar
Digital Library
Index Terms
Distributed PageRank computation based on iterative aggregation-disaggregation methods
Recommendations
Associated pagerank: improved pagerank measured by frequent term sets
VECIMS'09: Proceedings of the 2009 IEEE international conference on Virtual Environments, Human-Computer Interfaces and Measurement SystemsWeb search engines encounter many new challenges while the amount of information on the web increases rapidly. Web documents have been a main resource for various purposes, and people rely on search engines to retrieve the desired documents. This paper ...
Beyond PageRank: machine learning for static ranking
WWW '06: Proceedings of the 15th international conference on World Wide WebSince the publication of Brin and Page's paper on PageRank, many in the Web community have depended on PageRank for the static (query-independent) ordering of Web pages. We show that we can significantly outperform PageRank using features that are ...
Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search
The original PageRank algorithm for improving the ranking of search-query results computes a single vector, using the link structure of the Web, to capture the relative “importance” of Web pages, independent of any particular search query. To yield more ...






Comments