skip to main content
10.1145/1390334.1390412acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

BrowseRank: letting web users vote for page importance

Published:20 July 2008Publication History

ABSTRACT

This paper proposes a new method for computing page importance, referred to as BrowseRank. The conventional approach to compute page importance is to exploit the link graph of the web and to build a model based on that graph. For instance, PageRank is such an algorithm, which employs a discrete-time Markov process as the model. Unfortunately, the link graph might be incomplete and inaccurate with respect to data for determining page importance, because links can be easily added and deleted by web content creators. In this paper, we propose computing page importance by using a 'user browsing graph' created from user behavior data. In this graph, vertices represent pages and directed edges represent transitions between pages in the users' web browsing history. Furthermore, the lengths of staying time spent on the pages by users are also included. The user browsing graph is more reliable than the link graph for inferring page importance. This paper further proposes using the continuous-time Markov process on the user browsing graph as a model and computing the stationary probability distribution of the process as page importance. An efficient algorithm for this computation has also been devised. In this way, we can leverage hundreds of millions of users' implicit voting on page importance. Experimental results show that BrowseRank indeed outperforms the baseline methods such as PageRank and TrustRank in several tasks.

References

  1. B. Amento, L. Terveen, and W. Hill. Does authority mean quality? Predicting expert quality ratings of web documents. In SIGIR ' 00. ACM, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval. Addison Wesley, May 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M. Bianchini, M. Gori, and F. Scarselli. Inside pagerank. ACM Trans. Interet Technol., 5(1):92--128, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. P. Boldi, M. Santini, and S. Vigna. Pagerank as a function of the damping factor. In WWW ' 05. ACM, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. Brin and L. Page. The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems, 30(1-7):107--117, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. G. H. Golub and C. F. V. Loan. Matrix computations (3rd ed.). Johns Hopkins University Press, Baltimore, MD, USA, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Z. Gyongyi and H. Garcia-Molina. Web spam taxonomy, 2005.Google ScholarGoogle Scholar
  8. Z. Gyongyi, H. Garcia-Molina, and J. Pedersen. Combating web spam with trustrank. In VLDB '04, pages 576--587. VLDB Endowment, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. T. Haveliwala. Efficient computation of pageRank. Technical Report 1999-31, 1999.Google ScholarGoogle Scholar
  10. T. Haveliwala and S. Kamvar. The second eigenvalue of the google matrix, 2003.Google ScholarGoogle Scholar
  11. T. Haveliwala, S. Kamvar, and G. Jeh. An analytical comparison of approaches to personalizing pagerank, 2003.Google ScholarGoogle Scholar
  12. T. H. Haveliwala. Topic-sensitive pagerank. In WWW ' 02, Honolulu, Hawaii, May 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. K. Jarvelin and J. Kekalainen. IR evaluation methods for retrieving highly relevant documents. In SIGIR '00, pages 41--48, New York, NY, USA, 2000. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. K. Jarvelin and J. Kekalainen. Cumulated gain-based evaluation of ir techniques. ACM Trans. Inf. Syst., 20(4):422--446, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. M. Kleinberg. Authoritative sources in a hyperlinked environment. In SODA '98, pages 668--677, Philadelphia, PA, USA, 1998. Society for Industrial and Applied Mathematics. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. A. N. Langville and C. D. Meyer. Deeper inside pagerank. Internet Mathematics, 1(3):335--400, 2004.Google ScholarGoogle Scholar
  17. F. McSherry. A uniform approach to accelerated pagerank computation. In WWW '05, pages 575--582, New York, NY, USA, 2005. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. Technical report, Stanford Digital Library Technologies Project, 1998.Google ScholarGoogle Scholar
  19. J. A. Rice. Mathematical Statistics and Data Analysis (2nd ed.). Duxbery Press, 1995.Google ScholarGoogle Scholar
  20. M. Richardson and P. Domingos. The Intelligent Surfer: Probabilistic Combination of Link and Content Information in PageRank. In Advances in Neural Information Processing Systems 14. MIT Press, 2002.Google ScholarGoogle Scholar
  21. S. E. Robertson. Overview of okapi projects. Journal of Documentatioin, 53(1):3--7, 1997.Google ScholarGoogle ScholarCross RefCross Ref
  22. W. J. Stewart. Introduction to the Numerical Solution of Markov Chains. Princeton University Press, Princeton, N,J., 1994.Google ScholarGoogle Scholar
  23. Z. K. Wang and X. Q. Yang. Birth and Death Processes and Markov Chains. Springer-Verlag, New York, 1992.Google ScholarGoogle Scholar
  24. R. W. White, M. Bilenko, and S. Cucerzan. Studying the use of popular destinations to enhance web search interaction. In SIGIR '07, pages 159--166, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. BrowseRank: letting web users vote for page importance

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
          July 2008
          934 pages
          ISBN:9781605581644
          DOI:10.1145/1390334

          Copyright © 2008 ACM

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 20 July 2008

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate792of3,983submissions,20%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!