ABSTRACT
The Kleinberg HITS and the Google PageRank algorithms are eigenvector methods for identifying ``authoritative'' or ``influential'' articles, given hyperlink or citation information. That such algorithms should give reliable or consistent answers is surely a desideratum, and in~\cite{ijcaiPaper}, we analyzed when they can be expected to give stable rankings under small perturbations to the linkage patterns. In this paper, we extend the analysis and show how it gives insight into ways of designing stable link analysis methods. This in turn motivates two new algorithms, whose performance we study empirically using citation data and web hyperlink data.
- 1.B. Amento, L. G. Terveen, and W. C. Hill. Does "authority" mean quality? Predicting expert quality ratings of web documents. In Proc. 23rd Annual Intl. ACM SIGIR Conference, pages 296-303. ACM, 2000. Google Scholar
Digital Library
- 2.K. Bharat and M. Henzinger. Improved algorithms for topic distillation in a hyperlinked environment. In Proc. 21st Annual Intl. ACM SIGIR Conf., pages 104-111. ACM, 1998. Google Scholar
Digital Library
- 3.S. Brin and L. Page. The anatomy of a large-scale hypertextual (Web) search engine. In The Seventh International World Wide Web Conference, 1998. Google Scholar
Digital Library
- 4.Fan R. K. Chung. Spectral Graph Theory. American Mathematical Society, 1994.Google Scholar
- 5.D. Cohn and H. Chang. Probabilistically identifying authoritative documents. In Proc. 17th International Conference on Machine Learning, 2000. Google Scholar
Digital Library
- 6.S. Deerwester, S. Dumais, G. Furnas, T. Landauer, and R. Harshman. Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41(6):391-407, 1990.Google Scholar
Digital Library
- 7.G. H. Golub and C. F. Van Loan. Matrix Computations. Johns Hopkins Univ. Press, 1996.Google Scholar
- 8.J. Kleinberg. Authoritative sources in a hyperlinked environment. Proc. 9th ACM-SIAM Symposium on Discrete Algorithms, 1998. Google Scholar
Digital Library
- 9.A. McCallum, K. Nigam, J. Rennie, and K. Seymore. Automating the contruction of Internet portals with machine learning. Information Retrieval Journal, 3:127-163, 2000. Google Scholar
Digital Library
- 10.A. Y. Ng, A. X. Zheng, and M. I. Jordan. Link analysis, eigenvectors, and stability. In Proc. 17th International Joint Conference on Artificial Intelligence, 2001. Google Scholar
Digital Library
- 11.F. Osareh. Bibliometrics, citation analysis and co-citation analysis: A review of literature I. Libri, 46:149-158, 1996.Google Scholar
Cross Ref
- 12.L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank citation ranking: Bringing order to the web. Unpublished Manuscript, 1998.Google Scholar
- 13.C. Papadimitriou, P. Raghavan, H. Tamaki, and S. Vempala. Latent semantic indexing: A probabilistic analysis. In Proc. SIGMODS/PODS, 1998. Google Scholar
Digital Library
- 14.Davood Rafiei and Alberto Mendelzon. What is this Page Known for? Computing Web Page Reputations. In Proc. WWW9 Conference, 2000. Google Scholar
Digital Library
- 15.G. W. Stewart and Ji-Guang Sun. Matrix Perturbation Theory. Academic Press, 1990.Google Scholar
Index Terms
Stable algorithms for link analysis
Recommendations
Link analysis ranking: algorithms, theory, and experiments
The explosive growth and the widespread accessibility of the Web has led to a surge of research activity in the area of information retrieval on the World Wide Web. The seminal papers of Kleinberg [1998, 1999] and Brin and Page [1998] introduced Link ...
Extrapolation to speed-up query-dependent link analysis ranking algorithms
FIT '10: Proceedings of the 8th International Conference on Frontiers of Information TechnologyRelevance is a numerical score assigned to a search result, representing how well the results meet the information needs of the user that issued the search query. Several mathematical tools and techniques have been used in research for improving the ...
Link Analysis: Hubs and Authorities on the World Wide Web
Ranking the tens of thousands of retrieved webpages for a user query on a Web search engine such that the most informative webpages are on the top is a key information retrieval technology. A popular ranking algorithm is the HITS algorithm of Kleinberg. ...






Comments