Abstract
The continuous growth in the size and use of the World Wide Web imposes new methods of design and development of online information services. The need for predicting the users' needs in order to improve the usability and user retention of a Web site is more than evident and can be addressed by personalizing it. Recommendation algorithms aim at proposing “next” pages to users based on their current visit and past users' navigational patterns. In the vast majority of related algorithms, however, only the usage data is used to produce recommendations, disregarding the structural properties of the Web graph. Thus important—in terms of PageRank authority score—pages may be underrated. In this work, we present UPR, a PageRank-style algorithm which combines usage data and link analysis techniques for assigning probabilities to Web pages based on their importance in the Web site's navigational graph. We propose the application of a localized version of UPR (l-UPR) to personalized navigational subgraphs for online Web page ranking and recommendation. Moreover, we propose a hybrid probabilistic predictive model based on Markov models and link analysis for assigning prior probabilities in a hybrid probabilistic model. We prove, through experimentation, that this approach results in more objective and representative predictions than the ones produced from the pure usage-based approaches.
- Aktas, M. S., Nacar, M. A., and Menczer, F. 2004. Personalizing PageRank based on domain profiles. In Proceedings of the WEBKDD 2004 Workshop (Seattle, WA, Aug. 2004).Google Scholar
- Borges, J. and Levene, M. 2000. Data mining of user navigation patterns. In Revised Papers from the International Workshop on Web Usage Analysis and User Profiling. Lecture Notes in Computer Science, vol. 1836. Springer, Berlin, Germany, 92--111. Google Scholar
Digital Library
- Borges, J. and Levene, M. 2004. A dynamic clustering-based Markov model for Web usage Mining. Technical Report. Available online at http://xxx.arxiv.org/abs/cs.IR/0406032.Google Scholar
- Borges, J. and Levene, M. 2006. Ranking pages by topology and popularity within Web sites. World Wide Web J. 9, 3 (Oct.), 301--316. Google Scholar
Digital Library
- Brin, S. and Page, L. 1998. The anatomy of a large-scale hypertextual Web search engine. Comput. Netw. 30, 1--7, 107--117. Google Scholar
Digital Library
- Cadez, I. Heckerman, D., Meek, C., Smyth, P., and White, S. 2000. Visualization of navigation patterns on a Web site using model based clustering. In Proceedings of the ACM KDD2000 Conference (Boston, MA). Google Scholar
Digital Library
- Cadez, I., Gaffney, S., and Smyth, P. 2006. A general probabilistic framework for clustering individuals and objects. In Proceedings of the ACM KDD2000 Conference (Boston, MA). Google Scholar
Digital Library
- Deshpande, M. and Karypis, G. 2001. Selective Markov models for predicting Web-page accesses. In Proceedings of the first SIAM International Conference on Data Mining.Google Scholar
- Eirinaki, M. 2004. Web mining: A roadmap. Technical Report. Available online at http://www.db-net.aueb.gr.Google Scholar
- Eirinaki, M. and Vazirgiannis, M. 2003. Web mining for Web personalization. ACM Trans. Internet Tech. 3, 1, 1--29. Google Scholar
Digital Library
- Eirinaki, M., Vazirgiannis, M., and Kapogiannis, D. 2005. Web path recommendations based on page ranking and Markov models. In Proceedings of the Seventh ACM International Workshop on Web Information and Data Management (WIDM 2005, Bremen, Germany, November). Google Scholar
Digital Library
- Eirinaki, M., Vazirgiannis, M., and Varlamis, I. 2003. SEWeP: Using site semantics and a taxonomy to enhance the Web personalization process. In Proceedings of the ACM KDD2003 Conference (Washington, DC, August). Google Scholar
Digital Library
- El-Sayed, M., Ruiz, C., Rundesteiner, E. A. 2004. FS-Miner: Efficient and incremental mining of frequent sequence patterns in Web logs. In Proceedings of the Sixth ACM International Workshop on Web Information and Data Management (WIDM 2004, Washington, DC, November). Google Scholar
Digital Library
- Haveliwala, T. 2002. Topic-sensitive PageRank. In Proceedings of the WWW2002 Conference (Hawaii, May). Google Scholar
Digital Library
- Huang, Z., Li, X., and Chen, H. 2005. Link prediction approach to collaborative filtering. In Proceedings of ACM JCDL'05 (Colorado). Google Scholar
Digital Library
- Kamvar, S. D., Haveliwala, T. H., and Golub, G. H. 2003a. Adaptive methods for the computation of PageRank. In Proceedings of the International Conference on the Numerical Solution of Markov Chains (September).Google Scholar
- Kamvar, S. D., Haveliwala, T. H., Manning, C. D., and Golub, G. H. 2003b. Extrapolation methods for accelerating PageRank computations. In Proceedings of the twelfth International World Wide Web Conference (WWW2003, May). Google Scholar
Digital Library
- Kendall, M. and Gibbons, J. D. 1990. Rank correlation methods. Oxford University Press, Oxford, U.K.Google Scholar
- Levene, M. and Loizou, G. 2003. Computing the entropy of user navigation in the Web. Int. J. Inform. Tech. Decis. Mak. 2, 459--476.Google Scholar
Cross Ref
- Manavoglu, D., Pavlov, D., and Giles, C. L. 2003. Probabilistic user behaviour models. In Proceedings of ICDM 2003. Google Scholar
Digital Library
- Motwani, R. and Raghavan, P. 1995. Randomized algorithms. Cambridge University Press, Cambridge, U.K. Google Scholar
Digital Library
- Nakagawa, M. and Mobasher, B. 2003. A hybrid Web personalization model based on site connectivity. In Proceedings of the Fifth WEBKDD Workshop (Washington, DC).Google Scholar
- Polyzotis, N. and Garofalakis, M. 2002. Structure and value synopses for XML data graphs. In Proceedings of the 28th VLDB Conference. Google Scholar
Digital Library
- Polyzotis, N., Garofalakis, M., and Ioannidis, Y. 2004. Approximate XML query answers. In Proceedings of SIGMOD 2004 (Paris, France, June). Google Scholar
Digital Library
- Richardson, M. and Domingos, P. 2002. The intelligent surfer: Probabilistic combination of link and content information in PageRank. Neur. Inform. Process. Syst. 14, 1441--1448.Google Scholar
- Sarukkai, R. R. 2000. Link prediction and path analysis using Markov chains. Comput. Netw. 33, 1--6, 337--386. Google Scholar
Digital Library
- Sen, R. and Hansen, M. 2003. Predicting a Web user's next access based on log data. J. Comput. Graph. Stat. 12, 1, 143--155.Google Scholar
Cross Ref
- Spiliopoulou, M. and Faulstich, L. C. 1998. WUM: A Web utilization miner. In Proceedings of the First International Workshop on the Web and Databases (WebDB 1998 Spain, March).Google Scholar
- Zhao, Q. and Bhowmick, S. S. 2004. Mining history of changes to Web access patterns. In Proceedings of PKDD 2004 (Italy, September). Google Scholar
Digital Library
- Zhu, J., Hong, J., and Hughes, J. G. 2002. Using Markov models for Web site link prediction. In Proceedings of ACM HT'02 (Maryland). Google Scholar
Digital Library
Index Terms
Web site personalization based on link analysis and navigational patterns
Recommendations
Web path recommendations based on page ranking and Markov models
WIDM '05: Proceedings of the 7th annual ACM international workshop on Web information and data managementMarkov models have been widely used for modelling users' navigational behaviour in the Web graph, using the transitional probabilities between web pages, as recorded in the web logs. The recorded users' navigation is used to extract popular web paths ...
Web mining for web personalization
Web personalization is the process of customizing a Web site to the needs of specific users, taking advantage of the knowledge acquired from the analysis of the user's navigational behavior (usage data) in correlation with other information collected in ...
Using Markov models for web site link prediction
HYPERTEXT '02: Proceedings of the thirteenth ACM conference on Hypertext and hypermediaMarkov models have been extensively used to model Web users' navigation behaviors on Web sites. The link structure of a Web site can be seen as a citation network. By applying bibliographic co-citation and coupling analysis to a Markov model constructed ...






Comments