Abstract
This article addresses search engine personalization. We present a new approach to mining a user's preferences on the search results from clickthrough data and using the discovered preferences to adapt the search engine's ranking function for improving search quality. We develop a new preference mining technique called SpyNB, which is based on the practical assumption that the search results clicked on by the user reflect the user's preferences but does not draw any conclusions about the results that the user did not click on. As such, SpyNB is still valid even if the user does not follow any order in reading the search results or does not click on all relevant results. Our extensive offline experiments demonstrate that SpyNB discovers many more accurate preferences than existing algorithms do. The interactive online experiments further confirm that SpyNB and our personalization approach are effective in practice. We also show that the efficiency of SpyNB is comparable to existing simple preference mining algorithms.
- Agrawal, R. and Wimmers, E. 2000. A framework for expressing and combining preferences. In Proceedings of the 19th ACM SIGMOD International Conference on Management of Data. 297--306. Google Scholar
Digital Library
- Bartell, B., G., Cottrell, and Belew, R. 1994. Automatic combination of multiple ranked retrieval systemss. In Proceedings of the 17th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 173--181. Google Scholar
Digital Library
- Blum, A. and Mitchell, T. 1998. Combining labeled and unlabeled data with co-training. In Proceedings of the 11th Annual Conference on Learning Theory (COLT 98). 92--100. Google Scholar
Digital Library
- Deng, L., Chai, X., Ng, W., and Lee, D. 2004. Spying out real user preferences for metasearch engine adaptation. In Proceedings of the 6th ACM SIGKDD Workshop on Web Mining and Web Usage Analysis (WebKDD 04, WA). Seattle, 71--82.Google Scholar
- Goulden, C. 1956. Methods of Statistics Analysis, 2nd ed. John Wiley & Sons, New York, NY.Google Scholar
- Haveliwala, T. 2002. Topic-sensitive PageRank. In Proceedings of the 11th International World Wide Web Conference (WWW 02). 517--526. Google Scholar
Digital Library
- Heer, J. and Chi, E. H. 2002. Separating the swarm: Categorization methods for user sessions on the Web. In Proceedings of CHI. 243--250. Google Scholar
Digital Library
- Hoffgen, K., Simon, H., and Horn, K. V. 1995. Robust trainability of single neurons. J. Comput. Syst. Sci. 50, 114--125. Google Scholar
Digital Library
- Jeh, G. and Widom, J. 2003. Scaling personalized Web search. In Proceedings of the 12th International World Wide Web Conference (WWW 03). 271--279. Google Scholar
Digital Library
- Joachims, T. 1999. Making large-scale SVM learning practical. In Advances in Kernel Methods---Support Vector Learning, B. Scholkoph et al., Ed. MIT Press, Cambridge, MA. Google Scholar
Digital Library
- Joachims, T. 2002a. Evaluating retrieval performance using clickthrough data. In Proceedings of the SIGIR Workshop on Mathematical/Formal Methods in Information Retrieval.Google Scholar
- Joachims, T. 2002b. Optimizing search engines using clickthrough data. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 02). 133--142. Google Scholar
Digital Library
- Joachims, T., Granka, L. A., Pan, B., Hembrooke, H., and Gay, G. 2005. Accurately interpreting clickthrough data as implicit feedback. In Proceedings of SIGIR. 154--161. Google Scholar
Digital Library
- Ke, Y., Deng, L., Ng, W., and Lee, D. L. 2005. Web dynamics and their ramifications for the development of Web search engines. Comput. Netw. J. (Special Issue on Web Dynamics), 50, 1430--1447. Google Scholar
Digital Library
- Kießling, W. 2002. Foundations of preferences in database systems. In Proceedings of the 28th International Conference on Very Large Data Bases (VLDB 02 Hong Kong, China). 311--322. Google Scholar
Digital Library
- Liu, B., Dai, Y., Li, X., and Lee, W. S. 2003. Building text classifiers using positive and unlabeled examples. In Proceedings of the 3rd IEEE International Conference on Data Mining (ICDM). Google Scholar
Digital Library
- Liu, B., Lee, W. S., Yu, P., and Li, X. 2002a. Partially supervised classification of text documents. In Proceedings of the 19th International Conference on Machine Learning (ICML). Google Scholar
Digital Library
- Liu, F., Yu, C., and Meng, W. 2002b. Personalize Web search by mapping user queries to categories. In Proceedings of the 11th ACM International Conference on Information and Knowledge Management (CIKM 02). 558--565. Google Scholar
Digital Library
- Liu, F., Yu, C., and Meng, W. 2004. Personalized Web search for improving retrieval effectiveness. IEEE Trans. Knowl. Data Eng. 16, 28--40. Google Scholar
Digital Library
- McCallum, A. and Nigam, K. 1998. A comparison of event models for naive bayes text classification. In Proceedings of the AAAI/ICML-98 Workshop on Learning for Text Categorization. 41--48.Google Scholar
- Mitchell, T. 1997. Machine Learning. McGraw Hill, New York, NY. Google Scholar
Digital Library
- Pretschner, A. and Gauch, S. 1999. Ontology based personalized search. In Proceedings of ICTAI. 391--398. Google Scholar
Digital Library
- Sugiyama, K., Hatano, K., and Yoshikawa, M. 2004. Adaptive Web search based on user profile constructed without any effort from users. In Proceedings of the 13th International World Wide Web Conference (WWW 04). 675--684. Google Scholar
Digital Library
- Tan, Q., Chai, X., Ng, W., and Lee, D. 2004. Applying co-training to clickthrough data for search engine adaptation. In Proceedings of the 9th International Conference on Database Systems for Advanced Applications (DASFAA 04). 519--532.Google Scholar
Index Terms
Mining User preference using Spy voting for search engine personalization
Recommendations
User preference retrieval using semantic categorization for web search
ICACT'10: Proceedings of the 12th international conference on Advanced communication technologySearch engines have been one of the most popular ways for people to find web pages of interest. Presently, when a user enters a keyword in a search engine, the search results are usually presented the same result to other users who search the same ...
Query suggestion with diversification and personalization
Web search query suggestion is an important functionality that facilitates information seeking of search engine users. In existing work, the concepts of diversification and personalization have been individually introduced to query suggestion systems. ...
Discovering the representative of a search engine
CIKM '01: Proceedings of the tenth international conference on Information and knowledge managementGiven a large number of search engines on the Internet, it is difficult for a person to determine which search engines could serve his/her information needs. A common solution is to construct a metasearch engine on top of the search engines. Upon ...






Comments