Abstract
Search engines answer users' queries by listing relevant items (e.g. documents, songs, products, web pages, ...). These engines rely on algorithms that learn to rank items so as to present an ordered list maximizing the probability that it contains relevant item. The main challenge in the design of learning-to-rank algorithms stems from the fact that queries often have different meanings for different users. In absence of any contextual information about the query, one often has to adhere to the {\it diversity} principle, i.e., to return a list covering the various possible topics or meanings of the query. To formalize this learning-to-rank problem, we propose a natural model where (i) items are categorized into topics, (ii) users find items relevant only if they match the topic of their query, and (iii) the engine is not aware of the topic of an arriving query, nor of the frequency at which queries related to various topics arrive, nor of the topic-dependent click-through-rates of the items. For this problem, we devise LDR (Learning Diverse Rankings), an algorithm that efficiently learns the optimal list based on users' feedback only. We show that after $T$ queries, the regret of LDR scales as $O((N-L)\log(T))$ where $N$ is the number of all items. We further establish that this scaling cannot be improved, i.e., LDR is order optimal. Finally, using numerical experiments on both artificial and real-world data, we illustrate the superiority of LDR compared to existing learning-to-rank algorithms.
- Nick Craswell, Onno Zoeter, Michael Taylor, and Bill Ramsey. An experimental comparison of click position-bias models. In Proceedings of the 2008 International Conference on Web Search and Data Mining, pages 87--94. ACM, 2008. Google Scholar
Digital Library
- Anne Schuth, Harrie Oosterhuis, Shimon Whiteson, and Maarten de Rijke. Multileave gradient descent for fast online learning to rank. In Proceedings of the Ninth ACM International Conference on Web Search and Data Mining, pages 457--466. ACM, 2016. Google Scholar
Digital Library
- Richard Combes, Stefan Magureanu, Alexandre Proutiere, and Cyrille Laroche. Learning to rank: Regret lower bounds and efficient algorithms. SIGMETRICS Perform. Eval. Rev., 43(1):231--244, June 2015. Google Scholar
Digital Library
- Branislav Kveton, Csaba Szepesvari, Zheng Wen, and Azin Ashkan. Cascading bandits: Learning to rank in the cascade model. In Proceedings of the 32nd International Conference on Machine Learning (ICML-15), pages 767--776, 2015. Google Scholar
Digital Library
- Anne Schuth, Katja Hofmann, Shimon Whiteson, and Maarten de Rijke. Lerot: An online learning to rank framework. In Proceedings of the 2013 workshop on Living labs for information retrieval evaluation, pages 23--26. ACM, 2013. Google Scholar
Digital Library
- Filip Radlinski and Thorsten Joachims. Active exploration for learning rankings from clickthrough data. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 570--579. ACM, 2007. Google Scholar
Digital Library
- Pushmeet Kohli, Mahyar Salek, and Greg Stoddard. A fast bandit algorithm for recommendation to users with heterogenous tastes. In AAAI, 2013. Google Scholar
Digital Library
- Mahmuda Rahman and Jae C Oh. Fast online learning to recommend a diverse set from big data. In Current Approaches in Applied Artificial Intelligence, pages 361--370. Springer, 2015. Google Scholar
Digital Library
- Filip Radlinski, Robert Kleinberg, and Thorsten Joachims. Learning diverse rankings with multi-armed bandits. In Proceedings of the 25th international conference on Machine learning, pages 784--791. ACM, 2008. Google Scholar
Digital Library
- Matthew Streeter and Daniel Golovin. An online algorithm for maximizing submodular functions. In Advances in Neural Information Processing Systems, pages 1577--1584, 2009. Google Scholar
Digital Library
- Aleksandrs Slivkins, Filip Radlinski, and Sreenivas Gollapudi. Ranked bandits in metric spaces: learning diverse rankings over large document collections. The Journal of Machine Learning Research, 14(1):399--436, 2013. Google Scholar
Digital Library
- Yisong Yue and Carlos Guestrin. Linear submodular bandits and their application to diversified retrieval. In Advances in Neural Information Processing Systems, pages 2483--2491, 2011. Google Scholar
Digital Library
- Baosheng Yu, Meng Fang, and Dacheng Tao. Linear submodular bandits with a knapsack constraint. In Thirtieth AAAI Conference on Artificial Intelligence, 2016. Google Scholar
Digital Library
- Lijing Qin, Shouyuan Chen, and Xiaoyan Zhu. Contextual combinatorial bandit and its application on diversified online recommendation. In SDM, pages 461--469. SIAM, 2014.Google Scholar
- Shuai Li, Baoxiang Wang, Shengyu Zhang, and Wei Chen. Contextual combinatorial cascading bandits. In Proceedings of The 33rd International Conference on Machine Learning, pages 1245--1253, 2016. Google Scholar
Digital Library
- Aurélien Garivier and Olivier Cappé. The kl-ucb algorithm for bounded stochastic bandits and beyond. In COLT, pages 359--376, 2011.Google Scholar
- Aurélien Garivier. Informational confidence bounds for self-normalized averages and applications. In IEEE Information Theory Workshop, pages 489--493, 2013.Google Scholar
Cross Ref
- Todd L Graves and Tze Leung Lai. Asymptotically efficient adaptive choice of control laws incontrolled markov chains. SIAM journal on control and optimization, 35(3):715--743, 1997. Google Scholar
Digital Library
Index Terms
Online Learning of Optimally Diverse Rankings
Recommendations
Learning to Rank: Regret Lower Bounds and Efficient Algorithms
SIGMETRICS '15: Proceedings of the 2015 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer SystemsAlgorithms for learning to rank Web documents, display ads, or other types of items constitute a fundamental component of search engines and more generally of online services. In such systems, when a user makes a request or visits a web page, an ordered ...
Online Learning of Optimally Diverse Rankings
SIGMETRICS '18Search engines answer users' queries by listing relevant items (e.g. documents, songs, products, web pages, ...). These engines rely on algorithms that learn to rank items so as to present an ordered list maximizing the probability that it contains ...
Online Learning of Optimally Diverse Rankings
SIGMETRICS '18: Abstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer SystemsSearch engines answer users' queries by listing relevant items (e.g. documents, songs, products, web pages, ...). These engines rely on algorithms that learn to rank items so as to present an ordered list maximizing the probability that it contains ...






Comments