10.1145/3477495.3531994acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Implicit Feedback for Dense Passage Retrieval: A Counterfactual Approach

Authors Info & Claims
Published:07 July 2022Publication History

ABSTRACT

In this paper we study how to effectively exploit implicit feedback in Dense Retrievers (DRs). We consider the specific case in which click data from a historic click log is available as implicit feedback. We then exploit such historic implicit interactions to improve the effectiveness of a DR. A key challenge that we study is the effect that biases in the click signal, such as position bias, have on the DRs. To overcome the problems associated with the presence of such bias, we propose the Counterfactual Rocchio (CoRocchio) algorithm for exploiting implicit feedback in Dense Retrievers. We demonstrate both theoretically and empirically that dense query representations learnt with CoRocchio are unbiased with respect to position bias and lead to higher retrieval effectiveness. We make available the implementations of the proposed methods and the experimental framework, along with all results at https://github.com/ielab/Counterfactual-DR.

References

  1. Nasreen Abdul-Jaleel, James Allan, W Bruce Croft, Fernando Diaz, Leah Larkey, Xiaoyan Li, Mark D Smucker, and Courtney Wade. 2004. UMass at TREC 2004: Novelty and HARD. Computer Science Department Faculty Publication Series (2004), 189.Google ScholarGoogle ScholarCross RefCross Ref
  2. Aman Agarwal, Soumya Basu, Tobias Schnabel, and Thorsten Joachims. 2017. Effective evaluation using logged bandit feedback from multiple loggers. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 687--696.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Aman Agarwal, Ivan Zaitsev, Xuanhui Wang, Cheng Li, Marc Najork, and Thorsten Joachims. 2019. Estimating position bias without intrusive interventions. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. 474--482.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Eugene Agichtein, Eric Brill, and Susan Dumais. 2006. Improving web search ranking by incorporating user behavior information. In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval. 19--26.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Negar Arabzadeh, Alexandra Vtyurina, Xinyi Yan, and Charles LA Clarke. 2021. Shallow pooling for sparse labels. arXiv preprint arXiv:2109.00062 (2021).Google ScholarGoogle Scholar
  6. Aleksandr Chuklin, Ilya Markov, and Maarten de Rijke. 2015. Click models for web search. Synthesis lectures on information concepts, retrieval, and services, Vol. 7, 3 (2015), 1--115.Google ScholarGoogle ScholarCross RefCross Ref
  7. Nick Craswell, Daniel Campos, Bhaskar Mitra, Emine Yilmaz, and Bodo Billerbeck. 2020 a. ORCAS: 20 million clicked query-document pairs for analyzing search. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 2983--2989.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos, and Ellen M Voorhees. 2020 b. Overview of the TREC 2019 Deep Learning Track. In Proceedings of the Twenty-Ninth Text REtrieval Conference (NIST Special Publication). National Institute of Standards and Technology (NIST).Google ScholarGoogle Scholar
  9. Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos, and Ellen M Voorhees. 2020 c. Overview of the TREC 2019 Deep Learning Track. In Proceedings of the Twenty-Ninth Text REtrieval Conference (NIST Special Publication). National Institute of Standards and Technology (NIST).Google ScholarGoogle Scholar
  10. Zhichong Fang, Aman Agarwal, and Thorsten Joachims. 2019. Intervention harvesting for context-dependent examination-bias estimation. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 825--834.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Luyu Gao, Zhuyun Dai, Tongfei Chen, Zhen Fan, Benjamin Van Durme, and Jamie Callan. 2021. Complement lexical retrieval model with semantic residual embeddings. In European Conference on Information Retrieval. Springer, 146--160.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Zhiwei Guan and Edward Cutrell. 2007. An eye tracking study of the effect of target rank on web search. In Proceedings of the SIGCHI conference on Human factors in computing systems. 417--420.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Fan Guo, Chao Liu, and Yi Min Wang. 2009. Efficient multiple-click models in web search. In Proceedings of the second acm international conference on web search and data mining. 124--131.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Sebastian Hofstätter, Sheng-Chieh Lin, Jheng-Hong Yang, Jimmy Lin, and Allan Hanbury. 2021. Efficiently Teaching an Effective Dense Retriever with Balanced Topic Aware Sampling. In The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 113--122.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Daniel G Horvitz and Donovan J Thompson. 1952. A generalization of sampling without replacement from a finite universe. Journal of the American statistical Association, Vol. 47, 260 (1952), 663--685.Google ScholarGoogle ScholarCross RefCross Ref
  16. Rolf Jagerman, Harrie Oosterhuis, and Maarten de Rijke. 2019. To Model or to Intervene: A Comparison of Counterfactual and Online Learning to Rank from User Interactions. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'19). Association for Computing Machinery, 15--24.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Thorsten Joachims. 2002. Optimizing search engines using clickthrough data. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining. 133--142.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Thorsten Joachims, Laura Granka, Bing Pan, Helene Hembrooke, and Geri Gay. 2017a. Accurately interpreting clickthrough data as implicit feedback. In ACM SIGIR Forum, Vol. 51. Acm New York, NY, USA, 4--11.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Thorsten Joachims, Adith Swaminathan, and Tobias Schnabel. 2017b. Unbiased learning-to-rank with biased feedback. In Proceedings of the Tenth ACM International Conference on Web Search and Data Mining. 781--789.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Vladimir Karpukhin, Barlas Oguz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih. 2020. Dense Passage Retrieval for Open-Domain Question Answering. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 6769--6781.Google ScholarGoogle ScholarCross RefCross Ref
  21. Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of NAACL-HLT. 4171--4186.Google ScholarGoogle Scholar
  22. Omar Khattab and Matei Zaharia. 2020. Colbert: Efficient and effective passage search via contextualized late interaction over bert. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval. 39--48.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Kenton Lee, Ming-Wei Chang, and Kristina Toutanova. 2019. Latent Retrieval for Weakly Supervised Open Domain Question Answering. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 6086--6096.Google ScholarGoogle ScholarCross RefCross Ref
  24. Hang Li, Ahmed Mourad, Shengyao Zhuang, Bevan Koopman, and Guido Zuccon. 2021. Pseudo Relevance Feedback with Deep Language Models and Dense Retrievers: Successes and Pitfalls. arXiv preprint arXiv:2108.11044 (2021).Google ScholarGoogle Scholar
  25. Hang Li, Shengyao Zhuang, Ahmed Mourad, Xueguang Ma, Jimmy Lin, and Guido Zuccon. 2022. Improving Query Representations for Dense Retrieval with Pseudo Relevance Feedback: A Reproducibility Study. In European Conference on Information Retrieval. Springer, 599--612.Google ScholarGoogle Scholar
  26. Jimmy Lin and Xueguang Ma. 2021. A Few Brief Notes on DeepImpact, COIL, and a Conceptual Framework for Information Retrieval Techniques. arXiv preprint arXiv:2106.14807 (2021).Google ScholarGoogle Scholar
  27. Jimmy Lin, Xueguang Ma, Sheng-Chieh Lin, Jheng-Hong Yang, Ronak Pradeep, and Rodrigo Nogueira. 2021 a. Pyserini: An easy-to-use Python toolkit to support replicable IR research with sparse and dense representations. arXiv preprint arXiv:2102.10073 (2021).Google ScholarGoogle Scholar
  28. Sheng-Chieh Lin, Jheng-Hong Yang, and Jimmy Lin. 2020. Distilling dense representations for ranking using tightly-coupled teachers. arXiv preprint arXiv:2010.11386 (2020).Google ScholarGoogle Scholar
  29. Sheng-Chieh Lin, Jheng-Hong Yang, and Jimmy Lin. 2021 b. In-batch negatives for knowledge distillation with tightly-coupled teachers for dense retrieval. In Proceedings of the 6th Workshop on Representation Learning for NLP (RepL4NLP-2021). 163--173.Google ScholarGoogle ScholarCross RefCross Ref
  30. Antonio Mallia, Omar Khattab, Torsten Suel, and Nicola Tonellotto. 2021. Learning passage impacts for inverted indexes. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1723--1727.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Graham McDonald, Craig Macdonald, and Iadh Ounis. 2018. Active learning strategies for technology assisted sensitivity review. In European Conference on Information Retrieval. Springer, 439--453.Google ScholarGoogle ScholarCross RefCross Ref
  32. Tri Nguyen, Mir Rosenberg, Xia Song, Jianfeng Gao, Saurabh Tiwary, Rangan Majumder, and Li Deng. 2016. MS MARCO: A human generated machine reading comprehension dataset. In Proceedings of the Workshop on Cognitive Computation: Integrating neural and symbolic approaches 2016 co-located with the 30th Annual Conference on Neural Information Processing Systems (CEUR Workshop Proceedings, Vol. 1773). CEUR-WS.org.Google ScholarGoogle Scholar
  33. Rodrigo Nogueira and Kyunghyun Cho. 2019. Passage Re-ranking with BERT. arXiv preprint arXiv:1901.04085 (2019).Google ScholarGoogle Scholar
  34. Rodrigo Nogueira and Jimmy Lin. 2019. From doc2query to docTTTTTquery.Google ScholarGoogle Scholar
  35. Douglas W Oard and William Webber. 2013. Information retrieval for e-discovery. Information Retrieval, Vol. 7, 2--3 (2013), 99--237.Google ScholarGoogle Scholar
  36. Harrie Oosterhuis and Maarten de Rijke. 2017. Sensitive and scalable online evaluation with theoretical guarantees. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. 77--86.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Harrie Oosterhuis and Maarten de Rijke. 2018. Differentiable unbiased online learning to rank. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. 1293--1302.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Harrie Oosterhuis and Maarten de de Rijke. 2021. Robust Generalization and Safe Query-Specializationin Counterfactual Learning to Rank. In Proceedings of the Web Conference 2021. 158--170.Google ScholarGoogle Scholar
  39. Zohreh Ovaisi, Ragib Ahsan, Yifan Zhang, Kathryn Vasilaky, and Elena Zheleva. 2020. Correcting for selection bias in learning-to-rank systems. In Proceedings of The Web Conference 2020. 1863--1873.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Journal of Machine Learning Research, Vol. 21 (2020), 1--67.Google ScholarGoogle Scholar
  41. Craig Silverstein, Hannes Marais, Monika Henzinger, and Michael Moricz. 1999. Analysis of a very large web search engine query log. In Acm sigir forum, Vol. 33. ACM New York, NY, USA, 6--12.Google ScholarGoogle Scholar
  42. Gabriela Surita, Rodrigo Nogueira, and Roberto Lotufo. 2020. Can questions summarize a corpus? Using question generation for characterizing COVID-19 research. arXiv preprint arXiv:2009.09290 (2020).Google ScholarGoogle Scholar
  43. Nan Wang, Zhen Qin, Xuanhui Wang, and Hongning Wang. 2021. Non-clicks mean irrelevant? propensity ratio scoring as a correction. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining. 481--489.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Lee Xiong, Chenyan Xiong, Ye Li, Kwok-Fung Tang, Jialin Liu, Paul N Bennett, Junaid Ahmed, and Arnold Overwijk. 2020. Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval. In International Conference on Learning Representations.Google ScholarGoogle Scholar
  45. HongChien Yu, Chenyan Xiong, and Jamie Callan. 2021. Improving Query Representations for Dense Retrieval with Pseudo Relevance Feedback. In Proceedings of the 30th ACM International Conference on Information and Knowledge Management. ACM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Yisong Yue and Thorsten Joachims. 2009. Interactively optimizing information retrieval systems as a dueling bandits problem. In Proceedings of the 26th Annual International Conference on Machine Learning. 1201--1208.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Yisong Yue, Rajan Patel, and Hein Roehrig. 2010. Beyond position bias: Examining result attractiveness as a source of presentation bias in clickthrough data. In Proceedings of the 19th international conference on World wide web. 1011--1018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Haotian Zhang, Gordon V Cormack, Maura R Grossman, and Mark D Smucker. 2020. Evaluating sentence-level relevance feedback for high-recall information retrieval. Information Retrieval Journal, Vol. 23, 1 (2020), 1--26.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Shengyao Zhuang, Zhihao Qiao, and Guido Zuccon. 2022. Reinforcement Online Learning to Rank with Unbiased Reward Shaping. arXiv preprint arXiv:2201.01534 (2022).Google ScholarGoogle Scholar
  50. Shengyao Zhuang and Guido Zuccon. 2020. Counterfactual Online Learning to Rank. In European Conference on Information Retrieval. Springer, 415--430.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Shengyao Zhuang and Guido Zuccon. 2021. Fast passage re-ranking with contextualized exact term matching and efficient passage expansion. arXiv preprint arXiv:2108.08513 (2021).Google ScholarGoogle Scholar
  52. Masrour Zoghi, Tomávs Tunys, Lihong Li, Damien Jose, Junyan Chen, Chun Ming Chin, and Maarten de Rijke. 2016. Click-based hot fixes for underperforming torso queries. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval. 195--204.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Implicit Feedback for Dense Passage Retrieval: A Counterfactual Approach

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
        July 2022
        3569 pages
        ISBN:9781450387323
        DOI:10.1145/3477495

        Copyright © 2022 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 7 July 2022

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate792of3,983submissions,20%
      • Article Metrics

        • Downloads (Last 12 months)235
        • Downloads (Last 6 weeks)44

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!