skip to main content
10.1145/2213556.2213580acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
tutorial

Max-Sum diversification, monotone submodular functions and dynamic updates

Published:21 May 2012Publication History

ABSTRACT

Result diversification has many important applications in databases, operations research, information retrieval, and finance. In this paper, we study and extend a particular version of result diversification, known as max-sum diversification. More specifically, we consider the setting where we are given a set of elements in a metric space and a set valuation function f defined on every subset. For any given subset S, the overall objective is a linear combination of f(S) and the sum of the distances induced by S. The goal is to find a subset S satisfying some constraints that maximizes the overall objective.

This problem is first studied by Gollapudi and Sharma in [17] for modular set functions and for sets satisfying a cardinality constraint (uniform matroids). In their paper, they give a 2-approximation algorithm by reducing to an earlier result in [20]. The first part of this paper considers an extension of the modular case to the monotone submodular case, for which the algorithm in [17] no longer applies. Interestingly, we are able to maintain the same 2-approximation using a natural, but different greedy algorithm. We then further extend the problem by considering any matroid constraint and show that a natural single swap local search algorithm provides a 2-approximation in this more general setting. This extends the Nemhauser, Wolsey and Fisher approximation result [20] for the problem of submodular function maximization subject to a matroid constraint (without the distance function component).

The second part of the paper focuses on dynamic updates for the modular case. Suppose we have a good initial approximate solution and then there is a single weight-perturbation either on the valuation of an element or on the distance between two elements. Given that users expect some stability in the results they see, we ask how easy is it to maintain a good approximation without significantly changing the initial set. We measure this by the number of updates, where each update is a swap of a single element in the current solution with a single element outside the current solution. We show that we can maintain an approximation ratio of 3 by just a single update if the perturbation is not too large.

References

  1. R. Agrawal, S. Gollapudi, A. Halverson, and S. Ieong. Diversifying search results. In WSDM, pages 5--14, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. N. Bansal, K. Jain, A. Kazeykina, and J. Naor. Approximation algorithms for diversified search ranking. In ICALP (2), pages 273--284, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. C. Brandt, T. Joachims, Y. Yue, and J. Bank. Dynamic ranked retrieval. In WSDM, pages 247--256, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. R. A. Brualdi. Comments on bases in dependence structures. Bulletin of the Australian Mathematical Society, 1(02):161--167, 1969.Google ScholarGoogle ScholarCross RefCross Ref
  5. G. Călinescu, C. Chekuri, M. Pál, and J. Vondrák. Maximizing a monotone submodular function subject to a matroid constraint. SIAM J. Comput., 40(6):1740--1766, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Carbonell and J. Goldstein. The use of mmr, diversity-based reranking for reordering documents and producing summaries. In Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '98, pages 335--336, New York, NY, USA, 1998. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. B. Chandra and M. M. Halldórsson. Facility dispersion and remote subgraphs. In Proceedings of the 5th Scandinavian Workshop on Algorithm Theory, pages 53--65, London, UK, 1996. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. B. Chandra and M. M. Halldórsson. Approximation algorithms for dispersion problems. J. Algorithms, 38(2):438--465, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. H. Chen and D. R. Karger. Less is more: probabilistic models for retrieving fewer relevant documents. In SIGIR, pages 429--436, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. E. Demidova, P. Fankhauser, X. Zhou, and W. Nejdl. Divq: diversification for keyword search over structured databases. In Proceeding of the 33rd international ACM SIGIR conference on Research and development in information retrieval, SIGIR '10, pages 331--338. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Z. Dou, S. Hu, K. Chen, R. Song, and J.-R. Wen. Multi-dimensional search result diversification. In WSDM, pages 475--484, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. M. Drosou and E. Pitoura. Diversity over continuous data. IEEE Data Eng. Bull., 32(4):49--56, 2009.Google ScholarGoogle Scholar
  13. M. Drosou and E. Pitoura. Search result diversification. SIGMOD Record, 39(1):41--47, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Edmonds. Matroids and the greedy algorithm. Mathematical Programming, 1:127--136, 1971.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. E. Erkut. The discrete p-dispersion problem. European Journal of Operational Research, 46(1):48--60, May 1990.Google ScholarGoogle Scholar
  16. E. Erkut and S. Neuman. Analytical models for locating undesirable facilities. European Journal of Operational Research, 40(3):275--291, June 1989.Google ScholarGoogle Scholar
  17. S. Gollapudi and A. Sharma. An axiomatic approach for result diversification. In World Wide Web Conference Series, pages 381--390, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. M. Halldórsson, K. Iwano, N. Katoh, and T. Tokuyama. Finding subsets maximizing minimum structures. In Symposium on Discrete Algorithms, pages 150--159, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. P. Hansen and I. D. Moon. Dispersion facilities on a network. Presentation at the TIMS/ORSA Joint National Meeting, Washington, D.C., 1988.Google ScholarGoogle Scholar
  20. R. Hassin, S. Rubinstein, and A. Tamir. Approximation algorithms for maximum dispersion. Oper. Res. Lett., 21(3):133--137, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. D. Kempe, J. Kleinberg, and E. Tardos. Maximizing the spread of influence through a social network. In Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '03, pages 137--146, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. S. Khanna, R. Motwani, M. Sudan, and U. V. Vazirani. On syntactic versus computational views of approximability. Electronic Colloquium on Computational Complexity (ECCC), 2(23), 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M. J. Kuby. Programming models for facility dispersion: The p-dispersion and maxisum dispersion problems. Geographical Analysis, 19(4):315--329, 1987.Google ScholarGoogle ScholarCross RefCross Ref
  24. H. Lin and J. Bilmes. Multi-document summarization via budgeted maximization of submodular functions. In HLT-NAACL, pages 912--920, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. H. Lin and J. Bilmes. A class of submodular functions for document summarization. In North American chapter of the Association for Computational Linguistics/Human Language Technology Conference (NAACL/HLT-2011), Portland, OR, June 2011. (long paper). Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. H. Lin, J. Bilmes, and S. Xie. Graph-based submodular selection for extractive summarization. In Proc. IEEE Automatic Speech Recognition and Understanding (ASRU), Merano, Italy, December 2009.Google ScholarGoogle ScholarCross RefCross Ref
  27. Z. Liu, P. Sun, and Y. Chen. Structured search result differentiation. PVLDB, 2(1):313--324, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. E. Minack, W. Siberski, and W. Nejdl. Incremental diversification for very large sets: a streaming-based approach. In SIGIR, pages 585--594, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. G. Nemhauser, L. Wolsey, and M. Fisher. An analysis of the approximations for maximizing submodular set functions. Mathematical Programming, 1978.Google ScholarGoogle Scholar
  30. F. Radlinski, R. Kleinberg, and T. Joachims. Learning diverse rankings with multi-armed bandits. In ICML, pages 784--791, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. R. Rado. A note on independence functios. Proceedings of the London Mathematical Society, 7:300--320, 1957.Google ScholarGoogle ScholarCross RefCross Ref
  32. D. Rafiei, K. Bharat, and A. Shukla. Diversifying web search results. In WWW, pages 781--790, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. S. S. Ravi, D. J. Rosenkrantz, and G. K. Tayi. Heuristic and special case algorithms for dispersion problems. Operations Research, 42(2):299--310, March-April 1994.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. R. L. T. Santos, C. Macdonald, and I. Ounis. Intent-aware search result diversification. In SIGIR, pages 595--604, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. A. Schrijver. Combinatorial Optimization: Polyhedra and Efficiency. Springer, 2003.Google ScholarGoogle Scholar
  36. A. Slivkins, F. Radlinski, and S. Gollapudi. Learning optimally diverse rankings over large document collections. In ICML, pages 983--990, 2010.Google ScholarGoogle Scholar
  37. M. R. Vieira, H. L. Razente, M. C. N. Barioni, M. Hadjieleftheriou, D. Srivastava, C. T. Jr., and V. J. Tsotras. Divdb: A system for diversifying query results. PVLDB, 4(12):1395--1398, 2011.Google ScholarGoogle Scholar
  38. M. R. Vieira, H. L. Razente, M. C. N. Barioni, M. Hadjieleftheriou, D. Srivastava, C. T. Jr., and V. J. Tsotras. On query result diversification. In ICDE, pages 1163--1174, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. D. W. Wang and Y.-S. Kuo. A study on two geometric location problems. Inf. Process. Lett., 28:281--286, August 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. C. Yu, L. Lakshmanan, and S. Amer-Yahia. It takes variety to make a world: diversification in recommender systems. In Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, EDBT '09, pages 368--378, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Y. Yue and T. Joachims. Predicting diverse subsets using structural svms. In ICML, pages 1224--1231, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. C. Zhai, W. W. Cohen, and J. D. Lafferty. Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In SIGIR, pages 10--17, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. F. Zhao, X. Zhang, A. K. H. Tung, and G. Chen. Broad: Diversified keyword search in databases. PVLDB, 4(12):1355--1358, 2011.Google ScholarGoogle Scholar
  44. X. Zhu, A. B. Goldberg, J. V. Gael, and D. Andrzejewski. Improving diversity in ranking using absorbing random walks. In HLT-NAACL, pages 97--104, 2007.Google ScholarGoogle Scholar

Index Terms

  1. Max-Sum diversification, monotone submodular functions and dynamic updates

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!