skip to main content
research-article

Multimodal Retrieval with Diversification and Relevance Feedback for Tourist Attraction Images

Published:12 August 2017Publication History
Skip Abstract Section

Abstract

In this article, we present a novel framework that can produce a visual description of a tourist attraction by choosing the most diverse pictures from community-contributed datasets, which describe different details of the queried location. The main strength of the proposed approach is its flexibility that permits us to filter out non-relevant images and to obtain a reliable set of diverse and relevant images by first clustering similar images according to their textual descriptions and their visual content and then extracting images from different clusters according to a measure of the user’s credibility. Clustering is based on a two-step process, where textual descriptions are used first and the clusters are then refined according to the visual features. The degree of diversification can be further increased by exploiting users’ judgments on the results produced by the proposed algorithm through a novel approach, where users not only provide a relevance feedback but also a diversity feedback. Experimental results performed on the MediaEval 2015 “Retrieving Diverse Social Images” dataset show that the proposed framework can achieve very good performance both in the case of automatic retrieval of diverse images and in the case of the exploitation of the users’ feedback. The effectiveness of the proposed approach has been also confirmed by a small case study involving a number of real users.

References

  1. M. R. Anderberg. 1973. Cluster Analysis for Applications. Academic Press.Google ScholarGoogle Scholar
  2. J. Bian, Y. Yang, H. Zhang, and T. S. Chua. 2015. Multimedia summarization for social events in microblog stream. IEEE Trans. Multimed. 17, 2 (Feb 2015), 216--228. DOI:http://dx.doi.org/10.1109/TMM.2014.2384912 Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. G. Boato, D.-T. Dang-Nguyen, O. Muratov, N. Alajlan, and F. G. B. De Natale. 2015. Exploiting visual saliency for increasing diversity of image retrieval results. Multimed. Tools. Appl. (2015), 1--22.Google ScholarGoogle Scholar
  4. B. Boteanu, I. Mironica, and B. Ionescu. 2014. A relevance feedback perspective to image search result diversification. In Proceedings of the IEEE International Conference on Computer Vision. 47--54. Google ScholarGoogle ScholarCross RefCross Ref
  5. B. Boteanu, I. Mironica, and B. Ionescu. 2015. Hierarchical clustering pseudo-relevance feedback for social image search result diversification. In Proceedings of the IEEE International Workshop on Content-Based Multimedia Indexing. 1--6. DOI:http://dx.doi.org/10.1109/CBMI.2015.7153613 Google ScholarGoogle ScholarCross RefCross Ref
  6. J. Carbonell and J. Goldstein. 1998. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval. 335--336. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. T. Chen, K.-H. Yap, and D. Zhang. 2014. Discriminative soft bag-of-visual phrase for mobile landmark recognition. IEEE Trans. Multimed. 16, 3 (2014), 612--622. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Y. Chen, X. S. Zhou, and T. S. Huang. 2001. One-class SVM for learning in image retrieval. In Proceedings of the IEEE International Conference on Image Processing, Vol. 1. 34--37.Google ScholarGoogle Scholar
  9. D. Giordano, S. Palazzo, and C. Spampinato. 2016. A diversity-based search approach to support annotation of a large fish image dataset. Multimedia Systems 22, 6 (Nov. 2016), 725--736. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. N. Dalal and B. Triggs. 2005. Histograms of oriented gradients for human detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 886--893. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. D.-T. Dang-Nguyen, G. Boato, F. G.B. De Natale, L. Piras, G. Giacinto, F. Tuveri, and M. Angioni. 2015a. Multimodal-based diversified summarization in social image retrieval. In MediaEval, Vol. 1436.Google ScholarGoogle Scholar
  12. D.-T. Dang-Nguyen, L. Piras, G. Giacinto, G. Boato, and F. G. B. De Natale. 2015b. A hybrid approach for retrieving diverse social images of landmarks. In Proceedings of the IEEE International Conference on Multimedia and Expo. Google ScholarGoogle ScholarCross RefCross Ref
  13. V. de Weijer, C. Schmid, J. Verbeek, and D. Larlus. 2009. Learning color names for real-world applications. IEEE Trans. Image Process. 18, 7 (2009), 1512--1523. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. L. Gînscă, A. Popescu, B. Ionescu, A. Armagan, and I. Kanellos. 2014. Toward an estimation of user tagging credibility for social image retrieval. In Proceedings of the ACM International Conference on Multimedia. 1021--1024.Google ScholarGoogle Scholar
  15. J.-T. Huang, C.-H. Shen, S.-M. Phoong, and H. Chen. 2005. Robust measure of image focus in the wavelet domain. In Proceedings of the Conference on Intelligent Signal Processing and Communication Systems. 157--160.Google ScholarGoogle Scholar
  16. Z. Huang, B. Hu, H. Cheng, H. T. Shen, H. Liu, and X. Zhou. 2010. Mining near-duplicate graph for cluster-based reranking of web video search results. ACM Trans. Info. Syst. 28, 4 (2010), 22:1--22:27.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. B. Ionescu, A.-L. Gînscă, B. Boteanu, A. Popescu, M. Lupu, and H. Müller. 2015. Retrieving diverse social images at mediaeval 2015: Challenge, dataset and evaluation. In MediaEval, Vol. 1436.Google ScholarGoogle Scholar
  18. B. Ionescu, A. Popescu, M. Lupu, A. L. Gînscă, and Müller. 2014. Retrieving diverse social images at mediaeval 2014: Challenge, dataset and evaluation. In MediaEval.Google ScholarGoogle Scholar
  19. S. Jiang, X. Qian, J. Shen, Y. Fu, and T. Mei. 2015. Author topic model-based collaborative filtering for personalized POI recommendations. IEEE Trans. Multimed. 17, 6 (June 2015), 907--918. DOI:http://dx.doi.org/10.1109/TMM.2015.2417506 Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. L. S. Kennedy and M. Naaman. 2008. Generating diverse and representative image search results for landmarks. In Proceedings of the ACM International Conference on World Wide Web. 297--306. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. D.-H. Kim, C.-W. Chung, and K. Barnard. 2005. Relevance feedback using adaptive clustering for image similarity retrieval. J. Syst. Softw. 78, 1 (2005), 9--23. DOI:http://dx.doi.org/10.1016/j.jss.2005.02.005 Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. Laaksonen, M. Koskela, and E. Oja. 2002. PicSOM-self-organizing image retrieval with MPEG-7 content descriptors. IEEE Trans. Neural Netw. 13, 4 (2002), 841--853. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. S. Lazebnik, C. Schmid, and J. Ponce. 2006. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2169--2178. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. S. Liang and Z. Sun. 2008. Sketch retrieval and relevance feedback with biased SVM classification. Patt. Recogn. Lett. 29, 12 (2008), 17331741. DOI:http://dx.doi.org/10.1016/j.patrec.2008.05.004 Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. D. Lu, X. Liu, and X. Qian. 2016. Tag-based image search by social re-ranking. IEEE Trans. Multimed. 18, 8 (Aug 2016), 1628--1639. DOI:http://dx.doi.org/10.1109/TMM.2016.2568099 Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Z. Lu and H. H. S. Ip. 2010. Combining context, consistency, and diversity cues for interactive image categorization. IEEE Trans. Multimed. 12, 3 (2010), 194--203. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. B. S. Manjunath, J. R. Ohm, V. V. Vasudevan, and A. Yamada. 2001. Color and texture descriptors. IEEE Trans. Circ. Syst. Vid. Technol. 11, 6 (2001), 703--715. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. I. Mironica, B. Ionescu, and C. Vertan. 2012. Hierarchical clustering relevance feedback for content-based image retrieval. In Proceedings of the IEEE International Workshop on Content-Based Multimedia Indexing. 1--6. Google ScholarGoogle ScholarCross RefCross Ref
  29. T. Ojala, M. Pietikinen, and D. Harwood. 1994. Performance evaluation of texture measures with classification based on kullback discrimination of distributions. In Proceedings of the IAPR International Conference on Pattern Recognition. 582--585. Google ScholarGoogle ScholarCross RefCross Ref
  30. M. Paramita, M. Sanderson, and P. Clough. 2009. Diversity in photo retrieval: Overview of the ImageCLEF photo task 2009. In Proceedings of the International Conference on Cross-language Evaluation Forum: Multimedia Experiments.Google ScholarGoogle Scholar
  31. L. Piras and G. Giacinto. 2009. Neighborhood-based feature weighting for relevance feedback in content-based retrieval. In Proceedings of the IEEE International Workshop on Image Analysis for Multimedia Interactive Services. 238--241. Google ScholarGoogle ScholarCross RefCross Ref
  32. L. Piras and G. Giacinto. 2017. Information fusion in content based image retrieval: A comprehensive overview. Info. Fusion 37 (2017), 50--60. DOI:http://dx.doi.org/10.1016/j.inffus.2017.01.003 Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. X. Qian, X. Tan, Y. Zhang, R. Hong, and M. Wang. 2016. Enhancing sketch-based image retrieval by re-ranking and relevance feedback. IEEE Trans. Image Process. 25, 1 (Jan 2016), 195--208. DOI:http://dx.doi.org/10.1109/TIP.2015.2497145 Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. X. Qian, Y. Xue, X. Yang, Y. Y. Tang, X. Hou, and T. Mei. 2015. Landmark summarization with diverse viewpoints. IEEE Trans. Circ. Syst. Vid. Technol. 25, 11 (2015), 1857--1869. DOI:http://dx.doi.org/10.1109/TCSVT.2014.2369731 Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. S. S. Ravindranath, M. Gygli, and L. van Gool. In MediaEval.Google ScholarGoogle Scholar
  36. S. Rudinac, A. Hanjalic, and M. Larson. 2013. Generating visual summaries of geographic areas using community-contributed images. IEEE Trans. Multimed. 15, 4 (2013), 921--932. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Y. Rui, T. S. Huang, and S. Mehrotra. 1997. Content-based image retrieval with relevance feedback in MARS. In Proceedings of the IEEE International Conference on Image Processing. 815--818. Google ScholarGoogle ScholarCross RefCross Ref
  38. Y. Rui, T. S. Huang, and S. Mehrotra. 1998. Relevance feedback: A power tool in interactive content-based image retrieval. IEEE Trans. Circ. Syst. Vid. Technol. 8, 5 (September 1998), 644--655. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. S. Sabetghadam, J. R. M. Palotti, N. Rekabsaz, M. Lupu, and A. Hanbury. 2015. TUW @ MediaEval 2015 retrieving diverse social images task. In MediaEval, Vol. 1436.Google ScholarGoogle Scholar
  40. I. Simon, N. Snavely, and S. M. Seitz. 2007. Scene summarization for online image collections. In Proceedings of the IEEE International Conference on Computer Vision. 1--8. Google ScholarGoogle ScholarCross RefCross Ref
  41. B. Thomee and M. S. Lew. 2012. Interactive search in image retrieval: A survey. Int. J. Multimed. Info. Retriev. 1, 1 (2012), 71--86. Google ScholarGoogle ScholarCross RefCross Ref
  42. R. Tronci, G. Murgia, M. Pili, L. Piras, and G. Giacinto. 2013. ImageHunter: A novel tool for relevance feedback in content based image retrieval. In New Challenges in Distributed Information Filtering and Retrieval. Vol. 439. 53--70. Google ScholarGoogle ScholarCross RefCross Ref
  43. C.-M. Tsai, A. Qamra, E. Y. Chang, and Y.-F. Wang. 2006. Extent: Interring image metadata from context and content. In Proceedings of the IEEE International Conference on Multimedia and Expo. 1270--1273.Google ScholarGoogle Scholar
  44. R. H. van Leuken, L. Garcia, X. Olivares, and R. van Zwol. 2009. Visual diversification of image search results. In Proceedings of the ACM International Conference on World Wide Web. 341--350. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. T. Wang, Y. Rui, S.-M. Hu, and J.-G. Sun. 2003. Adaptive tree similarity learning for image retrieval. Multimed. Syst. 9, 2 (2003), 131--143. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. J. Xiao, J. Hays, K. A. Ehinger, A. Oliva, and A. Torralba. 2010. SUN database: Large-scale scene recognition from abbey to zoo. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3485--3492. Google ScholarGoogle ScholarCross RefCross Ref
  47. E. S. Xioufis, A. Popescu, S. Papadopoulos, and I. Kompatsiaris. USEMP: Finding diverse images at MediaEval 2015. In MediaEval.Google ScholarGoogle Scholar
  48. M. Zaharieva and L. Diem. 2015. MIS @ retrieving diverse social images task 2015. In MediaEval, Vol. 1436.Google ScholarGoogle Scholar
  49. L. Zhang, F. Lin, and B. Zhang. 2001. Support vector machine learning for image retrieval. In Proceedings of the IEEE International Conference on Image Processing, Vol. 2. 721--724. Google ScholarGoogle ScholarCross RefCross Ref
  50. R. Zhang and Z. Zhang. 2005. FAST: Toward more effective and efficient image retrieval. Multimed. Syst. 10, 6 (2005), 529--543. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. T. Zhang, R. Ramakrishnan, and M. Livny. 1996. BIRCH: An efficient data clustering method for very large databases. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 103--114. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. L. Zhu, J. Shen, H. Jin, L. Xie, and R. Zheng. 2015. Landmark classification with hierarchical multi-modal exemplar feature. IEEE Trans. Multimed. 17, 7 (2015), 981--993. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Multimodal Retrieval with Diversification and Relevance Feedback for Tourist Attraction Images

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Multimedia Computing, Communications, and Applications
        ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 13, Issue 4
        November 2017
        362 pages
        ISSN:1551-6857
        EISSN:1551-6865
        DOI:10.1145/3129737
        Issue’s Table of Contents

        Copyright © 2017 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 12 August 2017
        • Accepted: 1 May 2017
        • Revised: 1 April 2017
        • Received: 1 October 2016
        Published in tomm Volume 13, Issue 4

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!