Abstract
This article presents a novel attribute-augmented semantic hierarchy (A2SH) and demonstrates its effectiveness in bridging both the semantic and intention gaps in content-based image retrieval (CBIR). A2SH organizes semantic concepts into multiple semantic levels and augments each concept with a set of related attributes. The attributes are used to describe the multiple facets of the concept and act as the intermediate bridge connecting the concept and low-level visual content. An hierarchical semantic similarity function is learned to characterize the semantic similarities among images for retrieval. To better capture user search intent, a hybrid feedback mechanism is developed, which collects hybrid feedback on attributes and images. This feedback is then used to refine the search results based on A2SH. We use A2SH as a basis to develop a unified content-based image retrieval system. We conduct extensive experiments on a large-scale dataset of over one million Web images. Experimental results show that the proposed A2SH can characterize the semantic affinities among images accurately and can shape user search intent quickly, leading to more accurate search results as compared to state-of-the-art CBIR solutions.
- C. F. Baker, C. J. Fillmore, and J. B. Lowe. 1998. The Berkeley FrameNet project. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics. Google Scholar
Digital Library
- M. Belkin and P. Niyogi. 2003. Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computat. 15, 6, 1373--1396. Google Scholar
Digital Library
- A. Binder, K.-R. Müller, and M. Kawanabe. 2012. On taxonomies for multi-class image categorization. Int. J. Comput. Vision 99, 3, 281--301. Google Scholar
Digital Library
- Y. Boureau, N. Le Roux, F. Bach, J. Ponce, and Y. LeCun. 2011. Ask the locals: Multi-way local pooling for image recognition. In Proceedings of the International Conference on Computer Vision. Google Scholar
Digital Library
- M. Crucianu, M. Ferecatu, and N. Boujemaa. 2004. Relevance feedback for image retrieval: A short survey. DELOS2 Report.Google Scholar
- R. Datta, D. Joshi, J. Li, and J. Wang. 2008. Image retrieval: Ideas, influences, and trends of the new age. ACM Comput. Surv. 40, 2, Article 50. Google Scholar
Digital Library
- J. Deng, A. C. Berg, and F.-F. Li. 2011. Hierarchical semantic indexing for large scale image retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Google Scholar
Digital Library
- J. Deng, A. C. Berg, K. Li, and F.-F. Li. 2010. What does classifying more than 10,000 image categories tell us? In Proceedings of the European Conference on Computer Vision. Google Scholar
Digital Library
- J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and F.-F. Li. 2009. ImageNet: A large-scale hierarchical image database. In Proceedings of the IEEE Computer Vision and Pattern Recognition.Google Scholar
Cross Ref
- T. Deselaers and V. Ferrari. 2011. Visual and semantic similarity in ImageNet. In Proceedings of the IEEE Computer Vision and Pattern Recognition. Google Scholar
Digital Library
- M. Douze, A. Ramisa, and C. Schmid. 2011. Combining attributes and fisher vectors for efficient image retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Google Scholar
Digital Library
- A. Farhadi, I. Endres, D. Hoiem, and D. Forsyth. 2009. Describing objects by their attributes. In Proceedings of the IEEE Conference on Computer Vision and Patter Recognition.Google Scholar
- C. Fellbaum. 2010. WordNet. In Theory and Applications of Ontology: Computer Applications. Springer.Google Scholar
- G. Griffin and P. Perona. 2008. Learning and using taxonomies for fast visual categorization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
- A. Jaimes and S.-F. Chang. 2000. A conceptual framework for indexing visual information at multiple levels. Proc. SPIE 3964.Google Scholar
- A. Kovashka, D. Parikh, and K. Grauman. 2012. WhittleSearch: Image search with relative attribute feedback. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Google Scholar
Digital Library
- N. Kumar, A. Berg, P. Belhumeur, and S. Nayar. 2011. Describable visual attributes for face verification and image search. IEEE Trans. Pattern Anal. Mach. Intell. 33, 10, 1962--1977. Google Scholar
Digital Library
- M. S. Lew, N. Sebe, C. Djeraba, and R. Jain. 2006. Content-based multimedia information retrieval: State of the art and challenges. ACM Trans. Multimedia Comput. Commun. Appl. 2, 1--90. Google Scholar
Digital Library
- Z. Ma, Y. Yang, Z. Xu, S. Yan, N. Sebe, and A. G. Hauptmann. 2012. Complex event detection via multi-source video attributes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Google Scholar
Digital Library
- M. Marszalek and C. Schmid. 2007. Semantic hierarchies for visual object recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
- M. Naphade, J. R. Smith, J. Tesic, S.-F. Chang, W. Hsu, L. Kennedy, A. Hauptmann, and J. Curtis. 2006. Large-scale concept ontology for multimedia. IEEE Multimedia 13, 3, 86--91. Google Scholar
Digital Library
- P. Over, G. Awad, M. Michel, J. Fiscus, G. Sanders, B. Shaw, W. Kraaij, A. F. Smeaton, and G. Quenot. 2012. TRECVID 2012 -- An overview of the goals, tasks, data, evaluation mechanisms and metrics. In Proceedings of the TRECVID Conference.Google Scholar
- D. Parikh and K. Grauman. 2011a. Interactively building a discriminative vocabulary of nameable attributes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Google Scholar
Digital Library
- D. Parikh and K. Grauman. 2011b. Relative attributes. In Proceedings of the IEEE International Conference on Computer Vision. Google Scholar
Digital Library
- Y. Rui, T. S. Huang, and S.-F. Chang. 1999. Image retrieval: Current techniques, promising directions, and open issues. J. Visual Commun. Image Represent. 10, 1, 39--62. Google Scholar
Digital Library
- Y. Rui, T. S. Huang, M. Ortega, and S. Mehrotra. 1998. Relevance feedback: A power tool for interactive content-based image retrieval. IEEE Trans. Circuits Syst. Video Techno. 8, 5, 644--655. Google Scholar
Digital Library
- O. Russakovsky and F.-F. Li. 2010. Attribute learning in large-scale datasets. In Trends and Topics in Computer Vision. Lecture Notes in Computer Science, vol. 6553. Springer. Google Scholar
Digital Library
- W. J. Scheirer, N. Kumar, P. N. Belhumeur, and T. E. Boult. 2012. Multi-attribute spaces: Calibration for attribute fusion and similarity search. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Google Scholar
Digital Library
- A. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jain. 2000. Content-based image retrieval at the end of the early years. IEEE Trans. Pattern Anal. Mach. Intell. 22, 12, 1349--1380. Google Scholar
Digital Library
- J. R. Smith and S.-F. Chang. 1997. VisualSeek: A fully automated content-based image query system. In Proceedings of the ACM International Conference on Multimedia. Google Scholar
Digital Library
- Y. Song, M. Zhao, J. Yagnik, and X. Wu. 2010. Taxonomic classification for web-based videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
- D. Tao, X. Tang, X. Li, and X. Wu. 2006. Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 28, 7, 1088--1099. Google Scholar
Digital Library
- S. Tong and E. Chang. 2001. Support vector machine active learning for image retrieval. In Proceedings of the ACM International Conference on Multimedia. Google Scholar
Digital Library
- N. Verma, D. Mahajan, S. Sellamanickam, and V. Nair. 2012. Learning hierarchical similarity metrics. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Google Scholar
Digital Library
- J. Wang, J. Yang, K. Yu, F. Lv, T. Huang, and Y. Gong. 2010. Locality-constrained linear coding for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
- K. Q. Weinberger, J. Blitzer, and L. K. Saul. 2006. Distance metric learning for large margin nearest neighbor classification. In Proceedings of the 20th Annual Conference on Neural Information Processing Systems.Google Scholar
- C. Yang, M. Dong, and F. Fotouhi. 2005. Semantic feedback for interactive image retrieval. In Proceedings of the ACM International Conference on Multimedia. Google Scholar
Digital Library
- F. X. Yu, L. Cao, R. S. Feris, J. R. Smith, and S.-F. Chang. 2013. Designing category-level attributes for discriminative visual recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Google Scholar
Digital Library
- Z.-J. Zha, X.-S. Hua, T. Mei, J. Wang, G.-J. Qi, and Z. Wang. 2008. Joint multi-label multi-instance learning for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
- Z.-J. Zha, W. Meng, Y.-T. Zheng, Y. Yang, R. Hong, and T.-S. Chua. 2012. Interactive video indexing with statistical active learning. IEEE Trans. Multimedia 14, 1. Google Scholar
Digital Library
- Z.-J. Zha, L. Yang, T. Mei, M. Wang, and Z. Wang. 2009. Visual query suggestion. In Proceedings of the ACM International Conference on Multimedia. Google Scholar
Digital Library
- Z.-J. Zha, L. Yang, T. Mei, M. Wang, Z. Wang, T.-S. Chua, and X.-S. Hua. 2010. Visual query suggestion: Towards capturing user intent in internet image search. ACM Trans. Multimedia Comput. Commun. Appl. 6, 3. Google Scholar
Digital Library
- H. Zhang, Z.-J. Zha, S. Yan, J. Bian, and T.-S. Chua. 2012. Attribute feedback. In Proceedings of the ACM International Conference on Multimedia. Google Scholar
Digital Library
- H. Zhang, Z.-J. Zha, Y. Yang, S. Yan, Y. Gao, and T.-S. Chua. 2013. Attribute-augmented semantic hierarchy: Towards bridging semantic gap and intention gap in image retrieval. In Proceedings of the ACM International Conference on Multimedia. Google Scholar
Digital Library
- K. Zhang, I. W. Tsang, and J. T. Kwok. 2009. Maximum margin clustering made practical. IEEE Trans. Neural Netw. 20, 4, 583--596. Google Scholar
Digital Library
Index Terms
Attribute-Augmented Semantic Hierarchy: Towards a Unified Framework for Content-Based Image Retrieval
Recommendations
Attribute-augmented semantic hierarchy: towards bridging semantic gap and intention gap in image retrieval
MM '13: Proceedings of the 21st ACM international conference on MultimediaThis paper presents a novel Attribute-augmented Semantic Hierarchy (A2 SH) and demonstrates its effectiveness in bridging both the semantic and intention gaps in Content-based Image Retrieval (CBIR). A2 SH organizes the semantic concepts into multiple ...
Semantic feedback for interactive image retrieval
MULTIMEDIA '05: Proceedings of the 13th annual ACM international conference on MultimediaIn this paper we present a semantic image retrieval system with integrated feedback mechanism. In our system, we propose a novel feedback solution for semantic retrieval: semantic feedback, which allows our system to interact with users directly at the ...
Augmented Image Retrieval using Multi-order Object Layout with Attributes
MM '14: Proceedings of the 22nd ACM international conference on MultimediaIn image retrieval, users' search intention is usually specified by textual queries, exemplar images, concept maps, and even sketches, which can only express the search intention partially. These query strategies lack the abilities to indicate the ...






Comments