Abstract
Most large-scale image retrieval systems are based on the bag-of-visual-words model. However, the traditional bag-of-visual-words model does not capture the geometric context among local features in images well, which plays an important role in image retrieval. In order to fully explore geometric context of all visual words in images, efficient global geometric verification methods have been attracting lots of attention. Unfortunately, current existing methods on global geometric verification are either computationally expensive to ensure real-time response, or cannot handle rotation well. To solve the preceding problems, in this article, we propose a novel geometric coding algorithm, to encode the spatial context among local features for large-scale partial-duplicate Web image retrieval. Our geometric coding consists of geometric square coding and geometric fan coding, which describe the spatial relationships of SIFT features into three geo-maps for global verification to remove geometrically inconsistent SIFT matches. Our approach is not only computationally efficient, but also effective in detecting partial-duplicate images with rotation, scale changes, partial-occlusion, and background clutter.
Experiments in partial-duplicate Web image search, using two datasets with one million Web images as distractors, reveal that our approach outperforms the baseline bag-of-visual-words approach even following a RANSAC verification in mean average precision. Besides, our approach achieves comparable performance to other state-of-the-art global geometric verification methods, for example, spatial coding scheme, but is more computationally efficient.
- Bay, H., Tuytelaars, T., Gool, L. V. 2006. SURF: Speeded up robust features. In Proceedings of the 9th European Conference on Computer Vision (ECCV'06). 404--417. Google Scholar
Digital Library
- Belongie, S., Malik, J., and Puzicha, J. 2002. Shape matching and object recognition using shape context. IEEE Trans. Pattern Anal. Mach. Intell. 24, 4, 509--522. Google Scholar
Digital Library
- Chang, S.-K., Shi, Q. Y., and Yan, C. Y. 1987. Iconic indexing by 2-D strings. IEEE Trans. Pattern Anal. Mach. Intell. 9, 3, 413--428. Google Scholar
Digital Library
- Chum, O., Philbin, J., Sivic, J., Isard, M., and Zisserman, A. 2007a. Total recall: Automatic query expansion with a generative feature model for object retrieval. In Proceedings of the IEEE 11th International Conference on Computer Vision. 1--8.Google Scholar
- Chum, O., Philbin, J., Isard, M., and Zisserman, A. 2007b. Scalable near identical image and shot detection. In Proceedings of the 6th ACM international Conference on Image and Video Retrieval. ACM, 1--8. Google Scholar
Digital Library
- Chum, O., Perdoch, M., and Matas, J. 2009. Geometric minhashing: Finding a (thick) needle in a haystack. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 17--24.Google Scholar
- Chum, O., Matas, J., and Obdrzalek, S. 2004. Enhancing RANSAC by generalized model optimization. In Proceedings of the Asian Conference on Computer Vision. 812--817.Google Scholar
- Copydays, 2008. http://lear.inrialpes.fr/~jegou/data.phpGoogle Scholar
- DupImage, 2011. http://www.cs.utsa.edu/~wzhou/data/DupGroundTruthDataset.tgzGoogle Scholar
- Fischler, M. A. and Bolles, R. C. 1981. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Comm. ACM. 24, 6, 381--395. Google Scholar
Digital Library
- Gao, Y., Wang, C., Li, Z., Zhang, L., and Zhang, L. 2010. Spatial-Bag-of-Features. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3352--3359.Google Scholar
- Google Similar Image Search, 2009. http://similar-images.googlelabs.com/Google Scholar
- Hoang, N. V., Gouet-Brunet, V., Rukoz, M., and Manourier, M. 2010. Embedding spatial information into image content description for scene retrieval. Pattern Recogn. 43, 9, 3003--3012. Google Scholar
Digital Library
- Jegou, H., Douze, M., and Schmid, C. 2008. Hamming embedding and weak geometric consistency for large scale image search. In Proceedings of the 10th European Conference on Computer Vision. 304--317. Google Scholar
Digital Library
- Jegou, H., Harzallah, H., and Schmid, C. 2007. A contextual dissimilarity measure for accurate and efficient image search. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1--8.Google Scholar
- Lowe, D. 2004. Distinctive image features from scale-invariant key points. Int. J. Comput. Vis. 60, 2, 91--110. Google Scholar
Digital Library
- Matas, J., Chum, O., Urban, M., and Pajdla, T. 2002. Robust wide baseline stereo from maximally stable extremal regions. In Proceedings of the British Machine Vision Conference. 384--393.Google Scholar
- Nister, D. and Stewenius, H. 2006. Scalable recognition with a vocabulary tree. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2161--2168. Google Scholar
Digital Library
- Oliva, A., and Torralba, A. 2001. Modeling the shape of the scene: A holistic representation of the spatial envelope. Int. J. Comput. Vision 42, 3, 145--175. Google Scholar
Digital Library
- Philbin, J., Chum, O., Isard, M., Sivic, J., and Zisserman, A. 2007. Object retrieval with large vocabularies and fast spatial matching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1--8.Google Scholar
- Philbin, J., Chum, O., Isard, M., Sivic, J., and Zisserman, A. 2008. Lost in quantization: Improving particular object retrieval in large scale image databases. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1--8.Google Scholar
- Sivic, J. and Zisserman, A. 2003. Video google: A text retrieval approach to object matching in videos. In Proceedings of the International Conference on Computer Vision. 1470--1477. Google Scholar
Digital Library
- Smith, J. R. and Chang S.-F. 1996. VisualSEEk: A fully automated content-based image query system. In Proceedings of the 4th ACM International Conference on Multimedia. 87--98. Google Scholar
Digital Library
- Tang, J., Yan, S., Hong, R., Qi, G.-J., and Chua T.-S. 2009 Inferring semantic concepts from community-contributed images and noisy tags. In Proceedings of the ACM International Conference on Multimedia. Google Scholar
Digital Library
- Tang, J., Li, H., Qi, G.-J., and Chua T.-S. 2010. Image annotation by graph-based inference with integrated multiple/single instance representations. IEEE Trans. Multimedia 12, 2, 131--141. Google Scholar
Digital Library
- Tineye, 2008. http://www. Tineye.comGoogle Scholar
- Wang, X., Yang, M., Cour, T., Zhu, S., Yu, K., and Han, T. X. 2011. Contextual weighting for vocabulary tree based image retrieval. In Proceedings of the International Conference on Computer Vision. Google Scholar
Digital Library
- Wu, Z., Ke, Q., Isard, M., and Sun, J. 2009. Bundling features for large scale partial-duplicate web image search. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 25--32.Google Scholar
- Zhang, S., Tian, Q., Hua, G., Huang, Q., and Li, S. 2009. Descriptive visual words and visual phrases for image applications. In Proceedings of the ACM International Conference on Multimedia. 75--84. Google Scholar
Digital Library
- Zhang, S., Huang, Q., Hua, G., Jiang, S., Gao, W., and Tian, Q. 2010. Building contextual visual vocabulary for large-scale image applications. In Proceedings of the ACM International Conference on Multimedia. 501--510. Google Scholar
Digital Library
- Zhang, Y., Jia, Z., and Chen, T. 2011. Image retrieval with geometry-preserving visual phrases. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 809--816. Google Scholar
Digital Library
- Zhao, W.-L., Wu, X., and Ngo, C.-W. 2010. On the annotation of web videos by efficient near-duplicate search. IEEE Trans. Multimedia 12, 5, 448--461. Google Scholar
Digital Library
- Zhou, W., Lu, Y., Li, H., Song, Y., and Tian, Q. 2010. Spatial coding for large scale partial-duplicate web image search. In Proceedings of the ACM International Conference on Multimedia. 511--520. Google Scholar
Digital Library
- Zhou, W., Li, H., Lu, Y., and Tian, Q. 2011. Large scale image search with geometric coding. In Proceedings of the ACM International Conference on Multimedia. Google Scholar
Digital Library
Index Terms
SIFT match verification by geometric coding for large-scale partial-duplicate web image search
Recommendations
Region-Level Visual Consistency Verification for Large-Scale Partial-Duplicate Image Search
Most recent large-scale image search approaches build on a bag-of-visual-words model, in which local features are quantized and then efficiently matched between images. However, the limited discriminability of local features and the BOW quantization ...
Spatial coding for large scale partial-duplicate web image search
MM '10: Proceedings of the 18th ACM international conference on MultimediaThe state-of-the-art image retrieval approaches represent images with a high dimensional vector of visual words by quantizing local features, such as SIFT, in the descriptor space. The geometric clues among visual words in an image is usually ignored or ...
Large scale image search with geometric coding
MM '11: Proceedings of the 19th ACM international conference on MultimediaBag-of-Visual-Words model is popular in large-scale image search. However, traditional Bag-of-Visual-Words model does not capture the geometric context among local features in images. To fully explore geometric context of all visual words in images, ...






Comments