Abstract
Near-duplicate keyframe (NDK) retrieval techniques are critical to many real-world multimedia applications. Over the last few years, we have witnessed a surge of attention on studying near-duplicate image/keyframe retrieval in the multimedia community. To facilitate an effective approach to NDK retrieval on large-scale data, we suggest an effective Multi-Level Ranking (MLR) scheme that effectively retrieves NDKs in a coarse-to-fine manner. One key stage of the MLR ranking scheme is how to learn an effective ranking function with extremely small training examples in a near-duplicate detection task. To attack this challenge, we employ a semi-supervised learning method, semi-supervised support vector machines, which is able to significantly improve the retrieval performance by exploiting unlabeled data. Another key stage of the MLR scheme is to perform a fine matching among a subset of keyframe candidates retrieved from the previous coarse ranking stage. In contrast to previous approaches based on either simple heuristics or rigid matching models, we propose a novel Nonrigid Image Matching (NIM) approach to tackle near-duplicate keyframe retrieval from real-world video corpora in order to conduct an effective fine matching. Compared with the conventional methods, the proposed NIM approach can recover explicit mapping between two near-duplicate images with a few deformation parameters and find out the correct correspondences from noisy data simultaneously. To evaluate the effectiveness of our proposed approach, we performed extensive experiments on two benchmark testbeds extracted from the TRECVID2003 and TRECVID2004 corpora. The promising results indicate that our proposed method is more effective than other state-of-the-art approaches for near-duplicate keyframe retrieval.
- Andoni, A. and Indyk, P. 2008. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Comm. ACM 51, 1, 117--122. Google Scholar
Digital Library
- Bay, H., Tuytelaars, T., and Gool, L. J. V. 2006. Surf: Speeded up robust features. In Proceedings of the European Conference on Computer Vision. 404--417. Google Scholar
Digital Library
- Boyd, S. and Vandenberghe, L. 2004. Convex Optimization. Cambridge University Press. Google Scholar
Digital Library
- Canny, J. 1986. A computational approach to edge detection. IEEE Trans. Patt. Anal. Mach. Intell. 8, 6, 679--698. Google Scholar
Digital Library
- Chum, O. and Matas, J. 2005. Matching with prosac- progressive sample consensus. In Proceedings of the Conference on Computer Vision and Pattern Recognition. Vol. 1. 220--226. Google Scholar
Digital Library
- Chum, O., Philbin, J., Isard, M., and Zisserman, A. 2007. Scalable near identical image and shot detection. In Proceedings of the 6th ACM International Conference on Image and Video Retrieval (CIVR'07). 549--556. Google Scholar
Digital Library
- Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., and Zisserman, A. 2007. The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Results. http://www.citeulike.org/user/Comm.doubleshow/tag/file-import-09-04-17.Google Scholar
- Fischler, M. A. and Bolles, R. C. 1981. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Comm. CACM 24, 6, 381--395. Google Scholar
Digital Library
- Fua, P. and Leclerc, Y. 1995. Object-centered surface reconstruction: Combining multi-image stereo and shading. Int. J. Comput. Visi. 16, 1, 35--56. Google Scholar
Digital Library
- Fukunaga, K. 1990. Introduction to Statistical Pattern Recognition. Academic Press Professional, Inc. Google Scholar
Digital Library
- Hoi, C.-H., Wang, W., and Lyu, M. R. 2003. A novel scheme for video similarity detection. In Proceedings of the International Conference on Image and Video Retrieval. 373--382. Google Scholar
Digital Library
- Hoi, S. C. and Lyu, M. R. 2008. A multi-modal and multi-level ranking framework for content-based video retrieval. IEEE Trans. Multimed. 10, 4, 607--619. Google Scholar
Digital Library
- Kass, M., Witkin, A., and Terzopoulos, D. 1988. Snakes: Active contour models. Int. J. Comput. Visi. 1, 4, 321--331.Google Scholar
Cross Ref
- Ke, Y., Sukthankar, R., and Huston, L. 2004. Efficient near-duplicate detction and sub-image retrieval system. In Proceedings of ACM MULTIMEDIA. ACM, 869--876. Google Scholar
Digital Library
- Lades, M., Vorbruggen, J. C., Buhmann, J., Lange, J., von der Malsburg, C., Wurtz, R. P., and Konen, W. 1993. Distortion invariant object recognition in the dynamic link architecture. IEEE Trans. Comput. 42, 5, 300--311. Google Scholar
Digital Library
- Lowe, D. G. 2004. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Visi. 60, 2, 91--110. Google Scholar
Digital Library
- Mikolajczyk, K. and Schmid, C. 2005. A performance evaluation of local descriptors. IEEE Trans. Patt. Analys. Mach. Intel. 27, 10, 1615--1630. Google Scholar
Digital Library
- Ngo, C.-W., Zhao, W.-L., and Jiang, Y.-G. 2006. Fast tracking of near-duplicate keyframes in broadcast domain with transitivity propagation. In Proceedings of ACM MULTIMEDIA. ACM, 845--854. Google Scholar
Digital Library
- Ojala, T., Pietikainen, M., and Harwood, D. 1996. A comparative study of texture measures with classification based on feature distributions. Patt. Recog. 29, 1, 51--59.Google Scholar
Cross Ref
- Oliva, A. and Torralba, A. 2001. Modeling the shape of the scene: A holistic representation of the spatial envelope. Int. J. Comput. Visi. 42, 3, 145--175. Google Scholar
Digital Library
- Pilet, J., Lepetit, V., and Fua, P. 2008. Fast non-rigid surface detection, registration, and realistic augmentation. Int. J. Comput. Visi. 76, 2, 109--122. Google Scholar
Digital Library
- Qamra, A., Meng, Y., and Chang, E. Y. 2005. Enhanced perceptual distance functions and indexing for image replica recognition. IEEE Trans. Patt. Anal. Mach. Intell. 27, 3, 379--391. Google Scholar
Digital Library
- Rubner, Y., Tomasi, C., and Guibas, L. J. 2000. The earth mover's distance as a metric for image retrieval. Int. J. Comput. Visi. 40, 2, 99--121. Google Scholar
Digital Library
- Sindhwani, V., Niyogi, P., and Belkin, M. 2005. Beyond the point cloud: from transductive to semi-supervised learning. In Proceedings of the International Conference on Machine Learning. ACM Press, 824--831. Google Scholar
Digital Library
- Sivic, J. and Zisserman, A. 2003. Video google: A text retrieval approach to object matching in videos. In Proceedings of the International Conference on Computer Vision (ICCV'3). 1470--1477. Google Scholar
Digital Library
- Smeaton, A. F., Over, P., and Kraaij, W. 2006. Evaluation campaigns and trecvid. In Proceedings of the 8th ACM International Workshop on Multimedia Information Retrieval (MIR'06). ACM Press, New York, NY, 321--330. Google Scholar
Digital Library
- Vapnik, V. N. 1998. Statistical Learning Theory. John Wiley & Sons.Google Scholar
- Wu, X., Hauptmann, A. G., and Ngo, C.-W. 2007a. Novelty detection for cross-lingual news stories with visual duplicates and speech transcripts. In Proceedings of ACM MULTIMEDIA. ACM, 168--177. Google Scholar
Digital Library
- Wu, X., Hauptmann, A. G., and Ngo, C.-W. 2007b. Practical elimination of near-duplicates from web video search. In Proceedings of ACM MULTIMEDIA. ACM, 218--227. Google Scholar
Digital Library
- Wu, X., Zhao, W.-L., and Ngo, C.-W. 2007c. Near-duplicate keyframe retrieval with visual keywords and semantic context. In Proceedings of the International Conference on Image and Video Retrieval. ACM, 162--169. Google Scholar
Digital Library
- Xu, D., Cham, T.-J., Yan, S., and Chang, S.-F. 2008. Near duplicate image identification with spatially aligned pyramid matching. In Proceedings of the Conference on Computer Vision and Pattern Recognition.Google Scholar
- Xu, Z., Jin, R., Zhu, J., King, I., and Lyu, M. R. 2007. Efficient convex relaxation for transductive support vector machine. In Proceedings of the Conference on Advances in Neural Information Processing Systems. 1641--1648.Google Scholar
- Yan, R., Hauptmann, A. G., and Jin, R. 2003. Negative pseudo-relevance feedback in content-based video retrieval. In Proceedings of ACM MULTIMEDIA. 343--346. Google Scholar
Digital Library
- Zhang, D.-Q. and Chang, S.-F. 2004. Detecting image near-duplicate by stochastic attributed relational graph matching with learning. In Proceedings of ACM MULTIMEDIA. ACM, 877--884. Google Scholar
Digital Library
- Zhao, W., Chellappa, R., Phillips, P. J., and Rosenfeld, A. 2003. Face recognition: A literature survey. ACM Comput. Surv. 35, 4, 399--458. Google Scholar
Digital Library
- Zhao, W., Jiang, Y., and Ngo, C. 2006. Keyframe retrieval by keypoints: Can point-to-point matching help? In Proceedings of the International Conference on Image and Video Retrieval. 72--81. Google Scholar
Digital Library
- Zhao, W.-L., Ngo, C.-W., Tan, H. K., and Wu, X. 2007. Near-duplicate keyframe identification with interest point matching and pattern learning. IEEE Trans. Multimed. 9, 5, 1037--1048. Google Scholar
Digital Library
- Zhu, J., Hoi, S. C., and Lyu, M. R. 2008a. Face annotation by transductive kernel fisher discriminant. IEEE Trans. Multimed. 10, 1, 86--96. Google Scholar
Digital Library
- Zhu, J., Hoi, S. C., Lyu, M. R., and Yan, S. 2008b. Near-duplicate keyframe retrieval by nonrigid image matching. In Proceedings of ACM MULTIMEDIA. 41--50. Google Scholar
Digital Library
- Zhu, J., Hoi, S. C., Xu, Z., and Lyu, M. R. 2008c. An effective approach to 3d deformable surface tracking. In Proceedings of the European Conference on Computer Vision. III: 766--779. Google Scholar
Digital Library
- Zhu, J. and Lyu, M. R. 2007. Progressive finite newton approach to real-time nonrigid surface detection. In Proceedings of the Conference on Computer Vision and Pattern Recognition.Google Scholar
- Zhu, J., Lyu, M. R., and Huang, T. S. 2009. A fast 2d shape recovery approach by fusing features and appearance. IEEE Trans. Patt. Anal. Mach. Intell. 31, 7, 1210--1224. Google Scholar
Digital Library
- Zhu, X. 2005. Semi-supervised learning literature survey. Tech. rep., Carnegie Mellon University.Google Scholar
Index Terms
Near-duplicate keyframe retrieval by semi-supervised learning and nonrigid image matching
Recommendations
Near-duplicate keyframe retrieval by nonrigid image matching
MM '08: Proceedings of the 16th ACM international conference on MultimediaNear-duplicate image retrieval plays an important role in many real-world multimedia applications. Most previous approaches have some limitations. For example, conventional appearance-based methods may suffer from the illumination variations and ...
Semi-supervised modality-dependent cross-media retrieval
In this paper, we propose a modality-dependent cross-media retrieval approach under semi-supervised conditions. The approach utilizes both labeled samples and unlabeled ones to obtain two couples of projection matrices and uses feature distance to ...
Semi-supervised document retrieval
This paper proposes a new machine learning method for constructing ranking models in document retrieval. The method, which is referred to as SSRank, aims to use the advantages of both the traditional Information Retrieval (IR) methods and the supervised ...






Comments