Abstract
Concept-based video indexing generates a matrix of scores predicting the possibilities of concepts occurring in video shots. Based on the idea of collaborative filtering, this article presents unsupervised methods to refine the initial scores generated by concept classifiers by taking into account the concept-to-concept correlation and shot-to-shot similarity embedded within the score matrix. Given a noisy matrix, we refine the inaccurate scores via matrix factorization. This method is further improved by learning multiple local models and incorporating contextual-temporal structures. Experiments on the TRECVID 2006--2008 datasets demonstrate relative performance gains ranging from 13% to 52% without using any user annotations or external knowledge resources.
- Adams, W. H., Iyengar, G., Lin, C.-Y., Naphade, M. R., Neti, C., Nock, H. J., and Smith, J. R. 2003. Semantic indexing of multimedia content using visual, audio, and text cues. Eurasip J. Appl. Signal Process. 2, 170--185.Google Scholar
- Amir, A., Argillander, J., Campbell, M., Haubold, A., Iyengar, G., Ebadollahi, S., Kang, F., and Naphade, M. R. 2005. IBM research TRECVID-2005 video retrieval system. In Online Proceedings of the TRECVID Workshop.Google Scholar
- Aytar, Y., Shah, M., and Luo, J. 2008. Utilizing semantic word similarity measures for video retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1--8.Google Scholar
- Candès, E. J., Li, X., Ma, Y., and Wright, J. 2009. Robust principal component analysis? Tech. rep., Stanford University.Google Scholar
- Chua, T.-S., Tang, J., Hong, R., Li, H., Luo, Z., and Zheng, Y.-T. 2009. NUS-WIDE: A real-world web image database from National University of Singapore. In Proceedings of the ACM International Conference on Image and Video Retrieval. Google Scholar
Digital Library
- Ji, H., Liu, C., Shen, Z., and Xu, Y. 2010. Robust video denoising using low rank matrix completion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
- Jiang, W., Chang, S.-F., and Loui, A. C. 2007. Context-based concept fusion with boosted conditional random fields. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing. Vol. 1, 949--952.Google Scholar
- Jiang, Y.-G., Ngo, C.-W., and Yang, J. 2007. Towards optimal bag-of-features for object categorization and semantic video retrieval. In Proceedings of the ACM International Conference on Image and Video Retrieval. Google Scholar
Digital Library
- Jiang, Y.-G., Wang, J., Chang, S.-F., and Ngo, C.-W. 2009. Domain adaptive semantic diffusion for large scale context-based video annotation. In Proceedings of the IEEE International Conference on Computer Vision.Google Scholar
- Jiang, Y.-G., Yanagawa, A., Chang, S.-F., and Ngo, C.-W. 2008. CU-VIREO374: Fusing Columbia374 and VIREO374 for large scale semantic concept detection. Tech. rep., Columbia University.Google Scholar
- Jiang, Y.-G., Yang, J., Ngo, C.-W., and Hauptmann, A. G. 2010. Representations of keypoint-based semantic concept detection: A comprehensive study. IEEE Trans. Multimedia 12, 1, 42--53. Google Scholar
Digital Library
- Kennedy, L. and Hauptmann, A. 2006. LSCOM lexicon definitions and annotations version 1.0, DTO challenge workshop on large scale concept ontology for multimedia. Tech. rep., Columbia University.Google Scholar
- Kennedy, L. S. and Chang, S.-F. 2007. A reranking approach for context-based concept fusion in video indexing and retrieval. In Proceedings of the ACM International Conference on Image and Video Retrieval. 333--340. Google Scholar
Digital Library
- Koren, Y., Bell, R., and Volinsky, C. 2009. Matrix factorization techniques for recommender systems. Computer 42, 8, 30--37. Google Scholar
Digital Library
- Lew, M. S., Sebe, N., Djeraba, C., and Jain, R. 2006. Content-based multimedia information retrieval: State of the art and challenges. ACM Trans. Multimedia Comput. Commun. Appl. 2, 1, 1--19. Google Scholar
Digital Library
- Liu, K.-H., Weng, M.-F., Tseng, C.-Y., Chuang, Y.-Y., and Chen, M.-S. 2008. Association and temporal rule mining for post-filtering of semantic concept detection in video. IEEE Trans. Multimedia 10, 2, 240--251. Google Scholar
Digital Library
- Liu, Y., Mei, T., and Hua, X.-S. 2009. CrowdReranking: Exploring multiple search engines for visual search reranking. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. 500--507. Google Scholar
Digital Library
- Naphade, M. R. and Huang, T. S. 2001. A probabilistic framework for semantic video indexing, filtering, and retrieval. IEEE Trans. Multimedia 3, 1, 141--151. Google Scholar
Digital Library
- Naphade, M. R., Kozintsev, I. V., and Huang, T. S. 2002. Factor graph framework for semantic video indexing. IEEE Trans. Circuits Syst. Video Technol. 12, 1, 40--52. Google Scholar
Digital Library
- Naphade, M. R., Smith, J. R., Tešić, J., Chang, S.-F., Hsu, W., Kennedy, L., Hauptmann, A., and Curtis, J. 2006. Large-scale concept ontology for multimedia. IEEE Multimedia 13, 3, 86--91. Google Scholar
Digital Library
- Qi, G.-J., Hua, X.-S., Rui, Y., Tang, J., Mei, T., Wang, M., and Zhang, H.-J. 2008. Correlative multilabel video annotation with temporal kernels. ACM Trans. Multimedia Comput. Comm. Appl. 5, 1, 1--27. Google Scholar
Digital Library
- Qi, G.-J., Hua, X.-S., Rui, Y., Tang, J., Mei, T., Wang, M., and Zhang, H.-J. 2007. Correlative multi-label video annotation. In Proceedings of the ACM International Conference on Multimedia. 17--26. Google Scholar
Digital Library
- Rennie, J. D. M. and Srebro, N. 2005. Fast maximum margin matrix factorization for collaborative prediction. In Proceedings of the International Conference on Machine Learning. 713--719. Google Scholar
Digital Library
- Roweis, S. T. and Saul, L. K. 2000. Nonlinear dimensionality reduction by locally linear embedding. Science 290, 5500, 2323--2326.Google Scholar
- Smeaton, A. F., Over, P., and Kraaij, W. 2006. Evaluation campaigns and TRECVid. In Proceedings of the ACM International Workshop on Multimedia Information Retrieval. 321--330. Google Scholar
Digital Library
- Snoek, C. G. M., van de Sande, K. E. A., de Rooij, O., Huurnink, B., Gavves, E., Odijk, D., de Rijke, M., Gevers, T., Worring, M., Koelma, D. C., and Smeulders, A. W. M. 2009. The MediaMill TRECVID 2009 semantic video search engine. In Online Proceedings of the TRECVID Workshop.Google Scholar
- Snoek, C. G. M. and Worring, M. 2009. Concept-based video retrieval. Foundations and Trends Infor. Retrieval 2, 4, 215--322. Google Scholar
Digital Library
- Snoek, C. G. M., Worring, M., van Gemert, J. C., Geusebroek, J.-M., and Smeulders, A. W. M. 2006. The challenge problem for automated detection of 101 semantic concepts in multimedia. In Proceedings of the ACM International Conference on Multimedia. 421--430. Google Scholar
Digital Library
- Su, X. and Khoshgoftaar, T. M. 2009. A survey of collaborative filtering techniques. Adv. Artif. Intell. 1--19. Google Scholar
Digital Library
- Weng, M.-F. and Chuang, Y.-Y. 2008. Multi-cue fusion for semantic video indexing. In Proceedings of the ACM International Conference on Multimedia. 71--80. Google Scholar
Digital Library
- Yanagawa, A., Chang, S.-F., Kennedy, L., and Hsu, W. 2007. Columbia University's baseline detectors for 374 LSCOM semantic visual concepts. Tech. rep., Columbia University.Google Scholar
- Yang, Y.-H., Hsu, W. H., and Chen, H. H. 2009. Online reranking via ordinal informative concepts for context fusion in concept detection and video search. IEEE Trans. Circuits Syst. Video Technol. 19, 12, 1880--1890. Google Scholar
Digital Library
- Yilmaz, E. and Aslam, J. A. 2006. Estimating average precision with incomplete and imperfect judgments. In Proceedings of the ACM International Conference on Information and Knowledge Management. 102--111. Google Scholar
Digital Library
Index Terms
Collaborative video reindexing via matrix factorization
Recommendations
Nonnegative matrix factorization via rank-one downdate
ICML '08: Proceedings of the 25th international conference on Machine learningNonnegative matrix factorization (NMF) was popularized as a tool for data mining by Lee and Seung in 1999. NMF attempts to approximate a matrix with nonnegative entries by a product of two low-rank matrices, also with nonnegative entries. We propose an ...
Monotonous (semi-)nonnegative matrix factorization
CODS '15: Proceedings of the 2nd ACM IKDD Conference on Data SciencesNonnegative matrix factorization (NMF) factorizes a non-negative matrix into product of two non-negative matrices, namely a signal matrix and a mixing matrix. NMF suffers from the scale and ordering ambiguities. Often, the source signals can be ...
Multi-cue fusion for semantic video indexing
MM '08: Proceedings of the 16th ACM international conference on MultimediaThe huge amount of videos currently available poses a difficult problem in semantic video retrieval. The success of query-by-concept, recently proposed to handle this problem, depends greatly on the accuracy of concept-based video indexing. This paper ...






Comments