skip to main content
research-article

Collaborative video reindexing via matrix factorization

Published:22 May 2012Publication History
Skip Abstract Section

Abstract

Concept-based video indexing generates a matrix of scores predicting the possibilities of concepts occurring in video shots. Based on the idea of collaborative filtering, this article presents unsupervised methods to refine the initial scores generated by concept classifiers by taking into account the concept-to-concept correlation and shot-to-shot similarity embedded within the score matrix. Given a noisy matrix, we refine the inaccurate scores via matrix factorization. This method is further improved by learning multiple local models and incorporating contextual-temporal structures. Experiments on the TRECVID 2006--2008 datasets demonstrate relative performance gains ranging from 13% to 52% without using any user annotations or external knowledge resources.

References

  1. Adams, W. H., Iyengar, G., Lin, C.-Y., Naphade, M. R., Neti, C., Nock, H. J., and Smith, J. R. 2003. Semantic indexing of multimedia content using visual, audio, and text cues. Eurasip J. Appl. Signal Process. 2, 170--185.Google ScholarGoogle Scholar
  2. Amir, A., Argillander, J., Campbell, M., Haubold, A., Iyengar, G., Ebadollahi, S., Kang, F., and Naphade, M. R. 2005. IBM research TRECVID-2005 video retrieval system. In Online Proceedings of the TRECVID Workshop.Google ScholarGoogle Scholar
  3. Aytar, Y., Shah, M., and Luo, J. 2008. Utilizing semantic word similarity measures for video retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1--8.Google ScholarGoogle Scholar
  4. Candès, E. J., Li, X., Ma, Y., and Wright, J. 2009. Robust principal component analysis? Tech. rep., Stanford University.Google ScholarGoogle Scholar
  5. Chua, T.-S., Tang, J., Hong, R., Li, H., Luo, Z., and Zheng, Y.-T. 2009. NUS-WIDE: A real-world web image database from National University of Singapore. In Proceedings of the ACM International Conference on Image and Video Retrieval. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Ji, H., Liu, C., Shen, Z., and Xu, Y. 2010. Robust video denoising using low rank matrix completion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle Scholar
  7. Jiang, W., Chang, S.-F., and Loui, A. C. 2007. Context-based concept fusion with boosted conditional random fields. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing. Vol. 1, 949--952.Google ScholarGoogle Scholar
  8. Jiang, Y.-G., Ngo, C.-W., and Yang, J. 2007. Towards optimal bag-of-features for object categorization and semantic video retrieval. In Proceedings of the ACM International Conference on Image and Video Retrieval. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Jiang, Y.-G., Wang, J., Chang, S.-F., and Ngo, C.-W. 2009. Domain adaptive semantic diffusion for large scale context-based video annotation. In Proceedings of the IEEE International Conference on Computer Vision.Google ScholarGoogle Scholar
  10. Jiang, Y.-G., Yanagawa, A., Chang, S.-F., and Ngo, C.-W. 2008. CU-VIREO374: Fusing Columbia374 and VIREO374 for large scale semantic concept detection. Tech. rep., Columbia University.Google ScholarGoogle Scholar
  11. Jiang, Y.-G., Yang, J., Ngo, C.-W., and Hauptmann, A. G. 2010. Representations of keypoint-based semantic concept detection: A comprehensive study. IEEE Trans. Multimedia 12, 1, 42--53. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Kennedy, L. and Hauptmann, A. 2006. LSCOM lexicon definitions and annotations version 1.0, DTO challenge workshop on large scale concept ontology for multimedia. Tech. rep., Columbia University.Google ScholarGoogle Scholar
  13. Kennedy, L. S. and Chang, S.-F. 2007. A reranking approach for context-based concept fusion in video indexing and retrieval. In Proceedings of the ACM International Conference on Image and Video Retrieval. 333--340. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Koren, Y., Bell, R., and Volinsky, C. 2009. Matrix factorization techniques for recommender systems. Computer 42, 8, 30--37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Lew, M. S., Sebe, N., Djeraba, C., and Jain, R. 2006. Content-based multimedia information retrieval: State of the art and challenges. ACM Trans. Multimedia Comput. Commun. Appl. 2, 1, 1--19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Liu, K.-H., Weng, M.-F., Tseng, C.-Y., Chuang, Y.-Y., and Chen, M.-S. 2008. Association and temporal rule mining for post-filtering of semantic concept detection in video. IEEE Trans. Multimedia 10, 2, 240--251. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Liu, Y., Mei, T., and Hua, X.-S. 2009. CrowdReranking: Exploring multiple search engines for visual search reranking. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. 500--507. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Naphade, M. R. and Huang, T. S. 2001. A probabilistic framework for semantic video indexing, filtering, and retrieval. IEEE Trans. Multimedia 3, 1, 141--151. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Naphade, M. R., Kozintsev, I. V., and Huang, T. S. 2002. Factor graph framework for semantic video indexing. IEEE Trans. Circuits Syst. Video Technol. 12, 1, 40--52. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Naphade, M. R., Smith, J. R., Tešić, J., Chang, S.-F., Hsu, W., Kennedy, L., Hauptmann, A., and Curtis, J. 2006. Large-scale concept ontology for multimedia. IEEE Multimedia 13, 3, 86--91. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Qi, G.-J., Hua, X.-S., Rui, Y., Tang, J., Mei, T., Wang, M., and Zhang, H.-J. 2008. Correlative multilabel video annotation with temporal kernels. ACM Trans. Multimedia Comput. Comm. Appl. 5, 1, 1--27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Qi, G.-J., Hua, X.-S., Rui, Y., Tang, J., Mei, T., Wang, M., and Zhang, H.-J. 2007. Correlative multi-label video annotation. In Proceedings of the ACM International Conference on Multimedia. 17--26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Rennie, J. D. M. and Srebro, N. 2005. Fast maximum margin matrix factorization for collaborative prediction. In Proceedings of the International Conference on Machine Learning. 713--719. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Roweis, S. T. and Saul, L. K. 2000. Nonlinear dimensionality reduction by locally linear embedding. Science 290, 5500, 2323--2326.Google ScholarGoogle Scholar
  25. Smeaton, A. F., Over, P., and Kraaij, W. 2006. Evaluation campaigns and TRECVid. In Proceedings of the ACM International Workshop on Multimedia Information Retrieval. 321--330. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Snoek, C. G. M., van de Sande, K. E. A., de Rooij, O., Huurnink, B., Gavves, E., Odijk, D., de Rijke, M., Gevers, T., Worring, M., Koelma, D. C., and Smeulders, A. W. M. 2009. The MediaMill TRECVID 2009 semantic video search engine. In Online Proceedings of the TRECVID Workshop.Google ScholarGoogle Scholar
  27. Snoek, C. G. M. and Worring, M. 2009. Concept-based video retrieval. Foundations and Trends Infor. Retrieval 2, 4, 215--322. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Snoek, C. G. M., Worring, M., van Gemert, J. C., Geusebroek, J.-M., and Smeulders, A. W. M. 2006. The challenge problem for automated detection of 101 semantic concepts in multimedia. In Proceedings of the ACM International Conference on Multimedia. 421--430. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Su, X. and Khoshgoftaar, T. M. 2009. A survey of collaborative filtering techniques. Adv. Artif. Intell. 1--19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Weng, M.-F. and Chuang, Y.-Y. 2008. Multi-cue fusion for semantic video indexing. In Proceedings of the ACM International Conference on Multimedia. 71--80. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Yanagawa, A., Chang, S.-F., Kennedy, L., and Hsu, W. 2007. Columbia University's baseline detectors for 374 LSCOM semantic visual concepts. Tech. rep., Columbia University.Google ScholarGoogle Scholar
  32. Yang, Y.-H., Hsu, W. H., and Chen, H. H. 2009. Online reranking via ordinal informative concepts for context fusion in concept detection and video search. IEEE Trans. Circuits Syst. Video Technol. 19, 12, 1880--1890. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Yilmaz, E. and Aslam, J. A. 2006. Estimating average precision with incomplete and imperfect judgments. In Proceedings of the ACM International Conference on Information and Knowledge Management. 102--111. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Collaborative video reindexing via matrix factorization

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!