Abstract
We often remember images and videos that we have seen or recorded before but cannot quite recall the exact venues or details of the contents. We typically have vague memories of the contents, which can often be expressed as a textual description and/or rough visual descriptions of the scenes. Using these vague memories, we then want to search for the corresponding videos of interest. We call this “Memory Recall based Video Search” (MRVS). To tackle this problem, we propose a video search system that permits a user to input his/her vague and incomplete query as a combination of text query, a sequence of visual queries, and/or concept queries. Here, a visual query is often in the form of a visual sketch depicting the outline of scenes within the desired video, while each corresponding concept query depicts a list of visual concepts that appears in that scene. As the query specified by users is generally approximate or incomplete, we need to develop techniques to handle this inexact and incomplete specification by also leveraging on user feedback to refine the specification. We utilize several innovative approaches to enhance the automatic search. First, we employ a visual query suggestion model to automatically suggest potential visual features to users as better queries. Second, we utilize a color similarity matrix to help compensate for inexact color specification in visual queries. Third, we leverage on the ordering of visual queries and/or concept queries to rerank the results by using a greedy algorithm. Moreover, as the query is inexact and there is likely to be only one or few possible answers, we incorporate an interactive feedback loop to permit the users to label related samples which are visually similar or semantically close to the relevant sample. Based on the labeled samples, we then propose optimization algorithms to update visual queries and concept weights to refine the search results. We conduct experiments on two large-scale video datasets: TRECVID 2010 and YouTube. The experimental results demonstrate that our proposed system is effective for MRVS tasks.
- A. Amir, J. Argillandery, et al. 2005. IBM Research TRECVID-2005 video retrieval system. In Proceedings of the TRECVID Workshop.Google Scholar
- F. R. Bach, G. R. G. Lanckriet, and M. I. Jordan. 2004. Multiple kernel learning, conic duality, and the SMO algorithm. In Proceedings of the International Conference on Machine Learning. Google Scholar
Digital Library
- P. Browne and A. F. Smeaton. 2005. Video retrieval using dialogue, keyframe similarity and video objects. In Proceedings of the International Conference on Image Process 3, 1208--1211.Google Scholar
- L. Chaisorn, K. W. Wan, et al. 2010. TRECVID 2010 Known-item Search (KIS) task by I2R. In Proceedings of the TRECVID Workshop.Google Scholar
- S.-F. Chang, W. H. Hsu, W. Jiang, L. S. Kennedy, D. Xu, A. Yanagawa, and E. Zavesky. 2006. Columbia University Trecvid-2006 video search and high-level feature extraction. In Proceedings of the TRECVID Workshop.Google Scholar
- X. Y. Chen, J. Yuan, et al. 2010. TRECVID 2010 known-item search by NUS. In Proceedings of the TRECVID Workshop.Google Scholar
- M. D. Fairchild. 2005. Color Appearance Models 2nd Ed. Addison-Wesley.Google Scholar
- M. R. Hestenes. 1969. Multiplier and gradient methods. J. Optimization Theory Appl. 303--320.Google Scholar
- W. H. Hsu, L. S. Kennedy, and S. F. Chang. 2007. Reranking methods for visual search. IEEE Trans. Multimedia 14, 14--22. Google Scholar
Digital Library
- W. M. Hu, D. Xie, Z. Y. Fu, W. R. Zeng, and S. Maybank. 2007. Semantic based surveillance video retrieval. IEEE Trans. Image Process 16, 1168--1181. Google Scholar
Digital Library
- W. M. Hu, N. H. Xie, L. Li, and X. L. Zeng. 2011. A survey on visual content-based video indexing and retrieval. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 41, 797--819. Google Scholar
Digital Library
- L. Kennedy and S.-F. Chang. 2010. Visual ontology construction and concept detection for multimedia indexing and retrieval. In Semantic Computing, 155.Google Scholar
- L. Kennedy, A. P. Natsev, and S.-F. Chang. 2005. Automatic discovery of query-class-dependent models for multimodal search. In Proceedings of the ACM International Conference on Multimedia. Google Scholar
Digital Library
- Y. Liu, T. Mei, and X. S. Hua. 2009a. CrowdReranking: Exploring multiple search engines for visual search reranking. In Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 500--507. Google Scholar
Digital Library
- Y. Liu, Tao Mei, X. Q. Wu, and X.-S. Hua. 2009b. Multigraph-based query-independent learning for video search. IEEE Trans. Circuits Syst. Video Technol. 19, 12, 1841--1850. Google Scholar
Digital Library
- Y. F. Ma and H. J. Zhang. 2002. Motion texture: A new motion based video representation. In Proceedings of the International Conference on Pattern Recognition. 548--551.Google Scholar
- C. D. Manning, P. Raghavan, and H. Schtze. 2009. An Introduction to Information Retrieval. Cambridge University Press, Cambridge, UK. Google Scholar
Digital Library
- S.-Y. Neo, J. Zhao, M.-Y. Kan, and T.-S. Chua. 2006. Video retrieval using high level features: Exploiting query matching and confidence-based weighting. In Proceedings of the ACM International Conference on Image and Video Retrieval. Google Scholar
Digital Library
- L. Q. Nie, M. Wang, Z.-J. Zha, G. D. Li, and T.-S. Chua. 2011. Multimedia answering: Enriching Text QA with media information. In Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 695--704. Google Scholar
Digital Library
- J. Sivic, M. Everingham, and A. Zisserman. 2005. Person spotting: Video shot retrieval for Dace sets. In Proceedings of the International Conference on Image Video Retrieval. Google Scholar
Digital Library
- C. G. M. Snoek, B. Huurnink, et al. 2007. Adding semantics to detectors for video retrieval. IEEE Trans. Multimedia 9, 975--986. Google Scholar
Digital Library
- C. G. M. Snoek, K. E. A. VandeSande, et al. 2008. The MediaMill TRECVID 2008 semantic video search engine. In Proceedings of the TRECVID Workshop.Google Scholar
- C. G. M. Snoek and M. Worring. 2009. Concept-based video retrieval. In Foundations and Trends in Information Retrieval 2, 215--322. Google Scholar
Digital Library
- TRECVID2010. 2010. TRECVID2010. http://www-nlpir.nist.gov/projects/tv2010/tv2010.html (2010).Google Scholar
- D. Wang, X. Li, J. Li, and B. Zhang. 2007. The importance of query concept-mapping for automatic video retrieval. In Proceedings of the International Conference on Multimedia. 285--288. Google Scholar
Digital Library
- R. Yan and A. G. Hauptmann. 2007. A review of text and image retrieval approaches for broadcast news video. Inf. Retrieval 10, 445--484. Google Scholar
Digital Library
- A. Yanagawa, S.-F. Chang, L. Kennedy, and W. H. Hsu. 2007. Columbia University's baseline detectors for 374 LSCOM semantic visual concepts. ADVENT Tech. rep. 222-2006-8.Google Scholar
- J. Yang and A. G. Hauptmann. 2006. Exploring temporal consistency for video analysis and retrieval. In Proceedings of the International Conference on Multimedia Information Retrieval. Google Scholar
Digital Library
- T. Yao, C.-W. Ngo, and T. Mei. 2013. Circular reranking for visual search. IEEE Trans. Image Process. 22, 4, 1644--1655. Google Scholar
Digital Library
- J. Yuan, Z.-J. Zha, Y.-T. Zheng, M. Wang, X. D. Zhou, and T.-S. Chua. 2011a. Learning concept bundles for video search with complex queries. In Proceedings of the ACM International Conference on Multimedia. Google Scholar
Digital Library
- J. Yuan, Z.-J. Zha, Y.-T. Zheng, M. Wang, X. D. Zhou, and T.-S. Chua. 2011b. Utilizing related samples to enhance interactive concept-based video search. IEEE Trans. Multimedia 13, 1343--1355. Google Scholar
Digital Library
- E. Zavesky and S.-F. Chang. 2008. CuZero: Embracing the Frontier of interactive visual search for informed users. In Proceedings of the International Conference on Multimedia Information Retrieval. Google Scholar
Digital Library
- Z. J. Zha, L. J. Yang, T. Mei, M. Wang, and Z. F. Wang. 2009. Visual query suggestion. In Proceedings of the International Conference on Multimedia. Google Scholar
Digital Library
Index Terms
Memory recall based video search: Finding videos you have seen before based on your memory
Recommendations
Utilizing related samples to learn complex queries in interactive concept-based video search
CIVR '10: Proceedings of the ACM International Conference on Image and Video RetrievalOne of the main challenges in interactive concept-based video search is the insufficient relevant sample problem, especially for queries with complex semantics. To address this problem, in this paper, we propose to utilize "related samples" to learn the ...
SQL-Like Interpretable Interactive Video Search
MultiMedia ModelingAbstractConcept-free search, which embeds text and video signals in a joint space for retrieval, appears to be a new state-of-the-art. However, this new search paradigm suffers from two limitations. First, the search result is unpredictable and not ...
Multimodal Query Suggestion and Searching for Video Search
DEXA '09: Proceedings of the 2009 20th International Workshop on Database and Expert Systems ApplicationIn this paper, we propose a multimodal query suggestion method for video search engine which can leverage multimodal processing to improve the quality of search results. When users type general or ambiguous textual queries, our system provides keyword ...






Comments