Abstract
The overwhelming amount of Web videos returned from search engines makes effective browsing and search a challenging task. Rather than conventional ranked list, it becomes necessary to organize the retrieved videos in alternative ways. In this article, we explore the issue of topic mining and organizing of the retrieved web videos in semantic clusters. We present a framework for clustering-based video retrieval and build a visualization user interface. A hierarchical topic structure is exploited to encode the characteristics of the retrieved video collection and a semi-supervised hierarchical topic model is proposed to guide the topic hierarchy discovery. Carefully designed experiments on web-scale video dataset collected from video sharing websites validate the proposed method and demonstrate that clustering-based video retrieval is practical to facilitate users for effective browsing.
- Bishop, C. M. 2006. Pattern Recognition and Machine Learning. Springer. Google Scholar
Digital Library
- Blei, D., Ng, A., and Jordan, M. 2003. Latent dirichlet allocation. J. Mach. Learn. Res. 7, 993--1022. Google Scholar
Digital Library
- Blei, D. M., Griffiths, T. L., and Joradan, M. I. 2010. The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies. J. ACM 57, 2, 1--30. Google Scholar
Digital Library
- Blei, D. M., Griffiths, T. L., Joradan, M. I., and Tenenbaum, J. 2004. Hierarchical topic models and the nested chinese restaurant process. In Advances in Neural Information Processing Systems. MIT Press, 17--24.Google Scholar
- Cai, D., He, X., Li, Z., Ma, W. Y., and Wen, J. R. 2004. Hierarchical clustering of www image search results using visual textual and link information. In Proceedings of the ACM Multimedia Conference (MM). 952--959. Google Scholar
Digital Library
- Cao, J., Ngo, C.-W., Zhang, Y.-D., Zhang, D.-M., and Ma, L. 2010. Trajectory-based visualization of web video topics. In Proceedings of the ACM Multimedia Conference (MM). 1639--1642. Google Scholar
Digital Library
- Carpineto, C., Osinski, S., Romano, G., and Weiss, D. 2009. A survey of web clustering engines. ACM Comput. Surv. 41, 3, 1--38. Google Scholar
Digital Library
- Chandramouli, K., Kliegr, T., Nemrava, J., Svatek, V., and Izquierdo, E. 2008. Query refinement and user relevance feedback for contextualized image retrieval. In Visual Information Engineering, Xian, China, 452--458.Google Scholar
- Cheung, S. S. and Zakhor, A. 2004. Fast similarity search and clustering of video sequences on the world-wide-web. IEEE Trans. Multimedia 7, 3, 524--537. Google Scholar
Digital Library
- Cutting, D. R., Pedersen, J. O., Karger, D. R., and Tukey, J. W. 1992. Scatter/gather: a cluster-based approach to browsing large document collections. In Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 318--329. Google Scholar
Digital Library
- Druck, G., Mann, G., and McCallum, A. 2008. Learning from labeled features using generalized expectation criteria. In Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 595--602. Google Scholar
Digital Library
- Gong, Z., Cheang, C. W., and U, L. H. 2005. Web query expansion by wordnet. In Proceedings of the International Conference on Database and Expert Systems Applications (DEXA). Springer-Verlag, 166--175. Google Scholar
Digital Library
- Hindle, A., Shao, J., Lin, D., Lu, J., and Zhang, R. 2010. Clustering web video search results based on integration of multiple features. In Proceedings of the International World Wide Web Conference (WWW), 1--21. Google Scholar
Digital Library
- Jing, F., Wang, C., Yao, Y., Deng, K., Zhang, L., and Ma, W. Y. 2006. Igroup: web image search results clustering. In Proceedings of the ACM Multimedia Conference (MM). 377--384. Google Scholar
Digital Library
- Kummamuru, K., Lotikar, R., and Etzioni, O. 1998. Web document clustering: A feasibility demonstration. In Proceedings of the 21st International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 46--54. Google Scholar
Digital Library
- Liu, J. 1994. The collapsed gibbs sampler in Bayesian computations with application to a gene regulation problem. J. Amer. Stat. Assoc. 89, 958--966.Google Scholar
Cross Ref
- Liu, L., Rui, Y., Sun, L.-F., Yang, B., Zhang, J., and Yang, S.-Q. 2008b. Topic mining on web-shared videos. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP). 2145--2148.Google Scholar
- Liu, L., Sun, L.-F., Rui, Y., Shi, Y., and Yang, S.-Q. 2008a. Web video topic discovery and tracking via bipartite graph reinforcement model. In Proceedings of the International World Wide Web Conference (WWW). 1009--1018. Google Scholar
Digital Library
- Miller, G. A., Beckwith, R., Felbaum, C., Gross, D., and Miller, K. 1990. Introduction to WordNet: An On-line Lexical Database. Vol. 3. Oxford University Press.Google Scholar
- Ramachandran, C., Malik, R., Jin, X., Gao, J., and Han, J. 2009. Videomule: a consensus learning approach to multi-label classification from noisy user-generated videos. In Proceedings of the Multimedia Conference (MM). Google Scholar
Digital Library
- Steinbach, M., Karypis, G., and Kumar, V. 2000. A comparison of document clustering techniques. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 35--42.Google Scholar
- Tan, P., Steinbach, M., and Kumar, V. 2005. Introduction to Data Mining. Vol. 19. Addison Wesley. Google Scholar
Digital Library
- Teh, Y. W., Jordan, M. I., Beal, M. J., and Blei, D. M. 2006. Hierarchical dirichlet processes. J. Amer. Stat. Asso. 101, 476, 1566--1581.Google Scholar
Cross Ref
- Wu, X., Hauptmann, A. G., and Ngo, C.-W. 2007. Practical elimination of near-duplicates from web video search. In Proceedings of the ACM MultiMedia Conference (MM). 218--227. Google Scholar
Digital Library
- Yuan, J., Luo, J., and Wu, Y. 2010. Mining compositional features from gps and visual cues for event recognition in photo collections. IEEE Trans. Multimedia 12, 7, 705--716. Google Scholar
Digital Library
- Yuan, J., Meng, J., Wu, Y., and Luo, J. 2008. Mining recurring events through forest growing. IEEE Trans. Circuits Syst. Video Techn. 18, 11, 1597--1607. Google Scholar
Digital Library
- Zamir, O. and Etzioni, O. 1998. Web document clustering: A feasibility demonstration. In Proceedings of the 21st International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 46--54. Google Scholar
Digital Library
Index Terms
Browse by chunks: Topic mining and organizing on web-scale social media
Recommendations
Mining query subtopics from search log data
SIGIR '12: Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrievalMost queries in web search are ambiguous and multifaceted. Identifying the major senses and facets of queries from search log data, referred to as query subtopic mining in this paper, is a very important issue in web search. Through search log analysis, ...
A Novel Contextual Topic Model for Query-Focused Multi-document Summarization
ICTAI '14: Proceedings of the 2014 IEEE 26th International Conference on Tools with Artificial IntelligenceThe problem of the oft-decried information overload negatively impacts comprehension of useful information. How to solve this problem has given rise to increase of interest in research on multi-document summarization. With the aim of seeking a new ...
A novel contextual topic model for multi-document summarization
A novel contextual topic model is proposed for multi-document summarization.The main idea is to leverage hierarchical topics and their correlations with respect to the lexical co-occurrences of words.The proposed contextual topic model can effectively ...






Comments