skip to main content
research-article

Browse by chunks: Topic mining and organizing on web-scale social media

Authors Info & Claims
Published:04 November 2011Publication History
Skip Abstract Section

Abstract

The overwhelming amount of Web videos returned from search engines makes effective browsing and search a challenging task. Rather than conventional ranked list, it becomes necessary to organize the retrieved videos in alternative ways. In this article, we explore the issue of topic mining and organizing of the retrieved web videos in semantic clusters. We present a framework for clustering-based video retrieval and build a visualization user interface. A hierarchical topic structure is exploited to encode the characteristics of the retrieved video collection and a semi-supervised hierarchical topic model is proposed to guide the topic hierarchy discovery. Carefully designed experiments on web-scale video dataset collected from video sharing websites validate the proposed method and demonstrate that clustering-based video retrieval is practical to facilitate users for effective browsing.

References

  1. Bishop, C. M. 2006. Pattern Recognition and Machine Learning. Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Blei, D., Ng, A., and Jordan, M. 2003. Latent dirichlet allocation. J. Mach. Learn. Res. 7, 993--1022. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Blei, D. M., Griffiths, T. L., and Joradan, M. I. 2010. The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies. J. ACM 57, 2, 1--30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Blei, D. M., Griffiths, T. L., Joradan, M. I., and Tenenbaum, J. 2004. Hierarchical topic models and the nested chinese restaurant process. In Advances in Neural Information Processing Systems. MIT Press, 17--24.Google ScholarGoogle Scholar
  5. Cai, D., He, X., Li, Z., Ma, W. Y., and Wen, J. R. 2004. Hierarchical clustering of www image search results using visual textual and link information. In Proceedings of the ACM Multimedia Conference (MM). 952--959. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Cao, J., Ngo, C.-W., Zhang, Y.-D., Zhang, D.-M., and Ma, L. 2010. Trajectory-based visualization of web video topics. In Proceedings of the ACM Multimedia Conference (MM). 1639--1642. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Carpineto, C., Osinski, S., Romano, G., and Weiss, D. 2009. A survey of web clustering engines. ACM Comput. Surv. 41, 3, 1--38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Chandramouli, K., Kliegr, T., Nemrava, J., Svatek, V., and Izquierdo, E. 2008. Query refinement and user relevance feedback for contextualized image retrieval. In Visual Information Engineering, Xian, China, 452--458.Google ScholarGoogle Scholar
  9. Cheung, S. S. and Zakhor, A. 2004. Fast similarity search and clustering of video sequences on the world-wide-web. IEEE Trans. Multimedia 7, 3, 524--537. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Cutting, D. R., Pedersen, J. O., Karger, D. R., and Tukey, J. W. 1992. Scatter/gather: a cluster-based approach to browsing large document collections. In Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 318--329. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Druck, G., Mann, G., and McCallum, A. 2008. Learning from labeled features using generalized expectation criteria. In Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 595--602. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Gong, Z., Cheang, C. W., and U, L. H. 2005. Web query expansion by wordnet. In Proceedings of the International Conference on Database and Expert Systems Applications (DEXA). Springer-Verlag, 166--175. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Hindle, A., Shao, J., Lin, D., Lu, J., and Zhang, R. 2010. Clustering web video search results based on integration of multiple features. In Proceedings of the International World Wide Web Conference (WWW), 1--21. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Jing, F., Wang, C., Yao, Y., Deng, K., Zhang, L., and Ma, W. Y. 2006. Igroup: web image search results clustering. In Proceedings of the ACM Multimedia Conference (MM). 377--384. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Kummamuru, K., Lotikar, R., and Etzioni, O. 1998. Web document clustering: A feasibility demonstration. In Proceedings of the 21st International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 46--54. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Liu, J. 1994. The collapsed gibbs sampler in Bayesian computations with application to a gene regulation problem. J. Amer. Stat. Assoc. 89, 958--966.Google ScholarGoogle ScholarCross RefCross Ref
  17. Liu, L., Rui, Y., Sun, L.-F., Yang, B., Zhang, J., and Yang, S.-Q. 2008b. Topic mining on web-shared videos. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP). 2145--2148.Google ScholarGoogle Scholar
  18. Liu, L., Sun, L.-F., Rui, Y., Shi, Y., and Yang, S.-Q. 2008a. Web video topic discovery and tracking via bipartite graph reinforcement model. In Proceedings of the International World Wide Web Conference (WWW). 1009--1018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Miller, G. A., Beckwith, R., Felbaum, C., Gross, D., and Miller, K. 1990. Introduction to WordNet: An On-line Lexical Database. Vol. 3. Oxford University Press.Google ScholarGoogle Scholar
  20. Ramachandran, C., Malik, R., Jin, X., Gao, J., and Han, J. 2009. Videomule: a consensus learning approach to multi-label classification from noisy user-generated videos. In Proceedings of the Multimedia Conference (MM). Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Steinbach, M., Karypis, G., and Kumar, V. 2000. A comparison of document clustering techniques. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 35--42.Google ScholarGoogle Scholar
  22. Tan, P., Steinbach, M., and Kumar, V. 2005. Introduction to Data Mining. Vol. 19. Addison Wesley. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Teh, Y. W., Jordan, M. I., Beal, M. J., and Blei, D. M. 2006. Hierarchical dirichlet processes. J. Amer. Stat. Asso. 101, 476, 1566--1581.Google ScholarGoogle ScholarCross RefCross Ref
  24. Wu, X., Hauptmann, A. G., and Ngo, C.-W. 2007. Practical elimination of near-duplicates from web video search. In Proceedings of the ACM MultiMedia Conference (MM). 218--227. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Yuan, J., Luo, J., and Wu, Y. 2010. Mining compositional features from gps and visual cues for event recognition in photo collections. IEEE Trans. Multimedia 12, 7, 705--716. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Yuan, J., Meng, J., Wu, Y., and Luo, J. 2008. Mining recurring events through forest growing. IEEE Trans. Circuits Syst. Video Techn. 18, 11, 1597--1607. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Zamir, O. and Etzioni, O. 1998. Web document clustering: A feasibility demonstration. In Proceedings of the 21st International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, 46--54. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Browse by chunks: Topic mining and organizing on web-scale social media

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!