skip to main content
research-article
Best Paper

Cross-Platform Emerging Topic Detection and Elaboration from Multimedia Streams

Authors Info & Claims
Published:02 June 2015Publication History
Skip Abstract Section

Abstract

With the explosive growth of online media platforms in recent years, it becomes more and more attractive to provide users a solution of emerging topic detection and elaboration. And this posts a real challenge to both industrial and academic researchers because of the overwhelming information available in multiple modalities and with large outlier noises. This article provides a method on emerging topic detection and elaboration using multimedia streams cross different online platforms. Specifically, Twitter, New York Times and Flickr are selected for the work to represent the microblog, news portal and imaging sharing platforms. The emerging keywords of Twitter are firstly extracted using aging theory. Then, to overcome the nature of short length message in microblog, Robust Cross-Platform Multimedia Co-Clustering (RCPMM-CC) is proposed to detect emerging topics with three novelties: 1) The data from different media platforms are in multimodalities; 2) The coclustering is processed based on a pairwise correlated structure, in which the involved three media platforms are pairwise dependent; 3) The noninformative samples are automatically pruned away at the same time of coclustering. In the last step of cross-platform elaboration, we enrich each emerging topic with the samples from New York Times and Flickr by computing the implicit links between social topics and samples from selected news and Flickr image clusters, which are obtained by RCPMM-CC. Qualitative and quantitative evaluation results demonstrate the effectiveness of our method.

Skip Supplemental Material Section

Supplemental Material

References

  1. L. M. Aiello, G. Petkos, C. Martin, D. Corney, S. Papadopoulos, R. Skraba, A. Göker, I. Kompatsiaris, and A. Jaimes. 2013. Sensing trending topics in Twitter. IEEE Trans. Multimedia 15, 6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Foteini Alvanaki, Michel Sebastian, Krithi Ramamritham, and Gerhard Weikum. 2011. EnBlogue: Emergent topic detection in Web 2.0 streams. In Proceedings of the ACM International Conference on Management of Data. 1271--1274. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. Banerjee, I. Dhillon, J. Ghosh, S. Merugu, and D. S. Modha. 2007. A generalized maximum entropy approach to Bregman co-clustering and matrix approximation. J. Machine Learning Research 8, 1919--1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Romil Bansal, Radhika Kumaran, Diwakar Mahajan, Arpit Khurdiya, Lipika Dey, and Hiranmay Ghosh. 2012. TWIPIX: A web magazine curated from social media. In Proceedings of the ACM International Conference on Multimedia. 1355--1356. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Bing-Kun Bao, Weiqing Min, Teng Li, and Changsheng Xu. 2015. Joint local and global consistency on interdocument and interword relationships for co-clustering. IEEE Trans. Cybernetics 45, 1, 15--28.Google ScholarGoogle ScholarCross RefCross Ref
  6. Bing-Kun Bao, Weiqing Min, Ke Lu, and Changsheng Xu. 2013. Social event detection with robust high-order co-clustering. In Proceedings of the 3rd ACM Conference on International Conference on Multimedia Retrieval. 135--142. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Bing-Kun Bao, Weiqing Min, Jitao Sang, and Changsheng Xu. 2012. Multimedia news digger on emerging topics from social streams. In Proceedings of the 20th ACM International Conference on Multimedia. 1357--1358. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. R. Cai, L. Lu, and A. Hanjalic. 2008. Co-clustering for auditory scene categorization. IEEE Trans. Multimedia 10, 4, 596--606. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. C. Chen, Y. T. Chen, Y. Sun, and M. Chen. 2003. Life cycle modeling of news events using aging theory. In Proceedings of the International Conference on Machine Learning. 47--59.Google ScholarGoogle Scholar
  10. K. Y. Chen, L. Luesukprasert, and S. Chou. 2007. Hot topic extraction based on timeline analysis and multidimensional sentence modeling. IEEE Trans. Knowl. Data Eng. 19, 8, 1016--1025. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. Deodhar, H. Cho, G. Gupta, J. Ghosh, and I. Dhillon. 2008. Robust overlapping co-clustering. Tech Rep. IDEAL-TR09, Department of ECE, University of Texas at Austin.Google ScholarGoogle Scholar
  12. I.S. Dhillon and D.S. Modha. 2001. Concept decompositions for large sparse text data using clustering. Machine Learning 42, 1, 143--175. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. I. S. Dhillon. 2001. Co-clustering documents and words using bipartite spectral graph partitioning. In Proceedings of the International Conference on Knowledge Discovery and Data Mining. 269--274. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. I. S. Dhillon, S. Mallela, and D. S. Modha. 2003. Information-theoretic co-clustering. In Proceedings of the International Conference on Knowledge Discovery and Data Mining. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Chris Ding, Tao Li, Wei Peng, and Haesun Park. 2006. Orthogonal nonnegative matrix t-factorizations for clustering. In Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining. 126--135. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. Gamon, S. Basu, D. Belenko, D. Fisher, M. Hurst, and A. C. König. 2008. Blews: Using blogs to provide context for news articles. In Proceedings of the AAAI Conference on Weblogs and Social Media. American Association for Artificial Intelligence.Google ScholarGoogle Scholar
  17. B. Gao, T. Y. Liu, and W. Y. Ma. 2006. Star-structured high-order heterogeneous data co-clustering based on consistent information theory. In Proceedings of the International Conference on Data Mining. IEEE, 880--884. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. G. Greco, A. Guzzo, and L. Pontieri. 2010. Coclustering multiple heterogeneous domains: Linear combinations and agreements. IEEE Trans. Knowl. Data Eng. 22, 12, 1649--1663. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. K. Jarvelin and J. Kekalainen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. 20, 4, 422--446. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Stefanie Jegelka, Suvrit Sra, and Arindam Banerjee. 2009. Approximation algorithms for tensor clustering. Algorithmic Learning Theory, Lecture Notes in Computer Science, vol. 5809, 368--383. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Shiva Prasad Kasiviswanathan, Prem Melville, Arindam Banerjee, and Vikas Sindhwani. 2011. Emerging topic detection using dictionary learning. In Proceedings of the ACM International Conference on Information and Knowledge Management. 745--754. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. R. E. Kass and L. Wasserman. 1995. A reference Bayesian test for nested hypotheses and its relationship to the Schwarz criterion. J. Amer. Statist. Assoc., 928--934.Google ScholarGoogle ScholarCross RefCross Ref
  23. T. Li, H. Chang, M. Wang, B. Ni, R. Hong, and S. Yan. 2015. Crowded scene analysis: A survey. IEEE Trans. Circuits Syst. Video Technol.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Bo Long, Zhongfei Mark Zhang, and Philip S. Yu. 2005. Co-clustering by block value decomposition. In Proceedings of the ACM International Conference on Knowledge Discovery in Data Mining. 635--640. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Miles Osborne, Sasa Petrovic, Richard McCreadie, Craig Macdonald, and Iadh Ounis. 2012. Bieber no more: First story detection using Twitter andWikipedia. In Proceedings of the SIGIR Workshop on Time-Aware Information Access.Google ScholarGoogle Scholar
  26. D. Pelleg and A. Moore. 2000. X-means: Extending K-means with efficient estimation of the number of clusters. In Proceedings of the 17th International Conference on Machine Learning. 727--734. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. S. D. Roy, T. Mei, W. Zeng, and S. Li. 2012. Empowering cross-domain internet media with real-time topic learning from social streams. In Proceedings of the IEEE International Conference on Multimedia and Expo. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. T. Sakaki, M. Okazaki, and Y. Matsuo. 2010. Earthquake shakes Twitter users: Real-time event detection by social sensors. In Proceedings of the 19th International Conference on World Wide Web. 851--860. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Jitao Sang, Changsheng Xu, and Jing Liu. 2012. User-aware image tag refinement via ternary semantic analysis. IEEE Trans. Multimedia 14, 3, 883--895. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Hassan Sayyadi, Matthew Hurst, and Alexey Maykov. 2009. Event detection and tracking in social streams. In Proceedings of the International Conference on Weblogs and Social Media.Google ScholarGoogle Scholar
  31. Giuseppe Serra, Thomas Alisi, Marco Bertini, Lamberto Ballan, Alberto Del Bimbo, Laurent Walter Goix, and Carlo Alberto Licciardi. 2013. Demo paper: Stamat: A framework for social topics and media analysis. In Proceedings of the IEEE International Conference on Multimedia and Expo Workshops. 1--2.Google ScholarGoogle ScholarCross RefCross Ref
  32. Y. Takama, A. Matsumura, and T. Kajinami. 2006. Visualization of news distribution in blog space. In Proceedings of the IEEE International Conference on Web Intelligence and Intelligent Agent Technology. 413--416. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. S. Tan, C. W. Ngo, H. K. Tan, and L. Pang. 2011. Cross media hyperlinking for search topic browsing. In Proceedings of the ACM International Conference on Multimedia. 243--252. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Wei Xu, Xin Liu, and Yihong Gong. 2003. Document clustering based on non-negative matrix factorization. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. 267--273. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Jianke Zhu, Steven C. H. Hoi, Michael R. Lyu, and Shuicheng Yan. 2008. Near-duplicate keyframe retrieval by nonrigid image matching. In Proceedings of the 16th ACM International Conference on Multimedia. 41--50. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Cross-Platform Emerging Topic Detection and Elaboration from Multimedia Streams

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Multimedia Computing, Communications, and Applications
        ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 11, Issue 4
        April 2015
        231 pages
        ISSN:1551-6857
        EISSN:1551-6865
        DOI:10.1145/2788342
        Issue’s Table of Contents

        Copyright © 2015 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 2 June 2015
        • Accepted: 1 January 2015
        • Revised: 1 August 2014
        • Received: 1 January 2014
        Published in tomm Volume 11, Issue 4

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!