Abstract
With the explosive growth of online media platforms in recent years, it becomes more and more attractive to provide users a solution of emerging topic detection and elaboration. And this posts a real challenge to both industrial and academic researchers because of the overwhelming information available in multiple modalities and with large outlier noises. This article provides a method on emerging topic detection and elaboration using multimedia streams cross different online platforms. Specifically, Twitter, New York Times and Flickr are selected for the work to represent the microblog, news portal and imaging sharing platforms. The emerging keywords of Twitter are firstly extracted using aging theory. Then, to overcome the nature of short length message in microblog, Robust Cross-Platform Multimedia Co-Clustering (RCPMM-CC) is proposed to detect emerging topics with three novelties: 1) The data from different media platforms are in multimodalities; 2) The coclustering is processed based on a pairwise correlated structure, in which the involved three media platforms are pairwise dependent; 3) The noninformative samples are automatically pruned away at the same time of coclustering. In the last step of cross-platform elaboration, we enrich each emerging topic with the samples from New York Times and Flickr by computing the implicit links between social topics and samples from selected news and Flickr image clusters, which are obtained by RCPMM-CC. Qualitative and quantitative evaluation results demonstrate the effectiveness of our method.
Supplemental Material
Available for Download
Supplemental movie, appendix, image and software files for, A reward-and-punishment-based approach for concept detection using adaptive ontology rules
- L. M. Aiello, G. Petkos, C. Martin, D. Corney, S. Papadopoulos, R. Skraba, A. Göker, I. Kompatsiaris, and A. Jaimes. 2013. Sensing trending topics in Twitter. IEEE Trans. Multimedia 15, 6. Google Scholar
Digital Library
- Foteini Alvanaki, Michel Sebastian, Krithi Ramamritham, and Gerhard Weikum. 2011. EnBlogue: Emergent topic detection in Web 2.0 streams. In Proceedings of the ACM International Conference on Management of Data. 1271--1274. Google Scholar
Digital Library
- A. Banerjee, I. Dhillon, J. Ghosh, S. Merugu, and D. S. Modha. 2007. A generalized maximum entropy approach to Bregman co-clustering and matrix approximation. J. Machine Learning Research 8, 1919--1986. Google Scholar
Digital Library
- Romil Bansal, Radhika Kumaran, Diwakar Mahajan, Arpit Khurdiya, Lipika Dey, and Hiranmay Ghosh. 2012. TWIPIX: A web magazine curated from social media. In Proceedings of the ACM International Conference on Multimedia. 1355--1356. Google Scholar
Digital Library
- Bing-Kun Bao, Weiqing Min, Teng Li, and Changsheng Xu. 2015. Joint local and global consistency on interdocument and interword relationships for co-clustering. IEEE Trans. Cybernetics 45, 1, 15--28.Google Scholar
Cross Ref
- Bing-Kun Bao, Weiqing Min, Ke Lu, and Changsheng Xu. 2013. Social event detection with robust high-order co-clustering. In Proceedings of the 3rd ACM Conference on International Conference on Multimedia Retrieval. 135--142. Google Scholar
Digital Library
- Bing-Kun Bao, Weiqing Min, Jitao Sang, and Changsheng Xu. 2012. Multimedia news digger on emerging topics from social streams. In Proceedings of the 20th ACM International Conference on Multimedia. 1357--1358. Google Scholar
Digital Library
- R. Cai, L. Lu, and A. Hanjalic. 2008. Co-clustering for auditory scene categorization. IEEE Trans. Multimedia 10, 4, 596--606. Google Scholar
Digital Library
- C. Chen, Y. T. Chen, Y. Sun, and M. Chen. 2003. Life cycle modeling of news events using aging theory. In Proceedings of the International Conference on Machine Learning. 47--59.Google Scholar
- K. Y. Chen, L. Luesukprasert, and S. Chou. 2007. Hot topic extraction based on timeline analysis and multidimensional sentence modeling. IEEE Trans. Knowl. Data Eng. 19, 8, 1016--1025. Google Scholar
Digital Library
- M. Deodhar, H. Cho, G. Gupta, J. Ghosh, and I. Dhillon. 2008. Robust overlapping co-clustering. Tech Rep. IDEAL-TR09, Department of ECE, University of Texas at Austin.Google Scholar
- I.S. Dhillon and D.S. Modha. 2001. Concept decompositions for large sparse text data using clustering. Machine Learning 42, 1, 143--175. Google Scholar
Digital Library
- I. S. Dhillon. 2001. Co-clustering documents and words using bipartite spectral graph partitioning. In Proceedings of the International Conference on Knowledge Discovery and Data Mining. 269--274. Google Scholar
Digital Library
- I. S. Dhillon, S. Mallela, and D. S. Modha. 2003. Information-theoretic co-clustering. In Proceedings of the International Conference on Knowledge Discovery and Data Mining. Google Scholar
Digital Library
- Chris Ding, Tao Li, Wei Peng, and Haesun Park. 2006. Orthogonal nonnegative matrix t-factorizations for clustering. In Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining. 126--135. Google Scholar
Digital Library
- M. Gamon, S. Basu, D. Belenko, D. Fisher, M. Hurst, and A. C. König. 2008. Blews: Using blogs to provide context for news articles. In Proceedings of the AAAI Conference on Weblogs and Social Media. American Association for Artificial Intelligence.Google Scholar
- B. Gao, T. Y. Liu, and W. Y. Ma. 2006. Star-structured high-order heterogeneous data co-clustering based on consistent information theory. In Proceedings of the International Conference on Data Mining. IEEE, 880--884. Google Scholar
Digital Library
- G. Greco, A. Guzzo, and L. Pontieri. 2010. Coclustering multiple heterogeneous domains: Linear combinations and agreements. IEEE Trans. Knowl. Data Eng. 22, 12, 1649--1663. Google Scholar
Digital Library
- K. Jarvelin and J. Kekalainen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. 20, 4, 422--446. Google Scholar
Digital Library
- Stefanie Jegelka, Suvrit Sra, and Arindam Banerjee. 2009. Approximation algorithms for tensor clustering. Algorithmic Learning Theory, Lecture Notes in Computer Science, vol. 5809, 368--383. Google Scholar
Digital Library
- Shiva Prasad Kasiviswanathan, Prem Melville, Arindam Banerjee, and Vikas Sindhwani. 2011. Emerging topic detection using dictionary learning. In Proceedings of the ACM International Conference on Information and Knowledge Management. 745--754. Google Scholar
Digital Library
- R. E. Kass and L. Wasserman. 1995. A reference Bayesian test for nested hypotheses and its relationship to the Schwarz criterion. J. Amer. Statist. Assoc., 928--934.Google Scholar
Cross Ref
- T. Li, H. Chang, M. Wang, B. Ni, R. Hong, and S. Yan. 2015. Crowded scene analysis: A survey. IEEE Trans. Circuits Syst. Video Technol.Google Scholar
Digital Library
- Bo Long, Zhongfei Mark Zhang, and Philip S. Yu. 2005. Co-clustering by block value decomposition. In Proceedings of the ACM International Conference on Knowledge Discovery in Data Mining. 635--640. Google Scholar
Digital Library
- Miles Osborne, Sasa Petrovic, Richard McCreadie, Craig Macdonald, and Iadh Ounis. 2012. Bieber no more: First story detection using Twitter andWikipedia. In Proceedings of the SIGIR Workshop on Time-Aware Information Access.Google Scholar
- D. Pelleg and A. Moore. 2000. X-means: Extending K-means with efficient estimation of the number of clusters. In Proceedings of the 17th International Conference on Machine Learning. 727--734. Google Scholar
Digital Library
- S. D. Roy, T. Mei, W. Zeng, and S. Li. 2012. Empowering cross-domain internet media with real-time topic learning from social streams. In Proceedings of the IEEE International Conference on Multimedia and Expo. Google Scholar
Digital Library
- T. Sakaki, M. Okazaki, and Y. Matsuo. 2010. Earthquake shakes Twitter users: Real-time event detection by social sensors. In Proceedings of the 19th International Conference on World Wide Web. 851--860. Google Scholar
Digital Library
- Jitao Sang, Changsheng Xu, and Jing Liu. 2012. User-aware image tag refinement via ternary semantic analysis. IEEE Trans. Multimedia 14, 3, 883--895. Google Scholar
Digital Library
- Hassan Sayyadi, Matthew Hurst, and Alexey Maykov. 2009. Event detection and tracking in social streams. In Proceedings of the International Conference on Weblogs and Social Media.Google Scholar
- Giuseppe Serra, Thomas Alisi, Marco Bertini, Lamberto Ballan, Alberto Del Bimbo, Laurent Walter Goix, and Carlo Alberto Licciardi. 2013. Demo paper: Stamat: A framework for social topics and media analysis. In Proceedings of the IEEE International Conference on Multimedia and Expo Workshops. 1--2.Google Scholar
Cross Ref
- Y. Takama, A. Matsumura, and T. Kajinami. 2006. Visualization of news distribution in blog space. In Proceedings of the IEEE International Conference on Web Intelligence and Intelligent Agent Technology. 413--416. Google Scholar
Digital Library
- S. Tan, C. W. Ngo, H. K. Tan, and L. Pang. 2011. Cross media hyperlinking for search topic browsing. In Proceedings of the ACM International Conference on Multimedia. 243--252. Google Scholar
Digital Library
- Wei Xu, Xin Liu, and Yihong Gong. 2003. Document clustering based on non-negative matrix factorization. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. 267--273. Google Scholar
Digital Library
- Jianke Zhu, Steven C. H. Hoi, Michael R. Lyu, and Shuicheng Yan. 2008. Near-duplicate keyframe retrieval by nonrigid image matching. In Proceedings of the 16th ACM International Conference on Multimedia. 41--50. Google Scholar
Digital Library
Index Terms
Cross-Platform Emerging Topic Detection and Elaboration from Multimedia Streams
Recommendations
Cross-media topic detection associated with hot search queries
ICIMCS '13: Proceedings of the Fifth International Conference on Internet Multimedia Computing and ServiceAlthough lots of work has been done since NIST proposed the problem of Topic Detection and Tracking (TDT), most of them focus on single media data. Topic detection for cross-media data hasn't been fully investigated. In this paper, we propose an ...
Image-regulated graph topic model for cross-media topic detection
ICIMCS '15: Proceedings of the 7th International Conference on Internet Multimedia Computing and ServiceIn recent years, pictures and videos have become ubiquitous on the Internet, which encourage the development of algorithm that analyze their semantic contents for detecting topics. Among them, topic modeling plays an essential role in discovering topics ...
A semantic approach for topic-based polarity detection: a case study in the Spanish language
AbstractIn recent years, surprising amounts of news, messages, and reviews of products and services are generated in the online social media. Several efforts are being dedicated to detecting topics, as well as mining opinions in these unstructured texts. ...






Comments