skip to main content
research-article

Dual Structure Constrained Multimodal Feature Coding for Social Event Detection from Flickr Data

Published:27 March 2017Publication History
Skip Abstract Section

Abstract

In this work, a three-stage social event detection (SED) framework is proposed to discover events from Flickr-like data. First, multiple bipartite graphs are constructed for the heterogeneous feature modalities to achieve fused features. Furthermore, considering the geometrical structures of dictionary and data, a dual structure constrained multimodal feature coding model is designed to learn discriminative feature codes by incorporating corresponding regularization terms into the objective. Finally, clustering models utilizing density or label knowledge and data recovery residual models are devised to discover real-world events. The proposed SED approach achieves the highest performance on the MediaEval 2014 SED dataset.

References

  1. Julien Ah-Pine, Gabriela Csurka, and Stéphane Clinchant. 2015. Unsupervised visual and textual information fusion in CBMIR using graph-based methods. ACM Transactions on Information Systems 33, 2, 9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Unaiza Ahsan and Irfan Essa. 2014. Clustering social event images using kernel canonical correlation analysis. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’14). IEEE, Los Alamitos, CA, 814--819. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. James Allan. 2002. Introduction to topic detection and tracking. In Topic Detection and Tracking. Springer, 1--16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. James Allan, Ron Papka, and Victor Lavrenko. 1998. On-line new event detection and tracking. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, NY, 37--45. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Deng Cai, Xiaofei He, Jiawei Han, and Thomas S. Huang. 2011a. Graph regularized nonnegative matrix factorization for data representation. IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 8, 1548--1560. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Xiao Cai, Feiping Nie, Heng Huang, and Farhad Kamangar. 2011b. Heterogeneous image feature integration via multi-modal spectral clustering. In Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’11). IEEE, Los Alamitos, CA, 1977--1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Ling Chen and Abhishek Roy. 2009. Event detection from Flickr data through wavelet-based spatial analysis. In Proceedings of the 18th ACM Conference on Information and Knowledge Management. ACM, New York, NY, 523--532. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Petros Daras, Stavroula Manolopoulou, and Apostolos Axenopoulos. 2012. Search and retrieval of rich media objects supporting multiple multimodal queries. IEEE Transactions on Multimedia 14, 3, 734--746. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Simon Denman, David Dean, Clinton Fookes, and Sridha Sridharan. 2014. SAIVT-ADMRG@ MediaEval 2014 social event detection. In Proceedings of the MediaEval 2014 Multimedia Benchmark Workshop, Vol. 1263. 1--2.Google ScholarGoogle Scholar
  10. Claudiu S. Firan, Mihai Georgescu, Wolfgang Nejdl, and Raluca Paiu. 2010. Bringing order to your photos: Event-driven classification of Flickr images based on social knowledge. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management. ACM, New York, NY, 189--198. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Camille Guinaudeau, Antoine Laurent, and Hervé Bredin. 2014. LIMSI @ MediaEval SED 2014. Retrieved January 31, 2017, from http://ceur-ws.org/Vol-1263/mediaeval2014_submission_45.pdf.Google ScholarGoogle Scholar
  12. Winston H. Hsu and Silvia Chang. 2006. Topic tracking across broadcast news videos with visual duplicates and semantic concepts. In Proceedings of the 2006 IEEE International Conference on Image Processing. IEEE, Los Alamitos, CA, 141--144.Google ScholarGoogle Scholar
  13. Haroon Idrees, Imran Saleemi, Cody Seibert, and Mubarak Shah. 2013. Multi-source multi-scale counting in extremely dense crowd images. In Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’13). IEEE, Los Alamitos, CA, 2547--2554. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Xudong Jiang and Jian Lai. 2015. Sparse and dense hybrid representation via dictionary decomposition for face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 37, 5, 1067--1079.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Takamu Kaneko and Keiji Yanai. 2015. Event photo mining from Twitter using keyword bursts and image clustering. Neurocomputing 172, C, 143--158. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Svetlana Lazebnik, Cordelia Schmid, and Jean Ponce. 2006. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2. IEEE, Los Alamitos, CA, 2169--2178. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Xueliang Liu and Benoit Huet. 2013. Heterogeneous features and model selection for event-based media classification. In Proceedings of the 3rd ACM International Conference on Multimedia Retrieval. ACM, New York, NY, 151--158. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Daniel Manchon Vizuete and Xavier Giró Nieto. 2013. UPC at MediaEval 2013 social event detection task. In Proceedings of the MediaEval 2013 Multimedia Benchmark Workshop.Google ScholarGoogle Scholar
  19. Alberto Messina, Maurizio Montagnuolo, Riccardo Di Massa, and Andrea Elia. 2011. The hyper media news system for multimodal and personalised fruition of informative content. In Proceedings of the 1st ACM International Conference on Multimedia Retrieval. ACM, New York, NY, 64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Naoko Nitta, Yusuke Kumihashi, Tomochika Kato, and Noboru Babaguchi. 2014. Real-world event detection using Flickr images. In MultiMedia Modeling. Springer, 307--314. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Georgios Petkos, Symeon Papadopoulos, and Yiannis Kompatsiaris. 2012. Social event detection using multimodal clustering and integrating supervisory signals. In Proceedings of the 2nd ACM International Conference on Multimedia Retrieval. ACM, New York, NY, 23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Georgios Petkos, Symeon Papadopoulos, Vasileios Mezaris, and Yiannis Kompatsiaris. 2014. Social event detection at MediaEval 2014: Challenges, datasets, and evaluation. In Proceedings of the MediaEval 2014 Workshop.Google ScholarGoogle Scholar
  23. Shengsheng Qian, Tianzhu Zhang, Changsheng Xu, and M. Shamim Hossain. 2015. Social event classification via boosted multimodal supervised latent Dirichlet allocation. ACM Transactions on Multimedia Computing, Communications, and Applications 11, 2, 27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Shengsheng Qian, Tianzhu Zhang, Changsheng Xu, and Jie Shao. 2016. Multi-modal event topic model for social event analysis. IEEE Transactions on Multimedia 18, 2, 233--246.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Marina Riga, Georgios Petkos, Symeon Papadopoulos, Manos Schinas, and Yiannis Kompatsiaris. 2014. CERTH @ MediaEval 2014 Social Event Detection Task. Retrieved January 31, 2017, from http://ceur-ws.org/Vol-1263/mediaeval2014_submission_47.pdf.Google ScholarGoogle Scholar
  26. Takeshi Sakaki, Makoto Okazaki, and Yutaka Matsuo. 2010. Earthquake shakes Twitter users: Real-time event detection by social sensors. In Proceedings of the 19th International Conference on World Wide Web. ACM, New York, NY, 851--860. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Hassan Sayyadi and Louiqa Raschid. 2013. A graph analytical approach for topic detection. ACM Transactions on Internet Technology 13, 2, 4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Shashi Shekhar, Vishal M. Patel, Nasser M. Nasrabadi, and Rama Chellappa. 2014. Joint sparse representation for robust multimodal biometrics recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 36, 1, 113--126. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Cees G. M. Snoek, Marcel Worring, and Arnold W. M. Smeulders. 2005. Early versus late fusion in semantic video analysis. In Proceedings of the 13th Annual ACM International Conference on Multimedia. ACM, New York, NY, 399--402. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Taufik Sutanto and Richi Nayak. 2014. Ranking based clustering for social event detection. In Proceedings of the MediaEval 2014 Multimedia Benchmark Workshop, Vol. 1263. 1--2.Google ScholarGoogle Scholar
  31. Mohamed Riadh Trad, Alexis Joly, and Nozha Boujemaa. 2011. Large scale visual-based event matching. In Proceedings of the 1st ACM International Conference on Multimedia Retrieval. ACM, New York, NY, 53. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Jinjun Wang, Jianchao Yang, Kai Yu, Fengjun Lv, Thomas Huang, and Yihong Gong. 2010. Locality-constrained linear coding for image classification. In Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’10). IEEE, Los Alamitos, CA, 3360--3367.Google ScholarGoogle ScholarCross RefCross Ref
  33. Fei Wu, Zhou Yu, Yi Yang, Siliang Tang, Yin Zhang, and Yueting Zhuang. 2014. Sparse multi-modal hashing. IEEE Transactions on Multimedia 16, 2, 427--439. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Xiao Wu, Chong-Wah Ngo, and Alexander G. Hauptmann. 2008. Multimodal news story clustering with pairwise visual near-duplicate constraint. IEEE Transactions on Multimedia 10, 2, 188--199. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. J. Xu, V. Jagadeesh, and B. S. Manjunath. 2014. Multi-label learning with fused multimodal bi-relational graph. IEEE Transactions on Multimedia 16, 2, 403--412. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Jianchao Yang, Kai Yu, Yihong Gong, and Tingwen Huang. 2009. Linear spatial pyramid matching using sparse coding for image classification. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’09). IEEE, Los Alamitos, CA, 1794--1801.Google ScholarGoogle ScholarCross RefCross Ref
  37. Yiyang Yang, Zhiguo Gong, and Leong Hou U. 2014. Identifying points of interest using heterogeneous features. ACM Transactions on Intelligent Systems and Technology 5, 4, 68. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Zhenguo Yang, Qing Li, Zheng Lu, Yun Ma, Zhiguo Gong, and Haiwei Pan. 2015a. Semi-supervised multimodal clustering algorithm integrating label signals for social event detection. In Proceedings of the IEEE International Conference on Multimedia Big Data. IEEE, Los Alamitos, CA, 32--39. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Zhenguo Yang, Qing Li, Zheng Lu, Yun Ma, Zhiguo Gong, Haiwei Pan, and Yangbin Chen. 2015b. Semi-supervised multimodal fusion model for social event detection on Web image collections. International Journal of Multimedia Data Engineering and Management 6, 4, 1--22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Maia Zaharieva, Daniel Schopfhauser, Manfred Del Fabro, and Matthias Zeppelzauer. 2014. Clustering and Retrieval of Social Events in Flickr. Retrieved January 31, 2017, from http://ceur-ws-org/Vol-1263/mediaeval2014_submission_39.pdf.Google ScholarGoogle Scholar
  41. Haipeng Zhang, Mohammed Korayem, David J. Crandall, and Gretchen LeBuhn. 2012. Mining photo-sharing Websites to study ecological phenomena. In Proceedings of the 21st International Conference on World Wide Web. ACM, New York, NY, 749--758. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Tianzhu Zhang and Changsheng Xu. 2014. Cross-domain multi-event tracking via CO-PMHT. ACM Transactions on Multimedia Computing, Communications, and Applications 10, 4, 31. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Zhenyue Zhang and Keke Zhao. 2013. Low-rank matrix approximation with manifold regularization. IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 7, 1717--1729. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Lei Zhu, Jianbing Shen, Hye-Jin Jin, Ran Zheng, and Lihua Xie. 2015. Content-based visual landmark search via multimodal hypergraph learning. IEEE Transactions on Cybernetics 45, 12, 2756--2769.Google ScholarGoogle Scholar
  45. Liansheng Zhuang, Shenghua Gao, Jinhui Tang, Jingjing Wang, Zhouchen Lin, and Yi Ma. 2015. Constructing a non-negative low rank and sparse graph with data-adaptive features. IEEE Transactions on Image Processing 24, 11, 3717--3728.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Dual Structure Constrained Multimodal Feature Coding for Social Event Detection from Flickr Data

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!