skip to main content
research-article

Extracting and Summarizing Situational Information from the Twitter Social Media during Disasters

Published:17 July 2018Publication History
Skip Abstract Section

Abstract

Microblogging sites like Twitter have become important sources of real-time information during disaster events. A large amount of valuable situational information is posted in these sites during disasters; however, the information is dispersed among hundreds of thousands of tweets containing sentiments and opinions of the masses. To effectively utilize microblogging sites during disaster events, it is necessary to not only extract the situational information from the large amounts of sentiments and opinions, but also to summarize the large amounts of situational information posted in real-time. During disasters in countries like India, a sizable number of tweets are posted in local resource-poor languages besides the normal English-language tweets. For instance, in the Indian subcontinent, a large number of tweets are posted in Hindi/Devanagari (the national language of India), and some of the information contained in such non-English tweets is not available (or available at a later point of time) through English tweets. In this work, we develop a novel classification-summarization framework which handles tweets in both English and Hindi—we first extract tweets containing situational information, and then summarize this information. Our proposed methodology is developed based on the understanding of how several concepts evolve in Twitter during disaster. This understanding helps us achieve superior performance compared to the state-of-the-art tweet classifiers and summarization approaches on English tweets. Additionally, to our knowledge, this is the first attempt to extract situational information from non-English tweets.

Skip Supplemental Material Section

Supplemental Material

References

  1. Akshat Bakliwal, Piyush Arora, and Vasudeva Varma. 2012. Hindi subjective lexicon: A lexical resource for Hindi adjective polarity classification. In Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC’12). ELRA, 1189--1196.Google ScholarGoogle Scholar
  2. Mark A. Cameron, Robert Power, Bella Robinson, and Jie Yin. 2012. Emergency situation awareness from twitter for crisis management. In Proceedings of the 21st International Conference on World Wide Web (WWW’12 Companion). ACM, 695--698. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Carlos Castillo. 2016. Big Crisis Data: Social Media in Disasters and Time-Critical Situations (1st ed.). Cambridge University Press, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Deepayan Chakrabarti and Kunal Punera. 2011. Event summarization using tweets. In Proceedings of the 5th International Conference on Weblogs and Social Media (ICWSM’11). AAAI, 340--348.Google ScholarGoogle Scholar
  5. Günes Erkan and Dragomir R. Radev. 2004. LexRank: Graph-based lexical centrality as salience in text summarization. Journal of Artificial Intelligence Research 22, 1 (Dec. 2004), 457--479. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Joseph L. Fleiss and Jacob Cohen. 1973. The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educational and Psychological Measurement 33, 3 (1973), 613--619.Google ScholarGoogle ScholarCross RefCross Ref
  7. Kevin Gimpel, Nathan Schneider, Brendan O’Connor, Dipanjan Das, Daniel Mills, Jacob Eisenstein, Michael Heilman, Dani Yogatama, Jeffrey Flanigan, and Noah A. Smith. 2011. Part-of-speech tagging for twitter: Annotation, features, and experiments. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: Short Papers - Volume 2 (HLT’11). ACL, 42--47. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Gurobi 2016. Gurobi—The state-of-the-art mathematical programming solver for prescriptive analytics. Retrieved July 2016 from http://www.gurobi.com.Google ScholarGoogle Scholar
  9. Hagupit-wiki 2014. Typhoon Hagupit—Wikipedia. Retrieved December 2014 from http://en.wikipedia.org/wiki/Typhoon_Hagupit.Google ScholarGoogle Scholar
  10. Aniko Hannak, Eric Anderson, Lisa Feldman Barrett, Sune Lehmann, Alan Mislove, and Mirek Riedewald. 2012. Tweetin’ in the rain: Exploring societal-scale effects of weather on mood. In Proceedings of the 6th International Conference on Weblogs and Social Media (ICWSM’12). AAAI, 479--482.Google ScholarGoogle Scholar
  11. Harda-derailment-wiki 2015. 2015 Harda Train Derailment—Wikipedia. Retrieved August 2015 from http://en.wikipedia.org/wiki/2015_Harda_accident.Google ScholarGoogle Scholar
  12. Hindi-postagger 2015. Hindi Parser and POS-Tagger. http://sivareddy.in/downloads/.Google ScholarGoogle Scholar
  13. Hyderabad-blast-wiki 2013. Hyderabad Blasts—Wikipedia. Retrieved February 2013 from http://en.wikipedia.org/wiki/2013_Hyderabad_blasts.Google ScholarGoogle Scholar
  14. Muhammad Imran, Carlos Castillo, Fernando Diaz, and Sarah Vieweg. 2015. Processing social media messages in mass emergency: A survey. ACM Computing Surveys 47, 4 (June 2015), 67:1--67:38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Muhammad Imran, Carlos Castillo, Ji Lucas, Patrick Meier, and Sarah Vieweg. 2014. AIDR: Artificial intelligence for disaster response. In Proceedings of the 23rd International Conference on World Wide Web (WWW’14 Companion). ACM, 159--162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Muhammad Imran, Prasenjit Mitra, and Carlos Castillo. 2016. Twitter as a lifeline: Human-annotated twitter corpora for NLP of crisis-related messages. In Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC’16). ELRA, 1638--1643.Google ScholarGoogle Scholar
  17. Chris Kedzie, Kathleen McKeown, and Fernando Diaz. 2015. Predicting salient updates for disaster summarization. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics (ACL) and the 7th International Joint Conference on Natural Language Processing (IJCNLP’15) (Volume 1: Long Papers). ACL, 1608--1617.Google ScholarGoogle ScholarCross RefCross Ref
  18. Muhammad Asif Hossain Khan, Danushka Bollegala, Guangwen Liu, and Kaoru Sezaki. 2013. Multi-tweet summarization of real-time events. In Proceedings of the 2013 International Conference on Social Computing. IEEE Computer Society, 128--133. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Lingpeng Kong, Nathan Schneider, Swabha Swayamdipta, Archna Bhatia, Chris Dyer, and Noah A. Smith. 2014. A dependency parser for tweets. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’14). ACL, 1001--1012.Google ScholarGoogle Scholar
  20. Chin-Yew Lin. 2004. ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out: Proceedings of the ACL-04 Workshop. ACL, 74--81.Google ScholarGoogle Scholar
  21. Suman Maity, Anshit Chaudhary, Shraman Kumar, Animesh Mukherjee, Chaitanya Sarda, Abhijeet Patil, and Akash Mondal. 2016. WASSUP? LOL: Characterizing out-of-vocabulary words in twitter. In Proceedings of the 19th ACM Conference on Computer Supported Cooperative Work and Social Computing Companion (CSCW’16 Companion). ACM, 341--344. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Nepal-quake-wiki 2015. 2015 Nepal Earthquake—Wikipedia. Retrieved April 2015 from http://en.wikipedia.org/wiki/2015_Nepal_earthquake.Google ScholarGoogle Scholar
  23. Graham Neubig, Yuichiroh Matsubayashi, Masato Hagiwara, and Koji Murakami. 2011. Safety information mining—What can NLP do in a disaster—. In Proceedings of the 5th International Joint Conference on Natural Language Processing (IJCNLP’11). AFNLP, 965--973.Google ScholarGoogle Scholar
  24. Minh-Tien Nguyen, Asanobu Kitamoto, and Tri-Thanh Nguyen. 2015. TSum4act: A framework for retrieving and summarizing actionable tweets during a disaster for reaction. In Proceedings of the 19th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining (PAKDD’15). Springer International Publishing, 64--75.Google ScholarGoogle ScholarCross RefCross Ref
  25. Andrei Olariu. 2014. Efficient online summarization of microblogging streams. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics (EACL’14). ACL, 236--240.Google ScholarGoogle ScholarCross RefCross Ref
  26. Miles Osborne, Sean Moran, Richard McCreadie, Alexander Von Lunen, Martin Sykora, Elizabeth Cano, Neil Ireson, Craig Macdonald, Iadh Ounis, Yulan He, Tom Jackson, Fabio Ciravegna, and Ann O’Brien. 2014. Real-time detection, tracking, and monitoring of automatically discovered events in social media. In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations (ACL). ACL, 37--42.Google ScholarGoogle ScholarCross RefCross Ref
  27. Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1999. The PageRank Citation Ranking: Bringing Order to the Web. Technical Report 1999-66. Stanford Infolab. http://ilpubs.stanford.edu:8090/422/.Google ScholarGoogle Scholar
  28. Daraksha Parveen and Michael Strube. 2014. Multi-document summarization using bipartite graphs. In Proceedings of TextGraphs Workshop on Graph-based Methods for Natural Language Processing. ACL, 15--24.Google ScholarGoogle ScholarCross RefCross Ref
  29. Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, and Édouard Duchesnay. 2011. Scikit-learn: Machine learning in python. Journal of Machine Learning Research 12 (2011), 2825--2830. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Hemant Purohit, Carlos Castillo, Fernando Diaz, Amit Sheth, and Patrick Meier. 2013. Emergency-relief coordination on social media: Automatically matching resource requests and offers. First Monday 19, 1 (2013).Google ScholarGoogle Scholar
  31. Yan Qu, Chen Huang, Pengyi Zhang, and Jun Zhang. 2011. Microblogging after a major disaster in China: A case study of the 2010 Yushu earthquake. In Proceedings of the ACM Conference on Computer Supported Cooperative Work (CSCW’11). ACM, 25--34. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Randolph Quirk, Sidney Greenbaum, Geoffrey Leech, and Jan Svartvik. 1985. A Comprehensive Grammar of the English Language. Longman.Google ScholarGoogle Scholar
  33. Siva Reddy and Serge Sharoff. 2011. Cross language POS taggers (and other tools) for Indian languages: An experiment with Kannada using Telugu resources. In Proceedings of the International Workshop On Cross Lingual Information Access. AFNLP, 11--19.Google ScholarGoogle Scholar
  34. Alan Ritter, Sam Clark, Mausam, and Oren Etzioni. 2011. Named entity recognition in tweets: An experimental study. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNL’11). ACL, 1524--1534. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Koustav Rudra, Siddhartha Banerjee, Niloy Ganguly, Pawan Goyal, Muhammad Imran, and Prasenjit Mitra. 2016. Summarizing situational tweets in crisis scenario. In Proceedings of the 27th ACM Conference on Hypertext and Social Media (HT’16). ACM, 137--147. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Koustav Rudra, Subham Ghosh, Niloy Ganguly, Pawan Goyal, and Saptarshi Ghosh. 2015. Extracting situational information from microblogs during disaster events: A classification-summarization approach. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management (CIKM’15). ACM, 583--592. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Koustav Rudra, Ashish Sharma, Niloy Ganguly, and Saptarshi Ghosh. 2016. Characterizing communal microblogs during disaster events. In Proceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM’16). IEEE, 96--99. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Takeshi Sakaki, Makoto Okazaki, and Yutaka Matsuo. 2010. Earthquake shakes twitter users: Real-time event detection by social sensors. In Proceedings of the 19th International Conference on World Wide Web (WWW’10). ACM, 851--860. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Sandyhook-wiki 2012. Sandy Hook Elementary School Shooting—Wikipedia. Retrieved December 2012 from http://en.wikipedia.org/wiki/Sandy_Hook_Elementary_School_shooting.Google ScholarGoogle Scholar
  40. Nadine B. Sarter and David D. Woods. 1991. Situation awareness: A critical but ill-defined phenomenon. The International Journal of Aviation Psychology 1, 1 (1991), 45--57.Google ScholarGoogle ScholarCross RefCross Ref
  41. Lidan Shou, Zhenhua Wang, Ke Chen, and Gang Chen. 2013. Sumblr: Continuous summarization of evolving tweet streams. In Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’13). ACM, 533--542. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Simpson 2017. Szymkiewicz-Simpson Coefficient. https://en.wikipedia.org/wiki/Overlap_coefficient.Google ScholarGoogle Scholar
  43. Hiroya Takamura, Hikaru Yokono, and Manabu Okumura. 2011. Summarizing a document stream. In Proceedings of 33rd European Conference on IR Research (ECIR’11). Springer, Berlin, 177--188. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Ke Tao, Fabian Abel, Claudia Hauff, Geert-Jan Houben, and Ujwal Gadiraju. 2013. Groundhog day: Near-duplicate detection on twitter. In Proceedings of the 22nd International Conference on World Wide Web (WWW’13). ACM, 1273--1284. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Chetana Thaokar and Latesh Malik. 2013. Test model for summarizing Hindi text using extraction method. In Proceedings of 2013 IEEE Conference on Information Communication Technologies (ICoICT’13). IEEE, 1138--1143.Google ScholarGoogle ScholarCross RefCross Ref
  46. TwitterAPI 2015. REST API Resources, Twitter Developers. https://dev.twitter.com/docs/api.Google ScholarGoogle Scholar
  47. UMBC-semantic 2015. UMBC Semantic Similarity Service. http://swoogle.umbc.edu/SimService/.Google ScholarGoogle Scholar
  48. Uttarakhand-flood-wiki 2013. North India Floods—Wikipedia. Retrieved June 2013 from http://en.wikipedia.org/wiki/2013_North_India_floods.Google ScholarGoogle Scholar
  49. István Varga, Motoki Sano, Kentaro Torisawa, Chikara Hashimoto, Kiyonori Ohtake, Takao Kawai, Jong-Hoon Oh, and Stijn De Saeger. 2013. Aid is out there: Looking for help from tweets during a large scale disaster. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL’13) (Volume 1: Long Papers). ACL, 1619--1629.Google ScholarGoogle Scholar
  50. Sudha Verma, Sarah Vieweg, William J. Corvey, Leysia Palen, James H. Martin, Martha Palmer, Aaron Schram, and Kenneth M. Anderson. 2011. Natural language processing to the rescue? Extracting “situational awareness” tweets during mass emergency. In Proceedings of the 5th International Conference on Weblogs and Social Media (ICWSM’11). AAAI, 385--392.Google ScholarGoogle Scholar
  51. Sarah Vieweg, Amanda L. Hughes, Kate Starbird, and Leysia Palen. 2010. Microblogging during two natural hazards events: What twitter may contribute to situational awareness. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’10). ACM, 1079--1088. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Svitlana Volkova, Theresa Wilson, and David Yarowsky. 2013. Exploring sentiment in social media: Bootstrapping subjectivity clues from multilingual twitter streams. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL’13) (Vol. 2: Short Papers). ACL, 505--510.Google ScholarGoogle Scholar
  53. Zhenhua Wang, Lidan Shou, Ke Chen, Gang Chen, and Sharad Mehrotra. 2015. On summarization and timeline generation for evolutionary tweet streams. IEEE Transactions on Knowledge and Data Engineering 27, 5 (2015), 1301--1314.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Jie Yin, Andrew Lampert, Mark Cameron, Bella Robinson, and Robert Power. 2012. Using social media to enhance emergency situation awareness. IEEE Intelligent Systems 27, 6 (2012), 52--59. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Arkaitz Zubiaga, Damiano Spina, Enrique Amigó, and Julio Gonzalo. 2012. Towards real-time summarization of scheduled events from twitter streams. In Proceedings of the 23rd ACM Conference on Hypertext and Social Media (HT’12). ACM, 319--320. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Extracting and Summarizing Situational Information from the Twitter Social Media during Disasters

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!