Abstract
Microblogging sites like Twitter have become important sources of real-time information during disaster events. A large amount of valuable situational information is posted in these sites during disasters; however, the information is dispersed among hundreds of thousands of tweets containing sentiments and opinions of the masses. To effectively utilize microblogging sites during disaster events, it is necessary to not only extract the situational information from the large amounts of sentiments and opinions, but also to summarize the large amounts of situational information posted in real-time. During disasters in countries like India, a sizable number of tweets are posted in local resource-poor languages besides the normal English-language tweets. For instance, in the Indian subcontinent, a large number of tweets are posted in Hindi/Devanagari (the national language of India), and some of the information contained in such non-English tweets is not available (or available at a later point of time) through English tweets. In this work, we develop a novel classification-summarization framework which handles tweets in both English and Hindi—we first extract tweets containing situational information, and then summarize this information. Our proposed methodology is developed based on the understanding of how several concepts evolve in Twitter during disaster. This understanding helps us achieve superior performance compared to the state-of-the-art tweet classifiers and summarization approaches on English tweets. Additionally, to our knowledge, this is the first attempt to extract situational information from non-English tweets.
Supplemental Material
Available for Download
Supplemental movie, appendix, image and software files for, Extracting and Summarizing Situational Information from the Twitter Social Media during Disasters
- Akshat Bakliwal, Piyush Arora, and Vasudeva Varma. 2012. Hindi subjective lexicon: A lexical resource for Hindi adjective polarity classification. In Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC’12). ELRA, 1189--1196.Google Scholar
- Mark A. Cameron, Robert Power, Bella Robinson, and Jie Yin. 2012. Emergency situation awareness from twitter for crisis management. In Proceedings of the 21st International Conference on World Wide Web (WWW’12 Companion). ACM, 695--698. Google Scholar
Digital Library
- Carlos Castillo. 2016. Big Crisis Data: Social Media in Disasters and Time-Critical Situations (1st ed.). Cambridge University Press, New York. Google Scholar
Digital Library
- Deepayan Chakrabarti and Kunal Punera. 2011. Event summarization using tweets. In Proceedings of the 5th International Conference on Weblogs and Social Media (ICWSM’11). AAAI, 340--348.Google Scholar
- Günes Erkan and Dragomir R. Radev. 2004. LexRank: Graph-based lexical centrality as salience in text summarization. Journal of Artificial Intelligence Research 22, 1 (Dec. 2004), 457--479. Google Scholar
Digital Library
- Joseph L. Fleiss and Jacob Cohen. 1973. The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educational and Psychological Measurement 33, 3 (1973), 613--619.Google Scholar
Cross Ref
- Kevin Gimpel, Nathan Schneider, Brendan O’Connor, Dipanjan Das, Daniel Mills, Jacob Eisenstein, Michael Heilman, Dani Yogatama, Jeffrey Flanigan, and Noah A. Smith. 2011. Part-of-speech tagging for twitter: Annotation, features, and experiments. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: Short Papers - Volume 2 (HLT’11). ACL, 42--47. Google Scholar
Digital Library
- Gurobi 2016. Gurobi—The state-of-the-art mathematical programming solver for prescriptive analytics. Retrieved July 2016 from http://www.gurobi.com.Google Scholar
- Hagupit-wiki 2014. Typhoon Hagupit—Wikipedia. Retrieved December 2014 from http://en.wikipedia.org/wiki/Typhoon_Hagupit.Google Scholar
- Aniko Hannak, Eric Anderson, Lisa Feldman Barrett, Sune Lehmann, Alan Mislove, and Mirek Riedewald. 2012. Tweetin’ in the rain: Exploring societal-scale effects of weather on mood. In Proceedings of the 6th International Conference on Weblogs and Social Media (ICWSM’12). AAAI, 479--482.Google Scholar
- Harda-derailment-wiki 2015. 2015 Harda Train Derailment—Wikipedia. Retrieved August 2015 from http://en.wikipedia.org/wiki/2015_Harda_accident.Google Scholar
- Hindi-postagger 2015. Hindi Parser and POS-Tagger. http://sivareddy.in/downloads/.Google Scholar
- Hyderabad-blast-wiki 2013. Hyderabad Blasts—Wikipedia. Retrieved February 2013 from http://en.wikipedia.org/wiki/2013_Hyderabad_blasts.Google Scholar
- Muhammad Imran, Carlos Castillo, Fernando Diaz, and Sarah Vieweg. 2015. Processing social media messages in mass emergency: A survey. ACM Computing Surveys 47, 4 (June 2015), 67:1--67:38. Google Scholar
Digital Library
- Muhammad Imran, Carlos Castillo, Ji Lucas, Patrick Meier, and Sarah Vieweg. 2014. AIDR: Artificial intelligence for disaster response. In Proceedings of the 23rd International Conference on World Wide Web (WWW’14 Companion). ACM, 159--162. Google Scholar
Digital Library
- Muhammad Imran, Prasenjit Mitra, and Carlos Castillo. 2016. Twitter as a lifeline: Human-annotated twitter corpora for NLP of crisis-related messages. In Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC’16). ELRA, 1638--1643.Google Scholar
- Chris Kedzie, Kathleen McKeown, and Fernando Diaz. 2015. Predicting salient updates for disaster summarization. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics (ACL) and the 7th International Joint Conference on Natural Language Processing (IJCNLP’15) (Volume 1: Long Papers). ACL, 1608--1617.Google Scholar
Cross Ref
- Muhammad Asif Hossain Khan, Danushka Bollegala, Guangwen Liu, and Kaoru Sezaki. 2013. Multi-tweet summarization of real-time events. In Proceedings of the 2013 International Conference on Social Computing. IEEE Computer Society, 128--133. Google Scholar
Digital Library
- Lingpeng Kong, Nathan Schneider, Swabha Swayamdipta, Archna Bhatia, Chris Dyer, and Noah A. Smith. 2014. A dependency parser for tweets. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’14). ACL, 1001--1012.Google Scholar
- Chin-Yew Lin. 2004. ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out: Proceedings of the ACL-04 Workshop. ACL, 74--81.Google Scholar
- Suman Maity, Anshit Chaudhary, Shraman Kumar, Animesh Mukherjee, Chaitanya Sarda, Abhijeet Patil, and Akash Mondal. 2016. WASSUP? LOL: Characterizing out-of-vocabulary words in twitter. In Proceedings of the 19th ACM Conference on Computer Supported Cooperative Work and Social Computing Companion (CSCW’16 Companion). ACM, 341--344. Google Scholar
Digital Library
- Nepal-quake-wiki 2015. 2015 Nepal Earthquake—Wikipedia. Retrieved April 2015 from http://en.wikipedia.org/wiki/2015_Nepal_earthquake.Google Scholar
- Graham Neubig, Yuichiroh Matsubayashi, Masato Hagiwara, and Koji Murakami. 2011. Safety information mining—What can NLP do in a disaster—. In Proceedings of the 5th International Joint Conference on Natural Language Processing (IJCNLP’11). AFNLP, 965--973.Google Scholar
- Minh-Tien Nguyen, Asanobu Kitamoto, and Tri-Thanh Nguyen. 2015. TSum4act: A framework for retrieving and summarizing actionable tweets during a disaster for reaction. In Proceedings of the 19th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining (PAKDD’15). Springer International Publishing, 64--75.Google Scholar
Cross Ref
- Andrei Olariu. 2014. Efficient online summarization of microblogging streams. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics (EACL’14). ACL, 236--240.Google Scholar
Cross Ref
- Miles Osborne, Sean Moran, Richard McCreadie, Alexander Von Lunen, Martin Sykora, Elizabeth Cano, Neil Ireson, Craig Macdonald, Iadh Ounis, Yulan He, Tom Jackson, Fabio Ciravegna, and Ann O’Brien. 2014. Real-time detection, tracking, and monitoring of automatically discovered events in social media. In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations (ACL). ACL, 37--42.Google Scholar
Cross Ref
- Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1999. The PageRank Citation Ranking: Bringing Order to the Web. Technical Report 1999-66. Stanford Infolab. http://ilpubs.stanford.edu:8090/422/.Google Scholar
- Daraksha Parveen and Michael Strube. 2014. Multi-document summarization using bipartite graphs. In Proceedings of TextGraphs Workshop on Graph-based Methods for Natural Language Processing. ACL, 15--24.Google Scholar
Cross Ref
- Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, and Édouard Duchesnay. 2011. Scikit-learn: Machine learning in python. Journal of Machine Learning Research 12 (2011), 2825--2830. Google Scholar
Digital Library
- Hemant Purohit, Carlos Castillo, Fernando Diaz, Amit Sheth, and Patrick Meier. 2013. Emergency-relief coordination on social media: Automatically matching resource requests and offers. First Monday 19, 1 (2013).Google Scholar
- Yan Qu, Chen Huang, Pengyi Zhang, and Jun Zhang. 2011. Microblogging after a major disaster in China: A case study of the 2010 Yushu earthquake. In Proceedings of the ACM Conference on Computer Supported Cooperative Work (CSCW’11). ACM, 25--34. Google Scholar
Digital Library
- Randolph Quirk, Sidney Greenbaum, Geoffrey Leech, and Jan Svartvik. 1985. A Comprehensive Grammar of the English Language. Longman.Google Scholar
- Siva Reddy and Serge Sharoff. 2011. Cross language POS taggers (and other tools) for Indian languages: An experiment with Kannada using Telugu resources. In Proceedings of the International Workshop On Cross Lingual Information Access. AFNLP, 11--19.Google Scholar
- Alan Ritter, Sam Clark, Mausam, and Oren Etzioni. 2011. Named entity recognition in tweets: An experimental study. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNL’11). ACL, 1524--1534. Google Scholar
Digital Library
- Koustav Rudra, Siddhartha Banerjee, Niloy Ganguly, Pawan Goyal, Muhammad Imran, and Prasenjit Mitra. 2016. Summarizing situational tweets in crisis scenario. In Proceedings of the 27th ACM Conference on Hypertext and Social Media (HT’16). ACM, 137--147. Google Scholar
Digital Library
- Koustav Rudra, Subham Ghosh, Niloy Ganguly, Pawan Goyal, and Saptarshi Ghosh. 2015. Extracting situational information from microblogs during disaster events: A classification-summarization approach. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management (CIKM’15). ACM, 583--592. Google Scholar
Digital Library
- Koustav Rudra, Ashish Sharma, Niloy Ganguly, and Saptarshi Ghosh. 2016. Characterizing communal microblogs during disaster events. In Proceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM’16). IEEE, 96--99. Google Scholar
Digital Library
- Takeshi Sakaki, Makoto Okazaki, and Yutaka Matsuo. 2010. Earthquake shakes twitter users: Real-time event detection by social sensors. In Proceedings of the 19th International Conference on World Wide Web (WWW’10). ACM, 851--860. Google Scholar
Digital Library
- Sandyhook-wiki 2012. Sandy Hook Elementary School Shooting—Wikipedia. Retrieved December 2012 from http://en.wikipedia.org/wiki/Sandy_Hook_Elementary_School_shooting.Google Scholar
- Nadine B. Sarter and David D. Woods. 1991. Situation awareness: A critical but ill-defined phenomenon. The International Journal of Aviation Psychology 1, 1 (1991), 45--57.Google Scholar
Cross Ref
- Lidan Shou, Zhenhua Wang, Ke Chen, and Gang Chen. 2013. Sumblr: Continuous summarization of evolving tweet streams. In Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’13). ACM, 533--542. Google Scholar
Digital Library
- Simpson 2017. Szymkiewicz-Simpson Coefficient. https://en.wikipedia.org/wiki/Overlap_coefficient.Google Scholar
- Hiroya Takamura, Hikaru Yokono, and Manabu Okumura. 2011. Summarizing a document stream. In Proceedings of 33rd European Conference on IR Research (ECIR’11). Springer, Berlin, 177--188. Google Scholar
Digital Library
- Ke Tao, Fabian Abel, Claudia Hauff, Geert-Jan Houben, and Ujwal Gadiraju. 2013. Groundhog day: Near-duplicate detection on twitter. In Proceedings of the 22nd International Conference on World Wide Web (WWW’13). ACM, 1273--1284. Google Scholar
Digital Library
- Chetana Thaokar and Latesh Malik. 2013. Test model for summarizing Hindi text using extraction method. In Proceedings of 2013 IEEE Conference on Information Communication Technologies (ICoICT’13). IEEE, 1138--1143.Google Scholar
Cross Ref
- TwitterAPI 2015. REST API Resources, Twitter Developers. https://dev.twitter.com/docs/api.Google Scholar
- UMBC-semantic 2015. UMBC Semantic Similarity Service. http://swoogle.umbc.edu/SimService/.Google Scholar
- Uttarakhand-flood-wiki 2013. North India Floods—Wikipedia. Retrieved June 2013 from http://en.wikipedia.org/wiki/2013_North_India_floods.Google Scholar
- István Varga, Motoki Sano, Kentaro Torisawa, Chikara Hashimoto, Kiyonori Ohtake, Takao Kawai, Jong-Hoon Oh, and Stijn De Saeger. 2013. Aid is out there: Looking for help from tweets during a large scale disaster. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL’13) (Volume 1: Long Papers). ACL, 1619--1629.Google Scholar
- Sudha Verma, Sarah Vieweg, William J. Corvey, Leysia Palen, James H. Martin, Martha Palmer, Aaron Schram, and Kenneth M. Anderson. 2011. Natural language processing to the rescue? Extracting “situational awareness” tweets during mass emergency. In Proceedings of the 5th International Conference on Weblogs and Social Media (ICWSM’11). AAAI, 385--392.Google Scholar
- Sarah Vieweg, Amanda L. Hughes, Kate Starbird, and Leysia Palen. 2010. Microblogging during two natural hazards events: What twitter may contribute to situational awareness. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’10). ACM, 1079--1088. Google Scholar
Digital Library
- Svitlana Volkova, Theresa Wilson, and David Yarowsky. 2013. Exploring sentiment in social media: Bootstrapping subjectivity clues from multilingual twitter streams. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL’13) (Vol. 2: Short Papers). ACL, 505--510.Google Scholar
- Zhenhua Wang, Lidan Shou, Ke Chen, Gang Chen, and Sharad Mehrotra. 2015. On summarization and timeline generation for evolutionary tweet streams. IEEE Transactions on Knowledge and Data Engineering 27, 5 (2015), 1301--1314.Google Scholar
Digital Library
- Jie Yin, Andrew Lampert, Mark Cameron, Bella Robinson, and Robert Power. 2012. Using social media to enhance emergency situation awareness. IEEE Intelligent Systems 27, 6 (2012), 52--59. Google Scholar
Digital Library
- Arkaitz Zubiaga, Damiano Spina, Enrique Amigó, and Julio Gonzalo. 2012. Towards real-time summarization of scheduled events from twitter streams. In Proceedings of the 23rd ACM Conference on Hypertext and Social Media (HT’12). ACM, 319--320. Google Scholar
Digital Library
Index Terms
Extracting and Summarizing Situational Information from the Twitter Social Media during Disasters
Recommendations
Extracting Situational Information from Microblogs during Disaster Events: a Classification-Summarization Approach
CIKM '15: Proceedings of the 24th ACM International on Conference on Information and Knowledge ManagementMicroblogging sites like Twitter have become important sources of real-time information during disaster events. A significant amount of valuable situational information is available in these sites; however, this information is immersed among hundreds of ...
Summarizing Situational Tweets in Crisis Scenario
HT '16: Proceedings of the 27th ACM Conference on Hypertext and Social MediaDuring mass convergence events such as natural disasters, microblogging platforms like Twitter are widely used by affected people to post situational awareness messages. These crisis-related messages disperse among multiple categories like ...
Identifying fact-checkable microblogs during disasters: a classification-ranking approach
ICDCN '19: Proceedings of the 20th International Conference on Distributed Computing and NetworkingMicroblogging sites are increasingly playing an important role in real-time disaster management. However, rumors and fake news often spread on such platforms, which if not detected, can derail the rescue operations. Therefore, it becomes imperative to ...






Comments