skip to main content
research-article
Free Access

Modeling Temporal Patterns of Cyberbullying Detection with Hierarchical Attention Networks

Published:08 April 2021Publication History
Skip Abstract Section

Abstract

Cyberbullying is rapidly becoming one of the most serious online risks for adolescents. This has motivated work on machine learning methods to automate the process of cyberbullying detection, which have so far mostly viewed cyberbullying as one-off incidents that occur at a single point in time. Comparatively less is known about how cyberbullying behavior occurs and evolves over time. This oversight highlights a crucial open challenge for cyberbullying-related research, given that cyberbullying is typically defined as intentional acts of aggression via electronic communication that occur repeatedly and persistently. In this article, we center our discussion on the challenge of modeling temporal patterns of cyberbullying behavior. Specifically, we investigate how temporal information within a social media session, which has an inherently hierarchical structure (e.g., words form a comment and comments form a session), can be leveraged to facilitate cyberbullying detection. Recent findings from interdisciplinary research suggest that the temporal characteristics of bullying sessions differ from those of non-bullying sessions and that the temporal information from users’ comments can improve cyberbullying detection. The proposed framework consists of three distinctive features: (1) a hierarchical structure that reflects how a social media session is formed in a bottom-up manner; (2) attention mechanisms applied at the word- and comment-level to differentiate the contributions of words and comments to the representation of a social media session; and (3) the incorporation of temporal features in modeling cyberbullying behavior at the comment-level. Quantitative and qualitative evaluations are conducted on a real-world dataset collected from Instagram, the social networking site with the highest percentage of users reporting cyberbullying experiences. Results from empirical evaluations show the significance of the proposed methods, which are tailored to capture temporal patterns of cyberbullying detection.

References

  1. Wasi Uddin Ahmad, Xueying Bai, Zhechao Huang, Chao Jiang, Nanyun Peng, and Kai-Wei Chang. 2018. Multi-task learning for universal sentence embeddings: A thorough evaluation using transfer and auxiliary tasks. arXiv preprint arXiv:1804.07911 (2018).Google ScholarGoogle Scholar
  2. Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014).Google ScholarGoogle Scholar
  3. Nitesh V. Chawla, Kevin W. Bowyer, Lawrence O. Hall, and W. Philip Kegelmeyer. 2002. SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res. 16 (2002), 321–357. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Charalampos Chelmis and Mengfan Yao. 2019. Minority report: Cyberbullying prediction on Instagram. In Proceedings of the 10th ACM Conference on Web Science. 37–45. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Tianqi Chen and Carlos Guestrin. 2016. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 785–794. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Lu Cheng, Ruocheng Guo, and Huan Liu. 2019. Robust cyberbullying detection with causal interpretation. In Proceedings of the World Wide Web Conference. 169–175. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Lu Cheng, Ruocheng Guo, Yasin Silva, Deborah Hall, and Huan Liu. 2019. Hierarchical attention networks for cyberbullying detection on the Instagram social network. In Proceedings of the SIAM International Conference on Data Mining. SIAM, 235–243.Google ScholarGoogle Scholar
  8. Lu Cheng, Jundong Li, Yasin Silva, Deborah Hall, and Huan Liu. 2019. PI-bully: Personalized cyberbullying detection with peer influence. In Proceedings of the 29th International Joint Conference on Artificial Intelligence. Google ScholarGoogle ScholarCross RefCross Ref
  9. Lu Cheng, Jundong Li, Yasin N. Silva, Deborah L. Hall, and Huan Liu. 2019. XBully: Cyberbullying detection within a multi-modal context. In Proceedings of the 12th ACM International Conference on Web Search and Data Mining. 339–347. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Lu Cheng, Yasin Silva, Deborah Hall, and Huan Liu. 2020. Session-based cyberbullying detection: Problems and challenges. IEEE Internet Comput., Spec. Iss. Cyber-soc. Health: Promot. Good Counter. Harm Soc. Media (2020).Google ScholarGoogle Scholar
  11. Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014).Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Alexis Conneau, Holger Schwenk, Loïc Barrault, and Yann Lecun. 2016. Very deep convolutional networks for text classification. arXiv preprint arXiv:1606.01781 (2016).Google ScholarGoogle Scholar
  13. Maral Dadvar, F. M. G. de Jong, Roeland Ordelman, and Dolf Trieschnigg. 2012. Improved cyberbullying detection using gender information. In Proceedings of the 12th Dutch-Belgian Information Retrieval Workshop (DIR’12). University of Ghent.Google ScholarGoogle Scholar
  14. Harsh Dani, Jundong Li, and Huan Liu. 2017. Sentiment informed cyberbullying detection in social media. In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 52–67.Google ScholarGoogle ScholarCross RefCross Ref
  15. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).Google ScholarGoogle Scholar
  16. Adji B. Dieng, Chong Wang, Jianfeng Gao, and John Paisley. 2016. TopicRNN: A recurrent neural network with long-range semantic dependency. arXiv preprint arXiv:1611.01702 (2016).Google ScholarGoogle Scholar
  17. Karthik Dinakar, Roi Reichart, and Henry Lieberman. 2011. Modeling the detection of textual cyberbullying. In Proceedings of the Social Mobile Web Conference.Google ScholarGoogle Scholar
  18. Ruocheng Guo, Lu Cheng, Jundong Li, P. Richard Hahn, and Huan Liu. 2020. A survey of learning causality with data: Problems and methods. ACM Comput. Surv. 53, 4 (2020), 1–37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Aabhaas Gupta, Wenxi Yang, Divya Sivakumar, Yasin N. Silva, Deborah L. Hall, and Maria Camila Nardini Barioni. 2020. Temporal properties of cyberbullying on Instagram. In Proceedings of the World Wide Web Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. L. Hackett. 2017. The annual bullying survey 2017. DitchThe Label. Retrieved from https://www.ditchthelabel.org/research-papers/the-annual-bullyingsurvey-2017.Google ScholarGoogle Scholar
  21. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Comput. 9, 8 (1997), 1735–1780. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Homa Hosseinmardi, Sabrina Arredondo Mattson, Rahat Ibn Rafiq, Richard Han, Qin Lv, and Shivakant Mishra. 2015. Detection of cyberbullying incidents on the Instagram social network. arXiv preprint arXiv:1503.03909 (2015).Google ScholarGoogle Scholar
  23. Qianjia Huang, Vivek Kumar Singh, and Pradeep Kumar Atrey. 2014. Cyber bullying detection using social and textual analysis. In Proceedings of the 3rd International Workshop on Socially-aware Multimedia. 3–6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Mohit Iyyer, Varun Manjunatha, Jordan Boyd-Graber, and Hal Daumé III. 2015. Deep unordered composition rivals syntactic methods for text classification. In Proceedings of the 53rd Meeting of the Association for Computational Linguistics. 1681–1691.Google ScholarGoogle Scholar
  25. Rie Johnson and Tong Zhang. 2014. Effective use of word order for text categorization with convolutional neural networks. arXiv preprint arXiv:1412.1058 (2014).Google ScholarGoogle Scholar
  26. Rie Johnson and Tong Zhang. 2015. Semi-supervised convolutional neural networks for text categorization via region embedding. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 919–927. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Nal Kalchbrenner, Edward Grefenstette, and Phil Blunsom. 2014. A convolutional neural network for modelling sentences. arXiv preprint arXiv:1404.2188 (2014).Google ScholarGoogle Scholar
  28. Seonhoon Kim, Inho Kang, and Nojun Kwak. 2019. Semantic sentence matching with densely-connected recurrent and co-attentive information. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 6586–6593.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Yoon Kim. 2014. Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014).Google ScholarGoogle Scholar
  30. Stefan Kombrink, Tomáš Mikolov, Martin Karafiát, and Lukáš Burget. 2011. Recurrent neural network based language modeling in meeting recognition. In Proceedings of the 12th Conference of the International Speech Communication Association.Google ScholarGoogle Scholar
  31. April Kontostathis, Kelly Reynolds, Andy Garron, and Lynne Edwards. 2013. Detecting cyberbullying: Query terms and techniques. In Proceedings of the 4th ACM Conference on Web Science. 195–204. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Siwei Lai, Liheng Xu, Kang Liu, and Jun Zhao. 2015. Recurrent convolutional neural networks for text classification. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 333. 2267–2273. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Jingzhou Liu, Wei-Cheng Chang, Yuexin Wu, and Yiming Yang. 2017. Deep learning for extreme multi-label text classification. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 115–124. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. J. Mach. Learn. Res. 9, Nov. (2008), 2579–2605.Google ScholarGoogle Scholar
  35. Tomáš Mikolov, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 3111–3119. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Shervin Minaee, Nal Kalchbrenner, Erik Cambria, Narjes Nikzad, Meysam Chenaghlu, and Jianfeng Gao. 2020. Deep learning based text classification: A comprehensive review. arXiv preprint arXiv:2004.03705 (2020).Google ScholarGoogle Scholar
  37. Vinita Nahar, Xue Li, and Chaoyi Pang. 2013. An effective approach for cyberbullying detection. Commun. Inf. Sci. Manag. Eng. 3, 5 (2013), 238.Google ScholarGoogle Scholar
  38. Vinita Nahar, Sayan Unankard, Xue Li, and Chaoyi Pang. 2012. Sentiment analysis for effective detection of cyber bullying. In Proceedings of the Asia-Pacific Web Conference. Springer, 767–774. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Parma Nand, Rivindu Perera, and Abhijeet Kasture. 2016. “How bullying is this message?”: A psychometric thermometer for bullying. In Proceedings of the 26th International Conference on Computational Linguistics: Technical Papers. 695–706.Google ScholarGoogle Scholar
  40. James W. Pennebaker, Martha E. Francis, and Roger J. Booth. 2001. Linguistic inquiry and word count: LIWC 2001. Mahway: Lawr. Erlb. Assoc. 71, 2001 (2001).Google ScholarGoogle Scholar
  41. Semiu Salawu, Yulan He, and Joanna Lumsden. 2017. Approaches to automated detection of cyberbullying: A survey. IEEE Trans. Automat. Control 11, 1 (2017), 3--24.Google ScholarGoogle Scholar
  42. Cicero dos Santos, Ming Tan, Bing Xiang, and Bowen Zhou. 2016. Attentive pooling networks. arXiv preprint arXiv:1602.03609 (2016).Google ScholarGoogle Scholar
  43. Tao Shen, Tianyi Zhou, Guodong Long, Jing Jiang, Shirui Pan, and Chengqi Zhang. 2018. DiSAN: Directional self-attention network for RNN/CNN-free language understanding. In Proceedings of the AAAI Conference on Artificial Intelligence.Google ScholarGoogle Scholar
  44. Peter K. Smith, Jess Mahdavi, Manuel Carvalho, Sonja Fisher, Shanette Russell, and Neil Tippett. 2008. Cyberbullying: Its nature and impact in secondary school pupils. J. Child Psychol. Psychiat. 49, 4 (2008), 376–385.Google ScholarGoogle ScholarCross RefCross Ref
  45. Devin Soni and Vivek Singh. 2018. Time reveals all wounds: Modeling temporal characteristics of cyberbullying. In Proceedings of the 12thInternational AAAI Conference on Web and Social Media.Google ScholarGoogle Scholar
  46. Anna Squicciarini, Sarah Rajtmajer, Y. Liu, and Christopher Griffin. 2015. Identification and characterization of cyberbullying dynamics in an online social network. In Proceedings of the IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. 280–285. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Kai Sheng Tai, Richard Socher, and Christopher D. Manning. 2015. Improved semantic representations from tree-structured long short-term memory networks. arXiv preprint arXiv:1503.00075 (2015).Google ScholarGoogle Scholar
  48. Duyu Tang, Furu Wei, Nan Yang, Ming Zhou, Ting Liu, and Bing Qin. 2014. Learning sentiment-specific word embedding for Twitter sentiment classification. In Proceedings of the 52nd Meeting of the Association for Computational Linguistics, Vol. 1. 1555–1565.Google ScholarGoogle Scholar
  49. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 5998–6008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Jun-Ming Xu, Kwang-Sung Jun, Xiaojin Zhu, and Amy Bellmore. 2012. Learning from bullying traces in social media. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. ACL, 656–666. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Russ R. Salakhutdinov, and Quoc V. Le. 2019. XLNet: Generalized autoregressive pretraining for language understanding. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 5754–5764.Google ScholarGoogle Scholar
  52. Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, and Eduard Hovy. 2016. Hierarchical attention networks for document classification. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1480–1489.Google ScholarGoogle ScholarCross RefCross Ref
  53. Mengfan Yao, Charalampos Chelmis, and Daphney Stavroula Zois. 2019. Cyberbullying ends here: Towards robust detection of cyberbullying in social media. In Proceedings of the World Wide Web Conference. 3427–3433. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Dawei Yin, Zhenzhen Xue, Liangjie Hong, Brian D. Davison, April Kontostathis, and Lynne Edwards. 2009. Detection of harassment on web 2.0. Proc. Content Anal. WEB 2 (2009), 1–7.Google ScholarGoogle Scholar
  55. Justin Zhan and Binay Dahal. 2017. Using deep learning for short text understanding. J. Big Data 4, 1 (2017), 34.Google ScholarGoogle Scholar
  56. Xiang Zhang, Junbo Zhao, and Yann LeCun. 2015. Character-level convolutional networks for text classification. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 649–657. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Xinjie Zhou, Xiaojun Wan, and Jianguo Xiao. 2016. Attention-based LSTM network for cross-lingual sentiment classification. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 247–256.Google ScholarGoogle Scholar
  58. Caleb Ziems, Ymir Vigfusson, and Fred Morstatter. 2020. Aggressive, repetitive, intentional, visible, and imbalanced: Refining representations for cyberbullying classification. arXiv preprint arXiv:2004.01820 (2020).Google ScholarGoogle Scholar

Index Terms

  1. Modeling Temporal Patterns of Cyberbullying Detection with Hierarchical Attention Networks

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM/IMS Transactions on Data Science
        ACM/IMS Transactions on Data Science  Volume 2, Issue 2
        May 2021
        149 pages
        ISSN:2691-1922
        DOI:10.1145/3454114
        Issue’s Table of Contents

        Copyright © 2021 Association for Computing Machinery.

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 8 April 2021
        • Accepted: 1 December 2020
        • Revised: 1 September 2020
        • Received: 1 June 2020
        Published in tds Volume 2, Issue 2

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!