skip to main content
research-article

Toxic Comment Classification Based on Bidirectional Gated Recurrent Unit and Convolutional Neural Network

Authors Info & Claims
Published:21 December 2021Publication History
Skip Abstract Section

Abstract

For English toxic comment classification, this paper presents the model that combines Bi-GRU and CNN optimized by global average pooling (BG-GCNN) based on the bidirectional gated recurrent unit (Bi-GRU) and global pooling optimized convolution neural network (CNN). The model treats each type of toxic comment as a binary classification. First, Bi-GRU is used to extract the time-series features of the comment and then the dimensionality is reduced through global pooling optimized convolution neural network. Finally, the classification result is output by Sigmoid function. Comparative experiments show the BG-GCNN model has a better classification effect than Text-CNN, LSTM, Bi-GRU, and other models. The Macro-F1 value of the toxic comment dataset on the Kaggle competition platform is 0.62. The F1 values of the three toxic label classification results (toxic, obscene, and insult label) are 0.81, 0.84, and 0.74, respectively, which are the highest values in the comparative experiment.

REFERENCE

  1. [1] Support and Safety Team. 2015. Harassment Survey. Wikimedia Foundation, 2015. https://foundation.wikimedia.org/wiki/File:Harassment_Survey_2015_-_Results_Report.pdf.Google ScholarGoogle Scholar
  2. [2] Dinakar K., Reichart R., and Lieberman H.. 2011. Modeling the detection of textual cyberbullying. In Fifth International AAAI Conference on Weblogs and Social Media.Google ScholarGoogle Scholar
  3. [3] Xu J. M., Jun K. S., Zhu X., and Bellmore A.. 2012. Learning from bullying traces in social media. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 656666. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. [4] Davidson T., Warmsley D., Macy M., and Weber I.. 2017. Automated hate speech detection and the problem of offensive language. In Eleventh International AAAI Conference on Web and Social Media.Google ScholarGoogle Scholar
  5. [5] Georgakopoulos S. V., Tasoulis S. K., Vrahatis A. G., and Plagianakos V. P.. 2018. Convolutional neural networks for toxic comment classification. In Proceedings of the 10th Hellenic Conference on Artificial Intelligence. 16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. [6] Kim Y.. 2014. Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882.Google ScholarGoogle Scholar
  7. [7] Sterckx L.. An Evaluation of Neural Network Models for Toxic Comment Classification.Google ScholarGoogle Scholar
  8. [8] Nikhil N., Pahwa R., Nirala M. K., and Khilnani R.. 2018. LSTM with attention for aggression detection. In Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC-2018). 5257.Google ScholarGoogle Scholar
  9. [9] Kumar R., Bhanodai G., Pamula R., and Chennuru M. R.. 2018. TRAC-1 shared task on aggression identification: IIT (ISM)@ COLING’18. In Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC-2018). 5865.Google ScholarGoogle Scholar
  10. [10] Pronko R.. 2019. Simple bidirectional LSTM solution for text classification. Proceedings of the Pol Eval 2019 Workshop, 2019: 111.Google ScholarGoogle Scholar
  11. [11] Srivastava S., Khurana P., and Tewari V.. 2018. Identifying aggression and toxicity in comments using capsule network. In Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC-2018). 98105.Google ScholarGoogle Scholar
  12. [12] Elman J. L., Bates E. A., Johnson M. H., Karmiloff-Smith A., Plunkett K., and Parisi D.. 1998. Rethinking innateness: A connectionist perspective on development, Vol. 10. MIT Press.Google ScholarGoogle Scholar
  13. [13] Hochreiter S. and Schmidhuber J.. 1997. Long short-term memory. Neural Computation 9, 8 (1997), 17351780. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. [14] Cheng J., Dong L., and Lapata M.. 2016. Long short-term memory-networks for machine reading. arXiv preprint arXiv:1601.06733.Google ScholarGoogle Scholar
  15. [15] Peng Li, Yuanwei Yang, Xianjun Gao, Lihui Du, Yi Zhou, Meiyue Jiang, and Jingbo Zhang. 2020. Chinese speech recognition based on bi-directional circulatory neural network [J/OL]. Applied Acoustics, 2020(03):1–8 [2020-06-02]. http://kns.cnki.net/kcms/detail/11.2121.o4.20200506.1009.022.html.Google ScholarGoogle Scholar
  16. [16] Yang Xu and Xiaoqin Liao. 2020. Discriminatory discriminations of converting bidirectional gated circulatory units and convolutional neural networks. Journal of Wuhan University (Science Edition) 66, 02 (2020), 111116.Google ScholarGoogle Scholar
  17. [17] LeCun Y., Bottou L., Bengio Y., and Haffner P.. 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 11 (1998), 22782324.Google ScholarGoogle Scholar
  18. [18] Lin M., Chen Q., and Yan S.. 2013. Network in network. arXiv preprint arXiv:1312.4400.Google ScholarGoogle Scholar
  19. [19] Zhou B., Khosla A., Lapedriza A. et al. 2016. Learning deep features for discriminative localization. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 29212929.Google ScholarGoogle ScholarCross RefCross Ref
  20. [20] Zhao J., Li K., Xi X., Wang S., Saravanan V., and Samuel R. D.. 2020. Analysis of complex cognitive task and pattern recognition using distributed patterns of EEG signals with cognitive functions. Neural Computing and Applications. DOI: DOI: 10.1007/s00521-020-05439-9Google ScholarGoogle Scholar
  21. [21] Asghar M. Z., Subhan F., Ahmad H., Khan W. Z., Hakak S., Gadekallu T. R., and Alazab M.. 2020. Senti-eSystem: A sentiment-based eSystem -using hybridized fuzzy and deep neural network for measuring customer satisfaction. Software: Practice and Experience 51, 3 (2020), 571594. DOI: DOI: 10.1002/spe.2853Google ScholarGoogle ScholarCross RefCross Ref
  22. [22] Rodriguez A. O., Mateus D. E., Garcia P. A., Acosta A. G., and Marin C. E.. 2019. Segmentation methods for image classification using a convolutional neural network on AR-sandbox. IFIP Advances in Information and Communication Technology Artificial Intelligence Applications and Innovations. 391398. DOI: DOI: 10.1007/978-3-030-19823-7_33Google ScholarGoogle Scholar
  23. [23] Muthu BalaAnand et al. A framework for extractive text summarization based on deep learning modified neural network classifier. ACM Transactions on Asian and Low-Resource Language Information Processing 2020. DOI: DOI: 10.1145/3392048 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Toxic Comment Classification Based on Bidirectional Gated Recurrent Unit and Convolutional Neural Network

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Asian and Low-Resource Language Information Processing
      ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 21, Issue 3
      May 2022
      413 pages
      ISSN:2375-4699
      EISSN:2375-4702
      DOI:10.1145/3505182
      Issue’s Table of Contents

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 21 December 2021
      • Accepted: 1 August 2021
      • Received: 1 October 2020
      Published in tallip Volume 21, Issue 3

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Refereed
    • Article Metrics

      • Downloads (Last 12 months)175
      • Downloads (Last 6 weeks)7

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!