skip to main content
research-article

Denigrate Comment Detection in Low-Resource Hindi Language Using Attention-Based Residual Networks

Authors Info & Claims
Published:29 November 2021Publication History
Skip Abstract Section

Abstract

Cyberspace has been recognized as a conducive environment for use of various hostile, direct, and indirect behavioural tactics to target individuals or groups. Denigration is one of the most frequently used cyberbullying ploys to actively damage, humiliate, and disparage the online reputation of target by sending, posting, or publishing cruel rumours, gossip, and untrue statements. Previous pertinent studies report detecting profane, vulgar, and offensive words primarily in the English language. This research puts forward a model to detect online denigration bullying in low-resource Hindi language using attention residual networks. The proposed model Hindi Denigrate Comment–Attention Residual Network (HDC-ARN) intends to uncover defamatory posts (denigrate comments) written in Hindi language which stake and vilify a person or an entity in public. Data with 942 denigrate comments and 1499 non-denigrate comments is scraped using certain hashtags from two recent trending events in India: Tablighi Jamaat spiked Covid-19 (April 2020, Event 1) and Sushant Singh Rajput Death (June 2020: Event 2). Only text-based features, that is, the actual content of the post, are considered. The pre-trained word embedding for Hindi language from fastText is used. The model has three ResNet blocks with an attention layer that generates a post vector for a single input, which is passed through a sigmoid activation function to get the final output as either denigrate (positive class) or non-denigrate (negative class). An F-1 score of 0.642 is achieved on the dataset.

REFERENCES

  1. [1] Kumar Akshi and Sachdeva Nitin. 2019. Cyberbullying detection on social multimedia using soft computing techniques: A meta-analysis. Multimedia Tools and Applications 78, 17 (2019), 2397324010. DOI: DOI: https://doi.org/10.1007/s11042-019-7234-zGoogle ScholarGoogle ScholarCross RefCross Ref
  2. [2] Sangwan Saurabh Ra and Bhatia M. P. S.. 2020. D-BullyRumbler: A safety rumble strip to resolve online denigration bullying using a hybrid filter-wrapper approach. Multimedia Systems (2020), 117. DOI: DOI: https://doi.org/10.1007/s00530-020-00661-wGoogle ScholarGoogle Scholar
  3. [3] Kumar Akshi and Garg Geetanjali. 2019. Sentiment analysis of multimodal Twitter data. Multimedia Tools and Applications 78, 17 (2019), 2410324119. DOI: DOI: https://doi.org/10.1007/s11042-019-7390-1Google ScholarGoogle ScholarCross RefCross Ref
  4. [4] Baker Will and Sangiamchit Chittima. 2019. Transcultural communication: Language, communication and culture through English as a lingua franca in a social network community. Language and Intercultural Communication 19, 6 (2019), 471487. DOI: DOI: https://doi.org/10.1080/14708477.2019.1606230Google ScholarGoogle ScholarCross RefCross Ref
  5. [5] Chimalamarri Santwana, Sitaram Dinkar, and Jain Ashritha. 2020. Morphological segmentation to improve crosslingual word embeddings for low resource languages. ACM Transactions on Asian and Low-Resource Language Information Processing 19, 5 (2020), 115. DOI: DOI: https://doi.org/10.1145/3390298Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. [6] Sangwan Saurabh Raj and Bhatia M. P. S.. 2020. Denigration bullying resolution using wolf search optimized online reputation rumour detection. Procedia Computer Science 173 (2020), 305314. DOI: DOI: https://doi.org/10.1016/j.procs.2020.06.036Google ScholarGoogle ScholarCross RefCross Ref
  7. [7] Han Hu, Bai Xuxu, and Liu Jin. 2018. Attention-based ResNet for Chinese text sentiment classification. 2018 International Conference on Computer Science, Electronics and Communication Engineering (CSECE’18). Atlantis Press, 2018. DOI: DOI: https://doi.org/10.2991/csece-18.2018.108Google ScholarGoogle ScholarCross RefCross Ref
  8. [8] He Kaiming, Zhang Xiangyu, Ren Shaoqing, and Sun Jian. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, 770778. DOI: DOI: https://doi.org/10.1109/CVPR.2016.90Google ScholarGoogle ScholarCross RefCross Ref
  9. [9] Modha S., Majumder P., Mandl T., and Mandalia C., 2020. Detecting and visualizing hate speech in social media: A cyber watchdog for surveillance. Expert Systems with Applications 113725. DOI: DOI: https://doi.org/10.1016/j.eswa.2020.113725Google ScholarGoogle ScholarCross RefCross Ref
  10. [10] Mossie Z. and Wang J. H.. 2020. Vulnerable community identification using hate speech detection on social media. Information Processing & Management 57, 3 (2020), 102087. DOI: DOI: https://doi.org/10.1016/j.ipm.2019.102087Google ScholarGoogle ScholarCross RefCross Ref
  11. [11] Rodríguez A., Argueta C., and Chen Y. L.. 2019. Automatic detection of hate speech on Facebook using sentiment and emotion analysis. In 2019 International Conference on Artificial Intelligence in Information and Communication (ICAIIC’19). IEEE, 169174. DOI: DOI: https://doi.org/10.1109/ICAIIC.2019.8669073Google ScholarGoogle ScholarCross RefCross Ref
  12. [12] Nugroho K. et al. 2019. Improving random forest method to detect hatespeech and offensive word. 2019 International Conference on Information and Communications Technology (ICOIACT), Yogyakarta, Indonesia. IEEE, 514518. DOI: DOI: https://doi.org/10.1109/ICOIACT46704.2019.8938451Google ScholarGoogle ScholarCross RefCross Ref
  13. [13] Uban A. S. and Dinu L. P.. 2019. On transfer learning for detecting abusive language online. In International Work-Conference on Artificial Neural Networks (IWANN’19). Lecture Notes in Computer Science, Vol. 11506. Springer, Cham. 688700. DOI: DOI: https://doi.org/10.1007/978-3-030-20521-8_57Google ScholarGoogle ScholarCross RefCross Ref
  14. [14] Chen J., Yan S., and Wong K. C.. 2018. Verbal aggression detection on Twitter comments: Convolutional neural network for short-text sentiment analysis. Neural Computing and Applications 2018, 110. DOI: DOI: https://doi.org/10.1007/s00521-018-3442-0Google ScholarGoogle Scholar
  15. [15] Balakrishnan V., Khan S., and Arabnia H. R.. 2020. Improving cyberbullying detection using Twitter users’ psychological features and machine learning. Computers & Security 90, 2020, 101710. DOI: DOI: https://doi.org/10.1016/j.cose.2019.101710Google ScholarGoogle ScholarCross RefCross Ref
  16. [16] Cheng L., Li J., Silva Y. N., Hall D. L., and Liu H.. 2019. Xbully: Cyberbullying detection within a multi-modal context. In Proceedings of the 12th ACM International Conference on Web Search and Data Mining, 2019. ACM, 339347. DOI: DOI: https://doi.org/10.1145/3289600.3291037Google ScholarGoogle ScholarCross RefCross Ref
  17. [17] Malte A. and Ratadiya P.. 2019. Multilingual cyber abuse detection using advanced transformer architecture. TENCON 2019—2019 IEEE Region 10 Conference (TENCON), Kochi, India, 2019. IEEE, 784789. DOI: DOI: https://doi.org/10.1109/TENCON.2019.8929493Google ScholarGoogle ScholarCross RefCross Ref
  18. [18] Thiago Galery, Charitos Efstathios, and Tian Ye. 2018. Aggression identification and multi lingual word embeddings. In Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC’18), Santa Fe, USA. 7479. 2018.Google ScholarGoogle Scholar
  19. [19] Kumar Ritesh, Kumar Ojha Atul, Malmasi Shervin, and Zampieri Marcos. 2018. Benchmarking aggression identification in social media. In Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC’18), 111. 2018.Google ScholarGoogle Scholar
  20. [20] Kapoor Raghav, Kumar Yaman, Rajput Kshitij, Shah Rajiv Ratn, Kumaraguru Ponnurangam, and Zimmermann Roger. 2019. Mind your language: Abuse and offense detection for code-switched languages. In Proceedings of the AAAI Conference on Artificial Intelligence. 33, 99519952.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. [21] Si Shukrity, Datta Anisha, Banerjee Somnath, and Naskar Sudip Kumar. 2019. Aggression detection on multilingual social media text. In 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT), IEEE, 15. DOI: DOI: https://doi.org/10.1109/ICCCNT45670.2019.8944868Google ScholarGoogle ScholarCross RefCross Ref
  22. [22] Saha Baidya Nath and Senapati Apurbalal. 2019. CIT Kokrajhar team: LSTM based deep RNN architecture for hate speech and offensive content (HASOC). Identification in Indo-European languages. In FIRE (Working Notes), 359365.Google ScholarGoogle Scholar
  23. [23] Akhter M. P., Jiangbin Z., Naqvi I. R., Abdelmajeed M., and Sadiq M. T.. 2020. Automatic detection of offensive language for Urdu and Roman Urdu. In IEEE Access 8, 9121391226. DOI: DOI: https://doi.org/10.1109/ACCESS.2020.2994950Google ScholarGoogle Scholar
  24. [24] Jha V. K., Hrudya P., Vinu P. N., Vijayan V., and Prabaharan P.. 2020. DHOT-repository and classification of offensive tweets in the Hindi language. Procedia Computer Science 171, 23242333. DOI: DOI: https://doi.org/10.1016/j.procs.2020.04.252Google ScholarGoogle ScholarCross RefCross Ref
  25. [25] Pawar Rohit and Raje Rajeev R.. 2019. Multilingual cyberbullying detection system. In 2019 IEEE International Conference on Electro Information Technology (EIT’19), Brookings, SD, USA. IEEE, 40--44. DOI: 10.1109/EIT.2019.8833846Google ScholarGoogle ScholarCross RefCross Ref
  26. [26] Arup Baruah, Das Kaushik, Barbhuiya Ferdous, and Dey Kuntal. 2020. Aggression identification in English, Hindi and Bangla text using BERT, RoBERTa and SVM. In Proceedings of the 2nd Workshop on Trolling, Aggression and Cyberbullying, Marseille, France. 7682.Google ScholarGoogle Scholar
  27. [27] Abdhullah-Al-Mamun and Akhter S.. 2018. Social media bullying detection using machine learning on Bangla text. In 2018 10th International Conference on Electrical and Computer Engineering (ICECE’18), Dhaka, Bangladesh, 385388. DOI: DOI: https://doi.org/10.1109/ICECE.2018.8636797Google ScholarGoogle Scholar
  28. [28] Peters M. E., Neumann M., Iyyer M., Gardner M., Clark C., Lee K., and Zettlemoyer L.. 2018. Deep contextualized word representations. (accessed July 10, 2020). Statista. https://www.statista.com.Google ScholarGoogle Scholar
  29. [29] He K., Zhang X., Ren S., and Sun J.. 2016. Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16), Las Vegas, NV. 770778. DOI: DOI: https://doi.org/10.1109/CVPR.2016.90Google ScholarGoogle ScholarCross RefCross Ref
  30. [30] Jain Deepak, Kumar Akshi, and Garg Geetanjali. 2020. Sarcasm detection in mash-up language using soft-attention based bi-directional LSTM and feature-rich CNN. Applied Soft Computing (2020): 106198. DOI: DOI: https://doi.org/10.1016/j.asoc.2020.106198Google ScholarGoogle ScholarCross RefCross Ref
  31. [31] Kumar Akshi. 2020. Using cognition to resolve duplicacy issues in socially connected healthcare for smart cities. Computer Communications 152 (2020), 272281. DOI: DOI: https://doi.org/10.1016/j.comcom.2020.01.041Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Denigrate Comment Detection in Low-Resource Hindi Language Using Attention-Based Residual Networks

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Asian and Low-Resource Language Information Processing
        ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 21, Issue 1
        January 2022
        442 pages
        ISSN:2375-4699
        EISSN:2375-4702
        DOI:10.1145/3494068
        Issue’s Table of Contents

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 29 November 2021
        • Revised: 1 November 2020
        • Accepted: 1 October 2020
        • Received: 1 July 2020
        Published in tallip Volume 21, Issue 1

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Refereed
      • Article Metrics

        • Downloads (Last 12 months)160
        • Downloads (Last 6 weeks)13

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      View Full Text

      HTML Format

      View this article in HTML Format .

      View HTML Format
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!