skip to main content
research-article

Toward Integrated CNN-based Sentiment Analysis of Tweets for Scarce-resource Language—Hindi

Authors Info & Claims
Published:30 June 2021Publication History
Skip Abstract Section

Abstract

Linguistic resources for commonly used languages such as English and Mandarin Chinese are available in abundance, hence the existing research in these languages. However, there are languages for which linguistic resources are scarcely available. One of these languages is the Hindi language. Hindi, being the fourth-most popular language, still lacks in richly populated linguistic resources, owing to the challenges involved in dealing with the Hindi language. This article first explores the machine learning-based approaches—Naïve Bayes, Support Vector Machine, Decision Tree, and Logistic Regression—to analyze the sentiment contained in Hindi language text derived from Twitter.

Further, the article presents lexicon-based approaches (Hindi Senti-WordNet, NRC Emotion Lexicon) for sentiment analysis in Hindi while also proposing a Domain-specific Sentiment Dictionary. Finally, an integrated convolutional neural network (CNN)—Recurrent Neural Network and Long Short-term Memory—is proposed to analyze sentiment from Hindi language tweets, a total of 23,767 tweets classified into positive, negative, and neutral. The proposed CNN approach gives an accuracy of 85%.

References

  1. S. Singh, K. Gupta, M. Shrivastava, and P. Bhattacharyya. 2006. Morphological richness offsets resource demand-experiences in constructing a pos tagger for Hindi. In Proceedings of the International Conference on Computational Linguistics (COLING’06). Association for Computational Linguistics, 779–786. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. V. Jha, N. Manjunath, P. D. Shenoy, K. R. Venugopal, and L. M. Patnaik. 2015. Homs: Hindi opinion mining system. In Proceedings of the IEEE 2nd International Conference on Recent Trends in Information Systems (ReTIS’15). IEEE, 366–371.Google ScholarGoogle Scholar
  3. V. Gupta, V. K. Singh, P. Mukhija, and U. Ghose. 2019. Aspect-based sentiment analysis of mobile reviews. J. Intell. Fuzzy Syst. 36, 5 (2019), 4721–4730.Google ScholarGoogle ScholarCross RefCross Ref
  4. R. Piryani, V. Gupta, V. K. Singh, and U. Ghose. 2017. A linguistic rule-based approach for aspect-level sentiment analysis of movie reviews. In Advances in Computer and Computational Sciences. Springer, Singapore, 201–109.Google ScholarGoogle Scholar
  5. R. Piryani, V. Gupta, and V. K. Singh. 2017. Movie prism: A novel system for aspect level sentiment profiling of movies. J. Intell. Fuzzy Syst. 32, 5 (2017), 3297–331Google ScholarGoogle ScholarCross RefCross Ref
  6. V. Gupta, N. Jain, P. Katariya, A. Kumar, S. Mohan, A. Ahmadian, and M. Ferrara. 2021. An emotion care model using multimodal textual analysis on COVID-19. Chaos, Solitons Fractals 144 (2021), 110708.Google ScholarGoogle ScholarCross RefCross Ref
  7. B. R. Ambati, S. Husain, S. Jain, D. M. Sharma, and R. Sangal. 2010. Two methods to incorporate local morphosyntactic features in Hindi dependency parsing. In Proceedings of the NAACL HLT 1st Workshop on Statistical Parsing of Morphologically Rich Languages. Association for Computational Linguistics, 22–30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. Joshi, A. R. Balamurali, and P. Bhattacharyya. 2010. A fall-back strategy for sentiment analysis in Hindi: a case study. Proceedings of the 8th International Conference on Natural Language Processing (ICON’10).Google ScholarGoogle Scholar
  9. A. Karthikeyan. 2010, May. Hindi English Wordnet Linkage. Dual-degree thesis, CSE Dept. IIT Bombay.Google ScholarGoogle Scholar
  10. A. Bakliwal, P. Arora, A. Patil, and V. Varma. 2011. Towards enhanced opinion classification using NLP techniques. In Proceedings of the Workshop on Sentiment Analysis where AI meets Psychology (SAAIP’11). 101–107.Google ScholarGoogle Scholar
  11. A. Bakliwal, P. Arora, and V. Varma. 2012. Hindi subjective lexicon: A lexical resource for Hindi polarity classification. In Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC’12). 1189–1196.Google ScholarGoogle Scholar
  12. P. Arora, A. Bakliwal, and V. Varma. 2012. Hindi subjective lexicon generation using WordNet graph traversal. International J. Comput. Linguist. Appl. 3, 1 (2012), 25–39.Google ScholarGoogle Scholar
  13. S. Mukherjee and P. Bhattacharyya. 2012. Sentiment analysis in Twitter with lightweight discourse analysis. In Proceedings of the International Conference on Computational Linguistics (COLING’12). 1847–1864.Google ScholarGoogle Scholar
  14. N. Mittal, B. Agarwal, G. Chouhan, N. Bania, and P. Pareek. 2013. Sentiment analysis of Hindi reviews based on negation and discourse relation. In Proceedings of the 11th Workshop on Asian Language Resources. 45–50.Google ScholarGoogle Scholar
  15. R. Sharma, S. Nigam, and R. Jain. 2014. Polarity detection of Movie Review in Hindi Language. In Int. J. Comput. Sci. Appl. 4, 4 (2014), 49–57.Google ScholarGoogle ScholarCross RefCross Ref
  16. K. Ravi and V. Ravi. 2016. Sentiment classification of Hinglish text. In Proceedings of the 3rd International Conference on Recent Advances in Information Technology (RAIT’16). IEEE, 641–645.Google ScholarGoogle Scholar
  17. M. Z. Ansari, T. Ahmad, and M. A. Ali. 2018. Cross script Hindi-English NER corpus from Wikipedia. In Proceedings of the International Conference on Intelligent Data Communication Technologies and Internet of Things. Springer, Cham, 1006–1012.Google ScholarGoogle Scholar
  18. R. Piryani, V. Gupta, and V. K. Singh. 2018. Generating aspect-based extractive opinion summary: Drawing inferences from social media texts. Comput. Sistem. 22, 1 (2018), 83–91.Google ScholarGoogle Scholar
  19. R. Jain, N. Jain, A. Aggarwal, and D. J. Hemanth. 2019. Convolutional neural network-based Alzheimer's disease classification from magnetic resonance brain images. Cogn. Syst. Res. 57, 147–159.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. V. Gupta, S. Juyal, G. P. Singh, C. Killa, and N. Gupta. 2020. Emotion recognition of audio/speech data using deep learning approaches. J. Info. Optimiz. Sci. 41, 6 (2020), 1309–1317.Google ScholarGoogle Scholar
  21. N. Jain, A. Chauhan, P. Tripathi, S. B. Moosa, P. Aggarwal, and B. Oznacar. 2020. Cell image analysis for malaria detection using deep convolutional network. Intell. Decis. Technol. 14, 1 (2020), 55–65.Google ScholarGoogle ScholarCross RefCross Ref
  22. D. Gupta, A. Ekbal, and P. Bhattacharyya. 2019. A deep neural network framework for english hindi question answering. ACM Trans. Asian Low-Res. Lang. Info. Process. 19, 2 (2019), 1–22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M. Tummalapalli, M. Chinnakotla, and R. Mamidi. 2018, March. Towards better sentence classification for morphologically rich languages. In Proceedings of the International Conference on Computational Linguistics and Intelligent Text Processing.Google ScholarGoogle Scholar
  24. M. Singh, R. Kumar, and I. Chana. 2020. Corpus-based machine translation system with deep neural network for Sanskrit to Hindi translation. Procedia Comput. Sci. 167, 2534–2544.Google ScholarGoogle ScholarCross RefCross Ref
  25. M. S. Akhtar, A. Kumar, A. Ekbal, and P. Bhattacharyya. 2016, December. A hybrid deep learning architecture for sentiment analysis. In Proceedings of the 26th International Conference on Computational Linguistics (COLING’16). 482–493.Google ScholarGoogle Scholar
  26. L. Rolling. 1981. Indexing consistency, quality, and efficiency. Info. Process. Manage. 17, 2 (1981), 69–76.Google ScholarGoogle ScholarCross RefCross Ref
  27. T. Byrt. 1996. How good is that agreement? Epidemiology 7, 5 (1996), 561.Google ScholarGoogle ScholarCross RefCross Ref
  28. N. Jain, S. Jhunthra, H. Garg, V. Gupta, S. Mohan, A. Ahmadian, S. Salahshour, and M. Ferrara. 2021. Prediction Modelling of COVID using Machine Learning methods from B-Cell dataset. Results Phys. 21 (2021), 103813.Google ScholarGoogle ScholarCross RefCross Ref
  29. Y. Duan, L. Jiang, T. Qin, M. Zhou, and H. Y. Shum. 2010. An empirical study on learning to the rank of tweets. In Proceedings of the 23rd International Conference on Computational Linguistics. Association for Computational Linguistics, 295–303. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. R. McCreadie and C. Macdonald. 2013. Relevance in microblogs: Enhancing tweet retrieval using hyperlinked documents. In Proceedings of the 10th Conference on Open Research Areas in Information Retrieval. Le Centre de Hautes Etudes Internationales D'informatique Documentaire, 189–196. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. J. Vosecky, K. W. T. Leung, and W. Ng. 2012. Searching for quality microblog posts: Filtering and ranking based on content analysis and implicit links. In Proceedings of the International Conference on Database Systems for Advanced Applications. Springer, Berlin, 397–413. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. S. Mohammad. 2011. From once upon a time to happily ever after: Tracking emotions in novels and fairy tales. In Proceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities. Association for Computational Linguistics, 105–114. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. S. M. Mohammad and P. D. Turney. 2013. Crowdsourcing a word–emotion association lexicon. Comput. Intell. 29, 3 (2013), 436–465.Google ScholarGoogle ScholarCross RefCross Ref
  34. D. Jain, A. Kumar, and G. Garg. 2020. Sarcasm detection in mash-up language using soft-attention-based bi-directional LSTM and feature-rich CNN. Appl. Soft Comput. 91 (2020), 106198.Google ScholarGoogle ScholarCross RefCross Ref
  35. S. Seshadri, A. K. Madasamy, S. K. Padannayil, and M. A. Kumar. 2016. Analyzing sentiment in Indian languages micro text using a recurrent neural network. Inst. Integr. Omics Appl. Biotechnol. J. 7 (2016), 313–318.Google ScholarGoogle Scholar

Index Terms

  1. Toward Integrated CNN-based Sentiment Analysis of Tweets for Scarce-resource Language—Hindi

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!