Abstract
Deep learning has become most prominent in solving various Natural Language Processing (NLP) tasks including sentiment analysis. However, these techniques require a considerably large amount of annotated corpus, which is not easy to obtain for most of the languages, especially under the scenario of low-resource settings. In this article, we propose a deep multi-task multi-lingual adversarial framework to solve the resource-scarcity problem of sentiment analysis by leveraging the useful and relevant knowledge from a high-resource language. To transfer the knowledge between the different languages, both the languages are mapped to the shared semantic space using cross-lingual word embeddings. We evaluate our proposed architecture on a low-resource language, Hindi, using English as the high-resource language. Experiments show that our proposed model achieves an accuracy of 60.09% for the movie review dataset and 72.14% for the product review dataset. The effectiveness of our proposed approach is demonstrated with significant performance gains over the state-of-the-art systems and translation-based baselines.
- [1] . 2020. Multi-domain tweet corpora for sentiment analysis: Resource creation and evaluation. In Proceedings of the 12th Language Resources and Evaluation Conference. European Language Resources Association, Marseille, France, 5046–5054. Google Scholar
- [2] . 2016. Tensorflow: A system for large-scale machine learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI’16). 265–283.Google Scholar
- [3] . 2020. Borrow from rich cousin: Transfer learning for emotion detection using cross lingual embedding. Expert Systems with Applications 139 (2020), 112851.Google Scholar
Digital Library
- [4] . 2020. How intense are you? Predicting intensities of emotions and sentiments using stacked ensemble. IEEE Computational Intelligence Magazine 15, 1 (2020), 64–75.Google Scholar
Digital Library
- [5] . 2017. Feature selection and ensemble construction: A two-step method for aspect based sentiment analysis. Knowledge-based Systems 125 (2017), 116–135.Google Scholar
Digital Library
- [6] . 2016. A hybrid deep learning architecture for sentiment analysis. In Proceedings of the 26th International Conference on Computational Linguistics: Technical Papers (COLING’16). 482–493.Google Scholar
- [7] . 2017. A multilayer perceptron based ensemble technique for fine-grained financial sentiment analysis. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 540–546.Google Scholar
Cross Ref
- [8] . 2018. Solving data sparsity for aspect based sentiment analysis using cross-linguality and multi-linguality. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Association for Computational Linguistics, 572–582.
DOI: Google ScholarCross Ref
- [9] . 2017. Enhancing deep learning sentiment analysis with ensemble techniques in social applications. Expert Systems with Applications 77 (2017), 236–246.Google Scholar
Digital Library
- [10] . 2018. Multilingual multi-class sentiment classification using convolutional neural networks. In Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC’18).Google Scholar
- [11] . 2013. Sentiment Analysis of Political Tweets: Towards an Accurate Classifier. Association for Computational Linguistics.Google Scholar
- [12] . 2012. Cross-lingual sentiment analysis for Indian languages using linked wordnets. In Proceedings of COLING 2012: Posters, Martin Kay and Christian Boitet (Eds.). Indian Institute of Technology Bombay, 73–82. https://aclanthology.org/C12-2008/.Google Scholar
- [13] . 2017. Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics 5 (2017), 135–146.Google Scholar
Cross Ref
- [14] . 2018. Multilingual sentiment analysis: An RNN-based framework for limited data. arXiv preprint arXiv:1806.04511 (2018).Google Scholar
- [15] . 2018. Distinguishing between facts and opinions for sentiment analysis: Survey and challenges. Information Fusion 44 (2018), 65–77.Google Scholar
Cross Ref
- [16] . 2018. Adversarial deep averaging networks for cross-lingual sentiment classification. Transactions of the Association for Computational Linguistics 6 (2018), 557–570.Google Scholar
Cross Ref
- [17] . 2015. SeNTU: Sentiment analysis of tweets by combining a rule-based classifier with supervised learning. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval’15). 647–651.Google Scholar
Cross Ref
- [18] . 2015. Gated feedback recurrent neural networks. In Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015, R. Bach Francis and M. Blei David (Eds.). JMLR.org, 2067–2075. http://proceedings.mlr.press/v37/chung15.html.Google Scholar
- [19] . 2017. Word translation without parallel data. arXiv preprint arXiv:1710.04087 (2017).Google Scholar
- [20] . 2016. Multilingual sentiment analysis: State of the art and independent comparison of techniques. Cognitive Computation 8, 4 (2016), 757–771.Google Scholar
Cross Ref
- [21] . 2018. Senti-N-Gram: An n-gram lexicon for sentiment analysis. Expert Systems with Applications 103 (2018), 92–105.Google Scholar
Cross Ref
- [22] . 2018. A domain transferable lexicon set for Twitter sentiment analysis using a supervised machine learning approach. Expert Systems with Applications 106 (2018), 197–216.Google Scholar
Cross Ref
- [23] . 2014. Generative adversarial nets. In Advances in Neural Information Processing Systems. 2672–2680.Google Scholar
Digital Library
- [24] . 2017. Deep Learning with Keras. Packt Publishing Ltd.Google Scholar
Digital Library
- [25] . 2004. The problem of overfitting. Journal of Chemical Information and Computer Sciences 44, 1 (2004), 1–12.Google Scholar
Cross Ref
- [26] . 1997. Long short-term memory. Neural Computation 9, 8 (1997), 1735–1780.Google Scholar
Digital Library
- [27] . 2020. Xtreme: A massively multilingual multi-task benchmark for evaluating cross-lingual generalisation. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13-18 July 2020, Virtual Event. PMLR, 4411–4421. http://proceedings.mlr.press/v119/hu20b.html.Google Scholar
- [28] . 2010. A fall-back strategy for sentiment analysis in hindi: A case study. Proceedings of the 8th ICON.Google Scholar
- [29] . 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
- [30] . 2019. Fusion of EEG response and sentiment analysis of products review to predict customer satisfaction. Information Fusion 52 (2019), 41–52.Google Scholar
Digital Library
- [31] . 2018. A multi-lingual multi-task architecture for low-resource sequence labeling. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 799–809.Google Scholar
Cross Ref
- [32] . 2017. Adversarial multi-task learning for text classification. arXiv preprint arXiv:1704.05742 (2017).Google Scholar
- [33] . 2017. Ranking products through online reviews: A method based on sentiment analysis technique and intuitionistic fuzzy set theory. Information Fusion 36 (2017), 149–161.Google Scholar
Digital Library
- [34] . 2020. Emoji-based sentiment analysis using attention networks. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) 19, 5 (2020), 1–13.Google Scholar
Digital Library
- [35] . 2018. Multi-task and multi-lingual joint learning of neural lexical utterance classification based on partially-shared modeling. In Proceedings of the 27th International Conference on Computational Linguistics, Emily M. Bender, Leon Derczynski and, Pierre Isabelle (Eds.). Association for Computational Linguistics, 3586–3596. https://aclanthology.org/C18-1304/.Google Scholar
- [36] . 2021. Exploring multi-task multi-lingual learning of transformer models for hate speech and offensive speech identification in social media. SN Computer Science 2, 2 (2021), 1–19.Google Scholar
Digital Library
- [37] . 2017. Stance and sentiment in tweets. ACM Transactions on Internet Technology (TOIT) 17, 3 (2017), 1–23.Google Scholar
Digital Library
- [38] . 2009. Topic-dependent sentiment analysis of financial blogs. In Proceedings of the 1st International CIKM Workshop on Topic-sentiment Analysis for Mass Opinion. 9–16.Google Scholar
Digital Library
- [39] . 2002. Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing-Volume 10. Association for Computational Linguistics, 79–86.Google Scholar
Digital Library
- [40] . 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1532–1543.Google Scholar
Cross Ref
- [41] . 2017. A review of affective computing: From unimodal analysis to multimodal fusion. Information Fusion 37 (2017), 98–125.Google Scholar
Digital Library
- [42] . 2020. Feature distillation network for aspect-based sentiment analysis. Information Fusion 61 (2020), 13–23.Google Scholar
Cross Ref
- [43] . 2016. Borrow a little from your rich cousin: Using embeddings and polarities of English words for multilingual sentiment classification. In Proceedings of the 26th International Conference on Computational Linguistics (COLING’16): Technical Papers. 3053–3062.Google Scholar
- [44] . 2014. Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research 15, 1 (2014), 1929–1958.Google Scholar
Digital Library
- [45] . 2021. Multi-task learning for cross-lingual sentiment analysis. In Proceedings of the 2nd International Workshop on Cross-lingual Event-centric Open Analytics co-located with the 30th The Web Conference (WWW’21), Ljubljana, Slovenia, April 12, 2021 (online event due to COVID-19 outbreak), Vol. 2829. CEUR-WS.org, 76–84. http://ceur-ws.org/Vol-2829/short1.pdf.Google Scholar
- [46] . 2018. Consensus vote models for detecting and filtering neutrality in sentiment analysis. Information Fusion 44 (2018), 126–135.Google Scholar
Cross Ref
- [47] . 2015. Fine-grained analysis of explicit and implicit sentiment in financial news articles. Expert Systems with Applications 42, 11 (2015), 4999–5010.Google Scholar
Digital Library
- [48] . 2019. Sentiment lexicon enhanced neural sentiment classification. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, Wenwu Zhu, Dacheng Tao, Xueqi Cheng, Peng Cui, Elke A. Rundensteiner, David Carmel, Qi He, and Jeffrey Xu Yu (Eds.). ACM, 1091–1100.
DOI: Google ScholarDigital Library
- [49] . 2017. Domain-specific sentiment classification via fusing sentiment knowledge from multiple sources. Information Fusion 35 (2017), 26–37.Google Scholar
Digital Library
- [50] . 2019. A unified multi-task adversarial learning framework for pharmacovigilance mining. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 5234–5245.Google Scholar
Cross Ref
- [51] . 2021. Multi-view ensemble learning method for microblog sentiment classification. Expert Systems with Applications 166 (2021), 113987.Google Scholar
Cross Ref
- [52] . 2016. Attention-based LSTM network for cross-lingual sentiment classification. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 247–256.Google Scholar
Cross Ref
Index Terms
Exploring Multi-lingual, Multi-task, and Adversarial Learning for Low-resource Sentiment Analysis
Recommendations
Cross-lingual Sentence Embedding for Low-resource Chinese-Vietnamese Based on Contrastive Learning
Cross-lingual sentence embedding’s goal is mapping sentences with similar semantics but in different languages close together and dissimilar sentences farther apart in the representation space. It is the basis of many downstream tasks such as cross-...
Chinese emotion lexicon developing via multi-lingual lexical resources integration
CICLing'13: Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume 2This paper proposes an automatic approach to build Chinese emotion lexicon based on WordNet-Affect which is a widely-used English emotion lexicon resource developed on WordNet. The approach consists of three steps, namely translation, filtering and ...
Multi-Round Transfer Learning for Low-Resource NMT Using Multiple High-Resource Languages
Neural machine translation (NMT) has made remarkable progress in recent years, but the performance of NMT suffers from a data sparsity problem since large-scale parallel corpora are only readily available for high-resource languages (HRLs). In recent ...






Comments