Abstract
Emotion detection (ED) plays a vital role in determining individual interest in any field. Humans use gestures, facial expressions, and voice pitch and choose words to describe their emotions. Significant work has been done to detect emotions from the textual data in English, French, Chinese, and other high-resource languages. However, emotion classification has not been well studied in low-resource languages (i.e., Urdu) due to the lack of labeled corpora. This article presents a publicly available Urdu Nastalique Emotions Dataset (UNED) of sentences and paragraphs annotated with different emotions and proposes a deep learning (DL)-based technique for classifying emotions in the UNED corpus. Our annotated UNED corpus has six emotions for both paragraphs and sentences. We perform extensive experimentation to evaluate the quality of the corpus and further classify it using machine learning and DL approaches. Experimental results show that the developed DL-based model performs better than generic machine learning approaches with an F1 score of 85% on the UNED sentence-based corpus and 50% on the UNED paragraph-based corpus.
- [1] . 2021. ElStream: An ensemble learning approach for concept drift detection in dynamic social big data stream learning. IEEE Access 9 (2021), 66408–66419.Google Scholar
Cross Ref
- [2] . 2018. Teamuncc at SemEval-2018 task 1: Emotion detection in English and Arabic tweets using deep learning. In Proceedings of the 12th International Workshop on Semantic Evaluation. 350–357.Google Scholar
Cross Ref
- [3] . 2003. An information-theoretic perspective of tf–idf measures. Information Processing & Management 39, 1 (2003), 45–65.Google Scholar
Digital Library
- [4] . 2015. Emotion analysis of Arabic articles and its impact on identifying the author’s gender. In 2015 IEEE/ACS 12th International Conference of Computer Systems and Applications (AICCSA’15). IEEE, 1–6.Google Scholar
Cross Ref
- [5] . 2019. KSU at SemEval-2019 task 3: Hybrid features for emotion recognition in textual conversation. In Proceedings of the 13th International Workshop on Semantic Evaluation. 247–250.Google Scholar
Cross Ref
- [6] . 2017. Exploring Twitter news biases using Urdu-based sentiment lexicon. In 2017 International Conference on Open Source Systems & Technologies (ICOSST’17). IEEE, 48–53.Google Scholar
Cross Ref
- [7] . 2019. Corpus for emotion detection on Roman Urdu. In 2019 22nd International Multitopic Conference (INMIC’19). 1–6.Google Scholar
Cross Ref
- [8] . 2008. Inter-coder agreement for computational linguistics. Computational Linguistics 34, 4 (2008), 555–596.Google Scholar
Digital Library
- [9] . 2021. A two-stage text feature selection algorithm for improving text classification. ACM Transactions on Asian and Low-resource Language Information Processing 20, 3 (2021).Google Scholar
- [10] . 2019. Multimodal database of emotional speech, video and gestures. In Pattern Recognition and Information Forensics: ICPR 2018 International Workshops, CVAUI, IWCF, and MIPPSNA, Revised Selected Papers, Vol. 11188. Springer, 153.Google Scholar
- [11] . 2018. EMA at SemEval-2018 task 1: Emotion mining for Arabic. In Proceedings of the 12th International Workshop on Semantic Evaluation. 236–244.Google Scholar
Cross Ref
- [12] . 2019. CECL at SemEval-2019 task 3: Using surface learning for detecting emotion in textual conversations. In Proceedings of the 13th International Workshop on Semantic Evaluation. 148–152.Google Scholar
Cross Ref
- [13] . 2019. EPITA-ADAPT at SemEval-2019 task 3: Detecting emotions in textual conversations using deep learning models combination. In Proceedings of the 13th International Workshop on Semantic Evaluation. 215–219.Google Scholar
Cross Ref
- [14] . 2017. Automatic emotional spoken language text corpus construction from written dialogs in fictions. In 2017 7th International Conference on Affective Computing and Intelligent Interaction (ACII’17). IEEE, 319–324.Google Scholar
Cross Ref
- [15] . 2017. An emotion cause corpus for Chinese microblogs with multiple-user structures. ACM Transactions on Asian and Low-resource Language Information Processing (TALLIP) 17, 1 (2017), 6.Google Scholar
- [16] . 2014. Emovo corpus: An Italian emotional speech database. In International Conference on Language Resources and Evaluation (LREC’14). European Language Resources Association (ELRA), 3501–3504.Google Scholar
- [17] . 2021. Negative emotions detection on online mental-health related patients texts using the deep learning with MHA-BCNN model. Expert Systems with Applications (2021), 115265.Google Scholar
Digital Library
- [18] . 2015. Korean Twitter emotion classification using automatically built emotion lexicons and fine-grained features. In Proceedings of the 29th Pacific Asia Conference on Language, Information and Computation: Posters. 142–150.Google Scholar
- [19] . 2015. Accurate online social network user profiling. In Joint German/Austrian Conference on Artificial Intelligence (Künstliche Intelligenz). Springer, 264–270.Google Scholar
Cross Ref
- [20] . 2016. Inferring social network user profiles using a partial social graph. Journal of Intelligent Information Systems 47, 2 (2016), 313–344.Google Scholar
Digital Library
- [21] . 2017. Semi-supervised Bayesian deep multi-modal emotion recognition. arXiv preprint arXiv:1704.07548 (2017).Google Scholar
- [22] . 2017. A hybrid model for emotion detection from text. International Journal of Information Retrieval Research (IJIRR) 7, 1 (2017), 32–48.Google Scholar
Digital Library
- [23] . 2021. Deep learning-embedded social internet of things for ambiguity-aware social recommendations. IEEE Transactions on Network Science and Engineering (2021).Google Scholar
- [24] . 2018. Opinion within opinion: Segmentation approach for urdu sentiment analysis. International Arab Journal on Information Technology 15, 1 (2018), 21–28.Google Scholar
- [25] . 2016. Urdu summary corpus. In Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC’16). 796–800.Google Scholar
- [26] . 2017. A corpus-based approach to classifying emotions using Korean linguistic features. Cluster Computing 20, 1 (2017), 583–595.Google Scholar
Digital Library
- [27] . 2019. Developing a Thai emotional speech corpus from Lakorn (EMOLA). Language Resources and Evaluation 53 (
March 2019), 1–39. .Google ScholarCross Ref
- [28] . 2004. Ekman, emotional expression, and the art of empirical epiphany. Journal of Research in Personality 38, 1 (2004), 37–44.Google Scholar
Cross Ref
- [29] . 2017. Emotion detection of tweets using naïve Bayes classifier. Emotion (2017).Google Scholar
- [30] . 2018. Ug18 at SemEval-2018 task 1: Generating additional training data for predicting emotion intensity in spanish. arXiv preprint arXiv:1805.10824 (2018).Google Scholar
- [31] . 2018. Understanding social viewing through discussion network and emotion: A focus on South Korean presidential debates. Telematics and Informatics 35, 5 (2018), 1382–1391.Google Scholar
Cross Ref
- [32] . 2017. Unsupervised learning of fundamental emotional states via word embeddings. In 2017 IEEE Symposium Series on Computational Intelligence (SSCI’17). IEEE, 1–6.Google Scholar
- [33] . 2017. Building and analysing emotion corpus of the Arabic speech. In 2017 1st International Workshop on Arabic Script Analysis and Recognition (ASAR’17). IEEE, 134–139.Google Scholar
- [34] . 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).Google Scholar
- [35] . 2006. Statistical analysis of a Japanese emotion corpus for natural language processing. In International Conference on Intelligent Computing. Springer, 924–929.Google Scholar
Cross Ref
- [36] . 2008. Japanese emotion corpus analysis and its use for automatic emotion word identification. Engineering Letters 16, 1 (2008).Google Scholar
- [37] . 2013. Crowdsourcing a word–emotion association lexicon. Computational Intelligence 29, 3 (2013), 436–465.Google Scholar
Cross Ref
- [38] . 2018. Urdu sentiment analysis using supervised machine learning approach. International Journal of Pattern Recognition and Artificial Intelligence 32, 2 (2018), 1851001.Google Scholar
Cross Ref
- [39] . 2018. Identification and handling of intensifiers for enhancing accuracy of Urdu sentiment analysis. Expert Systems 35, 6 (2018), e12317.Google Scholar
Cross Ref
- [40] . 2014. Are they different? Affect, feeling, emotion, sentiment, and opinion detection in text. IEEE Transactions on Affective Computing 5, 2 (2014), 101–111.Google Scholar
Cross Ref
- [41] . 2015. C5. 0 algorithm to improved decision tree with feature selection and reduced error pruning. International Journal of Computer Applications 117, 16 (2015), 18–21.Google Scholar
Cross Ref
- [42] . 1980. A general psychoevolutionary theory of emotion. In Theories of Emotion. Elsevier, 3–33.Google Scholar
Cross Ref
- [43] . 2009. Construction of a blog emotion corpus for Chinese emotional expression analysis. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3. Association for Computational Linguistics, 1446–1454.Google Scholar
Digital Library
- [44] . 2020. Information granulation-based community detection for social networks. IEEE Transactions on Computational Social Systems 8, 1 (2020), 122–133.Google Scholar
Cross Ref
- [45] . 2016. Lexicon-based sentiment analysis for Urdu language. In 2016 6th International Conference on Innovative Computing Technology (INTECH’16). IEEE, 497–501.Google Scholar
Cross Ref
- [46] . 2014. Word2vec parameter learning explained. arXiv preprint arXiv:1411.2738 (2014).Google Scholar
- [47] . 2019. A sense annotated corpus for all-words Urdu word sense disambiguation. ACM Transactions on Asian and Low-resource Language Information Processing (TALLIP) 18, 4 (2019), 40.Google Scholar
- [48] . 2018. Emotion detection from text and speech: A survey. Social Network Analysis and Mining 8, 1 (2018), 28.Google Scholar
Cross Ref
- [49] . 2017. Emotion detection in blog posts using keyword spotting and semantic analysis. In Proceedings of the 3rd International Conference on Communication and Information Processing. ACM, 6–13.Google Scholar
Digital Library
- [50] . 2019. Estimation of emotion type and intensity in Japanese tweets using multi-task deep learning. In Workshops of the International Conference on Advanced Information Networking and Applications. Springer, 314–323.Google Scholar
Cross Ref
- [51] . 2008. Bag-of-word normalized n-gram models. In 9th Annual Conference of the International Speech Communication Association.Google Scholar
Cross Ref
- [52] . 2020. Assessing canadians health activity and nutritional habits through social media. Frontiers in Public Health 7 (2020), 400.Google Scholar
Cross Ref
- [53] . 2019. EmoSense at SemEval-2019 task 3: Bidirectional LSTM network for contextual emotion detection in textual conversations. In Proceedings of the 13th International Workshop on Semantic Evaluation. 210–214.Google Scholar
Cross Ref
- [54] . 2018. Facial emotion detection in massive open online courses. In World Conference on Information Systems and Technologies. Springer, 277–286.Google Scholar
Cross Ref
- [55] . 2011. Sentiment analysis of Urdu language: Handling phrase-level negation. In Mexican International Conference on Artificial Intelligence. Springer, 382–393.Google Scholar
Digital Library
- [56] . 2014. Associating targets with SentiUnits: A step forward in sentiment analysis of Urdu text. Artificial Intelligence Review 41, 4 (2014), 535–561.Google Scholar
Digital Library
- [57] . 2014. Emotion extraction from turkish text. In 2014 European Network Intelligence Conference. IEEE, 130–133.Google Scholar
Digital Library
- [58] . 2018. TREMO: A dataset for emotion analysis in Turkish. Journal of Information Science 44, 6 (2018), 848–860.Google Scholar
Digital Library
- [59] . 2005. Understanding interobserver agreement: The kappa statistic. Family Medicine 37, 5 (2005), 360–363.Google Scholar
- [60] . 2018. Corpus creation and emotion prediction for Hindi-English code-mixed social media text. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop. 128–135.Google Scholar
Cross Ref
- [61] . 2017. Extracting acoustic features of Japanese speech to classify emotions.. In FedCSIS Communication Papers. 141–145.Google Scholar
- [62] . 2012. Construction and application of Chinese emotional corpus. In Workshop on Chinese Lexical Semantics. Springer, 122–133.Google Scholar
- [63] . 2021. Cross corpus multi-lingual speech emotion recognition using ensemble learning. Complex & Intelligent Systems (2021), 1–10.Google Scholar
- [64] . 2018. Construction of a Chinese corpus for the analysis of the emotionality of metaphorical expressions. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 144–150.Google Scholar
Cross Ref
- [65] . 2018. The identification of the emotionality of metaphorical expressions based on a manually annotated Chinese corpus. IEEE Access 6 (2018), 71241–71248.Google Scholar
Cross Ref
- [66] . 2017. Annotation and detection of emotion in text-based dialogue systems with CNN. arXiv preprint arXiv:1710.00987 (2017).Google Scholar
- [67] . 1977. The measurement of observer agreement for categorical data. Biometrics, 159–e174.Google Scholar
Cross Ref
Index Terms
Context-aware Emotion Detection from Low-resource Urdu Language Using Deep Neural Network
Recommendations
Emotion Detection in Code-Mixed Roman Urdu - English Text
Emotion detection is a widely studied topic in natural language processing due to its significance in a number of application areas. A plethora of studies have been conducted on emotion detection in European as well as Asian languages. However, a large ...
Urdu language processing: a survey
Extensive work has been done on different activities of natural language processing for Western languages as compared to its Eastern counterparts particularly South Asian Languages. Western languages are termed as resource-rich languages. Core ...
A survey on Urdu and Urdu like language stemmers and stemming techniques
Stemming is one of the basic steps in natural language processing applications such as information retrieval, parts of speech tagging, syntactic parsing and machine translation, etc. It is a morphological process that intends to convert the inflected ...






Comments