Abstract
Word Sense Disambiguation (WSD), the process of automatically identifying the correct meaning of a word used in a given context, is a significant challenge in Natural Language Processing. A range of approaches to the problem has been explored by the research community. The majority of these efforts has focused on a relatively small set of languages, particularly English. Research on WSD for South Asian languages, particularly Urdu, is still in its infancy. In recent years, deep learning methods have proved to be extremely successful for a range of Natural Language Processing tasks. The main aim of this study is to apply, evaluate, and compare a range of deep learning methods approaches to Urdu WSD (both Lexical Sample and All-Words) including Simple Recurrent Neural Networks, Long-Short Term Memory, Gated Recurrent Units, Bidirectional Long-Short Term Memory, and Ensemble Learning. The evaluation was carried out on two benchmark corpora: (1) the ULS-WSD-18 corpus and (2) the UAW-WSD-18 corpus. Results (Accuracy = 63.25% and F1-Measure = 0.49) show that a deep learning approach outperforms previously reported results for the Urdu All-Words WSD task, whereas performance using deep learning approaches (Accuracy = 72.63% and F1-Measure = 0.60) are low in comparison to previously reported for the Urdu Lexical Sample task.
- [1] . 2018. Urdu word sense disambiguation using machine learning approach. Cluster Comput. 21, 1 (2018), 515–522.Google Scholar
Cross Ref
- [2] . 2009. All-words Word Sense Disambiguation on a Specific Domain (SemEval-2010 Task 17). In SEW2009@ NAACL-HLT2009 te Boulder, Colorado, USA. Association for Computational Linguistics (ACL), 123–128. Google Scholar
Digital Library
- [3] . 2018. Bidirectional recurrent neural network approach for Arabic named entity recognition. Fut. Internet 10, 12 (2018), 123.Google Scholar
Cross Ref
- [4] . 2017. Corpus specificity in LSA and Word2vec: The role of out-of-domain documents. arXiv preprint arXiv:1712.10054 (2017).Google Scholar
- [5] . 2013. Enhancing search: Events and their discourse context. In International Conference on Intelligent Text Processing and Computational Linguistics. Springer, 318–334. Google Scholar
Digital Library
- [6] . 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014).Google Scholar
- [7] . 1999. Using lexical chains for text summarization. In Proceedings of the ACL Workshop on Intelligent Scalable Text Summarization (Madrid, Spain). 10–17.Google Scholar
- [8] . 2013. Facilitating the analysis of discourse phenomena in an interoperable NLP platform. In International Conference on Intelligent Text Processing and Computational Linguistics. Springer, 559–571. Google Scholar
Digital Library
- [9] . 2016. A review of deep machine learning. In International Journal of Engineering Research in Africa, Vol. 24. Trans Tech Publications, 124–136.Google Scholar
- [10] . 2014. Evaluating lemmatization models for machine-assisted corpus-dictionary linkage. In International Conference on Language Resources and Evaluation. 3798–3805.Google Scholar
- [11] . 1994. Word-sense disambiguation using decomposable models. In 32nd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 139–146. Google Scholar
Digital Library
- [12] . 2019. Semi-supervised learning for all-words WSD using self-learning and fine-tuning. In 33rd Pacific Asia Conference on Language, Information and Computation. Waseda Institute for the Study of Language and Information, 356–361.Google Scholar
- [13] . 2018. Feature extraction based on deep learning for some traditional machine learning methods. In 3rd International Conference on Computer Science and Engineering (UBMK’18). IEEE, 494–497.Google Scholar
- [14] . 1999. An empirical study of smoothing techniques for language modeling. Comput. Speech Lang. 13, 4 (1999), 359–394. Google Scholar
Digital Library
- [15] . 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014).Google Scholar
- [16] . 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014).Google Scholar
- [17] . 2019. Short-term traffic flow prediction method for urban road sections based on space–time analysis and GRU. IEEE Access 7 (2019), 143025–143035.Google Scholar
Cross Ref
- [18] . 2017. Urdu language processing: a survey. Artif. Intell. Rev. 47, 3 (2017), 279–311. Google Scholar
Digital Library
- [19] . 2015. Long-term recurrent convolutional networks for visual recognition and description. In IEEE Conference on Computer Vision and Pattern Recognition. 2625–2634.Google Scholar
Cross Ref
- [20] . 2001. SENSEVAL-2: Overview. In 2nd International Workshop on Evaluating Word Sense Disambiguation Systems. Association for Computational Linguistics, 1–5. Google Scholar
Digital Library
- [21] . 1990. Finding structure in time. Cogn. Sci. 14, 2 (1990), 179–211.Google Scholar
Cross Ref
- [22] . 1979. Brown corpus manual. Lett. Ed. 5, 2 (1979), 7.Google Scholar
- [23] . 2016. Deep Learning. Vol. 1. The MIT Press, Cambridge, MA. Google Scholar
Digital Library
- [24] . 2013. Speech recognition with deep recurrent neural networks. In IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 6645–6649.Google Scholar
Cross Ref
- [25] . 2018. Urdu word embeddings. In 11th International Conference on Language Resources and Evaluation (LREC’18).Google Scholar
- [26] . 1997. Long short-term memory. Neural Comput. 9, 8 (1997), 1735–1780. Google Scholar
Digital Library
- [27] . 2018. Usability testing of a developed assistive robotic system with virtual assistance for individuals with cerebral palsy: A case study. Disabil. Rehab.: Assist. Technol. 13, 6 (2018), 517–522.Google Scholar
Cross Ref
- [28] . 2016. Supervised and semi-supervised text categorization using LSTM for region embeddings. arXiv preprint arXiv:1602.02373 (2016). Google Scholar
Digital Library
- [29] . 2016. Word sense disambiguation using a bidirectional LSTM. arXiv preprint arXiv:1606.03568 (2016).Google Scholar
- [30] . 2016. Improved convolutional neural network for biomedical word sense disambiguation with enhanced context feature modeling. J. Digit. Inf. Manag. 14, 6 (2016).Google Scholar
- [31] . 2021. Word sense disambiguation in Tamil using Indo-WordNet and cross-language semantic similarity. Int. J. Intell. Enterp. 8, 1 (2021), 62–73.Google Scholar
Cross Ref
- [32] . 2004. How dominant is the commonest sense of a word? In International Conference on Text, Speech and Dialogue. Springer, 103–111.Google Scholar
- [33] . 2004. How dominant is the commonest sense of a word? In International Conference on Text, Speech and Dialogue. Springer, 103–111.Google Scholar
- [34] . 2018. Conversational agent for search.
US Patent App. 15/419,497. Google Scholar - [35] . 2018. Long short term memory recurrent neural network (LSTM-RNN) based workload forecasting model for cloud datacenters. Proced. Comput. Sci. 125 (2018), 676–682.Google Scholar
Cross Ref
- [36] . 2018. Sheffield submissions for WMT18 multimodal translation shared task. In 3rd Conference on Machine Translation: Shared Task Papers. 624–631.Google Scholar
- [37] . 2015. Deep learning. Nature 521, 7553 (2015), 436.Google Scholar
Cross Ref
- [38] . 2014. Neural word embedding as implicit matrix factorization. In Conference on Advances in Neural Information Processing Systems. 2177–2185. Google Scholar
Digital Library
- [39] . 2018. Word embedding for understanding natural language: A survey. In Guide to Big Data Applications. Springer, 83–104.Google Scholar
Cross Ref
- [40] . 2017. Text feature extraction based on deep learning: A review. EURASIP J. Wirel. Commun. Netw. 2017, 1 (2017), 1–12.Google Scholar
Cross Ref
- [41] . 2015. Two/too simple adaptations of Word2vec for syntax problems. In Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1299–1304.Google Scholar
- [42] . 1998. A comparison of event models for naive Bayes text classification. In AAAI-98 Workshop on Learning for Text Categorization, Vol. 752. Citeseer, 41–48.Google Scholar
- [43] . 2020. Comparing supervised learning algorithms for spatial nominal entity recognition. AGILE: GISci. Series 1 (2020), 1–18.Google Scholar
Cross Ref
- [44] . 2004. The Senseval-3 English lexical sample task. In 3rd International Workshop on the Evaluation of Systems for the Semantic Analysis of Text.Google Scholar
- [45] . 2004. Senselearner: Minimally supervised word sense disambiguation for all words in open text. In 3rd International Workshop on the Evaluation of Systems for the Semantic Analysis of Text. 155–158.Google Scholar
- [46] . 2017. Improving web scale discovery services. Ann. Libr. Inf. Stud. 64 (2017), 276–279.Google Scholar
- [47] . 2009. Supervised word sense disambiguation for Urdu using Bayesian classification. Proceeding of Conference on Language & Technology (CLT10).Google Scholar
- [48] . 2009. Word sense disambiguation: A survey. ACM Comput. Surv. 41, 2 (2009), 10. Google Scholar
Digital Library
- [49] . 2013. SemEval-2013 Task 12: Multilingual word sense disambiguation. In 2nd Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval’13). Association for Computational Linguistics, 222–231. Retrieved from https://www.aclweb.org/anthology/S13-2040.Google Scholar
- [50] . 2012. Identification of Manner in Bio-Events. In International Conference on Language Resources and Evaluation. 3505–3510.Google Scholar
- [51] . 1999. A case study on inter-annotator agreement for word sense disambiguation. In SIGLEX99: Standardizing Lexical Resources. 9–13.Google Scholar
- [52] . 2020. Word sense disambiguation for Punjabi language using deep learning techniques. Neural Comput. Applic. 32, 8 (2020), 2963–2973.Google Scholar
Cross Ref
- [53] . 2008. Opinion mining and sentiment analysis. Found. Trends® Inf. Retr. 2, 1–2 (2008), 1–135. Google Scholar
Digital Library
- [54] . 2012. The MASC word sense sentence corpus. In International Conference on Language Resources and Evaluation.Google Scholar
- [55] . 2015. Leverage financial news to predict stock price movements using word embeddings and deep neural networks. arXiv preprint arXiv:1506.07220 (2015).Google Scholar
- [56] . 2018. WiC: the word-in-context dataset for evaluating context-sensitive meaning representations. arXiv preprint arXiv:1808.09121 (2018).Google Scholar
- [57] . 2017. Word sense disambiguation with recurrent neural networks. In Student Research Workshop Associated with RANLP. 25–34.Google Scholar
- [58] . 2017. Neural sequence learning models for word sense disambiguation. In Conference on Empirical Methods in Natural Language Processing. 1156–1167.Google Scholar
- [59] . 2018. A word sense disambiguation corpus for Urdu. Lang. Resour. Eval. 53, 3 (2018), 397–418.Google Scholar
- [60] . 2019. A sense annotated corpus for all-words Urdu word sense disambiguation. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) 18, 4 (2019), 1–14. Google Scholar
Digital Library
- [61] . 2016. Impact of automatic feature extraction in deep learning architecture. In International Conference on Digital Image Computing: Techniques and Applications (DICTA’16). IEEE, 1–8.Google Scholar
- [62] . 2016. Deep crossing: Web-scale modeling without manually crafted combinatorial features. In 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 255–262. Google Scholar
Digital Library
- [63] . 2018. Identification of research hypotheses and new knowledge from scientific literature. BMC Med. Inform. Decis. Mak. 18, 1 (2018), 46.Google Scholar
Cross Ref
- [64] . 2018. A deep recurrent neural network with biLSTM model for sentiment classification. In International Conference on Bangla Speech and Language Processing (ICBSLP’18). IEEE, 1–4.Google Scholar
- [65] . 2019. A comparative analysis of forecasting financial time series using ARIMA, LSTM, and biLSTM. arXiv preprint arXiv:1911.09512 (2019).Google Scholar
- [66] . 2016. Sense annotated Hindi corpus. In International Conference on Asian Language Processing (IALP’16). IEEE, 22–25.Google Scholar
- [67] . 2017. Chinese word sense disambiguation using a LSTM. In ITM Web of Conferences, Vol. 12. EDP Sciences, 01027.Google Scholar
- [68] . 2017. Enriching news events with meta-knowledge information. Lang. Resour. Eval. 51, 2 (2017), 409–438. Google Scholar
Digital Library
- [69] . 2004. Senseval-3: The Italian all-words task. In 3rd International Workshop on the Evaluation of Systems for the Semantic Analysis of Text.Google Scholar
- [70] . 2011. Detecting experimental techniques and selecting relevant documents for protein-protein interactions from biomedical literature. BMC Bioinf. 12, 8 (2011), S11.Google Scholar
Cross Ref
- [71] . 2016. Semi-supervised word sense disambiguation with neural models. arXiv preprint arXiv:1603.07012 (2016).Google Scholar
- [72] . 2015. Bidirectional long short-term memory networks for relation classification. In 29th Pacific Asia Conference on Language, Information and Computation. 73–78.Google Scholar
- [73] . 2020. Deep learning-based extraction of construction procedural constraints from construction regulations. Adv. Eng. Inform. 43 (2020), 101003.Google Scholar
Digital Library
- [74] . 2002. Chinese documents classification based on N-grams. In International Conference on Intelligent Text Processing and Computational Linguistics. Springer, 405–414. Google Scholar
Digital Library
Index Terms
Investigating the Feasibility of Deep Learning Methods for Urdu Word Sense Disambiguation
Recommendations
A Sense Annotated Corpus for All-Words Urdu Word Sense Disambiguation
Word Sense Disambiguation (WSD) aims to automatically predict the correct sense of a word used in a given context. All human languages exhibit word sense ambiguity, and resolving this ambiguity can be difficult. Standard benchmark resources are required ...
A word sense disambiguation corpus for Urdu
AbstractThe aim of word sense disambiguation (WSD) is to correctly identify the meaning of a word in context. All natural languages exhibit word sense ambiguities and these are often hard to resolve automatically. Consequently WSD is considered an ...
Word Sense Disambiguation for Vocabulary Learning
ITS '08: Proceedings of the 9th international conference on Intelligent Tutoring SystemsWords with multiple meanings are a phenomenon inherent to any natural language. In this work, we study the effects of such lexical ambiguities on second language vocabulary learning. We demonstrate that machine learning algorithms for word sense ...






Comments