skip to main content
research-article

Investigating the Feasibility of Deep Learning Methods for Urdu Word Sense Disambiguation

Authors Info & Claims
Published:31 October 2021Publication History
Skip Abstract Section

Abstract

Word Sense Disambiguation (WSD), the process of automatically identifying the correct meaning of a word used in a given context, is a significant challenge in Natural Language Processing. A range of approaches to the problem has been explored by the research community. The majority of these efforts has focused on a relatively small set of languages, particularly English. Research on WSD for South Asian languages, particularly Urdu, is still in its infancy. In recent years, deep learning methods have proved to be extremely successful for a range of Natural Language Processing tasks. The main aim of this study is to apply, evaluate, and compare a range of deep learning methods approaches to Urdu WSD (both Lexical Sample and All-Words) including Simple Recurrent Neural Networks, Long-Short Term Memory, Gated Recurrent Units, Bidirectional Long-Short Term Memory, and Ensemble Learning. The evaluation was carried out on two benchmark corpora: (1) the ULS-WSD-18 corpus and (2) the UAW-WSD-18 corpus. Results (Accuracy = 63.25% and F1-Measure = 0.49) show that a deep learning approach outperforms previously reported results for the Urdu All-Words WSD task, whereas performance using deep learning approaches (Accuracy = 72.63% and F1-Measure = 0.60) are low in comparison to previously reported for the Urdu Lexical Sample task.

REFERENCES

  1. [1] Abid Muhammad, Habib Asad, Ashraf Jawad, and Shahid Abdul. 2018. Urdu word sense disambiguation using machine learning approach. Cluster Comput. 21, 1 (2018), 515–522.Google ScholarGoogle ScholarCross RefCross Ref
  2. [2] Agirre E., Lacalle O. Lopez de, Fellbaum C., Marchetti A., Toral A., Vossen P. T. J. M., Màrques L., and Wicentowski R.. 2009. All-words Word Sense Disambiguation on a Specific Domain (SemEval-2010 Task 17). In SEW2009@ NAACL-HLT2009 te Boulder, Colorado, USA. Association for Computational Linguistics (ACL), 123–128. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. [3] Ali Mohammed N. A., Tan Guanzheng, and Hussain Aamir. 2018. Bidirectional recurrent neural network approach for Arabic named entity recognition. Fut. Internet 10, 12 (2018), 123.Google ScholarGoogle ScholarCross RefCross Ref
  4. [4] Altszyler Edgar, Sigman Mariano, and Slezak Diego Fernández. 2017. Corpus specificity in LSA and Word2vec: The role of out-of-domain documents. arXiv preprint arXiv:1712.10054 (2017).Google ScholarGoogle Scholar
  5. [5] Ananiadou Sophia, Thompson Paul, and Nawaz Raheel. 2013. Enhancing search: Events and their discourse context. In International Conference on Intelligent Text Processing and Computational Linguistics. Springer, 318334. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. [6] Bahdanau Dzmitry, Cho Kyunghyun, and Bengio Yoshua. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014).Google ScholarGoogle Scholar
  7. [7] Barzilay Regina and Elhadad Michael. 1999. Using lexical chains for text summarization. In Proceedings of the ACL Workshop on Intelligent Scalable Text Summarization (Madrid, Spain). 10–17.Google ScholarGoogle Scholar
  8. [8] Batista-Navarro Riza Theresa, Kontonatsios Georgios, Mihăilă Claudiu, Thompson Paul, Rak Rafal, Nawaz Raheel, Korkontzelos Ioannis, and Ananiadou Sophia. 2013. Facilitating the analysis of discourse phenomena in an interoperable NLP platform. In International Conference on Intelligent Text Processing and Computational Linguistics. Springer, 559571. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. [9] Benuwa Ben Bright, Zhan Yong Zhao, Ghansah Benjamin, Wornyo Dickson Keddy, and Kataka Frank Banaseka. 2016. A review of deep machine learning. In International Journal of Engineering Research in Africa, Vol. 24. Trans Tech Publications, 124136.Google ScholarGoogle Scholar
  10. [10] Black Kevin, Ringger Eric K., Felt Paul, Seppi Kevin D., Heal Kristian, and Lonsdale Deryle. 2014. Evaluating lemmatization models for machine-assisted corpus-dictionary linkage. In International Conference on Language Resources and Evaluation. 37983805.Google ScholarGoogle Scholar
  11. [11] Bruce Rebecca and Wiebe Janyce. 1994. Word-sense disambiguation using decomposable models. In 32nd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 139146. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. [12] Cao Rui, Bai Jing, Ma Wen, and Shinnou Hiroyuki. 2019. Semi-supervised learning for all-words WSD using self-learning and fine-tuning. In 33rd Pacific Asia Conference on Language, Information and Computation. Waseda Institute for the Study of Language and Information, 356361.Google ScholarGoogle Scholar
  13. [13] Çayir Aykut, Yenidoğan Işil, and Dağ Hasan. 2018. Feature extraction based on deep learning for some traditional machine learning methods. In 3rd International Conference on Computer Science and Engineering (UBMK’18). IEEE, 494497.Google ScholarGoogle Scholar
  14. [14] Chen Stanley F. and Goodman Joshua. 1999. An empirical study of smoothing techniques for language modeling. Comput. Speech Lang. 13, 4 (1999), 359394. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. [15] Cho Kyunghyun, Merriënboer Bart Van, Gulcehre Caglar, Bahdanau Dzmitry, Bougares Fethi, Schwenk Holger, and Bengio Yoshua. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014).Google ScholarGoogle Scholar
  16. [16] Chung Junyoung, Gulcehre Caglar, Cho KyungHyun, and Bengio Yoshua. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014).Google ScholarGoogle Scholar
  17. [17] Dai Guowen, Ma Changxi, and Xu Xuecai. 2019. Short-term traffic flow prediction method for urban road sections based on space–time analysis and GRU. IEEE Access 7 (2019), 143025143035.Google ScholarGoogle ScholarCross RefCross Ref
  18. [18] Daud Ali, Khan Wahab, and Che Dunren. 2017. Urdu language processing: a survey. Artif. Intell. Rev. 47, 3 (2017), 279311. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. [19] Donahue Jeffrey, Hendricks Lisa Anne, Guadarrama Sergio, Rohrbach Marcus, Venugopalan Subhashini, Saenko Kate, and Darrell Trevor. 2015. Long-term recurrent convolutional networks for visual recognition and description. In IEEE Conference on Computer Vision and Pattern Recognition. 26252634.Google ScholarGoogle ScholarCross RefCross Ref
  20. [20] Edmonds Philip and Cotton Scott. 2001. SENSEVAL-2: Overview. In 2nd International Workshop on Evaluating Word Sense Disambiguation Systems. Association for Computational Linguistics, 15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. [21] Elman Jeffrey L.. 1990. Finding structure in time. Cogn. Sci. 14, 2 (1990), 179211.Google ScholarGoogle ScholarCross RefCross Ref
  22. [22] Francis W. Nelson and Kucera Henry. 1979. Brown corpus manual. Lett. Ed. 5, 2 (1979), 7.Google ScholarGoogle Scholar
  23. [23] Goodfellow Ian, Bengio Yoshua, Courville Aaron, and Bengio Yoshua. 2016. Deep Learning. Vol. 1. The MIT Press, Cambridge, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. [24] Graves Alex, Mohamed Abdel-rahman, and Hinton Geoffrey. 2013. Speech recognition with deep recurrent neural networks. In IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 66456649.Google ScholarGoogle ScholarCross RefCross Ref
  25. [25] Haider Samar. 2018. Urdu word embeddings. In 11th International Conference on Language Resources and Evaluation (LREC’18).Google ScholarGoogle Scholar
  26. [26] Hochreiter Sepp and Schmidhuber Jürgen. 1997. Long short-term memory. Neural Comput. 9, 8 (1997), 17351780. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. [27] Jafari Nooshin, Adams Kim, Tavakoli Mahdi, Wiebe Sandra, and Janz Heidi. 2018. Usability testing of a developed assistive robotic system with virtual assistance for individuals with cerebral palsy: A case study. Disabil. Rehab.: Assist. Technol. 13, 6 (2018), 517522.Google ScholarGoogle ScholarCross RefCross Ref
  28. [28] Johnson Rie and Zhang Tong. 2016. Supervised and semi-supervised text categorization using LSTM for region embeddings. arXiv preprint arXiv:1602.02373 (2016). Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. [29] Kågebäck Mikael and Salomonsson Hans. 2016. Word sense disambiguation using a bidirectional LSTM. arXiv preprint arXiv:1606.03568 (2016).Google ScholarGoogle Scholar
  30. [30] Kai Ren and Wang Shi-Wen. 2016. Improved convolutional neural network for biomedical word sense disambiguation with enhanced context feature modeling. J. Digit. Inf. Manag. 14, 6 (2016).Google ScholarGoogle Scholar
  31. [31] Karuppaiah Deepa and Vincent P. M. Durai Raj. 2021. Word sense disambiguation in Tamil using Indo-WordNet and cross-language semantic similarity. Int. J. Intell. Enterp. 8, 1 (2021), 6273.Google ScholarGoogle ScholarCross RefCross Ref
  32. [32] Kilgarriff Adam. 2004. How dominant is the commonest sense of a word? In International Conference on Text, Speech and Dialogue. Springer, 103111.Google ScholarGoogle Scholar
  33. [33] Kilgarriff Adam. 2004. How dominant is the commonest sense of a word? In International Conference on Text, Speech and Dialogue. Springer, 103111.Google ScholarGoogle Scholar
  34. [34] Krishnamurthy Balaji, Sodhani Shagun, Arora Aarushi, and Aggarwal Milan. 2018. Conversational agent for search. US Patent App. 15/419,497.Google ScholarGoogle Scholar
  35. [35] Kumar Jitendra, Goomer Rimsha, and Singh Ashutosh Kumar. 2018. Long short term memory recurrent neural network (LSTM-RNN) based workload forecasting model for cloud datacenters. Proced. Comput. Sci. 125 (2018), 676682.Google ScholarGoogle ScholarCross RefCross Ref
  36. [36] Lala Chiraag, Madhyastha Pranava Swaroop, Scarton Carolina, and Specia Lucia. 2018. Sheffield submissions for WMT18 multimodal translation shared task. In 3rd Conference on Machine Translation: Shared Task Papers. 624631.Google ScholarGoogle Scholar
  37. [37] LeCun Yann, Bengio Yoshua, and Hinton Geoffrey. 2015. Deep learning. Nature 521, 7553 (2015), 436.Google ScholarGoogle ScholarCross RefCross Ref
  38. [38] Levy Omer and Goldberg Yoav. 2014. Neural word embedding as implicit matrix factorization. In Conference on Advances in Neural Information Processing Systems. 21772185. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. [39] Li Yang and Yang Tao. 2018. Word embedding for understanding natural language: A survey. In Guide to Big Data Applications. Springer, 83104.Google ScholarGoogle ScholarCross RefCross Ref
  40. [40] Liang Hong, Sun Xiao, Sun Yunlei, and Gao Yuan. 2017. Text feature extraction based on deep learning: A review. EURASIP J. Wirel. Commun. Netw. 2017, 1 (2017), 112.Google ScholarGoogle ScholarCross RefCross Ref
  41. [41] Ling Wang, Dyer Chris, Black Alan W., and Trancoso Isabel. 2015. Two/too simple adaptations of Word2vec for syntax problems. In Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 12991304.Google ScholarGoogle Scholar
  42. [42] McCallum Andrew, Nigam Kamal, et al. 1998. A comparison of event models for naive Bayes text classification. In AAAI-98 Workshop on Learning for Text Categorization, Vol. 752. Citeseer, 4148.Google ScholarGoogle Scholar
  43. [43] Medad Amine, Gaio Mauro, Moncla Ludovic, Mustière Sébastien, and Nir Yannick Le. 2020. Comparing supervised learning algorithms for spatial nominal entity recognition. AGILE: GISci. Series 1 (2020), 118.Google ScholarGoogle ScholarCross RefCross Ref
  44. [44] Mihalcea Rada, Chklovski Timothy, and Kilgarriff Adam. 2004. The Senseval-3 English lexical sample task. In 3rd International Workshop on the Evaluation of Systems for the Semantic Analysis of Text.Google ScholarGoogle Scholar
  45. [45] Mihalcea Rada and Faruque Ehsanul. 2004. Senselearner: Minimally supervised word sense disambiguation for all words in open text. In 3rd International Workshop on the Evaluation of Systems for the Semantic Analysis of Text. 155158.Google ScholarGoogle Scholar
  46. [46] Narayanan Nikesh and Byers Dorothy Furber. 2017. Improving web scale discovery services. Ann. Libr. Inf. Stud. 64 (2017), 276279.Google ScholarGoogle Scholar
  47. [47] Naseer Asma and Hussain Sarmad. 2009. Supervised word sense disambiguation for Urdu using Bayesian classification. Proceeding of Conference on Language & Technology (CLT10).Google ScholarGoogle Scholar
  48. [48] Navigli Roberto. 2009. Word sense disambiguation: A survey. ACM Comput. Surv. 41, 2 (2009), 10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. [49] Navigli Roberto, Jurgens David, and Vannella Daniele. 2013. SemEval-2013 Task 12: Multilingual word sense disambiguation. In 2nd Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval’13). Association for Computational Linguistics, 222231. Retrieved from https://www.aclweb.org/anthology/S13-2040.Google ScholarGoogle Scholar
  50. [50] Nawaz Raheel, Thompson Paul, and Ananiadou Sophia. 2012. Identification of Manner in Bio-Events. In International Conference on Language Resources and Evaluation. 35053510.Google ScholarGoogle Scholar
  51. [51] Ng Hwee Tou, Lim Chung Yong, and Foo Shou King. 1999. A case study on inter-annotator agreement for word sense disambiguation. In SIGLEX99: Standardizing Lexical Resources. 9–13.Google ScholarGoogle Scholar
  52. [52] Singh Varinder Pal and Kumar Parteek. 2020. Word sense disambiguation for Punjabi language using deep learning techniques. Neural Comput. Applic. 32, 8 (2020), 29632973.Google ScholarGoogle ScholarCross RefCross Ref
  53. [53] Pang Bo, Lee Lillian et al. 2008. Opinion mining and sentiment analysis. Found. Trends® Inf. Retr. 2, 1–2 (2008), 1135. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. [54] Passonneau Rebecca J., Baker Collin, Fellbaum Christiane, and Ide Nancy. 2012. The MASC word sense sentence corpus. In International Conference on Language Resources and Evaluation.Google ScholarGoogle Scholar
  55. [55] Peng Yangtuo and Jiang Hui. 2015. Leverage financial news to predict stock price movements using word embeddings and deep neural networks. arXiv preprint arXiv:1506.07220 (2015).Google ScholarGoogle Scholar
  56. [56] Pilehvar Mohammad Taher and Camacho-Collados Jose. 2018. WiC: the word-in-context dataset for evaluating context-sensitive meaning representations. arXiv preprint arXiv:1808.09121 (2018).Google ScholarGoogle Scholar
  57. [57] Popov Alexander. 2017. Word sense disambiguation with recurrent neural networks. In Student Research Workshop Associated with RANLP. 2534.Google ScholarGoogle Scholar
  58. [58] Raganato Alessandro, Bovi Claudio Delli, and Navigli Roberto. 2017. Neural sequence learning models for word sense disambiguation. In Conference on Empirical Methods in Natural Language Processing. 11561167.Google ScholarGoogle Scholar
  59. [59] Saeed Ali, Nawab Rao Muhammad Adeel, Stevenson Mark, and Rayson Paul. 2018. A word sense disambiguation corpus for Urdu. Lang. Resour. Eval. 53, 3 (2018), 397–418.Google ScholarGoogle Scholar
  60. [60] Nawab Mark Stevenson, Paul Rayson, Ali Saeed, and Rao Muhammad Adeel. 2019. A sense annotated corpus for all-words Urdu word sense disambiguation. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) 18, 4 (2019), 1–14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. [61] Shaheen Fatma, Verma Brijesh, and Asafuddoula Md. 2016. Impact of automatic feature extraction in deep learning architecture. In International Conference on Digital Image Computing: Techniques and Applications (DICTA’16). IEEE, 18.Google ScholarGoogle Scholar
  62. [62] Shan Ying, Hoens T. Ryan, Jiao Jian, Wang Haijing, Yu Dong, and Mao J. C.. 2016. Deep crossing: Web-scale modeling without manually crafted combinatorial features. In 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 255262. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. [63] Shardlow Matthew, Batista-Navarro Riza, Thompson Paul, Nawaz Raheel, McNaught John, and Ananiadou Sophia. 2018. Identification of research hypotheses and new knowledge from scientific literature. BMC Med. Inform. Decis. Mak. 18, 1 (2018), 46.Google ScholarGoogle ScholarCross RefCross Ref
  64. [64] Sharfuddin Abdullah Aziz, Tihami Md Nafis, and Islam Md Saiful. 2018. A deep recurrent neural network with biLSTM model for sentiment classification. In International Conference on Bangla Speech and Language Processing (ICBSLP’18). IEEE, 14.Google ScholarGoogle Scholar
  65. [65] Siami-Namini Sima, Tavakoli Neda, and Namin Akbar Siami. 2019. A comparative analysis of forecasting financial time series using ARIMA, LSTM, and biLSTM. arXiv preprint arXiv:1911.09512 (2019).Google ScholarGoogle Scholar
  66. [66] Singh Satyendr and Siddiqui Tanveer J.. 2016. Sense annotated Hindi corpus. In International Conference on Asian Language Processing (IALP’16). IEEE, 2225.Google ScholarGoogle Scholar
  67. [67] Sun Xue-Ren, Lv Shao-He, Wang Xiao-Dong, and Wang Dong. 2017. Chinese word sense disambiguation using a LSTM. In ITM Web of Conferences, Vol. 12. EDP Sciences, 01027.Google ScholarGoogle Scholar
  68. [68] Thompson Paul, Nawaz Raheel, McNaught John, and Ananiadou Sophia. 2017. Enriching news events with meta-knowledge information. Lang. Resour. Eval. 51, 2 (2017), 409438. Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. [69] Ulivieri Marisa, Guazzini Elisabetta, Bertagna Francesca, and Calzolari Nicoletta. 2004. Senseval-3: The Italian all-words task. In 3rd International Workshop on the Evaluation of Systems for the Semantic Analysis of Text.Google ScholarGoogle Scholar
  70. [70] Wang Xinglong, Rak Rafal, Restificar Angelo, Nobata Chikashi, Rupp C. J., Batista-Navarro Riza Theresa B., Nawaz Raheel, and Ananiadou Sophia. 2011. Detecting experimental techniques and selecting relevant documents for protein-protein interactions from biomedical literature. BMC Bioinf. 12, 8 (2011), S11.Google ScholarGoogle ScholarCross RefCross Ref
  71. [71] Yuan Dayu, Richardson Julian, Doherty Ryan, Evans Colin, and Altendorf Eric. 2016. Semi-supervised word sense disambiguation with neural models. arXiv preprint arXiv:1603.07012 (2016).Google ScholarGoogle Scholar
  72. [72] Zhang Shu, Zheng Dequan, Hu Xinchen, and Yang Ming. 2015. Bidirectional long short-term memory networks for relation classification. In 29th Pacific Asia Conference on Language, Information and Computation. 7378.Google ScholarGoogle Scholar
  73. [73] Zhong Botao, Xing Xuejiao, Luo Hanbin, Zhou Qirui, Li Heng, Rose Timothy, and Fang Weili. 2020. Deep learning-based extraction of construction procedural constraints from construction regulations. Adv. Eng. Inform. 43 (2020), 101003.Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. [74] Zhou Shuigeng and Guan Jihong. 2002. Chinese documents classification based on N-grams. In International Conference on Intelligent Text Processing and Computational Linguistics. Springer, 405414. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Investigating the Feasibility of Deep Learning Methods for Urdu Word Sense Disambiguation

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Article Metrics

      • Downloads (Last 12 months)106
      • Downloads (Last 6 weeks)6

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!