skip to main content
research-article

A Study on the Performance of Recurrent Neural Network based Models in Maithili Part of Speech Tagging

Published:21 February 2023Publication History
Skip Abstract Section

Abstract

This article presents our effort in developing a Maithili Part of Speech (POS) tagger. Substantial effort has been devoted to developing POS taggers in several Indian languages, including Hindi, Bengali, Tamil, Telugu, Kannada, Punjabi, and Marathi; but Maithili did not achieve much attention from the research community. Maithili is one of the official languages of India, with around 50 million native speakers. So, we worked on developing a POS tagger in Maithili. For the development, we use a manually annotated in-house Maithili corpus containing 56,126 tokens. The tagset contains 27 tags. We train a conditional random fields (CRF) classifier to prepare a baseline system that achieves an accuracy of 82.67%. Then, we employ several recurrent neural networks (RNN)-based models, including Long-short Term Memory (LSTM), Gated Recurrent Unit (GRU), LSTM with a CRF layer (LSTM-CRF), and GRU with a CRF layer (GRU-CRF) and perform a comparative study. We also study the effect of both word embedding and character embedding in the task. The highest accuracy of the system is 91.53%.

REFERENCES

  1. [1] F. Alam, S. A. Chowdhury, and S. R. H. Noori. 2016. Bidirectional lstms-crfs networks for bangla pos tagging. In Proceedings of the 2016 19th International Conference on Computer and Information Technology. IEEE, 377–382.Google ScholarGoogle Scholar
  2. [2] A. Ananth, S. Bhat, R. Naik, and U. P. Nair. 2021. Parts of speech tagging and extractive summarization techniques for kannada documents. In Smart Sensors Measurements and Instrumentation. Springer, Singapore, 367–380.Google ScholarGoogle Scholar
  3. [3] D. Andor, C. Alberti, D. Weiss, A. Severyn, A. Presta, K. Ganchev, S. Petrov, and M. Collins. 2016. Globally normalized transition-based neural networks. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2442–2452.Google ScholarGoogle Scholar
  4. [4] P. J. Antony and K. P. Soman. 2010. Kernel based part of speech tagger for kannada. In Proceedings of the 2010 International Conference on Machine Learning and Cybernetics (ICMLC’10). IEEE, 4, 2139–2144.Google ScholarGoogle Scholar
  5. [5] P. J. Antony, S. P. Mohan, and K. P. Soman. 2010. SVM based part of speech tagger for Malayalam. In 2010 International Conference on Recent Trends in Information, Telecommunication and Computing. IEEE, 339–341.Google ScholarGoogle Scholar
  6. [6] P. Arulmozhi and L. Sobha. 2006. A hybrid POS tagger for a relatively free word order language. In Proceedings of the First National Symposium on Modeling and Shallow Parsing of Indian Languages. 79–85.Google ScholarGoogle Scholar
  7. [7] P. V. S. Avinesh and G. Karthik. 2007. Part-of-speech tagging and chunking using conditional random fields and transformation based learning. Shallow Parsing for South Asian Languages 21, 2007, 1–40.Google ScholarGoogle Scholar
  8. [8] C. A. Bahcevan, E. Kutlu, and T. Yildiz. 2018. Deep neural network architecture for part-of-speech tagging for turkish language. In Proceedings of the 2018 3rd International Conference on Computer Science and Engineering (UBMK’18). IEEE 235–238.Google ScholarGoogle Scholar
  9. [9] V. Bharati Chaitanya and R. Sangal. 1995. Natural Language Processing: A Paninian Perspective. Prentice Hall India, 65–106.Google ScholarGoogle Scholar
  10. [10] M. Boden. 2002. A guide to recurrent neural networks and backpropagation. The Dallas Project.Google ScholarGoogle Scholar
  11. [11] J. Chung, C. Gulcehre, K. Cho, and Y. Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. In Proceedings of the NIPS 2014 Workshop on Deep Learning, December 2014.Google ScholarGoogle Scholar
  12. [12] D. Cutting, J. Kupiec, J. Pedersen, and S. Penelope. 1992. A practical part-of-speech tagger. In Proceedings of the 3rd Conference on Applied Natural Language Processing. 133–140Google ScholarGoogle Scholar
  13. [13] S. Dandapat. 2009. Part-of-speech Tagging for Bengali. MS Thesis, Indian Institute of Technology Kharagpur, India.Google ScholarGoogle Scholar
  14. [14] S. Dandapat, S. Sarkar, and A. Basu. 2007. Automatic part-of-speech tagging for Bengali: An approach for morphologically rich languages in a poor resource scenario. InProceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions. Association for Computational Linguistics. 221–224.Google ScholarGoogle Scholar
  15. [15] E. Dermatas and K. George. 1995. Automatic stochastic tagging of natural language texts. Computational Linguistics 21.2. 137–163.Google ScholarGoogle Scholar
  16. [16] R. D. Deshmukh and A. Kiwelekar. 2020. Deep learning techniques for part of speech tagging by natural language processing. In Proceedings of the 2020 2nd International Conference on Innovative Mechanisms for Industry Applications. IEEE, 76–81.Google ScholarGoogle Scholar
  17. [17] V. Dhanalakshmi, G. Shivapratap, S. Rajendran, and K. P. Soman. 2009. Tamil POS tagging using linear programming. International Journal of Recent Trends in Engineering. 1, 2 (2009).Google ScholarGoogle Scholar
  18. [18] A. Ekbal, R. Haque, and S. Bandyopadhyay. 2007. Bengali part of speech tagging using conditional random field. In Proceedings of the 7th International Symposium on Natural Language Processing. 131–136.Google ScholarGoogle Scholar
  19. [19] N. Garg, V. Goyal, and S. Preet. 2012. Rule based Hindi part of speech tagger. Proceedings of COLING 2012: Demonstration Papers, Mumbai, India. 163–174.Google ScholarGoogle Scholar
  20. [20] A. Gopalakrishnan, K. P. Soman, and B. Premjith. 2019. Part-of-speech tagger for biomedical domain using deep neural network architecture. In Proceedings of the 2019 10th International Conference on Computing, Communication and Networking Technologies. IEEE, 1–5.Google ScholarGoogle Scholar
  21. [21] B. Greene and G. Rubin. 1971. Automated grammatical tagging of English. Department of Linguistics, Brown University.Google ScholarGoogle Scholar
  22. [22] J. Guo, S. Wang, C. Yu, and J. Song. 2019. Chinese POS tagging method based on Bi-GRU+CRF hybrid model. In Proceedings of the Advances in Intelligent Networking and Collaborative Systems, F. Xhafa, L. Barolli, and M. Greguš (Eds.). Lecture Notes on Data Engineering and Communications Technologies, vol 23. Springer, Cham.Google ScholarGoogle Scholar
  23. [23] Z. Harris. 1962. String Analysis of Language Structure. The Hague: Mouton and Co.Google ScholarGoogle Scholar
  24. [24] S. Hochreiter. 1998. The vanishing gradient problem during learning recurrent neural nets and problem solutions. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 6, 02 (1998), 107–116.Google ScholarGoogle Scholar
  25. [25] T. Horsmann and T. Zesch. 2017. Do LSTMs really work so well for PoS tagging?–A replication study. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 727–736.Google ScholarGoogle Scholar
  26. [26] Z. Huang, W. Xu, and K. Yu. 2015. Bidirectional LSTM-CRF models for sequence tagging. arXiv:1508.01991 (2015).Google ScholarGoogle Scholar
  27. [27] A. Jamatia and A. Das. 2014. Part-of-speech tagging system for indian social media text on Twitter. In Proceedings of the Social-India 2014, First Workshop on Language Technologies for Indian Social Media Text, at the Eleventh International Conference on Natural Language Processing (ICON’14). 21–28.Google ScholarGoogle Scholar
  28. [28] M. K. Junaida and A. P. Babu. 2021. A deep learning approach to Malayalam parts of speech tagging. In Proceedin gs of the 2nd International Conference on Networks and Advances in Computational Technologies. Springer, Cham. 243–250.Google ScholarGoogle Scholar
  29. [29] G. Krishnan, A. Pooja, M. Anand Kumar, and K. P. Soman. 2017. Character based bidirectional LSTM for disambiguating tamil part-of-speech categories. International Journal of Control Theory and Applications. 229–235.Google ScholarGoogle Scholar
  30. [30] O. Irsoy Kumar, J. Su, J. Bradbury, R. English, B. Pierce, P. Ondruska, I. Gulrajani, and R. Socher. 2015. Ask me anything: Dynamic memory networks for natural language processing. 1378–1387. arXiv:1506.07285. Retrieved from https://arxiv.org/abs/1506.07285.Google ScholarGoogle Scholar
  31. [31] J. Lafferty, A. McCallum, and F. Pereira CN. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the 18th International Conference on Machine Learning 2001. 282–289.Google ScholarGoogle Scholar
  32. [32] Y. Ma and J. C. Principe. 2019. A taxonomy for neural memory networks. IEEE Transactions on Neural Networks and Learning Systems 31, 6 (2019), 1780–1793.Google ScholarGoogle Scholar
  33. [33] S. Meftah, N. Semmar, and F. Sadat. 2018. A neural network model for part-of-speech tagging of social media texts. In Proceedings of the 11th International Conference on Language Resources and Evaluation. 2821–2828.Google ScholarGoogle Scholar
  34. [34] A. McCallum, D. Freitag, and F. C. N. Pereira. 2000. Maximum entropy markov models for information extraction and segmentation. ICML 17, 2000 (2000), 591–598.Google ScholarGoogle Scholar
  35. [35] T. Mikolov, K. Chen, G. Corrado, and J. Dean. 2013. Efficient estimation of word representations in vector space. In Proceedings of Workshop at ICLR. arXiv. 1301–3781.Google ScholarGoogle Scholar
  36. [36] D. Modi and N. Nain. 2016. Part-of-speech tagging of Hindi corpus using rule-based method. Proceedings of the International Conference on Recent Cognizance in Wireless Communication & Image Processing. 241–247.Google ScholarGoogle Scholar
  37. [37] R. K. Mundotiya, M. K. Singh, R. Kapur, S. Mishra, and A. K. Singh. 2020. Linguistic resources for Bhojpuri, Magahi, and Maithili: Statistics about them, their similarity estimates, and baselines for three applications. Transactions on Asian and Low-Resource Language Information Processing 20.6 (2021), 1–37.Google ScholarGoogle Scholar
  38. [38] R. K. Mundotiya, M. K. Singh, R. Kapur, S. Mishra, and A. K. Singh. 2021. Hierarchical self attention based sequential labelling model for Bhojpuri, Maithili and Magahi languages. Journal of King Saud University-Computer and Information Sciences.Google ScholarGoogle Scholar
  39. [39] N. Ljubesic. 2018. Comparing CRF and LSTM performance on the task of morphosyntactic tagging of non-standard varieties of South Slavic languages. In Proceedings of the 5th Workshop on NLP for Similar Languages, Varieties and Dialects. 156–163.Google ScholarGoogle Scholar
  40. [40] J. C. W. Lin, Y. Shao, Y. Djenouri, and U. Yun. 2021. ASRNN: A recurrent neural network with an attention model for sequence labeling. Knowledge-Based Systems. 212, 106548.Google ScholarGoogle Scholar
  41. [41] S. Pan and D. Saha. 2021. Performance evaluation of part-of-speech tagging for Bengali text. Journal of The Institution of Engineers (India): Series B. 1–13.Google ScholarGoogle Scholar
  42. [42] A. Pradhan and A. Yajnik. 2021. Probabilistic and neural network based POS tagging of ambiguous Nepali text: A comparative study. In Proceedings of the 2021 International Symposium on Electrical, Electronics and Information Engineering. 249–253.Google ScholarGoogle Scholar
  43. [43] A. Priyadarshi and S. K. Saha. 2020. Towards the first Maithili part of speech tagger: Resource creation and system development. Computer Speech and Language. 62, 101054.Google ScholarGoogle Scholar
  44. [44] A. Priyadarshi and S. K. Saha. 2019. A study on the importance of linguistic suffixes in Maithili POS tagger development. In Proceedings of the International Conference on Mining Intelligence and Knowledge Exploration. Springer, Cham. 11–20.Google ScholarGoogle Scholar
  45. [45] A. Ratnaparkhi. 1996. A maximum entropy model for part-of-speech tagging. Conference on Empirical Methods in Natural Language Processing. 133–142.Google ScholarGoogle Scholar
  46. [46] P. R. Ray, V. Harish, S. Sarkar, and A. Basu. 2003. Part of speech tagging and local word grouping techniques for natural language processing. In Proceedings of the 1st International Conference on Natural Language Processing.Google ScholarGoogle Scholar
  47. [47] S. N. Sakiba, M. M. U. Shuvo, N. Hossain, S. K. Das, J. D. Mela, and M. A. Islam. 2021. A memory-efficient tool for bengali parts of speech tagging. In Artificial Intelligence Techniques for Advanced Computing Applications. Springer, Singapore, 67–78.Google ScholarGoogle Scholar
  48. [48] C. N. Santos and B. Zadrozny. 2014. Learning character-level representations for part-of-speech tagging. In Proceedings of the 31 st International Conference on Machine Learning. JMLR: W&CP volume 32, 1818–1826.Google ScholarGoogle Scholar
  49. [49] H. Schmid. 1994. Part-of-speech tagging with neural networks. In Proceedings of the 15th Conference on Computational Linguistics. Association for Computational Linguistics, Volume 1, 172–176.Google ScholarGoogle Scholar
  50. [50] Y. Shao, C. Hardmeier, J. Tiedemann, and J. Nivre. 2017. Character-based joint segmentation and POS tagging for Chinese using bidirectional RNN-CRF. In Proceedings of the 8th International Joint Conference on Natural Language Processing. 173–183.Google ScholarGoogle Scholar
  51. [51] R. Sharma, S. Morwal, B. Agarwal, R. Chandra, and M. S. Khan. 2020. A deep neural network-based model for named entity recognition for Hindi language. Neural Computing and Applications 32, 20 (2020), 16191–16203.Google ScholarGoogle Scholar
  52. [52] S. K. Sharma and G. S. Lehal. 2011. Using Hidden Markov Model to improve the accuracy of Punjabi POS tagger. In Proceedings of the 2011 IEEE International Conference on Computer Science and Automation Engineering. Vol. 2, 697–701.Google ScholarGoogle Scholar
  53. [53] M. Shrivastava and P. Bhattacharyya. 2008. Hindi POS tagger using naive stemming: Harnessing morphological information without extensive linguistic knowledge. In Proceedings of the International Conference on NLP (ICON’08). Pune, India.Google ScholarGoogle Scholar
  54. [54] S. Singh, K. Gupta, M. Shrivastava, and P. Bhattacharyya. 2006. Morphological richness offset resource demand – experience in constructing a POS tagger for Hindi. In Proceedings of the (COLLING/ACL’06). 779–786.Google ScholarGoogle Scholar
  55. [55] K. Singh, I. Sen, and P. Kumaraguru. 2018. A Twitter corpus for Hindi-English code mixed POS tagging. In Proceedings of the 6th International Workshop on Natural Language Processing for Social Media. 12–17.Google ScholarGoogle Scholar
  56. [56] K. P. Soman, B. Premjith, and P. Prabaharan. 2018. A deep learning based Part-of-Speech (POS) tagger for Sanskrit language by embedding character level features. In Proceedings of the 10th Annual Meeting of the Forum for Information Retrieval Evaluation. 56–60.Google ScholarGoogle Scholar
  57. [57] K. K. Todi, P. Mishra, and D. M. Sharma. 2018. Building a kannada POS tagger using machine learning and neural network models. arXiv:1808.03175. Retrieved from https://arxiv.org/abs/1808.03175.Google ScholarGoogle Scholar
  58. [58] K. Usha and S. L. Pandian. 2021. Malayalam POS tagger–a comparison using SVM and HMM. Evolution in Computational Intelligence. Springer, Singapore, 413–420.Google ScholarGoogle Scholar
  59. [59] P. Wang, Y. Qian, F. K. Soong, L. He, and H. Zhao. 2015. Parts-of-speech tagging with bidirectional long short-term memory recurrent neural network. arXiv:1510.06168. Retrieved from https://arxiv.org/abs/1510.06168.Google ScholarGoogle Scholar
  60. [60] Y. Wu, M. Yuan, S. Dong, L. Lin, and Y. Liu. 2018. Remaining useful life estimation of engineered systems using vanilla LSTM neural networks. Neurocomputing. 275, 167–179.Google ScholarGoogle Scholar

Index Terms

  1. A Study on the Performance of Recurrent Neural Network based Models in Maithili Part of Speech Tagging

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Asian and Low-Resource Language Information Processing
      ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 22, Issue 2
      February 2023
      624 pages
      ISSN:2375-4699
      EISSN:2375-4702
      DOI:10.1145/3572719
      Issue’s Table of Contents

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 21 February 2023
      • Online AM: 1 June 2022
      • Accepted: 16 May 2022
      • Revised: 13 March 2022
      • Received: 5 September 2019
      Published in tallip Volume 22, Issue 2

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Refereed
    • Article Metrics

      • Downloads (Last 12 months)154
      • Downloads (Last 6 weeks)13

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!