Abstract
This article presents our effort in developing a Maithili Part of Speech (POS) tagger. Substantial effort has been devoted to developing POS taggers in several Indian languages, including Hindi, Bengali, Tamil, Telugu, Kannada, Punjabi, and Marathi; but Maithili did not achieve much attention from the research community. Maithili is one of the official languages of India, with around 50 million native speakers. So, we worked on developing a POS tagger in Maithili. For the development, we use a manually annotated in-house Maithili corpus containing 56,126 tokens. The tagset contains 27 tags. We train a conditional random fields (CRF) classifier to prepare a baseline system that achieves an accuracy of 82.67%. Then, we employ several recurrent neural networks (RNN)-based models, including Long-short Term Memory (LSTM), Gated Recurrent Unit (GRU), LSTM with a CRF layer (LSTM-CRF), and GRU with a CRF layer (GRU-CRF) and perform a comparative study. We also study the effect of both word embedding and character embedding in the task. The highest accuracy of the system is 91.53%.
- [1] F. Alam, S. A. Chowdhury, and S. R. H. Noori. 2016. Bidirectional lstms-crfs networks for bangla pos tagging. In Proceedings of the 2016 19th International Conference on Computer and Information Technology. IEEE, 377–382.Google Scholar
- [2] A. Ananth, S. Bhat, R. Naik, and U. P. Nair. 2021. Parts of speech tagging and extractive summarization techniques for kannada documents. In Smart Sensors Measurements and Instrumentation. Springer, Singapore, 367–380.Google Scholar
- [3] D. Andor, C. Alberti, D. Weiss, A. Severyn, A. Presta, K. Ganchev, S. Petrov, and M. Collins. 2016. Globally normalized transition-based neural networks. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2442–2452.Google Scholar
- [4] P. J. Antony and K. P. Soman. 2010. Kernel based part of speech tagger for kannada. In Proceedings of the 2010 International Conference on Machine Learning and Cybernetics (ICMLC’10). IEEE, 4, 2139–2144.Google Scholar
- [5] P. J. Antony, S. P. Mohan, and K. P. Soman. 2010. SVM based part of speech tagger for Malayalam. In 2010 International Conference on Recent Trends in Information, Telecommunication and Computing. IEEE, 339–341.Google Scholar
- [6] P. Arulmozhi and L. Sobha. 2006. A hybrid POS tagger for a relatively free word order language. In Proceedings of the First National Symposium on Modeling and Shallow Parsing of Indian Languages. 79–85.Google Scholar
- [7] P. V. S. Avinesh and G. Karthik. 2007. Part-of-speech tagging and chunking using conditional random fields and transformation based learning. Shallow Parsing for South Asian Languages 21, 2007, 1–40.Google Scholar
- [8] C. A. Bahcevan, E. Kutlu, and T. Yildiz. 2018. Deep neural network architecture for part-of-speech tagging for turkish language. In Proceedings of the 2018 3rd International Conference on Computer Science and Engineering (UBMK’18). IEEE 235–238.Google Scholar
- [9] V. Bharati Chaitanya and R. Sangal. 1995. Natural Language Processing: A Paninian Perspective. Prentice Hall India, 65–106.Google Scholar
- [10] M. Boden. 2002. A guide to recurrent neural networks and backpropagation. The Dallas Project.Google Scholar
- [11] J. Chung, C. Gulcehre, K. Cho, and Y. Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. In Proceedings of the NIPS 2014 Workshop on Deep Learning, December 2014.Google Scholar
- [12] D. Cutting, J. Kupiec, J. Pedersen, and S. Penelope. 1992. A practical part-of-speech tagger. In Proceedings of the 3rd Conference on Applied Natural Language Processing. 133–140Google Scholar
- [13] S. Dandapat. 2009. Part-of-speech Tagging for Bengali. MS Thesis, Indian Institute of Technology Kharagpur, India.Google Scholar
- [14] S. Dandapat, S. Sarkar, and A. Basu. 2007. Automatic part-of-speech tagging for Bengali: An approach for morphologically rich languages in a poor resource scenario. InProceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions. Association for Computational Linguistics. 221–224.Google Scholar
- [15] E. Dermatas and K. George. 1995. Automatic stochastic tagging of natural language texts. Computational Linguistics 21.2. 137–163.Google Scholar
- [16] R. D. Deshmukh and A. Kiwelekar. 2020. Deep learning techniques for part of speech tagging by natural language processing. In Proceedings of the 2020 2nd International Conference on Innovative Mechanisms for Industry Applications. IEEE, 76–81.Google Scholar
- [17] V. Dhanalakshmi, G. Shivapratap, S. Rajendran, and K. P. Soman. 2009. Tamil POS tagging using linear programming. International Journal of Recent Trends in Engineering. 1, 2 (2009).Google Scholar
- [18] A. Ekbal, R. Haque, and S. Bandyopadhyay. 2007. Bengali part of speech tagging using conditional random field. In Proceedings of the 7th International Symposium on Natural Language Processing. 131–136.Google Scholar
- [19] N. Garg, V. Goyal, and S. Preet. 2012. Rule based Hindi part of speech tagger. Proceedings of COLING 2012: Demonstration Papers, Mumbai, India. 163–174.Google Scholar
- [20] A. Gopalakrishnan, K. P. Soman, and B. Premjith. 2019. Part-of-speech tagger for biomedical domain using deep neural network architecture. In Proceedings of the 2019 10th International Conference on Computing, Communication and Networking Technologies. IEEE, 1–5.Google Scholar
- [21] B. Greene and G. Rubin. 1971. Automated grammatical tagging of English. Department of Linguistics, Brown University.Google Scholar
- [22] J. Guo, S. Wang, C. Yu, and J. Song. 2019. Chinese POS tagging method based on Bi-GRU+CRF hybrid model. In Proceedings of the Advances in Intelligent Networking and Collaborative Systems, F. Xhafa, L. Barolli, and M. Greguš (Eds.). Lecture Notes on Data Engineering and Communications Technologies, vol 23. Springer, Cham.Google Scholar
- [23] Z. Harris. 1962. String Analysis of Language Structure. The Hague: Mouton and Co.Google Scholar
- [24] S. Hochreiter. 1998. The vanishing gradient problem during learning recurrent neural nets and problem solutions. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 6, 02 (1998), 107–116.Google Scholar
- [25] T. Horsmann and T. Zesch. 2017. Do LSTMs really work so well for PoS tagging?–A replication study. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 727–736.Google Scholar
- [26] Z. Huang, W. Xu, and K. Yu. 2015. Bidirectional LSTM-CRF models for sequence tagging. arXiv:1508.01991 (2015).Google Scholar
- [27] A. Jamatia and A. Das. 2014. Part-of-speech tagging system for indian social media text on Twitter. In Proceedings of the Social-India 2014, First Workshop on Language Technologies for Indian Social Media Text, at the Eleventh International Conference on Natural Language Processing (ICON’14). 21–28.Google Scholar
- [28] M. K. Junaida and A. P. Babu. 2021. A deep learning approach to Malayalam parts of speech tagging. In Proceedin gs of the 2nd International Conference on Networks and Advances in Computational Technologies. Springer, Cham. 243–250.Google Scholar
- [29] G. Krishnan, A. Pooja, M. Anand Kumar, and K. P. Soman. 2017. Character based bidirectional LSTM for disambiguating tamil part-of-speech categories. International Journal of Control Theory and Applications. 229–235.Google Scholar
- [30] O. Irsoy Kumar, J. Su, J. Bradbury, R. English, B. Pierce, P. Ondruska, I. Gulrajani, and R. Socher. 2015. Ask me anything: Dynamic memory networks for natural language processing. 1378–1387. arXiv:1506.07285. Retrieved from https://arxiv.org/abs/1506.07285.Google Scholar
- [31] J. Lafferty, A. McCallum, and F. Pereira CN. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the 18th International Conference on Machine Learning 2001. 282–289.Google Scholar
- [32] Y. Ma and J. C. Principe. 2019. A taxonomy for neural memory networks. IEEE Transactions on Neural Networks and Learning Systems 31, 6 (2019), 1780–1793.Google Scholar
- [33] S. Meftah, N. Semmar, and F. Sadat. 2018. A neural network model for part-of-speech tagging of social media texts. In Proceedings of the 11th International Conference on Language Resources and Evaluation. 2821–2828.Google Scholar
- [34] A. McCallum, D. Freitag, and F. C. N. Pereira. 2000. Maximum entropy markov models for information extraction and segmentation. ICML 17, 2000 (2000), 591–598.Google Scholar
- [35] T. Mikolov, K. Chen, G. Corrado, and J. Dean. 2013. Efficient estimation of word representations in vector space. In Proceedings of Workshop at ICLR. arXiv. 1301–3781.Google Scholar
- [36] D. Modi and N. Nain. 2016. Part-of-speech tagging of Hindi corpus using rule-based method. Proceedings of the International Conference on Recent Cognizance in Wireless Communication & Image Processing. 241–247.Google Scholar
- [37] R. K. Mundotiya, M. K. Singh, R. Kapur, S. Mishra, and A. K. Singh. 2020. Linguistic resources for Bhojpuri, Magahi, and Maithili: Statistics about them, their similarity estimates, and baselines for three applications. Transactions on Asian and Low-Resource Language Information Processing 20.6 (2021), 1–37.Google Scholar
- [38] R. K. Mundotiya, M. K. Singh, R. Kapur, S. Mishra, and A. K. Singh. 2021. Hierarchical self attention based sequential labelling model for Bhojpuri, Maithili and Magahi languages. Journal of King Saud University-Computer and Information Sciences.Google Scholar
- [39] N. Ljubesic. 2018. Comparing CRF and LSTM performance on the task of morphosyntactic tagging of non-standard varieties of South Slavic languages. In Proceedings of the 5th Workshop on NLP for Similar Languages, Varieties and Dialects. 156–163.Google Scholar
- [40] J. C. W. Lin, Y. Shao, Y. Djenouri, and U. Yun. 2021. ASRNN: A recurrent neural network with an attention model for sequence labeling. Knowledge-Based Systems. 212, 106548.Google Scholar
- [41] S. Pan and D. Saha. 2021. Performance evaluation of part-of-speech tagging for Bengali text. Journal of The Institution of Engineers (India): Series B. 1–13.Google Scholar
- [42] A. Pradhan and A. Yajnik. 2021. Probabilistic and neural network based POS tagging of ambiguous Nepali text: A comparative study. In Proceedings of the 2021 International Symposium on Electrical, Electronics and Information Engineering. 249–253.Google Scholar
- [43] A. Priyadarshi and S. K. Saha. 2020. Towards the first Maithili part of speech tagger: Resource creation and system development. Computer Speech and Language. 62, 101054.Google Scholar
- [44] A. Priyadarshi and S. K. Saha. 2019. A study on the importance of linguistic suffixes in Maithili POS tagger development. In Proceedings of the International Conference on Mining Intelligence and Knowledge Exploration. Springer, Cham. 11–20.Google Scholar
- [45] A. Ratnaparkhi. 1996. A maximum entropy model for part-of-speech tagging. Conference on Empirical Methods in Natural Language Processing. 133–142.Google Scholar
- [46] P. R. Ray, V. Harish, S. Sarkar, and A. Basu. 2003. Part of speech tagging and local word grouping techniques for natural language processing. In Proceedings of the 1st International Conference on Natural Language Processing.Google Scholar
- [47] S. N. Sakiba, M. M. U. Shuvo, N. Hossain, S. K. Das, J. D. Mela, and M. A. Islam. 2021. A memory-efficient tool for bengali parts of speech tagging. In Artificial Intelligence Techniques for Advanced Computing Applications. Springer, Singapore, 67–78.Google Scholar
- [48] C. N. Santos and B. Zadrozny. 2014. Learning character-level representations for part-of-speech tagging. In Proceedings of the 31 st International Conference on Machine Learning. JMLR: W&CP volume 32, 1818–1826.Google Scholar
- [49] H. Schmid. 1994. Part-of-speech tagging with neural networks. In Proceedings of the 15th Conference on Computational Linguistics. Association for Computational Linguistics, Volume 1, 172–176.Google Scholar
- [50] Y. Shao, C. Hardmeier, J. Tiedemann, and J. Nivre. 2017. Character-based joint segmentation and POS tagging for Chinese using bidirectional RNN-CRF. In Proceedings of the 8th International Joint Conference on Natural Language Processing. 173–183.Google Scholar
- [51] R. Sharma, S. Morwal, B. Agarwal, R. Chandra, and M. S. Khan. 2020. A deep neural network-based model for named entity recognition for Hindi language. Neural Computing and Applications 32, 20 (2020), 16191–16203.Google Scholar
- [52] S. K. Sharma and G. S. Lehal. 2011. Using Hidden Markov Model to improve the accuracy of Punjabi POS tagger. In Proceedings of the 2011 IEEE International Conference on Computer Science and Automation Engineering. Vol. 2, 697–701.Google Scholar
- [53] M. Shrivastava and P. Bhattacharyya. 2008. Hindi POS tagger using naive stemming: Harnessing morphological information without extensive linguistic knowledge. In Proceedings of the International Conference on NLP (ICON’08). Pune, India.Google Scholar
- [54] S. Singh, K. Gupta, M. Shrivastava, and P. Bhattacharyya. 2006. Morphological richness offset resource demand – experience in constructing a POS tagger for Hindi. In Proceedings of the (COLLING/ACL’06). 779–786.Google Scholar
- [55] K. Singh, I. Sen, and P. Kumaraguru. 2018. A Twitter corpus for Hindi-English code mixed POS tagging. In Proceedings of the 6th International Workshop on Natural Language Processing for Social Media. 12–17.Google Scholar
- [56] K. P. Soman, B. Premjith, and P. Prabaharan. 2018. A deep learning based Part-of-Speech (POS) tagger for Sanskrit language by embedding character level features. In Proceedings of the 10th Annual Meeting of the Forum for Information Retrieval Evaluation. 56–60.Google Scholar
- [57] K. K. Todi, P. Mishra, and D. M. Sharma. 2018. Building a kannada POS tagger using machine learning and neural network models. arXiv:1808.03175. Retrieved from https://arxiv.org/abs/1808.03175.Google Scholar
- [58] K. Usha and S. L. Pandian. 2021. Malayalam POS tagger–a comparison using SVM and HMM. Evolution in Computational Intelligence. Springer, Singapore, 413–420.Google Scholar
- [59] P. Wang, Y. Qian, F. K. Soong, L. He, and H. Zhao. 2015. Parts-of-speech tagging with bidirectional long short-term memory recurrent neural network. arXiv:1510.06168. Retrieved from https://arxiv.org/abs/1510.06168.Google Scholar
- [60] Y. Wu, M. Yuan, S. Dong, L. Lin, and Y. Liu. 2018. Remaining useful life estimation of engineered systems using vanilla LSTM neural networks. Neurocomputing. 275, 167–179.Google Scholar
Index Terms
A Study on the Performance of Recurrent Neural Network based Models in Maithili Part of Speech Tagging
Recommendations
Towards the first Maithili part of speech tagger: Resource creation and system development
AbstractPart of speech (POS) tagging for the Indian language Maithili is not an explored territory. There have been substantial efforts at developing POS taggers in several Indian languages including Hindi, Bengali, Tamil, Telugu, Kannada, ...
A Trigram Language Model to Predict Part of Speech Tags Using Neural Network
IDEAL 2013: Proceedings of the 14th International Conference on Intelligent Data Engineering and Automated Learning --- IDEAL 2013 - Volume 8206This paper presents a novel approach of part of speech tagging using neural networks for Punjabi language. To the best of our knowledge neural networks have never been used for the prediction of part of speech tags for Punjabi language. In this paper, a ...
Toward enhanced Arabic speech recognition using part of speech tagging
One major source of suboptimal performance in automatic continuous speech recognition systems is misrecognition of small words. In general, errors resulting from small words are much more than errors resulting from long words. Therefore, compounding ...






Comments