skip to main content
research-article

A Neural Joint Model with BERT for Burmese Syllable Segmentation, Word Segmentation, and POS Tagging

Authors Info & Claims
Published:26 May 2021Publication History
Skip Abstract Section

Abstract

The smallest semantic unit of the Burmese language is called the syllable. In the present study, it is intended to propose the first neural joint learning model for Burmese syllable segmentation, word segmentation, and part-of-speech (POS) tagging with the BERT. The proposed model alleviates the error propagation problem of the syllable segmentation. More specifically, it extends the neural joint model for Vietnamese word segmentation, POS tagging, and dependency parsing [28] with the pre-training method of the Burmese character, syllable, and word embedding with BiLSTM-CRF-based neural layers. In order to evaluate the performance of the proposed model, experiments are carried out on Burmese benchmark datasets, and we fine-tune the model of multilingual BERT. Obtained results show that the proposed joint model can result in an excellent performance.

References

  1. Chris Alberti, Kenton Lee, and Michael Collins. 2019. A BERT baseline for the natural questions. arXiv: Computation and Language.Google ScholarGoogle Scholar
  2. Cunli Mao, Zhibo Man, Zhengtao Yu, Zhenhan Wang, Shengxiang Gao, and Yafei Zhang. 2020. A Burmese dependency parsing method based on transfer learning. In Proceedings of the 2020 International Conference on Asian Language Processing (IALP’20). IEEE, 92–97.Google ScholarGoogle ScholarCross RefCross Ref
  3. Bernd Bohnet, Ryan McDonald, Goncalo Simoes, Daniel Andor, Emily Pitler, and Joshua Maynez. 2018. Morphosyntactic tagging with a meta-BiLSTM model over context sensitive token encodings. arXiv:1805.08237.Google ScholarGoogle Scholar
  4. Wanxiang Che, Yijia Liu, Yuxuan Wang, Bo Zheng, and Ting Liu. 2018. Towards better UD parsing: Deep contextualized word embeddings, ensemble, and treebank concatenation. arXiv:1807.03121.Google ScholarGoogle Scholar
  5. Xinchi Chen, Xipeng Qiu, and Xuanjing Huang. 2016. A feature-enriched neural model for joint Chinese word segmentation and part-of-speech tagging. arXiv:1611.05384.Google ScholarGoogle Scholar
  6. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805.Google ScholarGoogle Scholar
  7. Chenchen Ding, Hnin Thu Zar Aye, Win Pa Pa, Khin Thandar Nwet, Khin Mar Soe, Masao Utiyama, and Eiichiro Sumita. 2019. Towards Burmese (Myanmar) morphological analysis: Syllable-based tokenization and part-of-speech tagging. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) 19, 1 (2019), 1–34. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Chenchen Ding, Ye Kyaw Thu, Masao Utiyama, and Eiichiro Sumita. 2016. Word segmentation for Burmese (Myanmar). ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) 15, 4 (2016), 1–10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Chenchen Ding, Masao Utiyama, and Eiichiro Sumita. 2018. NOVA: A feasible and flexible annotation system for joint tokenization and part-of-speech tagging. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) 18, 2 (2018), 1–18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Chenchen Ding, Sann Su Su Yee, Win Pa Pa, Khin Mar Soe, Masao Utiyama, and Eiichiro Sumita. 2020. A Burmese (Myanmar) treebank: Guideline and analysis. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) 19, 3 (2020), 1–13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Jun Hatori, Takuya Matsuzaki, Yusuke Miyao, and Jun’ichi Tsujii. 2012. Incremental joint approach to word segmentation, POS tagging, and dependency parsing in Chinese. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers-Volume 1. Association for Computational Linguistics, 1045–1053. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Tin Htay Hlaing and Yoshiki Mikami. 2014. Automatic syllable segmentation of Myanmar texts using finite state transducer. ICTer 6, 2 (2014).Google ScholarGoogle Scholar
  13. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Computation 9, 8 (1997), 1735–1780. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Hla Hla Htay and Kavi Narayana Murthy. 2008. Myanmar word segmentation using syllable level longest matching. In Proceedings of the 6th Workshop on Asian Language Resources.Google ScholarGoogle Scholar
  15. Zhiheng Huang, Wei Xu, and Kai Yu. 2015. Bidirectional LSTM-CRF models for sequence tagging. arXiv:1508.01991.Google ScholarGoogle Scholar
  16. Wenbin Jiang, Liang Huang, Qun Liu, and Yajuan Lü. 2008. A cascaded linear model for joint Chinese word segmentation and part-of-speech tagging. In Proceedings of ACL-08: HLT. 897–904. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Zhanming Jie and Wei Lu. 2019. Dependency-guided LSTM-CRF for named entity recognition. arXiv:1909.10148.Google ScholarGoogle Scholar
  18. Dan Kondratyuk and Milan Straka. 2019. 75 languages, 1 model: Parsing universal dependencies universally. arXiv:1904.02099.Google ScholarGoogle Scholar
  19. Canasai Kruengkrai, Kiyotaka Uchimoto, Jun’ichi Kazama, Yiou Wang, Kentaro Torisawa, and Hitoshi Isahara. 2009. An error-driven word-character hybrid model for joint Chinese word segmentation and POS tagging. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1-Volume 1. Association for Computational Linguistics, 513–521. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. John Lafferty. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of ICML. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Guillaume Lample, Miguel Ballesteros, Sandeep Subramanian, Kazuya Kawakami, and Chris Dyer. 2016. Neural architectures for named entity recognition. arXiv:1603.01360.Google ScholarGoogle Scholar
  22. Yang Liu. 2019. Fine-tune BERT for extractive summarization. arXiv:1903.10318.Google ScholarGoogle Scholar
  23. Zin Maung Maung and Yoshiki Mikami. 2008. A rule-based syllable segmentation of Myanmar text. In Proceedings of the IJCNLP-08 Workshop on NLP for Less Privileged Languages.Google ScholarGoogle Scholar
  24. Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv:1301.3781.Google ScholarGoogle Scholar
  25. Aye Myat Mon, Soe Lai Phyue, Myint Myint Thein, Su Su Htay, and Thinn Thinn Win. 2010. Analysis of Myanmar word boundary and segmentation by using statistical approach. In Proceedings of the 2010 3rd International Conference on Advanced Computer Theory and Engineering (ICACTE’10), Vol. 5. IEEE, V5-233–V5-237.Google ScholarGoogle Scholar
  26. Cynthia Myint. 2011. A hybrid approach for part-of-speech tagging of Burmese texts. In Proceedings of the 2011 International Conference on Computer and Management (CAMAN’11). IEEE, 1–4.Google ScholarGoogle ScholarCross RefCross Ref
  27. Phyu Hninn Myint, Tin Myat Htwe, and Ni Lar Thein. 2011. Bigram part-of-speech tagger for Myanmar language. In Proceedings of 2011 International Conference on Information Communication and Management, Singapore. 147–152.Google ScholarGoogle Scholar
  28. Dat Quoc Nguyen. [n.d.]. A neural joint model for Vietnamese word segmentation, POS tagging and dependency parsing.Google ScholarGoogle Scholar
  29. Rodrigo Nogueira and Kyunghyun Cho. 2019. Passage re-ranking with BERT.Google ScholarGoogle Scholar
  30. Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. arXiv:1802.05365.Google ScholarGoogle Scholar
  31. Myat Lay Phyu and Kiyota Hashimoto. 2017. Burmese word segmentation with character clustering and CRFs. In Proceedings of the 2017 14th International Joint Conference on Computer Science and Software Engineering (JCSSE’17). IEEE, 1–6.Google ScholarGoogle ScholarCross RefCross Ref
  32. Tao Qian, Yue Zhang, Meishan Zhang, Yafeng Ren, and Donghong Ji. 2015. A transition-based model for joint segmentation, POS-tagging and normalization. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 1837–1846.Google ScholarGoogle ScholarCross RefCross Ref
  33. Xian Qian and Yang Liu. 2012. Joint Chinese word segmentation, POS tagging and parsing. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Association for Computational Linguistics, 501–511. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Xipeng Qiu, Tianxiang Sun, Yige Xu, Yunfan Shao, Ning Dai, and Xuanjing Huang. 2020. Pre-trained models for natural language processing: A survey. Science China Technological Sciences (2020), 1–26.Google ScholarGoogle ScholarCross RefCross Ref
  35. Nuo Qun, Hang Yan, Xipeng Qiu, and Xuanjing Huang. 2020. Chinese word segmentation via BiLSTM+Semi-CRF with relay node. Journal of Computer Science and Technology 35, 5 (2020), 1115–1126.Google ScholarGoogle ScholarCross RefCross Ref
  36. Lin Songkai, Mao Cunli, Yu Zhengtao, Guo Jianyi, Wang Hongbin, and Zhang Jiafu. 2018. A method of Myanmar word segmentation based on convolution neural network. Journal of Chinese Information Processing 6 (2018), 8.Google ScholarGoogle Scholar
  37. Chen Sun, Austin Myers, Carl Vondrick, Kevin Murphy, and Cordelia Schmid. 2019. VideoBERT: A joint model for video and language representation learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 7464–7473.Google ScholarGoogle ScholarCross RefCross Ref
  38. Chi Sun, Xipeng Qiu, Yige Xu, and Xuanjing Huang. 2019. How to fine-tune BERT for text classification? CoRR abs/1905.05583 (2019). arxiv:1905.05583. http://arxiv.org/abs/1905.05583Google ScholarGoogle Scholar
  39. Weiwei Sun. 2011. A stacked sub-word model for joint Chinese word segmentation and part-of-speech tagging. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1. Association for Computational Linguistics, 1385–1394. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Ian Tenney, Dipanjan Das, and Ellie Pavlick. 2019. Bert rediscovers the classical nlp pipeline. arXiv:1905.05950.Google ScholarGoogle Scholar
  41. Tun Thura Thet, Jin-Cheon Na, and Wunna Ko Ko. 2008. Word segmentation for the Myanmar language. Journal of Information Science 34, 5 (2008), 688–704. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Ye Kyaw Thu, Andrew Finch, Eiichiro Sumita, and Yoshinori Sagisaka. 2014. Integrating dictionaries into an unsupervised model for Myanmar word segmentation. In Proceedings of the Fifth Workshop on South and Southeast Asian Natural Language Processing. 20–27.Google ScholarGoogle Scholar
  43. Ye Kyaw Thu, Win Pa Pa, Masao Utiyama, Andrew Finch, and Eiichiro Sumita. 2016. Introducing the Asian language treebank (ALT). In Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC’16). 1574–1578.Google ScholarGoogle Scholar
  44. Yuxuan Wang, Wanxiang Che, Jiang Guo, Yijia Liu, and Ting Liu. 2019. Cross-lingual BERT transformation for zero-shot dependency parsing. arXiv:1909.06775.Google ScholarGoogle Scholar
  45. Liner Yang, Meishan Zhang, Yang Liu, Maosong Sun, Nan Yu, and Guohong Fu. 2017. Joint POS tagging and dependence parsing with transition-based neural networks. IEEE/ACM Transactions on Audio, Speech, and Language Processing 26, 8 (2017), 1352–1358. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Meishan Zhang, Nan Yu, and Guohong Fu. 2018. A simple and effective neural model for joint word segmentation and POS tagging. IEEE/ACM Transactions on Audio, Speech, and Language Processing 26, 9 (2018), 1528–1538. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Meishan Zhang, Yue Zhang, Wanxiang Che, and Ting Liu. 2014. Character-level Chinese dependency parsing. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1326–1336.Google ScholarGoogle ScholarCross RefCross Ref
  48. Shaoning Zhang, Cunli Mao, Zhengtao Yu, Hongbin Wang, Zhongwei Li, and Jiafu Zhang. 2018. Word segmentation for Burmese based on dual-layer CRFs. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) 18, 1 (2018), 1–11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Yue Zhang and Stephen Clark. 2008. Joint word segmentation and POS tagging using a single perceptron. In Proceedings of ACL-08: HLT. 888–896.Google ScholarGoogle Scholar
  50. Yue Zhang and Stephen Clark. 2010. A fast decoder for joint word segmentation and POS-tagging using a single discriminative model. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 843–852. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Xiaoqing Zheng, Hanyang Chen, and Tianyu Xu. 2013. Deep learning for Chinese word segmentation and POS tagging. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. 647–657.Google ScholarGoogle Scholar

Index Terms

  1. A Neural Joint Model with BERT for Burmese Syllable Segmentation, Word Segmentation, and POS Tagging

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on Asian and Low-Resource Language Information Processing
            ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 20, Issue 4
            July 2021
            419 pages
            ISSN:2375-4699
            EISSN:2375-4702
            DOI:10.1145/3465463
            Issue’s Table of Contents

            Copyright © 2021 Association for Computing Machinery.

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 26 May 2021
            • Accepted: 1 November 2020
            • Revised: 1 July 2020
            • Received: 1 March 2020
            Published in tallip Volume 20, Issue 4

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Refereed

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format .

          View HTML Format
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!