skip to main content
short-paper

Robust Multi-task Learning-based Korean POS Tagging to Overcome Word Spacing Errors

Published:17 June 2023Publication History
Skip Abstract Section

Abstract

End-to-end neural network-based approaches have recently demonstrated significant improvements in natural language processing (NLP). However, in the NLP application such as assistant systems, NLP components are still processed to extract results using a pipeline paradigm. The pipeline-based concept has issues with error propagation. In Korean, morphological analysis and part-of-speech (POS) tagging step, incorrectly analyzing POS tags for a sentence containing spacing errors negatively affects other modules behind the POS module. Hence, we present a multi-task learning-based POS tagging neural model for Korean with word spacing challenges. When we apply this model to the Korean morphological analysis and POS tagging, we get findings that are robust to word spacing errors. We adopt syllable-level input and output formats, as well as a simple structure for ELECTRA and RNN-CRF models for multi-task learning, and we achieve a good performance 98.30 of F1, better than previous studies on the Sejong corpus test set.

REFERENCES

  1. [1] Arık Sercan Ö., Chrzanowski Mike, Coates Adam, Diamos Gregory, Gibiansky Andrew, Kang Yongguo, Li Xian, Miller John, Ng Andrew, Raiman Jonathan et al. 2017. Deep Voice: Real-time neural text-to-speech. In Proceedings of the International Conference on Machine Learning. PMLR, 195204.Google ScholarGoogle Scholar
  2. [2] Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In 3rd International Conference on Learning Representations (ICLR’15). May 7-9, 2015. Conference Track Proceedings, San Diego, CA.Google ScholarGoogle Scholar
  3. [3] Michael Braun, Anja Mainz, Ronee Chadowitz, Bastian Pfleging, and Florian Alt. 2019. At your service: Designing voice assistant personalities to improve automotive user interfaces. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI’19). Association for Computing Machinery, 1–11.Google ScholarGoogle Scholar
  4. [4] Caruana Rich. 1997. Multitask learning. Mach. Learn. 28, 1 (1997), 4175.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. [5] Changpinyo Soravit, Hu Hexiang, and Sha Fei. 2018. Multi-task learning for sequence tagging: An empirical study. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, 29652977.Google ScholarGoogle Scholar
  6. [6] Choi Jihun, Youn Jonghem, and Lee Sang-goo. 2016. A grapheme-level approach for constructing a Korean morphological analyzer without linguistic knowledge. In Proceedings of the IEEE International Conference on Big Data (Big Data). 38723879.Google ScholarGoogle ScholarCross RefCross Ref
  7. [7] Choi Key-Sun, Isahara Hitoshi, Kanzaki Kyoko, Kim Hansaem, Pak Seok Mun, and Sun Maosong. 2009. Word segmentation standard in Chinese, Japanese and Korean. In Proceedings of the 7th Workshop on Asian Language Resources (ALR’09). 179186.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. [8] Choi Sanghyuk, Kim Taeuk, Seol Jinseok, and Lee Sang-goo. 2017. A syllable-based technique for word embeddings of Korean words. In Proceedings of the 1st Workshop on Subword and Character Level Models in NLP. Association for Computational Linguistics, 3640.Google ScholarGoogle ScholarCross RefCross Ref
  9. [9] Chung Junyoung, Gulcehre Caglar, Cho KyungHyun, and Bengio Yoshua. 2014. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. (2014). arxiv:cs.NE/1412.3555.Google ScholarGoogle Scholar
  10. [10] Clark Kevin, Luong Minh-Thang, Le Quoc V., and Manning Christopher D.. 2020. ELECTRA: Pre-training text encoders as discriminators rather than generators. In Proceedings of the International Conference on Learning Representations.Google ScholarGoogle Scholar
  11. [11] Collobert Ronan and Weston Jason. 2008. A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th International Conference on Machine Learning (ICML’08). Association for Computing Machinery, New York, NY, 160167.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. [12] Devlin Jacob, Chang Ming-Wei, Lee Kenton, and Toutanova Kristina. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 41714186.Google ScholarGoogle Scholar
  13. [13] Do Soojong, Park Cheoneum, Lee Cheongjae, Han Kyuyeol, and Lee Mirye. 2020. Syllable-based Korean named entity recognition and slot filling with ELECTRA. In Proceedings of the 32nd Annual Conference on Human and Cognitive Language Technology.Google ScholarGoogle Scholar
  14. [14] Dyer Chris, Kuncoro Adhiguna, Ballesteros Miguel, and Smith Noah A.. 2016. Recurrent neural network grammars. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 199209.Google ScholarGoogle ScholarCross RefCross Ref
  15. [15] Han Chung-Hye and Palmer Martha. 2004. A morphological tagger for Korean: Statistical tagging combined with corpus-based morphological rule application. Mach. Translat. 18, 4 (2004), 275297.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. [16] He Y., Sainath T. N., Prabhavalkar R., McGraw I., Alvarez R., Zhao D., Rybach D., Kannan A., Wu Y., Pang R., Liang Q., Bhatia D., Shangguan Y., Li B., Pundak G., Sim K. C., Bagby T., Chang S., Rao K., and Gruenstein A.. 2019. Streaming end-to-end speech recognition for mobile devices. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 63816385.Google ScholarGoogle ScholarCross RefCross Ref
  17. [17] Hochreiter Sepp and Schmidhuber Jürgen. 1997. Long short-term memory. Neural Computat. 9, 8 (1997), 17351780.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. [18] Hwang HyunSun and Lee ChangKi. 2016. Korean morphological analysis using sequence-to-sequence learning with copying mechanism. In Proceedings of the Conference on Korea Software Congress. 443445.Google ScholarGoogle Scholar
  19. [19] Joachims Thorsten, Finley Thomas, and Yu Chun-Nam John. 2009. Cutting-plane training of structural SVMs. Mach. Learn. 77, 1 (Oct.2009), 2759.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. [20] Kann Katharina, Bjerva Johannes, Augenstein Isabelle, Plank Barbara, and Søgaard Anders. 2018. Character-level supervision for low-resource POS tagging. In Proceedings of the Workshop on Deep Learning Approaches for Low-Resource NLP. 111.Google ScholarGoogle ScholarCross RefCross Ref
  21. [21] Kim Youngsam and Shin Hyopil. 2013. Romanization-based approach to morphological analysis in Korean SMS text processing. In Proceedings of the 6th International Joint Conference on Natural Language Processing. 145152.Google ScholarGoogle Scholar
  22. [22] Kingma Diederik P. and Ba Jimmy. 2017. Adam: A Method for Stochastic Optimization. (2017). arxiv:cs.LG/1412.6980.Google ScholarGoogle Scholar
  23. [23] Kuru Onur, Can Ozan Arkan, and Yuret Deniz. 2016. CharNER: Character-level named entity recognition. In Proceedings of the 26th International Conference on Computational Linguistics: Technical Papers. The COLING 2016 Organizing Committee, 911921.Google ScholarGoogle Scholar
  24. [24] Lee Changki. 2013. Joint models for korean word spacing and POS tagging using structural SVM. In J. KISS: Softw. Applic. 40, 12 (2013), 826832.Google ScholarGoogle Scholar
  25. [25] Lee Do-Gil and Rim Hae-Chang. 2009. Probabilistic modeling of Korean morphology. IEEE Trans. Aud., Speech, Lang. Process. 17, 5 (2009), 945955.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. [26] Lee Jae Sung. 2011. Three-step probabilistic model for Korean morphological analysis. In J. KISS: Softw. Applic. 257268.Google ScholarGoogle Scholar
  27. [27] Lei Tao. 2021. When Attention Meets Fast Recurrence: Training Language Models with Reduced Compute. (2021). arxiv:cs.CL/2102.12459.Google ScholarGoogle Scholar
  28. [28] Lei Tao, Zhang Yu, Wang Sida I., Dai Hui, and Artzi Yoav. 2018. Simple recurrent units for highly parallelizable recurrence. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 44704481.Google ScholarGoogle ScholarCross RefCross Ref
  29. [29] Lim Dong-Hee, Kang Seung-Shik, and Chang Du-Seong. 2006. Word spacing error correction for the postprocessing of speech recognition. In Proceedings of the Korean Information Science Society Conference. 2527.Google ScholarGoogle Scholar
  30. [30] Luo Wencan and Yang Fan. 2016. An empirical study of automatic Chinese word segmentation for spoken language understanding and named entity recognition. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 238248.Google ScholarGoogle ScholarCross RefCross Ref
  31. [31] Luong Minh-Thang, Le Quoc V., Sutskever Ilya, Vinyals Oriol, and Kaiser Lukasz. 2016. Multi-task sequence to sequence learning. In Proceedings of the 4th International Conference on Learning Representations.Google ScholarGoogle Scholar
  32. [32] Ma Xuezhe, Hu Zecong, Liu Jingzhou, Peng Nanyun, Neubig Graham, and Hovy Eduard. 2018. Stack-pointer networks for dependency parsing. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 14031414.Google ScholarGoogle ScholarCross RefCross Ref
  33. [33] Matteson Andrew, Lee Chanhee, Kim Youngbum, and Lim Heuiseok. 2018. Rich character-level information for Korean morphological analysis and part-of-speech tagging. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, 24822492.Google ScholarGoogle Scholar
  34. [34] Min Jinwoo, Na Seung-Hoon, Shin Jong-Hoon, and Kim Young-Kil. 2020. Stack pointer network for Korean morphological analysis. In Proceedings of theConference on Korea Software Congress. 371373.Google ScholarGoogle Scholar
  35. [35] Na Seung-Hoon. 2015. Conditional random fields for Korean morpheme segmentation and POS tagging. ACM Trans. Asian Low-Resour. Lang. Inf. Process. 14, 3 (June2015).Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. [36] Park Keunyoung, Kim Kyungduk, and Kang Inho. 2018. Jam-packing Korean sentence classification method robust for spacing errors. In Proceedings of the Annual Conference on Human and Language Technology. 600604.Google ScholarGoogle Scholar
  37. [37] Park Kyubyong, Lee Joohong, Jang Seongbo, and Jung Dawoon. 2020. An empirical study of tokenization strategies for various Korean NLP tasks. In Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing. Association for Computational Linguistics, 133142.Google ScholarGoogle Scholar
  38. [38] Ramshaw Lance and Marcus Mitch. 1995. Text chunking using transformation-based learning. In Proceedings of the 3rd Workshop on Very Large Corpora.Google ScholarGoogle Scholar
  39. [39] Ruder Sebastian. 2017. An overview of multi-task learning in deep neural networks. arXiv preprint arXiv:1706.05098 (2017).Google ScholarGoogle Scholar
  40. [40] Sharma Dravyansh, Wilson Melissa, and Bruguier Antoine. 2019. Better morphology prediction for better speech systems. In Proceedings of the 20th Annual Conference of the International Speech Communication Association. ISCA, 35353539.Google ScholarGoogle ScholarCross RefCross Ref
  41. [41] Shim Kwang-Seob. 2011. Syllable-based POS tagging without korean morphological analysis. Korean J. Cognit. Sci. 22, 3 (2011), 327345.Google ScholarGoogle ScholarCross RefCross Ref
  42. [42] Song Hyun-Je and Park Seong-Bae. 2019. Korean morphological analysis with tied sequence-to-sequence multi-task model. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, 14361441.Google ScholarGoogle ScholarCross RefCross Ref
  43. [43] Sutskever Ilya, Vinyals Oriol, and Le Quoc V.. 2014. Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems, Vol. 27. Curran Associates, Inc.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. [44] Tachibana H., Uenoyama K., and Aihara S.. 2018. Efficiently trainable text-to-speech system based on deep convolutional networks with guided attention. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 47844788.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. [45] Terzopoulos George and Satratzemi Maya. 2020. Voice assistants and smart speakers in everyday life and in education. Inform. Educ. 19, 3 (2020), 473490.Google ScholarGoogle ScholarCross RefCross Ref
  46. [46] Vaswani Ashish, Shazeer Noam, Parmar Niki, Uszkoreit Jakob, Jones Llion, Gomez Aidan N., Kaiser Łukasz, and Polosukhin Illia. 2017. Attention is all you need. In Advances in Neural Information Processing Systems, Vol. 30. Curran Associates, Inc.Google ScholarGoogle Scholar
  47. [47] Wu Yirui, Guo Haifeng, Chakraborty Chinmay, Khosravi Mohammad, Berretti Stefano, and Wan Shaohua. 2022. Edge computing driven low-light image dynamic enhancement for object detection. IEEE Trans. Netw. Sci. Eng. (2022). DOI:Google ScholarGoogle ScholarCross RefCross Ref
  48. [48] Wu Yirui, Ma Yuntao, and Wan Shaohua. 2021. Multi-scale relation reasoning for multi-modal Visual Question Answering. Sign. Process.: Image Commun. 96 (2021), 116319. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  49. [49] Wu Yirui, Zhang Wen, and Wan Shaohua. 2022. CE-text: A context-Aware and embedded text detector in natural scene images. Pattern Recog. Lett. 159 (2022), 7783. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. [50] Zheng S., Jayasumana S., Romera-Paredes B., Vineet V., Su Z., Du D., Huang C., and Torr P. H. S.. 2015. Conditional random fields as recurrent neural networks. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 15291537.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Robust Multi-task Learning-based Korean POS Tagging to Overcome Word Spacing Errors

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Asian and Low-Resource Language Information Processing
        ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 22, Issue 6
        June 2023
        635 pages
        ISSN:2375-4699
        EISSN:2375-4702
        DOI:10.1145/3604597
        Issue’s Table of Contents

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 17 June 2023
        • Online AM: 5 April 2023
        • Accepted: 31 March 2023
        • Revised: 27 December 2022
        • Received: 27 March 2022
        Published in tallip Volume 22, Issue 6

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • short-paper
      • Article Metrics

        • Downloads (Last 12 months)48
        • Downloads (Last 6 weeks)11

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      View Full Text
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!