skip to main content
research-article

Adversarial Evaluation of Robust Neural Sequential Tagging Methods for Thai Language

Published:18 May 2020Publication History
Skip Abstract Section

Abstract

Sequential tagging tasks, such as Part-Of-Speech (POS) tagging and Named-Entity Recognition, are the building blocks of many natural language processing applications. Although prior works have reported promising results in standard settings, they often underperform on non-standard text, such as microblogs and social media. In this article, we introduce an adversarial evaluation scheme for the Thai language by creating adversarial examples based on known spelling errors. Furthermore, we propose novel methods including UNK masking, condition initialization with affixation embeddings, and untied-directional self-attention mechanism to enhance robustness and interpretability of the neural networks. We conducted experiments on two Thai corpora: BEST2010 and ORCHID. Our adversarial evaluation schemes reveal that bidirectional LSTM (BiLSTM) do not perform well on adversarial examples. Our best methods match the performance of the BiLSTM baseline model and outperform it on adversarial examples.

References

  1. Rami Al-Rfou and Steven Skiena. 2012. SpeedRead: A fast named entity recognition pipeline. In Proceedings of the International Conference on Computational Linguistics (COLING’12). 51--66.Google ScholarGoogle Scholar
  2. Wirote Aroonmanakun et al. 2007. Thoughts on word and sentence segmentation in Thai. In Proceedings of the 7th Symposium on Natural Language Processing. 85--90.Google ScholarGoogle Scholar
  3. Yonatan Belinkov and Yonatan Bisk. 2018. Synthetic and natural noise both break neural machine translation. In Proceedings of the International Conference on Learning Representations.Google ScholarGoogle Scholar
  4. Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. Enriching word vectors with subword information. Trans. Assoc. Comput. Ling. 5 (2017), 135--146.Google ScholarGoogle ScholarCross RefCross Ref
  5. Prachya Boonkwan and Thepchai Supnithi. 2017. Bidirectional deep learning of context representation for joint word segmentation and POS tagging. In Proceedings of the International Conference on Computer Science, Applied Mathematics and Applications. Springer, 184--196.Google ScholarGoogle Scholar
  6. Yong Cheng, Zhaopeng Tu, Fandong Meng, Junjie Zhai, and Yang Liu. 2018. Towards robust neural machine translation. In Proceedings of the 56th Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1756--1766.Google ScholarGoogle ScholarCross RefCross Ref
  7. Jason P. C. Chiu and Eric Nichols. 2016. Named entity recognition with bidirectional LSTM-CNNs. Trans. Assoc. Comput. Ling. 4 (2016), 357--370.Google ScholarGoogle ScholarCross RefCross Ref
  8. Leon Derczynski, Alan Ritter, Sam Clark, and Kalina Bontcheva. 2013. Twitter part-of-speech tagging for all: Overcoming sparse and noisy data. In Proceedings of the International Conference Recent Advances in Natural Language Processing (RANLP’13). 198--206.Google ScholarGoogle Scholar
  9. Javid Ebrahimi, Anyi Rao, Daniel Lowd, and Dejing Dou. 2018. HotFlip: White-box adversarial examples for text classification. In Proceedings of the 56th Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 31--36.Google ScholarGoogle ScholarCross RefCross Ref
  10. Charles Carpenter Fries. 1952. The Structure of English. Harcourt, Brace and World.Google ScholarGoogle Scholar
  11. Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and harnessing adversarial examples. In Proceedings of the International Conference on Learning Representations (ICLR’15).Google ScholarGoogle Scholar
  12. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Comput. 9, 8 (1997), 1735--1780.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Zhiheng Huang, Wei Xu, and Kai Yu. 2015. Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015).Google ScholarGoogle Scholar
  14. Shoichi Iwasaki, Preeya Ingkaphirom, and Inkapiromu Puriyā Horie. 2005. A Reference Grammar of Thai. Cambridge University Press.Google ScholarGoogle Scholar
  15. Mohit Iyyer, Varun Manjunatha, Jordan Boyd-Graber, and Hal Daumé III. 2015. Deep unordered composition rivals syntactic methods for text classification. In Proceedings of the 53rd Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 1681--1691.Google ScholarGoogle ScholarCross RefCross Ref
  16. Amarin Jettakul, Chavisa Thamjarat, Kawin Liaowongphuthorn, Can Udomcharoenchaikit, Peerapon Vateekul, and Prachya Boonkwan. 2018. A comparative study on various deep learning techniques for Thai NLP lexical and syntactic tasks on noisy data. In Proceedings of the15th International Joint Conference on Computer Science and Software Engineering (JCSSE’18). IEEE, 1--6.Google ScholarGoogle ScholarCross RefCross Ref
  17. Robin Jia and Percy Liang. 2017. Adversarial examples for evaluating reading comprehension systems. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’17). 2021--2031.Google ScholarGoogle ScholarCross RefCross Ref
  18. Diederik P. Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations (ICLR’15).Google ScholarGoogle Scholar
  19. Kanyanut Kriengket, Sawittree Jumpathong, Prachya Boonkwan, and Thepchai Supnithi. 2017. A cognitive and linguistic analysis of search queries of an online dictionary: A case study of LEXiTRON. In Proceedings of the Conference on Artificial Intelligence and Natural Language Processing (iSAI-NLP’17).Google ScholarGoogle Scholar
  20. John Lafferty, Andrew McCallum, and Fernando C. N. Pereira. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the 18th International Conference on Machine Learning.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Theerapat Lapjaturapit, Kobkrit Viriyayudhakom, and Thanaruk Theeramunkong. 2018. Multi-candidate word segmentation using bi-directional LSTM neural networks. In Proceedings of the International Conference on Embedded Systems and Intelligent Technology 8 International Conference on Information and Communication Technology for Embedded Systems (ICESIT-ICICTES’18). IEEE, 1--6.Google ScholarGoogle Scholar
  22. Zhouhan Lin, Minwei Feng, Cicero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, and Yoshua Bengio. 2017. A structured self-attentive sentence embedding. arXiv preprint arXiv:1703.03130 (2017).Google ScholarGoogle Scholar
  23. Wang Ling, Tiago Luís, Luís Marujo, Ramón Fernandez Astudillo, Silvio Amir, Chris Dyer, Alan W. Black, and Isabel Trancoso. 2015. Finding function in form: Compositional character models for open vocabulary word representation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’15).Google ScholarGoogle ScholarCross RefCross Ref
  24. Thang Luong, Richard Socher, and Christopher Manning. 2013. Better word representations with recursive neural networks for morphology. In Proceedings of the 17th Conference on Computational Natural Language Learning. 104--113.Google ScholarGoogle Scholar
  25. Xuezhe Ma and Eduard Hovy. 2016. End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. In Proceedings of the 54th Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1064--1074.Google ScholarGoogle ScholarCross RefCross Ref
  26. Xuezhe Ma and Fei Xia. 2014. Unsupervised dependency parsing with transferring distribution via parallel guidance and entropy regularization. In Proceedings of the 52nd Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1337--1348.Google ScholarGoogle ScholarCross RefCross Ref
  27. Makoto Minegishi. 2011. Description of Thai as an isolating language. Soc. Sci. Inf. 50, 1 (2011), 62--80.Google ScholarGoogle ScholarCross RefCross Ref
  28. Nam Nguyen and Yunsong Guo. 2007. Comparisons of sequence labeling algorithms and extensions. In Proceedings of the 24th International Conference on Machine Learning. ACM, 681--688.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Matthew Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). 2227--2237.Google ScholarGoogle ScholarCross RefCross Ref
  30. Lev Ratinov and Dan Roth. 2009. Design challenges and misconceptions in named entity recognition. In Proceedings of the 13th Conference on Computational Natural Language Learning. Association for Computational Linguistics, 147--155.Google ScholarGoogle ScholarCross RefCross Ref
  31. Graham Ernest Rawlinson. 1976. The Significance of Letter Position in Word Recognition. Ph.D. Dissertation. University of Nottingham.Google ScholarGoogle Scholar
  32. Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2018. Semantically equivalent adversarial rules for debugging NLP models. In Proceedings of the 56th Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 856--865.Google ScholarGoogle ScholarCross RefCross Ref
  33. Alan Ritter, Sam Clark, Oren Etzioni, et al. 2011. Named entity recognition in tweets: An experimental study. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1524--1534.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Virach Sornlertlamvanich, Naoto Takahashi, and Hitoshi Isahara. 1998. Thai part-of-speech tagged corpus: ORCHID. In Proceedings of the Oriental COCOSDA Workshop. 131--138.Google ScholarGoogle Scholar
  35. Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1 (2014), 1929--1958.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2014. Intriguing properties of neural networks. In Proceedings of the International Conference on Learning Representations. Retrieved from: http://arxiv.org/abs/1312.6199.Google ScholarGoogle Scholar
  37. Can Udomcharoenchaikit, Peerapon Vateekul, and Prachya Boonkwan. 2017. Thai named-entity recognition using variational long short-term memory with conditional random field. In Proceedings of the Joint International Symposium on Artificial Intelligence and Natural Language Processing. Springer, 82--92.Google ScholarGoogle Scholar
  38. Rongxiang Weng, Shujian Huang, Zaixiang Zheng, Xinyu Dai, and Jiajun Chen. 2017. Neural machine translation with word predictions. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 136--145.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Adversarial Evaluation of Robust Neural Sequential Tagging Methods for Thai Language

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Asian and Low-Resource Language Information Processing
      ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 19, Issue 4
      July 2020
      291 pages
      ISSN:2375-4699
      EISSN:2375-4702
      DOI:10.1145/3391538
      Issue’s Table of Contents

      Copyright © 2020 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 18 May 2020
      • Online AM: 7 May 2020
      • Revised: 1 February 2020
      • Accepted: 1 February 2020
      • Received: 1 July 2019
      Published in tallip Volume 19, Issue 4

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed
    • Article Metrics

      • Downloads (Last 12 months)21
      • Downloads (Last 6 weeks)1

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!