Abstract
Sequential tagging tasks, such as Part-Of-Speech (POS) tagging and Named-Entity Recognition, are the building blocks of many natural language processing applications. Although prior works have reported promising results in standard settings, they often underperform on non-standard text, such as microblogs and social media. In this article, we introduce an adversarial evaluation scheme for the Thai language by creating adversarial examples based on known spelling errors. Furthermore, we propose novel methods including UNK masking, condition initialization with affixation embeddings, and untied-directional self-attention mechanism to enhance robustness and interpretability of the neural networks. We conducted experiments on two Thai corpora: BEST2010 and ORCHID. Our adversarial evaluation schemes reveal that bidirectional LSTM (BiLSTM) do not perform well on adversarial examples. Our best methods match the performance of the BiLSTM baseline model and outperform it on adversarial examples.
- Rami Al-Rfou and Steven Skiena. 2012. SpeedRead: A fast named entity recognition pipeline. In Proceedings of the International Conference on Computational Linguistics (COLING’12). 51--66.Google Scholar
- Wirote Aroonmanakun et al. 2007. Thoughts on word and sentence segmentation in Thai. In Proceedings of the 7th Symposium on Natural Language Processing. 85--90.Google Scholar
- Yonatan Belinkov and Yonatan Bisk. 2018. Synthetic and natural noise both break neural machine translation. In Proceedings of the International Conference on Learning Representations.Google Scholar
- Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. Enriching word vectors with subword information. Trans. Assoc. Comput. Ling. 5 (2017), 135--146.Google Scholar
Cross Ref
- Prachya Boonkwan and Thepchai Supnithi. 2017. Bidirectional deep learning of context representation for joint word segmentation and POS tagging. In Proceedings of the International Conference on Computer Science, Applied Mathematics and Applications. Springer, 184--196.Google Scholar
- Yong Cheng, Zhaopeng Tu, Fandong Meng, Junjie Zhai, and Yang Liu. 2018. Towards robust neural machine translation. In Proceedings of the 56th Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1756--1766.Google Scholar
Cross Ref
- Jason P. C. Chiu and Eric Nichols. 2016. Named entity recognition with bidirectional LSTM-CNNs. Trans. Assoc. Comput. Ling. 4 (2016), 357--370.Google Scholar
Cross Ref
- Leon Derczynski, Alan Ritter, Sam Clark, and Kalina Bontcheva. 2013. Twitter part-of-speech tagging for all: Overcoming sparse and noisy data. In Proceedings of the International Conference Recent Advances in Natural Language Processing (RANLP’13). 198--206.Google Scholar
- Javid Ebrahimi, Anyi Rao, Daniel Lowd, and Dejing Dou. 2018. HotFlip: White-box adversarial examples for text classification. In Proceedings of the 56th Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 31--36.Google Scholar
Cross Ref
- Charles Carpenter Fries. 1952. The Structure of English. Harcourt, Brace and World.Google Scholar
- Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and harnessing adversarial examples. In Proceedings of the International Conference on Learning Representations (ICLR’15).Google Scholar
- Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Comput. 9, 8 (1997), 1735--1780.Google Scholar
Digital Library
- Zhiheng Huang, Wei Xu, and Kai Yu. 2015. Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015).Google Scholar
- Shoichi Iwasaki, Preeya Ingkaphirom, and Inkapiromu Puriyā Horie. 2005. A Reference Grammar of Thai. Cambridge University Press.Google Scholar
- Mohit Iyyer, Varun Manjunatha, Jordan Boyd-Graber, and Hal Daumé III. 2015. Deep unordered composition rivals syntactic methods for text classification. In Proceedings of the 53rd Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 1681--1691.Google Scholar
Cross Ref
- Amarin Jettakul, Chavisa Thamjarat, Kawin Liaowongphuthorn, Can Udomcharoenchaikit, Peerapon Vateekul, and Prachya Boonkwan. 2018. A comparative study on various deep learning techniques for Thai NLP lexical and syntactic tasks on noisy data. In Proceedings of the15th International Joint Conference on Computer Science and Software Engineering (JCSSE’18). IEEE, 1--6.Google Scholar
Cross Ref
- Robin Jia and Percy Liang. 2017. Adversarial examples for evaluating reading comprehension systems. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’17). 2021--2031.Google Scholar
Cross Ref
- Diederik P. Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations (ICLR’15).Google Scholar
- Kanyanut Kriengket, Sawittree Jumpathong, Prachya Boonkwan, and Thepchai Supnithi. 2017. A cognitive and linguistic analysis of search queries of an online dictionary: A case study of LEXiTRON. In Proceedings of the Conference on Artificial Intelligence and Natural Language Processing (iSAI-NLP’17).Google Scholar
- John Lafferty, Andrew McCallum, and Fernando C. N. Pereira. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the 18th International Conference on Machine Learning.Google Scholar
Digital Library
- Theerapat Lapjaturapit, Kobkrit Viriyayudhakom, and Thanaruk Theeramunkong. 2018. Multi-candidate word segmentation using bi-directional LSTM neural networks. In Proceedings of the International Conference on Embedded Systems and Intelligent Technology 8 International Conference on Information and Communication Technology for Embedded Systems (ICESIT-ICICTES’18). IEEE, 1--6.Google Scholar
- Zhouhan Lin, Minwei Feng, Cicero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, and Yoshua Bengio. 2017. A structured self-attentive sentence embedding. arXiv preprint arXiv:1703.03130 (2017).Google Scholar
- Wang Ling, Tiago Luís, Luís Marujo, Ramón Fernandez Astudillo, Silvio Amir, Chris Dyer, Alan W. Black, and Isabel Trancoso. 2015. Finding function in form: Compositional character models for open vocabulary word representation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’15).Google Scholar
Cross Ref
- Thang Luong, Richard Socher, and Christopher Manning. 2013. Better word representations with recursive neural networks for morphology. In Proceedings of the 17th Conference on Computational Natural Language Learning. 104--113.Google Scholar
- Xuezhe Ma and Eduard Hovy. 2016. End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. In Proceedings of the 54th Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1064--1074.Google Scholar
Cross Ref
- Xuezhe Ma and Fei Xia. 2014. Unsupervised dependency parsing with transferring distribution via parallel guidance and entropy regularization. In Proceedings of the 52nd Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1337--1348.Google Scholar
Cross Ref
- Makoto Minegishi. 2011. Description of Thai as an isolating language. Soc. Sci. Inf. 50, 1 (2011), 62--80.Google Scholar
Cross Ref
- Nam Nguyen and Yunsong Guo. 2007. Comparisons of sequence labeling algorithms and extensions. In Proceedings of the 24th International Conference on Machine Learning. ACM, 681--688.Google Scholar
Digital Library
- Matthew Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). 2227--2237.Google Scholar
Cross Ref
- Lev Ratinov and Dan Roth. 2009. Design challenges and misconceptions in named entity recognition. In Proceedings of the 13th Conference on Computational Natural Language Learning. Association for Computational Linguistics, 147--155.Google Scholar
Cross Ref
- Graham Ernest Rawlinson. 1976. The Significance of Letter Position in Word Recognition. Ph.D. Dissertation. University of Nottingham.Google Scholar
- Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2018. Semantically equivalent adversarial rules for debugging NLP models. In Proceedings of the 56th Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 856--865.Google Scholar
Cross Ref
- Alan Ritter, Sam Clark, Oren Etzioni, et al. 2011. Named entity recognition in tweets: An experimental study. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1524--1534.Google Scholar
Digital Library
- Virach Sornlertlamvanich, Naoto Takahashi, and Hitoshi Isahara. 1998. Thai part-of-speech tagged corpus: ORCHID. In Proceedings of the Oriental COCOSDA Workshop. 131--138.Google Scholar
- Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1 (2014), 1929--1958.Google Scholar
Digital Library
- Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2014. Intriguing properties of neural networks. In Proceedings of the International Conference on Learning Representations. Retrieved from: http://arxiv.org/abs/1312.6199.Google Scholar
- Can Udomcharoenchaikit, Peerapon Vateekul, and Prachya Boonkwan. 2017. Thai named-entity recognition using variational long short-term memory with conditional random field. In Proceedings of the Joint International Symposium on Artificial Intelligence and Natural Language Processing. Springer, 82--92.Google Scholar
- Rongxiang Weng, Shujian Huang, Zaixiang Zheng, Xinyu Dai, and Jiajun Chen. 2017. Neural machine translation with word predictions. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 136--145.Google Scholar
Cross Ref
Index Terms
Adversarial Evaluation of Robust Neural Sequential Tagging Methods for Thai Language
Recommendations
A Cross-lingual Part-of-Speech Tagging for Malay Language
ICAART 2015: Proceedings of the International Conference on Agents and Artificial Intelligence - Volume 2Cross-lingual annotation projection methods can benefit from rich-resourced languages to improve the performance
of Natural Language Processing (NLP) tasks in less-resourced languages. In this research, Malay
is experimented as the less-resourced ...
Korean Part-of-speech Tagging Based on Morpheme Generation
Two major problems of Korean part-of-speech (POS) tagging are that the word-spacing unit is not mapped one-to-one to a POS tag and that morphemes should be recovered during POS tagging. Therefore, this article proposes a novel two-step Korean POS tagger ...
Accuracy of Baseline and Complex Methods Applied to Morphosyntactic Tagging of Polish
ICCS '08: Proceedings of the 8th international conference on Computational Science, Part IThe paper presents baseline and complex part-of-speech taggers applied to the modified corpus of Frequency Dictionary of Contemporary Polish. Accuracy of 5 baseline part-of-speech taggers is reported. On the base of these results complex methods are ...






Comments