Abstract
Machine translation aims to break the language barrier that prevents communication with others and increase access to information. Deaf people face huge language barriers in their daily lives, including access to digital and spoken information. There are very few digital resources for sign language processing. In this article, we present a transfer-based machine translation system for translating Korean-to-Korean Sign Language (KSL) glosses, mainly composed of (1) dictionary-based lexical transfer and (2) a hybrid syntactic transfer based on a data-driven model. In particular, we formulate complicated word reordering problems in syntactic transfer as multi-class classification tasks and propose “syntactically guided” data-driven syntactic transfer. The core part of our study is a neural classification model for reordering order-important constituent pairs with a reordering task that is newly designed for Korean-to-KSL translation. The experiment results evaluated on news transcript data show that the proposed system achieves a BLEU score of 0.512 and a RIBES score of 0.425, significantly improving upon the baseline system performance.
- Adam L. Berger, Vincent J. Della Pietra, and Stephen A. Della Pietra. 1996. A maximum entropy approach to natural language processing. Comput. Linguist. 22, 1 (Mar. 1996), 39--71. Retrieved from: http://dl.acm.org/citation.cfm?id=234285.234289Google Scholar
Digital Library
- Alexandra Birch and Miles Osborne. 2011. Reordering metrics for MT. In Proceedings of the 49th Meeting of the Association for Computational Linguistics: Human Language Technologies (HLT’11), Vol. 1. 1027--1035.Google Scholar
- Alexandra Birch, Miles Osborne, and Phil Blunsom. 2010. Metrics for MT evaluation: Evaluating reordering. Machine Trans. 24, 1 (2010), 15--26.Google Scholar
Digital Library
- Arianna Bisazza and Marcello Federico. 2016. A survey of word reordering in statistical machine translation: Computational models and language phenomena. Comput. Ling. 42, 2 (2016), 163--205. DOI:https://doi.org/10.1162/COLI_a_00245Google Scholar
Digital Library
- Arianna Bisazza and Marcello Federico. 2016. A survey of word reordering in statistical machine translation: Computational models and language phenomena. Comput. Ling. 42, 2 (2016), 163--205.Google Scholar
Digital Library
- Samuel R. Bowman, Jon Gauthier, Abhinav Rastogi, Raghav Gupta, Christopher D. Manning, and Christopher Potts. 2016. A fast unified model for parsing and sentence understanding. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL'16). Association for Computational Linguistics, 1466--1477. DOI:https://doi.org/10.18653/v1/P16-1139Google Scholar
Cross Ref
- Diana Burman. 2008. Researching Deaf Children’s Literacy. Presentation at ESRC Research Methods Festival, St Catherine's College, Oxford.Google Scholar
- David Chiang. 2005. A hierarchical phrase-based model for statistical machine translation. In Proceedings of the 43rd Meeting of the Association for Computational Linguistics (ACL’05). Association for Computational Linguistics, 263--270. DOI:https://doi.org/10.3115/1219840.1219873Google Scholar
Digital Library
- Yiming Cui, Shijin Wang, and Jianfeng Li. 2016. LSTM neural reordering feature for statistical machine translation. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 977--982.Google Scholar
Cross Ref
- Srisavakon Dangsaart, Kanlaya Naruedomkul, Nick Cercone, and Booncharoen Sirinaovakul. 2008. Intelligent Thai text—Thai sign translation for language learning. Comput. Educ. 51, 3 (2008), 1125--1141.Google Scholar
Digital Library
- Adrià de Gispert, Gonzalo Iglesias, Graeme Blackwood, Eduardo R. Banga, and William Byrne. 2010. Hierarchical phrase-based translation with weighted finite-state transducers and shallow-n grammars. Comput. Ling. 36, 3 (2010), 505--533. DOI:https://doi.org/10.1162/coli_a_00006Google Scholar
Digital Library
- Michael Denkowski and Alon Lavie. 2014. Meteor universal: Language specific translation evaluation for any target language. In Proceedings of the 9th Workshop on Statistical Machine Translation. Association for Computational Linguistics, 376--380. Retrieved from: http://www.aclweb.org/anthology/W/W14/W14-3348Google Scholar
Cross Ref
- Yuan Ding and Martha Palmer. 2005. Machine translation using probabilistic synchronous dependency insertion grammars. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL'05). Association for Computational Linguistics, 541--548. DOI:https://doi.org/10.3115/1219840.1219907Google Scholar
Digital Library
- Angus B. Grieve-Smith. 1999. English to American Sign Language machine translation of weather reports. In Proceedings of the 2nd High Desert Student Conference in Linguistics (HDSL’99). 23--30.Google Scholar
- Zhongjun He, Yao Meng, and Hao Qiang Yu. 2010. Extending the hierarchical phrase based model with maximum entropy based BTG. In Proceedings of the Conference of the Association for Machine Translation in the Americas (AMTA’10).Google Scholar
- Liang Huang, Kevin Knight, and Aravind Joshi. 2006. Statistical syntax-directed translation with extended domain of locality. In Proceedings of the Conference of the Association for Machine Translation in the Americas (AMTA’06). 66--73.Google Scholar
- Matthew P. Huenerfauth. 2003. American Sign Language Natural Language Generation and Machine Translation Systems. Technical Report. Computer and Information Sciences, University of Pennsylvania.Google Scholar
- Kenji Imamura, Hideo Okuma, and Eiichiro Sumita. 2005. Practical approach to syntax-based statistical machine translation. In Proceedings of the Tenth Machine Translation Summit (MT SUMMIT X). Asia-Pacific Association for Machine Translation and Thai Computational Linguistics Laboratory.Google Scholar
- P. Isabelle and G. Foster. 2006. Machine translation: Overview. In Encyclopedia of Language 8 Linguistics (Second Edition) (second edition ed.), Keith Brown (Ed.). Elsevier, Oxford, 404--422. DOI:https://doi.org/10.1016/B0-08-044854-2/00936-6Google Scholar
- Hideki Isozaki, Tsutomu Hirao, Kevin Duh, Katsuhito Sudoh, and Hajime Tsukada. 2010. Automatic evaluation of translation quality for distant language pairs. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 944--952. Retrieved from: http://aclweb.org/anthology/D10-1092Google Scholar
- Daniel Jurafsky and James H. Martin. 2009. Speech and Language Processing (2nd Edition). Prentice-Hall, Inc., Upper Saddle River, NJ.Google Scholar
Digital Library
- Sin-Jae Kang, You-Jin Chung, and Jong-Hyeok Lee. 2002. Disambiguating word senses in Korean-Japanese machine translation by using semi-automatically constructed ontology. IEICE Trans. Inform. Syst. 85, 10 (2002), 1688--1697.Google Scholar
- Philipp Koehn and Hieu Hoang. 2007. Factored translation models. In Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL’07). Retrieved from: http://aclweb.org/anthology/D07-1091Google Scholar
- Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, Chris Dyer, Ondrej Bojar, Alexandra Constantin, and Evan Herbst. 2007. Moses: Open source toolkit for statistical machine translation. In Proceedings of the 45th Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions. Association for Computational Linguistics. 177--180. Retrieved from: https://www.aclweb.org/anthology/P07-2045Google Scholar
Cross Ref
- Dimitrios Kouremenos, Klimis Ntalianis, and Stefanos Kollias. 2018. A novel rule based machine translation scheme from Greek to Greek sign language: Production of different types of large corpora and language models evaluation. Comput. Speech Lang. 51 (2018), 110--135. DOI:https://doi.org/10.1016/j.csl.2018.04.001Google Scholar
- Daeseop Lee, Junwoo Lee, Jiwoong Jung, Miae Oh, Jinseok Jang, Minho Yang, Hyunsoo Choi, Jaehee Kang, Yeonshin Kim, Junghwan Kim, Yongju Hwang, and Jinyoung Oh. 2017. A Survey of the Current Status of Using the Korean Sign Language. (in Korean). National Institute of Korean Language. Retrieved from: https://www.korean.go.kr/front/reportData/reportDataView.do?mn_id=2078report_seq=938.Google Scholar
- Chi-Ho Li, Minghui Li, Dongdong Zhang, Mu Li, Ming Zhou, and Yi Guan. 2007. A probabilistic approach to syntax-based reordering for statistical machine translation. In Proceedings of the 45th Meeting of the Association of Computational Linguistics. Association for Computational Linguistics, 720--727. Retrieved from: http://aclweb.org/anthology/P07-1091Google Scholar
- Peng Li, Yang Liu, and Maosong Sun. 2013. Recursive autoencoders for ITG-based translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 567--577. Retrieved from: http://aclweb.org/anthology/D13-1054Google Scholar
- Peng Li, Yang Liu, Maosong Sun, Tatsuya Izuha, and Dakun Zhang. 2014. A neural reordering model for phrase-based translation. In Proceedings of the 25th International Conference on Computational Linguistics: Technical Papers. 1897--1907.Google Scholar
- Yang Liu, Qun Liu, and Shouxun Lin. 2006. Tree-to-string alignment template for statistical machine translation. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 609--616. Retrieved from: http://aclweb.org/anthology/P06-1077Google Scholar
Digital Library
- Verónica López-Ludeña, Rubén San-Segundo, R. Córdoba, Javier Ferreiros, Juan Manuel Montero, and José Manuel Pardo. 2011. Factored translation models for improving a speech into sign language translation system. In Proceedings of the Conference of the International Speech Communication Association (INTERSPEECH’11). 1605--1608.Google Scholar
- Gideon Maillette de Buy Wenniger and Khalil Sima’an. 2014. Bilingual Markov reordering labels for hierarchical SMT. In Proceedings of the 8th Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST’14). Association for Computational Linguistics, 11--21. DOI:https://doi.org/10.3115/v1/W14-4002Google Scholar
Cross Ref
- Ian Marshall and Éva Sáfár. 2002. Sign language generation using HPSG. In Proceedings of the 9th International Conference on Theoretical and Methodological Issues in Machine Translation (TMI’02). 105--114.Google Scholar
- Kyonghi Moon and Jong-Hyeok Lee. 2002. Translation of discontinuous multi-word translation units in a Korean-to-Japanese machine translation system. Int. J. Comput. Proc. Orient. Lang. 15, 01 (2002), 79--99.Google Scholar
Cross Ref
- Sara Morrissey. 2008. Data-driven Machine Translation for Sign Languages. Ph.D. Dissertation. Dublin City University.Google Scholar
- Markos Mylonakis and Khalil Sima’an. 2010. Learning probabilistic synchronous CFGs for phrase-based translation. In Proceedings of the 14th Conference on Computational Natural Language Learning. Association for Computational Linguistics, 117--125. Retrieved from: http://aclweb.org/anthology/W10-2915Google Scholar
- Seung-Hoon Na, Jianri Li, Jong-Hoon Shin, and Kangil Kim. 2016. Stack LSTMs with recurrent controllers for Korean dependency parsing. In Proceedings of the Korea Software Congress. The Korean Institute of Information Scientists and Engineers, 446--448. (in Korean). Retrieved from: https://www.dbpia.co.kr/journal/articleDetail?nodeId=NODE07017626&language==ko_KR.Google Scholar
- Donna Jo Napoli and Rachel Sutton-Spence. 2014. Order of the major constituents in sign languages: Implications for all language. Front. Psych. 5 (2014), 376. DOI:https://doi.org/10.3389/fpsyg.2014.00376Google Scholar
- Shin Ohno and Masato Hamanishi. 1981. New synonym dictionary. Kadogawa Shoten, Tokyo (1981). Retrieved from: https://searchworks.stanford.edu/view/6675100Google Scholar
- Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 311--318. Retrieved from: http://aclweb.org/anthology/P02-1040Google Scholar
- Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’14). Association for Computational Linguistics, 1532--1543. DOI:https://doi.org/10.3115/v1/D14-1162Google Scholar
Cross Ref
- Pamela Perniss, Roland Pfau, and Markus Steinbach. 2007. Can’t you see the difference? Sources of variation in sign language structure. Emotion (1 2007), 1--34.Google Scholar
- Jordi Porta, Fernando López-Colino, Javier Tejedor, and José Colás. 2014. A rule-based translation from written Spanish to Spanish sign language glosses. Comput. Speech Lang. 28, 3 (2014), 788--811. DOI:https://doi.org/10.1016/j.csl.2013.10.003Google Scholar
Digital Library
- R. San-Segundo, R. Barra, R. Córdoba, L. F. D’Haro, F. Fernández, J. Ferreiros, J. M. Lucas, J. Macías-Guarasa, J. M. Montero, and J. M. Pardo. 2008. Speech to sign language translation system for Spanish. Speech Commun. 50, 11 (2008), 1009--1020. DOI:https://doi.org/10.1016/j.specom.2008.02.001Google Scholar
Digital Library
- Wendy Sandler and Diane Lillo-Martin. 2006. Sign Language and Linguistic Universals. Cambridge University Press. DOI:https://doi.org/10.1017/CBO9781139163910Google Scholar
- Izchak M. Schlesinger and Lila Namir. 1978. Sign Language of the Deaf: Psychological, Linguistic, and Sociological Perspectives. Academic Press.Google Scholar
- Libin Shen, Jinxi Xu, and Ralph Weischedel. 2010. String-to-dependency statistical machine translation. Comput. Ling. 36, 4 (2010), 649--671. DOI:https://doi.org/10.1162/coli_a_00015Google Scholar
Digital Library
- David Smith and Jason Eisner. 2006. Quasi-synchronous grammars: Alignment by soft projection of syntactic dependencies. In Proceedings of the Workshop on Statistical Machine Translation. Association for Computational Linguistics, 23--30. Retrieved from: https://www.aclweb.org/anthology/W06-3104Google Scholar
Cross Ref
- Mi Yeon Song. 2017. A Study on the Syntactics and Semantics Analysis of Korean Sign Language Interpreter’s Interpreting on TV News. Master’s thesis. Kangnam University.Google Scholar
- D. Stein, J. Bungeroth, and H. Ney. 2006. Morpho-syntax based statistical methods for sign language translation. In Proceedings of the 11th Conference of the European Association for Machine Translation. Citeseer, 169--177.Google Scholar
- Nicolas Stroppa and Andy Way. 2006. MaTrEx: DCU machine translation system for IWSLT 2006. In Proceedings of the International Workshop on Spoken Language Translation (IWSLT’06). 31--36.Google Scholar
- Carol Bloomquist Traxler. 2000. The Stanford Achievement Test, 9th Edition: National norming and performance standards for deaf and hard-of-hearing students. J. Deaf Stud. Deaf Educ. 5, 4 (2000), 337--348. DOI:https://doi.org/10.1093/deafed/5.4.337Google Scholar
Cross Ref
- Mi Sook Um. 1997. Analysis of Syntactic Characteristics of Korean Sign Language. Master’s thesis. Daegu University.Google Scholar
- Masao Utiyama, Eiichro Sumita, Hai Zhao et al. 2015. Learning word reorderings for hierarchical phrase-based statistical machine translation. In Proceedings of the 53rd Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). 542--548.Google Scholar
- Vincent Van Asch. 2013. Macro- and micro-averaged evaluation measures [[basic draft]]. Belgium: CLiPS (2013), 1--27. Retrieved from: https://pdfs.semanticscholar.org/1d10/6a2730801b6210a67f7622e4d192bb309303.pdf.Google Scholar
- Tony Veale, Alan Conway, and BrÓna Collins. 1998. The challenges of cross-modal translation: English-to-sign-language translation in the Zardoz system. Machine Trans. 13, 1 (1 Mar 1998), 81--106. DOI:https://doi.org/10.1023/A:1008014420317Google Scholar
- Chung-Hsien Wu, Hung-Yu Su, Yu-Hsien Chiu, and Chia-Hung Lin. 2007. Transfer-based statistical translation of Taiwanese sign language using PCFG. 6, 1, Article 1 (Apr. 2007). DOI:https://doi.org/10.1145/1227850.1227851Google Scholar
Digital Library
- Kenji Yamada and Kevin Knight. 2002. A decoder for syntax-based statistical MT. In Proceedings of the 40th Meeting of the Association for Computational Linguistics. Retrieved from: http://aclweb.org/anthology/P02-1039Google Scholar
- Seokmin Yoon, Youngjae Lee, Kwangjin Seo, Ara Ko, Hwani Choi, and Seongjun Kim. 2014. Basic Research on the Literacy Educational Realities of the Deaf. (in Korean). National Institute of Korean Language and Korea Association of the Deaf. Retrieved from: https://www.korean.go.kr/front/reportData/reportDataView.do?mn_id=207&report_seq==808Google Scholar
- Hao Zhang, Axel Ng, and Richard Sproat. 2018. Fast and accurate reordering with ITG transition RNN. In Proceedings of the 27th International Conference on Computational Linguistics. 1454--1463.Google Scholar
- Min Zhang, Hongfei Jiang, Aiti Aw, Haizhou Li, Chew Lim Tan, and Sheng Li. 2008. A tree sequence alignment-based tree-to-tree translation model. In Proceedings of ACL-08: HLT. Association for Computational Linguistics, 559--567. Retrieved from: http://aclweb.org/anthology/P08-1064.Google Scholar
Index Terms
Word Reordering for Translation into Korean Sign Language Using Syntactically-guided Classification
Recommendations
Source-side Reordering to Improve Machine Translation between Languages with Distinct Word Orders
English and Hindi have significantly different word orders. English follows the subject-verb-object (SVO) order, while Hindi primarily follows the subject-object-verb (SOV) order. This difference poses challenges to modeling this pair of languages for ...
Recursive alignment block classification technique for word reordering in statistical machine translation
Statistical machine translation (SMT) is based on alignment models which learn from bilingual corpora the word correspondences between source and target language. These models are assumed to be capable of learning reorderings. However, the difference in ...
Syntax-based reordering for statistical machine translation
Abstract: In this paper, we develop an approach called syntax-based reordering (SBR) to handling the fundamental problem of word ordering for statistical machine translation (SMT). We propose to alleviate the word order challenge including morpho-...






Comments