Abstract
Parenthetical translations are translations of terms in otherwise monolingual text that appear inside parentheses. Parenthetical translations extraction (PTE) is the task of extracting parenthetical translations from natural language documents. One of the main difficulties in PTE is to detect the left boundary of the translated term in preparenthetical text. In this article, we propose a collective approach that employs Markov logic to model multiple constraints used in the PTE task. We show how various constraints can be formulated and combined in a Markov logic network (MLN). Our experimental results show that the proposed collective PTE approach significantly outperforms a current state-of-the-art method, improving the average F-measure up to 27.11% compared to the previous word alignment approach. It also outperforms an individual MLN-based system by 8.2% and a system based on conditional random fields by 5.9%.
- Guihong Cao, Jianfeng Gao, Jian-Yun Nie, and W. Redmond. 2007. A system to mine large-scale bilingual dictionaries from monolingual web. In Proceedings of the MT Summit XI. 57--64.Google Scholar
- Y. Chen and C. Zong. 2008. A structure-based model for Chinese organization name translation. ACM Transactions on Asian Language Information Processing 7, 1, 1--30. Google Scholar
Digital Library
- K. Crammer and Y. Singer. 2003. Ultraconservative online algorithms for multiclass problems. Journal of Machine Learning Research 3, 951--991. Google Scholar
Digital Library
- H.-J. Dai, Y. C. Chang, R. T.-H Tsai, and W. L. Hsu. 2011. Integration of gene normalization stages and coreference resolution using a Markov logic network. Bioinformatics 27, 2586--2594. Google Scholar
Digital Library
- M. Diab and S. Finch. 2000. A statistical word-level translation model for comparable corpora. In Proceedings of the Conference on Content-Based Multimedia Information Access. Google Scholar
Digital Library
- P. Domingos and D. Lowd. 2009. Markov Logic: An Interface Layer for Artificial Intelligence. Morgan and Claypool Publishers, San Francisco, CA. Google Scholar
Digital Library
- W. A. Gale and K. W. Church. 1991. Identifying word correspondence in parallel texts. In Proceedings of the Workshop on Speech and Natural Language. Google Scholar
Digital Library
- J. Kim, L. Jiang, S.-W. Hwang, Y.-I. Song, and M. Zhou. 2011. Mining entity translations from comparable corpora: A holistic graph mapping approach. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management. Glasgow, Scotland. Google Scholar
Digital Library
- K. L. Kwok, P. Deng, N. Dinstl, H. L. Sun, W. Xu, P. Peng, and J. Doyon. 2005. Chinet: A Chinese name finder system for document triage. In Proceedings of 2005 International Conference on Intelligence Analysis.Google Scholar
- C.-J. Lee, J. S. Chang, and J.-S. R. Jang. 2006. Extraction of transliteration pairs from parallel corpora using a statistical transliteration model. Information Sciences 176, 1, 67--90. Google Scholar
Digital Library
- D. Lin, S. Zhao, B. V. Durme, and M. Paşca. 2008. Mining parenthetical translations from the web by word alignment. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 944--1002.Google Scholar
- W.-H. Lu, L.-F. Chien, and H.-J. Lee. 2002. Translation of web queries using anchor text mining. ACM Transactions on Asian Language Information Processing 1, 2, 159--172. Google Scholar
Digital Library
- D. I. Melamed. 2000. Models of translational equivalence among words. Computational Linguistics 26, 2, 221--249. Google Scholar
Digital Library
- M. Nagata, T. Saito, and K. Suzuki. 2001. Using the web as a bilingual dictionary. In Proceedings of the Workshop on Data-Driven Methods in Machine Translation. Google Scholar
Digital Library
- H. Poon and L. Vanderwende. 2010. Joint inference for knowledge extraction from biomedical literature. In The Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Los Angeles, CA. Google Scholar
Digital Library
- M. Richardson and P. Domingos. 2006. Markov logic networks. Machine Learning 62, 107--136. Google Scholar
Digital Library
- S. Riedel. 2008. Improving the accuracy and efficiency of MAP inference for Markov logic. UAI.Google Scholar
- S. Riedel, H.-W. Chun, T. Takagi, and J. I. Tsujii. 2009. A Markov logic approach to bio-molecular event extraction. In Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing: Shared Task, Boulder, CO. Google Scholar
Digital Library
- L. Shao and H. T. Ng. 2004. Mining new word translations from comparable corpora. In Proceedings of the 20th International Conference on Computational Linguistics. Google Scholar
Digital Library
- Y. C. Wang, C. K. Wu, Richard T. H. Tsai, and J. Hsiang. 2013. Transliteration pair extraction from classical Chinese Buddhist literature using phonetic similarity measurement. In New Generation Computing 265--283.Google Scholar
- D. Zhou, M. Truran, T. Brailsford, and H. Ashman. 2006. NTCIR-6 experiments using pattern matched translation extraction. In Proceedings of NTCIR-6 Workshop Meeting.Google Scholar
Index Terms
Collective Web-Based Parenthetical Translation Extraction Using Markov Logic Networks
Recommendations
Named entity translation method based on machine translation lexicon
AbstractIn the context of the rapid development of computer technology, communication between various languages has become increasingly important. Among the research methods of named entities, the research on named entity translation methods based on ...
Named entity translation: extended abstract
HLT '02: Proceedings of the second international conference on Human Language Technology ResearchNamed entity phrases are being introduced in news stories on a daily basis in the form of personal names, organizations, locations, temporal phrases, and monetary expressions. While the identification of named entities in text has received significant ...
Web-based pattern learning for named entity translation in Korean-Chinese cross-language information retrieval
Named entity (NE) translation plays an important role in many applications, such as information retrieval and machine translation. In this paper, we focus on translating NEs from Korean to Chinese in order to improve Korean-Chinese cross-language ...






Comments