Author image not provided
 Yuval Marton

Authors:
Add personal information
  Affiliation history
Bibliometrics: publication history
Average citations per article11.67
Citation Count175
Publication count15
Publication years2005-2014
Available for download11
Average downloads per article171.55
Downloads (cumulative)1,887
Downloads (12 Months)188
Downloads (6 Weeks)20
SEARCH
ROLE
Arrow RightAuthor only


AUTHOR'S COLLEAGUES
See all colleagues of this author

SUBJECT AREAS
See all subject areas




BOOKMARK & SHARE


15 results found Export Results: bibtexendnoteacmrefcsv

Result 1 – 15 of 15
Sort by:

1
December 2014 Journal of King Saud University - Computer and Information Sciences: Volume 26 Issue 4, December 2014
Publisher: Elsevier Science Inc.
Bibliometrics:
Citation Count: 0

Foreign name transliterations typically include multiple spelling variants. These variants cause data sparseness and inconsistency problems, increase the Out-of-Vocabulary (OOV) rate, and present challenges for Machine Translation, Information Extraction and other natural language processing (NLP) tasks. This work aims to identify and cluster name spelling variants using a Statistical Machine ...
Keywords: Arabic, Machine Translation, Name normalization, Named Entity Recognition, Information Extraction, Transliteration

2 published by ACM
July 2013 ACM Transactions on Intelligent Systems and Technology (TIST) - Special Sections on Paraphrasing; Intelligent Systems for Socially Aware Computing; Social Computing, Behavioral-Cultural Modeling, and Prediction: Volume 4 Issue 3, June 2013
Publisher: ACM
Bibliometrics:
Citation Count: 1
Downloads (6 Weeks): 4,   Downloads (12 Months): 30,   Downloads (Overall): 208

Full text available: PDFPDF
Paraphrase generation has been shown useful for various natural language processing tasks, including statistical machine translation. A commonly used method for paraphrase generation is pivoting [Callison-Burch et al. 2006], which benefits from linguistic knowledge implicit in the sentence alignment of parallel texts, but has limited applicability due to its reliance ...
Keywords: SMT, Semantic similarity, paraphrase generation, statistical machine translation, semantic distance

3
March 2013 Computational Linguistics: Volume 39 Issue 1, March 2013
Publisher: MIT Press
Bibliometrics:
Citation Count: 5
Downloads (6 Weeks): 0,   Downloads (12 Months): 10,   Downloads (Overall): 83

Full text available: PDFPDF
We explore the contribution of lexical and inflectional morphology features to dependency parsing of Arabic, a morphologically rich language with complex agreement patterns. Using controlled experiments, we contrast the contribution of different part-of-speech POS tag sets and morphological features in two input conditions: machine-predicted condition in which POS tags and ...

4
June 2012 NAACL HLT '12: Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Tutorials
Publisher: Association for Computational Linguistics
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 1,   Downloads (12 Months): 3,   Downloads (Overall): 72

Full text available: PDFPDF
Semantic distance measures aim to answer questions such as: How close in meaning are words A and B? Fore example: "couch" and "sofa"? (very); "wave" and "ripple"? (soso); "wave" and "bank"? (far). Distributional measures do that by modeling which words occur next to A and next to B in large ...

5
March 2012 Machine Translation: Volume 26 Issue 1-2, March 2012
Publisher: Kluwer Academic Publishers
Bibliometrics:
Citation Count: 3

We study challenges raised by the order of Arabic verbs and their subjects in statistical machine translation (SMT). We show that the boundaries of post-verbal subjects (VS) are hard to detect accurately, even with a state-of-the-art Arabic dependency parser. In addition, VS constructions have highly ambiguous reordering patterns when translated ...
Keywords: Subject detection, Word alignment, Dependency parsing, Matrix subject, Post-verbal subjects, Reordering, Statistical machine translation, VS

6
March 2012 Machine Translation: Volume 26 Issue 1-2, March 2012
Publisher: Kluwer Academic Publishers
Bibliometrics:
Citation Count: 1

In adding syntax to statistical machine translation, there is a tradeoff between taking advantage of linguistic analysis and allowing the model to exploit parallel training data with no linguistic analysis: translation quality versus coverage. A number of previous efforts have tackled this tradeoff by starting with a commitment to linguistically ...
Keywords: Arabic, Syntax, Soft constraints, Statistical methods, Machine translation, Parsing

7
July 2011 WMT '11: Proceedings of the Sixth Workshop on Statistical Machine Translation
Publisher: Association for Computational Linguistics
Bibliometrics:
Citation Count: 2
Downloads (6 Weeks): 1,   Downloads (12 Months): 9,   Downloads (Overall): 87

Full text available: PDFPDF
Paraphrases are useful for statistical machine translation (SMT) and natural language processing tasks. Distributional paraphrase generation is independent of parallel texts and syntactic parses, and hence is suitable also for resource-poor languages, but tends to erroneously rank antonyms, trend-contrasting, and polarity-dissimilar candidates as good paraphrases. We present here a novel ...

8
June 2011 HLT '11: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Publisher: Association for Computational Linguistics
Bibliometrics:
Citation Count: 5
Downloads (6 Weeks): 3,   Downloads (12 Months): 13,   Downloads (Overall): 128

Full text available: PDFPDF
We explore the contribution of morphological features -- both lexical and inflectional -- to dependency parsing of Arabic, a morphologically rich language. Using controlled experiments, we find that definiteness, person, number, gender, and the undiacritzed lemma are most helpful for parsing on automatically tagged input. We further contrast the contribution ...

9
July 2010 ACLShort '10: Proceedings of the ACL 2010 Conference Short Papers
Publisher: Association for Computational Linguistics
Bibliometrics:
Citation Count: 11
Downloads (6 Weeks): 1,   Downloads (12 Months): 17,   Downloads (Overall): 164

Full text available: PDFPDF
We study the challenges raised by Arabic verb and subject detection and reordering in Statistical Machine Translation (SMT). We show that post-verbal subject (VS) constructions are hard to translate because they have highly ambiguous reordering patterns when translated to English. In addition, implementing reordering is difficult because the boundaries of ...

10
June 2010 SPMRL '10: Proceedings of the NAACL HLT 2010 First Workshop on Statistical Parsing of Morphologically-Rich Languages
Publisher: Association for Computational Linguistics
Bibliometrics:
Citation Count: 15
Downloads (6 Weeks): 3,   Downloads (12 Months): 27,   Downloads (Overall): 210

Full text available: PDFPDF
We explore the contribution of different lexical and inflectional morphological features to dependency parsing of Arabic, a morphologically rich language. We experiment with all leading POS tagsets for Arabic, and introduce a few new sets. We show that training the parser using a simple regular expressive extension of an impoverished ...

11
August 2009 EMNLP '09: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Publisher: Association for Computational Linguistics
Bibliometrics:
Citation Count: 28
Downloads (6 Weeks): 5,   Downloads (12 Months): 48,   Downloads (Overall): 406

Full text available: PDFPDF
Untranslated words still constitute a major problem for Statistical Machine Translation (SMT), and current SMT systems are limited by the quantity of parallel training texts. Augmenting the training data with paraphrases generated by pivoting through other languages alleviates this problem, especially for the so-called "low density" languages. But pivoting requires ...

12
August 2009 EMNLP '09: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Publisher: Association for Computational Linguistics
Bibliometrics:
Citation Count: 4
Downloads (6 Weeks): 1,   Downloads (12 Months): 6,   Downloads (Overall): 89

Full text available: PDFPDF
Strictly corpus-based measures of semantic distance conflate co-occurrence information pertaining to the many possible senses of target words. We propose a corpus-thesaurus hybrid method that uses soft constraints to generate word-senseaware distributional profiles (DPs) from coarser "concept DPs" (derived from a Roget-like thesaurus) and sense-unaware traditional word DPs (derived from ...

13
March 2009 StatMT '09: Proceedings of the Fourth Workshop on Statistical Machine Translation
Publisher: Association for Computational Linguistics
Bibliometrics:
Citation Count: 7
Downloads (6 Weeks): 0,   Downloads (12 Months): 3,   Downloads (Overall): 55

Full text available: PDFPDF
This paper describes the techniques we explored to improve the translation of news text in the German-English and Hungarian-English tracks of the WMT09 shared translation task. Beginning with a convention hierarchical phrase-based system, we found benefits for using word segmentation lattices as input, explicit generation of beginning and end of ...

14
October 2008 EMNLP '08: Proceedings of the Conference on Empirical Methods in Natural Language Processing
Publisher: Association for Computational Linguistics
Bibliometrics:
Citation Count: 80
Downloads (6 Weeks): 1,   Downloads (12 Months): 22,   Downloads (Overall): 385

Full text available: PDFPDF
Minimum-error-rate training (MERT) is a bottleneck for current development in statistical machine translation because it is limited in the number of weights it can reliably optimize. Building on the work of Watanabe et al., we explore the use of the MIRA algorithm of Crammer et al. as an alternative to ...

15
March 2005 ECIR'05: Proceedings of the 27th European conference on Advances in Information Retrieval Research
Publisher: Springer-Verlag
Bibliometrics:
Citation Count: 13

Compression-based text classification methods are easy to apply, requiring virtually no preprocessing of the data. Most such methods are character-based, and thus have the potential to automatically capture non-word features of a document, such as punctuation, word-stems, and features spanning more than one word. However, compression-based classification methods have drawbacks ...



The ACM Digital Library is published by the Association for Computing Machinery. Copyright © 2018 ACM, Inc.
Terms of Usage   Privacy Policy   Code of Ethics   Contact Us