Example-Based Machine Translation for Low-Resource Language Using Chunk-String Templates
نویسندگان
چکیده
Example-Based Machine Translation (EBMT) for low resource language, like Bengali, has low-coverage issues, due to the lack of parallel corpus. In this paper, we propose an EBMT for low resource language, using chunk-string templates (CSTs) and translating unknown words. CSTs consist of a chunk in source-language, a string in target-language, and word alignment information. CSTs are prepared automatically from aligned parallel corpus and WordNet. To translate unknown words, we used WordNet hypernym tree and English-Bengali dictionary. If no translation candidate found, system transliterates the word. Proposed EBMT improved widecoverage by 41 points and quality by 48.81 points in human evaluation.
منابع مشابه
Template Extraction for a Bidirectional English-Filipino Machine Translation System
A bidirectional English-Filipino Example-based Machine Translation System that learns and uses templates is presented. The system uses machine learning techniques to initially extract templates from a given bilingual corpus. These templates are subsequently used for translating English input text into Filipino and vice versa. The system implements the similarity template learning algorithm perf...
متن کاملمدل ترجمه عبارت-مرزی با استفاده از برچسبهای کمعمق نحوی
Phrase-boundary model for statistical machine translation labels the rules with classes of boundary words on the target side phrases of training corpus. In this paper, we extend the phrase-boundary model using shallow syntactic labels including POS tags and chunk labels. With the priority of chunk labels, the proposed model names non-terminals with shallow syntactic labels on the boundaries of ...
متن کاملThe Best Templates Match Technique For Example Based Machine Translation
It has been proved that large-scale realistic Knowledge Based Machine Translation (KBMT) applications require acquisition of huge knowledge about language and about the world. This knowledge is encoded in computational grammars, lexicons and domain models. Another approach – which avoids the need for collecting and analyzing massive knowledge-is the Example Based approach, which is the topic of...
متن کاملA Full - Text Experiment in Example - Based MachineTranslationSergei
This paper describes an experiment in example-based machine translation (EBMT) on full text. The unit of translation is a text chunk of arbitrary length, in contrast to sentence-level EBMT experiments. Intra-and inter-language matching techniques and metrics used in the experiment are described.
متن کاملTemplate-Based English-Filipino Machine Translation System
This paper presents a template-based machine translation system that extracts templates from a given bilingual corpus, then uses these templates to perform bi-directional EnglishFilipino translations. The system extended the similarity template learning algorithm of Cicekli and Guvenir [2] by refining existing templates and deriving templates from previously learned chunks. Chunk alignment and ...
متن کامل