ALEPH: an EBMT system based on the preservation of proportional analogies between sentences across languages
نویسندگان
چکیده
We designed, implemented and assessed ALEPH, a pure example-based machine translation system. It strictly does not make any use of variables, templates or training, does not have any explicit transfer component, and does not require any preprocessing of the aligned examples. It relies on a specific operation: the resolution of analogical equations, that neutralizes translation divergences in an elegant way. Starting only from theoretical results, a system that is state-of-the-art with the top IWSLT 2004 results could be built in six month time. Evaluated on the Unrestricted Data track of IWSLT 2004, our system achieved second place in CE, and third place in JE (with best BLEU for this latter track). For this year’s evaluation campaign, the features of the system allowed its immediate application to all possible language pairs in the C-STAR tracks.
منابع مشابه
The ‘ purest ’ EBMT system
We designed, implemented and assessed an EBMT system that can be dubbed the “purest ever built”: it strictly does not make any use of variables, templates or training, does not have any explicit transfer component, and does not require any preprocessing of the aligned examples. It uses a specific operation, namely proportional analogy, that implicitly neutralises divergences between languages a...
متن کاملThe ‘ purest ’ EBMT system ever
We designed, implemented and assessed an EBMT system that can be dubbed the “purest ever built”: it strictly does not make any use of variables, templates or training, does not have any explicit transfer component, and does not require any preprocessing of the aligned examples. It uses a specific operation, namely proportional analogy, that implicitly neutralises divergences between languages a...
متن کاملWord Selection for EBMT based on Monolingual Similarity and Translation Confidence
We propose a method of constructing an example-based machine translation (EBMT) system that exploits a content-aligned bilingual corpus. First, the sentences and phrases in the corpus are aligned across the two languages, and the pairs with high translation confidence are selected and stored in the translation memory. Then, for a given input sentences, the system searches for fitting examples b...
متن کاملIdentification of Divergence for English to Hindi EBMT
Divergence is a key aspect of translation between two languages. Divergence occurs when structurally similar sentences of the source language do not translate into sentences that are similar in structures in the target language. Divergence assumes special significance in the domain of Example-Based Machine Translation (EBMT). An EBMT system generates translation of a given sentence by retrievin...
متن کاملMarker-based Chunking for Analogy-based Translation of Chunks
An example-based machine translation (EBMT) system based on analogies requires numerous analogies between linguistic units to work properly. Consequently, long sentences cannot be handled directly in such a framework. In this paper, we inspect the quality of translation of chunks obtained by marker-based chunking in English and French in both directions. Our results show that more than three qu...
متن کامل