Genetic Algorithm-based Multi-Word Automatic Language Translation
نویسنده
چکیده
An Automatic Language Translation System’s quality depends mainly on that of two components: the Alignment approach and the Translation Model. In this paper, we will present an alignment approach that covers one-to-one, one-to-many, many-to-one, many-to-many alignment, whose output is used by a translation model based on Genetic Algorithm. The Translation Model searches for the decomposition of a phrase, into singleand multi-word units, that gives the best translation, while allowing for units’ containment and overlapping. The system was used on 17475 English-French phrases from the European Parliament’ debates.
منابع مشابه
A Hybrid Machine Translation System Based on a Monotone Decoder
In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...
متن کاملA new model for persian multi-part words edition based on statistical machine translation
Multi-part words in English language are hyphenated and hyphen is used to separate different parts. Persian language consists of multi-part words as well. Based on Persian morphology, half-space character is needed to separate parts of multi-part words where in many cases people incorrectly use space character instead of half-space character. This common incorrectly use of space leads to some s...
متن کاملThe Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language
Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...
متن کاملPhrase-Based Translation Models
In previous lectures we’ve seen IBM translation models 1 and 2. In this note we will describe phrasebased translation models. Phrase-based translation models give much improved translations over the IBM models, and give state-of-the-art translations for many pairs of languages. Crucially, phrase-based translation models allow lexical entries with more than one word on either the source-language...
متن کاملAutomatic Construction of Persian ICT WordNet using Princeton WordNet
WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...
متن کامل