Towards a Model of Competence for Corpus-Based Machine Translation
نویسنده
چکیده
A translation is a conversion from a source language into a target language preserving the meaning. A huge number of techniques and computational approaches have been experimented in order to translate natural languages automatically, yet no satisfactory solution has been found. This paper examines approaches to corpus-based machine translation (CBMT). In CBMT, a set of reference example translations is given to the MT system. These are analyzed and compiled into the system's internal representation according to the theory of meaning the system implements. The representations, then, serve as a basis to translate new sentences. This paper discusses three main approaches in the CBMT paradigm: the memory-based approach (e.g. translation memories (TM)), the example-based approach (EBMT) and the statistical-based approach (SBMT). Concrete CBMT systems are discussed in light of the theory of meaning (preservation) they implement. This discussion, then leads to a model of competence for CBMT systems. The paper concludes that CBMT systems can be designed to achieve high reliability or broad coverage, though both seem to be mutually exclusive qualities. 1 Meaning Preservation in Machine
منابع مشابه
A Model of Competence for Corpus-Based Machine Translation
In this paper I elaborate a model of competence for corpus-based machine translation (CBMT) along the lines of the representations used in the translation system. Representations in CBMT-systems can be rich or austere, molecular or holistic and they can be ne-grained or coarse-grained. The paper shows that di erent CBMT architectures are required dependent on whether a better translation qualit...
متن کاملA new model for persian multi-part words edition based on statistical machine translation
Multi-part words in English language are hyphenated and hyphen is used to separate different parts. Persian language consists of multi-part words as well. Based on Persian morphology, half-space character is needed to separate parts of multi-part words where in many cases people incorrectly use space character instead of half-space character. This common incorrectly use of space leads to some s...
متن کاملCorpus based coreference resolution for Farsi text
"Coreference resolution" or "finding all expressions that refer to the same entity" in a text, is one of the important requirements in natural language processing. Two words are coreference when both refer to a single entity in the text or the real world. So the main task of coreference resolution systems is to identify terms that refer to a unique entity. A coreference resolution tool could be...
متن کاملTranslation Strategies in English to Persian Translation of Children's Literature based on Klingberg's Model
This research sought to identify the translation strategies adopted by the translator in Persian translation of 'whatever after, Fairest of all' written by 'Sarah Mlynowski' based on Klingberg's model (1986). To achieve the objectives of the study, a qualitative content analysis design was selected for it. The corpus of the study consisted of 60 pages of the novel 'whatever after, Fairest of al...
متن کاملPromoting Translation Sub-Competences and Identifying the Ranking of Influence among Them
In this two-stage empirical research, the authors attempted to study the impact of promoting translation sub-competences defined by PACTE's Multi-Componential Model for Translation Competence on the promotion of total translation competence as the first stage. The experiment for this purpose was conducted on a group of Iranian undergraduate students comprising of their exposure to a targeted sy...
متن کامل