Lexical gaps and idioms in machine translation
نویسنده
چکیده
A bstract This paper deserihes the treatment of lexical gaps, collocation information and idioms in the English to Portuguese machine translation system PORTUGA. The perspective is strictly bilingual, in the sense that all problems referenced above are considered to behmg to the tranM'cr phase, and not, as in other systems, to analysis or generation. The solution presented invokes a parser fi}r the target language (Portuguese) that analyses, producing the corresponding graph structure, the multiword expression selected as the result of lexieal transfer. This process seems to bring considerable advantage in what readability and ease of bilingual dictionary development is concerned, and to fiirnish maximal flexihility together with minimal storage requirements. Finally, it also provides complete independence between dictionary-rod grammar formalisms. Organization 'lhe general architecture of" the MT system is at first described very briefly, emphasizing the features relevant to the full understanding of the problem at hand. Then tim problem is presented, and a literature survey given. The solution put tbrward is then described. Finally, we fiwnish a detailed example, together with some evaluation results. The structure of the transfer MT system POR.TUGA is illustrated in Figure 1. 'lhe main characteristics of this English to Portuguese transhm)r are: the separation between possible translation (which may be multiple), and best or chosen translation (decided in the "style Iransfer" module). ® Complete independence between English and Portuguese processing. English analysis is performed hy PEG[8]. • Bilingual dictionary being kept to a minimum, only the selection conditions for lexical transfer and contrastive knowledge are stored. It should also be mentioned that all intbrmation in this dictionary is associated with the translations, and not to the English index, as is usual for lexical transfer in MT.
منابع مشابه
AProposed Standard for the Lexical Representation of Idioms
this paper I first explain briefly the properties ofone type ofMuIti-Word Expression (MWE), viz., flexible idioms, and how they are dealt with in the Rosetta machine translation system. Taking this as a starting point and generalizing beyond it, I argue that a standardized lexical representation for flexible idioms is not so straightforward. Nevertheless, I make a very concrete proposal for an ...
متن کاملCOMPUTATIONAL LEXICOGRAPHYAND LEXICOLOGY AProposed Standard for the Lexical Representation of Idioms
this paper I first explain briefly the properties ofone type ofMuIti-Word Expression (MWE), viz., flexible idioms, and how they are dealt with in the Rosetta machine translation system. Taking this as a starting point and generalizing beyond it, I argue that a standardized lexical representation for flexible idioms is not so straightforward. Nevertheless, I make a very concrete proposal for an ...
متن کاملEquivalency and Non-equivalency of Lexical Items in English Translations of Nahj al-balagha
Lexical items play a key role in both language in general and translation in particular. Likewise, equivalence is a controversial concept discussed so widely in translation studies. Some theorists deem it to be fundamental in translation theory and define translation in terms of equivalence. The aim of this study is to identify the problems of lexical gaps in two translations of Nahj al-ba...
متن کاملA Hybrid Machine Translation System Based on a Monotone Decoder
In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...
متن کاملEtymology, Contextual Pragmatic Clues, and Lexical Knowledge in L2 Idioms Learning
To investigate the effects of etymological elaboration, contextual pragmatic clues, and lexical knowledge on L2 idioms comprehension and production, 60 male intermediate level EFL students in three groups were selected. Each group was randomly assigned to one treatment condition. Group one participants were presented with the etymological explanation of idioms. In group two, the same idioms wer...
متن کامل