Addressing Problems across Linguistic Levels in SMT: Combining Approaches to Model Morphology, Syntax and Lexical Choice

نویسندگان

  • Alexander M. Fraser
  • Sabine Schulte im Walde
  • Marion Weller-Di Marco
چکیده

Morphological complexity • Data sparsity due to uncovered inflected forms • Difficulty to produce the correct target-side inflection based on available information COMBINING APPROACHES • Pre-processing – syntactic level Source-side reordering (Gojun and Fraser, 2012) • At decoding time – lexical level Discriminative classifier to score translation rules using source-side context (Tamchyna et al., 2014) • Post-processing – morphological level Target-side inflection prediction (Fraser et al. 2012)

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Hybrid Machine Translation System Based on a Monotone Decoder

In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...

متن کامل

Code-Copying in the Balochi Language of Sistan

This empirical study deals with language contact phenomena in Sistan. Code-copying is viewed as a strategy of linguistic behavior when a dominated language acquires new elements in lexicon, phonology, morphology, syntax, pragmatic organization, etc., which can be interpreted as copies of a dominating language. In this framework Persian is regarded as the model code which provides elements for b...

متن کامل

Cognitive Task Complexity and Iranian EFL Learners’ Written Linguistic Performance across Writing Proficiency Levels

Recently tasks, as the basic units of syllabi, and the cognitive complexity, as the criterion for sequencing them, have caught many second language researchers’ attention. This study sought to explore the effect of utilizing the cognitively simple and complex tasks on high- and low-proficient EFL Iranian writers’ linguistic performance, i.e., fluency, accuracy, lexical complexity, and structura...

متن کامل

Lexical Syntax for Statistical Machine Translation

Statistical Machine Translation (SMT) is by far the most dominant paradigm of Machine Translation. This can be justified by many reasons, such as accuracy, scalability, computational efficiency and fast adaptation to new languages and domains. However, current approaches of Phrase-based SMT lacks the capabilities of producing more grammatical translations and handling long-range reordering whil...

متن کامل

Linguistic Creativity at Different Levels of Decision in Sentence Production

The shape taken by linguistic creativity at the different levels of decision involved in sentence production (phonetics, rhythm, lexical choice, semantics, syntax and narrative content) is explored in relation to existing computational models of creativity. A general outline of the possibilities is given for each level, and two specific levels word invention at the lexical level, illustrated by...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017