Discriminative Feature-Tied Mixture Modeling for Statistical Machine Translation

نویسندگان

  • Bing Xiang
  • Abraham Ittycheriah
چکیده

In this paper we present a novel discriminative mixture model for statistical machine translation (SMT). We model the feature space with a log-linear combination of multiple mixture components. Each component contains a large set of features trained in a maximumentropy framework. All features within the same mixture component are tied and share the same mixture weights, where the mixture weights are trained discriminatively to maximize the translation performance. This approach aims at bridging the gap between the maximum-likelihood training and the discriminative training for SMT. It is shown that the feature space can be partitioned in a variety of ways, such as based on feature types, word alignments, or domains, for various applications. The proposed approach improves the translation performance significantly on a large-scale Arabic-to-English MT task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Simulating Discriminative Training for Linear Mixture Adaptation in Statistical Machine Translation

Linear mixture models are a simple and effective technique for performing domain adaptation of translation models in statistical MT. In this paper, we identify and correct two weaknesses of this method. First, we show that standard maximumlikelihood weights are biased toward large corpora, and that a straightforward preprocessing step that down-samples phrase tables can be used to counter this ...

متن کامل

Statistical Modeling for Improved Land Cover Classification

Novel statistical modeling and training techniques are proposed for improving classification accuracy of land cover data acquired by LandSat Thermatic Mapper (TM). The proposed modeling techniques consist of joint modeling of spectral feature distributions among neighboring pixels and partial modeling of spectral correlations across TM sensor bands with a set of semi-tied covariance matrices in...

متن کامل

Machine translation in continuous space

We present a different perspective on the machine translation problem that relies upon continuous-space probabilistic models for words and phrases. Within this perspective we propose a method called Tied-Mixture Machine Translation (TMMT) that uses a trainable parametric model employing Gaussian mixture probability density functions to represent wordand phrase– pairs. In the new perspective, ma...

متن کامل

Modeling the Translation of Predicate-Argument Structure for SMT

Predicate-argument structure contains rich semantic information of which statistical machine translation hasn’t taken full advantage. In this paper, we propose two discriminative, feature-based models to exploit predicateargument structures for statistical machine translation: 1) a predicate translation model and 2) an argument reordering model. The predicate translation model explores lexical ...

متن کامل

Discriminative Models and Training Methods For Statistical Machine Translation

Statistical Machine Translation (SMT) has been the dominant avor of Machine Translation (MT) over the last decade. Traditional SMT systems have a pipeline structure in which di erent kinds of Machine Learning models are employed in di erent stages. For the translation modeling, most state of the art systems use hybrid models that combine a handful of generative models in a discriminative framew...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011