Two Word Reordering Strategies in English - to - Chinese Translation

نویسندگان

  • Fei Su
  • Jin Huang
چکیده

In English-to-Chinese machine translation, reordering mistakes are frequently caused by miss-located prepositional phrases(PP). In English, a PP is often located after its attached host, while in Chinese things are the opposite. Recent phrase-based MT approaches tend to ignore such syntax information. In our work, we propose two strategies to resolve the above problem: first, for MT that adopts the lexical reordering model, we modify the reordering orientations from Monotone-Swap-Discontinuous (MSD) to MonotoneLeftDiscontinuous-RightDiscontinuous (MLR), which is more efficient in guiding the reordering direction of PP; second, for MT using the pre-reordering approach, we apply PP attachment disambiguation to find the host of each PP and then pre-reorder them precisely. The superiority of the two approaches is verified in our empirical studies. Our work has already been applied in Youdao online translation system (http://fanyi.youdao.com).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Chinese Syntactic Reordering for Statistical Machine Translation

Syntactic reordering approaches are an effective method for handling word-order differences between source and target languages in statistical machine translation (SMT) systems. This paper introduces a reordering approach for translation from Chinese to English. We describe a set of syntactic reordering rules that exploit systematic differences between Chinese and English word order. The result...

متن کامل

Reordered Search and Tuple Unfolding for Ngram-based SMT

In Statistical Machine Translation, the use of reordering for certain language pairs can produce a significant improvement on translation accuracy. However, the search problem is shown to be NP-hard when arbitrary reorderings are allowed. This paper addresses the question of reordering for an Ngram-based SMT approach following two complementary strategies, namely reordered search and tuple unfo...

متن کامل

The application of source language information in Chinese-English statistical machine translation

The quality of machine translation (MT) has been significantly improved by using statistical approaches. The integration of syntactic knowledge into a statistical MT system is still an open problem. This talk investigates the application of syntactic knowledge of the source language to the phrase-based MT system for translating Chinese into English. In this thesis, particular issues have been a...

متن کامل

Rule-Based Preordering on Multiple Syntactic Levels in Statistical Machine Translation

We propose a novel data-driven rule-based preordering approach, which uses the tree information of multiple syntactic levels. This approach extend the tree-based reordering from one level into multiple levels, which has the capability to process more complicated reordering cases. We have conducted experiments in English-to-Chinese and Chinese-to-English translation directions. Our results show ...

متن کامل

To Swap or Not to Swap? Exploiting Dependency Word Pairs for Reordering in Statistical Machine Translation

Reordering poses a major challenge in machine translation (MT) between two languages with significant differences in word order. In this paper, we present a novel reordering approach utilizing sparse features based on dependency word pairs. Each instance of these features captures whether two words, which are related by a dependency link in the source sentence dependency parse tree, follow the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013