An Empirical Comparison of Parsers in Constraining Reordering for E-J Patent Machine Translation
نویسندگان
چکیده
Machine translation of patent documents is very important from a practical point of view. One of the key technologies for improving machine translation quality is the utilization of syntax. It is difficult to select the appropriate parser for English to Japanese patent machine translation because the effects of each parser on patent translation are not clear. This paper provides an empirical comparative evaluation of several state-of-the-art parsers for English, focusing on the effects on patent machine translation from English to Japanese. We add syntax to a method that constrains the reordering of noun phrases for phrase-based statistical machine translation. There are two methods for obtaining the noun phrases from input sentences: 1) an input sentence is directly parsed by a parser and 2) noun phrases from an input sentence are determined by a method using the parsing results of the context document that contains the input sentence. We measured how much each parser contributed to improving the translation quality for each of the two methods and how much a combination of parsers contributed to improving the translation quality for the second method. We conducted experiments using the NTCIR-8 patent translation task dataset. Most of the parsers improved translation quality. Combinations of parsers using the method based on context documents achieved the best translation quality.
منابع مشابه
A Comparison Study of Parsers for Patent Machine Translation
Machine translation of patent documents is very important from a practical point of view. One of the key technologies for improving machine translation quality is the utilization of syntax. It is difficult to select the appropriate parser for patent translation because the effects of each parser on patent translation are not clear. This paper provides comparative evaluation of several state-of-...
متن کاملAnalyzing the Influence of Parsing Errors on Pre-reordering Performance for SMT
Word alignment for long distance language pairs is problematic in state-of-the-art phrasebased statistical machine translation. Linguistically motivated reordering models have been widely studied to conquer this challenge. One of the most popular and effective methods is called pre-reordering, where words in sentences from the source language are re-arranged with the objective to resemble the w...
متن کاملPre-reordering Model of Chinese Special Sentences for Patent Machine Translation
Chinese prepositions play an important role in sentence reordering, especially in patent texts. In this paper, a rule-based model is proposed to deal with the long distance reordering of sentences with special prepositions. We firstly identify the prepositions and their syntax levels. After that, sentences are parsed and transformed to be much closer to English word order with reordering rules....
متن کاملUsing unlabeled dependency parsing for pre-reordering for Chinese-to-Japanese statistical machine translation
Chinese and Japanese have a different sentence structure. Reordering methods are effective, but need reliable parsers to extract the syntactic structure of the source sentences. However, Chinese has a loose word order, and Chinese parsers that extract the phrase structure do not perform well. We propose a framework where only POS tags and unlabeled dependency parse trees are necessary, and ling...
متن کاملTraining dependency parsers by jointly optimizing multiple objectives
We present an online learning algorithm for training parsers which allows for the inclusion of multiple objective functions. The primary example is the extension of a standard supervised parsing objective function with additional loss-functions, either based on intrinsic parsing quality or task-specific extrinsic measures of quality. Our empirical results show how this approach performs for two...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- JIP
دوره 20 شماره
صفحات -
تاریخ انتشار 2012