Efficient Statistical Machine Translation Algorithm based on IBM Model 4
نویسندگان
چکیده
This paper describes our methodologies for NTCIR-7 Patent Translation Task, and reports the official results based on English and Japanese corpus. Our system was a novel combination pattern of machine translation algorithms including classical statistical method -IBM model and highly efficient decoding algorithm. The result of this new method is relatively decent, and its speed is also fast. It can be considered as a candidate for such situations as people who want to get a kind of quick and simple grasp of the main idea of a text.
منابع مشابه
A new model for persian multi-part words edition based on statistical machine translation
Multi-part words in English language are hyphenated and hyphen is used to separate different parts. Persian language consists of multi-part words as well. Based on Persian morphology, half-space character is needed to separate parts of multi-part words where in many cases people incorrectly use space character instead of half-space character. This common incorrectly use of space leads to some s...
متن کاملSimultaneous Word-Morpheme Alignment for Statistical Machine Translation
Current word alignment models for statistical machine translation do not address morphology beyond merely splitting words. We present a two-level alignment model that distinguishes between words and morphemes, in which we embed an IBM Model 1 inside an HMM based word alignment model. The model jointly induces word and morpheme alignments using an EM algorithm. We evaluated our model on Turkish-...
متن کامل11-928 Master’s Thesis Symmetric Probabilistic Alignment
The CMU Example-Based Machine Translation (EBMT) system has been deployed successfully in many projects for years. But even though a good alignment algorithm is essential since the CMU EBMT system uses parallel corpora, it has relatively less studied than other components of EBMT. For this reason, we developed a new alignment algorithm which uses statistical information drawn from parallel corp...
متن کاملA Convex Alternative to IBM Model 2
The IBM translation models have been hugely influential in statistical machine translation; they are the basis of the alignment models used in modern translation systems. Excluding IBM Model 1, the IBM translation models, and practically all variants proposed in the literature, have relied on the optimization of likelihood functions or similar functions that are non-convex, and hence have multi...
متن کاملRevisiting Optimal Decoding for Machine Translation IBM Model 4
This paper revisits optimal decoding for statistical machine translation using IBM Model 4. We show that exact/optimal inference using Integer Linear Programming is more practical than previously suggested when used in conjunction with the Cutting-Plane Algorithm. In our experiments we see that exact inference can provide a gain of up to one BLEU point for sentences of length up to 30 tokens.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008