Exact Decoding for Phrase-Based Statistical Machine Translation
نویسندگان
چکیده
The combinatorial space of translation derivations in phrase-based statistical machine translation is given by the intersection between a translation lattice and a target language model. We replace this intractable intersection by a tractable relaxation which incorporates a low-order upperbound on the language model. Exact optimisation is achieved through a coarseto-fine strategy with connections to adaptive rejection sampling. We perform exact optimisation with unpruned language models of order 3 to 5 and show searcherror curves for beam search and cube pruning on standard test sets. This is the first work to tractably tackle exact optimisation with language models of orders higher than 3.
منابع مشابه
NiuTrans: An Open Source Toolkit for Phrase-based and Syntax-based Machine Translation
We present a new open source toolkit for phrase-based and syntax-based machine translation. The toolkit supports several state-of-the-art models developed in statistical machine translation, including the phrase-based model, the hierachical phrase-based model, and various syntaxbased models. The key innovation provided by the toolkit is that the decoder can work with various grammars and offers...
متن کاملPhrase-based Machine Translation using Multiple Preordering Candidates
In this paper, we propose a new decoding method for phrase-based statistical machine translation which directly uses multiple preordering candidates as a graph structure. Compared with previous phrase-based decoding methods, our method is based on a simple left-to-right dynamic programming in which no decoding-time reordering is performed. As a result, its runtime is very fast and implementing ...
متن کاملIncremental Decoding for Phrase-Based Statistical Machine Translation
In this paper we focus on the incremental decoding for a statistical phrase-based machine translation system. In incremental decoding, translations are generated incrementally for every word typed by a user, instead of waiting for the entire sentence as input. We introduce a novel modification to the beam-search decoding algorithm for phrase-based MT to address this issue, aimed at efficient co...
متن کاملImproving Neural Machine Translation through Phrase-based Forced Decoding
Compared to traditional statistical machine translation (SMT), neural machine translation (NMT) often sacrifices adequacy for the sake of fluency. We propose a method to combine the advantages of traditional SMT and NMT by exploiting an existing phrase-based SMT model to compute the phrase-based decoding cost for an NMT output and then using this cost to rerank the n-best NMT outputs. The main ...
متن کاملClimbing Mount BLEU: The Strange World of Reachable High-BLEU Translations
We present a method for finding oracle BLEU translations in phrase-based statistical machine translation using exact document-level scores. Experiments are presented where the BLEU score of a candidate translation is directly optimised in order to examine the properties of reachable translations with very high BLEU scores. This is achieved by running the documentlevel decoder Docent in BLEU-dec...
متن کامل