Analysis and System Combination of Phrase- and N-Gram-Based Statistical Machine Translation Systems
نویسندگان
چکیده
In the framework of the Tc-Star project, we analyze and propose a combination of two Statistical Machine Translation systems: a phrase-based and an N -gram-based one. The exhaustive analysis includes a comparison of the translation models in terms of efficiency (number of translation units used in the search and computational time) and an examination of the errors in each system’s output. Additionally, we combine both systems, showing accuracy improvements.
منابع مشابه
The RWTH machine translation system for IWSLT 2007
The RWTH system for the IWSLT 2007 evaluation is a combination of several statistical machine translation systems. The combination includes Phrase-Based models, a n-gram translation model and a hierarchical phrase model. We describe the individual systems and the method that was used for combining the system outputs. Compared to our 2006 system, we newly introduce a hierarchical phrase-based tr...
متن کاملN-Gram-Based Statistical Machine Translation versus Syntax Augmented Machine Translation: Comparison and System Combination
In this paper we compare and contrast two approaches to Machine Translation (MT): the CMU-UKA Syntax Augmented Machine Translation system (SAMT) and UPC-TALP N-gram-based Statistical Machine Translation (SMT). SAMT is a hierarchical syntax-driven translation system underlain by a phrase-based model and a target part parse tree. In N-gram-based SMT, the translation process is based on bilingual ...
متن کاملTALP phrase-based system and TALP system combination for IWSLT 2006
This paper describes the TALP phrase-based statistical machine translation system, enriched with the statistical machine reordering technique. We also report the combination of this system and the TALP-tuple, the n-gram-based statistical machine translation system. We report the results for all the tasks (Chinese, Arabic, Italian and Japanese to English) in the framework of the third evaluation...
متن کاملPhrase and Ngram-Based Statistical Machine Translation System Combination
Multiples translations can be computed by one machine translation (MT) system or by different MT systems. We may assume that different MT systems make different errors due to using different models, generation strategies, or tweaks. An investigated technique, inherited from automatic speech recognition (ASR), is the so-called system combination that is based on combining the outputs of multiple...
متن کاملPhrase-level System Combination for Machine Translation Based on Target-to-Target Decoding
In this paper, we propose a novel latticebased MT combination methodology that we call Target-to-Target Decoding (TTD). The combination process is carried out as a “translation” from backbone to the combination result. This perspective suggests the use of existing phrase-based MT techniques in the combination framework. We show how phrase extraction rules and confidence estimations inspired fro...
متن کامل