Ensembling Factored Neural Machine Translation Models for Automatic Post-Editing and Quality Estimation
نویسنده
چکیده
This work presents a novel approach to Automatic Post-Editing (APE) and WordLevel Quality Estimation (QE) using ensembles of specialized Neural Machine Translation (NMT) systems. Word-level features that have proven effective for QE are included as input factors, expanding the representation of the original source and the machine translation hypothesis, which are used to generate an automatically post-edited hypothesis. We train a suite of NMT models that use different input representations, but share the same output space. These models are then ensembled together, and tuned for both the APE and the QE task. We thus attempt to connect the state-of-the-art approaches to APE and QE within a single framework. Our models achieve state-of-the-art results in both tasks, with the only difference in the tuning step which learns weights for each component of the ensemble.
منابع مشابه
The FBK Participation in the WMT 2016 Automatic Post-editing Shared Task
In this paper, we present a novel approach to combine the two variants of phrasebased APE (monolingual and contextaware) by a factored machine translation model that is able to leverage benefits from both. Our factored APE models include part-of-speech-tag and class-based neural language models (LM) along with statistical word-based LM to improve the fluency of the post-edits. These models are ...
متن کاملNeural Post-Editing Based on Quality Estimation
Automatic post-editing (APE) is a challenging task on WMT evaluation campaign. We find that only a small number of edit operations are required for most machine translation outputs, through analysis of the training set of WMT17 APE en-de task. Based on this statistics analysis, two neural postediting (NPE) models are trained depended on the edit numbers: single edit and minor edits. The improve...
متن کاملLIUM Machine Translation Systems for WMT17 News Translation Task
This paper describes LIUM submissions to WMT17 News Translation Task for English↔German, English↔Turkish, English→Czech and English→Latvian language pairs. We train BPE-based attentive Neural Machine Translation systems with and without factored outputs using the open source nmtpy framework. Competitive scores were obtained by ensembling various systems and exploiting the availability of target...
متن کاملFindings of the 2017 Conference on Machine Translation (WMT17)
This paper presents the results of the WMT17 shared tasks, which included three machine translation (MT) tasks (news, biomedical, and multimodal), two evaluation tasks (metrics and run-time estimation of MT quality), an automatic post-editing task, a neural MT training task, and a bandit learning task.
متن کاملJapanese to English/Chinese/Korean Datasets for Translation Quality Estimation and Automatic Post-Editing
Aiming at facilitating the research on quality estimation (QE) and automatic post-editing (APE) of machine translation (MT) outputs, especially for those among Asian languages, we have created new datasets for Japanese to English, Chinese, and Korean translations. As the source text, actual utterances in Japanese were extracted from the log data of our speech translation service. MT outputs wer...
متن کامل