Quality estimation for translation selection
نویسندگان
چکیده
We describe experiments on quality estimation to select the best translation among multiple options for a given source sentence. We consider a realistic and challenging setting where the translation systems used are unknown, and no relative quality assessments are available for the training of prediction models. Our findings indicate that prediction errors are higher in this blind setting. However, these errors do not have a negative impact in performance when the predictions are used to select the best translation, compared to non-blind settings. This holds even when test conditions (text domains, MT systems) are different from model building conditions. In addition, we experiment with quality prediction for translations produced by both translation systems and human translators. Although the latter are on average of much higher quality, we show that automatically distinguishing the two types of translation is not a trivial problem.
منابع مشابه
Quality Estimation-guided Data Selection for Domain Adaptation of SMT
Supplementary data selection is a strongly motivated approach in domain adaptation of statistical machine translation systems. In this paper we report a novel approach of data selection guided by automatic quality estimation. In contrast to the conventional approach of using the entire target-domain data as reference for data selection, we restrict the reference set only to sentences poorly tra...
متن کاملQuality estimation for Machine Translation output using linguistic analysis and decoding features
We describe a submission to the WMT12 Quality Estimation task, including an extensive Machine Learning experimentation. Data were augmented with features from linguistic analysis and statistical features from the SMT search graph. Several Feature Selection algorithms were employed. The Quality Estimation problem was addressed both as a regression task and as a discretised classification task, b...
متن کاملCombining Quality Prediction and System Selection for Improved Automatic Translation Output
This paper presents techniques for referencefree, automatic prediction of Machine Translation output quality at both sentenceand document-level. In addition to helping with document-level quality estimation, sentencelevel predictions are used for system selection, improving the quality of the output translations. We present three system selection techniques and perform evaluations that quantify...
متن کاملAn Investigation on the Effectiveness of Features for Translation Quality Estimation
We describe a systematic analysis on the effectiveness of features commonly exploited for the problem of predicting machine translation quality. Using a feature selection technique based on Gaussian Processes, we identify small subsets of features that perform well across many datasets for different language pairs, text domains, machine translation systems and quality labels. In addition, we sh...
متن کاملHuQ: An English-Hungarian Corpus for Quality Estimation
Quality estimation for machine translation is an important task. The standard automatic evaluation methods that use reference translations cannot perform the evaluation task well enough. These methods produce low correlation with human evaluation for English-Hungarian. Quality estimation is a new approach to solve this problem. This method is a prediction task estimating the quality of translat...
متن کاملA Quality-based Active Sample Selection Strategy for Statistical Machine Translation
This paper presents a new active learning technique for machine translation based on quality estimation of automatically translated sentences. It uses an error-driven strategy, i.e., it assumes that the more errors an automatically translated sentence contains, the more informative it is for the translation system. Our approach is based on a quality estimation technique which involves a wider r...
متن کامل