Parser Accuracy in Quality Estimation of Machine Translation: A Tree Kernel Approach
نویسندگان
چکیده
We report on experiments designed to investigate the role of syntactic features in the task of quality estimation for machine translation, focusing on the effect of parser accuracy. Tree kernels are used to predict the segment-level BLEU score of EnglishFrench translations. In order to examine the effect of the accuracy of the parse tree on the accuracy of the quality estimation system, we experiment with various parsing systems which differ substantially with respect to their Parseval f-scores. We find that it makes very little difference which system we choose to use in the quality estimation task – this effect is particularly apparent for source-side English parse trees.
منابع مشابه
Quality Estimation of English-French Machine Translation: A Detailed Study of the Role of Syntax
We investigate the usefulness of syntactic knowledge in estimating the quality of English-French translations. We find that dependency and constituency tree kernels perform well but the error rate can be further reduced when these are combined with hand-crafted syntactic features. Both types of syntactic features provide information which is complementary to tried-and-tested nonsyntactic featur...
متن کاملUHH Submission to the WMT17 Quality Estimation Shared Task
The field of Quality Estimation (QE) has the goal to provide automatic methods for the evaluation of Machine Translation (MT), that do not require reference translations in their computation. We present our submission to the sentence level WMT17 Quality Estimation Shared Task. It combines tree and sequence kernels for predicting the post-editing effort of the target sentence. The kernels exploi...
متن کاملCross-language Projection of Dependency Trees for Tree-to-tree Machine Translation
Syntax-based machine translation (MT) is an attractive approach for introducing additional linguistic knowledge in corpus-based MT. Previous studies have shown that treeto-string and string-to-tree translation models perform better than tree-to-tree translation models since tree-to-tree models require two high quality parsers on the source as well as the target language side. In practice, high ...
متن کاملStudying impressive parameters on the performance of Persian probabilistic context free grammar parser
In linguistics, a tree bank is a parsed text corpus that annotates syntactic or semantic sentence structure. The exploitation of tree bank data has been important ever since the first large-scale tree bank, The Penn Treebank, was published. However, although originating in computational linguistics, the value of tree bank is becoming more widely appreciated in linguistics research as a whole. F...
متن کاملTop Accuracy and Fast Dependency Parsing is not a Contradiction
In addition to a high accuracy, short parsing and training times are the most important properties of a parser. However, parsing and training times are still relatively long. To determine why, we analyzed the time usage of a dependency parser. We illustrate that the mapping of the features onto their weights in the support vector machine is the major factor in time complexity. To resolve this p...
متن کامل