Cross-lingual Pronoun Prediction for English, French and German with Maximum Entropy Classification
نویسنده
چکیده
We present our submission to the crosslingual pronoun prediction (CLPP) shared task for English-German and EnglishFrench at the First Conference on Machine Translation (WMT16). We trained a Maximum Entropy (MaxEnt) classifier based on features from Wetzel et al. (2015), that we adapted to the new task and applied to a new language pair. Additional features such as n-grams of the pronoun context and prediction of NULLtranslations proved helpful to a varying degree. Experiments with a sequence classifier over pronoun sequences did not show any improvements. Our submission is among the top three systems for English-French (61.62% macro-averaged recall) and in the middle range for EnglishGerman (48.72%) out of nine submissions.
منابع مشابه
Feature Exploration for Cross-Lingual Pronoun Prediction
We explore a large number of features for cross-lingual pronoun prediction for translation between English and German/French. We find that features related to German/French are more informative than features related to English, regardless of the translation direction. Our most useful features are local context, dependency head features, and source pronouns. We also find that it is sometimes mor...
متن کاملFindings of the 2016 WMT Shared Task on Cross-lingual Pronoun Prediction
We describe the design, the evaluation setup, and the results of the 2016 WMT shared task on cross-lingual pronoun prediction. This is a classification task in which participants are asked to provide predictions on what pronoun class label should replace a placeholder value in the target-language text, provided in lemmatised and PoS-tagged form. We provided four subtasks, for the English–French...
متن کاملFindings of the 2017 DiscoMT Shared Task on Cross-lingual Pronoun Prediction
We describe the design, the setup, and the evaluation results of the DiscoMT 2017 shared task on cross-lingual pronoun prediction. The task asked participants to predict a target-language pronoun given a source-language pronoun in the context of a sentence. We further provided a lemmatized target-language human-authored translation of the source sentence, and automatic word alignments between t...
متن کاملPronoun-Focused MT and Cross-Lingual Pronoun Prediction: Findings of the 2015 DiscoMT Shared Task on Pronoun Translation
We describe the design, the evaluation setup, and the results of the DiscoMT 2015 shared task, which included two subtasks, relevant to both the machine translation (MT) and the discourse communities: (i) pronoun-focused translation, a practical MT task, and (ii) cross-lingual pronoun prediction, a classification task that requires no specific MT expertise and is interesting as a machine learni...
متن کاملIt-disambiguation and source-aware language models for cross-lingual pronoun prediction
We present our systems for the WMT 2016 shared task on cross-lingual pronoun prediction. The main contribution is a classifier used to determine whether an instance of the ambiguous English pronoun “it” functions as an anaphoric, pleonastic or event reference pronoun. For the English-to-French task the classifier is incorporated in an extended baseline, which takes the form of a source-aware la...
متن کامل