Cross-lingual Pronoun Prediction with Linguistically Informed Features
نویسنده
چکیده
We present the LIMSI’s cross-lingual pronoun prediction system for the WMT 2016 shared task. We use high-level linguistic features with explicit coreference resolution and expletive detection and rely on dependency annotations and a morphological lexicon. We show that our few, carefully chosen features perform significantly better than several language model baselines and competitively compared to the other systems submitted.
منابع مشابه
Pronoun Prediction with Linguistic Features and Example Weighing
We present a system submitted to the WMT16 shared task in cross-lingual pronoun prediction, in particular, to the English-to-German and German-toEnglish sub-tasks. The system is based on a linear classifier making use of features both from the target language model and from linguistically analyzed source and target texts. Furthermore, we apply example weighing in classifier learning, which prov...
متن کاملFeature Exploration for Cross-Lingual Pronoun Prediction
We explore a large number of features for cross-lingual pronoun prediction for translation between English and German/French. We find that features related to German/French are more informative than features related to English, regardless of the translation direction. Our most useful features are local context, dependency head features, and source pronouns. We also find that it is sometimes mor...
متن کاملBaseline Models for Pronoun Prediction and Pronoun-Aware Translation
This paper presents baseline models for the cross-lingual pronoun prediction task and the pronoun-focused translation task at DiscoMT 2015. We present simple yet effective classifiers for the former and discuss the impact of various contextual features on the prediction performance. In the translation task we rely on the document-level decoder Docent and a cross-sentence target language-model o...
متن کاملA Linear Baseline Classifier for Cross-Lingual Pronoun Prediction
This paper presents baseline models using linear classifiers for the pronoun translation task at WMT 2016. We explore various local context features and include history features of potential antecedents extracted by means of a simple PoSmatching strategy. The results show the difficulties of the task in general but also represent valuable baselines to compare other more-informed systems with. O...
متن کاملCross-lingual Pronoun Prediction for English, French and German with Maximum Entropy Classification
We present our submission to the crosslingual pronoun prediction (CLPP) shared task for English-German and EnglishFrench at the First Conference on Machine Translation (WMT16). We trained a Maximum Entropy (MaxEnt) classifier based on features from Wetzel et al. (2015), that we adapted to the new task and applied to a new language pair. Additional features such as n-grams of the pronoun context...
متن کامل