Using Self-Trained Bilexical Preferences to Improve Disambiguation Accuracy
نویسنده
چکیده
A method is described to incorporate bilexical preferences between phrase heads, such as selection restrictions, in a MaximumEntropy parser for Dutch. The bilexical preferences are modelled as association rates which are determined on the basis of a very large parsed corpus (about 500M words). We show that the incorporation of such selftrained preferences improves parsing accuracy significantly.
منابع مشابه
Using Self-Trained Bilexical Preferences to Improve Disambiguation Accuracy
A method is described to incorporate bilexical preferences between phrase heads, such as selection restrictions, in a MaximumEntropy parser for Dutch. The bilexical preferences are modelled as association rates which are determined on the basis of a very large parsed corpus (about 500M words). We show that the incorporation of such selftrained preferences improves parsing accuracy significantly.
متن کاملSelf-Trained Bilexical Preferences to Improve Disambiguation Accuracy
A method is described to incorporate bilexical preferences between phrase heads, such as selection restrictions, in a Maximum-Entropy parser for Dutch. The bilexical preferences are model-led as association rates which are determined on the basis of a very large parsed corpus (about 500M words). The preferences are incorporated in the Maximum Entropy framework as auxiliary distributions, using ...
متن کاملSemi-supervised Dependency Parsing using Bilexical Contextual Features from Auto-Parsed Data
We present a semi-supervised approach to improve dependency parsing accuracy by using bilexical statistics derived from auto-parsed data. The method is based on estimating the attachment potential of head-modifier words, by taking into account not only the head and modifier words themselves, but also the words surrounding the head and the modifier. When integrating the learned statistics as fea...
متن کاملUsing Large Monolingual and Bilingual Corpora to Improve Coordination Disambiguation
Resolving coordination ambiguity is a classic hard problem. This paper looks at coordination disambiguation in complex noun phrases (NPs). Parsers trained on the Penn Treebank are reporting impressive numbers these days, but they don’t do very well on this problem (79%). We explore systems trained using three types of corpora: (1) annotated (e.g. the Penn Treebank), (2) bitexts (e.g. Europarl),...
متن کاملImproving Supervised Sense Disambiguation with Web-Scale Selectors
This paper introduces a method to improve supervised word sense disambiguation performance by including a new class of features which leverage contextual information from large unannotated corpora. This new feature class, selectors, contains words that appear in other corpora with the same local context as a given lexical instance. We show that support vector sense classifiers trained with sele...
متن کامل