Using Self-Trained Bilexical Preferences to Improve Disambiguation Accuracy

نویسنده

  • Gertjan van Noord
چکیده

A method is described to incorporate bilexical preferences between phrase heads, such as selection restrictions, in a MaximumEntropy parser for Dutch. The bilexical preferences are modelled as association rates which are determined on the basis of a very large parsed corpus (about 500M words). We show that the incorporation of such selftrained preferences improves parsing accuracy significantly.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Self-Trained Bilexical Preferences to Improve Disambiguation Accuracy

A method is described to incorporate bilexical preferences between phrase heads, such as selection restrictions, in a MaximumEntropy parser for Dutch. The bilexical preferences are modelled as association rates which are determined on the basis of a very large parsed corpus (about 500M words). We show that the incorporation of such selftrained preferences improves parsing accuracy significantly.

متن کامل

Self-Trained Bilexical Preferences to Improve Disambiguation Accuracy

A method is described to incorporate bilexical preferences between phrase heads, such as selection restrictions, in a Maximum-Entropy parser for Dutch. The bilexical preferences are model-led as association rates which are determined on the basis of a very large parsed corpus (about 500M words). The preferences are incorporated in the Maximum Entropy framework as auxiliary distributions, using ...

متن کامل

Semi-supervised Dependency Parsing using Bilexical Contextual Features from Auto-Parsed Data

We present a semi-supervised approach to improve dependency parsing accuracy by using bilexical statistics derived from auto-parsed data. The method is based on estimating the attachment potential of head-modifier words, by taking into account not only the head and modifier words themselves, but also the words surrounding the head and the modifier. When integrating the learned statistics as fea...

متن کامل

Using Large Monolingual and Bilingual Corpora to Improve Coordination Disambiguation

Resolving coordination ambiguity is a classic hard problem. This paper looks at coordination disambiguation in complex noun phrases (NPs). Parsers trained on the Penn Treebank are reporting impressive numbers these days, but they don’t do very well on this problem (79%). We explore systems trained using three types of corpora: (1) annotated (e.g. the Penn Treebank), (2) bitexts (e.g. Europarl),...

متن کامل

Improving Supervised Sense Disambiguation with Web-Scale Selectors

This paper introduces a method to improve supervised word sense disambiguation performance by including a new class of features which leverage contextual information from large unannotated corpora. This new feature class, selectors, contains words that appear in other corpora with the same local context as a given lexical instance. We show that support vector sense classifiers trained with sele...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007