Parse disambiguation for a rich HPSG grammar

نویسندگان

  • Kristina Toutanova
  • Christopher D. Manning
  • Stuart M. Shieber
  • Dan Flickinger
  • Stephan Oepen
چکیده

The fine-grained nature of the HPSG representations found in the Redwoods treebank raises novel issues in parse disambiguation relative to more traditional treebanks such as the Penn treebank, which have been the focus of most past work on probabilistic parsing (e.g., Charniak 1997; Collins 1997). The Redwoods treebank is much richer in the representations it makes available. Most similar to Penn treebank parse trees are the phrase structure trees (Figure 1(b)). In this work we have concentrated on the derivation trees (Figure 1(a)), which represent combining rule schemas of the HPSG grammar. The nodes represent, for example, head-complement, head-specifier, and head-adjunct schemas and the derivation trees are consequently significantly different from phrase structure trees. The preterminals of the derivation trees are from a set of about 8,000 lexical labels and are much finer grained than Penn treebank labels, which are about 45 part-of-speech tags, and 27 phrasal node labels.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Unsupervised Parse Selection for HPSG

Parser disambiguation with precision grammars generally takes place via statistical ranking of the parse yield of the grammar using a supervised parse selection model. In the standard process, the parse selection model is trained over a hand-disambiguated treebank, meaning that without a significant investment of effort to produce the treebank, parse selection is not possible. Furthermore, as t...

متن کامل

Stochastic HPSG Parse Disambiguation Using the Redwoods Corpus

This article details our experiments on hpsg parse disambiguation, based on the Redwoods treebank. Using existing and novel stochastic models, we evaluate the usefulness of different information sources for disambiguation – lexical, syntactic, and semantic. We perform careful comparisons of generative and discriminative models using equivalent features and show the consistent advantage of discr...

متن کامل

An HPSG Parser Based on Description Logics

In this paper I present a parser based on Description Logics (DL) for a German HPSG-style fragment. The specified parser relies mainly on the inferential capabilities of the underlying DL system. Given a preferential default extension for DL disambiguation is achieved by choosing the parse containing a qualitatively minimal number of exceptions.

متن کامل

The Leaf Path Projection View of Parse Trees: Exploring String Kernels for HPSG Parse Selection

We present a novel representation of parse trees as lists of paths (leaf projection paths) from leaves to the top level of the tree. This representation allows us to achieve significantly higher accuracy in the task of HPSG parse selection than standard models, and makes the application of string kernels natural. We define tree kernels via string kernels on projection paths and explore their pe...

متن کامل

Using Lexical and Compositional Semantics to Improve HPSG Parse Selection

Using Lexical and Compositional Semantics to Improve HPSG Parse Selection Chair of the Supervisory Committee: Dr. Emily Bender University of Washington Accurate parse ranking is essential for deep linguistic processing applications and is one of the classic problems for academic research in NLP. Despite significant advances, there remains a big need for improvement, especially for domains where...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002