Why is German Dependency Parsing More Reliable than Constituent Parsing?

نویسندگان

Sandra Kübler

Jelena Prokić

چکیده

In recent years, research in parsing has extended in several new directions. One of these directions is concerned with parsing languages other than English. Treebanks have become available for many European languages, but also for Arabic, Chinese, or Japanese. However, it was shown that parsing results on these treebanks depend on the types of treebank annotations used [ , ]. Another direction in parsing research is the development of dependency parsers. Dependency parsing profits from the non-hierarchical nature of dependency relations, thus lexical information can be included in the parsing process in a much more natural way. Especially machine learning based approaches are very successful (cf. e.g. [12, 13]). The results achieved by these dependency parsers are very competitive although comparisons are difficult because of the differences in annotation. For English, the Penn Treebank [11] has been converted to dependencies. For this version, Nivre et al. [14] report an accuracy rate of 86.3%, as compared to an F-score of 2.1 for Charniak’s parser [1]. The Penn Chinese Treebank [1 ] is also available in a constituent and a dependency representations. The best results reported for parsing experiments with this treebank give an F-score of 81.8 for the constituent version [2] and . % accuracy for the dependency version [14]. The general trend in comparisons between constituent and dependency parsers is that the dependency parser performs slightly worse than the constituent parser. The only exception occurs for German, where F-scores for constituent plus grammatical function parses range between 51.4 and 5.3, depending on the treebank, NEGRA [1 ] or TüBa-D/Z [1 ]. The dependency parser based on a converted version of Tüba-D/Z, in contrast, reached an accuracy of 3.4% [14], i.e. 12 percent points better than the best constituent analysis including grammatical functions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An improved joint model: POS tagging and dependency parsing

Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...

متن کامل

Experiments with Easy-first nonprojective constituent parsing

Less-configurational languages such as German often show not just morphological variation but also free word order and nonprojectivity. German is not exceptional in this regard, as other morphologically-rich languages such as Czech, Tamil or Greek, offer similar challenges that make context-free constituent parsing less attractive. Advocates of dependency parsing have long pointed out that the ...

متن کامل

Language Independent Dependency to Constituent Tree Conversion

We present a dependency to constituent tree conversion technique that aims to improve constituent parsing accuracies by leveraging dependency treebanks available in a wide variety in many languages. The technique works in two steps. First, a partial constituent tree is derived from a dependency tree with a very simple deterministic algorithm that is both language and dependency type independent...

متن کامل

Universal Dependency Annotation for Multilingual Parsing

We present a new collection of treebanks with homogeneous syntactic dependency annotation for six languages: German, English, Swedish, Spanish, French and Korean. To show the usefulness of such a resource, we present a case study of crosslingual transfer parsing with more reliable evaluation than has been possible before. This ‘universal’ treebank is made freely available in order to facilitate...

متن کامل

The PaGe 2008 Shared Task on Parsing German

The ACL 2008 Workshop on Parsing German features a shared task on parsing German. The goal of the shared task was to find reasons for the radically different behavior of parsers on the different treebanks and between constituent and dependency representations. In this paper, we describe the task and the data sets. In addition, we provide an overview of the test results and a first analysis.

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2006

Why is German Dependency Parsing More Reliable than Constituent Parsing?

نویسندگان

چکیده

منابع مشابه

An improved joint model: POS tagging and dependency parsing

Experiments with Easy-first nonprojective constituent parsing

Language Independent Dependency to Constituent Tree Conversion

Universal Dependency Annotation for Multilingual Parsing

The PaGe 2008 Shared Task on Parsing German

عنوان ژورنال:

اشتراک گذاری