Evaluating a German Sketch Grammar: A Case Study on Noun Phrase Case

نویسندگان

  • Kremena Ivanova
  • Ulrich Heid
  • Sabine Schulte im Walde
  • Adam Kilgarriff
  • Jan Pomikálek
چکیده

Word sketches are part of the Sketch Engine corpus query system. They represent automatic, corpus-derived summaries of the words’ grammatical and collocational behaviour. Besides the corpus itself, word sketches require a sketch grammar, a regular expression-based shallow grammar over the part-of-speech tags, to extract evidence for the properties of the targeted words from the corpus. The paper presents a sketch grammar for German, a language which is not strictly configurational and which shows a considerable amount of case syncretism, and evaluates its accuracy, which has not been done for other sketch grammars. The evaluation focuses on NP case as a crucial part of the German grammar. We present various versions of NP definitions, so demonstrating the influence of grammar detail on precision and recall.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An HPSG-Analysis for Free Relative Clauses in German

At the moment there is no theory for free relative clauses in German in the framework of Head-driven Phrase Structure Grammar (HPSG) (Pollard and Sag, 1994). From GB literature on the subject it is known that free relative clauses behave partly like noun phrases. They can fill argument positions of verbs. And although they are finite sentences, they are serialized like noun phrases in the Germa...

متن کامل

Incremental Identification of Inflectional Types

We present an approach to the incremental accrual of lexical information for unknown words that is constraint-based and compatible with standard unification-based grammars. Although the techniques are language-independent and can be applied to all kinds of information, in this paper we concentrate on the domain of German noun inflection. We show how morphological information, especially inflect...

متن کامل

A CHAT-Based Annotation Scheme for Case and Noun-Phrase Inflection in Child Language Data

This paper describes a coding scheme and a set of semi-automatic procedures for the annotation of complex noun phrases and their morpho-syntactic properties in child language data. These tools are based on the CHAT conventions of the Child Language Data Exchange System (MacWhinney 2000; CHILDES: http://childes.psy.cmu.edu/; CHAT: http://childes.psy.cmu.edu/manuals/chat.pdf). The coding scheme p...

متن کامل

Evaluating and Extending the Coverage of HPSG Grammars: A Case Study for German

In this work, we examine and attempt to extend the coverage of a German HPSG grammar. We use the grammar to parse a corpus of newspaper text and evaluate the proportion of sentences which have a correct attested parse, and analyse the cause of errors in terms of lexical or constructional gaps which prevent parsing. Then, using a maximum entropy model, we evaluate prediction of lexical types in ...

متن کامل

Towards a corpus-based dictionary of German noun-verb collocations

We 1 describe our attempts to automatically extract raw material for a dictionary of German noun-verb collocations from large corpora of newspaper text. Such a dictionary should be about collocations and it should include a description of their linguistic properties, rather than listing the mere lexical cooccurrence. Since most statistical collocation nding tools do not provide other than lexic...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008