Linguistic Issues in Language Technology – LiLT

نویسندگان

  • Ines Rehbein
  • Hagen Hirschmann
  • Anke Lüdeling
  • Marc Reznicek
چکیده

Parsing learner data poses a great challenge for standard tools, since non-canonical and unusual structures may lead to wrong interpretations on the part of the taggers and parsers. It is well known that providing a statistical parser with perfect part-of-speech (POS) tags is of great benefit for parsing accuracy, and that parsing results can decrease considerably when the parser has to predict its own POS tags. Therefore one might expect that even small improvements in POS accuracy have a positive effect on parsing performance. In this paper we test this assumption and assess the impact of POS tag accuracy on constituency parsing for German learner language. We compare different strategies to manual correction of the learner text and specific POS tags, and we measure the time requirements for each strategy. We show that tagging a canonical equivalent of the non-canonical learner text substantially improves POS tag accuracy. Correcting selected POS tags can only lead to parsing results comparable to a setting where all POS tags are corrected, while reducing annotation time substantially. However, the manual corrections of the POS tags do not result in a statistically significant improvement for parsing, giving evidence for the high quality of the automatically predicted parts-of-speech for the corrected learner data. 1 LiLT Volume 7, Issue 10, January 2012. Better tags give better trees – or do they?. Copyright c © 2012, CSLI Publications. 2 / LiLT volume 7, issue 10 January 2012

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Linguistic Issues in Language Technology LiLT

In this paper, we overview the ways in which computational methods can serve the goals of analysis and theory development in linguistics, and encourage the reader to become involved in the emerging cyberinfrastructure for linguistics. We survey examples from diverse subfields of how computational methods are already being used, describe the current state of the art in cyberinfrastructure for li...

متن کامل

Linguistic Issues in Language Technology – LiLT

Lakoff (1974) argues that affective demonstratives in English are markers of solidarity, with exclamative overtones deriving from their close association with evaluative predication. Focusing on this, we seek to inform these claims using quantitative corpus evidence. Our experiments suggest that affectivity is not limited to specific uses of this, but rather that it arises in a wide range of li...

متن کامل

Linguistic Issues in Language Technology – LiLT

Morphology is a key component for many Language Technology applications. However, morphological relations, especially those relying on the derivation and compounding processes, are often addressed in a superficial manner. In this article, we focus on assessing the relevance of deep and motivated morphological knowledge in Natural Language Processing applications. We first describe an annotation...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011