KU Leuven at HOO-2012: A Hybrid Approach to Detection and Correction of Determiner and Preposition Errors in Non-native English Text

نویسندگان

  • Li Quan
  • Oleksandr Kolomiyets
  • Marie-Francine Moens
چکیده

In this paper we describe the technical implementation of our system that participated in the Helping Our Own 2012 Shared Task (HOO-2012). The system employs a number of preprocessing steps and machine learning classifiers for correction of determiner and preposition errors in non-native English texts. We use maximum entropy classifiers trained on the provided HOO-2012 development data and a large high-quality English text collection. The system proposes a number of highlyprobable corrections, which are evaluated by a language model and compared with the original text. A number of deterministic rules are used to increase the precision and recall of the system. Our system is ranked among the three best performing HOO-2012 systems with a precision of 31.15%, recall of 22.08% and F1score of 25.84% for correction of determiner and preposition errors combined.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Informing Determiner and Preposition Error Correction with Word Clusters

We extend our n-gram-based data-driven prediction approach from the Helping Our Own (HOO) 2011 Shared Task (Boyd and Meurers, 2011) to identify determiner and preposition errors in non-native English essays from the Cambridge Learner Corpus FCE Dataset (Yannakoudakis et al., 2011) as part of the HOO 2012 Shared Task. Our system focuses on three error categories: missing determiner, incorrect de...

متن کامل

Informing Determiner and Preposition Error Correction with Hierarchical Word Clustering

We extend our n-gram-based data-driven prediction approach from the Helping Our Own (HOO) 2011 Shared Task (Boyd and Meurers, 2011) to identify determiner and preposition errors in non-native English essays from the Cambridge Learner Corpus FCE Dataset (Yannakoudakis et al., 2011) as part of the HOO 2012 Shared Task. Our system focuses on three error categories: missing determiner, incorrect de...

متن کامل

HOO 2012: A Report on the Preposition and Determiner Error Correction Shared Task

Incorrect usage of prepositions and determiners constitute the most common types of errors made by non-native speakers of English. It is not surprising, then, that there has been a significant amount of work directed towards the automated detection and correction of such errors. However, to date, the use of different data sets and different task definitions has made it difficult to compare work...

متن کامل

Detection and Correction of Preposition and Determiner Errors in English: HOO 2012

This paper reports on our work in the HOO 2012 shared task. The task is to automatically detect, recognize and correct the errors in the use of prepositions and determiners in a set of given test documents in English. For that, we have developed a hybrid system of an n-gram statistical model along with some rule-based techniques. The system has been trained on the HOO shared task’s training dat...

متن کامل

Data-Driven Correction of FunctionWords in Non-Native English

We extend the n-gram-based data-driven prediction approach (Elghafari, Meurers and Wunsch, 2010) to identify function word errors in non-native academic texts as part of the Helping Our Own (HOO) Shared Task. We focus on substitution errors for four categories: prepositions, determiners, conjunctions, and quantifiers. These error types make up 12% of the errors annotated in the HOO training dat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012