A New Dataset and Method for Automatically Grading ESOL Texts

نویسندگان

  • Helen Yannakoudakis
  • Ted Briscoe
  • Ben Medlock
چکیده

We demonstrate how supervised discriminative machine learning techniques can be used to automate the assessment of ‘English as a Second or Other Language’ (ESOL) examination scripts. In particular, we use rank preference learning to explicitly model the grade relationships between scripts. A number of different features are extracted and ablation tests are used to investigate their contribution to overall performance. A comparison between regression and rank preference models further supports our method. Experimental results on the first publically available dataset show that our system can achieve levels of performance close to the upper bound for the task, as defined by the agreement between human examiners on the same corpus. Finally, using a set of ‘outlier’ texts, we test the validity of our model and identify cases where the model’s scores diverge from that of a human examiner.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

شناسایی نوع و مدل وسیله نقلیه با استفاده از مجموعه بخش‌های متمایز‌کننده

In fine-grained recognition, the main category of object is well known and the goal is to determine the subcategory or fine-grained category. Vehicle make and model recognition (VMMR) is a fine-grained classification problem. It includes several challenges like the large number of classes, substantial inner-class and small inter-class distance. VMMR can be utilized when license plate numbers ca...

متن کامل

Modeling coherence in ESOL learner texts

To date, few attempts have been made to develop new methods and validate existing ones for automatic evaluation of discourse coherence in the noisy domain of learner texts. We present the first systematic analysis of several methods for assessing coherence under the framework of automated assessment (AA) of learner free-text responses. We examine the predictive power of different coherence mode...

متن کامل

Automatically Assessing Free Texts

Evaluation of the content of free texts is a challenging task for humans. Automation of this process is largely useful in order to reduce human related errors. We consider one instance of the “free texts” assessment problems; automatic essay grading where the task is to grade student written essays automatically given course materials and a set of human-graded essays as training data. We use a ...

متن کامل

Presenting a method for extracting structured domain-dependent information from Farsi Web pages

Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...

متن کامل

Land Cover Subpixel Change Detection using Hyperspectral Images Based on Spectral Unmixing and Post-processing

  The earth is continually being influenced by some actions such as flood, tornado and human artificial activities. This process causes the changes in land cover type. Thus, for optimal management of the use of resources, it is necessary to be aware of these changes. Today’s remote sensing plays key role in geology and environmental monitoring by its high resolution, wide covering and low cost...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011