SCESS: a WFSA-based automated simplified chinese essay scoring system with incremental latent semantic analysis

نویسندگان

  • Shudong Hao
  • Yanyan Xu
  • Dengfeng Ke
  • Kaile Su
  • Hengli Peng
چکیده

Writing in language tests is regarded as an important indicator for assessing language skills of test takers. As Chinese language tests become popular, scoring a large number of essays becomes a heavy and expensive task for the organizers of these tests. In the past several years, some efforts have been made to develop automated simplified Chinese essay scoring systems, reducing both costs and evaluation time. In this paper, we introduce a system called SCESS (automated Simplified Chinese Essay Scoring System) based on Weighted Finite State Automata (WFSA) and using Incremental Latent Semantic Analysis (ILSA) to deal with a large number of essays. First, SCESS uses an n-gram language model to construct a WFSA to perform text pre-processing. At this stage, the system integrates a Confusing-Character Table, a Part-Of-Speech Table, beam search and heuristic search to perform automated word segmentation and correction of essays. Experimental results show that this pre-processing procedure is effective, with a Recall Rate of 88.50%, a Detection Precision of 92.31% and a Correction Precision of 88.46%. After text preprocessing, SCESS uses ILSA to perform automated essay scoring. We have carried out experiments to compare the ILSA method with the traditional LSA method on the corpora of essays from the MHK test (the Chinese proficiency test for minorities). Experimental results indicate that ILSA has a significant advantage over LSA, in terms of both running time and memory usage. Furthermore, experimental results also show that SCESS is quite effective with a scoring performance of 89.50%. † Corresponding author.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automated Essay Scoring Using Incremental Latent Semantic Analysis

Writing has been increasingly regarded by the testers of language tests as an important indicator to assess the language skill of testees. As such tests become more and more popular and the number of testees becomes larger, it is a huge task to score so many essays by raters. So far, many methods have been used to solve this problem and the traditional method is Latent Semantic Analysis (LSA). ...

متن کامل

ESSAY ASSESSMENT 1 Running head: ESSAY ASSESSMENT Essay Assessment with Latent Semantic Analysis

Latent semantic analysis (LSA) is an automated, statistical technique for comparing the semantic similarity of words or documents. In this paper, I examine the application of LSA to automated essay scoring. I compare LSA methods to earlier statistical methods for assessing essay quality, and critically review contemporary essay-scoring systems built on LSA, including the Intelligent Essay Asses...

متن کامل

Automated Scoring of Handwritten Essays Based on Latent Semantic Analysis

Handwritten essays are widely used in educational assessments, particularly in classroom instruction. This paper concerns the design of an automated system for performing the task of taking as input scanned images of handwritten student essays in reading comprehension tests and to produce as output scores for the answers which are analogous to those provided by human scorers. The system is base...

متن کامل

Automated Essay Scoring Based on Finite State Transducer: towards ASR Transcription of Oral English Speech

Conventional Automated Essay Scoring (AES) measures may cause severe problems when directly applied in scoring Automatic Speech Recognition (ASR) transcription as they are error sensitive and unsuitable for the characteristic of ASR transcription. Therefore, we introduce a framework of Finite State Transducer (FST) to avoid the shortcomings. Compared with the Latent Semantic Analysis with Suppo...

متن کامل

Parameters Driving Effec- Tiveness of Automated Essay Scoring with Lsa

Automated essay scoring with latent semantic analysis (LSA) has recently been subject to increasing interest. Although previous authors have achieved grade ranges similar to those awarded by humans, it is still not clear which and how parameters improve or decrease the effectiveness of LSA. This paper presents an analysis of the effects of these parameters, such as text preprocessing, weighting...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Natural Language Engineering

دوره 22  شماره 

صفحات  -

تاریخ انتشار 2016