Extracting meaningful speech features to support diagnostic feedback: an ECD approach to automated scoring
نویسندگان
چکیده
Although the operational scoring of the TOEFL iBT speaking section features the overall judgment of an examinee's speaking ability, the evaluation of specific components of speech such as delivery (pace and clarity of speech) and language use (vocabulary and grammar use) may be a promising approach to providing diagnostic information to learners. This study used an evidence-centered design approach (ECD) to extract features that provide evidence about the quality of responses to TOEFL iBT speaking tasks through the use of speech and NLP technologies. The computed features were identified through a detailed explication of the rubrics and confirmed by content specialists. Classification trees were developed to model human holistic scores and delivery and language use scores, and validated on independent samples. We will discuss the feasibility of extracting meaningful speech features amenable to diagnostic feedback. Introduction The speaking section of the Test of English as a Foreign Language Internet-Based Test (TOEFL ® iBT) is designed to measure the academic English speaking proficiency of non-native speakers who plan to study at universities where English is spoken. This test represents an important advancement in the large-scale assessment of productive skills. However, it poses particular challenges to learners in parts of the world where opportunities to learn speaking skills are limited. First, the learning environments in those countries may not be optimal for acquiring speaking skills. Second, English teachers in those parts of the world may not be adequately trained to teach speaking skills and to provide reliable feedback on their students' speaking performance. This calls for research that would support the design of effective learning products that can serve the diverse needs of learners, promote better learning, and improve teaching practices. An important requirement for such learning products is that they be capable of providing instant feedback to learners. This project explores automated evaluation of TOEFL ® iBT speaking performances that supports instant feedback capabilities for potential speaking products. Previous work The technologies that support automated evaluation of speaking proficiency are automated speech recognition (ASR) and natural language processing (NLP) tools (for a survey of these technologies see, e.g., Jurafsky & Martin, 2000). The application of these technologies to responses to the TOEFL ® iBT Speaking test poses challenges because this test elicits spontaneous speech and the scoring rubrics, based on which the responses are evaluated, draw on models of communicative competence. In addition, TOEFL ® iBT speaking has been developed to …
منابع مشابه
Automated Essay Scoring with the E-rater System
This paper provides an overview of e-rater®, a state-of-the-art automated essay scoring system developed at the Educational Testing Service (ETS). E-rater is used as part of the operational scoring of two high-stakes graduate admissions programs: the GRE® General Test and the TOEFL iBT® assessments. E-rater is also used to provide score reporting and diagnostic feedback in Criterion SM , ETS’s ...
متن کاملModeling Discourse Coherence for the Automated Scoring of Spontaneous Spoken Responses
This study describes an approach for modeling the discourse coherence of spontaneous spoken responses in the context of automated assessment of non-native speech. Although the measurement of discourse coherence is typically a key metric in human scoring rubrics for assessments of spontaneous spoken language, little prior research has been done to assess a speaker’s coherence in the context of a...
متن کاملComputing and Evaluating Syntactic Complexity Features for Automated Scoring of Spontaneous Non-Native Speech
This paper focuses on identifying, extracting and evaluating features related to syntactic complexity of spontaneous spoken responses as part of an effort to expand the current feature set of an automated speech scoring system in order to cover additional aspects considered important in the construct of communicative competence. Our goal is to find effective features, selected from a large set ...
متن کاملUsing an Ontology for Improved Automated Content Scoring of Spontaneous Non-Native Speech
This paper presents an exploration into automated content scoring of non-native spontaneous speech using ontology-based information to enhance a vector space approach. We use content vector analysis as a baseline and evaluate the correlations between human rater proficiency scores and two cosine-similarity-based features, previously used in the context of automated essay scoring. We use two ont...
متن کاملToward Evaluation of Writing Style: Finding Overly Repetitive Word Use in Student Essays
Automated essay scoring is now an established capability used from elementary school through graduate school for purposes of instruction and assessment. Newer applications provide automated diagnostic feedback about student writing. Feedback includes errors in grammar, usage, and mechanics, comments about writing style, and evaluation of discourse structure. This paper reports on a system that ...
متن کامل