“We All Make Mistakes!”. Analysing an Error-coded Corpus of Spanish University Students’ Written English
نویسنده
چکیده
The present study analyses the errors identified in the written argumentative texts of 304 Spanish university students of English taken from two different corpora –one from a technical university context and the other from learners enrolled in the Humanities. Considered an important design criterion for computer learner corpora studies, the metadata of the students’ was recorded and their competence levels were measured using the Oxford Quick Placement Test. The scores obtained (0 to 60) were then related to the CEFR (Common European Framework of Reference for Languages) levels ranging from A1 to C2. Within the field of applied linguistics and language teaching/learning, many studies have been carried out over the years designed to address the phenomenon of interlanguage errors made by learners of English (Dusková 1969, Green & Hecht 1985, Lennon 1991, Olsen 1999, among many others). These studies involved analyzing a small number of texts with a limited number of tags, based on either linguistic taxonomies or surface structure categories of errors (Dulay, Burt, & Krashen 1982). However, in the last three decades, technological advances have been made which have facilitated the analysis of much larger amounts of data using computers for both the development of learner corpora and programs for a more detailed analysis of the learner data. The error coding system used in the present research work has been designed to address all the possible levels of error (with as many sub-categories as required) since learners writing in a foreign language not only make errors related to grammar and vocabulary, but also with regard to their competence in the use of syntax, discourse relations and pragmatics, among others. The aim of the present study is two-fold. Firstly, we explore the nature of the errors coded in the corpus i.e. which errors are most frequent, including not only the main categories but also the most delicate levels of errors. Secondly, we address the question of the relationship, if any, of the learners’ competence levels and the type and frequency of the errors they make. The results show that grammar errors are the most frequent, and that the linguistic competence of the learners has a lower than expected influence on the most frequent types of errors coded in the corpus.
منابع مشابه
REALEC learner treebank: annotation principles and evaluation of automatic parsing
The paper presents a Universal Dependencies (UD) annotation scheme for a learner English corpus. The REALEC dataset consists of essays written in English by Russian-speaking university students in the course of general English. The original corpus is manually annotated for learners’ errors and gives information on the error span, error type, and the possible correction of the mistake provided b...
متن کاملThe Cambridge Learner Corpus - error coding and analysis for lexicography and ELT
The Cambridge Learner Corpus is a 16 million-word corpus of Learner English collected by Cambridge University Press in collaboration with the University of Cambridge Local Examinations Syndicate (now Cambridge ESOL). It comprises English examination scripts, transcribed retaining all errors, written by learners of English with 86 different mother tongues. The scripts range across 8 EFL examinat...
متن کاملData-Driven Learning and Awareness-Raising: An Effective Tandem to Improve Grammar in Written Composition?
The present paper, framed within the ECTS scheme currently being piloted at the University of Jaén, reports on a study carried out in the second semester of the academic year 2004-5 with English Philology freshmen at this University. One of its aims, described in an initial section of the paper, was to determine whether the use of Computer Assisted Language Learning (CALL), and DataDriven Learn...
متن کاملError Analysis of Taiwanese University Students’ English Essay Writing: A Longitudinal Corpus Study
Writing is considered one of the most difficult skills in EFL/ESL. Thus, meticulous recognition and classification of students’ errors in certain contexts is a worthwhile endeavor which provides us with both diagnostic and prognostic power. Accordingly, a total of 430 students in 15 English writing classes held during 12 consecutive semesters in a private university in central Taiwan were the s...
متن کاملCultural Influence on the Expression of Cathartic Conceptualization in English and Spanish: A Corpus-Based Analysis
This paper investigates the conceptualization of emotional release from a cognitive linguistics perspective (Cognitive Metaphor Theory). The metaphor weeping is a means of liberating contained emotions is grounded in universal embodied cognition and is reflected in linguistic expressions in English and Spanish. Lexicalization patterns which encapsulate this conceptualization i...
متن کامل