QCRI$@$QALB-2015 Shared Task: Correction of Arabic Text for Native and Non-Native Speakers' Errors
نویسندگان
چکیده
This paper describes the error correction model that we used for the QALB2015 Automatic Correction of Arabic Text shared task. We employed a case-specific correction approach that handles specific error types such as dialectal word substitution and word splits and merges with the aid of a language model. We also applied corrections that are specific to second language learners that handle erroneous preposition selection, definiteness, and gender-number agreement.
منابع مشابه
SAHSOH$@$QALB-2015 Shared Task: A Rule-Based Correction Method of Common Arabic Native and Non-Native Speakers' Errors
This paper describes our participation in the QALB-2015 Automatic Correction of Arabic Text shared task. We employed various tools and external resources to build a rule based correction method. Hand written linguistic rules were added by using existing lexicons and regular expressions. We handled specific errors with dedicated rules reserved for nonnative speakers. The system is simple as it d...
متن کاملThe Second QALB Shared Task on Automatic Text Correction for Arabic
We present a summary of QALB-2015, the second shared task on automatic text correction of Arabic texts. The shared task extends QALB-2014, which focused on correcting errors in Arabic texts produced by native speakers of Arabic. The competition this year, in addition to native data, includes texts produced by learners of Arabic as a foreign language. The report includes an overview of the QALB ...
متن کاملTECHLIMED$@$QALB-Shared Task 2015: a hybrid Arabic Error Correction System
This paper reports on the participation of Techlimed in the Second Shared Task on Automatic Arabic Error Correction organized by the Arabic Natural Language Processing Workshop. This year's competition includes two tracks, and, in addition to errors produced by native speakers (L1), also includes correction of texts written by learners of Arabic as a foreign language (L2). Techlimed participate...
متن کاملNon-native text analysis: A survey
Non-native speakers of English far outnumber native speakers; English is the main language of books, newspapers, airports, air-traffic control, international business, academic conferences, science, technology, diplomacy, sports, international competitions, pop music, and advertising (British Council 2014). Online education in the form of massive online open courses is also primarily in English...
متن کاملThe First QALB Shared Task on Automatic Text Correction for Arabic
We present a summary of the first shared task on automatic text correction for Arabic text. The shared task received 18 systems submissions from nine teams in six countries and represented a diversity of approaches. Our report includes an overview of the QALB corpus which was the source of the datasets used for training and evaluation, an overview of participating systems, results of the compet...
متن کامل