Grammatical error correction using neural machine translation
نویسندگان
چکیده
This paper presents the first study using neural machine translation (NMT) for grammatical error correction (GEC). We propose a twostep approach to handle the rare word problem in NMT, which has been proved to be useful and effective for the GEC task. Our best NMTbased system trained on the CLC outperforms our SMT-based system when testing on the publicly available FCE test set. The same system achieves an F0.5 score of 39.90% on the CoNLL-2014 shared task test set, outperforming the state-of-the-art and demonstrating that the NMT-based GEC system generalises effectively.
منابع مشابه
Grammatical Error Correction
Grammatical error correction (GEC) is the task of automatically correcting grammatical errors in written text. Earlier attempts to grammatical error correction involve rule-based and classifier approaches which are limited to correcting only some particular type of errors in a sentence. As sentences may contain multiple errors of different types, a practical error correction system should be ab...
متن کاملNeural Network Translation Models for Grammatical Error Correction
Phrase-based statistical machine translation (SMT) systems have previously been used for the task of grammatical error correction (GEC) to achieve state-of-the-art accuracy. The superiority of SMT systems comes from their ability to learn text transformations from erroneous to corrected text, without explicitly modeling error types. However, phrase-based SMT systems suffer from limitations of d...
متن کاملConnecting the Dots: Towards Human-Level Grammatical Error Correction
We build a grammatical error correction (GEC) system primarily based on the state-of-the-art statistical machine translation (SMT) approach, using task-specific features and tuning, and further enhance it with the modeling power of neural network joint models. The SMT-based system is weak in generalizing beyond patterns seen during training and lacks granularity below the word level. To address...
متن کاملNeural Sequence-Labelling Models for Grammatical Error Correction
We propose an approach to N -best list reranking using neural sequence-labelling models. We train a compositional model for error detection that calculates the probability of each token in a sentence being correct or incorrect, utilising the full sentence as context. Using the error detection model, we then re-rank the N best hypotheses generated by statistical machine translation systems. Our ...
متن کاملDiscriminative Reranking for Grammatical Error Correction with Statistical Machine Translation
Research on grammatical error correction has received considerable attention. For dealing with all types of errors, grammatical error correction methods that employ statistical machine translation (SMT) have been proposed in recent years. An SMT system generates candidates with scores for all candidates and selects the sentence with the highest score as the correction result. However, the 1-bes...
متن کامل