SoftCorrect: Error Correction with Soft Detection for Automatic Speech Recognition
نویسندگان
چکیده
Error correction in automatic speech recognition (ASR) aims to correct those incorrect words sentences generated by ASR models. Since recent models usually have low word error rate (WER), avoid affecting originally tokens, should only modify words, and therefore detecting is important for correction. Previous works on either implicitly detect through target-source attention or CTC (connectionist temporal classification) loss, explicitly locate specific deletion/substitution/insertion errors. However, implicit detection does not provide clear signal about which tokens are explicit suffers from accuracy. In this paper, we propose SoftCorrect with a soft mechanism the limitations of both detection. Specifically, first whether token probability produced dedicatedly designed language model, then design constrained loss that duplicates detected let decoder focus tokens. Compared provides thus need duplicate every but tokens; compared detection, errors just leaves it loss. Experiments AISHELL-1 Aidatatang datasets show achieves 26.1% 9.4% CER reduction respectively, outperforming previous large margin, while still enjoying fast speed parallel generation.
منابع مشابه
Error Detection in Automatic Speech Recognition
We offer a supervised machine learning approach for recognizing erroneous words in the output of a speech recognizer. We have investigated several sets of features combined with two word configurations, and compared the performance of two classifiers: Decision Trees and Naïve Bayes. Evaluation was performed on a corpus of 400 spoken referring expressions, with Decision Trees yielding a high rec...
متن کاملRecent Improvements on Error Detection for Automatic Speech Recognition
Automatic speech recognition(ASR) offers the ability to access the semantic content present in spoken language within audio and video documents. While acoustic models based on deep neural networks have recently significantly improved the performances of ASR systems, automatic transcriptions still contain errors. Errors perturb the exploitation of these ASR outputs by introducing noise to the te...
متن کاملContext-based Speech Recognition Error Detection and Correction
In this paper we present preliminary results of a novel unsupervised approach for highprecision detection and correction of errors in the output of automatic speech recognition systems. We model the likely contexts of all words in an ASR system vocabulary by performing a lexical co-occurrence analysis using a large corpus of output from the speech system. We then identify regions in the data th...
متن کاملRobust Error Correction of Continuous Speech Recognition
We present a post-processing technique for correcting errors committed by an arbitrary continuous speechrecognizer. The technique leverages our observation that consistent recognition errors arising from mismatched training and usageconditions can be modeled and corrected. We have implemented a post-processor called SPEECHPP to correct word-level errors, and we show that this post-processing te...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2023
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v37i11.26531