The 2011 KIT QUAERO speech-to-text system for Spanish
نویسندگان
چکیده
This paper describes our current Spanish speech-to-text (STT) system with which we participated in the 2011 Quaero STT evaluation that is being developed within the Quaero program. The system consists of 4 separate subsystems, as well as the standard MFCC and MVDR phoneme based subsystems we included a both a phoneme and grapheme based bottleneck subsystem. We carefully evaluate the performance of each subsystem. After including several new techniques we were able to reduce the WER by over 30% from 20.79% to 14.53%.
منابع مشابه
The 2013 KIT Quaero Speech-to-Text System for French
This paper describes our Speech-to-Text (STT) system for French, which was developed as part of our efforts in the Quaero program for the 2013 evaluation. Our STT system consists of six subsystems which were created by combining multiple complementary sources of pronunciation modeling including graphemes with various feature front-ends based on deep neural networks and tonal features. Both spea...
متن کاملSpeech recognition for machine translation in Quaero
This paper describes the speech-to-text systems used to provide automatic transcriptions used in the Quaero 2010 evaluation of Machine Translation from speech. Quaero (www.quaero.org) is a large research and industrial innovation program focusing on technologies for automatic analysis and classification of multimedia and multilingual documents. The ASR transcript is the result of a Rover combin...
متن کاملQuaero 2010 Speech-to-Text Evaluation Systems
Our laboratory has used the HP XC4000, the high performance computer of the federal state Baden-Württemberg, in order to participate in the third Quaero evaluation (2010) for automatic speech recognition (ASR). State-of-the-art ASR research systems usually employ techniques which require the parallel execution of several recognition systems for the purpose of system combination. The use of unsu...
متن کاملProtocol and lessons learnt from the production of parallel corpora for the evaluation of speech translation systems
Machine translation evaluation campaigns require the production of reference corpora to automatically measure system output. This paper describes recent efforts to create such data with the objective of measuring the quality of the systems participating in the Quaero evaluations. In particular, we focus on the protocols behind such production as well as all the issues raised by the complexity o...
متن کاملRWTH LVCSR systems for quaero and EU-bridge: German, Polish, Spanish and Portuguese
In this paper, German, Polish, Spanish, and Portuguese large vocabulary continuous speech recognition (LVCSR) systems developed by the RWTH Aachen University are presented. All the above mentioned systems for the aforementioned languages are used for the Quaero and EU-Bridge project evaluations. The LVCSR systems developed for these competitive evaluations focus on various domains like broadcas...
متن کامل