Improvements in RWTH LVCSR evaluation systems for Polish, Portuguese, English, urdu, and Arabic
نویسندگان
چکیده
In this work, Portuguese, Polish, English, Urdu, and Arabic automatic speech recognition evaluation systems developed by the RWTH Aachen University are presented. Our LVCSR systems focus on various domains like broadcast news, spontaneous speech, and podcasts. All these systems but Urdu are used for Euronews and Skynews evaluations as part of the EUBridge project. Our previously developed LVCSR systems were improved using different techniques for the aforementioned languages. Significant improvements are obtained using multilingual tandem and hybrid approaches, minimum phone error training, lexical adaptation, open vocabulary long short term memory language models, maximum entropy language models and confusion-network based system combination.
منابع مشابه
RWTH LVCSR systems for quaero and EU-bridge: German, Polish, Spanish and Portuguese
In this paper, German, Polish, Spanish, and Portuguese large vocabulary continuous speech recognition (LVCSR) systems developed by the RWTH Aachen University are presented. All the above mentioned systems for the aforementioned languages are used for the Quaero and EU-Bridge project evaluations. The LVCSR systems developed for these competitive evaluations focus on various domains like broadcas...
متن کاملThe RWTH Aachen German and English LVCSR systems for IWSLT-2013
In this paper, German and English large vocabulary continuous speech recognition (LVCSR) systems developed by the RWTH Aachen University for the IWSLT-2013 evaluation campaign are presented. Good improvements are obtained with state-of-the-art monolingual and multilingual bottleneck features. In addition, an open vocabulary approach using morphemic sub-lexical units is investigated along with t...
متن کاملRecent improvements of the RWTH GALE Mandarin LVCSR system
This paper describes the current improvements of the RWTH Mandarin LVCSR system. We introduce a new reduced toneme set developed at RWTH. We are using different toneme sets and pronunciation lexica. For the purpose of discriminative training we will show a fast way to transform word lattices between systems using different toneme sets and pronunciation lexica. In addition to various acoustic fr...
متن کاملThe RWTH Aachen machine translation system for IWSLT 2011
In this paper the statistical machine translation (SMT) systems of RWTH Aachen University developed for the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT) 2011 is presented. We participated in the MT (English-French, Arabic-English, ChineseEnglish) and SLT (English-French) tracks. Both hierarchical and phrase-based SMT decoders are applied. A number of ...
متن کاملThe RWTH Aachen speech recognition and machine translation system for IWSLT 2012
In this paper, the automatic speech recognition (ASR) and statistical machine translation (SMT) systems of RWTH Aachen University developed for the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT) 2012 are presented. We participated in the ASR (English), MT (English-French, Arabic-English, ChineseEnglish, German-English) and SLT (English-French) tracks. F...
متن کامل