Improving wideband acoustic models using mixed-bandwidth training data via DNN adaptation

نویسندگان

Zhao You

Bo Xu

چکیده

In the past few years, deep neural networks (DNNs) have achieved great successes in speech recognition. The deep network model can be viewed as a series of feature transforms followed by a log-linear classifier. For input of speeches from different bandwidths, although the hidden layer transform and log-linear classification can be shared, the input layer transforms should be specially designed respectively. So, training DNNs directly on different bandwidth speeches is intractable. In this paper, we treat the problem of training DNNs on mixed bandwidth data as an domain-adaptation problem. Upon our adaptation approach, DNNs trainied on the rich narrowband speech can be adapted effectively to the target wideband domain, and meanwhile shows good performance on the wideband speech. We evaluate this approach on the wideband clean7k and noise360 speech. Experimental results show that the DNNs adaptation approach can reduce character error rate (CER) range from 5% to 15%, relatively, over the baseline DNNs trained only on the limited wideband data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving DNN Bluetooth Narrowband Acoustic Models by Cross-Bandwidth and Cross-Lingual Initialization

The success of deep neural network (DNN) acoustic models is partly owed to large amounts of training data available for different applications. This work investigates ways to improve DNN acoustic models for Bluetooth narrowband mobile applications when relatively small amounts of in-domain training data are available. To address the challenge of limited indomain data, we use cross-bandwidth and...

متن کامل

Optimizing DNN Adaptation for Recognition of Enhanced Speech

Speech enhancement directly using deep neural network (DNN) is of major interest due to the capability of DNN to tangibly reduce the impact of noisy conditions in speech recognition tasks. Similarly, DNN based acoustic model adaptation to new environmental conditions is another challenging topic. In this paper we present an analysis of acoustic model adaptation in presence of a disjoint speech ...

متن کامل

Improving DNN-Based Automatic Recognition of Non-native Children's Speech with Adult Speech

Acoustic models for state-of-the-art DNN-based speech recognition systems are typically trained using at least several hundred hours of task-specific training data. However, this amount of training data is not always available for some applications. In this paper, we investigate how to use an adult speech corpus to improve DNN-based automatic speech recognition for non-native children's speech....

متن کامل

GMM-derived features for effective unsupervised adaptation of deep neural network acoustic models

In this paper we investigate GMM-derived features recently introduced for adaptation of context-dependent deep neural network HMM (CD-DNN-HMM) acoustic models. We improve the previously proposed adaptation algorithm by applying the concept of speaker adaptive training (SAT) to DNNs built on GMM-derived features and by using fMLLR-adapted features for training an auxiliary GMM model. Traditional...

متن کامل

WTIMIT: The TIMIT Speech Corpus Transmitted Over The 3G AMR Wideband Mobile Network

In anticipation of upcoming mobile telephony services with higher speech quality, a wideband (50 Hz to 7 kHz) mobile telephony derivative of TIMIT has been recorded called WTIMIT. It opens up various scientific investigations; e.g., on speech quality and intelligibility, as well as on wideband upgrades of network-side interactive voice response (IVR) systems with retrained or bandwidth-extended...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Improving wideband acoustic models using mixed-bandwidth training data via DNN adaptation

نویسندگان

چکیده

منابع مشابه

Improving DNN Bluetooth Narrowband Acoustic Models by Cross-Bandwidth and Cross-Lingual Initialization

Optimizing DNN Adaptation for Recognition of Enhanced Speech

Improving DNN-Based Automatic Recognition of Non-native Children's Speech with Adult Speech

GMM-derived features for effective unsupervised adaptation of deep neural network acoustic models

WTIMIT: The TIMIT Speech Corpus Transmitted Over The 3G AMR Wideband Mobile Network

عنوان ژورنال:

اشتراک گذاری