Text-informed speech enhancement with deep neural networks

نویسندگان

Keisuke Kinoshita

Marc Delcroix

Atsunori Ogawa

Tomohiro Nakatani

چکیده

A speech signal captured by a distant microphone is generally contaminated by background noise, which severely degrades the audible quality and intelligibility of the observed signal. To resolve this issue, speech enhancement has been intensively studied. In this paper, we consider a text-informed speech enhancement, where the enhancement process is guided by the corresponding text information, i.e., a correct transcription of the target utterance. The proposed deep neural network (DNN)based framework is motivated by the recent success in the textto-speech (TTS) research in employing DNN as well as high audible-quality output signal of the corpus-based speech enhancement which borrows knowledge from the TTS research field. Taking advantage of the nature of DNN that allows us to utilize disparate features in an inference stage, the proposed method infers the clean speech features by jointly using the observed signal and widely-used TTS features derived from the corresponding text. In this paper, we first introduce the background and the details of the proposed method. Then, we show how the text information can be naturally integrated into speech enhancement by utilizing DNN and improve the enhancement performance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech Enhancement for a Noise-Robust Text-to-Speech Synthesis System Using Deep Recurrent Neural Networks

Quality of text-to-speech voices built from noisy recordings is diminished. In order to improve it we propose the use of a recurrent neural network to enhance acoustic parameters prior to training. We trained a deep recurrent neural network using a parallel database of noisy and clean acoustics parameters as input and output of the network. The database consisted of multiple speakers and divers...

متن کامل

Speech Enhancement in Multiple-Noise Conditions Using Deep Neural Networks

In this paper we consider the problem of speech enhancement in real-world like conditions where multiple noises can simultaneously corrupt speech. Most of the current literature on speech enhancement focus primarily on presence of single noise in corrupted speech which is far from real-world environments. Specifically, we deal with improving speech quality in office environment where multiple s...

متن کامل

Singing Voice Separation Using Deep Neural Networks and F0 Estimation

Deep Neural Networks (DNN) have become a popular approach for speech enhancement, and singing voice separation. DNNs are typically trained to estimate a timefrequency mask using ground truth examples. In this submission, we combine DNN estimation as a first step with traditional refinement via F0 estimation, using the YINFFT algorithm.

متن کامل

شبکه عصبی پیچشی با پنجره‌های قابل تطبیق برای بازشناسی گفتار

Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...

متن کامل

Improving Speaker Verification for Reverberant Conditions with Deep Neural Network Dereverberation Processing

We present an improved method for training Deep Neural Networks for dereverberation and show that it can improve performance for the speech processing tasks of speaker verification and speech enhancement. We replicate recently proposed methods for dereverberation using Deep Neural Networks and present our improved method, highlighting important aspects that influence performance. We then experi...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2015

Text-informed speech enhancement with deep neural networks

نویسندگان

چکیده

منابع مشابه

Speech Enhancement for a Noise-Robust Text-to-Speech Synthesis System Using Deep Recurrent Neural Networks

Speech Enhancement in Multiple-Noise Conditions Using Deep Neural Networks

Singing Voice Separation Using Deep Neural Networks and F0 Estimation

شبکه عصبی پیچشی با پنجره‌های قابل تطبیق برای بازشناسی گفتار

Improving Speaker Verification for Reverberant Conditions with Deep Neural Network Dereverberation Processing

عنوان ژورنال:

اشتراک گذاری