Fast speech adaptation in linear spectral domain for additive and convolutional noise
نویسندگان
چکیده
In this paper, we propose a transform-based adaptation technique for robust speech recognition in unknown environments. It uses maximum likelihood spectral transform (MLST) algorithm with additive and convolutional noise parameters. Previously many adaptation algorithms have been proposed in the cepstral domain. Though the cepstral domain may be appropriate for the speech recognition, it is difficult to handle environmental noise directly in the cepstral domain. Therefore our approach deals with such noise in the linear spectral domain in which speech is directly affected by the noise. As a result, we can use a small number of noise parameters for fast adaptation. The experiments evaluated on the FFMTIMIT corpus shows promising result with only a small number of adaptation data.
منابع مشابه
Noise suppression and loudness normalization in an auditory model-based acoustic front-end
It is commonly acknowledged that the presence of additive and convolutional noise and speech level variations can seriously deteriorate the performance of a speech recognizer. In case an auditory model is used as the acoustic front-end, it turns out that compensation techniques such as spectral subtraction and log-spectral mean subtraction can be outperformed by time-domain techniques operating...
متن کاملNoise and room acoustics distorted speech recognition by HMM composition
This paper presents a robust speech recognition method based on the HMM composition for the noisy room acoustics distorted speech. The method realizes an improved user interface such as the user is not encumbered by microphone equipments. The proposed HMM composition is obtained by naturally extending the HMM composition method of an additive noise to that of the convolutional room acoustics di...
متن کاملA New Method for Speech Enhancement Based on Incoherent Model Learning in Wavelet Transform Domain
Quality of speech signal significantly reduces in the presence of environmental noise signals and leads to the imperfect performance of hearing aid devices, automatic speech recognition systems, and mobile phones. In this paper, the single channel speech enhancement of the corrupted signals by the additive noise signals is considered. A dictionary-based algorithm is proposed to train the speech...
متن کاملForward masking on a generalized logarithmic scale for robust speech recognition
This paper examines the forward masking on the generalized logarithmic scale for robust speech recognition to both additive and convolutional noise. The forward masking in the dynamic cepstral (DyC) representation is based upon subtraction of a masking pattern from a current spectrum on a logarithmic spectral domain, whereas the proposed method intends to make a compromise between the logarithm...
متن کاملUncertainty in Signal Estimation and Stochastic Weighted Viterbi Algorithm: A Unified Framework to Address Robustness in Speech Recognition and Speaker Verification
Robustness to noise and low-bit rate coding distortion is one of the main problems faced by automatic speech recognition (ASR) and speaker verification (SV) systems in real applications. Usually, ASR and SV models are trained with speech signals recorded in conditions that are different from testing environments. This mismatch between training and testing can lead to unacceptable error rates. N...
متن کامل