Preservation of speech spectral dynamics enhances intelligibility
نویسندگان
چکیده
We propose a method for the enhancement of intelligibility in scenarios where speech is rendered in a noisy environment. The method is based on the hypothesis that intelligibility is a monotonic function of the degree of preservation of the speech spectral dynamics. The accuracy of the speech spectral dynamics can then be traded against the power of the rendered speech signal. We can either maximize the dynamics accuracy given the signal power, or minimize the signal power given the dynamics accuracy. In our implementation, the spectral dynamics is quantified as the difference of the mel cepstra between time frames of the speech signal. We compared the speech rendered by our implementation against both natural speech and a reference method, for the scenario where signal power is minimized given a target dynamics accuracy, and observed a significantly improved intelligibility. The low system delay, and the low complexity and memory requirements make the new method particularly suitable for real-time applications.
منابع مشابه
مدل میکروسکوپی دوگوشی مبتنی بر فیلتر بانک مدولاسیون برای پیش گویی قابلیت فهم گفتار در افراد دارای شنوایی عادی
In this study, a binaural microscopic model for the prediction of speech intelligibility based on the modulation filter bank is introduced. So far, the spectral criteria such as the STI and SII or other analytical methods have been used in the binaural models to determine the binaural intelligibility. In the proposed model, unlike all models of binaural intelligibility prediction, an automatic ...
متن کاملSpectral Features for Automatic Blind Intelligibility Estimation of Spastic Dysarthric Speech
In this paper, we explore the use of the standard ITU-T P.563 speech quality estimation algorithm for automatic assessment of dysarthric speech intelligibility. A linear mapping consisting of three salient P.563 internal features is proposed and shown to accurately estimate spastic dysarthric speech intelligibility. Delta-energy features are further proposed in order to characterize the atypica...
متن کاملSpeech intelligibility after repair of cleft lip and palate
Background: Intelligibility refers to understandability of speech; and lack of it can negatively affect children’s overall communication effectiveness. Children with repaired cleft lip and/or cleft palate (CL/P) may experience poor speech intelligibility. This study aimed at evaluating speech intelligibility in children with repaired CL/P who had not been referred to sp...
متن کاملSpeech Intelligibility of Cochlear-Implanted and Normal-Hearing Children
Introduction: Speech intelligibility, the ability to be understood verbally by listeners, is the gold standard for assessing the effectiveness of cochlear implantation. Thus, the goal of this study was to compare the speech intelligibility between normal-hearing and cochlear-implanted children using the Persian intelligibility test. Materials and Methods: Twenty-six cochlear-implanted childre...
متن کاملNoisy Speech Intelligibility Enhancement
This paper addresses the study of the speech intelligibility enhancement. The speech model, noise sources, perceptual aspects of speech, and performance evaluation are reviewed. The intelligibility enhancement system based on spectral subtraction technique is investigated. Spectral density estimation device based on the algorithm of smoothed periodograms is analysed. Determination of the silenc...
متن کامل