Prediction of speech intelligibility with DNN-based performance measures

نویسندگان

چکیده

This paper presents a speech intelligibility model based on automatic recognition (ASR), combining phoneme probabilities from deep neural networks (DNN) and performance measure that estimates the word error rate these probabilities. does not require clean reference nor labels during testing as ASR decoding step, which finds most likely sequence of words given posterior probabilities, is omitted. The evaluated via root-mean-squared between predicted observed reception thresholds eight normal-hearing listeners. task consists identifying noisy German matrix sentence test. material was mixed with noise maskers covering different modulation types, speech-shaped stationary to single-talker masker. prediction compared five established models an ASR-model using labels. Two combinations features were tested. Both include temporal information either at feature level (amplitude filterbanks feed-forward network) or captured by architecture (mel-spectrograms time-delay network, TDNN). TDNN par DNN while reducing number parameters factor 37; this optimization allows parallel streams dedicated hearing aid hardware forward-pass can be computed within 10ms each frame. proposed performs almost well label-based produces more accurate predictions than baseline models.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An evaluation of objective quality measures for speech intelligibility prediction

In this research various objective quality measures are evaluated in order to predict the intelligibility for a wide range of non-linearly processed speech signals and speech degraded by additive noise. The obtained results are compared with the prediction results of a more advanced perceptual-based model proposed by Dau et al. and an objective intelligibility measure, namely the coherence spee...

متن کامل

Outcome measures based on classification performance fail to predict the intelligibility of binary-masked speech.

To date, the most commonly used outcome measure for assessing ideal binary mask estimation algorithms is based on the difference between the hit rate and the false alarm rate (H-FA). Recently, the error distribution has been shown to substantially affect intelligibility. However, H-FA treats each mask unit independently and does not take into account how errors are distributed. Alternatively, a...

متن کامل

Speech Intelligibility in Persian Children with Down Syndrome

Objectives: One of the most effective methods to describe speech disorders is the measurement of speech intelligibility. The speech intelligibility indicates the extent of acoustic signals that correctly speaker produces and hearer receives. The purpose of this study was to investigate the speech intelligibility in the Persian children with Down syndrome, age range was 3 to 5 years, who had spo...

متن کامل

An evaluation of objective measures for intelligibility prediction of time-frequency weighted noisy speech.

Existing objective speech-intelligibility measures are suitable for several types of degradation, however, it turns out that they are less appropriate in cases where noisy speech is processed by a time-frequency weighting. To this end, an extensive evaluation is presented of objective measure for intelligibility prediction of noisy speech processed with a technique called ideal time frequency (...

متن کامل

Evaluation of Objective Intelligibility Prediction Measures for Speech Enhancement in Mandarin

In this paper, we evaluate the performance of several state-of-the-art objective measures in terms of predicting speech intelligibility in Mandarin of the processed noisy signals by speech enhancement algorithms. The speech signals were first corrupted by three types of noises at two signal-to-noise ratios, followed by four classes of speech enhancement algorithms. The objective intelligibility...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Computer Speech & Language

سال: 2022

ISSN: ['1095-8363', '0885-2308']

DOI: https://doi.org/10.1016/j.csl.2021.101329