Learning speaker-specific pronunciations of disordered speech

نویسندگان

  • Heidi Christensen
  • Phil D. Green
  • Thomas Hain
چکیده

One of the main clinical applications of speech technology is in voice-enabled assistive technology for people with disordered speech. Progress in this area is hampered by a sparseness in suitable data and recent research have focused on ways of incorporating knowledge about typical (i.e., un-impaired) speech through the use of e.g., deep belief neural networks. This paper presents a new way of using deep belief neural networks trained on typical speech, namely to improve pronunciations for individual speakers. Analysis of the posterior probabilities show a clear correlation between measured pronunciation ‘disorderedness’ and the overall speech recognition performance of the full system. Based on this, we propose a method to use deep belief network outputs to i) identify which words are pronounced differently than what would be expected from a typical pronunciation, and ii) subsequently generate new pronunciations. We investigate different methods for pronunciation generation as well as what is the best way of using the modified pronunciations to inform the system development stages. Using the UAspeech database of disordered speech, we demonstrate improvement in average accuracy of 69.76% to 70.51%, with some speakers showing individual improvements of up to 10%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A comparison of methods for speaker-dependent pronunciation tuning for text-to-speech synthesis

Unit-based text-to-speech (TTS) systems typically use a set of speech recordings that have been phonetically transcribed to create a large set of phonetic units. During synthesis, pronunciations for input text are generated and used to guide the selection of a sequence of phonetic units. The style of these system pronunciations must match the style of the phonetic transcriptions of the recorded...

متن کامل

Prelexical adjustments to speaker idiosyncrasies: are they position-specific?

Listeners use lexical knowledge to adjust their prelexical representations of speech sounds in response to the idiosyncratic pronunciations of particular speakers. We used an exposure-test paradigm to investigate whether this type of perceptual learning transfers across syllabic positions. No significant learning effect was found in Experiment 1, where exposure sounds were onsets and test sound...

متن کامل

Regularized-MLLR speaker adaptation for computer-assisted language learning system

In this paper, we propose a novel speaker adaptation technique, regularized-MLLR, for Computer Assisted Language Learning (CALL) systems. This method uses a linear combination of a group of teachers’ transformation matrices to represent each target learner’s transformation matrix, thus avoids the over-adaptation problem that erroneous pronunciations come to be judged as good pronunciations afte...

متن کامل

Perceptual adjustments to multiple speakers

Different speakers may pronounce the same sounds very differently, yet listeners have little difficulty perceiving speech accurately. Recent research suggests that listeners adjust their preexisting phonemic categories to accommodate speakers’ pronunciations (perceptual learning). In some cases, these adjustments appear to reflect general changes to phonemic categories, rather than speaker-spec...

متن کامل

Speech Recognition as Feature Extraction for Speaker Recognition

Information from speech recognition can be used in various ways in state-of-the-art speaker recognition systems. This includes the obvious use of recognized words to enable the use of text-dependent speaker modeling techniques when the words spoken are not given. Furthermore, it has been shown that the choice of words and phones itself can be a useful indicator of speaker identity. Also, recogn...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013