Impact of Vocal Tract Length Normalization on the Speech Recognition Performance of an English Vowel Phoneme Recognizer for the Recognition of Children Voices
نویسندگان
چکیده
Differences in human vocal tract lengths can cause inter speaker acoustic variability in speech signals spoken by different speakers for the same textual version and due to these variations, the robustness of a speaker independent (SI) speech recognition system is affected. Speaker normalization using vocal tract length normalization (VTLN) is an effective approach to reduce the affect of these types of variability from speech signals. In this paper impact of VTLN approach has been investigated on the speech recognition performance of an English vowel phoneme recognizer with both noise free and noisy speech signals spoken by children. Pattern recognition approach based on Hidden Markov Model (HMM) has been used to develop the English vowel phoneme recognizer. Here training phase of the automatic speech recognition (ASR) system has been performed with speech signals spoken by adult male and female speakers and testing phase is performed by the children speech signals. In this investigation, it has been observed that use of VTLN can effectively improve the robustness of the English vowel phoneme recognizer in both noise free and noisy conditions.
منابع مشابه
Adaptation and normalization experiments in speech recognition for 4 to 8 year old children
An experimental offline investigation of the performance of connected digits recognition was performed on children in the age range four to eight years. Poor performance using adult models was improved significantly by adaptation and vocal tract length normalisation but not to the same level as training on children. Age dependent models were tried with limited advantage. A combined adult and ch...
متن کاملEecient Vocal Tract Normalization in Automatic Speech Recognition
In this paper we study the eeect of vocal tract normalization (VTN) on the word error rate (WER) in speaker independent large vocabulary speech recognition. Evaluation test results are reported for the German VerbMobil II (VM II) and the English Wall Street Journal (WSJ) corpus. In particular, we analyse: the eeect of the type of warping function (linear vs. non-linear) on the WER; diierent met...
متن کاملتخمین سریع ضرایب پیچش در هنجارسازی طول مجرای صوتی با استفاده از امتیاز به دست آمده از مدلسازی تشخیص جنسیت
The performance of automatic speech recognition (ASR) systems is adversely affected by the variations in speakers, audio channels and environmental conditions. Making these systems robust to these variations is still a big challenge. One of the main sources of variations in the speakers is the differences between their Vocal Tract Length (VTL). Vocal Tract Length Normalization (VTLN) is an effe...
متن کاملReal-Time Vocal Tract Length Normalization in a Phonological Awareness Teaching System
Speaker normalization in a speech recognition can significantly improve speech recognition accuracy. One such method, vocal tract length normalization (VTLN), is especially useful when the system has to work reliably for males, females and children. It is just this situation with our phonological awareness teaching system, the “SpeechMaster”, which aims at real-time phoneme recognition and feed...
متن کاملپیشبینی قابلیت فهم همخوانها در افراد دارای شنوایی عادی با استفاده از مدلهای میکروسکوپی دارای معیار فاصله مختلف در بازشناساگر خودکار گفتار
In this study, recognition rates of consonants available in vowel-consonant-vowel structure in hearing tests and two microscopic models will be investigated. Such a syllable structure doesn’t exist in Farsi and Azerbaijani languages, but since the goal is only recognition of middle phoneme, according to hearing tests, listeners are able to properly recognize phonemes in clean speech conditions....
متن کامل