What kind of pronunciation variation is hard for triphones to model?
نویسندگان
چکیده
In order to help understand why gains in pronunciation modeling have proven so elusive, we investigated which kinds of pronunciation variation are well captured by triphone models, and which are not. We do this by examining the change in behavior of a recognizer as it receives further triphone training. We show that many of the kinds of variation which previous pronunciation models attempt to capture, including phone substitution or phone reduction, are in fact already well captured by triphones. Our analysis suggests new areas where future pronunciation models should focus, including syllable deletion.
منابع مشابه
Data-driven pronunciation modeling for ASR using acoustic subword units
We describe a method to model pronunciation variation for ASR in a data-driven way, namely by use of automatically derived acoustic subword units. The inventory of units is designed so as to produce maximal separable pronunciation variants of words while at the same time only the most important variants for the particular application are trained. In doing so, the optimal number of variants per ...
متن کاملData-driven Pronunciation Modeling for AS
We describe a method to model pronunciation variation for ASR in a data-driven way, namely by use of automatically derived acoustic subword units. The inventory of units is designed so as to produce maximal separable pronunciation variants of words while at the same time only the most important variants for the particular application are trained. In doing so, the optimal number of variants per ...
متن کاملEnglish Pronunciation Instruction: A Literature Review
English pronunciation instruction is difficult for some reasons. Teachers are left without clear guidelines and are faced with contradictory practices for pronunciation instruction. There is no well-established systematic method of deciding what to teach, when, and how to do it. As a result of these problems, pronunciation instruction is less important and teachers are not very comfortable in t...
متن کاملMulti-path Syllable Models Based on Phonetic Knowledge
Recent research suggests that syllable-length acoustic models might be more appropriate for pronunciation variation modelling than the context-dependent phones that conventional automatic speech recognisers use. In this paper, we compare the recognition performance of two types of recognisers: a conventional recogniser that only uses triphones, and an experimental recogniser that employs a mix ...
متن کاملModel partial pronunciation variations for spontaneous Mandarin speech recognition
The high error rate in spontaneous speech recognition is due in part to the poor modeling of pronunciation variations. An analysis of acoustic data reveals that pronunciation variations include both complete changes and partial changes. Complete changes are the replacement of a canonical phoneme by another alternative phone, such as b being pronounced as p . Partial changes are the variations w...
متن کامل