Automatic phoneme alignment based on acoustic-phonetic modeling
نویسنده
چکیده
This paper presents a method for speaker-independent automatic phonetic alignment that is distinguished from standard HMM-based “forced alignment” in three respects: (1) specific acoustic-phonetic features are used, in addition to PLP features, by the phonetic classifier; (2) the units of classification consist of distinctive phonetic features instead of phonemes; and (3) observation probabilities depend not only on the current state, but also on the state transition information. This proposed method is compared with a state-of-the-art baseline forcedalignment system on a number of corpora, including telephone speech, microphone speech, and children’s speech. The new method has agreement of 92.57% within 20 msec on the TIMIT corpus, which is a 26% reduction in error over the baseline method (with 89.95% agreement on TIMIT). Average reduction in error over all corpora is 28%.
منابع مشابه
Allophone-based acoustic modeling for Persian phoneme recognition
Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...
متن کاملAutomatic Phoneme Ali Acoustic-phonetic
This paper presents a method for speaker-independent automatic phonetic alignment that is distinguished from standard HMM-based “forced alignment” in three respects: (1) specific acoustic-phonetic features are used, in addition to PLP features, by the phonetic classifier; (2) the units of classification consist of distinctive phonetic features instead of phonemes; and (3) observation probabilit...
متن کاملDeep Learning Techniques in Tandem with Signal Processing Cues for Phonetic Segmentation for Text to Speech Synthesis in Indian Languages
Automatic detection of phoneme boundaries is an important sub-task in building speech processing applications, especially text-to-speech synthesis (TTS) systems. The main drawback of the Gaussian mixture model hidden Markov model (GMMHMM) based forced-alignment is that the phoneme boundaries are not explicitly modeled. In an earlier work, we had proposed the use of signal processing cues in tan...
متن کاملOn the impact of phoneme alignment in DNN-based speech synthesis
Recently, deep neural networks (DNNs) have significantly improved the performance of acoustic modeling in statistical parametric speech synthesis (SPSS). However, in current implementations, when training a DNN-based speech synthesis system, phonetic transcripts are required to be aligned with the corresponding speech frames to obtain the phonetic segmentation, called phoneme alignment. Such an...
متن کاملطراحی الگوریتم بازشناسی واجها با به کارگیری همبسته های آکوستیکی مشخصه های واجی
In the present paper, the phonological feature geometry of the Persian phonemes is analyzed in the form of articulate-free and articulate-bound features based on the articulator model of the nonlinear phonology. Then, the reference phonetic pattern of each feature that consists of one or a set of acoustic correlates, characterized by the quantitative or qualitative values in its phonological re...
متن کامل