Syllable HMM based Mandarin TTS and comparison with concatenative TTS
نویسندگان
چکیده
This paper introduces a Syllable HMM based Mandarin TTS system. 10-state left-to-right HMMs are used to model each syllable. We leverage the corpus and the front end of a concatenative TTS system to build the Syllable HMM based TTS system. Furthermore, we utilize the unique consonant/vowel structure of Mandarin syllable to improve the voiced/unvoiced decision of HMM states. Evaluation results show that the Syllable HMM based Mandarin TTS system with a 5.3MB’s model size can achieve an overall quality close to a concatenative TTS system with 1GB’ data size.
منابع مشابه
Data pruning approach to unit selection for inventory generation of concatenative embeddable Chinese TTS systems
In this paper, a data pruning approach is presented for building acoustic unit inventory for syllable-based concatenative embeddable Chinese TTS system. A 3-portion segmentation of a syllable is proposed based on the nature of voiced/unvoiced structure of Chinese syllable. Individual factorial acoustic measurement of syllable is used to calculate the penalty of perceptual unsatisfactory for con...
متن کاملSpectral Continuity Measures at Mandarin Syllable Boundaries
In Text-to-Speech (TTS) systems based on concatenative synthesis, the naturalness of synthetic speech is highly affected by the spectral continuities at the concatenation point. In this paper, we focused on 4 kinds of syllable boundaries in mandarin and used several spectral distance measures combined with time derivatives distance measures to predict their audible discontinuities. A perceptual...
متن کاملRelative Functional Comparison of Neural and Non- Neural Approaches for Syllable Segmentation in Devnagari TTS System. Prof Mrs
This paper presents methods for automatic speech signal segmentation using neural network. Speech signal segmentation is carried out to form syllables. Syllable is a common unit for concatenative TTS systems. Concatenative TTS being using speech segments of recorded speech is natural as compare to Formant or Articulatory TTS systems. This TTS stores small segments of speech and join them togeth...
متن کاملMaximum-likelihood dynamic intonation model for concatenative text-to-speech system
In this work we present a Maximum Likelihood (ML) joint pitch curve modeling, inspired by HMM TTS synthesis concept. This model provides an optimal solution for the coarse target intonation curve (3 points per syllable) and incorporates both static and dynamic pitch values for better utterance intonation modeling. The coarse intonation curve may be optionally combined with the original pitch ex...
متن کاملText to Speech System for Malayalam
Text-to speech (TTS) systems which mainly meant for speech synthesis are, used for one of the South Indian languages called Malayalam. The paper makes a brief study on, Malayalam linguistics, and also gives a comparison between two prominent methodologies for speech synthesis, viz Concatenative based synthesis and HMM based synthesis. As a result, the paper mentions some of the problems facing ...
متن کامل