Analysis and modeling of syllable duration for Thai speech synthesis
نویسندگان
چکیده
This paper describes the analysis results on the control factors of Thai syllable duration, and a statistical control model using linear regression technique. The analyses have been carried out both at a syllable level and at a phrase level. In a syllable level duration control, the effects of five Thai tones and syllable structures are investigated. To analyze syllable structure effects statistically, we applied the quantification theory with two linguistic factors: (1) phone categories by themselves, and (2) the categories grouped by articulatory similarities. In a phrase level, the effects of position in a phrase and syllable counts in a phrase were analyzed. The experimental results showed that tones, syllable structures, and position in a phrase play significant roles on syllable duration control. Syllable counts in a phrase slightly affects the syllable duration. These analysis results have been integrated into a statistical control model. The duration assignment precision of the proposed model is evaluated using 2480-word speech data. Total correlation 0.73 between predicted values and observed values for test set samples shows the fair precision of the proposed control model.
منابع مشابه
Modeling Rhythmic Variation in Thai and its Application to Speech Synthesis
This study concerns a preliminary experiment on modeling the duration of Thai syllables. It is based on a corpus of minimal pairs of sentences only differing as to their stress patterns. Following a factor analysis of syllabic durations in the corpus a simple duration model was developed. This model was used for re-synthesizing the utterances by manipulating speech from a Thai TTS system by adj...
متن کاملDuration prediction using multi-level model for GPR-based speech synthesis
This paper introduces frame-based Gaussian process regression (GPR) into phone/syllable duration modeling for Thai speech synthesis. The GPR model is designed for predicting framelevel acoustic features using corresponding frame information, which includes relative position in each unit of utterance structure and linguistic information such as tone type and part of speech. Although the GPR-base...
متن کاملIssues in Thai Text - to - Speech Synthesis : The NECTEC Approach 1
This paper presents all the essential issues in developing the text-to-speech synthesis for Thai text analysis, prosody generation and speech synthesis. In the text analysis, problems in Thai text processing can be decomposed into the models of sentence extraction, phrase boundary determination and grapheme-to-phoneme conversion. The syllable duration and F0 contour generation rules are include...
متن کاملIssues in Thai Text-to-Speech Synthesis: The NECTEC Approach
This paper presents all the essential issues in developing the text-to-speech synthesis for Thai text analysis, prosody generation and speech synthesis. In the text analysis, problems in Thai text processing can be decomposed into the models of sentence extraction, phrase boundary determination and grapheme-to-phoneme conversion. The syllable duration and F0 contour generation rules are include...
متن کاملTone Question of Tree Based Context Clustering for Hidden Markov Model Based Thai Speech Synthesis
Problem statement: In HMM-based Thai speech synthesis, tone is an important issue that brings about the intelligibility of the synthesized speech. Tone distortion resulted from imbalance of the training data should be appropriately treated. Approach: This study described an HMM-based speech synthesis system for Thai language. In the system, spectrum, pitch and state duration are modeled simulta...
متن کامل