Syllable HMM based Mandarin TTS and comparison with concatenative TTS

نویسندگان

  • Zhiwei Shuang
  • Shiyin Kang
  • Qin Shi
  • Yong Qin
  • Lianhong Cai
چکیده

This paper introduces a Syllable HMM based Mandarin TTS system. 10-state left-to-right HMMs are used to model each syllable. We leverage the corpus and the front end of a concatenative TTS system to build the Syllable HMM based TTS system. Furthermore, we utilize the unique consonant/vowel structure of Mandarin syllable to improve the voiced/unvoiced decision of HMM states. Evaluation results show that the Syllable HMM based Mandarin TTS system with a 5.3MB’s model size can achieve an overall quality close to a concatenative TTS system with 1GB’ data size.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Data pruning approach to unit selection for inventory generation of concatenative embeddable Chinese TTS systems

In this paper, a data pruning approach is presented for building acoustic unit inventory for syllable-based concatenative embeddable Chinese TTS system. A 3-portion segmentation of a syllable is proposed based on the nature of voiced/unvoiced structure of Chinese syllable. Individual factorial acoustic measurement of syllable is used to calculate the penalty of perceptual unsatisfactory for con...

متن کامل

Spectral Continuity Measures at Mandarin Syllable Boundaries

In Text-to-Speech (TTS) systems based on concatenative synthesis, the naturalness of synthetic speech is highly affected by the spectral continuities at the concatenation point. In this paper, we focused on 4 kinds of syllable boundaries in mandarin and used several spectral distance measures combined with time derivatives distance measures to predict their audible discontinuities. A perceptual...

متن کامل

Relative Functional Comparison of Neural and Non- Neural Approaches for Syllable Segmentation in Devnagari TTS System. Prof Mrs

This paper presents methods for automatic speech signal segmentation using neural network. Speech signal segmentation is carried out to form syllables. Syllable is a common unit for concatenative TTS systems. Concatenative TTS being using speech segments of recorded speech is natural as compare to Formant or Articulatory TTS systems. This TTS stores small segments of speech and join them togeth...

متن کامل

Maximum-likelihood dynamic intonation model for concatenative text-to-speech system

In this work we present a Maximum Likelihood (ML) joint pitch curve modeling, inspired by HMM TTS synthesis concept. This model provides an optimal solution for the coarse target intonation curve (3 points per syllable) and incorporates both static and dynamic pitch values for better utterance intonation modeling. The coarse intonation curve may be optionally combined with the original pitch ex...

متن کامل

Text to Speech System for Malayalam

Text-to speech (TTS) systems which mainly meant for speech synthesis are, used for one of the South Indian languages called Malayalam. The paper makes a brief study on, Malayalam linguistics, and also gives a comparison between two prominent methodologies for speech synthesis, viz Concatenative based synthesis and HMM based synthesis. As a result, the paper mentions some of the problems facing ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009