New approach to the polyglot speech generation by means of an HMM-based speaker adaptable synthesizer
نویسندگان
چکیده
In this paper we present a new method for synthesizing multiple languages with the same voice, using HMM-based speech synthesis. Our approach, which we call HMM-based polyglot synthesis, consists of mixing speech data from several speakers in different languages, to create a speakerand language-independent (SI) acoustic model. We then adapt the resulting SI model to a specific speaker in order to create a speaker dependent (SD) acoustic model. Using the SD model it is possible to synthesize any of the languages used to train the SI model, with the voice of the speaker, regardless of the speaker’s language. We show that the performance obtained with our method is better than that of methods based on phone mapping for both adaptation and synthesis. Furthermore, for languages not included during training the performance of our approach also equals or surpasses the performance of any monolingual synthesizers based on the languages used to train the multilingual one. This means that our method can be used to create synthesizers for languages where no speech resources are available. 2006 Elsevier B.V. All rights reserved.
منابع مشابه
New approach to polyglot synthesis: how to speak any language with anyone’s voice
In this paper we present a new method to synthesize multiple languages with the voice of any arbitrary speaker. We call this method “HMM-based speaker-adaptable polyglot synthesis”. The idea consists in mixing data from several speakers in different languages to create a speakerindependent multilingual acoustic model. By means of MLLR, we can adapt this model to the voice of any given speaker. ...
متن کاملA Study on Speaker-Adaptable Multilingual Synthesis
This thesis introduces a new method for synthesizing multiple languages with the voice of any speaker, so that for example Japanese speech can be synthesized with the voice of a Russian monolingual speaker. This approach is based on the hypothesis that the average voice created by mixing a sufficient number of speakers is the same for all languages, i.e., the average voice is equivalent to a po...
متن کاملCross-lingual voice conversion-based polyglot speech synthesizer for indian languages
A polyglot speech synthesizer, synthesizes speech for any given monolingual or multilingual text, in a single speaker’s voice. In this regard, a polyglot speech corpus is required. It is difficult to find a speaker proficient in multiple languages. Therefore, in the current work, by exploiting the acoustic similarity of phonemes across Indian languages, a polyglot speech corpus is obtained for ...
متن کاملSpeaker and language adaptive training for HMM-based polyglot speech synthesis
This paper proposes a novel technique for speaker and language adaptive training for HMM-based statistical parametric polyglot speech synthesis. Language-specific context-dependencies in the system are captured using CAT with cluster-dependent decision trees. Acoustic variations caused by speaker characteristics are handled by CMLLR-based transforms. This framework allows multi-speaker/multi-la...
متن کاملHMM-based polyglot speech synthesis by speaker and language adaptive training
This paper describes a technique for speaker and language adaptive training (SLAT) for HMM-based polyglot speech synthesis and its evaluations on a multi-lingual speech corpus. The SLAT technique allows multi-speaker/multi-language adaptive training and synthesis to be performed. Experimental results show that the SLAT technique achieves better naturalness than both speaker-adaptively trained l...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Speech Communication
دوره 48 شماره
صفحات -
تاریخ انتشار 2006