The “kiel Corpus of Read Speech” as a Resource for Prosody Prediction in Speech Synthesis
نویسنده
چکیده
The naturalness of synthetic speech depends strongly on the prediction of appropriate prosody. For the present study the original annotation of the German speech database “Kiel Corpus of Read Speech” was extended automatically with syntactic features, word frequency, and syllable boundaries. Several classification and regression trees for predicting symbolic prosody features, postlexical phonological processes, duration, and F0 were trained on this database. The perceptual evaluation showed that the overall perceptual quality of the German text-to-speech system MARY can be significantly improved by training all models that contribute to prosody prediction on the same database. Furthermore, it showed that the error introduced by symbolic prosody prediction perceptually equals the error produced by a direct method that does not exploit any symbolic prosody features.
منابع مشابه
Study on Unit-Selection and Statistical Parametric Speech Synthesis Techniques
One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...
متن کاملA corpus-based Chinese speech synthesis with contextual dependent unit selection
This paper describes the realization of a corpus-based Chinese speech synthesis system, including the corpus design and unit selection procedure. The system selects the synthesis unit according to context similarity between target unit and candidate unit. Neither prosody parameter prediction nor prosody feature modification is needed. The informal test shows that the synthesized speech is quite...
متن کاملRule-based Prosody Prediction for German Text-to-Speech Synthesis
This paper presents two empirical studies that examine the influence of different linguistic aspects on prosody in German. First, we analysed a German corpus with respect to the effect of syntax and information status on prosody. Second, we conducted a listening test which investigated the prosodic realisation of constituents in the German ’Vorfeld’ depending on their information status. The re...
متن کاملAutomatic labeling of Japanese prosody using j-toBI style description
Speech corpora with prosodic labels are getting more and more important not only for speech synthesis but also for discourse modeling. A widely used labeling system for Japanese prosody, J-ToBI, however, is insufficient for applications like discourse modeling and it even lacks an accurate method for automatic labeling. In this paper, we propose an automatic labeling method for J-ToBI style des...
متن کاملSyllable detection in read and spontaneous speech
Automatic syllable detection is an important task when analysing very large speech corpora in order to answer questions concerning prosody, rhythm, speech rate, speech recognition and synthesis. In this paper a new method for automatic detection of syllable nuclei is presented. Two large spoken language corpora (PhonDatII, Verbmobil) were labelled by three phoneticians and then used to adjust t...
متن کامل