Multi-level rhythm control for speech synthesis using hybrid data driven and rule-based approaches
نویسندگان
چکیده
This paper presents: a multi-level concept to generate the speech rhythm in the Dresden TTS system for German (DreSS). The rhythm control includes the phrase, the syllabic and the phonemic level. The concept allows the alternative use of rulebased or statistical, but also data driven methods on these levels. To create the rules and to train a neural network, a new speech corpus from original speakers of the diphone-based inventories has been recorded. The corpus covers texts and single utterances and is subdivided into phrase, syllabic and phonemic databases. First results indicating that the rule-based and the train-based methods generate a comparable speech rhythm, if the databases are uniform. The stepwise duration control on several prosodic levels shows promise as a method of producing a flexible rhythm depending on the specific TTS application.
منابع مشابه
Learning the parameters of quantitative prosody models
The article introduces a novel hybrid data driven and rule based approach for the prosody control in a TTS system, which combines the advantages of well-balanced, quantitative models with the flexible training of derived model parameters. Instancing the training of Fujisaki intonation parameters for German (MFGI) the article describes the hybrid data driven and rule based architecture HYDRA, th...
متن کاملCreating an individual speech rhythm: a data driven approach
Generating a near-to-natural speech rhythm can greatly contribute to the user's acceptance of TTS systems. Beside common aspects of the rhythm control (correctness of the segmental durations, robust function, etc.) rhythmic flexibility for several applications and individual speaking styles are desired. This article describes a data driven concept, which aims at the generation of an individual ...
متن کاملProposing a Novel Cost Sensitive Imbalanced Classification Method based on Hybrid of New Fuzzy Cost Assigning Approaches, Fuzzy Clustering and Evolutionary Algorithms
In this paper, a new hybrid methodology is introduced to design a cost-sensitive fuzzy rule-based classification system. A novel cost metric is proposed based on the combination of three different concepts: Entropy, Gini index and DKM criterion. In order to calculate the effective cost of patterns, a hybrid of fuzzy c-means clustering and particle swarm optimization algorithm is utilized. This ...
متن کاملStudy on Unit-Selection and Statistical Parametric Speech Synthesis Techniques
One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...
متن کاملAre rule-based syllabification methods adequate for languages with low syllabic complexity? the case of Italian
Syllabification information is a valuable component in speech synthesis systems. Linguistic rule-based methods have been assumed to be the best technique for determining the syllabification of unknown words. This has recently been shown to be incorrect for the English language where data-driven algorithms have been shown to outperform rule-based methods. It may be possible, however, that data-d...
متن کامل