Parametric model for vocal effort interpolation with harmonics plus noise models
نویسندگان
چکیده
It is known that voice quality plays an important role in expressive speech. In this paper, we present a methodology for modifying vocal effort level, which can be applied by text-to-speech (TTS) systems to provide the flexibility needed to improve the naturalness of synthesized speech. This extends previous work using low order Linear Prediction Coefficients (LPC) where the flexibility was constrained by the amount of vocal effort levels available in the corpora. The proposed methodology overcomes these limitations by replacing the low order LPC by ninth order polynomials to allow not only vocal effort to be modified towards the available templates, but also to allow the generation of intermediate vocal effort levels between levels available in training data. This flexibility comes from the combination of Harmonics plus Noise Models and using a parametric model to represent the spectral envelope. The conducted perceptual tests demonstrate the effectiveness of the proposed technique in performing vocal effort interpolations while maintaining the signal quality in the final synthesis. The proposed technique can be used in unit-selection TTS systems to reduce corpus size while increasing its flexibility, and the techniques could potentially be employed by HMM based speech synthesis systems if appropriate acoustic features are being used.
منابع مشابه
Improving Voice Outcomes After Injury to the Recurrent Laryngeal Nerve
Objectives: The present study aimed to determine the voice outcomes before and after the administration of voice therapy in patients who suffered an injury to the recurrent laryngeal nerve after undergoing thyroidectomy. Methods: The sample consisted of 26 patients (2 males and 24 females) aged between 18 and 80 years (m=55±12) who experienced injury to the recurrent laryngeal nerve fol...
متن کاملA hybrid harmonics-and-bursts modelling approach to speech synthesis
Statistical speech synthesis systems rely on a parametric speech generation model, typically some sort of vocoder. Vocoders are great for voiced speech because they offer independent control over voice source (e.g. pitch) and vocal tract filter (e.g. vowel quality) through control parameters that typically vary smoothly in time and lend themselves well to statistical modelling. Voiceless sounds...
متن کاملVocal performance affects metabolic rate in dolphins: implications for animals communicating in noisy environments.
Many animals produce louder, longer or more repetitious vocalizations to compensate for increases in environmental noise. Biological costs of increased vocal effort in response to noise, including energetic costs, remain empirically undefined in many taxa, particularly in marine mammals that rely on sound for fundamental biological functions in increasingly noisy habitats. For this investigatio...
متن کاملA New Empirical Model to Increase the Accuracy of Software Cost Estimation (TECHNICAL NOTE)
We can say a software project is successful when it is delivered on time, within the budget and maintaining the required quality. However, nowadays software cost estimation is a critical issue for the advance software industry. As the modern software’s behaves dynamically so estimation of the effort and cost is significantly difficult. Since last 30 years, more than 20 models are already develo...
متن کاملTraffic Noise Mapping in Urban 3D Area by Using GIS and CORTN Model
Urban communities have been developing and they are being industrialized. These developments have some benefits for these communities; however, they have created some significant problems. One of these problems in this area is traffic and road congestion and following that, noise pollution on urban areas. These days, noise pollution is one the considerable problems that the residents of crowded...
متن کامل