A Neural Parametric Singing Synthesizer Modeling Timbre and Expression from Natural Songs
نویسندگان
چکیده
We recently presented a new model for singing synthesis based on a modified version of the WaveNet architecture. Instead of modeling raw waveform, we model features produced by a parametric vocoder that separates the influence of pitch and timbre. This allows conveniently modifying pitch to match any target melody, facilitates training on more modest dataset sizes, and significantly reduces training and generation times. Nonetheless, compared to modeling waveform directly, ways of effectively handling higher-dimensional outputs, multiple feature streams and regularization become more important with our approach. In this work, we extend our proposed system to include additional components for predicting F0 and phonetic timings from a musical score with lyrics. These expression-related features are learned together with timbrical features from a single set of natural songs. We compare our method to existing statistical parametric, concatenative, and neural network-based approaches using quantitative metrics as well as listening tests.
منابع مشابه
A Neural Parametric Singing Synthesizer
We present a new model for singing synthesis based on a modified version of the WaveNet architecture. Instead of modeling raw waveform, we model features produced by a parametric vocoder that separates the influence of pitch and timbre. This allows conveniently modifying pitch to match any target melody, facilitates training on more modest dataset sizes, and significantly reduces training and g...
متن کاملA Music Information Retrieval System Based on Singing Voice Timbre
We developed a music information retrieval system based on singing voice timbre, i.e., a system that can search for songs in a database that have similar vocal timbres. To achieve this, we developed a method for extracting feature vectors that represent characteristics of singing voices and calculating the vocal-timbre similarity between two songs by using a mutual information content of their ...
متن کاملA singing style modeling system for singing voice synthesizers
This paper describes a method of modeling singing styles by a statistical method. In this system, singing expression parameters consisting of melody and dynamics which are derived from F0 and power are modeled by context-dependent Hidden Markov Models (HMMs.) A modeling method of the parameters are optimized for dealing with them. Since parameters we focus on are essential but general ones for ...
متن کاملGenerating Singing Voice Expression Contours Based on Unit Selection
A common problem of many current singing voice synthesizers is that obtaining a natural-sounding and expressive performance requires a lot of manual user input. This thus becomes a time-consuming and difficult task. In this paper we introduce a unit selection-based approach for the generation of expression parameters that control the synthesizer. Given the notes of a target score, the system is...
متن کاملExpressive Singing Synthesis Based on Unit Selection for the Singing Synthesis Challenge 2016
Sample and statistically based singing synthesizers typically require a large amount of data for automatically generating expressive synthetic performances. In this paper we present a singing synthesizer that using two rather small databases is able to generate expressive synthesis from an input consisting of notes and lyrics. The system is based on unit selection and uses the Wide-Band Harmoni...
متن کامل