نتایج جستجو برای: Wavenet
تعداد نتایج: 91 فیلتر نتایج به سال:
This paper presents an efficient implementation of the Wavenet generation process called Fast Wavenet. Compared to a naı̈ve implementation that has complexity O(2) (L denotes the number of layers in the network), our proposed approach removes redundant convolution operations by caching previous calculations, thereby reducing the complexity to O(L) time. Timing experiments show significant advant...
In this study, we propose a speaker-dependent WaveNet vocoder, a method of synthesizing speech waveforms with WaveNet, by utilizing acoustic features from existing vocoder as auxiliary features of WaveNet. It is expected that WaveNet can learn a sample-by-sample correspondence between speech waveform and acoustic features. The advantage of the proposed method is that it does not require (1) exp...
Here, we propose a new approach for modeling conditional probability distributions of polyphonic music by combining WaveNET and CRF-RNN variants, and show that this approach beats LSTM and WaveNET baselines that do not take into account the statistical dependencies between simultaneous notes.
The recently-developed WaveNet architecture [27] is the current state of the art in realistic speech synthesis, consistently rated as more natural sounding for many different languages than any previous system. However, because WaveNet relies on sequential generation of one audio sample at a time, it is poorly suited to today’s massively parallel computers, and therefore hard to deploy in a rea...
Various sources have reported the WaveNet deep learning architecture being able to generate high-quality speech, but to our knowledge there haven’t been studies on the interpretation or visualization of trained WaveNets. This study investigates the possibility that WaveNet understands speech by unsupervisedly learning an acoustically meaningful latent representation of the speech signals in its...
We introduce FFTNet, a deep learning approach synthesizing audio waveforms. Our approach builds on the recent WaveNet project, which showed that it was possible to synthesize a natural sounding audio waveform directly from a deep convolutional neural network. FFTNet offers two improvements over WaveNet. First it is substantially faster, allowing for real-time synthesis of audio waveforms. Secon...
When the aim is to make an arbitrary nonlin-ear mapping, neural networks are known to be a suitable technique. The Wavenet combines them with the wavelet transform, enabling a multi-scale approximation, while dilation and translation parameters can be t to the data. Some properties of the Wavenet are investigated and an outlook to application in machinery monitoring is provided.
This paper presents a statistical voice conversion (VC) technique with the WaveNet-based waveform generation. VC based on a Gaussian mixture model (GMM) makes it possible to convert the speaker identity of a source speaker into that of a target speaker. However, in the conventional vocoding process, various factors such as F0 extraction errors, parameterization errors and over-smoothing effects...
In this paper we present algorithms which are adaptive and based on neural networks and wavelet series to build wavenets function approximators. Results are shown in numerical simulation of two wavenets approximators architectures: the first is based on a wavenet for approach the signals under study where the parameters of the neural network are adjusted online, the other uses a scheme approxim...
This paper describes Tacotron 2, a neural network architecture for speech synthesis directly from text. The system is composed of a recurrent sequence-to-sequence feature prediction network that maps character embeddings to mel-scale spectrograms, followed by a modified WaveNet model acting as a vocoder to synthesize timedomain waveforms from those spectrograms. Our model achieves a mean opinio...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید