نتایج جستجو برای: wavenet

تعداد نتایج: 91  

Journal: :CoRR 2016
Tom Le Paine Pooya Khorrami Shiyu Chang Yang Zhang Prajit Ramachandran Mark A. Hasegawa-Johnson Thomas S. Huang

This paper presents an efficient implementation of the Wavenet generation process called Fast Wavenet. Compared to a naı̈ve implementation that has complexity O(2) (L denotes the number of layers in the network), our proposed approach removes redundant convolution operations by caching previous calculations, thereby reducing the complexity to O(L) time. Timing experiments show significant advant...

2017
Akira Tamamori Tomoki Hayashi Kazuhiro Kobayashi Kazuya Takeda Tomoki Toda

In this study, we propose a speaker-dependent WaveNet vocoder, a method of synthesizing speech waveforms with WaveNet, by utilizing acoustic features from existing vocoder as auxiliary features of WaveNet. It is expected that WaveNet can learn a sample-by-sample correspondence between speech waveform and acoustic features. The advantage of the proposed method is that it does not require (1) exp...

2017
Umut Güçlü Yağmur Güçlütürk Luca Ambrogioni Eric Maris Rob van Lier Marcel van Gerven

Here, we propose a new approach for modeling conditional probability distributions of polyphonic music by combining WaveNET and CRF-RNN variants, and show that this approach beats LSTM and WaveNET baselines that do not take into account the statistical dependencies between simultaneous notes.

Journal: :CoRR 2017
Aäron van den Oord Yazhe Li Igor Babuschkin Karen Simonyan Oriol Vinyals Koray Kavukcuoglu George van den Driessche Edward Lockhart Luis C. Cobo Florian Stimberg Norman Casagrande Dominik Grewe Seb Noury Sander Dieleman Erich Elsen Nal Kalchbrenner Heiga Zen Alex Graves Helen King Tom Walters Dan Belov Demis Hassabis

The recently-developed WaveNet architecture [27] is the current state of the art in realistic speech synthesis, consistently rated as more natural sounding for many different languages than any previous system. However, because WaveNet relies on sequential generation of one audio sample at a time, it is poorly suited to today’s massively parallel computers, and therefore hard to deploy in a rea...

Journal: :CoRR 2018
Kanru Hua

Various sources have reported the WaveNet deep learning architecture being able to generate high-quality speech, but to our knowledge there haven’t been studies on the interpretation or visualization of trained WaveNets. This study investigates the possibility that WaveNet understands speech by unsupervisedly learning an acoustically meaningful latent representation of the speech signals in its...

2018
Zeyu Jin Adam Finkelstein Gautham J. Mysore Jingwan Lu

We introduce FFTNet, a deep learning approach synthesizing audio waveforms. Our approach builds on the recent WaveNet project, which showed that it was possible to synthesize a natural sounding audio waveform directly from a deep convolutional neural network. FFTNet offers two improvements over WaveNet. First it is substantially faster, allowing for real-time synthesis of audio waveforms. Secon...

1997
Alexander Ypma Robert P. W. Duin

When the aim is to make an arbitrary nonlin-ear mapping, neural networks are known to be a suitable technique. The Wavenet combines them with the wavelet transform, enabling a multi-scale approximation, while dilation and translation parameters can be t to the data. Some properties of the Wavenet are investigated and an outlook to application in machinery monitoring is provided.

2017
Kazuhiro Kobayashi Tomoki Hayashi Akira Tamamori Tomoki Toda

This paper presents a statistical voice conversion (VC) technique with the WaveNet-based waveform generation. VC based on a Gaussian mixture model (GMM) makes it possible to convert the speaker identity of a source speaker into that of a target speaker. However, in the conventional vocoding process, various factors such as F0 extraction errors, parameterization errors and over-smoothing effects...

2011
Carlos Roberto Domínguez Mayorga María Angélica Espejel Rivera Luis Enrique Ramos Velasco Julio César Ramos Fernández Enrique Escamilla Hernández

In this paper we present algorithms which are adaptive and based on neural networks and wavelet series to build wavenets function approximators. Results are shown in numerical simulation of two wavenets approximators architectures: the first is based on a wavenet for approach the signals under study where the parameters of the neural network are adjusted online, the other uses a scheme approxim...

Journal: :CoRR 2017
Jonathan Shen Ruoming Pang Ron J. Weiss Mike Schuster Navdeep Jaitly Zongheng Yang Zhifeng Chen Yu Zhang Yuxuan Wang R. J. Skerry-Ryan Rif A. Saurous Yannis Agiomyrgiannakis Yonghui Wu

This paper describes Tacotron 2, a neural network architecture for speech synthesis directly from text. The system is composed of a recurrent sequence-to-sequence feature prediction network that maps character embeddings to mel-scale spectrograms, followed by a modified WaveNet model acting as a vocoder to synthesize timedomain waveforms from those spectrograms. Our model achieves a mean opinio...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید