نتایج جستجو برای: wavenet

تعداد نتایج: 91  

2003
Shu-Fai Wong Kwan-Yee Kenneth Wong

Human body tracking is useful in applications like medical diagnostic, human computer interface, visual surveillance etc. In this paper, a trajectory-learning algorithm using wavenet is proposed to track human body in real time without sacrificing accuracy. Human body is located within a small searching window using color and shape as heuristic. The location and size of the searching window are...

2017
Jesse Engel Cinjon Resnick Adam Roberts Sander Dieleman Mohammad Norouzi Douglas Eck Karen Simonyan

Generative models in vision have seen rapid progress due to algorithmic improvements and the availability of high-quality image datasets. In this paper, we offer contributions in both these areas to enable similar progress in audio modeling. First, we detail a powerful new WaveNet-style autoencoder model that conditions an autoregressive decoder on temporal codes learned from the raw audio wave...

2017

This paper introduces HybridNet, a hybrid neural network to speed-up autoregressive models for raw audio waveform generation. As an example, we propose a hybrid model that combines an autoregressive network named WaveNet and a conventional LSTM model to address speech synthesis. Instead of generating one sample per time-step, the proposed HybridNet generates multiple samples per time-step by ex...

Journal: :CoRR 2017
W. Bastiaan Kleijn Felicia S. C. Lim Alejandro Luebs Jan Skoglund Florian Stimberg Quan Wang Thomas C. Walters

Traditional parametric coding of speech facilitates low rate but provides poor reconstruction quality because of the inadequacy of the model used. We describe how a WaveNet generative speech model can be used to generate high quality speech from the bit stream of a standard parametric coder operating at 2.4 kb/s. We compare this parametric coder with a waveform coder based on the same generativ...

2016
Aäron van den Oord Sander Dieleman Heiga Zen Karen Simonyan Oriol Vinyals Alex Graves Nal Kalchbrenner Andrew W. Senior Koray Kavukcuoglu

This paper introduces WaveNet, a deep neural network for generating raw audio waveforms. The model is fully probabilistic and autoregressive, with the predictive distribution for each audio sample conditioned on all previous ones; nonetheless we show that it can be efficiently trained on data with tens of thousands of samples per second of audio. When applied to text-to-speech, it yields state-...

Journal: :CoRR 2017
Dan Elbaz Michael Zibulevsky

PESQ, Perceptual Evaluation of Speech Quality [5], and POLQA, Perceptual Objective Listening Quality Assessment [1] , are standards comprising a test methodology for automated assessment of voice quality of speech as experienced by human beings. The predictions of those objective measures should come as close as possible to subjective quality scores as obtained in subjective listening tests, us...

Journal: :Journal of Physics: Conference Series 2020

Journal: :Optics Communications 2021

Express Wavenet is an improved optical diffractive neural network. At each layer, it uses wavelet-like pattern to modulate the phase of waves. For input image with n2 pixels, express wavenet reduce parameter number from O(n2) O(n). Only need one percent parameters, and accuracy still very high. In MNIST dataset, only needs 1229 parameters get 92%, while standard network 125440 parameters. The r...

2017
Kaizhi Qian Yang Zhang Shiyu Chang Xuesong Yang Dinei A. F. Florêncio Mark Hasegawa-Johnson

In recent years, deep learning has achieved great success in speech enhancement. However, there are two major limitations regarding existing works. First, the Bayesian framework is not adopted in many such deep-learning-based algorithms. In particular, the prior distribution for speech in the Bayesian framework has been shown useful by regularizing the output to be in the speech space, and thus...

Journal: :CoRR 2017
Dario Rethage Jordi Pons Xavier Serra

Currently, most speech processing techniques use magnitude spectrograms as frontend and are therefore by default discarding part of the signal: the phase. In order to overcome this limitation, we propose an end-to-end learning method for speech denoising based on Wavenet. The proposed model adaptation retains Wavenet’s powerful acoustic modeling capabilities, while significantly reducing its ti...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید