speech coding

Fully vector-quantized neural network-based code-excited nonlinear predictive speech coding

Journal: :IEEE Trans. Speech and Audio Processing 1994

Lizhong Wu Mahesan Niranjan Frank Fallside

I Recent studies have shown that non-linear prediction can be implemented with neural networks, and non-linear predictors will on average achieve about 2 3 improvement in prediction gain over conventional linear predictors. In this paper, we take the advantage of non-linear prediction with neural network, apply it to predictive speech coding and attempt to improve the speech coding performance....

متن کامل

A WIDEBAND CELP CODER AT 16 kbit/s FOR REAL TIME APPLICATIONS

2007

Erik Harborg Arild Fuldseth Finn Tore Johansen Jan Eikeset Knudsen

Since its introduction in 1984, Code Excited Linear Predictive (CELP) [1] coding has received considerable attention for high quality speech coding at low bit-rates. Although most of the research has been focused on coding of narrowband (200-3400 kHz) speech, some recent studies on CELP coding of wideband (50-7000 kHz) speech have been reported [2], [3], [4]. A possible application for wideband...

متن کامل

Syllable-based pitch encoding for low bit rate speech coding with recognition/synthesis architecture

2013

Milos Cernak Xingyu Na Philip N. Garner

Current HMM-based low bit rate speech coding systems work with phonetic vocoders. Pitch contour coding (on frame or phoneme level) is usually fairly orthogonal to other speech coding parameters. We make an assumption in our work that the speech signal contains supra-segmental cues. Hence, we present encoding of the pitch on the syllable level, used in the framework of a recognition/synthesis sp...

متن کامل

A Simple Continuous Excitation Model for Parametric Vocoding

2015

Philip N. Garner Milos Cernak Blaise Potard

We describe a continuous-pitch parametric vocoder suitable for speech coding and statistical text to speech synthesis. The spectral model is based on linear prediction. We show that glottal modelling techniques from recent literature can be cherry-picked to produce an excitation signal with properties known to be useful in the above application areas. We further show that the continuous pitch p...

متن کامل

Modelos de evolución de la Tecnología del Habla, y tendencias futuras

Journal: :Procesamiento del Lenguaje Natural 2003

Luis A. Hernández Gómez

This talk proposes a common model for describing the evolution of different technologies involved in Speech Technology. The proposed model describes the evolution from a general knowledge of basic language theory concepts that is combined with powerful datadriven processing algorithms. After presenting the overall technological success of speech coding, synthesis and recognition, we discuss the...

متن کامل

Prominence based scoring of speech segments for automatic speech-to-speech summarization

2010

Sree Harsha Yella Vasudeva Varma Kishore Prahallad

In order to perform speech summmarization it is necessary to identify important segments in speech signal. The importance of a speech segment can be effectively determined by using infomation from lexical and prosodic features. Standard speech summarization systems depend on ASR transcripts or gold standard human reference summaries to train a supervised system which combines lexical and prosod...

متن کامل

Automatic Parameter Estimation for a Context-Independent Speech Segmentation Algorithm

2002

Guido Aversano Anna Esposito

In the framework of a recently introduced algorithm for speech phoneme segmentation, a novel strategy has been elaborated for comparing different speech encoding methods and for finding parameters which are optimal to the algorithm. The automatic procedure that implements this strategy allows to improve previously declared performances and poses the basis for a more accurate comparison between ...

متن کامل

Towards flexible speech coding for speech synthesis: an LF + modulated noise vocoder

2008

Yannis Agiomyrgiannakis Olivier Rosec

This paper presents an ARX-LF-based model of speech that is amenable to low-bit-rate quantization and speech modifications directly at the parametric domain. The new model successfully addresses the non-deterministic part of voiced speech by modulating noise with the glottal flow, while unvoiced speech and transients are synthesized by modulating noise with a signal-derived time envelope. The p...

متن کامل

Lexical tone production by Cantonese speakers with parkinson's disease

2009

Joan Ka-Yin Ma

The aim of this study was to investigate lexical tone production in Cantonese speakers associated with Parkinson’s disease (PD speakers). The effect of intonation on the production of lexical tone was also examined. Speech data was collected from five Cantonese PD speakers. Speech materials consisted of targets contrasting in tones, embedded in different sentence contexts (initial, medial and f...

متن کامل

Prosodic cues of spontaneous speech in French

2005

Katarina Bartkova

Disfluencies, when present in speech signal, can make syntactic parsing difficult. This difficulty is increased when machines are involved in communication and when speech devices rely on automatic speech recognition techniques. In order to improve automatic speech parsing and thus speech comprehension, methods have been proposed to filter disfluencies out from the speech signal. Attempts have ...

متن کامل