speaker transformation

Speaker normalization and adaptation based on linear transformation

1997

Jun Ishii Masahiro Tonomura

We propose novel speaker independent (SI) modeling and speaker adaptation based on a linear transformation. An SI model and speaker dependent (SD) models are usually generated using the same preprocessing of acoustic data. This straightforward preprocessing causes a serious problem. Probability distributions of the SI models become broad and the SI models do not give good initial estimates for ...

متن کامل

Spectral normalization employing hidden Markov modeling of line spectrum pair frequencies

1997

Bryan L. Pellom John H. L. Hansen

This paper proposes a spectral normalization approach in which the acoustical qualities of an input speech waveform are mapped onto that of a desired neutral voice. Such a method can be e ective in reducing the impact of speaker variability such as accent, stress, and emotion for speech recognition. In the proposed method, the transformation is performed by modeling the temporal characteristics...

متن کامل

Structural Kld for Cross-variety Speaker Adaptation in Hmm-based Speech Synthesis

2012

Markus E. Toman Michael Pucher

While the synthesis of natural sounding, neutral style speech can be achieved using today’s technology, fast adaptation of speech synthesis to different contexts and situations still poses a challenge. In the context of variety modeling (dialects, sociolects) we have to cope with the problem that no standardized orthographic form is available and that existing speech resources for these varieti...

متن کامل

Single-pass adapted training with all-pass transforms

1999

John W. McDonough William J. Byrne

In recent work, the all-pass transform (APT) was proposed as the basis of a speaker adaptation scheme intended for use with a large vocabulary speech recognition system. It was shown that APT-based adaptation reduces to a linear transformation of cepstral means, much like the better known maximum likelihood linear regression (MLLR), but is specified by far fewer free parameters. Due to its line...

متن کامل

Acoustic Analysis of Whispered Speech for Phoneme and Speaker Dependency

2011

Xing Fan Keith W. Godin John H. L. Hansen

Whisper is used by speakers in certain circumstances to protect personal information. Due to the differences in production mechanisms between neutral and whispered speech, there are considerable differences between the spectral structure of neutral and whispered speech, such as formant shifts and shifts in spectral slope. This study analyzes the dependency of these differences on speakers and p...

متن کامل

Feature normalization using smoothed mixture transformations

2006

Patrick Kenny Vishwa Gupta Gilles Boulianne Pierre Ouellet Pierre Dumouchel

We propose a method for estimating the parameters of SPLICElike transformations from individual utterances so that this type of transformation can be used to normalize acoustic feature vectors for speech recognition on an utterance-by-utterance basis in a similar manner to cepstral mean normalization. We report results on an in-house French language multi-speaker database collected while deploy...

متن کامل

Speaker verification on the polycost database using frequency filtered spectral energies

1998

Javier Hernando Climent Nadeu

The spectral parameters that result from filtering the frequency sequence of log mel-scaled filter-bank energies with a first or second order FIR filter have proved to be competitive for speech recognition. Recently, the authors have shown that this frequency filtering can approximately equalize the cepstrum variance enhancing the oscillations of the spectral envelope curve that are most effect...

متن کامل

Discriminative PLDA training with application-specific loss functions for speaker verification

2014

Johan Rohdin Sangeeta Biswas Koichi Shinoda

Speaker verification systems are usually evaluated by a weighted average of its false acceptance (FA) rate and false rejection (FR) rate. The weights are known as the operating point (OP) and depend on the applications. Recent researches suggest that, for the purpose of score calibration of speaker verification systems, it is beneficial to let discriminative training emphasize on the operating ...

متن کامل

Speaker clustering and transformation for speaker adaptation in speech recognition systems

Journal: :IEEE Trans. Speech and Audio Processing 1998

Mukund Padmanabhan Lalit R. Bahl David Nahamoo Michael Picheny

A speaker adaptation strategy is described that is based on finding a subset of speakers, from the training set, who are acoustically close to the test speaker, and using only the data from these speakers (rather than the complete training corpus) to reestimate the system parameters. Further, a linear transformation is computed for every one of the selected training speakers to better map the t...

متن کامل

A study on speaker normalization using vocal tract normalization and speaker adaptive training

1998

Lutz Welling Reinhold Häb-Umbach X. Zubert N. Haberland

Although speaker normalization is attempted in very different manners, vocal tract normalization (VTN) and speaker adaptive training (SAT) share many common properties. We show that both lead to more compact representations of the phonetically relevant variations of the training data and that both achieve improved error rate performance only if a complementary normalization or adaptation operat...

متن کامل