نتایج جستجو برای: speaker transformation
تعداد نتایج: 242055 فیلتر نتایج به سال:
Constrained Maximum Likelihood Linear Regression (CMLLR) is a speaker adaptation method for speech recognition that can be realized as a featurespace transformation. In its original form it does not work well when the amount of speech available for adaptation is less than about five seconds, because of the difficulty of robustly estimating the parameters of the transformation matrix. In this pa...
This paper describes a new speaker verification system based on orthogonal Gaussian mixture modeling (GMM) techniques combined with maximum a posteriori (MAP) adaptation. In most of the GMM based speaker verification systems, the variance of each component is constrained to be diagonal for its computational simplicity. However, this approximation inevitably introduces performance degradation. T...
This paper addresses the issue of closed-set text-independent speaker identification from samples of speech recorded over the telephone. It focuses on the effects of acoustic mismatches between training and testing data, and concentrates on two approaches: 1) extracting features that are robust against channel variations and 2) transforming the speaker models to compensate for channel effects. ...
We give a unification of several different speaker recognition problems in terms of the general speaker partitioning problem, where a set of N inputs has to be partitioned into subsets according to speaker. We show how to solve this problem in terms of a simple generative model and demonstrate performance on NIST SRE 2006 and 2008 data. Our solution yields probabilistic outputs, which we show h...
In the context of automatic speaker recognition, we propose a model transformation technique that renders speaker models more robust to acoustic mismatches and to data scarcity by appropriately increasing their variances. We use a stereo database containing speech recorded simultaneously under di erent acoustic conditions to derive a synthetic variance distribution. This distribution is then us...
Confidence measures are expected give a measure of reliability on the result of a speech/speaker recognition system. Most commonly used confidence measures are based on posterior word or phoneme probabilities which can be obtained from the output of the recognizer. In this paper we introduced a linear interpretation of posterior probability based confidence measure by using inverse Fisher trans...
This paper presents a novel algorithm that modifies the speech uttered by a source speaker to sound as if produced by a target speaker. In particular, we address the issue of transformation of the vocal tract characteristics from one speaker to another. The approach is based on estimating spectral envelopes using radial basis function (RBF) networks, which is one of the well-known models of art...
Adaptation of mixed-excitation linear predictive (MELP) model for application in voice conversion is presented. The adapted model features only numerical parameters which can be used for phonetic space transformation from source to target speaker using methods of machine learning. The validity of the model was demonstrated by applying transformation to both the pitch and the spectral envelope o...
A new multiscale voice morphing algorithm using radial basis function (RBF) analysis is presented in this paper. The approach copes well with small training sets of high dimension, which is a problem often encountered in voice morphing. The aim of this algorithm is to transform one person’s speech pattern so that it is perceived as if it was spoken by another speaker. The voice morphing system ...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید