Transformation enhanced multi-grained modeling for text-independent speaker recognition
نویسندگان
چکیده
We describe our formulation of transformation enhanced data modeling used to develop a multi-grained data analysis approach to text independent speaker recognition. The broad goal is to address difficulties caused by sparse training and test data. First, our development of maximum likelihood transformation based recognition with diagonally constrained Gaussian mixture models is detailed. We give results to show its robustness to decreasing training data. Then using the these models as building blocks, a multigrained model structure is developed. For this, the training data must be labeled, e.g. with an HMM based phone labeler. A graduated phone class structure is then used to train the speaker model at various levels of detail. This structure is a tree with the root node containing all the phones. Subsequent levels partition the phones into increasingly finer grained linguistic classes. We demonstrate the effectiveness of the modeling with identification and verification experiments.
منابع مشابه
Very large population text-independent speaker identification using transformation enhanced multi-grained models
The paper presents results on speaker identification with a population size of over 10000 speakers. Speaker modeling is accomplished via our Transformation Enhanced MultiGrained Models. Pursuing two goals, the first is to study the performance of a number of different systems within the modeling framework of multi-grained models. The second is to analyze performance as a function of population ...
متن کاملMulti - Grained Modeling with Pattern Speci cMaximum
| We present a transformation based, multi-grained data modeling technique in the context of text independent speaker recognition, aimed at mitigating diicul-ties caused by sparse training and test data. Both identi-cation and veriication are addressed, where we view the entire population as divided into the target population and its complement, which we refer to as the background population. F...
متن کاملText-Independent Speaker Verification via State Alignment
To model the speech utterance at a finer granularity, this paper presents a novel state-alignment based supervector modeling method for text-independent speaker verification, which takes advantage of state-alignment method used in hidden Markov model (HMM) based acoustic modeling in speech recognition. By this way, the proposed modeling method can convert a text-independent speaker verification...
متن کاملA Speech Biometrics System with Multi- Grained Speaker Modeling
The paper describes a system for voice-based personal au-thentication in a conceptual framework of conversational speech biometrics relying on two sources of authentication-the acoustic voice-print and the user knowledge. Several technologies are closely integrated: speaker recognition and speech recognition as well as natural language understanding and dialog management. The rst part of this p...
متن کاملA segmental approach to text-independent speaker verification
Current text-independent speaker veri cation systems are usually based on modeling globally the probability density function (PDF) of the speaker feature vectors. In this paper, segmental approaches to text-independent speaker veri cation are discussed. Unlike the schemes based on Large Vocabulary Continuous Speech Recognition (LVCSR) with previously trained phone models, our systems are based ...
متن کامل