Comparisons of recent speaker recognition approaches based on word-conditioning
نویسندگان
چکیده
We examine the effectiveness of various speaker recognition approaches based on word-conditioning. Subsets of 62 keywords (used for word-conditioning) are examined for their individual and combined effectiveness for a keyword HMM approach, a supervector keyword HMM approach, a keyword phone Ngrams approach, and a keyword phone HMM approach. Our results demonstrate the effectiveness of acoustic features and importance of keyword frequency in individual keyword results, where the keywords yeah and you know outperform others. We also demonstrate the power of SVMs, in conjunction with acoustic features, in keyword combination experiments, in which the supervector keyword HMM approach (4.3% EER) outperforms other keyword-based approaches, and achieves a 6.5% improvement over the GMM baseline (4.6% EER) on the SRE06 8 conversation side task.
منابع مشابه
Closed-Set Speaker Identification Based on a Single Word Utterance: An Evaluation of Alternative Approaches
The problem of closed-set speaker identification based on a single spoken word from a limited vocabulary is relevant to several current and futuristic interactive multimedia applications. In this paper, we evaluate the effectiveness of several potential solutions using an isolated word speech corpus. In addition to evaluating the text-dependent and text-constrained variants of the Gaussian Mixt...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملClustered maximum likelihood linear basis for rapid speaker adaptation
Speaker space based adaptation methods for automatic speech recognition have been shown to provide significant performance improvements for tasks where only a few seconds of adaptation speech is available. This paper proposes a robust, low complexity technique within this general class that has been shown to reduce word error rate, reduce the large storage requirements associated with speaker s...
متن کاملPhonetic Speaker Recognition with Support Vector Machines
A recent area of significant progress in speaker recognition is the use of high level features—idiolect, phonetic relations, prosody, discourse structure, etc. A speaker not only has a distinctive acoustic sound but uses language in a characteristic manner. Large corpora of speech data available in recent years allow experimentation with long term statistics of phone patterns, word patterns, et...
متن کامل