Speaker Recognition System for Limited Speech Data Using High-Level Speaker Specific Features and Support Vector Machines
نویسندگان
چکیده
High-level speaker-specific features (HLSSFs), such as the style of pronunciation of words, their use, phonotactics and prosody, form the main subjects of state-of-the-art research on automatic speaker recognition (ASR). In this paper, we experimentally verify HLSSF extraction and support vector machine (SVM)-based modelling techniques. The HLSSF extraction produces patterns of symbols for each speaker during ASR training. The strategy involves changing these patterns during the training and testing of ASR using frequencies (n-gram) for a given voice sample. We used SVM and n-gram frequencies to implement ASR, where the application consisted of a new kernel based on the linear logprobability proportional scoring framework. This approach yielded impressive outcomes on an assortment of abnormal state highlights in ASR. We showed that the proposed ASR based on the linear log-probability proportional scoring framework is superior to other standard log-probability frameworks. The equal error rate (EER) of our ASR method was 2.5% with a 2% improvement over the standard method.
منابع مشابه
A Comparative Study of Gender and Age Classification in Speech Signals
Accurate gender classification is useful in speech and speaker recognition as well as speech emotion classification, because a better performance has been reported when separate acoustic models are employed for males and females. Gender classification is also apparent in face recognition, video summarization, human-robot interaction, etc. Although gender classification is rather mature in a...
متن کاملSpeaker and Speech recognition by Audio-Visual lip biometrics
This paper proposes a new robust bi-modal audio visual speech and speaker recognition system by lip-motion and speech biometrics. To increase the robustness of speech and speaker recognition, we have proposed a method using speaker lip motion information extracted from video sequences with low resolution (128 ×128 pixels). In this paper we investigate a biometric system for speech recognition a...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملMLLR transforms as features in speaker recognition
We explore the use of adaptation transforms employed in speech recognition systems as features for speaker recognition. This approach is attractive because, unlike standard framebased cepstral speaker recognition models, it normalizes for the choice of spoken words in text-independent speaker verification. Affine transforms are computed for the Gaussian means of the acoustic models used in a re...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کامل