Utterance Verification for Text-Dependent Speaker Recognition: A Comparative Assessment Using the RedDots Corpus
نویسندگان
چکیده
Text-dependent automatic speaker verification naturally calls for the simultaneous verification of speaker identity and spoken content. These two tasks can be achieved with automatic speaker verification (ASV) and utterance verification (UV) technologies. While both have been addressed previously in the literature, a treatment of simultaneous speaker and utterance verification with a modern, standard database is so far lacking. This is despite the burgeoning demand for voice biometrics in a plethora of practical security applications. With the goal of improving overall verification performance, this paper reports different strategies for simultaneous ASV and UV in the context of short-duration, text-dependent speaker verification. Experiments performed on the recently released RedDots corpus are reported for three different ASV systems and four different UV systems. Results show that the combination of utterance verification with automatic speaker verification is (almost) universally beneficial with significant performance improvements being observed.
منابع مشابه
Text Dependent Speaker Verification Using Un-Supervised HMM-UBM and Temporal GMM-UBM
In this paper, we investigate the Hidden Markov Model (HMM) and the temporal Gaussian Mixture Model (GMM) systems based on the Universal Background Model (UBM) concept to capture temporal information of speech for Text Dependent (TD) Speaker Verification (SV). In TD-SV, target speakers are constrained to use only predefined fixed sentence/s during both the enrollment and the test process. The t...
متن کاملTandem Features for Text-Dependent Speaker Verification on the RedDots Corpus
We use tandem features and a fusion of four systems for textdependent speaker verification on the RedDots corpus. In the tandem system, a senone-discriminant neural network provides a low-dimensional bottleneck feature at each frame which are concatenated with a standard Mel-frequency cepstral coefficients (MFCC) feature representation. The concatenated features are propagated to a conventional...
متن کاملIncorporating pass-phrase dependent background models for text-dependent speaker verification
In this paper, we propose a pass-phrase dependent background model (PBM) for text dependent (TD) speaker verification (SV) to integrate pass-phrase identification process (without an additional separate identification system) in the conventional TD-SV system, where a PBM is derived from a text-independent background model through adaptation using the utterances of a particular pass-phrase. Duri...
متن کاملi-Vector/HMM Based Text-Dependent Speaker Verification System for RedDots Challenge
Recently, a new data collection was initiated within the RedDots project in order to evaluate text-dependent and text-prompted speaker recognition technology on data from a wider speaker population and with more realistic noise, channel and phonetic variability. This paper analyses our systems built for RedDots challenge – the effort to collect and compare the initial results on this new evalua...
متن کاملParallel Speaker and Content Modelling for Text-Dependent Speaker Verification
Text-dependent short duration speaker verification involves two challenges. The primary challenge of interest is the verification of the speaker’s identity, and often a secondary challenge of interest is the verification of the lexical content of the pass-phrase. In this paper, we propose the use of two systems to handle these two tasks in parallel with one subsystem modelling speaker identity ...
متن کامل