An investigation into vocal tract length normalisation
نویسندگان
چکیده
This paper investigates several di erent methods for performing vocal tract length normalisation (VTLN) which are either completely linear or piece-wise linear. Furthermore the combination of VTLN with either standard unconstrained maximum likelihood linear regression (MLLR) or constrained MLLR is considered. Results on the Switchboard corpus show that there is little di erence in performance between the di erent forms of VTLN, and that as previously reported that the e ects of VTLN and unconstrained MLLR are largely additive. However it was found that if multiple iterations of constrained MLLR is used there is no additional advantage to also using VTLN.
منابع مشابه
On combining vocal tract length normalisation and speaker adaptation for noise robust speech recognition
This paper investigates the combination of vocal tract length normalisation and speaker adaptation in connected digit recognition. In particular, we focus on performing this task under a continuously varying car noise environment. Continuous supervised speaker and environment adaptation is carried out on the test data according to the Bayesian framework. The paper also evaluates various approac...
متن کاملOn Combining Vocal Tract Length Normalisation and Speaker Adapation for Noise Robust Speech Recognition
This paper investigates the combination of vocal tract length normalisation and speaker adaptation in connected digit recognition. In particular, we focus on performing this task under a continuously varying car noise environment. Continuous supervised speaker and environment adaptation is carried out on the test data according to the Bayesian framework. The paper also evaluates various approac...
متن کاملAdaptation and normalization experiments in speech recognition for 4 to 8 year old children
An experimental offline investigation of the performance of connected digits recognition was performed on children in the age range four to eight years. Poor performance using adult models was improved significantly by adaptation and vocal tract length normalisation but not to the same level as training on children. Age dependent models were tried with limited advantage. A combined adult and ch...
متن کاملUtilise Vocal Tract Length Normalisation for Robust Automatic Language Identification
This paper investigates the application of Vocal Tract Length Normalisation (VTLN) for robust automatic Language Identification (LID). Two different LID systems are utilised to evaluate the improvement in performance obtained by the application of VTLN. The first system is a Gaussian Mixture Model based system that utilised the Universal Background Model technique to improve efficiency. The sec...
متن کاملAn analysis of the size information in classical formant data: Peterson and Barney (1952) revisited
Irino and Patterson (2002) have suggested the Mellin Transform as a model for vocal tract normalisation in the auditory system. In this report, we reanalyse the classical formant data reported by Peterson and Barney (1952) to see if it supports the normalisation hypothesis. The vowel formant data are clustered, quantitatively, using very general assumptions about speaker-variability. These clus...
متن کامل