gaussian mixed model gmm

A Statistical Sample-Based Approach to GMM-Based Voice Conversion Using Tied-Covariance Acoustic Models

Journal: :IEICE Transactions 2016

Shinnosuke Takamichi Tomoki Toda Graham Neubig Sakriani Sakti Satoshi Nakamura

This paper presents a novel statistical sample-based approach for Gaussian Mixture Model (GMM)-based Voice Conversion (VC). Although GMM-based VC has the promising flexibility of model adaptation, quality in converted speech is significantly worse than that of natural speech. This paper addresses the problem of inaccurate modeling, which is one of the main reasons causing the quality degradatio...

متن کامل

Combining a Gaussian mixture model front end with MFCC parameters

2002

Matthew N. Stuttle Mark J. F. Gales

Fitting a Gaussian mixture model (GMM) to the smoothed speech spectrum allows an alternative set of features to be extracted from the speech signal. These features have been shown to possess information complementary to the standard MFCC parameterisation. This paper further investigates the use of these GMM features in combination with MFCCs. The extraction and use of a confidence metric to com...

متن کامل

Combining five acoustic level modeling methods for automatic speaker age and gender recognition

2010

Ming Li Chi-Sang Jung Kyu Jeong Han

This paper presents a novel automatic speaker age and gender identification approach which combines five different methods at the acoustic level to improve the baseline performance. The five subsystems are (1) Gaussian mixture model (GMM) system based on mel-frequency cepstral coefficient (MFCC) features, (2) Support vector machine (SVM) based on GMM mean supervectors, (3) SVM based on GMM maxi...

متن کامل

A Hybrid GMM/SVM System for Text Independent Speaker Identification

2012

Rafik Djemili Mouldi Bedda Hocine Bourouba

This paper proposes a novel approach that combines statistical models and support vector machines. A hybrid scheme which appropriately incorporates the advantages of both the generative and discriminant model paradigms is described and evaluated. Support vector machines (SVMs) are trained to divide the whole speakers’ space into small subsets of speakers within a hierarchical tree structure. Du...

متن کامل

Gaussian Mixture Model-Based Ensemble Kalman Filtering for State and Parameter Estimation for a PMMA Process

2016

Ruoxia Li Vinay Prasad Biao Huang

Abstract: Polymer processes often contain state variables whose distributions are multimodal; in addition, the models for these processes are often complex and nonlinear with uncertain parameters. This presents a challenge for Kalman-based state estimators such as the ensemble Kalman filter. We develop an estimator based on a Gaussian mixture model (GMM) coupled with the ensemble Kalman filter ...

متن کامل

Microsoft Word - djemili_rafik_paper.DOC

2007

Rafik Djemili Mouldi Bedda Hocine Bourouba

This paper proposes a novel approach that combines statistical models and support vector machines. A hybrid scheme which appropriately incorporates the advantages of both the generative and discriminant model paradigms is described and evaluated. Support vector machines (SVMs) are trained to divide the whole speakers’ space into small subsets of speakers within a hierarchical tree structure. Du...

متن کامل

Factor analysis for audio-based video genre classification

2009

Mickael Rouvier Driss Matrouf Georges Linarès

Statistical classifiers operate on features that generally include both useful and useless information. These two types of information are difficult to separate in the feature domain. Recently, a new paradigm based on a Latent Factor Analysis (LFA) proposed a model decomposition into usefull and useless components. This method was successfully applied to speaker and language recognition tasks. ...

متن کامل

Audio context recognition in variable mobile environments from short segments using speaker and language recognizers

2012

Tomi Kinnunen Rahim Saeidi Jussi Leppänen Jukka Saarinen

The problem of context recognition from mobile audio data is considered. We consider ten different audio contexts (such as car, bus, office and outdoors) prevalent in daily life situations. We choose mel-frequency cepstral coefficient (MFCC) parametrization and present an extensive comparison of six different classifiers: knearest neighbor (kNN), vector quantization (VQ), Gaussian mixture model...

متن کامل

Utterance independent bimodal emotion recognition in spontaneous communication

Journal: :EURASIP J. Adv. Sig. Proc. 2011

Jianhua Tao Shifeng Pan Minghao Yang Ya Li Kaihui Mu Jianfeng Che

Emotion expressions sometimes are mixed with the utterance expression in spontaneous face-to-face communication, which makes difficulties for emotion recognition. This article introduces the methods of reducing the utterance influences in visual parameters for the audio-visual-based emotion recognition. The audio and visual channels are first combined under a Multistream Hidden Markov Model (MH...

متن کامل

a hybrid of genetic algorithm and gaussian mixture model for features reduction and detection of vocal fold pathology

Journal: :journal of advances in computer research 2013

vahid majidnezhad igor kheidorov

acoustic analysis is a proper method in vocal fold pathology diagnosis so that itcan complement and in some cases replace the other invasive, based on direct vocalfold observation, methods. there are different approaches and algorithms for vocalfold pathology diagnosis. these algorithms usually have three stages which arefeature extraction, feature reduction and classification. in this paper in...

متن کامل