Gaussian Process Regression for Continuous Emotion Recognition with Global Temporal Invariance

نویسندگان

Mia Atcheson

Vidhyasaharan Sethu

Julien Epps

چکیده

Continuous emotion recognition (CER) is a task which requires the prediction of time series emotional parameter outputs corresponding to query time series inputs given training data in the form of matched pairs of input and output time series. In order to address this task, it is important to be able to model not only relationships between points in the input and output spaces, but also temporal relationships between points within the output space. Gaussian process regression (GPR) is an inference technique which has desirable properties for CER, including its ability to produce predictive distributions over the outputs rather than only point estimates. However, GPR is generally applied to pointwise prediction or interpolation tasks, rather than to predictions of entire functional outputs. We propose a covariance structure that is able to incorporate both input-output and temporal information to produce predictions that take into account the functional nature of CER data. We demonstrate the application of this method to simulated data, and to the AVEC2016 CER task, showing that GPR with this covariance structure is able to make predictions of emotional arousal from audio with over twice the accuracy of a straightforward pointwise application of GPR in the input feature space, and is furthermore able to produce predictions with accuracy approaching that of a competitive CER system using only very general component covariance models.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MediaEval 2015: A Segmentation-based Approach to Continuous Emotion Tracking

In this paper we approach the task of continuous music emotion recognition using unsupervised audio segmentation as a preparatory step. The MediaEval task requires predicting emotion of the song with a high time resolution of 2Hz. Though this resolution is necessary to find exact locations of emotional changes, we believe that those changes occur more sparsely. We suggest that using bigger time...

متن کامل

Hidden Markov model-based speech emotion recognition

In this contribution we introduce speech emotion recognition by use of continuous hidden Markov models. Two methods are propagated and compared throughout the paper. Within the first method a global statistics framework of an utterance is classified by Gaussian mixture models using derived features of the raw pitch and energy contour of the speech signal. A second method introduces increased te...

متن کامل

Music Emotion Recognition using Gaussian Processes

This paper describes the music emotion recognition system developed at the University of Aizu for the Emotion in Music task of the MediaEval’2013 benchmark evaluation campaign. A set of standard feature types provided by the Marsyas toolkit was used to parametrize each music clip. Arousal and valence are modeled separately using Gaussian Process regression (GPR). We compared performances of the...

متن کامل

Facing Realism in Spontaneous Emotion Recognition from Speech: Feature Enhancement by Autoencoder with LSTM Neural Networks

During the last decade, speech emotion recognition technology has matured well enough to be used in some real-life scenarios. However, these scenarios require an almost silent environment to not compromise the performance of the system. Emotion recognition technology from speech thus needs to evolve and face more challenging conditions, such as environmental additive and convolutional noises, i...

متن کامل

Speech Emotion Recognition Using Scalogram Based Deep Structure

Speech Emotion Recognition (SER) is an important part of speech-based Human-Computer Interface (HCI) applications. Previous SER methods rely on the extraction of features and training an appropriate classifier. However, most of those features can be affected by emotionally irrelevant factors such as gender, speaking styles and environment. Here, an SER method has been proposed based on a concat...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2017

Gaussian Process Regression for Continuous Emotion Recognition with Global Temporal Invariance

نویسندگان

چکیده

منابع مشابه

MediaEval 2015: A Segmentation-based Approach to Continuous Emotion Tracking

Hidden Markov model-based speech emotion recognition

Music Emotion Recognition using Gaussian Processes

Facing Realism in Spontaneous Emotion Recognition from Speech: Feature Enhancement by Autoencoder with LSTM Neural Networks

Speech Emotion Recognition Using Scalogram Based Deep Structure

عنوان ژورنال:

اشتراک گذاری