Improved Emotion Recognition with Novel Global Utterance-level Features

نویسندگان

  • Yongming Huang
  • Guobao Zhang
  • Xiong Li
چکیده

Traditional features, which are extracted from each frame, can not reflect the dynamic characteristics of emotion speech signal accurately. To solve this problem, first, without dividing the emotion speech into frames, novel global utterance-level features are proposed with multi-scale optimal wavelet packet decomposition; then, considering the case of little training samples, a fusion strategy through metric learning, which is called weak metric learning in this work, is proposed for fusing the global and traditional features. The experimental results with LIBSVM show that fusing the novel global feature to traditional feature achieves significant improvements about 5.2% to 13.6% than merely using local utterance-level features.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving emotion recognition using class-level spectral features

Traditional approaches to automatic emotion recognition from speech typically make use of utterance level prosodic features. Still, a great deal of useful information about expressivity and emotion can be gained from segmental spectral features, which provide a more detailed description of the speech signal, or from measurements from specific regions of the utterance, such as the stressed vowel...

متن کامل

Class-level spectral features for emotion recognition

The most common approaches to automatic emotion recognition rely on utterance level prosodic features. Recent studies have shown that utterance level statistics of segmental spectral features also contain rich information about expressivity and emotion. In our work we introduce a more fine-grained yet robust set of spectral features: statistics of Mel-Frequency Cepstral Coefficients computed ov...

متن کامل

Fusion of global statistical and segmental spectral features for speech emotion recognition

Speech emotion recognition is an interesting and challenging speech technology, which can be applied to broad areas. In this paper, we propose to fuse the global statistical and segmental spectral features at the decision level for speech emotion recognition. Each emotional utterance is individually scored by two recognition systems, the global statistics-based and segmental spectrum-based syst...

متن کامل

Improved Frame Level Features and SVM Supervectors Approach for the Recogniton of Emotional States from Speech: Application to categorical and dimensional states

The purpose of speech emotion recognition system is to classify speaker's utterances into different emotional states such as disgust, boredom, sadness, neutral and happiness. Speech features that are commonly used in speech emotion recognition (SER) rely on global utterance level prosodic features. In our work, we evaluate the impact of frame-level feature extraction. The speech samples are fro...

متن کامل

Incremental emotion recognition

Most emotion recognition systems do not perform real-time emotion recognition due to latencies caused by phrase segmentation and resource-intensive feature acquisition, etc. To address this issue, we present an emotion recognition approach that can estimate speaker emotions with much lower latency. The proposed approach does not rely on phrase-level features to recognize speaker emotion; rather...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011