Emotion Recognition via Continuous Mandarin Speech

نویسندگان

  • Tsang-Long Pao
  • Jun-Heng Yeh
  • Yu-Te Chen
چکیده

Emotion plays a significant role in cognitive psychology, behavioural sciences and humanoid robot design. The continuing improvements in speech recognition technology have led to many new and fascinating applications in human-computer interaction, context aware computing and computer mediated communication. A growing number of research studies in emotion recognition via an isolated short sentence are available to shed some light on the implementation of human-computer interface. However, to the best of our knowledge, no work has focused on automatic emotion tracking from continuous Mandarin speech. In this chapter, we will elaborate an emotion recognition method in continuous Mandarin speech, by dividing the utterance into independent segments, each of which contains a single emotional category. In the growing range of interactive interfaces, the research of emotional voice is still at an early stage, not to mention a paucity of literatures on real applications. The crucial difficulty of this subject is how to blend the knowledge of interdisciplinary, especially in speech processing, applied psychology and human-computer interface. To date, no clear direction has emerged to suggest how such considerations translate into practical interface design. The crux of this problem is that the emotion recognition in continuous speech has not yet been much explored. From the viewpoint of communication, it is natural for human beings to communicate with others in continuous dialogue. Even though, most proposed methods of emotion recognition via voice can only be provided with a fragmented sentence (i.e. a manual and deliberate cutting sentence). To ensure the practicability, the purpose of this chapter attempts to address these areas by processing speech signals rather than interpreting the lexicons of speaking. Moreover, the benefit from the outlook of processing speech signals can also tack the violent change of emotional expression in dialogue. In light of these concerns, this chapter has three purposes: (a) to report on trends in published research in the major journals of emotion recognition; (b) to provide a method in recognition of emotion from continuous Mandarin speech; and (c) to recommend promising research paradigms for recognition of emotion via continuous speech. This chapter is organized as follows. In section 2, related works are presented. In section 3, the testing corpus is introduced. In section 4, the proposed speech recognition method is

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...

متن کامل

Speech Emotion Recognition Using Scalogram Based Deep Structure

Speech Emotion Recognition (SER) is an important part of speech-based Human-Computer Interface (HCI) applications. Previous SER methods rely on the extraction of features and training an appropriate classifier. However, most of those features can be affected by emotionally irrelevant factors such as gender, speaking styles and environment. Here, an SER method has been proposed based on a concat...

متن کامل

Emotion Recognition and Evaluation of Mandarin Speech Using Weighted D-KNN Classification

In this paper, we proposed a weighted discrete K-nearest neighbor (weighted D-KNN) classification algorithm for detecting and evaluating emotion from Mandarin speech. In the experiments of the emotion recognition, Mandarin emotional speech database used contains five basic emotions, including anger, happiness, sadness, boredom and neutral, and the extracted acoustic features are Mel-Frequency C...

متن کامل

Modeling Lexical Tones for Mandarin Large Vocabulary Continuous Speech Recognition

Modeling Lexical Tones for Mandarin Large Vocabulary Continuous Speech Recognition

متن کامل

Modeling context-dependent phonetic units in a continuous speech recognition system for Mandarin Chinese

We study the problem of phonetic modeling for continuous Mandarin speech recognition by providing a systematic performance comparison for systems based on following primitive speech units: syllable, demi-syllable (Initials and Finals), context-independent phones, left-or-right context-dependentphones (diphones), and leftand-right context-dependent phones (triphones). In our speakerdependent con...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012