Predicting head motion from prosodic and linguistic features

نویسندگان

  • Angelika Hönemann
  • Diego Evin
  • Alejandro J. Hadad
  • Hansjörg Mixdorff
  • Sascha Fagel
چکیده

This paper describes an approach to predict non-verbal cues from speech-related features. Our previous investigations of audiovisual speech showed that there are strong correlations between the two modalities. In this work we developed two models using different kinds of Recurrent Artificial Neural Networks: Elman and NARX, to predict parameters of activity for head motion using linguistic and prosodic inputs, and compared their performance. Prosodic inputs included F0 and intensity, while linguistic parameters included the former plus additional information such as the type of syllables, phrases, and different relations between them. Using speaker specific models for six subjects, performance measures in terms of root mean square error (RMSE) showed that there are significant differences between the models with respect to the input parameters, and that NARX network outperformed the Elman network on the prediction task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Study of the Relationship between Acoustic Features of “bæle” and the Paralinguistic Information

Language users benefit from special phonetic tools in order to communicate linguistic information as well as different emotional aspects and paralinguistic information through daily conversation. Having functions in conveying semantic information to listeners, prosodic features form the essential part of linguistic behavour, manipulating  them potentially can play an important role in transmitt...

متن کامل

Predicting Prosodic Phrasing Using Linguistic Features

The prosodic structure of speech is based on complex interaction within and between several different levels of linguistic, and paralinguistic organization, and is expressed in the modulation of F0, intensity, duration, and voice quality, as well as the occurrence of pauses. Even though leading theories of prosody maintain that prosody is shaped through the interaction of grammatical factors fr...

متن کامل

An Information Theoretic Analysis of the Temporal Synchrony Between Head Gestures and Prosodic Patterns in Spontaneous Speech

We analyze the temporal co-ordination between head gestures and prosodic patterns in spontaneous speech in a data-driven manner. For this study, we consider head motion and speech data from 24 subjects while they tell a fixed set of five stories. The head motion, captured using a motion capture system, is converted to Euler angles and translations in X, Y and Z-directions to represent head gest...

متن کامل

Automatic head gesture learning and synthesis from prosodic cues

We present a novel approach to automatically learn and synthesize head gestures using prosodic features extracted from acoustic speech signals. A minimum entropy hidden Markov model is employed to learn the 3-D head-motion of a speaker. The result is a generative model that is compact and highly predictive. The model is further exploited to synchronize the head-motion with a set of continuous p...

متن کامل

The Role of Linguistic and Prosodic Cues on the Prediction of Self-Reported Satisfaction in Contact Centre Phone Calls

Call Centre data is typically collected by organizations and corporations in order to ensure the quality of service, supporting for example mining capabilities for monitoring customer satisfaction. In this work, we analyze the significance of various acoustic features extracted from customer-agents’ spoken interaction in predicting self-reported satisfaction by the customer. We also investigate...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013