Speech-Driven Mouth Animation Based on Support Vector Regression

نویسندگان

Chia-Te Liao

Yu-Kung Wu

Shang-Hong Lai

چکیده

Visual mouth information has been proved to be very helpful for understanding the speech contents. The synthesis of the corresponding face video directly from the speech signals is strongly demanded since it can significantly reduce the amount of video information for transmission. In this paper, we present a novel statistical learning approach that learns the mappings from input voice signals to the corresponding mouth images. A deformable mouth template model is employed to parameterize the mouth shape corresponding to different transient speech signals followed by a radial basis function (RBF) interpolation technique to synthesize the mouth image according to a new set of predicted mouth shape parameters. The support vector regression (SVR) machine is used to learn the mapping from speech features to visemes, which are parameterized now by a set of mouth shape parameters. From the input speech signals, we can dynamically predict the mouth shapes through the trained SVRs and further synthesize realistic mouth images. Experimental results are shown to demonstrate the vivid speech-driven mouth image synthesis results by using the proposed algorithm.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Data-Driven Speech Animation Synthesis Focusing on Realistic Inside of the Mouth

Speech animation synthesis is still a challenging topic in the field of computer graphics. Despite many challenges, representing detailed appearance of inner mouth such as nipping tongue’s tip with teeth and tongue’s back hasn’t been achieved in the resulting animation. To solve this problem, we propose a method of data-driven speech animation synthesis especially when focusing on the inside of...

متن کامل

Speech Recognition with Hidden Markov Models in Visual Communication

Speech is produced by the vibration of the vocal cords and the configuration of the arti-culators. Because some of these articulators are visible, there is an inherent relationship between the acoustic and the visual forms of speech. This relationship has been historically used in lipreading. Today's advanced computer technology opens up new possibilities to exploit the correlation between acou...

متن کامل

Photo-Realistic Mouth Animation Based on an Asynchronous Articulatory DBN Model for Continuous Speech

This paper proposes a continuous speech driven photo realistic visual speech synthesis approach based on an articulatory dynamic Bayesian network model (AF_AVDBN) with constrained asynchrony. In the training of the AF_AVDBN model, the perceptual linear prediction (PLP) features and YUV features are extracted as acoustic and visual features respectively. Given an input speech and the trained AF_...

متن کامل

Image-based Talking Head: Analysis and Synthesis

In this paper, our image-based talking head system is presented, which includes two parts: analysis and synthesis. In the analysis part, a subject reading a predefined corpus is recorded first. The recorded audio-visual data is analyzed in order to create a database containing a large number of normalized mouth images and their related information. The synthesis part generates natural looking t...

متن کامل

Acoustic Viseme Modelling for Speech Driven Animation: a Case Study

This paper addresses the problem of animating a talking figure, such as an avatar, using speech input only. The proposed system is based on Hidden Markov Models for the acoustic observation vectors of the speech sounds that correspond to each of 16 visually distinct mouth shapes (called visemes). This case study illustrates that it is indeed possible to obtain visually relevant speech segmentat...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

IJEBM

دوره 5 شماره

صفحات -

تاریخ انتشار 2007

Speech-Driven Mouth Animation Based on Support Vector Regression

نویسندگان

چکیده

منابع مشابه

Data-Driven Speech Animation Synthesis Focusing on Realistic Inside of the Mouth

Speech Recognition with Hidden Markov Models in Visual Communication

Photo-Realistic Mouth Animation Based on an Asynchronous Articulatory DBN Model for Continuous Speech

Image-based Talking Head: Analysis and Synthesis

Acoustic Viseme Modelling for Speech Driven Animation: a Case Study

عنوان ژورنال:

اشتراک گذاری