Performance Enhancement in Lip Synchronization Using MFCC Parameters

نویسنده

  • MAHESH GOYANI
چکیده

Many multimedia applications and entertainment industry products like games, cartoons and film dubbing require speech driven face animation and audio-video synchronization. Only Automatic Speech Recognition system (ASR) does not give good results in noisy environment. Audio Visual Speech Recognition system plays vital role in such harsh environment as it uses both – audio and visual – information. In this paper, we have proposed a novel approach with enhanced performance over traditional methods that have been reported so far. Our algorithm works on the bases of acoustic and visual parameters to achieve better results. We have tested our system for English language using MFCC and LPC parameters of the speech. Lip parameters like lip width, lip height etc are extracted from the video and these both acoustic and visual parameters are used to train neural network. Our system is giving almost cent percent response against vowels.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Real-time language independent lip synchronization method using a genetic algorithm

Lip synchronization is a method for the determination of the mouth and tongue motion during a speech. It is widely used in multimedia productions, and real time implementation is opening application possibilities in multimodal interfaces. We present an implementation of real time, language independent lip synchronization based on the classification of the speech signal, represented by MFCC vect...

متن کامل

Vector Quantization Approach for Speaker Recognition using MFCC and Inverted MFCC

Front-end or feature extractor is the first component in an automatic speaker recognition system. Feature extraction transforms the raw speech signal into a compact but effective representation that is more stable and discriminative than the original signal. Since the front-end is the first component in the chain, the quality of the later components (speaker modeling and pattern matching) is st...

متن کامل

Robust digit recognition in noise: an evaluation using the AURORA corpus

In this paper, a variety of techniques for robust digit recognition in noise are considered using the AURORA 2.0 corpus. Current recognizers perform as well as humans in small vocabulary tasks but computer recognition performance degrades substantially when noise is introduced into the speech, while human performance is much less sensitive. To make the recognizer robust, several methodologies a...

متن کامل

A Survey – Audio and Video Synchronization

The audio and video Synchronization is extremely necessary. The synchronization loss between image and sound continues to disturb observers and irritate telecasters. The demand is to assure synchronization without adjusting content at the same time as still retaining price low. The objective of the synchronization is to line up both the audio and video signals that are processed individually. T...

متن کامل

Statistical correlation analysis between lip contour parameters and formant parameters for Mandarin monophthongs

In this study we examine quantitatively the correlation between the geometric lip contour parameters and the formant parameters for Mandarin monophthongs, and carry out a multiple linear regression study between the two parameters. We explicitly analyze the relationship between different geometric lip parameters and the formant parameters, which have some linguistic significance instead of the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010