Performance Enhancement in Lip Synchronization Using MFCC Parameters
نویسنده
چکیده
Many multimedia applications and entertainment industry products like games, cartoons and film dubbing require speech driven face animation and audio-video synchronization. Only Automatic Speech Recognition system (ASR) does not give good results in noisy environment. Audio Visual Speech Recognition system plays vital role in such harsh environment as it uses both – audio and visual – information. In this paper, we have proposed a novel approach with enhanced performance over traditional methods that have been reported so far. Our algorithm works on the bases of acoustic and visual parameters to achieve better results. We have tested our system for English language using MFCC and LPC parameters of the speech. Lip parameters like lip width, lip height etc are extracted from the video and these both acoustic and visual parameters are used to train neural network. Our system is giving almost cent percent response against vowels.
منابع مشابه
Real-time language independent lip synchronization method using a genetic algorithm
Lip synchronization is a method for the determination of the mouth and tongue motion during a speech. It is widely used in multimedia productions, and real time implementation is opening application possibilities in multimodal interfaces. We present an implementation of real time, language independent lip synchronization based on the classification of the speech signal, represented by MFCC vect...
متن کاملVector Quantization Approach for Speaker Recognition using MFCC and Inverted MFCC
Front-end or feature extractor is the first component in an automatic speaker recognition system. Feature extraction transforms the raw speech signal into a compact but effective representation that is more stable and discriminative than the original signal. Since the front-end is the first component in the chain, the quality of the later components (speaker modeling and pattern matching) is st...
متن کاملRobust digit recognition in noise: an evaluation using the AURORA corpus
In this paper, a variety of techniques for robust digit recognition in noise are considered using the AURORA 2.0 corpus. Current recognizers perform as well as humans in small vocabulary tasks but computer recognition performance degrades substantially when noise is introduced into the speech, while human performance is much less sensitive. To make the recognizer robust, several methodologies a...
متن کاملA Survey – Audio and Video Synchronization
The audio and video Synchronization is extremely necessary. The synchronization loss between image and sound continues to disturb observers and irritate telecasters. The demand is to assure synchronization without adjusting content at the same time as still retaining price low. The objective of the synchronization is to line up both the audio and video signals that are processed individually. T...
متن کاملStatistical correlation analysis between lip contour parameters and formant parameters for Mandarin monophthongs
In this study we examine quantitatively the correlation between the geometric lip contour parameters and the formant parameters for Mandarin monophthongs, and carry out a multiple linear regression study between the two parameters. We explicitly analyze the relationship between different geometric lip parameters and the formant parameters, which have some linguistic significance instead of the ...
متن کامل