Utilising Human Audio Visual Response for Lip Synchronisation in Virtual Environments
نویسنده
چکیده
This paper discusses the difficulties in maintaining the impression of a 3D virtual persona within a multimedia desktop environment and proposes a method in which the task can be minimised in such a way that it takes advantage of the level of human perceptual acceptance for asynchronous audio visual speech information. An example of the algorithm devised is tested both objectively and by subjective user appraisal and is shown to demonstrate promising results that merit further research.
منابع مشابه
Synthesising Tongue Movements for conversing Virtual Humans
Facial motion capture is a common method for accurately capturing realistic facial movements. An actor’s performance can be used to bring a virtual human to life. However, the movement of the tongue is often forgotten in character animation. For the most part, the problem arises from the difficulty in capturing tongue movements due to occlusion by the teeth and the lips. Techniques from traditi...
متن کاملTowards Natural Communication in Networked Collaborative Virtual Environments
Networked Collaborative Virtual Environments (NCVE) have been a hot topic of research for some time now. However, most of the existing NCVE systems restrict the communication between the participants to text messages or audio communication. The natural means of human communication are richer than this. Facial expressions, lip movements, body postures and gestures all play an important role in o...
متن کاملMultifactor Fusion for Audio-Visual Speaker Recognition
In this paper we propose a multifactor hybrid fusion approach for enhancing security in audio-visual speaker verification. Speaker verification experiments conducted on two audiovisual databases, VidTIMIT and UCBN, show that multifactor hybrid fusion involve a combination feature-level fusion of lip-voice features and face-lip-voice features at score-level is indeed a powerful technique for spe...
متن کاملConfusability of Phonemes Grouped According to their Viseme Classes in Noisy Environments
Using visual information, such as lip shapes and movements, as the secondary source of speech information has been shown to make speech recognition systems more robust to problems associated with environmental noise, training/testing mismatch and channel and speech style variations. Research into utilising visual information for speech recognition has been ongoing for 20 years, however over thi...
متن کاملAudio, visual, and audio-visual egocentric distance perception by moving participants in virtual environments
A study on audio, visual, and audio-visual egocentric distance perception by moving participants in virtual environments is presented. Audio-visual rendering is provided using tracked passive visual stereoscopy and acoustic wave eld synthesis (WFS). Distances are estimated using indirect blind-walking (triangulation) under each rendering condition. Experimental results show that distances perce...
متن کامل