Audio-visual Integration in Multimodal Communication

نویسندگان

  • Tsuhan Chen
  • Ram R. Rao
چکیده

In this paper, we review recent research that examines audio-visual integration in multimodal communication. The topics include bimodality in human speech, human and automated lip-reading, facial animation, lip synchronization, joint audio-video coding, and bimodal speaker verification. We also study the enabling technologies for these research topics, including automatic facial feature tracking and audio-to-visual mapping. Recent progress in audio-visual research shows that joint processing of audio and video provides advantages that are not available when the audio and video are processed independently.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A multimedia platform for audio-visual speech processing

In the framework of the European ESPRIT Project MIAMI ("Multimodal Integration for Advanced Multimedia Interfaces"), a platform has been developed at the ICP to study the various combinations of audiovisual speech processing, including real-time lip motion analysis, real-time synthesis of models of the lips and of the face, audiovisual speech recognition of isolated words, and text-to-audio-vis...

متن کامل

A Multimodal Discourse Analysis of Some Visual Images in the Political Rally Discourse of 2011 Electioneering Campaigns in Southwestern Nigeria

This paper presented a multimodal discourse analysis of some visual images in the political rally discourse of 2011 electioneering campaigns in Southwestern Nigeria. The data comprised purposively selected political visual artefacts from political rallies across the six Southwestern States in Nigeria (Osun, Oyo, Ondo, Ekiti, Ogun, and Lagos). The data were analyzed using Halliday’s (1985) syste...

متن کامل

The effects of perceptual load and set on audio-visual speech integration

This study examines the hypothesis that audio-visual integration of speech requires both expectation to perceive speech and sufficient attentional resources to allow multimodal integration. Audio-visual integration was measured by recording susceptibility to the McGurk effect whilst participants simultaneously performed a primary visual task under conditions of high or low perceptual load. Acco...

متن کامل

The multimodal nature of spoken word processing in the visual world: Testing the predictions of alternative models of multimodal integration

Ambiguity in natural language is ubiquitous (Piantadosi, Tily & Gibson, 2012), yet spoken communication is effective due to integration of information carried in the speech signal with information available in the surrounding multimodal landscape. However, current cognitive models of spoken word recognition and comprehension are underspecified with respect to when and how multimodal information...

متن کامل

Perception of 'speech-and-gesture2 integration

This paper describes two experiments conducted to identify the role of synchronization in the perception of ‘speech and gesture’ communication and to isolate the parameters that determine the perception of temporal alignment. The results of the first experiment show that the synchronization between audio and visual signals determines the felicitousness of a multimodal utterance. With the second...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998