audio visual sign

Seeing to hear better: evidence for early audio-visual interactions in speech identification.

Journal: :Cognition 2004

Jean-Luc Schwartz Frédéric Berthommier Christophe Savariaux

Lip reading is the ability to partially understand speech by looking at the speaker's lips. It improves the intelligibility of speech in noise when audio-visual perception is compared with audio-only perception. A recent set of experiments showed that seeing the speaker's lips also enhances sensitivity to acoustic information, decreasing the auditory detection threshold of speech embedded in no...

متن کامل

Text to Avatar in Multi-modal Human Computer Interface

2003

Yiqiang Chen Wen Gao Zhaoqi Wang Changshui Yang Dalong Jiang

In this paper, we present a new text-driven avatar system, which consists of three major components, a text-to-speech (TTS) unit, a speech driven facial animation (SDFA) unit and a text-to-sign language (TTSL) unit. A new visual prosody time control model and an integrated learning framework are proposed to realize synchronization among speech synthesis, face animation and gesture animation, wh...

متن کامل

Large-vocabulary audio-visual speech recognition: a summary of the Johns Hopkins Summer 2000 Workshop

2001

Chalapathy Neti Gerasimos Potamianos Juergen Luettin Iain A. Matthews Hervé Glotin Dimitra Vergyri

We report a summary of the Johns Hopkins Summer 2000 Workshop on audio-visual automatic speech recognition (ASR) in the large-vocabulary, continuous speech domain. Two problems of audio-visual ASR were mainly addressed: Visual feature extraction and audio-visual information fusion. First, image transform and model-based visual features were considered, obtained by means of the discrete cosine t...

متن کامل

Speaker independent audio-visual continuous speech recognition

2002

Luhong Liang Xiaoxing Liu Yibao Zhao Xiaobo Pi Ara V. Nefian

The increase in the number of multimedia applications that require robust speech recognition systems determined a large interest in the study of audio-visual speech recognition (AVSR) systems. The use of visual features in AVSR is justified by both the audio and visual modality of the speech generation and the need for features that are invariant to acoustic noise perturbation. The speaker inde...

متن کامل

Maximising audio-visual speech correlation

2007

Ibrahim Almajai Ben P. Milner

The aim of this work is to investigate a selection of audio and visual speech features with the aim of finding pairs that maximise audio-visual correlation. Two audio speech features have been used in the analysis filterbank vectors and the first four formant frequencies. Similarly, three visual features have also been considered active appearance model (AAM), 2-D DCT and cross-DCT. From a data...

متن کامل

A Generic Evaluation Model for Auditory Feedback in Complex Visual Searches

2014

Timothy Neate Norberto Degara Andy Hunt Frederik Nagel

This paper proposes a method of evaluating the effect of auditory display techniques on a complex visual search task. The approach uses a pre-existing visual search task (conjunction search) to create a standardized model for audio, and non-audio assisted visual search tasks. A pre-existing auditory display technique is evaluated to test the system. Using randomly generated images, participants...

متن کامل

Audio-visual Integration in Multimodal Communication

1998

Tsuhan Chen Ram R. Rao

In this paper, we review recent research that examines audio-visual integration in multimodal communication. The topics include bimodality in human speech, human and automated lip-reading, facial animation, lip synchronization, joint audio-video coding, and bimodal speaker verification. We also study the enabling technologies for these research topics, including automatic facial feature trackin...

متن کامل

Large Vocabulary Audio-Visual Speech Recognition Using Active Shape Models

2000

Tanveer A. Faruquie Abhik Majumdar Nitendra Rajput L. Venkata Subramaniam

Orthogonal information present in the video signal associated with the audio helps in improving the accuracy of a speech recognition system. Audio-visual speech recognition involves extraction of both the audio as well as visual features from the input signal. Extraction of visual parameters is done by the recognition of speech dependent features from the video sequence. This paper uses geometr...

متن کامل

Form, Meaning and Context in Lexical Access: Meg and Behavioral Evidence

2009

Diogo Almeida David Poeppel Donald J. Bolger

Title of dissertation: FORM, MEANING AND CONTEXT IN LEXICAL ACCESS: MEG AND BEHAVIORAL EVIDENCE Diogo Almeida, Doctor of Philosophy, 2009 Dissertation directed by: Professor David Poeppel Department of Linguistics One of the main challenges in the study of cognition is how to connect brain activity to cognitive processes. In the domain of language, this requires coordination between two differe...

متن کامل

Perceptual-based quality assessment for audio-visual services: A survey

Journal: :Sig. Proc.: Image Comm. 2010

Junyong You Ulrich Reiter Miska M. Hannuksela Moncef Gabbouj Andrew Perkis

Accurate measurement of the perceived quality of audio–visual services at the end-user is becoming a crucial issue in digital applications due to the growing demand for compression and transmission of audio–visual services over communication networks. Content providers strive to offer the best quality of experience for customers linked to their different quality of service (QoS) solutions. Ther...

متن کامل