On the audiovisual asynchrony of speech

نویسنده

  • László Czap
چکیده

The temporal synchrony of auditory and visual signals is known to affect the perception of audiovisual speech. Several papers have discussed the asymmetry of acoustic and visual timing cues. These results are usually based on subjective intelligibility tests and the reason is remained obscure. It is not clear that the observation is perception or production origin. In this paper the effect of audio-visual asynchrony is studied in an automatic bimodal speech recognition task, eliminating the perception expertise of observers. Results are utilized to improve naturalness of audiovisual speech synthesis.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Audiovisual Speech Recognition with Articulator Positions as Hidden Variables

Speech recognition, by both humans and machines, benefits from visual observation of the face, especially at low signal-to-noise ratios (SNRs). It has often been noticed, however, that the audible and visible correlates of a phoneme may be asynchronous; perhaps for this reason, automatic speech recognition structures that allow asynchrony between the audible phoneme and the visible viseme outpe...

متن کامل

Audiovisual synchrony perception for speech and music assessed using a temporal order judgment task.

This study investigated people's sensitivity to audiovisual asynchrony in briefly-presented speech and musical videos. A series of speech (letters and syllables) and guitar and piano music (single and double notes) video clips were presented randomly at a range of stimulus onset asynchronies (SOAs) using the method of constant stimuli. Participants made unspeeded temporal order judgments (TOJs)...

متن کامل

Data and simulations about audiovisual asynchrony and predictability in speech perception

Since a paper by Chandrasekaran et al. (2009), an increasing number of neuroscience papers capitalize on the assumption that visual speech would be typically 150 ms ahead of auditory speech. It happens that the estimation of audiovisual asynchrony by Chandrasekaran et al. is valid only in very specific cases, for isolated CV syllables or at the beginning of a speech utterance. We present simple...

متن کامل

Influenсe of Phone-Viseme Temporal Correlations on Audiovisual STT and TTS Performance

In this paper, we present a research of temporal correlations of audiovisual units in continuous Russian speech. The corpus-based study identifies natural time asynchronies between flows of audible and visible speech modalities partially caused by inertance of the articulation organs. Original methods for speech asynchrony modeling have been proposed and studied using bimodal ASR and TTS system...

متن کامل

No, There Is No 150 ms Lead of Visual Speech on Auditory Speech, but a Range of Audiovisual Asynchronies Varying from Small Audio Lead to Large Audio Lag

An increasing number of neuroscience papers capitalize on the assumption published in this journal that visual speech would be typically 150 ms ahead of auditory speech. It happens that the estimation of audiovisual asynchrony in the reference paper is valid only in very specific cases, for isolated consonant-vowel syllables or at the beginning of a speech utterance, in what we call "preparator...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011