Visual feature analysis for automatic speechreading

نویسندگان

  • Patricia Scanlon
  • Richard B. Reilly
  • Philip de Chazal
چکیده

This paper proposes a novel method of visual feature extraction for automatic speechreading. While current methods of extracting delta or difference features involves computing the difference between adjacent frames, this method proposed provides information on how the visual features evolve over a time period longer than the time period between adjacent frames, the time period being relative to the length of the utterance. These new features provide a visual memory capability for improved system performance. Good visual discrimination is achieved by maintaining a base level of detail in the features. A frame rate of 30 frames per second provides rapid visual recognition of speech. The combination of the novel visual memory features, good visual discrimination and rapid visual recognition of speech movements is shown to improve visual speech recognition. Using this method an isolated word accuracy of 28.1% for a vocabulary 78 words over a database of 10 speakers was achieved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploiting lower face symmetry in appearance-based automatic speechreading

Appearance-based visual speech feature extraction is being widely used in the automatic speechreading and audio-visual speech recognition literature. In its most common application, the discrete cosine transform (DCT) is utilized to compress the image of the speaker’s mouth region-of-interest (ROI), and the highest energy spatial frequency components are retained as visual features. Good genera...

متن کامل

Linear discriminant analysis for speechreading

This paper investigates the use of Fisher-Rao linear discriminant analysis (LDA) as a means of visual feature extraction for hidden Markov model based automatic speechreading. For every video frame, a three-dimensional region of interest containing the speaker's mouth over a sequence of adjacent frames is lexicographically arranged into a data vector. Such vectors are then projected onto the sp...

متن کامل

A hierarchy probability-based visual features extraction method for speechreading

1 This research is supported by the President Foundation of the Institute of Acoustics, Chinese Academy of Sciences (No.98-02) and “863” High Tech R&D Project of China (No. 863-306-ZD-11-1). ABSTRACT Visual feature extraction method now becomes the key technique in automatic speechreading systems. However it still remains a difficult problem due to large inter-person and intraperson appearance ...

متن کامل

Feature analysis for automatic speechreading

− Audio-Visual Automatic Speech Recognition systems use visual information to enhance ASR systems in clean and noisy environments. This paper compares of a number of different visual feature extraction methods. When performing visual speech recognition the visual feature vector requires a base level of detail for optimum recognition. Geometric feature extraction provides lower recognition than ...

متن کامل

Towards speaker independent continuous speechreading

This paper describes recent speechreading experiments for a speaker independent continuous digit recognition task. Visual feature extraction is performed by a lip tracker which recovers information about the lip shape and information about the greylevel intensity around the mouth. These features are used to train visual word models using continuous density HMMs. Results show that the method gen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003