Unsupervised Feature Extraction for the Representation and Recognition of Lip Motion Video
نویسندگان
چکیده
The lip-reading recognition is reported with lip-motion features extracted from multiple video frames by three unsupervised learning algorithms, i.e., Principle Component Analysis (PCA), Independent Component Analysis (ICA), and Non-negative Matrix Factorization (NMF). Since the human perception of facial motion goes through two different pathways, i.e., the lateral fusifom gyrus for the invariant aspects and the superior temporal sulcus for the changeable aspects of faces, we extracted the dynamic video features from multiple consecutive frames for the latter. The multiple-frame features require less number of coefficients for the same frame length than the single-frame static features. The ICA-based features are most sparse, while the corresponding coefficients for the video representation are the least sparse. PCA-based features have the opposite characteristics, while the characteristics of the NMF-based features are in the middle. Also the ICA-based features result in much better recognition performance than the others.
منابع مشابه
Action Change Detection in Video Based on HOG
Background and Objectives: Action recognition, as the processes of labeling an unknown action of a query video, is a challenging problem, due to the event complexity, variations in imaging conditions, and intra- and inter-individual action-variability. A number of solutions proposed to solve action recognition problem. Many of these frameworks suppose that each video sequence includes only one ...
متن کاملBlock-Based Motion Estimation Analysis for Lip Reading User Authentication Systems
This paper proposes a lip reading technique for speech recognition by using motion estimation analysis. The method described in this paper represents a sub-system of the Silent Pass project. Silent Pass is a lip reading password entry system for security applications. It presents a user authentication system based on password lip reading. Motion estimation is done for lip movement image sequenc...
متن کاملUnsupervised Extraction of Multi-Frame Features for Lip-Reading
The features of human lip motion from video clips are extracted by three unsupervised learning algorithms, i.e., Principal Component Analysis (PCA), Independent Component Analysis (ICA), and Non-negative Matrix Factorization (NMF). Since the human perception of facial motion goes through two different pathways, i.e., the lateral fusifom gyrus for the invariant aspects and the superior temporal ...
متن کاملMotion Estimation Analysis for Unsupervised Training for Lip Reading User Authentication Systems
This paper proposes a lip reading technique for speech recognition by using motion estimation analysis. Motion estimation is done for lip movement image sequences representing speech. In this methodology, the motion estimation is computed without extracting the speaker’s lip contours and location. This leads to obtaining robust visual features for lip movements representing utterances. Our meth...
متن کاملA Lip Localization Based Visual Feature Extraction Method
This paper presents a lip localization based visual feature extraction method to segment lip region from image or video in real time. Lip localization and tracking is useful in many applications such as lip reading, lip synchronization, visual speech recognition, facial animation etc. To synchronize lip movements with input audio we need to first segment lip region from input image or video fra...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006