Unsupervised Extraction of Multi-Frame Features for Lip-Reading
نویسندگان
چکیده
The features of human lip motion from video clips are extracted by three unsupervised learning algorithms, i.e., Principal Component Analysis (PCA), Independent Component Analysis (ICA), and Non-negative Matrix Factorization (NMF). Since the human perception of facial motion goes through two different pathways, i.e., the lateral fusifom gyrus for the invariant aspects and the superior temporal sulcus for the changeable aspects of faces, we extracted the dynamic video features from multiple consecutive frames for the latter. While the PCA results in global features, the ICA results in local features with high sparsity. The sparsity of the NMF-based features resides between those of the PCA and ICA-based features. The probability density functions and kurtosis of these features are almost independent on the number of the consecutive frames, and the multiple-frame features require less number of coefficients to represent video clips than the single-frame static features. Keywords— Feature extraction, lip-reading, multi-frame features, convolutive feature extraction, Principal Component Analysis (PCA), Independent Component Analysis (ICA), Non-negative Matrix Factorization (NMF)
منابع مشابه
Block-Based Motion Estimation Analysis for Lip Reading User Authentication Systems
This paper proposes a lip reading technique for speech recognition by using motion estimation analysis. The method described in this paper represents a sub-system of the Silent Pass project. Silent Pass is a lip reading password entry system for security applications. It presents a user authentication system based on password lip reading. Motion estimation is done for lip movement image sequenc...
متن کاملUnsupervised Feature Extraction for the Representation and Recognition of Lip Motion Video
The lip-reading recognition is reported with lip-motion features extracted from multiple video frames by three unsupervised learning algorithms, i.e., Principle Component Analysis (PCA), Independent Component Analysis (ICA), and Non-negative Matrix Factorization (NMF). Since the human perception of facial motion goes through two different pathways, i.e., the lateral fusifom gyrus for the invari...
متن کاملMotion Estimation Analysis for Unsupervised Training for Lip Reading User Authentication Systems
This paper proposes a lip reading technique for speech recognition by using motion estimation analysis. Motion estimation is done for lip movement image sequences representing speech. In this methodology, the motion estimation is computed without extracting the speaker’s lip contours and location. This leads to obtaining robust visual features for lip movements representing utterances. Our meth...
متن کاملAutomatic Lip Reading for Daily Indonesian Words Based on Frame Difference and Horizontal-vertical Image Projection
Automatic lip reading is one of research being developed lately. Automatic lip reading has been used for various purposes, such as enhancing speech recognition and aid to speech training for the deaf. There are two approaches in lip feature extraction, namely appearance based and shape based. Appearance based approach is usually better, because it provides visual features that cover not only li...
متن کاملClassical Flexible Lip Model Based Relative Weight Finder for Better Lip Reading Utilizing Multi Aspect Lip Geometry
Problem statement: Deaf and dumb needs assistance from a technical box that takes movements of lips to identify the words. This technical article provided appropriate model implementation of flexible lip model for better visual lip reading system. Approach: From the frame sequence of words, Active Shape Model (ASM) based lip model provided local tracking and extraction of geometric lip-feature....
متن کامل