منابع مشابه
Finding phonemes: improving machine lip-reading
In machine lip-reading there is continued debate and research around the correct classes to be used for recognition. In this paper we use a structured approach for devising speaker-dependent viseme classes, which enables the creation of a set of phoneme-to-viseme maps where each has a different quantity of visemes ranging from two to 45. Viseme classes are based upon the mapping of articulated ...
متن کاملDesigning and implementing a system for Automatic recognition of Persian letters by Lip-reading using image processing methods
For many years, speech has been the most natural and efficient means of information exchange for human beings. With the advancement of technology and the prevalence of computer usage, the design and production of speech recognition systems have been considered by researchers. Among this, lip-reading techniques encountered with many challenges for speech recognition, that one of the challenges b...
متن کاملComparison of human and machine-based lip-reading
We investigate the performance of a machine-based lip-reading system using both shape-only parameters and full shape and appearance parameters. Furthermore, we contrast the performance of a machine-based lip-reading system with human lip-reading ability. We find that the automated system outperforms human lip-readers. Curiously however, for relatively simple tasks there is little improvement in...
متن کاملSpeaker-independent machine lip-reading with speaker-dependent viseme classifiers
In machine lip-reading, which is identification of speech from visual-only information, there is evidence to show that visual speech is highly dependent upon the speaker [1]. Here, we use a phoneme-clustering method to form new phoneme-to-viseme maps for both individual and multiple speakers. We use these maps to examine how similarly speakers talk visually. We conclude that broadly speaking, s...
متن کاملToward movement-invariant automatic lip-reading and speech recognition
We present the development of a modular system for flexible human–computer interaction via speech. The speech recognition component integrates acoustic and visual information (automatic lip-reading) improving overall recognition, especially in noisy environments. The image of the lips, constituting the visual input, is automatically extracted from the camera picture of the speaker’s face by the...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: TRANSACTIONS OF THE JAPAN SOCIETY OF MECHANICAL ENGINEERS Series C
سال: 1987
ISSN: 0387-5024,1884-8354
DOI: 10.1299/kikaic.53.2613