Lip Reading in Profile
نویسندگان
چکیده
There has been a quantum leap in the performance of automated lip reading recently due to the application of neural network sequence models trained on a very large corpus of aligned text and face videos. However, this advance has only been demonstrated for frontal or near frontal faces, and so the question remains: can lips be read in profile to the same standard? The objective of this paper is to answer that question. We make three contributions: first, we obtain a new large aligned training corpus that contains profile faces, and select these using a face pose regressor network; second, we propose a curriculum learning procedure that is able to extend SyncNet [10] (a network to synchronize face movements and speech) progressively from frontal to profile faces; third, we demonstrate lip reading in profile for unseen videos. The trained model is evaluated on a held out test set, and is also shown to far surpass the state of the art on the OuluVS2 multi-view benchmark.
منابع مشابه
Designing and implementing a system for Automatic recognition of Persian letters by Lip-reading using image processing methods
For many years, speech has been the most natural and efficient means of information exchange for human beings. With the advancement of technology and the prevalence of computer usage, the design and production of speech recognition systems have been considered by researchers. Among this, lip-reading techniques encountered with many challenges for speech recognition, that one of the challenges b...
متن کامللبخوانی و ادراک گفتار دانشآموزان کمشنوای مدارس ویژۀ کمشنوایان در شهر تهران
Objective: The goal of this study was to evaluate the lip reading ability and Speech perception of hearing impaired students of special schools for the hearing impaired in different speech levels. Materials & Methods: In this cross- sectional study, 44 deaf students (9-12 years old) were selected with multi-stage cluster sampling method, from two special schools for the deaf in Tehran. Tools...
متن کاملSoft tissue facial profile and anteroposterior lip positioning in Iranians
Objective: Since orthodontic and orthognathic treatment planningin each ethnic group must be done according to the soft tissue facial characteristicsregarded as beauty,they thus vary from country to country.The main purpose of this articlewas to determine the mean range of the middle third of soft tissue facial profile and anteroposterior lip positioningusing 3angular and 2 linear measurement...
متن کاملPose Compensation for Bimodal Speech Recognition
Lip reading has been proven to improve speech recognition accuracy in adverse environments. Most existing lip reading systems have frontal pose assumption, which makes it very difficult to use in tasks such as video transcription (speech recognition of the audio stream for video indexing and retrieval). In this paper, we propose a new method to compensate the lip pose change by exploiting the g...
متن کاملCan you "read tongue movements"?
Lip reading relies on visible articulators to ease audiovisual speech understanding. However, lips and face alone provide very incomplete phonetic information: the tongue, that is generally not entirely seen, carries an important part of the articulatory information not accessible through lip reading. The question was thus whether the direct and full vision of the tongue allows tongue reading. ...
متن کامل