Photo-realistic expressive text to talking head synthesis
نویسندگان
چکیده
A controllable computer animated avatar that could be used as a natural user interface for computers is demonstrated. Driven by text and emotion input, it generates expressive speech with corresponding facial movements. To create the avatar, HMM-based text-to-speech synthesis is combined with active appearance model (AAM)-based facial animation. The novelty is the degree of control achieved over the expressiveness of both the speech and the face while keeping the controls simple. Controllability is achieved by training both the speech and facial parameters within a cluster adaptive training (CAT) framework. CAT creates a continuous, low dimensional eigenspace of expressions, which allows the creation of expressions of different intensity (including ones more intense than those in the original recordings) and combining different expressions to create new ones. Results on an emotion-recognition task show that recognition rates given the synthetic output are comparable to those given the original videos of the speaker.
منابع مشابه
Photo-Realistic Talking-Heads from Image Samples
This paper describes a system for creating a photo-realistic model of the human head that can be animated and lip-synched from phonetic transcripts of text. Combined with a state-of-the-art text-to-speech synthesizer (TTS), it generates video animations of talking heads that closely resemble real people. To obtain a naturally looking head, we choose a “data-driven” approach. We record a talking...
متن کاملFace Analysis for the Synthesis of Photo-Realistic Talking Heads
This paper describes techniques for extracting bitmaps of facial parts from videos of a talking person. The goal is to synthesize photo-realistic talking heads of high quality that show picture-perfect appearance and realistic head movements with good lip-sound synchronization. For the synthesis of a talking head, bitmaps of facial parts are combined to form whole heads and then sequences of su...
متن کاملText Driven 3D Photo-Realistic Talking Head
We propose a new 3D photo-realistic talking head with a personalized, photo realistic appearance. Different head motions and facial expressions can be freely controlled and rendered. It extends our prior, high-quality, 2D photo-realistic talking head to 3D. Around 20-minutes of audio-visual 2D video are first recorded with read prompted sentences spoken by a speaker. We use a 2D-to-3D reconstru...
متن کاملAudio-Visual Unit Selection for the Synthesis of Photo-Realistic Talking-Heads
This paper investigates audio-visual unit selection for the synthesis of photo-realistic, speech-synchronized talking-head animations. These animations are synthesized from recorded video samples of a subject speaking in front of a camera, resulting in a photo-realistic appearance. The lip-synchronization is obtained by optimally selecting and concatenating variable-length video units of the mo...
متن کاملA 3d audio-visual animated agent for expressive conversational question answering
This paper reports on the ACQA (Animated agent for Conversational Question Answering) project conducted at LIMSI. The aim is to design an expressive animated conversational agent (ACA) for conducting research along two main lines: 1/ perceptual experiments (eg perception of expressivity and 3D movements in both audio and visual channels): 2/ design of human-computer interfaces requiring head mo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013