The quality of Text-to-Visual-Speech synthesis is judged by how well it matches the visual perception of speech articulators with acoustic speech perception. Concurrently, di erent viewers often prefer di erent head models for subjective reasons. Traditional facial animation approach tied the parameterization of animation directly to the model. Switching the head model is di cult because a leng...