Toward Clustering Persian Vowel Viseme: A New Clustering Approach based on HMM
نویسندگان
چکیده
This paper sorts out the problem of Persian Vowel viseme clustering. Clustering audio-visual data has been discussed for a decade or so. However, it is an open problem due to shortcoming of appropriate data and its dependency to target language. Here, we propose a speaker-independent and robust method for Persian viseme class identification as our main contribution. The overall process of the proposed method consists of three main steps including (I) Mouth region segmentation, (II) Feature extraction, and (IV) Hierarchical clustering. After segmenting the mouth region in all frames, the feature vectors are extracted based on a new look at Hidden Markov Model. This is another contribution to this work, which utilizes HMM as a probabilistic model-based feature detector. Finally, a hierarchical clustering approach is utilized to cluster Persian Vowel viseme. The main advantage of this work over others is producing a single clustering output for all subjects, which can simplify the research process in other applications. In order to prove the efficiency of the proposed method a set of experiments is conducted on AVAII. KeywordsViseme, Visual Speech, HMM
منابع مشابه
Based Persian Viseme Clustering
Viseme (Visual Phoneme) clusterin every language is among the most important conducting various multimedia researches as reading, lip synchronization and com pronunciation training applications. With re that clustering and analyzing visemes are lan processes, we concentrated our research on P which indeed has suffered from lack of su paper, we used a hierarchical approach for c in Persian langu...
متن کاملPersian Viseme Classification Using Interlaced Derivative Patterns and Support Vector Machine
Viseme (Visual Phoneme) classification and analysis in every language are among the most important preliminaries for conducting various multimedia researches such as talking head, lip reading, lip synchronization, and computer assisted pronunciation training applications. With respect to the fact that analyzing visemes is a language dependent process, we concentrated our research on Persian lan...
متن کاملElectrofacies clustering and a hybrid intelligent based method for porosity and permeability prediction in the South Pars Gas Field, Persian Gulf
This paper proposes a two-step approach for characterizing the reservoir properties of the world’s largest non-associated gas reservoir. This approach integrates geological and petrophysical data and compares them with the field performance analysis to achieve a practical electrofacies clustering. Porosity and permeability prediction is done on the basis of linear functions, succeeding the elec...
متن کاملHMM-based visual speech synthesis using dynamic visemes
In this paper we incorporate dynamic visemes into hidden Markov model (HMM)-based visual speech synthesis. Dynamic visemes represent intuitive visual gestures identified automatically by clustering purely visual speech parameters. They have the advantage of spanning multiple phones and so they capture the effects of visual coarticulation explicitly within the unit. The previous application of d...
متن کاملImproved visual speech synthesis using dynamic viseme k-means clustering and decision trees
We present a decision tree-based viseme clustering technique that allows visual speech synthesis after training on a small dataset of phonetically-annotated audiovisual speech. The decision trees allow improved viseme grouping by incorporating k-means clustering into the training algorithm. The use of overlapping dynamic visemes, defined by tri-phone time-varying oral pose boundaries, allows im...
متن کامل