A Neural Clustering Algorithm for Estimating Visible Articulatory Trajectory

نویسندگان

  • Fabio Vignoli
  • Sergio Curinga
  • Fabio Lavagetto
چکیده

The bimodal acoustic-visual nature of speech establishes sound correlations between its audio component and the corresponding articulatory information associated to the time-varying geometry of the vocal tract. In this paper we propose an estimation structure consisting of a simpliied Time-Delay Neural Network (TDNN) working on 4-5 dimensional cepstrum trajectories provided by a preceding clusterization layer based on a Self Organizing Map (SOM). The use of this pre-processing layer has allowed an eeective non-linear clusterization of cepstrum vectors thus simplifying of one order the complexity of the resulting system while maintaining unchanged the global estimation performances. The achieved results are shown in terms estimation precision and robustness with reference to previously published results. 1 A direct approach to articulatory estimation The objective of any direct approach to articulatory estimation is the design of a suitable mechanism for mapping a predeened acoustic representation of speech into a corresponding motor space representation. No intermediate explicit recognition or classiication is required since it is assumed that all the necessary processing is embedded in the conversion mechanism itself. Extensive experimentations on normal hearing and hearing impaired subjects 2] 3] have clearly demonstrated that if, on one hand, phonemes can be associated rather easily to well deened mouth conngurations (called "visemes"), the inverse association is usually troublesome since the same posture of the mouth can correspond to diierent phonemes. As an example the "bilabial" viseme is associated to diierent phonemes like /m,p,b/ and the "velar" viseme is associated to diierent phonemes like /k,g/. Moreover, deep investigation on the articulatory dynamics 4] 5] stress the role played by the coarticulatory phenomena which describe the eeects on articulation due to past acoustic outputs (backward coarticulation) and to future going-to-be-produced acoustic information (forward coarticulation). Since no exhaustive study on coarticulatory dynamics is still available, an approach to speech articulatory description passing through the phonemic level results extremely complex and, at least, formally incomplete. On the contrary, a more viable solution seems that of performing the articulatory estimation

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ball Trajectory Estimation and Robot Control to Reach the Ball Using Single Camera

In robotics research, catching a projectile object with a robotic system is one of the challenging problems. The outcome of these researches can be used in a wide range of applications such as video surveillance systems, analysis of sports videos, monitoring programs for human activities, and human-machine interactions. In this paper, we propose a new vision-based algorithm to estimate the traj...

متن کامل

Assessment of the Performance of Clustering Algorithms in the Extraction of Similar Trajectories

In recent years, the tremendous and increasing growth of spatial trajectory data and the necessity of processing and extraction of useful information and meaningful patterns have led to the fact that many researchers have been attracted to the field of spatio-temporal trajectory clustering. The process and analysis of these trajectories have resulted in the extraction of useful information whic...

متن کامل

A comparison of modified k-means(MKM) and NN based real time adaptive clustering algorithms for articulatory space codebook formation

This paper proposes the use of a neural network based real time adaptive clustering* algorithm for the formation of a codebook of limited set of acoustical representation of finite set of vocal tract shapes from an articulatory space. Modified k-means algorithm (MKM) used for clustering nearly 10000 vocal tract shapes into 1000 cluster centers to form a codebook of articulatory shapes is comput...

متن کامل

Trajectory Tracking of a Mobile Robot Using Fuzzy Logic Tuned by Genetic Algorithm (TECHNICAL NOTE)

In recent years, soft computing methods, like fuzzy logic and neural networks have been  presented and developed for the purpose of mobile robot trajectory tracking. In this paper we will present a fuzzy approach to the problem of mobile robot path tracking for the CEDRA rescue robot with a complicated kinematical model. After designing the fuzzy tracking controller, the membership functions an...

متن کامل

A trajectory mixture density network for the acoustic-articulatory inversion mapping

This paper proposes a trajectory model which is based on a mixture density network trained with target features augmented with dynamic features together with an algorithm for estimating maximum likelihood trajectories which respects constraints between the static and derived dynamic features. This model was evaluated on an inversion mapping task. We found the introduction of the trajectory mode...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996