Relevance-weighted-reconstruction of articulatory features in deep-neural-network-based acoustic-to-articulatory mapping
نویسندگان
چکیده
We present a strategy for learning Deep-Neural-Network (DNN)-based Acoustic-to-Articulatory Mapping (AAM) functions where the contribution of an articulatory feature (AF) to the global reconstruction error is weighted by its relevance. We first empirically show that when an articulator is more crucial for the production of a given phone it is less variable, confirming previous findings. We then compute the relevance of an articulatory feature as a function of its frame-wise variance dependent on the acoustic evidence which is estimated through a Mixture Density Network (MDN). Finally we combine acoustic and recovered articulatory features in a hybrid DNN-HMM phone recognizer. Tested on the MOCHA-TIMIT corpus, articulatory features reconstructed by a standardly trained DNN lead to a 8.4% relative phone error reduction (w.r.t. a recognizer that only uses MFCCs), whereas when the articulatory features are reconstructed taking into account their relevance the relative phone error reduction increased to 10.9%.
منابع مشابه
Integrating Articulatory Information in Deep Learning-Based Text-to-Speech Synthesis
Articulatory information has been shown to be effective in improving the performance of hidden Markov model (HMM)based text-to-speech (TTS) synthesis. Recently, deep learningbased TTS has outperformed HMM-based approaches. However, articulatory information has rarely been integrated in deep learning-based TTS. This paper investigated the effectiveness of integrating articulatory movement data t...
متن کاملHybrid convolutional neural networks for articulatory and acoustic information based speech recognition
Studies have shown that articulatory information helps model speech variability and, consequently, improves speech recognition performance. But learning speaker-invariant articulatory models is challenging, as speaker-specific signatures in both the articulatory and acoustic space increase complexity of speech-to-articulatory mapping, which is already an ill-posed problem due to its inherent no...
متن کاملParkinson's condition estimation using speech acoustic and inversely mapped articulatory data
Parkinson’s disease is a neurological disorder that affects patient’s motor function including speech articulation. There is no cure for Parkinson’s disease. Speech and motor function declines as the disease progresses. Automatic assessment of the disease condition may advance the treatment of Parkinson’s disease with objective, inexpensive measures. Speech acoustics, which can be easily obtain...
متن کاملMultiview Representation Learning via Deep CCA for Silent Speech Recognition
Silent speech recognition (SSR) converts non-audio information such as articulatory (tongue and lip) movements to text. Articulatory movements generally have less information than acoustic features for speech recognition, and therefore, the performance of SSR may be limited. Multiview representation learning, which can learn better representations by analyzing multiple information sources simul...
متن کاملA New Bidirectional Neural Network Model for the Acoustic- Articulatory Inversion Mapping For Speech Recognition
In this paper, a new bidirectional neural network for better acoustic-articulatory inversion mapping is proposed. The model is motivated by the parallel structure of human brain, processing information by having forward-inverse connections. In other words, there would be a feedback from articulatory system to the acoustic signals emitted from that organ. Inspired by this mechanism, a new bidire...
متن کامل