Analysis and Compensation of Pa Speech Recognition Usin
نویسنده
چکیده
The aim of this work is to improve the robustness of speech recognition systems operating in burst-like packet loss. First a set of highly artificial packet loss profiles are used to analyse their effect on both recognition performance and on the underlying feature vector stream. This indicates that the simple technique of vector repetition can make the recogniser robust to high percentages of packet loss, providing burst lengths are reasonably short. This leads to the proposal of interleaving the feature vector sequence, prior to packetisation, to disperse bursts of packet loss throughout the feature vector stream. Recognition results on the Aurora connected digits database show considerable accuracy gains across a range of packet losses and burst lengths. For example at a packet loss rate of 50% with an average burst length of 4 packets (corresponding to 8 static vectors) performance is increased from 49.4% to 88.5% with an increase in delay of 90ms.
منابع مشابه
Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods
Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...
متن کاملروشی جدید در بازشناسی مقاوم گفتار مبتنی بر دادگان مفقود با استفاده از شبکه عصبی دوسویه
Performance of speech recognition systems is greatly reduced when speech corrupted by noise. One common method for robust speech recognition systems is missing feature methods. In this way, the components in time - frequency representation of signal (Spectrogram) that present low signal to noise ratio (SNR), are tagged as missing and deleted then replaced by remained components and statistical ...
متن کاملStatistical Variation Analysis of Formant and Pitch Frequencies in Anger and Happiness Emotional Sentences in Farsi Language
Setup of an emotion recognition or emotional speech recognition system is directly related to how emotion changes the speech features. In this research, the influence of emotion on the anger and happiness was evaluated and the results were compared with the neutral speech. So the pitch frequency and the first three formant frequencies were used. The experimental results showed that there are lo...
متن کاملA Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation
Abstract Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...
متن کاملLost Speech Reconstruction Method usin Missing Feature Theory and HMM
In recent years, IP telephone service has spread rapidly. However, an unavoidable problem of IP telephone service is deterioration of speech due to packet loss, which often occurs on wireless networks. To overcome this problem, we propose a novel lost speech reconstruction method using speech recognition based on Missing Feature Theory and HMM-based speech synthesis. The proposed method uses li...
متن کامل