High-level feature weighted GMM network for audio stream classification
نویسندگان
چکیده
The problem of unsupervised audio classification continuous to be a challenging research problem which significantly impacts ASR and Spoken Document Retrieval (SDR) performance. This paper addresses novel advances in audio classification for speech recognition. A new algorithm is proposed for audio classification, which is based on Weighted GMM Network (WGN). Two new high-level features: VSF (Variance of the Spectrum Flux) and VZCR (Variance of the Zero-Crossing Rate) are used to pre-classify the audio and supply weights to the output probabilities of the GMM networks. The classification is then implemented using weighted GMM networks. Evaluations on a standard data set — DARPA Hub4 Broadcast News 1997 evaluation data, shows that the WGN classification algorithm achieves over a 50% improvement versus the GMM network baseline algorithm. The WGN also obtains very satisfactory results on the more diverse and challenging NGSW (National Gallery of the Spoken Word [8]) corpus. Classification based on segmentation method is also explored.
منابع مشابه
Feature Selection and Classification of Microarray Gene Expression Data of Ovarian Carcinoma Patients using Weighted Voting Support Vector Machine
We can reach by DNA microarray gene expression to such wealth of information with thousands of variables (genes). Analysis of this information can show genetic reasons of disease and tumor differences. In this study we try to reduce high-dimensional data by statistical method to select valuable genes with high impact as biomarkers and then classify ovarian tumor based on gene expression data of...
متن کاملContinuous Multimodal Emotion Recognition Approach for AVEC 2017
This paper reports the analysis of audio and visual features in predicting the continuous emotion dimensions under the seventh Audio/Visual Emotion Challenge (AVEC 2017), which was done as part of a B.Tech. 2nd year internship project. For visual features we used the HOG (Histogram of Gradients) features, Fisher encodings of SIFT (Scale-Invariant Feature Transform) features based on Gaussian mi...
متن کاملClassification of Protected Forest Areas for Road Network Planning (Case Study: Arasbaran Area)
The road network is an important element for sustainable forest ecosystem management. On the other hand, efficiency and quality of road design can be improved by considering of environmental and technical principles. Therefore, this study was performed to determine the capability of Arasbaran protected area for road passing and communication routes in order to use in the area management. For th...
متن کاملGAN-Assisted Two-Stream Neural Network for High-Resolution Remote Sensing Image Classification
Using deep learning to improve the capabilities of high-resolution satellite images has emerged recently as an important topic in automatic classification. Deep networks track hierarchical high-level features to identify objects; however, enhancing the classification accuracy from low-level features is often disregarded. We therefore proposed a two-stream deep-learning neural network strategy, ...
متن کاملSpeech/Music Classification using SVM and GMM
Today, digital audio applications are part of our everyday lives. Automatic audio classification is very useful in audio indexing; content based audio retrieval and online audio distribution. The accuracy of the classification relies on the strength of the features and classification scheme. In this work both, time domain and frequency domain features are extracted from the input signal. Time d...
متن کامل