Feature space mutual information in speech-video sequences
نویسندگان
چکیده
We present an approach to directly study mutual relationships between audio and video signals for multimedia applications. The presented approach is mathematically based on information theory and is closely related to information theoretic classification. We show that very simple features of the audioresp. video-channel can already contain lots of mutual information between both modalities. The mathematical approach is very general though and not restricted to the presented multimedia application.
منابع مشابه
Enhancing multimodal silent speech interfaces with feature selection
In research on Silent Speech Interfaces (SSI), different sources of information (modalities) have been combined, aiming at obtaining better performance than the individual modalities. However, when combining these modalities, the dimensionality of the feature space rapidly increases, yielding the well-known “curse of dimensionality”. As a consequence, in order to extract useful information from...
متن کاملA New Unequal Error Protection Technique Based on the Mutual Information of the MPEG-4 Video Frames over Wireless Networks
The performance of video transmission over wireless channels is limited by the channel noise. Thus many error resilience tools have been incorporated into the MPEG-4 video compression method. In addition to these tools, the unequal error protection (UEP) technique has been proposed to protect the different parts in an MPEG-4 video packet with different channel coding rates based on the rate...
متن کاملMutual information based visual feature selection for lipreading
Image transforms, such as the discrete cosine, are widely used to extract visual features from the speaker’s mouth region to be used in automatic speechreading and audio-visual speech recognition. Typically, the spatial frequency components with the highest energy in the transform space are retained for recognition. This paper proposes an alternative technique for selecting such features, by ut...
متن کاملCombining Feature Space Discriminative Training with Long-Term Spectro-Temporal Features for Noise-Robust Speech Recognition
Discriminative training of feature space using maximum mutual information (fMMI) objective function has been shown to yield remarkable accuracy improvements. For noisy environments, fMMI can be regarded as an effective noise compensation algorithm and can play a significant role for noise robustness. Feature space speaker adaptation techniques such as feature space maximum likelihood linear reg...
متن کاملAn evaluation of using mutual information for selection of acoustic-features representation of phonemes for speech recognition
This paper addresses the problem of finding a subset of the acoustic feature space that best represents the phoneme set used in a speech recognition system. A maximum mutual information approach is presented for selecting acoustic features to be combined together to represent the distinctions among the phonemes. The overall phoneme recognition accuracy is slightly increased for the same length ...
متن کامل