audio system

SMART-I: Spatial Multi-user Audio-Visual Real Time Interactive Interface

2011

Marc Rébillat

The SMART-I aims at creating a precise and coherent virtual environment by providing users with both audio and visual accurate localization cues. It is known that for audio rendering, Wave Field Synthesis, and for visual rendering, Tracked Stereoscopy, individually permit high quality spatial immersion within an extended space. The proposed system combines these two rendering approaches through...

متن کامل

Mirex 2008 Audio Music Classification Using a Combination of Spectral, Timbral, Rhythmic, Temporal and Symbolic Features

2008

T. Lidy A. Rauber A. Pertusa P. J. Ponce

The novel approach of combining audio and symbolic features for music classification from audio enhanced previous audio-only based results in MIREX 2007. We extended the approach by including temporal audio features, enhancing the polyphonic audio to MIDI transcription system and including an extended set of symbolic features. Recent research in music genre classification hints at a glass ceili...

متن کامل

Using Weighted Oriented Optical Flow Histograms for Multimodal Speaker Diarization

2008

Mary Tai Knox

Speaker diarization currently focuses on using audio features to partition an audio stream into speaker homogeneous speech regions, in other words to determine “who spoke when”. Recent speaker diarization corpora contains video recordings in addition to the commonly used audio. Thus, we investigated the benefits of incorporating video features, namely histograms of weighted oriented optical flo...

متن کامل

Experimental Results in Audio Indexing

1997

S. Dharanipragada S. Roukos

In this paper we describe the IBM Audio-Indexing System and present some experimental results on the performance of the system on an audio indexing task.

متن کامل

Comparative Study of Filter Performance for Separation of Singing Voice from Music Accompaniment

2015

Harshada P. Burute Madhuri Patil Pradeep B. Mane

An audio signal is a representation of sound. Audio signals have frequency range 20 to 20 kHz. Audio signals may be synthesized directly. A mixture refers to the physical combination of two or more substances on which the identities and are mixed in the form to separate out. An audio signal classification system should be able to categorize different audio input formats (speech, background nois...

متن کامل

Feature-fusion Based Audio-visual Speech Recognition Using Lip Geometry Features in Noisy Enviroment

2015

M. Z. Ibrahim D. J. Mulvaney M. F. Abas

Humans are often able to compensate for noise degradation and uncertainty in speech information by augmenting the received audio with visual information. Such bimodal perception generates a rich combination of information that can be used in the recognition of speech. However, due to wide variability in the lip movement involved in articulation, not all speech can be substantially improved by a...

متن کامل

Vision Steered B E Am-forming and Transaural Rendering for the Artiicial Life Interactive Video Environment, (alive) 1 Vision Steered Beam-forming 1.1 Introduction

1995

Michael A. Casey William G. Gardner Sumit Basu

This paper describes the audio component of a virtual reality system that uses remote sensing to free the user from body-mounted tracking equipment. Position information is obtained from a camera and used to constrain a beam-forming microphone array, for fareld speech input, and a two-speaker transaural audio system for rendering 3D audio. 1 Vision Steered Beam-Forming 1.

متن کامل

AUDIO FOR A MULTIMODAL ASSISTIVE INTERFACE Demo paper for the ICAD05 workshop "Combining Speech and Sound in the User Interface"

2005

Emma Murphy Graham McAllister Philip Strain Ravi Kuber Wai Yu

This paper details the design of an audio interface for a multi-modal content-aware web plug-in. The system aims to provide spatial and navigational information to visually impaired Internet users through speech and non-speech audio with haptic feedback. The web plug-in and audio interface are presented and discussed, along with recommendations for future system development.

متن کامل

Smart-i: “spatial Multi-user Audio-visual Real-time Interactive Interface”, a Broadcast Application Context

2011

Marc Rébillat Brian F.G. Katz Etienne Corteel

SMART-I is a high quality 3D audio-visual interactive rendering system. In SMART-I, the screen is also used as a multichannel loudspeaker. The spatial audio rendering is based on Wave Field Synthesis, an approach that creates a coherent spatial perception of a spatial sound scene over a large listening area. The azimuth localization accuracy of the system has been verified by a perceptual exper...

متن کامل

An Interactive Concert Program Based on Infrared Watermark and Audio Synthesis

2009

Hsi-Chun Wang Wen-Pin Hope Lee Feng-Ju Liang

The objective of this research is to propose a video/audio system which allows the user to listen the typical music notes in the concert program under infrared detection. The system synthesizes audio with different pitches and tempi in accordance with the encoded data in a 2-D barcode embedded in the infrared watermark. The digital halftoning technique has been used to fabricate the infrared wa...

متن کامل