A saliency-based auditory attention model with applications to unsupervised prominent syllable detection in speech
نویسندگان
چکیده
A bottom-up or saliency driven attention allows the brain to detect nonspecific conspicuous targets in cluttered scenes before fully processing and recognizing the targets. Here, a novel biologically plausible auditory saliency map is presented to model such saliency based auditory attention. Multi-scale auditory features are extracted based on the processing stages in the central auditory system, and they are combined into a single master saliency map. The usefulness of the proposed auditory saliency map in detecting the prominent syllable and word locations in speech is tested in an unsupervised manner. When evaluated with broadcast news-style read speech using the BU Radio News Corpus, the model achieves 75.9 % accuracy at the syllable level, and 78.1 % accuracy at word level. These results compare well to results reported on human performance.
منابع مشابه
Graph-based Visual Saliency Model using Background Color
Visual saliency is a cognitive psychology concept that makes some stimuli of a scene stand out relative to their neighbors and attract our attention. Computing visual saliency is a topic of recent interest. Here, we propose a graph-based method for saliency detection, which contains three stages: pre-processing, initial saliency detection and final saliency detection. The initial saliency map i...
متن کاملWord segmentation in Persian continuous speech using F0 contour
Word segmentation in continuous speech is a complex cognitive process. Previous research on spoken word segmentation has revealed that in fixed-stress languages, listeners use acoustic cues to stress to de-segment speech into words. It has been further assumed that stress in non-final or non-initial position hinders the demarcative function of this prosodic factor. In Persian, stress is retract...
متن کاملA Saliency Detection Model via Fusing Extracted Low-level and High-level Features from an Image
Saliency regions attract more human’s attention than other regions in an image. Low- level and high-level features are utilized in saliency region detection. Low-level features contain primitive information such as color or texture while high-level features usually consider visual systems. Recently, some salient region detection methods have been proposed based on only low-level features or hig...
متن کاملCombining task-dependent information with auditory attention cues for prominence detection in speech
Auditory attention is a highly complex mechanism that involves the process of low-level acoustic features of sound together with higher level cognitive rules. In this paper, a novel method that combines biologically inspired auditory attention cues with higher level lexical and syntactic information is proposed to model task-dependent influences on a given task. The feature maps are extracted f...
متن کاملCompressed-Sampling-Based Image Saliency Detection in the Wavelet Domain
When watching natural scenes, an overwhelming amount of information is delivered to the Human Visual System (HVS). The optic nerve is estimated to receive around 108 bits of information a second. This large amount of information can’t be processed right away through our neural system. Visual attention mechanism enables HVS to spend neural resources efficiently, only on the selected parts of the...
متن کامل