Models of Bottom-Up Attention and Saliency
نویسنده
چکیده
Visually conspicuous, or so-called salient, stimuli often have the capability of attracting focal visual attention towards their locations. Several computational architectures subserving this bottom-up, stimulus-driven, spatiotemporal deployment of attention are reviewed in this article. The resulting computational models have applications not only to the prediction of visual search psychophysics, but also, in the domain of machine vision, to the rapid selection of regions of interest in complex, cluttered visual environments. We describe an unusal such application, to the objective evaluation of advertising designs. One of the most important functions of selective visual attention is to rapidly direct our gaze towards objects of interest in our visual environment. From an evolutionary standpoint, this rapid orienting capability is critical in allowing living systems to quickly become aware of possible preys, mates or predators in their cluttered visual world. It has become clear that attention guides where to look next based on both bottom-up (image-based) and top-down (task-dependent) cues (James, 1890/1981). As such, attention implements an information processing bottleneck, only allowing a small part of the incoming sensory information to reach short-term memory and visual awareness [Linking Attention to Learning, Expectation, Competition and Consciousness]. That is, instead of attempting to fully process the massive sensory input in parallel, nature has devised a serial strategy to achieve near real-time performance despite limited computational capacity: Attention allows us to break down the problem of scene understanding into rapid series of computationally less demanding, localized visual analysis problems. Developing computational models of how attention is deployed onto complex visual scenes has been a longstanding challenge for fundamental neuroscience, with additional motivation provided by numerous potential applications in artificial vision, for tasks including surveillance, automatic target detection, navigational aids and robotics control. Here we focus on biologically-plausible computational modeling of bottom-up guidance of attention towards salient image locations, while [Attention and scene understanding] casts these models within broader frameworks that combine bottom-up and top-down attention control signals. 1 Preattentive Features and Saliency Map Development of computational models of attention started with the Feature Integration Theory of Treisman & Gelade (1980), which proposed that only simple visual features are computed in a massively parallel ∗University of Southern California, Hedco Neuroscience Building HNB-30A, Los Angeles, CA 90089-2520
منابع مشابه
Compressed-Sampling-Based Image Saliency Detection in the Wavelet Domain
When watching natural scenes, an overwhelming amount of information is delivered to the Human Visual System (HVS). The optic nerve is estimated to receive around 108 bits of information a second. This large amount of information can’t be processed right away through our neural system. Visual attention mechanism enables HVS to spend neural resources efficiently, only on the selected parts of the...
متن کاملGraph-based Visual Saliency Model using Background Color
Visual saliency is a cognitive psychology concept that makes some stimuli of a scene stand out relative to their neighbors and attract our attention. Computing visual saliency is a topic of recent interest. Here, we propose a graph-based method for saliency detection, which contains three stages: pre-processing, initial saliency detection and final saliency detection. The initial saliency map i...
متن کاملA Novel Method to Study Bottom-up Visual Saliency and its Neural Mechanism
In this study, we propose a novel method to measure bottom-up saliency maps of natural images. In order to eliminate the influence of top-down signals, backward masking is used to make stimuli (natural images) subjectively invisible to subjects, however, the bottom-up saliency can still orient the subjects attention. To measure this orientation/attention effect, we adopt the cueing effect parad...
متن کاملA New Detection Model for Saliency Map
Visual attention is a mechanism which filters out redundant visual information and focus on the most relevant parts when observing an image, many bottom-up computational models of visual attention have been devised to get the saliency map for an image. In this paper, a new visual attention model is proposed. Based on the experimental results obtained in this study, as compared with existing bot...
متن کاملLearning to predict where to look in interactive environments using deep recurrent q-learning
Bottom-Up (BU) saliency models do not perform well in complex interactive environments where humans are actively engaged in tasks (e.g., sandwich making and playing the video games). In this paper, we leverage Reinforcement Learning (RL) to highlight task-relevant locations of input frames. We propose a soft attention mechanism combined with the Deep Q-Network (DQN) model to teach an RL agent h...
متن کاملIntegrating Object Affordances with Artificial Visual Attention
Affordances, as for example grasping possibilities, are known to play a role in the guidance of human attention but have not been considered in artificial attention systems so far. Extending our earlier work, we investigate the combination of affordance estimation and visual saliency in an artificial visual attention model. Different models based on saliency, affordance estimation, or their com...
متن کامل