Attention in Low Resolution: Learning Proto-Object Representations with a Deep Network.

نویسندگان

  • Chengyao Shen
  • Xun Huang
  • Qi Zhao
چکیده

While previous researches in eye fixation prediction typically rely on integrating low-level features (e.g. color, edge) to form a saliency map, recently it has been found that the structural organization of these features into perceptual objects (proto-objects) can play a significant role, and many times more important than low-level features. In this work, we presented a computational framework based on deep network to demonstrate that proto-object representations can be learned naturally from low-resolution image patches from fixation regions. We advocated the use of low-resolution inputs in this work due to a number of reasons: (1) Stimuli triggering eye movements are usually in para-foveal or peripherial regions of the retina, which are in lower resolution compared with fovea. (2) People can perceive or recognize objects well even it is in low resolution. (3) Fixations from lower resolution images can predict fixations on higher resolution images. In the proposed computational model, we extracted multi-scale image patches on fixation regions from eye fixation datasets, resized them to low resolution and fed them into a two-layer neural network. With layer-wise unsupervised feature learning, we found that many proto-objects like features responsive to different shapes of object blobs were learned out in the second layer. Visualizations also show that these features are selective to potential objects in the scene and the responses of these features work well in predicting eye fixations on the images when combined with learned weights. Meeting abstract presented at VSS 2015.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Deep Model for Super-resolution Enhancement from a Single Image

This study presents a method to reconstruct a high-resolution image using a deep convolution neural network. We propose a deep model, entitled Deep Block Super Resolution (DBSR), by fusing the output features of a deep convolutional network and a shallow convolutional network. In this way, our model benefits from high frequency and low frequency features extracted from deep and shallow networks...

متن کامل

Emergence of Proto-Object Representations via Fixations in Low-Resolution

One prominent feature of our visual system is that the fovea – the highest-resolution portion of the retina – only occupies two visual degrees, while the remaining portion of the retina (parafovea and periphery) are mainly in low-resolution. Therefore, before we make a saccadic eye movement, the potential fixation target is usually located in parafovea or periphery and is perceived in low-resol...

متن کامل

Anomaly-based Web Attack Detection: The Application of Deep Neural Network Seq2Seq With Attention Mechanism

Today, the use of the Internet and Internet sites has been an integrated part of the people’s lives, and most activities and important data are in the Internet websites. Thus, attempts to intrude into these websites have grown exponentially. Intrusion detection systems (IDS) of web attacks are an approach to protect users. But, these systems are suffering from such drawbacks as low accuracy in ...

متن کامل

Deep Unsupervised Domain Adaptation for Image Classification via Low Rank Representation Learning

Domain adaptation is a powerful technique given a wide amount of labeled data from similar attributes in different domains. In real-world applications, there is a huge number of data but almost more of them are unlabeled. It is effective in image classification where it is expensive and time-consuming to obtain adequate label data. We propose a novel method named DALRRL, which consists of deep ...

متن کامل

Integration of Deep Learning Algorithms and Bilateral Filters with the Purpose of Building Extraction from Mono Optical Aerial Imagery

The problem of extracting the building from mono optical aerial imagery with high spatial resolution is always considered as an important challenge to prepare the maps. The goal of the current research is to take advantage of the semantic segmentation of mono optical aerial imagery to extract the building which is realized based on the combination of deep convolutional neural networks (DCNN) an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of vision

دوره 15 12  شماره 

صفحات  -

تاریخ انتشار 2015