Multi-scale discriminant saliency with wavelet-based Hidden Markov Tree modelling

نویسندگان

  • Anh Cat Le Ngo
  • Li-Minn Ang
  • Kah Phooi Seng
  • Guoping Qiu
چکیده

Bottom-up saliency, an early stage of human visual attention, can be considered as a binary classification problem between centre and surround classes. Discriminant power of features for the classification is measured as mutual information between distributions of image features and corresponding classes . As the estimated discrepancy very much depends on considered scale level, multi-scale structure and discriminant power are integrated by employing discrete wavelet features and Hidden Markov Tree (HMT). With wavelet coefficients and Hidden Markov Tree parameters, quad-tree like label structures are constructed and utilized in maximum a posterior probability (MAP) of hidden class variables at corresponding dyadic sub-squares. A saliency value for each square block at each scale level is computed with discriminant power principle. Finally, across multiple scales is integrated the final saliency map by an information maximization rule. Both standard quantitative tools such as NSS, LCC, AUC and qualitative assessments are used for evaluating the proposed multi-scale discriminant saliency (MDIS) against the well-know information based approach AIM on its released image collection with eyetracking data. Simulation results are presented and analysed to verify the validity of MDIS as well as point out its limitation for further research direction. Preprint submitted to Elsevier February 1, 2013 ar X iv :1 30 1. 76 41 v1 [ cs .C V ] 3 1 Ja n 20 13 1. Visual Attention Computational Approach Visual attention is a psychological phenomenon in which human visual systems are optimized for capturing scenic information. Robustness and efficiency of biological devices, the eyes and their control systems, visual paths in the brain have amazed scientists and engineers for centuries. From Neisser [1] to Marr [2], researchers have put intensive effort in discovering attention principles and engineering artificial systems with equivalent capability. For decades, this research field has been dominated by visual attention principle , proposing an existence of a saliency map for attention guidance. The idea is further promoted in Feature Integration Theory (FIT) [3] which elaborates computational principles of saliency map generation with centre-surround operators and basic image features such as intensity, orientation and colour. Then, Itti et al. [4] implemented and released the first complete computer algorithms of FIT theory . Feature Integration Theory is widely accepted as principles behind visual attention partly due to its utilization of basic image features such as colour, intensity, and orientation. Moreover, this hypothesis is supported by several evidences from psychological experiments. However, it only defines theoretical aspects of saliency maps and visual attention , but does not investigate how such principles would be implemented algorithmically. This lack of implementation details leaves research field open for many later saliency algorithms [4, 5, 6, 7], etc. Saliency might be computed as a linear contrast between features of central and surrounding environments across multiple scales by centre-surround operators. Saliency is also modelled as phase difference in Fourier Transform Domain [8], or its value depends on statistical modelling of the local feature distribution [6]. Though many approaches are mentioned in the long and rich literature of visual saliency, only a few are built on a solid theory or linked to other well-established computational theory. Among the approaches, Neil Bruce’s work [9] nicely established a bridge between visual saliency and information theory. It puts a first step for bridging two alien fields; moreover, visual attention for first time could be viewed as information system. Then, information based visual saliency has continuously been investigated and developed in several works [10, 11, 12, 13]. The distinguish points between these works are computational approaches for retrieving information from features. The process attracts much interhttp://ilab.usc.edu/toolkit/

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discriminant Analysis and Adaptive Wavelet Feature Selection for Statistical Object Detection

We utilize the discriminant analysis to select wavelet features for efficient object detection. The analysis applies to the Bayesian classifier and is extended to the case of boosting. Based on the error analysis under the Bayesian decision rule, we reduce the number of coefficients involved in detection to lower the computational cost. Using a Hidden Markov Tree (HMT) model to describe the pat...

متن کامل

Multiscale Discriminant Saliency for Visual Attention

The bottom-up saliency, an early stage of humans’ visual attention, can be considered as a binary classification problem between center and surround classes. Discriminant power of features for the classification is measured as mutual information between features and two classes distribution. The estimated discrepancy of two feature classes very much depends on considered scale levels; then, mul...

متن کامل

Compressed-Sampling-Based Image Saliency Detection in the Wavelet Domain

When watching natural scenes, an overwhelming amount of information is delivered to the Human Visual System (HVS). The optic nerve is estimated to receive around 108 bits of information a second. This large amount of information can’t be processed right away through our neural system. Visual attention mechanism enables HVS to spend neural resources efficiently, only on the selected parts of the...

متن کامل

Linear discriminant analysis for speechreading

This paper investigates the use of Fisher-Rao linear discriminant analysis (LDA) as a means of visual feature extraction for hidden Markov model based automatic speechreading. For every video frame, a three-dimensional region of interest containing the speaker's mouth over a sequence of adjacent frames is lexicographically arranged into a data vector. Such vectors are then projected onto the sp...

متن کامل

Evaluation of the Hidden Markov Model for Detection of P300 in EEG Signals

Introduction: Evoked potentials arisen by stimulating the brain can be utilized as a communication tool  between humans and machines. Most brain-computer interface (BCI) systems use the P300 component,  which is an evoked potential. In this paper, we evaluate the use of the hidden Markov model (HMM) for  detection of P300.  Materials and Methods: The wavelet transforms, wavelet-enhanced indepen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computers & Electrical Engineering

دوره 40  شماره 

صفحات  -

تاریخ انتشار 2014