Mahalanobis Encodings for Visual Categorization

نویسندگان

Tomoki Matsuzawa

Raissa Relator

Wataru Takei

Shinichiro Omachi

Tsuyoshi Kato

چکیده

Nowadays, the design of the representation of images is one of the most crucial factors in the performance of visual categorization. A common pipeline employed in most of recent researches for obtaining an image representation consists of two steps: the encoding step and the pooling step. In this paper, we introduce the Mahalanobis metric to the two popular image patch encoding modules, Histogram Encoding and Fisher Encoding, that are used for Bagof-Visual-Word method and Fisher Vector method, respectively. Moreover, for the proposed Fisher Vector method, a close-form approximation of Fisher Vector can be derived with the same assumption used in the original Fisher Vector, and the codebook is built without resorting to time-consuming EM (Expectation-Maximization) steps. Experimental evaluation of multi-class classification demonstrates the effectiveness of the proposed encoding methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Image Retrieval and Classification Using Local Distance Functions

(x − x)A(x − x) Mahalanobis distance: Previous work on learning metrics has focused on learning a single distance metric for all instances. One of our primary contributions is to learn a distance function for every training image. Most visual categorization approaches make use of machine learning after computing distances between images (e.g. SVM with pyramid kernel). We want to learn how to co...

متن کامل

Coarse blobs or fine edges? Evidence that information diagnosticity changes the perception of complex visual stimuli.

Efficient categorizations of complex visual stimuli require effective encodings of their distinctive properties. However, the question remains of how processes of object and scene categorization use the information associated with different perceptual spatial scales. The psychophysics of scale perception suggests that recognition uses coarse blobs before fine scale edges, because the former is ...

متن کامل

Visualization of Movement in Multiscale

This paper will explore a number of visual encodings that can be layered over (x, y, scale, rotation) movement through multiscale. The visual encodings are designed to enhance viewers’ understanding of movement in between key frames. Since none of the visual encodings in this paper conveys all aspects of movement, to give the viewer a complete sense of movement, multiple visual encodings must b...

متن کامل

Face retrieval by an adaptive Mahalanobis distance using a confidence factor

This paper proposes an adaptive Mahalanobis distance for face retrieval. The distance is derived from a posterior distribution of observation errors in features categorized by con dence of face images. Since the distance is calculated considering error variances of each dimension according to the con dence, it can re ect error distribution of each matching more precisely than a standard Mahalan...

متن کامل

Visual Analytics Using Density Equalizing Geographic Distortion

Visualizing large geo-demographical data sets using pixel-based techniques involves mapping the geo-spatial dimensions of a data point to screen coordinates and appropriately encoding its statistical value by color. Analysis of such data is a great challenge. General tasks involve clustering, categorization and searching for patterns of interest for sociological or economic research. Available ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

IPSJ Trans. Computer Vision and Applications

دوره 7 شماره

صفحات -

تاریخ انتشار 2015

Mahalanobis Encodings for Visual Categorization

نویسندگان

چکیده

منابع مشابه

Image Retrieval and Classification Using Local Distance Functions

Coarse blobs or fine edges? Evidence that information diagnosticity changes the perception of complex visual stimuli.

Visualization of Movement in Multiscale

Face retrieval by an adaptive Mahalanobis distance using a confidence factor

Visual Analytics Using Density Equalizing Geographic Distortion

عنوان ژورنال:

اشتراک گذاری