Multi-scale Orderless Pooling of Deep Convolutional Activation Features
نویسندگان
چکیده
Deep convolutional neural networks (CNN) have shown their promise as a universal representation for recognition. However, global CNN activations lack geometric invariance, which limits their robustness for classification and matching of highly variable scenes. To improve the invariance of CNN activations without degrading their discriminative power, this paper presents a simple but effective scheme called multiscale orderless pooling (MOP-CNN). This scheme extracts CNN activations for local patches at multiple scale levels, performs orderless VLAD pooling of these activations at each level separately, and concatenates the result. The resulting MOP-CNN representation can be used as a generic feature for either supervised or unsupervised recognition tasks, from image classification to instance-level retrieval; it consistently outperforms global CNN activations without requiring any joint training of prediction layers for a particular target dataset. In absolute terms, it achieves stateof-the-art results on the challenging SUN397 and MIT Indoor Scenes classification datasets, and competitive results on ILSVRC2012/2013 classification and INRIA Holidays retrieval datasets.
منابع مشابه
VLAD Is not Necessary for CNN
Global convolutional neural networks (CNNs) activations lack geometric invariance, and in order to address this problem, Gong et al proposed multi-scale orderless pooling(MOP-CNN), which extracts CNN activations for local patches at multiple scale levels, and performs orderless VLAD pooling to extract features. However, we find that this method can improve the performance mainly because it extr...
متن کاملAdaptive Deep Pyramid Matching for Remote Sensing Scene Classification
Convolutional neural networks (CNNs) have attracted increasing attention in the remote sensing community. Most CNNs only take the last fully-connected layers as features for the classification of remotely sensed images, discarding the other convolutional layer features which may also be helpful for classification purposes. In this paper, we propose a new adaptive deep pyramid matching (ADPM) mo...
متن کاملتشخیص و فیلترینگ هوشمند تصاویر نامتعارف بهکمک شبکههای عصبی عمیق
Currently vast improvement of internet access and significant growth of web based broadcasters have resulted in distribution and sharing of informative resources such as images worldwide. Although this kind of sharing may bring many advantages, there are certain risks such as access of kids to porn images which should not be neglected. In fact, access to these images can be a threat to the cult...
متن کاملA multi-scale convolutional neural network for automatic cloud and cloud shadow detection from Gaofen-1 images
The reconstruction of the information contaminated by cloud and cloud shadow is an important step in pre-processing of high-resolution satellite images. The cloud and cloud shadow automatic segmentation could be the first step in the process of reconstructing the information contaminated by cloud and cloud shadow. This stage is a remarkable challenge due to the relatively inefficient performanc...
متن کاملCompetitive Multi-scale Convolution
In this paper, we introduce a new deep convolutional neural network (ConvNet) module that promotes competition among a set of multi-scale convolutional filters. This new module is inspired by the inception module, where we replace the original collaborative pooling stage (consisting of a concatenation of the multi-scale filter outputs) by a competitive pooling represented by a maxout activation...
متن کامل