Weakly Supervised PatchNets: Learning Aggregated Patch Descriptors for Scene Recognition

نویسندگان

  • Zhe Wang
  • Limin Wang
  • Yali Wang
  • Bowen Zhang
  • Yu Qiao
  • Charless Fowlkes
چکیده

In this paper, we propose a hybrid representation, which leverages the great discriminative capacity of CNNs and the efficiency of descriptor encoding scheme scene recognition. We make three main contributions. First, we train an end-to-end PatchNet in a weakly supervised manner, in order to extract the discriminative deep descriptors of local patches. Second, we design a novel VSAD encoding approach. With the help of semantic predictions from PatchNet, it can effectively aggregate deep local-patch descriptors into a global image representation. Finally, we evaluate our approach on two standard scene recognition benchmarks to show the effectiveness, i.e., MIT Indoor67 (86.2%) and SUN397 (73.0%).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Deep patch learning for weakly supervised object classification and discovery

Patch-level image representation is very important for object classification and detection, since it is robust to spatial transformation, scale variation, and cluttered background. Many existing methods usually require fine-grained supervisions (e.g., bounding-box annotations) to learn patch features, which requires a great effort to label images may limit their potential applications. In this ...

متن کامل

Learning Patch-based Structural Element Models with Hierarchical Palettes Abstract Learning Patch-based Structural Element Models with Hierarchical Palettes

Learning Patch-Based Structural Element Models With Hierarchical Palettes Jeroen Chua Master of Applied Science Graduate Department of Electrical and Computer Engineering University of Toronto 2012 Image patches can be factorized into ‘shapelets’ that describe segmentation patterns, and palettes that describe how to paint the segments. This allows a flexible factorization of local shape (segmen...

متن کامل

Boosting VLAD with Supervised Dictionary Learning and High-Order Statistics

Recent studies show that aggregating local descriptors into super vector yields effective representation for retrieval and classification tasks. A popular method along this line is vector of locally aggregated descriptors (VLAD), which aggregates the residuals between descriptors and visual words. However, original VLAD ignores high-order statistics of local descriptors and its dictionary may n...

متن کامل

Weakly Supervised Object Localization with Stable Segmentations

Multiple Instance Learning (MIL) provides a framework for training a discriminative classifier from data with ambiguous labels. This framework is well suited for the task of learning object classifiers from weakly labeled image data, where only the presence of an object in an image is known, but not its location. Some recent work has explored the application of MIL algorithms to the tasks of im...

متن کامل

Support region estimation in biological images

We describe a series of experiments relating to patch size selection in a challenging dataset of coral reef images and show that we can achieve significant performance increase over the naive strategy of selecting a fixed patch size. Many computer vision methodologies use the bag of visual words representation of image objects. This is true, in particular, for the state of the art texture recog...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017