Scalable Image Annotation by Summarizing Training Samples into Labeled Prototypes
Authors
Abstract:
By increasing the number of images, it is essential to provide fast search methods and intelligent filtering of images. To handle images in large datasets, some relevant tags are assigned to each image to for describing its content. Automatic Image Annotation (AIA) aims to automatically assign a group of keywords to an image based on visual content of the image. AIA frameworks have two main stages; Feature Extraction and Tag Assignment which are both important in order to reach a proper performance. In the first stage of our proposed method, we utilize deep models to obtain a visual representation of images. We apply different pre-trained architectures of Convolutional Neural Networks (CNN) to the input image including Vgg16, Dense169, and ResNet 101. After passing the image through the layers of CNN, we obtain a single feature vector from the layer before the last layer, resulting into a rich representation for the visual content of the image. One advantage of deep feature extractor is that it substitutes a single feature vector instead of multiple feature vectors and thus, there is no need for combining multiple features. In the second stage, some tags are assigned from training images to a test image which is called “Tag Assignment”. Our approach for image annotation belongs to the search-based methods which have high performance in spite of simple structure. Although it is even more time-consuming due to its method of comparing the test image to every training in order to find similar images. Despite the efficiency of automatic Image annotation methods, it is challenging to provide a scalable method for large-scale datasets. In this paper, to solve this challenge, we propose a novel approach to summarize training database (images and their relevant tags) into a small number of prototypes. To this end, we apply a clustering algorithm on the visual descriptors of training images to extract the visual part of prototypes. Since the number of clusters is much smaller than the number of images, a good level of summarization will be achieved using our approach. In the next step, we extract the labels of prototypes based on the labels of input images in the dataset. because of this, semantic labels are propagated from training images to the prototypes using a label propagation process on a graph. In this graph, there is one node for each input image and one node for each prototypes. This means that we have a graph with union of input images and prototypes. Then, to extract the edges of graph, the visual feature of each node on graph is coded using other nodes to obtain its K-nearest neighbors. This goal is achieved by using Locality-constraints Linear Coding algorithm. After construction the above graph, a label propagation algorithm is applied on the graph to extract the labels of prototypes. Based on this approach, we achieve a set of labeled prototypes which can be used for annotating every test image. To assign tags for an input image, we propose an adaptive thresholding method that finds the labels of a new image using a linear interpolation from the labels of learned prototypes. The proposed method can reduce the size of a training dataset to 22.6% of its original size. This issue will considerably reduce the annotation time such that, compared to the state-of-the-art search-based methods such as 2PKNN, the proposed method is at least 4.2 times faster than 2PKNN, while the performance of annotation process in terms of Precision, Recall and F1 will be maintained on different datasets.
similar resources
Annotation of Online Shopping Images without Labeled Training Examples
We are interested in the task of image annotation using noisy natural text as training data. An image and its caption convey different information, but are generated by the same underlying concepts. In this paper, we learn latent mixtures of topics that generate image and product descriptions on shopping websites by adapting a topic model for multilingual data (Mimno et al., 2009). We use the t...
full textRegimvid at ImageCLEF 2015 Scalable Concept Image Annotation Task: Ontology based Hierarchical Image Annotation
In this paper, we describe our participation in the ImageCLEF 2015 Scalable Concept Image Annotation task. In this participation, we display our approach for an automatic image annotation by the use of an ontology-based semantic hierarchy handled at both learning and annotation steps. While recent works focused on the use of semantic hierarchies to improve concept detector accuracy, we are inve...
full textKDEVIR at ImageCLEF 2014 Scalable Concept Image Annotation Task: Ontology based Automatic Image Annotation
In this paper, we describe our participation in the ImageCLEF 2014 Scalable Concept Image Annotation task. In this participation, we propose a novel approach of automatic image annotation by using ontology at several steps of supervised learning. In this regard, we construct tree-like ontology for each annotating concept of images using WordNet and Wikipedia as primary source of knowledge. The ...
full textFuzzy Neighbor Voting for Automatic Image Annotation
With quick development of digital images and the availability of imaging tools, massive amounts of images are created. Therefore, efficient management and suitable retrieval, especially by computers, is one of themost challenging fields in image processing. Automatic image annotation (AIA) or refers to attaching words, keywords or comments to an image or to a selected part of it. In this paper,...
full textLearning Hybrid Models for Image Annotation with Partially Labeled Data
Extensive labeled data for image annotation systems, which learn to assign class labels to image regions, is difficult to obtain. We explore a hybrid model framework for utilizing partially labeled data that integrates a generative topic model for image appearance with discriminative label prediction. We propose three alternative formulations for imposing a spatial smoothness prior on the image...
full textMIL at ImageCLEF 2013: Scalable System for Image Annotation
We give details of our methods in the ImageCLEF 2013 Scalable Concept Image Annotation task. For the textual feature, we propose a method for selecting text closely related to an image from its webpage. In addition, to consider the meaning of the concept, we propose to use WordNet for getting words related to the concept. For visual features, we use Fisher Vector (FV), which is regarded as an e...
full textMy Resources
Journal title
volume 18 issue 4
pages 49- 68
publication date 2022-03
By following a journal you will be notified via email when a new issue of this journal is published.
No Keywords
Hosted on Doprax cloud platform doprax.com
copyright © 2015-2023