Automated Image Captioning for Rapid Prototyping and Resource Constrained Environments

نویسندگان

  • Karan Sharma
  • Arun C. S. Kumar
  • Suchendra M. Bhandarkar
چکیده

Significant performance gains in deep learning coupled with the exponential growth of image and video data on the Internet have resulted in the recent emergence of automated image captioning systems. Ensuring scalability of automated image captioning systems with respect to the ever increasing volume of image and video data is a significant challenge. This paper provides a valuable insight in that the detection of a few significant (top) objects in an image allows one to extract other relevant information such as actions (verbs) in the image. We expect this insight to be useful in the design of scalable image captioning systems. We address two parameters by which the scalability of image captioning systems could be quantified, i.e., the traditional algorithmic time complexity which is important given the resource limitations of the user device and the system development time since the programmers’ time is a critical resource constraint in many real-world scenarios. Additionally, we address the issue of how word embeddings could be used to infer the verb (action) from the nouns (objects) in a given image in a zero-shot manner. Our results show that it is possible to attain reasonably good performance on predicting actions and captioning images using our approaches with the added advantage of simplicity of implementation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automated Image Captioning Using Nearest-Neighbors Approach Driven by Top-Object Detections

The significant performance gains in deep learning coupled with the exponential growth of image and video data on the Internet have resulted in the recent emergence of automated image captioning systems. Two broad paradigms have emerged in automated image captioning, i.e., generative model-based approaches and retrieval-based approaches. Although generative model-based approaches that use the r...

متن کامل

Impact of Prototyping Resource Environments and Timing of Awareness of Constraints on Idea Generation in Product Design

Research and development laboratories in universities and firms around the world are trying to maximize innovation with a limited set of resources. However, questions remain about the influence of resource constraints on idea generation in early-stage product design. Multiple embedded case studies were conducted with engineering students and professors at two university campuses in Mexico. Stud...

متن کامل

Guided Open Vocabulary Image Captioning with Constrained Beam Search

Existing image captioning models do not generalize well to out-of-domain images containing novel scenes or objects. This limitation severely hinders the use of these models in real world applications dealing with images in the wild. We address this problem using a flexible approach that enables existing deep captioning architectures to take advantage of image taggers at test time, without re-tr...

متن کامل

A Model-based Approach for Rapid Prototyping of Parallel Applications on Manycore

Rapid prototyping of highly parallel applications on manycore platforms is extremely challenging. This paper presents an automated analysis and code generation flow for implementing high-level KPN models on STHORM, an embedded 64-core computing fabric developed by STMicroelectronics. The flow is model-based with sound semantical basis and enables formal verification and performance analysis at ...

متن کامل

Toward Scalable Social Alt Text: Conversational Crowdsourcing as a Tool for Refining Vision-to-Language Technology for the Blind

The access of visually impaired users to imagery in social media is constrained by the availability of suitable alt text. It is unknown how imperfections in emerging tools for automatic caption generation may help or hinder blind users’ understanding of social media posts with embedded imagery. In this paper, we study how crowdsourcing can be used both for evaluating the value provided by exist...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1606.01393  شماره 

صفحات  -

تاریخ انتشار 2016