captions

Generating captions without looking beyond objects

Journal: :CoRR 2016

Hendrik Heuer Christof Monz Arnold W. M. Smeulders

This paper explores new evaluation perspectives for image captioning and introduces a noun translation task that achieves comparative image caption generation performance by translating from a set of nouns to captions. This implies that in image captioning, all word categories other than nouns can be evoked by a powerful language model without sacrificing performance on n-gram precision. The pa...

متن کامل

ANVIL: a System for the Retrieval of Captioned Images using NLP Techniques

2000

Tony Rose David Elworthy Aaron Kotcheff Amanda Clare Petros Tsonis

ANVIL is a system designed for the retrieval of images annotated with short captions. It uses NLP techniques to extract dependency structures from captions and queries, and then applies a robust matching algorithm to recursively explore and compare them. There are currently two main interfaces to ANVIL: a list-based display and a 2D spatial layout that allows users to interact with and navigate...

متن کامل

Spatial Natural Language Generation for Location Description in Photo Captions

2015

Mark M. Hall Christopher B. Jones Philip David Smart

We present a spatial natural language generation system to create captions that describe the geographical context of geo-referenced photos. An analysis of existing photo captions was used to design templates representing typical caption language patterns, while the results of human subject experiments were used to create field-based spatial models of the applicability of some commonly used spat...

متن کامل

Enhancing Social Tagging with a Knowledge Organization System

2008

Koraljka Golub Jim Moon Douglas Tudhope Marianne Lykke Nielsen

The paper investigates the effect on indexing and retrieval when using only social tagging versus when using social tagging in combination with suggestions from a knowledge organization system. The specific context is that of tagging by Web document readers, using Dewey Decimal Classification, its captions, Relative Index Terms and Library of Congress Subject Headings mapped to the captions. Th...

متن کامل

Déjà Image-Captions: A Corpus of Expressive Descriptions in Repetition

2015

Jianfu Chen Polina Kuznetsova David Warren Yejin Choi

We present a new approach to harvesting a large-scale, high quality image-caption corpus that makes a better use of already existing web data with no additional human efforts. The key idea is to focus on Déjà Image-Captions: naturally existing image descriptions that are repeated almost verbatim – by more than one individual for different images. The resulting corpus provides association struct...

متن کامل

Slightly Supervised Adaptation of Acoustic Models on Captioned BBC Weather Forecasts

2013

Christian Mohr Christian Saam Kevin Kilgour Jonas Gehring Sebastian Stüker Alexander H. Waibel

In this paper we investigate the exploitation of loosely transcribed audio data, in the form of captions for weather forecast recordings, in order to adapt acoustic models for automatically transcribing these kinds of forecasts. We focus on dealing with inaccurate time stamps in the captions and the fact that they often deviate from the exact spoken word sequence in the forecasts. Furthermore, ...

متن کامل

Full Utilization of Closed-captions in Broadcast News Recognition

2006

Meng Meng Shijin Wang Jiaen Liang Peng Ding Bo Xu

Lightly supervised acoustic model training has been recognized as an effective way to improve acoustic model training for broadcast news recognition. In this paper, a new approach is introduced to both fully utilize the un-transcribed data by using closed captions as transcripts and to select more informative data for acoustic model training. We will show that this approach is superior to regul...

متن کامل

Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast

2012

Hassan Sawaf

We describe a system to rapidly generate high-quality closed captions and subtitles for live broadcasted TV shows, using automated components, namely Automatic Speech Recognition and Machine Translation. The human stays in the loop for quality assurance and optional postediting. We also describe how the system feeds the human edits and corrections back into the different components for improvem...

متن کامل

A Graphic Page Image Processing System

2007

Linlin Li Shijian Lu Chew Lim Tan

Patent images maintained by the U.S. patent database have figures and related description separated in different pages. This makes it difficult for users to refer to a figure while reading the description. In order to prepare these patent images for a friendly user interface, this paper presents a system, which is able to segment a mulitple-figure image page into individual figures and extract ...

متن کامل

Image Representations and New Domains in Neural Image Captioning

2015

Jack Hessel Nicolas Savva Michael J. Wilber

We examine the possibility that recent promising results in automatic caption generation are due primarily to language models. By varying image representation quality produced by a convolutional neural network, we find that a state-of-theart neural captioning algorithm is able to produce quality captions even when provided with surprisingly poor image representations. We replicate this result i...

متن کامل