captions

Interpreting spatial language in image captions

Journal: :Cognitive Processing 2010

Automatic formatted transcripts for videos

2015

Aasish Pappu Amanda Stent

Multimedia content may be supplemented with time-aligned closed captions for accessibility. Often these captions are created manually by professional editors — an expensive and timeconsuming process. In this paper, we present a novel approach to automatic creation of a well-formatted, readable transcript for a video from closed captions or ASR output. Our approach uses acoustic and lexical feat...

متن کامل

ChatPainter: Improving Text to Image Generation using Dialogue

Journal: :CoRR 2018

Shikhar Sharma Dendi Suhubdy Vincent Michalski Samira Ebrahimi Kahou Yoshua Bengio

Synthesizing realistic images from text descriptions on a dataset like Microsoft Common Objects in Context (MS COCO), where each image can contain several objects, is a challenging task. Prior work has used text captions to generate images. However, captions might not be informative enough to capture the entire image and insufficient for the model to be able to understand which objects in the i...

متن کامل

Mapping between image regions and caption concepts of captioned depictive photographs

2003

Neil C. Rowe

We discuss the obstacles to inference of correspondences between objects within photographic images and their counterpart concepts in descriptive captions of those images. This is important for information retrieval of photographic data since its content analysis is much arder than linguistic analysis of its captions. We argue that the key mapping is between certain caption concepts representin...

متن کامل

Video Captions Benefit Everyone.

Journal: :Policy insights from the behavioral and brain sciences 2015

Morton Ann Gernsbacher

Video captions, also known as same-language subtitles, benefit everyone who watches videos (children, adolescents, college students, and adults). More than 100 empirical studies document that captioning a video improves comprehension of, attention to, and memory for the video. Captions are particularly beneficial for persons watching videos in their non-native language, for children and adults ...

متن کامل

Words And Pictures In The News

2003

Jaety Edwards Ryan White David A. Forsyth

We discuss the properties of a collection of news photos and captions, collected from the Associated Press and Reuters. Captions have a vocabulary dominated by proper names. We have implemented various text clustering algorithms to organize these items by topic, as well as an iconic matcher that identifies articles that share a picture. We have found that the special structure of captions allow...

متن کامل

Microsoft COCO Captions: Data Collection and Evaluation Server

Journal: :CoRR 2015

Xinlei Chen Hao Fang Tsung-Yi Lin Ramakrishna Vedantam Saurabh Gupta Piotr Dollár C. Lawrence Zitnick

In this paper we describe the Microsoft COCO Caption dataset and evaluation server. When completed, the dataset will contain over one and a half million captions describing over 330,000 images. For the training and validation images, five independent human generated captions will be provided. To ensure consistency in evaluation of automatic caption generation algorithms, an evaluation server is...

متن کامل

Humor in Collective Discourse: Unsupervised Funniness Detection in the New Yorker Cartoon Caption Contest

Journal: :CoRR 2016

Dragomir R. Radev Amanda Stent Joel R. Tetreault Aasish Pappu Aikaterini Iliakopoulou Agustin Chanfreau Paloma de Juan Jordi Vallmitjana Alejandro Jaimes Rahul Jha Robert Mankoff

The New Yorker publishes a weekly captionless cartoon. More than 5,000 readers submit captions for it. The editors select three of them and ask the readers to pick the funniest one. We describe an experiment that compares a dozen automatic methods for selecting the funniest caption. We show that negative sentiment, human-centeredness, and lexical centrality most strongly match the funniest capt...

متن کامل

Exploring Nearest Neighbor Approaches for Image Captioning

Journal: :CoRR 2015

Jacob Devlin Saurabh Gupta Ross B. Girshick Margaret Mitchell C. Lawrence Zitnick

We explore a variety of nearest neighbor baseline approaches for image captioning. These approaches find a set of nearest neighbor images in the training set from which a caption may be borrowed for the query image. We select a caption for the query image by finding the caption that best represents the “consensus” of the set of candidate captions gathered from the nearest neighbor images. When ...

متن کامل

Examining the effectiveness of bilingual subtitles for comprehension: An eye-tracking study

Journal: :Studies in Second Language Acquisition 2022

Abstract The present study examined the relative effectiveness of bilingual subtitles for L2 viewing comprehension, compared to other subtitling types. Learners’ allocation attention image and subtitles/captions in different conditions, as well relationship between were also investigated. A total 112 Chinese learners English watched an documentary clip one four conditions (bilingual subtitles, ...

متن کامل