Combining Geometric, Textual and Visual Features for Predicting Prepositions in Image Descriptions: Supplementary Material

نویسندگان

  • Arnau Ramisa
  • Josiah Wang
  • Ying Lu
  • Emmanuel Dellandrea
  • Francesc Moreno-Noguer
  • Robert Gaizauskas
چکیده

Prepositions in Image Descriptions: Supplementary Material Arnau Ramisa* Josiah Wang* Ying Lu Emmanuel Dellandrea Francesc Moreno-Noguer Robert Gaizauskas 1 Institut de Robòtica i Informàtica Industrial (UPC-CSIC), Barcelona, Spain 2 Department of Computer Science, University of Sheffield, UK 3 LIRIS, École Centrale de Lyon, France {aramisa, fmoreno}@iri.upc.edu {j.k.wang, r.gaizauskas}@sheffield.ac.uk {ying.lu, emmanuel.dellandrea}@ec-lyon.fr

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Combining Geometric, Textual and Visual Features for Predicting Prepositions in Image Descriptions

We investigate the role that geometric, textual and visual features play in the task of predicting a preposition that links two visual entities depicted in an image. The task is an important part of the subsequent process of generating image descriptions. We explore the prediction of prepositions for a pair of entities, both in the case when the labels of such entities are known and unknown. In...

متن کامل

Natural Language Descriptions for Human Activities in Video Streams

There has been continuous growth in the volume and ubiquity of video material. It has become essential to define video semantics in order to aid the searchability and retrieval of this data. We present a framework that produces textual descriptions of video, based on the visual semantic content. Detected action classes rendered as verbs, participant objects converted to noun phrases, visual pro...

متن کامل

Generating Descriptions of Spatial Relations between Objects in Images

We investigate the task of predicting prepositions that can be used to describe the spatial relationships between pairs of objects depicted in images. We explore the extent to which such spatial prepositions can be predicted from (a) language information, (b) visual information, and (c) combinations of the two. In this paper we describe the dataset of object pairs and prepositions we have creat...

متن کامل

I Can Has Cheezburger? A Nonparanormal Approach to Combining Textual and Visual Information for Predicting and Generating Popular Meme Descriptions

The advent of social media has brought Internet memes, a unique social phenomenon, to the front stage of the Web. Embodied in the form of images with text descriptions, little do we know about the “language of memes”. In this paper, we statistically study the correlations among popular memes and their wordings, and generate meme descriptions from raw images. To do this, we take a multimodal app...

متن کامل

Tags Re-ranking Using Multi-level Features in Automatic Image Annotation

Automatic image annotation is a process in which computer systems automatically assign the textual tags related with visual content to a query image. In most cases, inappropriate tags generated by the users as well as the images without any tags among the challenges available in this field have a negative effect on the query's result. In this paper, a new method is presented for automatic image...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015