نتایج جستجو برای: captioning order

تعداد نتایج: 908879  

Journal: :Artificial Intelligence Review 2021

Image captioning is the task of automatically generating sentences that describe an input image in best way possible. The most successful techniques for captions have recently used attentive deep learning models. There are variations models with attention designed. In this survey, we provide a review literature related to captioning. Instead offering comprehensive all prior work on models, expl...

Journal: :Expert Systems With Applications 2022

In recent years, Transformer structures have been widely applied in image captioning with impressive performance. However, previous works often neglect the geometry and position relations of different visual objects. These are thought as crucial information for good results. Aiming to further promote by Transformers, this paper proposes an improved Geometry Attention (GAT) framework. order obta...

Journal: :Proceedings of the ACM on human-computer interaction 2022

Deaf and Hard-of-Hearing (DHH) audiences have long complained about caption qualities for many online videos created by individual content creators on video-sharing platforms (e.g., YouTube). However, there lack explorations of practices, challenges, perceptions video captions from the perspectives both DHH audiences. In this work, we first explore audiences' feedback reactions to YouTube throu...

2017
Hui Chen Guiguang Ding Sicheng Zhao Jungong Han

The existing methods for image captioning usually train the language model under the cross entropy loss, which results in the exposure bias and inconsistency of evaluation metric. Recent research has shown these two issues can be well addressed by policy gradient method in reinforcement learning domain attributable to its unique capability of directly optimizing the discrete and non-differentia...

Journal: :CoRR 2017
Sang Phan Le Gustav Eje Henter Yusuke Miyao Shin'ichi Satoh

Captioning models are typically trained using the crossentropy loss. However, their performance is evaluated on other metrics designed to better correlate with human assessments. Recently, it has been shown that reinforcement learning (RL) can directly optimize these metrics in tasks such as captioning. However, this is computationally costly and requires specifying a baseline reward at each st...

2017
Bor-Chun Chen Yan-Ying Chen Francine Chen

Video summarization and video captioning are considered two separate tasks in existing studies. For longer videos, automatically identifying the important parts of video content and annotating them with captions will enable a richer and more concise condensation of the video. We propose a general neural network configuration that jointly considers two supervisory signals (i.e., an image-based v...

2017
Jingkuan Song Lianli Gao Zhao Guo Wu Liu Dongxiang Zhang Heng Tao Shen

Recent progress has been made in using attention based encoder-decoder framework for video captioning. However, most existing decoders apply the attention mechanism to every generated word including both visual words (e.g., ”gun” and ”shooting”) and non-visual words (e.g. ”the”, ”a”). However, these non-visual words can be easily predicted using natural language model without considering visual...

Journal: :Proceedings of the AAAI Conference on Artificial Intelligence 2019

Journal: :CoRR 2016
Jiuxiang Gu Gang Wang Tsuhan Chen

Language models based on recurrent neural networks have dominated recent image caption generation tasks. In this paper, we introduce a language CNN model which is suitable for statistical language modeling tasks and shows competitive performance in image captioning. In contrast to previous models which predict next word based on one previous word and hidden state, our language CNN is fed with a...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید