captioning order

نتایج جستجو برای: captioning order

تعداد نتایج: 908879 فیلتر نتایج به سال:

Attention-Aligned Transformer for Image Captioning

Journal: :Proceedings of the ... AAAI Conference on Artificial Intelligence 2022

Recently, attention-based image captioning models, which are expected to ground correct regions for proper word generations, have achieved remarkable performance. However, some researchers argued “deviated focus” problem of existing attention mechanisms in determining the effective and influential features. In this paper, we present A2 - an attention-aligned Transformer captioning, guides learn...

متن کامل

Efficient Image Captioning for Edge Devices

Journal: :Proceedings of the ... AAAI Conference on Artificial Intelligence 2023

Recent years have witnessed the rapid progress of image captioning. However, demands for large memory storage and heavy computational burden prevent these captioning models from being deployed on mobile devices. The main obstacles lie in heavyweight visual feature extractors (i.e., object detectors) complicated cross-modal fusion networks. To this end, we propose LightCap, a lightweight caption...

متن کامل

CapERA: Captioning Events in Aerial Videos

Journal: :Remote Sensing 2023

In this paper, we introduce the CapERA dataset, which upgrades Event Recognition in Aerial Videos (ERA) dataset to aerial video captioning. The newly proposed aims advance visual–language-understanding tasks for UAV videos by providing each with diverse textual descriptions. To build 2864 are manually annotated a caption that includes information such as main event, object, place, action, numbe...

متن کامل

Full-Memory Transformer for Image Captioning

Journal: :Symmetry 2023

The Transformer-based approach represents the state-of-the-art in image captioning. However, existing studies have shown Transformer has a problem that irrelevant tokens with overlapping neighbors incorrectly attend to each other relatively large attention scores. We believe this limitation is due incompleteness of Self-Attention Network (SAN) and Feed-Forward (FFN). To solve problem, we presen...

متن کامل

Real-Time Captioning for Improving Informed Consent

Journal: :Regional Anesthesia and Pain Medicine 2016

متن کامل

Image Captioning using Facial Expression and Attention

Journal: :Journal of Artificial Intelligence Research 2020

متن کامل

Non-Autoregressive Coarse-to-Fine Video Captioning

Journal: :Proceedings of the ... AAAI Conference on Artificial Intelligence 2021

It is encouraged to see that progress has been made bridge videos and natural language. However, mainstream video captioning methods suffer from slow inference speed due the sequential manner of autoregressive decoding, prefer generating generic descriptions insufficient training visual words (e.g., nouns verbs) inadequate decoding paradigm. In this paper, we propose a non-autoregressive based ...

متن کامل

Diverse video captioning through latent variable expansion

Journal: :Pattern Recognition Letters 2022

• A diverse captioning model of full convolution design is proposed. We develop a new evaluation metric to assess the sentence diversity. Our method achieves superior performance compared state-of-the-art benchmarks. Automatically describing video content with text description challenging but important task, which has been attracting lot attention in computer vision community. Previous works ma...

متن کامل

Object Relation Attention for Image Paragraph Captioning

Journal: :Proceedings of the ... AAAI Conference on Artificial Intelligence 2021

Image paragraph captioning aims to automatically generate a from given image. It is an extension of image in terms generating multiple sentences instead single one, and it more challenging because paragraphs are longer, informative, linguistically complicated. Because consists several sentences, effective method should consistent rather than contradictory ones. still open question how achieve t...

متن کامل

RNIC-A retrospect network for image captioning

Journal: :Soft Computing 2022

As cross-domain research combining computer vision and natural language processing, the current image captioning mainly considers how to improve visual features; less attention has been paid utilizing inherent properties of boost performance. Facing this challenge, we proposed a textual mechanism, which can obtain semantic relevance between words by scanning all generated words. The retrospect ...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید