MAMBA: Multi-level Aggregation via Memory Bank for Video Object Detection

نویسندگان

چکیده

State-of-the-art video object detection methods maintain a memory structure, either sliding window or queue, to enhance the current frame using attention mechanisms. However, we argue that these structures are not efficient sufficient because of two implied operations: (1) concatenating all features in for enhancement, leading heavy computational cost; (2) frame-wise updating, preventing from capturing more temporal information. In this paper, propose multi-level aggregation architecture via bank called MAMBA. Specifically, our employs novel operations eliminate disadvantages existing methods: light-weight key-set construction which can significantly reduce fine-grained feature-wise updating strategy enables method utilize knowledge whole video. To better complementary levels, i.e., feature maps and proposals, further generalized enhancement operation (GEO) aggregate unified manner. We conduct extensive evaluations on challenging ImageNetVID dataset. Compared with state-of-the-art methods, achieves superior performance terms both speed accuracy. More remarkably, MAMBA mAP 83.7%/84.6% at 12.6/9.1 FPS ResNet-101.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spatial-Temporal Memory Networks for Video Object Detection

We introduce Spatial-Temporal Memory Networks (STMN) for video object detection. At its core, we propose a novel Spatial-Temporal Memory module (STMM) as the recurrent computation unit to model long-term temporal appearance and motion dynamics. The STMM’s design enables the integration of ImageNet pre-trained backbone CNN weights for both the feature stack as well as the prediction head, which ...

متن کامل

Feature-Level based Video Fusion for Object Detection

Fusion of three-dimensional data from multiple sensors gained momentum, especially in applications pertaining to surveillance, when promising results were obtained in moving object detection. Several approaches to video fusion of visual and infrared data have been proposed in recent literature. They mainly comprise of pixel based methodologies. Surveillance is a major application of video fusio...

متن کامل

Multi-Model Estimation Based Moving Object Detection for Aerial Video

With the wide development of UAV (Unmanned Aerial Vehicle) technology, moving target detection for aerial video has become a popular research topic in the computer field. Most of the existing methods are under the registration-detection framework and can only deal with simple background scenes. They tend to go wrong in the complex multi background scenarios, such as viaducts, buildings and tree...

متن کامل

Adaptive Multi-Level Region Merging for Salient Object Detection

Salient object detection is a long-standing problem in computer vision and plays a critical role in understanding the mechanism of human visual attention. In applications that require object-level prior (e.g. image retargeting), it is desirable that saliency detection highlights holistic objects. Lately over-segmentation techniques such as SLIC superpixel [6], Meanshift [1], and graph-based [3]...

متن کامل

Video Transmission Using Multi-Level Content Aware Compression Based on Object Detection

The computational power of network nodes whether it is a mobile device or a miniaturized computer increases year by year, however the capacity of wireless channels in 4G networks is still falling behind for real-time video transmission. The paper proposes utilizing the computational capacities of mobile devices to detect objects of interest and apply less compression to corresponding regions, w...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2021

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v35i3.16365