SlowFast Multimodality Compensation Fusion Swin Transformer Networks for RGB-D Action Recognition

نویسندگان

چکیده

RGB-D-based technology combines the advantages of RGB and depth sequences which can effectively recognize human actions in different environments. However, spatio-temporal information between modalities is difficult to learn from each other. To enhance exchange modalities, we introduce a SlowFast multimodality compensation block (SFMCB) designed extract features. Concretely, SFMCB fuses features two independent pathways with frame rates into single convolutional neural network achieve performance gains for model. Furthermore, explore fusion schemes combine feature rates. facilitate learning multiple pathways, loss functions are utilized joint optimization. evaluate effectiveness our proposed architecture, conducted experiments on four challenging datasets: NTU RGB+D 60, 120, THU-READ, PKU-MMD. Experimental results demonstrate model, utilizes mechanism capture complementary multimodal inputs.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Machine for RGB - D Action Recognition

Bilinear Heterogeneous Information Machine for RGB-D Action Recognition Report Title This paper proposes a novel approach to action recognition from RGB-D cameras, in which depth features and RGB visual features are jointly used. Rich heterogeneous RGB and depth data are effectively compressed and projected to a learned shared space, in order to reduce noise and capture useful information for r...

متن کامل

Beyond Action Recognition: Action Completion in RGB-D Data

An action is completed when its goal has been successfully achieved. Using current state-of-the-art depth features, designed primarily for action recognition, an incomplete sequence may still be classified as its complete counterpart due to the overlap in evidence. In this work we show that while features can perform comparably for action recognition, they vary in their ability to recognise inc...

متن کامل

Viewpoint Invariant Action Recognition using RGB-D Videos

In video-based action recognition, viewpoint variations often pose major challenges because the same actions can appear different from different views. We use the complementary RGB and Depth information from the RGB-D cameras to address this problem. The proposed technique capitalizes on the spatiotemporal information available in the two data streams to the extract action features that are lar...

متن کامل

RGB-D-based action recognition datasets: A survey

Human action recognition from RGB-D (Red, Green, Blue and Depth) data has attracted increasing attention since the first work reported in 2010. Over this period, many benchmark datasets have been created to facilitate the development and evaluation of new algorithms. This raises the question of which dataset to select and how to use it in providing a fair and objective comparative evaluation ag...

متن کامل

RGB-D Object Recognition Using Deep Convolutional Neural Networks

We address the problem of object recognition from RGB-D images using deep convolutional neural networks (CNNs). We advocate the use of 3D CNNs to fully exploit the 3D spatial information in depth images as well as the use of pretrained 2D CNNs to learn features from RGB-D images. There exists currently no large scale dataset available comprising depth information as compared to those for RGB da...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Mathematics

سال: 2023

ISSN: ['2227-7390']

DOI: https://doi.org/10.3390/math11092115