Fusing Geometric Features for Skeleton-Based Action Recognition using Multilayer LSTM Networks
نویسندگان
چکیده
Recent skeleton-based action recognition approaches achieve great improvement by using RNN models. Currently these approaches build an end-to-end network from coordinates of joints to class categories and improve accuracy by extending RNN to spatial domains. First, while such well-designed models and optimization strategies explore relations between different parts directly from joint coordinates, we provide a simple universal spatial modeling method perpendicular to the RNN model enhancement. Specifically, according to the evolution of previous work, we select a set of simple geometric features, and then seperately feed each type of features to a 3-layer LSTM framework. Second, we propose a multi-stream LSTM architecture with a new smoothed score fusion techinique to learn classification from different geometric feature streams. Furthermore, we observe that the geometric relational features based on distances between joints and selected lines outperform other features and the fusion results achieve state-of-the-art performance on four datasets. We also show the sparsity of input gate weights in the first LSTM layer trained by geometric features and demonstrate that utilizing joint-line distances as input require less data for training.
منابع مشابه
Co-Occurrence Feature Learning for Skeleton Based Action Recognition Using Regularized Deep LSTM Networks
Skeleton based action recognition distinguishes human actions using the trajectories of skeleton joints, which provide a very good representation for describing actions. Considering that recurrent neural networks (RNNs) with Long Short-Term Memory (LSTM) can learn feature representations and model long-term temporal dependencies automatically, we propose an endto-end fully connected deep LSTM n...
متن کاملA Fusion of Appearance based CNNs and Temporal evolution of Skeleton with LSTM for Daily Living Action Recognition
In this paper, we propose efficient method which combines skeleton information and appearance features for daily-living action recognition. Many RGB methods focus only on short term temporal information obtained from optical flow. Skeleton based methods on the other hand show that modeling long term skeleton evolution improves action recognition accuracy. In this paper we propose to fuse skelet...
متن کاملSkeleton-Based Action Recognition Using Spatio-Temporal LSTM Network with Trust Gates
Skeleton-based human action recognition has attracted a lot of research attention during the past few years. Recent works attempted to utilize recurrent neural networks to model the temporal dependencies between the 3D positional configurations of human body joints for better analysis of human activities in the skeletal data. The proposed work extends this idea to spatial domain as well as temp...
متن کاملAircraft Visual Identification by Neural Networks
In the present paper, an efficient method for three dimensional aircraft pattern recognition is introduced. In this method, a set of simple area based features extracted from silhouette of aerial vehicles are used to recognize an aircraft type from its optical or infrared images taken by a CCD camera or a FLIR sensor. These images can be taken from any direction and distance relative to the fly...
متن کاملAction Classification and Highlighting in Videos
Inspired by recent advances in neural machine translation, that jointly align and translate using encoder-decoder networks equipped with attention, we propose an attentionbased LSTM model for human activity recognition. Our model jointly learns to classify actions and highlight frames associated with the action, by attending to salient visual information through a jointly learned soft-attention...
متن کامل