DenseGCN: A multi-level and multi-temporal graph convolutional network for action recognition

Published in IET Image Processing, 2023

Abstract

This paper presents DenseGCN, a novel multi-level and multi-temporal graph convolutional network designed for skeleton-based action recognition. The proposed method addresses the limitations of existing graph convolutional networks in capturing long-range spatial and temporal dependencies in human motion sequences.

Key Contributions

Multi-level Spatial Modeling: DenseGCN incorporates multiple levels of spatial graph convolutions to capture both local joint relationships and global body structure dependencies.
Multi-temporal Feature Learning: The network employs temporal convolutions at different time scales to model both short-term motion patterns and long-term action dynamics.
Dense Connectivity: Inspired by DenseNet, the method uses dense connections between different temporal scales to enhance feature reuse and gradient flow.
Comprehensive Evaluation: Extensive experiments on NTU RGB+D and NTU RGB+D 120 datasets demonstrate the effectiveness of the proposed approach.

Technical Innovation

The DenseGCN architecture combines:

Spatial Graph Convolution Modules for modeling joint relationships
Multi-scale Temporal Convolutions for capturing motion dynamics
Dense Skip Connections for improved feature propagation
Adaptive Graph Topology Learning for flexible joint relationship modeling

Experimental Results

The proposed DenseGCN achieves competitive performance on standard benchmarks:

NTU RGB+D: Significant improvements in both cross-subject and cross-view evaluations
NTU RGB+D 120: State-of-the-art results on the large-scale dataset
Computational Efficiency: Maintains reasonable computational complexity while improving accuracy

Impact and Applications

This work contributes to the field of skeleton-based action recognition by providing a more effective way to model complex human motions. The multi-level and multi-temporal approach has potential applications in:

Human-computer interaction
Video surveillance
Sports analysis
Healthcare monitoring

Recommended citation: Yu, C., et al. (2023). "DenseGCN: A multi-level and multi-temporal graph convolutional network for action recognition." IET Image Processing. 17(11), 3299-3312.
Download Paper

Share on

Bluesky Facebook LinkedIn X (formerly Twitter)

Chengzhang Yu