Open Access
ARTICLE
HgaNets: Fusion of Visual Data and Skeletal Heatmap for Human Gesture Action Recognition
Wuyan Liang1, Xiaolong Xu2,*
1 School of Internet of Things, Nanjing University of Posts and Telecommunications, Nanjing, 210023, China
2 School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing, 210023, China
* Corresponding Author: Xiaolong Xu. Email:
(This article belongs to the Special Issue: Machine Vision Detection and Intelligent Recognition)
Computers, Materials & Continua 2024, 79(1), 1089-1103. https://doi.org/10.32604/cmc.2024.047861
Received 20 November 2023; Accepted 04 March 2024; Issue published 25 April 2024
Abstract
Recognition of human gesture actions is a challenging issue due to the complex patterns in both visual and skeletal features. Existing gesture action recognition (GAR) methods typically analyze visual and skeletal data, failing to meet the demands of various scenarios. Furthermore, multi-modal approaches lack the versatility to efficiently process both uniform and disparate input patterns. Thus, in this paper, an attention-enhanced pseudo-3D residual model is proposed to address the GAR problem, called HgaNets. This model comprises two independent components designed for modeling visual RGB (red, green and blue) images and 3D skeletal heatmaps, respectively. More specifically, each component consists of two main parts: 1) a multi-dimensional attention module for capturing important spatial, temporal and feature information in human gestures; 2) a spatiotemporal convolution module that utilizes pseudo-3D residual convolution to characterize spatiotemporal features of gestures. Then, the output weights of the two components are fused to generate the recognition results. Finally, we conducted experiments on four datasets to assess the efficiency of the proposed model. The results show that the accuracy on four datasets reaches 85.40%, 91.91%, 94.70%, and 95.30%, respectively, as well as the inference time is 0.54 s and the parameters is 2.74M. These findings highlight that the proposed model outperforms other existing approaches in terms of recognition accuracy.
Keywords
Cite This Article
APA Style
Liang, W., Xu, X. (2024). Hganets: fusion of visual data and skeletal heatmap for human gesture action recognition. Computers, Materials & Continua, 79(1), 1089-1103. https://doi.org/10.32604/cmc.2024.047861
Vancouver Style
Liang W, Xu X. Hganets: fusion of visual data and skeletal heatmap for human gesture action recognition. Comput Mater Contin. 2024;79(1):1089-1103 https://doi.org/10.32604/cmc.2024.047861
IEEE Style
W. Liang and X. Xu, "HgaNets: Fusion of Visual Data and Skeletal Heatmap for Human Gesture Action Recognition," Comput. Mater. Contin., vol. 79, no. 1, pp. 1089-1103. 2024. https://doi.org/10.32604/cmc.2024.047861