Open Access iconOpen Access



Multi-Layered Deep Learning Features Fusion for Human Action Recognition

Sadia Kiran1, Muhammad Attique Khan1, Muhammad Younus Javed1, Majed Alhaisoni2, Usman Tariq3, Yunyoung Nam4,*, Robertas Damaševičius5, Muhammad Sharif6

1 Department of Computer Science, HITEC University Taxila, Taxila, Pakistan
2 College of Computer Science and Engineering, University of Ha’il, Ha’il, Saudi Arabia
3 College of Computer Engineering and Sciences, Prince Sattam Bin Abdulaziz University, Al-Khraj, Saudi Arabia
4 Department of Computer Science and Engineering, Soonchunhyang University, Asan, Korea
5 Faculty of Applied Mathematics, Silesian University of Technology, Gliwice, Poland
6 Department of Computer Science, COMSATS University Islamabad, Wah Campus, Pakistan

* Corresponding Author: Yunyoung Nam. Email: email

(This article belongs to the Special Issue: Recent Advances in Deep Learning, Information Fusion, and Features Selection for Video Surveillance Application)

Computers, Materials & Continua 2021, 69(3), 4061-4075.


Human Action Recognition (HAR) is an active research topic in machine learning for the last few decades. Visual surveillance, robotics, and pedestrian detection are the main applications for action recognition. Computer vision researchers have introduced many HAR techniques, but they still face challenges such as redundant features and the cost of computing. In this article, we proposed a new method for the use of deep learning for HAR. In the proposed method, video frames are initially pre-processed using a global contrast approach and later used to train a deep learning model using domain transfer learning. The Resnet-50 Pre-Trained Model is used as a deep learning model in this work. Features are extracted from two layers: Global Average Pool (GAP) and Fully Connected (FC). The features of both layers are fused by the Canonical Correlation Analysis (CCA). Then features are selected using the Shanon Entropy-based threshold function. The selected features are finally passed to multiple classifiers for final classification. Experiments are conducted on five publicly available datasets as IXMAS, UCF Sports, YouTube, UT-Interaction, and KTH. The accuracy of these data sets was 89.6%, 99.7%, 100%, 96.7% and 96.6%, respectively. Comparison with existing techniques has shown that the proposed method provides improved accuracy for HAR. Also, the proposed method is computationally fast based on the time of execution.


Cite This Article

APA Style
Kiran, S., Khan, M.A., Javed, M.Y., Alhaisoni, M., Tariq, U. et al. (2021). Multi-layered deep learning features fusion for human action recognition. Computers, Materials & Continua, 69(3), 4061-4075.
Vancouver Style
Kiran S, Khan MA, Javed MY, Alhaisoni M, Tariq U, Nam Y, et al. Multi-layered deep learning features fusion for human action recognition. Comput Mater Contin. 2021;69(3):4061-4075
IEEE Style
S. Kiran et al., "Multi-Layered Deep Learning Features Fusion for Human Action Recognition," Comput. Mater. Contin., vol. 69, no. 3, pp. 4061-4075. 2021.


cc This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 2609


  • 1390


  • 0


Share Link