Open Access iconOpen Access

ARTICLE

Group Activity Recognition in Crowded Scenes Using Multi-Stage Feature Optimization and ST-GCN-LSTM Networks

Mohammed Alnusayri1, Tingting Xue2,3, Saleha Kamal4, Nouf Abdullah Almujally5, Khaled Alnowaiser6, Ahmad Jalal4,7,*, Hui Liu3,8,9,*

1 Department of Computer Science, College of Computer and Information Sciences, Jouf University, Sakaka, Saudi Arabia
2 School of Environmental Science & Engineering, Nanjing University of Information Science and Technology, Nanjing, China
3 Cognitive Systems Lab, University of Bremen, Bremen, Germany
4 Department of Computer Science, Air University, Islamabad, Pakistan
5 Department of Information Systems, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia
6 Department of Computer Science, College of Computer Engineering and Sciences, Prince Sattam Bin Abdulaziz University, Al-Kharj, Saudi Arabia
7 Department of Computer Science and Engineering, College of Informatics, Korea University, Seoul, Republic of Korea
8 Jiangsu Key Laboratory of Intelligent Medical Image Computing, School of Future Technology, Nanjing University of Information Science and Technology, Nanjing, China
9 Guodian Nanjing Automation Co., Ltd., Nanjing, China

* Corresponding Authors: Ahmad Jalal. Email: email; Hui Liu. Email: email

(This article belongs to the Special Issue: Deep Learning: Emerging Trends, Applications and Research Challenges for Image Recognition)

Computers, Materials & Continua 2026, 88(1), 58 https://doi.org/10.32604/cmc.2026.074115

Abstract

Group activity recognition in public environments is challenging due to dynamic formations, complex inter-person interactions, and frequent occlusions. Existing methods often emphasize individual actions, overlooking collective behavioral patterns. This work introduces a multi-modal framework integrating silhouette-based appearance and skeleton-based pose information for robust recognition in surveillance scenarios. You Only Look Once v11 (YOLOv11) detects persons, Segmenting Objects by LOcations version 2 (SOLOv2) segments instances, and AlphaPose extracts skeletons, followed by hierarchical grouping to form spatially coherent clusters. A hybrid feature extraction strategy combines handcrafted descriptors (Extended GIST (ExGIST), Distance Transform, Binary Robust Independent Elementary Features (BRIEF), Ridge) with deep representations, fused via multi-head attention. Feature selection is refined through a three-stage pipeline of Kernel Principal Component Analysis (K-PCA), mutual information ranking, and genetic algorithm-based optimization. Spatio-Temporal Graph Convolution Networks (ST-GCN) models spatio-temporal dependencies, while Long Short-Term Memory (LSTM) captures long-term dynamics for activity classification. On the Collective Activity Dataset (CAD), the framework achieves 96.80% accuracy, surpassing state-of-the-art approaches. Its modular design ensures scalability and adaptability for intelligent surveillance and smart city applications.

Keywords

BRIEF features; YOLO v11; STGCN; group activity recognition; LSTM

Cite This Article

APA Style
Alnusayri, M., Xue, T., Kamal, S., Almujally, N.A., Alnowaiser, K. et al. (2026). Group Activity Recognition in Crowded Scenes Using Multi-Stage Feature Optimization and ST-GCN-LSTM Networks. Computers, Materials & Continua, 88(1), 58. https://doi.org/10.32604/cmc.2026.074115
Vancouver Style
Alnusayri M, Xue T, Kamal S, Almujally NA, Alnowaiser K, Jalal A, et al. Group Activity Recognition in Crowded Scenes Using Multi-Stage Feature Optimization and ST-GCN-LSTM Networks. Comput Mater Contin. 2026;88(1):58. https://doi.org/10.32604/cmc.2026.074115
IEEE Style
M. Alnusayri et al., “Group Activity Recognition in Crowded Scenes Using Multi-Stage Feature Optimization and ST-GCN-LSTM Networks,” Comput. Mater. Contin., vol. 88, no. 1, pp. 58, 2026. https://doi.org/10.32604/cmc.2026.074115



cc Copyright © 2026 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 211

    View

  • 43

    Download

  • 0

    Like

Share Link