Open Access
ARTICLE
GaitMAFF: Adaptive Multi-Modal Fusion of Skeleton Maps and Silhouettes for Robust Gait Recognition in Complex Scenarios
1 College of Computer Science, Chongqing University, Chongqing, 400044, China
2 School of Traffic and Transportation, Chongqing Jiaotong University, Chongqing, 400074, China
3 Department of Engineering Sciences and Applied Mathematics, Northwestern University, Evanston, IL 60208, USA
4 School of Civil Engineering, Chongqing Jiaotong University, Chongqing, 400074, China
5 National & Local Joint Engineering Research Center of Transportation Civil Engineering Materials, Chongqing Jiaotong University, Chongqing, 400074, China
* Corresponding Author: Yanqiu Bi. Email:
Computers, Materials & Continua 2026, 87(2), 22 https://doi.org/10.32604/cmc.2025.075704
Received 06 November 2025; Accepted 15 December 2025; Issue published 12 March 2026
Abstract
Gait recognition is a key biometric for long-distance identification, yet its performance is severely degraded by real-world challenges such as varying clothing, carrying conditions, and changing viewpoints. While combining silhouette and skeleton data is a promising direction, effectively fusing these heterogeneous modalities and adaptively weighting their contributions in response to diverse conditions remains a central problem. This paper introduces GaitMAFF, a novel Multi-modal Adaptive Feature Fusion Network, to address this challenge. Our approach first transforms discrete skeleton joints into a dense Skeleton Map representation to align with silhouettes, then employs an attention-based module to dynamically learn the fusion weights between the two modalities. These fused features are processed by a powerful spatio-temporal backbone with Weighted Global-Local Feature Fusion Modules (WFFM) to learn a discriminative representation. Extensive experiments on the challenging CCPG and Gait3D datasets show that GaitMAFF achieves state-of-the-art performance, with an average Rank-1 accuracy of 84.6% on CCPG and 58.7% on Gait3D. These results demonstrate that our adaptive fusion strategy effectively integrates complementary multi-modal information, significantly enhancing gait recognition robustness and accuracy in complex scenes and providing a practical solution for real-world applications.Keywords
Cite This Article
Copyright © 2026 The Author(s). Published by Tech Science Press.This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Submit a Paper
Propose a Special lssue
View Full Text
Download PDF
Downloads
Citation Tools