Open Access
ARTICLE
HAMOT: A Hierarchical Adaptive Framework for Robust Multi-Object Tracking in Complex Environments
1 School of Software Engineering, Northwestern Polytechnical University, Xi’an, 710000, China
2 School of Computer Science, Northwestern Polytechnical University, Xi’an, 710000, China
3 Ningbo Institute of Northwestern Polytechnical University, Beilun, Ningbo, 315800, China
4 School of Electronic and Communication Engineering, Quanzhou University of Information Engineering, Quanzhou, 362000, China
5 Department of Electrical Engineering, College of Engineering, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh, 11671, Saudi Arabia
6 Centre for Smart Systems and Automation, CoE for Robotics and Sensing Technologies, Faculty of Artificial Intelligence and Engineering, Multimedia University, Persiaran Multimedia, Cyberjaya, 63100, Selangor, Malaysia
* Corresponding Authors: Peng Zhang. Email: ; Teong Chee Chuah. Email:
(This article belongs to the Special Issue: Advanced Image Segmentation and Object Detection: Innovations, Challenges, and Applications)
Computer Modeling in Engineering & Sciences 2025, 145(1), 947-969. https://doi.org/10.32604/cmes.2025.069956
Received 04 July 2025; Accepted 18 August 2025; Issue published 30 October 2025
Abstract
Multiple Object Tracking (MOT) is essential for applications such as autonomous driving, surveillance, and analytics; However, challenges such as occlusion, low-resolution imaging, and identity switches remain persistent. We propose HAMOT, a hierarchical adaptive multi-object tracker that solves these challenges with a novel, unified framework. Unlike previous methods that rely on isolated components, HAMOT incorporates a Swin Transformer-based Adaptive Enhancement (STAE) module—comprising Scene-Adaptive Transformer Enhancement and Confidence-Adaptive Feature Refinement—to improve detection under low-visibility conditions. The hierarchical Dynamic Graph Neural Network with Temporal Attention (DGNN-TA) models both short- and long-term associations, and the Adaptive Unscented Kalman Filter with Gated Recurrent Unit (AUKF-GRU) ensures accurate motion prediction. The novel Graph-Based Density-Aware Clustering (GDAC) improves occlusion recovery by adapting to scene density, preserving identity integrity. This integrated approach enables adaptive responses to complex visual scenarios, Achieving exceptional performance across all evaluation metrics, including a Higher Order Tracking Accuracy (HOTA) of 67.05%, a Multiple Object Tracking Accuracy (MOTA) of 82.4%, an ID F1 Score (IDF1) of 83.1%, and a total of 1052 Identity Switches (IDSW) on the MOT17; 66.61% HOTA, 78.3% MOTA, 82.1% IDF1, and a total of 748 IDSW on MOT20; and 66.4% HOTA, 92.32% MOTA, and 68.96% IDF1 on DanceTrack. With fixed thresholds, the full HAMOT model (all six components) achieves real-time functionality at 24 FPS on MOT17 using RTX3090, ensuring robustness and scalability for real-world MOT applications.Keywords
Cite This Article
Copyright © 2025 The Author(s). Published by Tech Science Press.This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Submit a Paper
Propose a Special lssue
View Full Text
Download PDF
Downloads
Citation Tools