Open Access
ARTICLE
Aerial Images for Intelligent Vehicle Detection and Classification via YOLOv11 and Deep Learner
1 Guodian Nanjing Automation Co., Ltd., Nanjing, 210032, China
2 Faculty of Computing and AI, Air University, Islamabad, 44000, Pakistan
3 College of Computer Science, King Khalid University, Abha, 61421, Saudi Arabia
4 Department of Informatics and Computer Systems, King Khalid University, Abha, 61421, Saudi Arabia
5 Department of Information Systems, College of Computer Engineering and Sciences, Prince Sattam bin Abdulaziz University, Al-Kharj, 16273, Saudi Arabia
6 Jiangsu Key Laboratory of Intelligent Medical Image Computing, School of Artificial Intelligence (School of Future Technology), Nanjing University of Information Science and Technology, Nanjing, 210044, China
7 Cognitive Systems Lab, University of Bremen, Bremen, 28359, Germany
* Corresponding Author: Hui Liu. Email:
# These authors contributed equally to this work
Computers, Materials & Continua 2026, 86(1), 1-19. https://doi.org/10.32604/cmc.2025.067895
Received 15 May 2025; Accepted 02 September 2025; Issue published 10 November 2025
Abstract
As urban landscapes evolve and vehicular volumes soar, traditional traffic monitoring systems struggle to scale, often failing under the complexities of dense, dynamic, and occluded environments. This paper introduces a novel, unified deep learning framework for vehicle detection, tracking, counting, and classification in aerial imagery designed explicitly for modern smart city infrastructure demands. Our approach begins with adaptive histogram equalization to optimize aerial image clarity, followed by a cutting-edge scene parsing technique using Mask2Former, enabling robust segmentation even in visually congested settings. Vehicle detection leverages the latest YOLOv11 architecture, delivering superior accuracy in aerial contexts by addressing occlusion, scale variance, and fine-grained object differentiation. We incorporate the highly efficient ByteTrack algorithm for tracking, enabling seamless identity preservation across frames. Vehicle counting is achieved through an unsupervised DBSCAN-based method, ensuring adaptability to varying traffic densities. We further introduce a hybrid feature extraction module combining Convolutional Neural Networks (CNNs) with Zernike Moments, capturing both deep semantic and geometric signatures of vehicles. The final classification is powered by NASNet, a neural architecture search-optimized model, ensuring high accuracy across diverse vehicle types and orientations. Extensive evaluations of the VAID benchmark dataset demonstrate the system’s outstanding performance, achieving 96% detection, 94% tracking, and 96.4% classification accuracy. On the UAVDT dataset, the system attains 95% detection, 93% tracking, and 95% classification accuracy, confirming its robustness across diverse aerial traffic scenarios. These results establish new benchmarks in aerial traffic analysis and validate the framework’s scalability, making it a powerful and adaptable solution for next-generation intelligent transportation systems and urban surveillance.Keywords
Cite This Article
Copyright © 2026 The Author(s). Published by Tech Science Press.This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Submit a Paper
Propose a Special lssue
View Full Text
Download PDF
Downloads
Citation Tools