Intelligent Semantic Segmentation with Vision Transformers for Aerial Vehicle Monitoring

Moneerah Alotaibi

doi:10.32604/cmc.2025.069195

Open Access icon Open Access

ARTICLE

Intelligent Semantic Segmentation with Vision Transformers for Aerial Vehicle Monitoring

Moneerah Alotaibi^*

Department of Computer Science, College of Science and Humanities Dawadmi, Shaqra University, Dawadmi, 11911, Saudi Arabia

* Corresponding Author: Moneerah Alotaibi. Email: email

Computers, Materials & Continua 2026, 86(1), 1-20. https://doi.org/10.32604/cmc.2025.069195

Received 17 June 2025; Accepted 25 September 2025; Issue published 10 November 2025

Abstract

Advanced traffic monitoring systems encounter substantial challenges in vehicle detection and classification due to the limitations of conventional methods, which often demand extensive computational resources and struggle with diverse data acquisition techniques. This research presents a novel approach for vehicle classification and recognition in aerial image sequences, integrating multiple advanced techniques to enhance detection accuracy. The proposed model begins with preprocessing using Multiscale Retinex (MSR) to enhance image quality, followed by Expectation-Maximization (EM) Segmentation for precise foreground object identification. Vehicle detection is performed using the state-of-the-art YOLOv10 framework, while feature extraction incorporates Maximally Stable Extremal Regions (MSER), Dense Scale-Invariant Feature Transform (Dense SIFT), and Zernike Moments Features to capture distinct object characteristics. Feature optimization is further refined through a Hybrid Swarm-based Optimization algorithm, ensuring optimal feature selection for improved classification performance. The final classification is conducted using a Vision Transformer, leveraging its robust learning capabilities for enhanced accuracy. Experimental evaluations on benchmark datasets, including UAVDT and the Unmanned Aerial Vehicle Intruder Dataset (UAVID), demonstrate the superiority of the proposed approach, achieving an accuracy of 94.40% on UAVDT and 93.57% on UAVID. The results highlight the efficacy of the model in significantly enhancing vehicle detection and classification in aerial imagery, outperforming existing methodologies and offering a statistically validated improvement for intelligent traffic monitoring systems compared to existing approaches.

Keywords

Machine learning; semantic segmentation; remote sensors; deep learning; object monitoring system

Cite This Article

APA Style

Alotaibi, M. (2026). Intelligent Semantic Segmentation with Vision Transformers for Aerial Vehicle Monitoring. Computers, Materials & Continua, 86(1), 1–20. https://doi.org/10.32604/cmc.2025.069195

Vancouver Style

Alotaibi M. Intelligent Semantic Segmentation with Vision Transformers for Aerial Vehicle Monitoring. Comput Mater Contin. 2026;86(1):1–20. https://doi.org/10.32604/cmc.2025.069195

IEEE Style

M. Alotaibi, “Intelligent Semantic Segmentation with Vision Transformers for Aerial Vehicle Monitoring,” Comput. Mater. Contin., vol. 86, no. 1, pp. 1–20, 2026. https://doi.org/10.32604/cmc.2025.069195

BibTex EndNote RIS

Copyright © 2026 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

Intelligent Semantic Segmentation with Vision Transformers for Aerial Vehicle Monitoring

Abstract

Keywords

Cite This Article

400

165

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link