Open Access iconOpen Access

ARTICLE

crossmark

Intelligent Semantic Segmentation with Vision Transformers for Aerial Vehicle Monitoring

Moneerah Alotaibi*

Department of Computer Science, College of Science and Humanities Dawadmi, Shaqra University, Dawadmi, 11911, Saudi Arabia

* Corresponding Author: Moneerah Alotaibi. Email: email

Computers, Materials & Continua 2026, 86(1), 1-20. https://doi.org/10.32604/cmc.2025.069195

Abstract

Advanced traffic monitoring systems encounter substantial challenges in vehicle detection and classification due to the limitations of conventional methods, which often demand extensive computational resources and struggle with diverse data acquisition techniques. This research presents a novel approach for vehicle classification and recognition in aerial image sequences, integrating multiple advanced techniques to enhance detection accuracy. The proposed model begins with preprocessing using Multiscale Retinex (MSR) to enhance image quality, followed by Expectation-Maximization (EM) Segmentation for precise foreground object identification. Vehicle detection is performed using the state-of-the-art YOLOv10 framework, while feature extraction incorporates Maximally Stable Extremal Regions (MSER), Dense Scale-Invariant Feature Transform (Dense SIFT), and Zernike Moments Features to capture distinct object characteristics. Feature optimization is further refined through a Hybrid Swarm-based Optimization algorithm, ensuring optimal feature selection for improved classification performance. The final classification is conducted using a Vision Transformer, leveraging its robust learning capabilities for enhanced accuracy. Experimental evaluations on benchmark datasets, including UAVDT and the Unmanned Aerial Vehicle Intruder Dataset (UAVID), demonstrate the superiority of the proposed approach, achieving an accuracy of 94.40% on UAVDT and 93.57% on UAVID. The results highlight the efficacy of the model in significantly enhancing vehicle detection and classification in aerial imagery, outperforming existing methodologies and offering a statistically validated improvement for intelligent traffic monitoring systems compared to existing approaches.

Keywords

Machine learning; semantic segmentation; remote sensors; deep learning; object monitoring system

Cite This Article

APA Style
Alotaibi, M. (2026). Intelligent Semantic Segmentation with Vision Transformers for Aerial Vehicle Monitoring. Computers, Materials & Continua, 86(1), 1–20. https://doi.org/10.32604/cmc.2025.069195
Vancouver Style
Alotaibi M. Intelligent Semantic Segmentation with Vision Transformers for Aerial Vehicle Monitoring. Comput Mater Contin. 2026;86(1):1–20. https://doi.org/10.32604/cmc.2025.069195
IEEE Style
M. Alotaibi, “Intelligent Semantic Segmentation with Vision Transformers for Aerial Vehicle Monitoring,” Comput. Mater. Contin., vol. 86, no. 1, pp. 1–20, 2026. https://doi.org/10.32604/cmc.2025.069195



cc Copyright © 2026 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 222

    View

  • 80

    Download

  • 0

    Like

Share Link