Home / Journals / CMC / Online First / doi:10.32604/cmc.2026.078756
Special Issues
Table of Content

Open Access

ARTICLE

A Large-Scale Dataset for Real-Time Vehicle Detection in Vietnamese Urban Traffic Scenes

Quang Dong Nguyen Vo1, Gia Nhu Nguyen1, Hoang Vu Tran2,*
1 School of Computer Science and Artificial Intelligence, Duy Tan University, Danang, Vietnam
2 The University of Danang-University of Technology and Education, Danang, Vietnam
* Corresponding Author: Hoang Vu Tran. Email: email

Computers, Materials & Continua https://doi.org/10.32604/cmc.2026.078756

Received 07 January 2026; Accepted 12 February 2026; Published online 09 March 2026

Abstract

Reliable vehicle detection in urban traffic environments remains challenging, particularly for fixed-view CCTV systems deployed in Southeast Asian cities, where heterogeneous traffic composition, high traffic density, frequent occlusions, and complex visual conditions are prevalent. The absence of large-scale datasets tailored to such mixed-traffic environments poses a significant limitation to the performance and generalization capability of existing object detection models. To address this gap, this paper presents a large-scale traffic image dataset for real-time vehicle detection in Vietnamese urban environments. The proposed dataset comprises 23,364 images collected from fixed-view CCTV traffic cameras deployed across Da Nang City, a representative urban area exhibiting mixed-traffic patterns commonly observed in Southeast Asian cities. The data cover diverse temporal periods, weather conditions, and traffic density levels encountered in real-world traffic monitoring scenarios. To comprehensively characterize these conditions, over 1.1 million instances are annotated across multiple traffic-related categories, including pedestrians, bicycles, motorbikes, cars, buses, trucks, and traffic lights with explicit signal-state labels. Such fine-grained, multi-class annotations support not only object-level detection but also higher-level traffic scene analysis relevant to intelligent transportation system (ITS) applications, such as traffic flow analysis and signal control. To balance annotation accuracy and scalability, a semi-automatic labeling pipeline is employed. Initial object annotations are generated using a pretrained YOLOv11m model and subsequently refined through systematic manual verification using the CVAT platform. Comprehensive experiments are conducted under the same experimental protocol, using the same YOLOv11m architecture, comprising a pretrained baseline and a version fine-tuned on the proposed dataset with domain-specific data augmentation and optimized hyperparameter settings tailored to fixed-view CCTV conditions. Under the same evaluation setting, the pretrained YOLOv11m achieves a mean Average Precision (mAP) of 0.409; in contrast, fine-tuning on the proposed dataset improves the mAP to 0.788. These results underscore the necessity of localized, context-aware datasets such as the one presented in this work for robust real-time traffic perception in Vietnam and similar Southeast Asian urban contexts.

Keywords

Deep learning; ITS; real-time vehicle detection; Vietnamese traffic dataset; traffic detection
  • 295

    View

  • 69

    Download

  • 0

    Like

Share Link