TCR-RoadNet: A Transformer-Enhanced Multi-Task Deep Learning Architecture for Real-Time Road Damage Detection and Segmentation
Olzhas Olzhayev1, Bakhytzhan Kulambayev2,*, Azizah Suliman3
1 Department of Mathematical and Computer Modeling, International Information Technology University, Almaty, Kazakhstan
2 Higher School of Telecommunications, Turan University, Almaty, Kazakhstan
3 Faculty of Science and Technology, Asia Metropolitan University, Subang Jaya Campus, Malaysia
* Corresponding Author: Bakhytzhan Kulambayev. Email:
Computers, Materials & Continua https://doi.org/10.32604/cmc.2026.082618
Received 19 March 2026; Accepted 25 May 2026; Published online 15 June 2026
Abstract
Automated road damage detection is a critical component of intelligent transportation systems, enabling efficient infrastructure maintenance and improved traffic safety. However, existing approaches often suffer from limited contextual understanding, insufficient segmentation accuracy, and suboptimal real-time performance. This study presents TCR-RoadNet, a transformer-enhanced multi-task deep learning architecture designed for simultaneous road damage detection and segmentation in real-world driving environments. The proposed framework integrates a multi-scale convolutional backbone with a Transformer Context Refinement (TCR) module to capture both fine-grained structural details and long-range spatial dependencies across feature scales. To further enhance performance, a Decoupled Detection Head (DDH) is employed to stabilize localization and classification learning, while a Classification Refinement Module (CRM) improves inter-class discrimination using region-based feature enhancement. In addition, a Boundary-Aware Segmentation Head is introduced to generate precise damage contours by incorporating edge-sensitive learning mechanisms. The model is evaluated on the RDD2022 dataset, complemented by an extended dataset containing additional road images collected from Kazakhstan to increase environmental diversity and robustness. Experimental results demonstrate that the proposed method achieves 0.9416 precision, 0.9235 recall, and 0.8718 mAP@50, along with a segmentation performance of 0.8129 mean Intersection over Union (mIoU), while maintaining real-time inference at 57 FPS. Comparative analysis and ablation studies confirm that each architectural component contributes to consistent performance gains in both detection and segmentation tasks. Qualitative results further illustrate the robustness of the proposed framework under varying lighting, weather, and road conditions. The proposed approach offers a scalable and efficient solution for real-time road condition monitoring and intelligent infrastructure management systems.
Keywords
Road damage detection; pavement defect segmentation; transformer; computer vision; multi-task deep learning; intelligent transportation systems