Pyramid–MixNet: Integrate Attention into Encoder-Decoder Transformer Framework for Automatic Railway Surface Damage Segmentation

Hui Luo; Wenqing Li; Wei Zeng

doi:10.32604/cmc.2025.062949

Open Access icon Open Access

ARTICLE

Pyramid–MixNet: Integrate Attention into Encoder-Decoder Transformer Framework for Automatic Railway Surface Damage Segmentation

Hui Luo, Wenqing Li^*, Wei Zeng

School of Information and Software Engineering, East China Jiaotong University, Nanchang, 330013, China

* Corresponding Author: Wenqing Li. Email: email

(This article belongs to the Special Issue: Artificial Intelligence and Advanced Computation Technology in Railways)

Computers, Materials & Continua 2025, 84(1), 1567-1580. https://doi.org/10.32604/cmc.2025.062949

Received 31 December 2024; Accepted 18 April 2025; Issue published 09 June 2025

Abstract

Rail surface damage is a critical component of high-speed railway infrastructure, directly affecting train operational stability and safety. Existing methods face limitations in accuracy and speed for small-sample, multi-category, and multi-scale target segmentation tasks. To address these challenges, this paper proposes Pyramid-MixNet, an intelligent segmentation model for high-speed rail surface damage, leveraging dataset construction and expansion alongside a feature pyramid-based encoder-decoder network with multi-attention mechanisms. The encoding network integrates Spatial Reduction Masked Multi-Head Attention (SRMMHA) to enhance global feature extraction while reducing trainable parameters. The decoding network incorporates Mix-Attention (MA), enabling multi-scale structural understanding and cross-scale token group correlation learning. Experimental results demonstrate that the proposed method achieves 62.17% average segmentation accuracy, 80.28% Damage Dice Coefficient, and 56.83 FPS, meeting real-time detection requirements. The model’s high accuracy and scene adaptability significantly improve the detection of small-scale and complex multi-scale rail damage, offering practical value for real-time monitoring in high-speed railway maintenance systems.

Keywords

Pyramid vision transformer; encoder–decoder architecture; railway damage segmentation; masked multi-head attention; mix-attention

Cite This Article

APA Style

Luo, H., Li, W., Zeng, W. (2025). Pyramid–MixNet: Integrate Attention into Encoder-Decoder Transformer Framework for Automatic Railway Surface Damage Segmentation. Computers, Materials & Continua, 84(1), 1567–1580. https://doi.org/10.32604/cmc.2025.062949

Vancouver Style

Luo H, Li W, Zeng W. Pyramid–MixNet: Integrate Attention into Encoder-Decoder Transformer Framework for Automatic Railway Surface Damage Segmentation. Comput Mater Contin. 2025;84(1):1567–1580. https://doi.org/10.32604/cmc.2025.062949

IEEE Style

H. Luo, W. Li, and W. Zeng, “Pyramid–MixNet: Integrate Attention into Encoder-Decoder Transformer Framework for Automatic Railway Surface Damage Segmentation,” Comput. Mater. Contin., vol. 84, no. 1, pp. 1567–1580, 2025. https://doi.org/10.32604/cmc.2025.062949

BibTex EndNote RIS

Copyright © 2025 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

Pyramid–MixNet: Integrate Attention into Encoder-Decoder Transformer Framework for Automatic Railway Surface Damage Segmentation

Abstract

Keywords

Cite This Article

758

335

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link