Open Access iconOpen Access

ARTICLE

crossmark

Bearing Fault Diagnosis Based on Multimodal Fusion GRU and Swin-Transformer

Yingyong Zou*, Yu Zhang, Long Li, Tao Liu, Xingkui Zhang

College of Mechanical and Vehicle Engineering, Changchun University, Changchun, 130012, China

* Corresponding Author: Yingyong Zou. Email: email

(This article belongs to the Special Issue: Advancements in Machine Fault Diagnosis and Prognosis: Data-Driven Approaches and Autonomous Systems)

Computers, Materials & Continua 2026, 86(1), 1-24. https://doi.org/10.32604/cmc.2025.068246

Abstract

Fault diagnosis of rolling bearings is crucial for ensuring the stable operation of mechanical equipment and production safety in industrial environments. However, due to the nonlinearity and non-stationarity of collected vibration signals, single-modal methods struggle to capture fault features fully. This paper proposes a rolling bearing fault diagnosis method based on multi-modal information fusion. The method first employs the Hippopotamus Optimization Algorithm (HO) to optimize the number of modes in Variational Mode Decomposition (VMD) to achieve optimal modal decomposition performance. It combines Convolutional Neural Networks (CNN) and Gated Recurrent Units (GRU) to extract temporal features from one-dimensional time-series signals. Meanwhile, the Markovian Transition Field (MTF) is used to transform one-dimensional signals into two-dimensional images for spatial feature mining. Through visualization techniques, the effectiveness of generated images from different parameter combinations is compared to determine the optimal parameter configuration. A multi-modal network (GSTCN) is constructed by integrating Swin-Transformer and the Convolutional Block Attention Module (CBAM), where the attention module is utilized to enhance fault features. Finally, the fault features extracted from different modalities are deeply fused and fed into a fully connected layer to complete fault classification. Experimental results show that the GSTCN model achieves an average diagnostic accuracy of 99.5% across three datasets, significantly outperforming existing comparison methods. This demonstrates that the proposed model has high diagnostic precision and good generalization ability, providing an efficient and reliable solution for rolling bearing fault diagnosis.

Keywords

Multi-modal; GRU; swin-transformer; CBAM; CNN; feature fusion

Cite This Article

APA Style
Zou, Y., Zhang, Y., Li, L., Liu, T., Zhang, X. (2026). Bearing Fault Diagnosis Based on Multimodal Fusion GRU and Swin-Transformer. Computers, Materials & Continua, 86(1), 1–24. https://doi.org/10.32604/cmc.2025.068246
Vancouver Style
Zou Y, Zhang Y, Li L, Liu T, Zhang X. Bearing Fault Diagnosis Based on Multimodal Fusion GRU and Swin-Transformer. Comput Mater Contin. 2026;86(1):1–24. https://doi.org/10.32604/cmc.2025.068246
IEEE Style
Y. Zou, Y. Zhang, L. Li, T. Liu, and X. Zhang, “Bearing Fault Diagnosis Based on Multimodal Fusion GRU and Swin-Transformer,” Comput. Mater. Contin., vol. 86, no. 1, pp. 1–24, 2026. https://doi.org/10.32604/cmc.2025.068246



cc Copyright © 2026 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 1264

    View

  • 452

    Download

  • 0

    Like

Share Link