Open Access
ARTICLE
Full Ceramic Bearing Fault Diagnosis with Few-Shot Learning Using GPT-2
1 Department of Mechanical and Industrial Engineering, College of Engineering, University of Illinois Chicago, Chicago, IL 60607, USA
2Siemens Corporation, Princeton, NJ 08540, USA
3NOV, Inc., Houston, TX 77042, USA
* Corresponding Author: David He. Email:
(This article belongs to the Special Issue: Applications of Large Language Models (LLMs) in Prognostics and Health Management)
Computer Modeling in Engineering & Sciences 2025, 143(2), 1955-1969. https://doi.org/10.32604/cmes.2025.063975
Received 31 January 2025; Accepted 22 April 2025; Issue published 30 May 2025
Abstract
Full ceramic bearings are mission-critical components in oil-free environments, such as food processing, semiconductor manufacturing, and medical applications. Developing effective fault diagnosis methods for these bearings is essential to ensuring operational reliability and preventing costly failures. Traditional supervised deep learning approaches have demonstrated promise in fault detection, but their dependence on large labeled datasets poses significant challenges in industrial settings where fault-labeled data is scarce. This paper introduces a few-shot learning approach for full ceramic bearing fault diagnosis by leveraging the pre-trained GPT-2 model. Large language models (LLMs) like GPT-2, pre-trained on diverse textual data, exhibit remarkable transfer learning and few-shot learning capabilities, making them ideal for applications with limited labeled data. In this study, acoustic emission (AE) signals from bearings were processed using empirical mode decomposition (EMD), and the extracted AE features were converted into structured text for fine-tuning GPT-2 as a fault classifier. To enhance its performance, we incorporated a modified loss function and softmax activation with cosine similarity, ensuring better generalization in fault identification. Experimental evaluations on a laboratory-collected full ceramic bearing dataset demonstrated that the proposed approach achieved high diagnostic accuracy with as few as five labeled samples, outperforming conventional methods such as k-nearest neighbor (KNN), large memory storage and retrieval (LAMSTAR) neural network, deep neural network (DNN), recurrent neural network (RNN), long short-term memory (LSTM) network, and model-agnostic meta-learning (MAML). The results highlight LLMs’ potential to revolutionize fault diagnosis, enabling faster deployment, reduced reliance on extensive labeled datasets, and improved adaptability in industrial monitoring systems.Keywords
Cite This Article

This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.