Aero-Engine Surge Fault Diagnosis Using Deep Neural Network

Deep learning techniques have outstanding performance in feature extraction and model fitting. In the field of aero-engine fault diagnosis, the introduction of deep learning technology is of great significance. The aero-engine is the heart of the aircraft, and its stable operation is the primary guarantee of the aircraft. In order to ensure the normal operation of the aircraft, it is necessary to study and diagnose the faults of the aero-engine. Among the many engine failures, the one that occurs more frequently and is more hazardous is the wheeze, which often poses a great threat to flight safety. On the basis of analyzing the mechanism of aero-engine surge, an aero-engine surge fault diagnosis method based on deep learning technology is proposed. In this paper, key sensor data are obtained by analyzing different engine sensor data. An aero-engine surge dataset acquisition algorithm (ASDA) is proposed to sample the fault and normal points to generate the training set, validation set and test set. Based on neural network models such as one-dimensional convolutional neural network (1D-CNN), convolutional neural network (RNN), and long-short memory neural network (LSTM), different neural network optimization algorithms are selected to achieve fault diagnosis and classification. The experimental results show that the deep learning technique has good effect in aero-engine surge fault diagnosis. The aero-engine surge fault diagnosis network (ASFDN) proposed in this paper achieves better results. Through training, the network achieves more than 99% classification accuracy for the test set.


Introduction
In recent years, the aviation industry has developed rapidly, and the safety of aviation aircraft has attracted widespread attention. Aero-engine is the core component of the air craft. Once the fault occurs, it will directly lead to the aircraft unable to fly normally. The problem of engine surge has been restricting the development of turbine engines, affecting the performance of the engine, and even causing serious damage. It is the most dangerous of all engine failures. Surge is the abnormal operation of the engine caused by the compressor [1]. When surge occurs, the airflow will produce low-frequency and highamplitude airflow oscillations along the compressor axis. This low-frequency and high-amplitude airflow oscillation will drive the compressor blades to produce strong vibrations, causing serious damage to the blades in a short time even break [2]. The reverse flow of the flame in the combustion chamber is a more dangerous situation. Even if it lasts for a few tenths of a second, the high-temperature flames rushing are enough to burn all the blades of the compressor. This situation is because the compressor surges too violently, causing the high-temperature gas in the combustion chamber at the back to flow back into the compressor. Earlier, even if this process only lasted a few tenths of a second, the rushing hightemperature flame was enough to burn out all the blades of the compressor and cause the engine to be scrapped [3]. In any working state, it is necessary to avoid the compressor from entering the surge state. In order to ensure the safety of the aero engine, it is necessary to carry out a failure test on the aero engine, and to identify and classify the failure signal.
In the field of mechanical fault diagnosis, many domestic and foreign scholars use deep neural networks to process the vibration signals of mechanical equipment, and have achieved good results. In 2019, Luo et al. [4] proposed a method based on the LSTM model in the fault diagnosis of reciprocating compression machinery, the fault identification accuracy of the best model reached 93%. In order to solve the problems of traditional one-dimensional vibration signal processing methods, Huang et al. [5] proposed a 1D-CNN fault diagnosis method based on cubic spline interpolation pool. Experiments show that the proposed method has high recognition rate and one-dimensional stability. Yu et al. [6] realized the end-toend fault diagnosis of rolling bearing based on the stack LSTM method, reaching 99% accuracy. In 2020, L. Gou, H. Li, H. Zheng, et al. proposed an intelligent fault diagnosis method for aero-engine sensors based on deep learning and time-frequency analysis. This method does not require modeling and design thresholds, and has strong robustness and accuracy of more than 97% [7]. Qiu et al. [8] proposed a rolling bearing fault diagnosis method based on an improved bidirectional long short-term memory (Bi-LSTM) based on the non-stationary characteristics and the simple logical structure characteristics of rolling bearings, which further reduced the error rate of rolling bearing fault diagnosis. Yin et al. [9] proposed a method based on Cos-LSTM neural network for the gearbox of wind turbines and proved its effectiveness.
Based on the above research, this paper applies 1D-CNN, LSTM and other neural network models which are often used to deal with vibration signal and sequence data of mechanical equipment to aeroengine surge fault diagnosis. Different network optimization algorithms are used for fault identification and classification. The experimental results of different methods are compared to verify the effectiveness of deep learning in the application of aero-engine surge fault diagnosis. Finally, this paper proposes an aero-engine surge fault diagnosis network (ASFDN). Based on1D-CNN, this network can classify normal and surge fault data by building and adjusting parameters for aero-engine surge data set.

Convolutional Layer
The convolutional layer is composed of multiple convolution kernels, and the shape and number of convolution kernels directly determine the performance of the network [13]. The parameter sharing of each convolution kernel can reduce the number of network model parameters, making the trained model stronger in generalization [14]. The convolution process is described shown in Eq. (1): where x l k , b l k represents the output and bias of the kth neuron in the layer l; x lÀ1 i represents the output of the ith neuron in the layer l À 1; w lÀ1 ik represents the convolution kernel of the ith neuron in the layer l À 1 and the kth neuron in the layer l, i ¼ 1; 2; …; N , N is the number of neurons.

Activation Layer
The activation layer adds nonlinear factors through the activation function to enhance the expressive ability of the model [15]. Currently commonly used activation functions are Sigmoid, Tanh, and ReLu. Because the ReLu function has linear non-saturation characteristics and fast convergence speed, it can overcome gradient dispersion and is widely used [16]. The formula is shown in Eq. (2): where a l k is the activation value of the layer l.

Pooling Layer
The pooling layer is usually added after the convolutional layer, and down-sampling is performed according to certain rules, thereby reducing the feature space size and network parameters [17]. Common pooling operations include maximum pooling (Max Pooling) and average pooling (Average Pooling). The formula is shown in Eq. (3): where, s i represents the value of the ith neuron after pooling operation; pool represents the pooling function; R i represents the pooling area of the feature graph; i, j is the index value of each element in the region.
Max pooling takes the maximum value of the pooling area. The formula is shown in Eq. (4): Average pooling takes the average of the pooled area. The formula is shown in Eq. (5):

Fully Connected Layer
The fully connected layer is to classify and recognize the signal after feature extraction [18]. This paper is a two-class classification problem of normal sequence and fault sequence, and the Sigmoid activation function can be used. The formula is shown in Eq. (6) : where m ¼ 1; 2 represents the two categories respectively; p m represents the probability of being classified into the m category; z is the neurons to be activated in the output layer.

Data Set Acquisition
This paper uses sensor data collected by different engines during the test, and takes the largest common set of all sensor data to obtain 6 key sensors. Surge is mainly judged based on the data of two sensors, one is the sensor that artificially controls the state of the engine throttle lever (PLA), and the other is the total pressure sensor (PT3) at the compressor outlet. When the PLA data is stable, the PT3 data curve has sudden changes and severe jitter, and this interval is the surge fault interval [19]. According to this rule, this paper completes the collection of fault points and normal points. In order to generate the normal sequence and the fault sequence, this paper adopts a method similar to the principle of image processing by convolutional neural networks. The division method is from the nth point until the end of n + window_size [20]. The judgment basis is that if there is a fault point in the sequence, the sequence is a fault sequence. In this way, the original data is reshaped into a sample of a specified length for model modeling.
Due to the different nature of the sensor data, they usually have different dimensions and orders of magnitude. When the levels of the various values are very different, if the original index values are directly used for analysis, it will highlight the role of attributes with higher values and relatively weaken the role of attributes with lower values [21]. Therefore, in order to ensure the reliability of the results, the original data needs to be standardized. This experiment uses the commonly used z-score standardization, and the standardization conversion formula is shown in Eq. (7): where x, l, r is the original data, the mean and variance of the data, z is normalized value.

Data Set Division
When dividing the data set, this paper considers the method of random shuffling. Because the original data set may be in order, if the positive examples are concentrated in the front and the negative examples are concentrated in the back, it will cause the validation set or test set to appear mostly negative examples. It is very likely that the training effect of negative examples is not good, and the overall performance of the model deteriorates. In addition, even if it is not orderly, disruption will appear more "fair", and it will also give the model an opportunity to improve. Therefore, when the original data has a certain distribution law, it will cause the learning curve to be unsmooth [22]. If the amount of data is large enough, it will show a random distribution after being disrupted, and the commonality of the sample will be better reflected after learning [23]. In order to strengthen the generalization ability of the model, this article disrupts the data set (including feature data and labels), and guarantees the correspondence between the feature data and the label in each piece of data.
The scrambled data is divided into three parts: training set, validation set and test set. Firstly, the data set is divided into training set and test set. Since the model construction process also needs to check the configuration and training level of the model, the training data will be divided into two parts, one is the training set for training, and the other is the validation set for testing. The training set is used to train the neural network model, and then the validation set is used to verify the effectiveness of the model, and the model with the best effect is selected until a satisfactory model is obtained. Finally, when the model "passes" the validation set, we then use the test set to test the final effect of the model, and evaluate the precision and recall rate of the model. This paper uses the data preprocessing method proposed in Section 3.1 to obtain the experimental data set, the selected window size is 64, the step size is 1, and the sequence length obtained is 64. This data set contains normal sequence, fault sequence, and their respective labels. Through data scrambling and division, this paper divides the obtained data set into training set, validation set and test set according to the ratio of 7:2:1.
The aero-engine surge data set acquisition algorithm (ASDA) used in this paper is described as follows:

Algorithm 1: Data set acquisition algorithm (ASDA)
Input: A data list X, the list of label Y, the starting index n, the window_size ws, the step of sliding window s. append s data to the failure sequence list FL; (Continued )

Experimental Environment and Model Structure
The operating system used in this paper is Windows 10, and the deep learning frameworks are Tensorflow and Keras. The hardware configuration is: Intel i5 processor, GTX1650Ti graphics card, 8GB memory. The ASFDN network configuration proposed in this paper is shown in Tab. 1, where K c is the number of convolution kernels, S i is the size of the convolution kernel, and P w is the pooling width.
According to the network parameters shown in Tab. 1, the ASFDN in this paper for aero-engine surge fault diagnosis is shown in Fig. 2.

Evaluation Metric
In general, the confusion matrix visualizes the performance of the algorithm in tabular form, as shown in the following table: In Tab. 2, true positive (TP), false negative (FN), false positive (FP) and true negative (TN) are expressed as follows 1. True positive (TP) is the total number of samples predicted as "normal" and actually "normal"; 2. False negative (FN) is the total number of samples predicted as "normal" and actually "fault"; 3. False positive (FP) is the total number of samples predicted as "fault" and actually "normal"; 4. True negative (TN) is the total number of samples predicted as "fault" and actually "fault".  In order to evaluate the performance of the classification model, the evaluation metric used in this paper is precision, recall, and F1 score, which can be calculated using these 4 metrics in the confusion matrix. The precision, recall and F1 score are calculated by Eqs. (8)-(10), respectively.

Experiment Results
On the ASFDN model built in this paper, five optimization algorithms, SGD, RMSprop, Adagrad, Adadelta, and Adam, are used to optimize the network. The experimental results are shown in Fig. 3. Fig. 3 shows the effect of different optimization functions on the verification set. The left picture shows the change curve of loss value (loss), and the right picture shows the change curve of accuracy value (accuracy). It can be seen from the results that the loss value of the ASFDN model decreases rapidly and tends to be stable, and the accuracy is rapidly improved to reach a stable state, which shows the effectiveness of the model for the classification of aero-engine surge faults. Among them, when the optimization function Adam is used, the loss after the model converges reaches the lowest and the accuracy reaches the highest, and the performance is better. Next, the experiment also uses the five  Figure 2: The structure of ASFDN  Fig. 4, the left picture is the test result of the fault sequence, and the right picture is the test result of the normal sequence. Since this article mainly studies the classification of aero-engine surge faults, it focuses more on the fault identification. It can be seen from Fig. 4 that for the fault sequence, when the optimization function Adam is used, the accuracy and F1 score reach the highest. For normal sequences, these 5 optimization functions perform relatively evenly on these 3 metrics.
The experimental results show that the ASFDN model uses the optimization function Adam to achieve better results. This paper mainly compares with the RNN and its improved model commonly used in sequential data processing. It can be seen from Tab. 3 that the improved model of RNN is better than the simple RNN, but the ASFDN model proposed in this article is in each evaluation index The above are better than other experimental models. The F1_score is 97.1%, the precision on the test set is 99.6%, and the recall rate is 94.7%. Through comparative experiments, the effectiveness of the ASFDN in the diagnosis of aeroengine surge faults is proved. From the results of the two classifications, the effect is outstanding.

Conclusion
The ASFDN model proposed in this paper is an adaptive fault diagnosis algorithm. The experimental data set is obtained from the aero-engine sensor data, and the ASFDN model is verified and tested in many aspects on this data set. The experimental results show that the ASFDN model does not need to manually extract features, and can directly use the original vibration signal as the model input to realize the aero-engine surge fault diagnosis and achieve good results. This method has strong generalization ability and robustness, and has higher accuracy and best performance than other methods in this experiment. In addition, this paper proves that ASFDN can be well applied to time series analysis of sensor data. Experiments show that there is still room for optimization and improvement of the ASFDN model. Future research can consider combining the better-performing ASFDN with LSTM to improve the classification accuracy and model generalization ability. Conflicts of Interest: The authors declare that they have no conflicts of interest to report regarding the present study.