1. Introduction
Turbofan engine is a highly complex and precise thermal machinery, which is the “heart” of the aircraft. About 60% of the total faults of the aircraft are related to the turbofan engine [
1]. The RUL prediction of the turbofan engine will provide an important basis for predictive maintenance and pre maintenance. In recent years, because of the rapid development of machine learning and deep learning, the intelligence and work efficiency of turbofan engines have been greatly improved. However, as large-scale precision equipment, its operation process cannot be separated from the comprehensive influence of internal factors such as physical and electrical characteristics and external factors such as temperature and humidity [
2]. The performance degradation process also shows temporal and spatial characteristics [
3], which provide necessary data support and bring challenges for the RUL prediction of the turbofan engine. At the same time, the data generated by the operation process of turbofan engine have the characteristics of nonlinearity [
4], time-varying [
5], large scale and high dimension [
6], which results in the failure of effective feature extraction, and the non-linear relationship between the extracted features and the RUL cannot be mapped, which are the key problems to be solved urgently.
Many models have been developed to predict the RUL of turbofan engines. Ahmadzadeh et al. [
7] divided the predicting methods into four categories, including experimental, physics-based, data driven, and hybrid methods. Experimental type relies on prior knowledge and historical data, but the operating conditions and operating environment of the equipment are uncertain, which leads to large prediction accuracy error, and cannot be promoted in complex scenarios. The physical model uses the physical and electrical characteristics of the equipment to construct accurate mathematical equations to describe the degradation law of the equipment and predict its remaining life. Usually, it is difficult to obtain the physical model for large precision equipment such as turbofan engines, and its application is restricted. Data driven is independent of the failure mechanism of equipment, its key is to monitor and extract effective performance degradation data. The method lacks an analysis of the uncertainty of the predicted results, and a large amount of historical data is needed to build a high-precision model. Hybrid model is a new prediction method which combines two or more neural network models, which has become the mainstream research trend. Among them, the hybrid model composed of CNN [
8,
9,
10] and LSTM is the most common one in the field of RUL prediction of turbofan engine. CNN has a strong feature extraction ability, which cannot only extract local abstract features, but also process the data with multiple working conditions and multiple faults [
11,
12,
13], especially the one-dimensional CNN can be well applied to the time series analysis generated by sensors (such as gyroscope or accelerometer data [
14,
15,
16]). It can also be used to analyze signal with fixed length period (such as audio signal). Zhang et al. [
17] adopted a fully convolutional neural network for feature self-learning and reduced training parameters; a weighted average method was used to denoise the prediction results, and the bearing-accelerated life experiment verified the effectiveness of the proposed method. Yang et al. [
18] proposed an intelligent RUL prediction method based on the dual CNN model architecture to predict the turbofan engine RUL, the first CNN model determines the initial failure point, and the second CNN model is used for RUL prediction. This method does not require any feature extractor. The original vibration signal can be received, and useful information can be retained as much as possible. The prediction results and evaluation indicators prove the effectiveness and superiority of the method. Hsu [
19] applied several deep learning methods to assess the status of aircraft engines in operation, and to classify the stages of operational degradation so as to predict the functional remaining lifespan of components. Li et al. [
20] designed a new data-driven method using deep convolutional neural network (DCNN) for prediction, time windows are used for sample preparation to better extract features, and experiments based on the C-MAPSS data set have confirmed the effectiveness of this method. In fact, for the turbofan engine dataset (C-MAPSS), recent studies have used CNN to extract its features and achieved good results, while one-dimensional CNN can extract sensor data from the dataset and the full convolutional layer can reduce training parameters and weights.
Long short-term memory network (LSTM) is a kind of time recurrent neural network (RNN), which can solve the problems of gradient disappearance and gradient explosion in RNN. For LSTM, the special "three gate structure" enables it to capture a long range of dependence and process time series data. While, the RUL prediction of turbofan engine needs to process the time series data, and LSTM can obtain the optimal features of the time series data generated by the turbofan engine, and can also mine rules in time series. Zhang et al. [
21] applied a method based on LSTM, which is specifically used to discover the underlying patterns embedded in the time series, so as to track the system performance degradation, thereby predicting RUL. Kong et al. [
22] utilized polynomial regression to obtain health indicators, and then combined them with CNN and LSTM neural networks to extract spatiotemporal features. Song et al. [
23] proposed a hybrid health prediction model that combines the advantages of the autoencoder neural network and the bidirectional long-term short-term memory (BLSTM) neural network, using the autoencoder as a feature extraction tool, and the BLSTM captures the characteristics of the bidirectional long-range dependence of features. The above methods are all tested on the C-MAPSS data set to verify the effectiveness and accuracy; however, there are common problems such as complex training process and low prediction accuracy.
We found that the data set of turbofan engine is composed of multiple time series, the data in different data sets contain different noise levels, so it is necessary to normalize the original data, which will eliminate the influence of noise, and realize data centralization to enhance the generalization ability of the model. At the same time, it is difficult to capture multi fault mode and multi-dimensional feature data in different operating environments. It is also necessary to use multi scene and multi time point data to extract effective features to improve prediction accuracy, traditional methods cannot extract temporal and spatial features simultaneously and effectively fuse them. In addition, single neural network model is difficult to extract enough effective information in the face of multiple working conditions and multiple types of features.
The main contributions of this paper include: (1) use LSTM to extract the temporal characteristics of the data sequence, and learn how to model the sequence according to the target RUL to provide accurate results. (2) A one-dimensional full-convolutional layer neural network is adopted to extract spatial features, and through dimensionality reduction processing, the parameters and computational complexity of the training process are greatly reduced. (3) The spatiotemporal features extracted by the two models are fused and used as the input of the one-dimensional convolutional neural network for secondary processing. Comparing this method with other mainstream RUL prediction methods, the score and error control of the method proposed in this article are better than others, which proves the feasibility and effectiveness of this method.
The rest of this article is arranged as follows: Part 2 is the basic theory, mainly introducing the model structure of neural network and evaluation indicators. The third part is the focus of this article, mainly including the proposed model structure, algorithm, training process, implementation flow. The fourth part is the experiment and result analysis, and the last part is summary and prospect.
4. Experiments and Analysis
First, the C-MAPSS data set is introduced in detail, second, preprocesses the data, test and verify the proposed prediction model on the data set. Then parameter settings are adjusted through the training model. Finally, compares the experimental results with other methods.
4.1. C-MAPSS Data Set
In this paper, NASA C-MAPSS turbofan engine degradation data set (
https://ti.arc.nasa.gov/tech/dash/groups/pcoe/prognostic-data-repository/) [
33] was used, which was derived from C-MAPSS database created by NASA Army Research Lab. The main control system consists of fan controller, regulator, and limiter. Fan controls the normal operation of the flight conditions, sending air into the inner and outer culverts, as shown in
Figure 8. A low pressure compressor (LPC) and high pressure compressor (HPC) supply compressed high temperature, high pressure gases to the combustor. Low pressure turbine (LPT) can decelerate and pressurize air to improve the chemical energy conversion efficiency of aviation kerosene. High pressure turbines (HPT) generate mechanical energy by using high temperature and high pressure gas strike turbine blades. Low-pressure rotor (N1), high-pressure rotor (N2), and nozzle guarantee the combustion efficiency of the engine.
The C-MAPSS database contains four subsets of data (FD001-FD004) generated from different time series and including cumulative spatial complexity. Each data subset includes a test data set and a training data set and the number of engines varies in each data subset. Each engine has varying degrees of initial wear-and-tear, and this kind of wear-and-tear is considered normal. There are three operating settings that have a significant impact on engine performance. The engine works normally at the beginning of each time series and fails at the end of the time series. In the training set, the fault increases until the system fails and in the test set, the time series ends at some time before the system fails [
35]. In each time series, 21 sensor parameters and 3 other parameters show the running state of the turbofan engine. The data set is provided as a compressed text file. Each row is a snapshot of the data taken during a single operation cycle, and each column is a different variable. The specific contents are shown in
Table 1 and
Table 2.
According to the needs of experiment, this paper adopts the data set FD001 and FD003 for model verification, and the specific description of the data set is shown in
Table 3:
In this table, the training set in the data set includes the data of the entire engine life cycle, while the data trajectory of the test set terminates at some point before failure. FD001 and FD003 were simulated under the same (sea level) condition. FD001 was only tested in the case of HPC degradation, and FD003 was simulated in two fault modes: HPC and fan degradation. The number of sensors and the type of operation parameters are consistent for the four data subsets (FD001-FD004). The data subsets FD001 and FD003 contain actual RUL values, so that the effect of the model can be seen according to the comparison between the actual value and the predicted value. The result of the experiment is to predict the number of remaining running cycles before the failure of the test set, namely RUL.
4.2. Data Preprocessing
In the training stage of the model, the original turbofan engine data should be pre-processed, and the pre-processed data can be put into the model to obtain the parameters required by the model. The pre-processing process includes feature selection, data standardization and normalization, setting the size of sliding window, and RUL label setting of training set and test set. The FD001 dataset contains 21 sensor features and 3 operating parameters (flight altitude, Mach number, and throttling parser Angle). The number of running cycles is also one of the features, so with a total of 25 features. In order to ensure the consistency of the input and output of the model and the comparison effect of different data sets, the feature selection of FD003 is consistent with FD001. Because multiple sensors will produce multiple features, in order to eliminate the influence of different dimensions on the prediction results, the normalization method of formula (14) is adopted. The input data are a 2D matrix containing (as the size of the sliding window) with (as the number of the selected features). In order to keep the size of input and output of FD001 and FD003 data subsets constant and the data processing process is accelerated. This paper uses a larger window to get more detailed features. The sliding window and the number of features is set to 50 and 25 respectively.
In the neural network model, we need to get the corresponding output according to the input data. The state of the system at each time step and the specific information of the target RUL are based on the physical model and are difficult to determine. To solve this problem, different solutions have been proposed. One solution is to simply allocate the required output as the actual time remaining before a functional failure, but at the same time the state of the system will decline linearly [
36]. Another option is to obtain the desired output value based on the appropriate degradation model. Referring to the current literature, this paper adopts piece-wise linear degradation model to determine the target RUL [
37,
38,
39,
40]. Piece-wise linear regression model can prevent the algorithm from overestimating RUL. For an engine, equipment can be considered healthy during its initial period. The degradation process will be obvious before the whole equipment runs for a period of time or is used by a certain extent, that is, near the end of the life of the equipment. Set the normal working state of the device to a constant value, and the RUL of the device will drop linearly with time after a certain period of time. This model limits the maximum value of RUL, which is determined by the observed data in the degradation stage. The maximum RUL value of the data set observed from the degradation phase of the experiment is set to 125, and the part exceeding 125 is uniformly specified as 125. When the critical period is reached, RUL decreases linearly as the running period increases. The Piece-wise Linear RUL Target Function is shown in
Figure 9.
4.3. Parameter Settings
Because the model needs to adjust parameters in the training process, it is important to select the appropriate parameters for the whole experiment. For two data subsets, FD001 and FD003, each data subset is divided into training set accounting for 85% and verification set accounting for 15%. All data sets are trained in the mini-batch method [
35]. For the main parameters involved in the data set in this experiment, this paper adopts a one-by-one optimization method according to the most commonly used values of the parameters. The selection of parameters included epoch (value: 40, 60, 80,100), batch size (value: 64,128,256,512), dropout rate (value: 0.1, 0.2, 0.3, 0.4). In the course of experimental training, if the training error and verification error do not decrease in the five training periods, the training shall be stopped in advance. The parameter results of FD001 are shown in
Figure 10. The parameter results of FD003 are shown in
Figure 11.
After model training and comparative analysis of experimental results, the parameter setting of FD001 and FD003 data subsets with the best model performance is finally obtained, as shown in
Table 4.
4.4. Experimental Results and Comparison
In this section, we mainly introduce the prediction results of this model and the comparative analysis with the recent popular research methods. With the same data input and the same pretreatment process, the prediction results of the traditional convolutional neural network are compared with the 1-FCLCNN-LSTM model proposed in this paper. The traditional convolutional neural network consists of two convolutional layers, two pooling layers, and a full connected layer. For FD001 and FD003 data subset, this paper compares the training effect of convolutional neural network and FCLCNN-LSTM model under the same data set and engine. The training effects of engines with FD001 and FD003 data subsets on the two models are shown in
Figure 12 and
Figure 13. The training diagrams of the two models in a single data subset can be obtained as follows: RUL began to decrease with the increase of time step, and finally failed. From the process of RUL reduction, it can be observed that with the increase of time, the higher the prediction accuracy, the closer the predicted value and the actual the values are, which means that the smaller RUL is closer to the potential fault. In this paper, RMSE is used to express the training effect of FD001 and FD003 training sets, as shown in Formula (12). The comparison results are shown in
Table 5.
From
Table 5 and the training diagrams of the two models on different data sets, it can be concluded that the 1-FCLCNN-LSTM proposed in this paper performs better in the training process than the traditional single CNN neural network. Among them, the RMSE of 1-FCLCNN-LSTM model on FD001 training set was 41% lower than that of CNN model, and the RMSE of 1-FCLCNN-LSTM model on FD003 training set was 46% lower than that of CNN model. FD003 has two fault modes while FD001 has only one, which indicates that the multi-neural network model has certain advantages in dealing with complex fault problems.
The test sets of FD001 and FD003 were input into the trained CNN and 1-FCLCNN-LSTM models to obtain the prediction results, which are shown in
Figure 14 and
Figure 15, respectively.
In this paper, the RMSE is used to express the effects of FD001 and FD003 test sets, as shown in Formula (12). See
Table 6 for details.
It can be seen from
Table 6 that the training effect of the model directly affects the performance of the test set of the model. As shown in the above table, the RMSE of 1-FCLCNN-LSTM model on FD001 test set is 35% lower than that of CNN model and the RMSE of 1-FCLCNN-LSTM model on FD003 is 35.5% lower than that of CNN model.
In order to measure the prediction performance of the model more comprehensively, this paper selects the latest advanced RUL prediction method, and compares the deviations of various methods under the same data set. The evaluation indicators are RMSE and the score function, both of which are as low as possible. The comparison results of FD001 data set are shown in
Table 7, and the comparison results of FD003 data set are shown in
Table 8.
The comparison results with multiple models show that the model proposed in this paper has the lowest score and RMSE values on both FD001 and FD003 data sets. The RMSE of 1-FCLCNN-LSTM model on FD001 was 11.4–36.6% lower than that of RF, DCNN, D-LSTM, and other traditional methods, and the RMSE of 1-FCLCNN-LSTM model on FD003 was 37.5–78% lower than that of GB, SVM, LSTMBS, and other traditional methods. The above results are attributed to the multi-neural network structure and parallel processing of feature information in this model, which can effectively extract RUL information. Compared with the current popular multi-model Autoencoder-BLSTM, VAE-D2GAN, HDNN, and other methods, the RMSE of FD001 was decreased by 4–18%, and the RMSE of FD003 was decreased by 18–37.5% compared with that on HDNN, DCNN, Rulclipper, and other methods. The above results are attributed to the same multi-model structure and multi-network structure, the 1-FCLCNN-LSTM model has advantages in feature processing in the1-FCLCNN path, and the fused data are processed by the 1D full-convolutional layer to obtain more accurate prediction results. The score of 1-FCLCNN-LSTM model in FD001 was 5% lower than the optimal LSTMBS in the previous model. The score of 1-FCLCNN-LSTM model in FD003 was 17.6% lower than the optimal DNN in the previous mode. This indicates that the prediction accuracy of this model in C-MAPSS data set is improved, and no expert knowledge or physical knowledge is required, which can help maintain predictive turbofan engines, as a research direction of mechanical equipment health management.