Health State Prediction Method Based on Multi-Featured Parameter Information Fusion

Yin, Xiaojing; Rong, Yao; Li, Lei; He, Weidong; Lv, Ming; Sun, Shiqi

doi:10.3390/app14156809

Open AccessArticle

Health State Prediction Method Based on Multi-Featured Parameter Information Fusion

by

Xiaojing Yin

^1,*,

Yao Rong

¹,

Lei Li

²,

Weidong He

¹,

Ming Lv

¹ and

Shiqi Sun

¹

Mechanical Engineering Program, School of Mechanical and Electrical Engineering, Changchun University of Technology, Changchun 130012, China

²

FAW Mold Manufacturing Co., Changchun 130015, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(15), 6809; https://doi.org/10.3390/app14156809 (registering DOI)

Submission received: 14 June 2024 / Revised: 17 July 2024 / Accepted: 23 July 2024 / Published: 4 August 2024

Download

Browse Figures

Versions Notes

Abstract

:

The prediction of the health status of critical components is an important influence in making accurate maintenance decisions for rotating equipment. Since vibration signals contain a large amount of fault information, they can more accurately describe the health status of critical components. Therefore, it is widely used in the field of rotating equipment health state prediction. However, there are two major problems in predicting the health status of key components based on vibration signals: (1) The working environment of rotating equipment is harsh, and if only one feature in the time or frequency domain is selected for fault analysis, it will be susceptible to harsh operating environments and cannot completely reflect the fault information. (2) The vibration signals are unlabeled time series data, which are difficult to accurately convert into the health status of key components. In order to solve the above problems, this paper proposes a combined prediction model combining a bidirectional long- and short-term memory network (BiLSTM), a self-organizing neural network (SOM) and particle swarm optimization (PSO). Firstly, the SOM is utilized to fuse the fault characteristics of multiple vibration signals of key components to obtain an indicator (HI) that can reflect the health status of rotating equipment and to also compensate for the vulnerability of single signal characteristics in the time or frequency domain to environmental influences. Secondly, the K-means clustering method is employed to cluster the health indicators and determine the health state, which solves the problem of determining the health of a component from unsupervised vibration signal data which is quite difficult. Finally, the particle swarm optimized BiLSTM model is used to predict the health state of key components and the bearing dataset from the IEEE PHM 2012 Data Challenge verifies the method’s effectiveness and validity.

Keywords:

deep learning; health status; rolling element bearing; multi-feature fusion

1. Introduction

The advantages of rotating equipment include high power-to-weight ratios, excellent energy conversion rates, compactness, smooth operation, low maintenance costs and a long service life. A variety of emerging and traditional industrial fields have been widely used, such as centrifugal pumps, compressors, fans, etc., which are widely used in petroleum, chemical industry, mining, natural gas, metallurgy, and other fields. These pieces of equipment need to operate for extended periods under high temperatures and high-pressure conditions. These poor working conditions will inevitably affect the service life of important parts. The failure of important parts not only has a significant negative impact on operational efficiency and cost control, but it can even seriously jeopardize the lives of staff. Therefore, to reduce the hazards of rotating equipment failures and improve the safe operation and full life-cycle operation of the system, it is becoming more and more urgent to accurately obtain and predict the health status of rotating equipment [1].

When rotating equipment malfunctions or abnormal conditions occur, the characteristics of its vibration signals can change significantly. For example, an unbalanced rotor can cause the system to produce periodic vibrations, the frequency and amplitude of which increase as the degree of unbalance increases. Similarly, misaligned shafts and couplings can cause periodic vibrations that manifest themselves in rotational speed frequencies and their harmonic components. Loose components can cause non-linear vibrations characterized by irregular variations in vibration amplitude and frequency. The wear of components such as bearings and gears can introduce shock and random vibration components into the vibration signal. Cracks, on the other hand, may lead to significant changes in the periodic and non-periodic components of the vibration signal of a structural member or rotating shaft. It is thus clear that vibration signals can characterize the state of health of a component. In general, vibration signals can be used for fault feature extraction from two different perspectives: the time domain (parameters such as mean, variance, kurtosis, waveform, etc.) and the frequency domain (gravitational frequency, mean-square frequency, frequency variance, etc.) [2]. Wang et al. [3] extracted the time-domain features superimposed on vibration signals as inputs to be transported into the GRU model for fault diagnosis. Although the time-domain features of vibration signals can be used as the basis for condition monitoring and fault diagnosis of mechanical components, the time-domain features are poorly resistant to interference in a strong noise background and are prone to misjudgment. Therefore, sometimes it is necessary to extract the features of the signal in the frequency domain, the Fourier transform of the signal, from the time domain to the frequency domain, to obtain the frequency information of the signal. Then, using this information, determine the fault characteristics and extract the fault features on this basis. Wang et al. [4] extracted the fault features from the frequency domain samples of the gears by using sparse filtering, and then used softmax regression as a classifier to categorize the different types of faults learned and used the dataset to prove its validity and reasonableness, as well as its effectiveness and rationality. Although the above methods perform fault diagnosis to some extent by characterizing sound signals, most of them only use the time or frequency domain for fault feature extraction. However, different feature extraction perspectives can reflect different health conditions, and there are both intersections and complementarities between them. Therefore, multi-feature fusion technology is used to fuse multiple features to obtain more comprehensive information.

Currently, the main methods of feature fusion are principal component analysis (PCA) [5], kernel principal component analysis (KPCA) [6], neighborhood component analysis (NCA) [7], linear discriminant analysis (LDA) [8], etc. Stief et al. [9] fused vibration data from sensors by combining the two-phase Bayesian approach with the principal component analysis (PCA) method to diagnose electrical and mechanical faults in rotating equipment. Sun et al. [10] obtained the entropy of a sound signal by variable mode decomposition and then fused the entropy by the kernel principal component analysis (KPCA) method to eliminate the redundancy of high-dimensional feature information, which has high diagnostic accuracy. The above study proves that both methods, PCA and KPCA, can realize data fusion but require dimensionality reduction. The dimension of the data for health state prediction is fixed, and if the dimension is reduced, a lot of useful information is lost. Yaman et al. [11] selected the sound signal features by using the Neighborhood Component Analysis (NCA) method and used these selected features as inputs to the model for fault diagnosis. Zhou et al. extracted the Fisher features of the vibration signal features by using the Linear Discriminant Analysis (LDA) method, which can reduce the dimensionality of the features and improve the feature accuracy of fault identification [12]. NCA and LDA rely on data labeling for clustering and fusion, although they do not require feature fusion by dimensionality reduction.

Cluster analysis is a powerful statistical method for grouping observations in a data set. Among the various clustering methods are K-Means Clustering [13], Hierarchical Clustering [14], and Gaussian Mixture Models (GMM) [15]. All are common choices.

Among them, C.T. et al. [16] use a K-mean clustering method for sorting automatic diagnosis of rolling bearing defects; this model does not require the use of data measured on a specific machine under bearing failure conditions to train the method. However, K-means has some drawbacks: it is sensitive to the initial center of mass position, and different initial values may lead to different clustering results; it is susceptible to outliers, which will shift the center of mass position and affect the clustering accuracy; and it is only applicable to linearly divisible spherical data and is less effective for non-spherical and unevenly sized data distributions. Lin et al. [17] used an algorithm with improved hierarchical clustering to reduce the computational effort of modeling all motors of a motor system separately. Hierarchical clustering provides a visual tree diagram, which helps to understand the hierarchical structure of the data, but its computational complexity is high and is not suitable for large-scale datasets. Wang et al. [18] analyzed the features extracted from large data during vehicle charging through a Gaussian mixture model to reduce the computational difficulties caused by the complexity of the data. Gaussian mixture models, on the other hand, provide more complex probabilistic models to describe the distribution of data points, with more complex single-parameter estimation and higher computational costs.

Currently, health state prediction methods can be broadly categorized into three groups: analytical model-based prediction methods, expert system-based prediction methods, and data-driven prediction methods. Model-based methods represent the actual working process of mechanical equipment components by building mathematical or physical models. However, in practice, it is very difficult to integrate accurate physical models of complex rotating equipment systems, which therefore limits the scope of application of this method. The advantage of the expert knowledge-based prediction method over the model-based prediction method is that it can estimate the future development trend without establishing an accurate physical model by utilizing the known information and adopting predictive reasoning, but its main problem is that the expert system is susceptible to the incomplete influence of the expert’s experience and knowledge base, leading to the miscalculation, which cannot reflect the failure process of the mechanical components in real time, affecting the real-time prediction, and making the inference results also have uncertainty. The results are also uncertain. Therefore, data-driven methods are widely used, which only need to analyze a large amount of historical data to provide decision support based on facts and evidence. Commonly used prediction algorithms mainly include recurrent neural networks (RNN) [19] and long short-term memory neural networks (LSTM) [20]. Recurrent neural networks (RNN) are known for their ability to handle temporal predictions [21]. Zollanvari et al. [22] used RNN prediction models to predict transformer failures. However, RNNs have significant challenges in recording long-term information and are prone to gradient vanishing and explosion problems. Long Short-Term Memory (LSTM) [23] is a variant of RNN, a mechanism that manages the accumulation of information by selectively forgetting past data and integrating new data, effectively solving the problem of long-term dependency. Liu et al. [24] built an LSTM prediction model for the prediction of diesel engine exhaust temperatures using a prediction of whether the diesel engine is in an abnormal working state that is inferred in advance.

Based on the research conclusions of the above scholars. This paper proposes a health state prediction method for rotating equipment based on vibration signals, and the technical flow chart (Figure 1). It is used to solve two main problems in predicting the health state of key components based on vibration signals: (1) The working environment of rotating equipment is harsh, and a single vibration signal feature is susceptible to the influence of the harsh working environment and cannot completely reflect the fault information. (2) The vibration signals are unlabeled time-series data, so how can we accurately convert them into the health state of key components? Firstly, the original vibration signal is feature extracted; secondly, the extracted features are fused by SOM neural network features to get the health state indicator (HI), and then, the health indicators are classified by the clustering method to get the health state of each stage. The optimized Bi-LSTM model is used to predict the health state, and the validity of the proposed method in this paper is verified using the dataset of bearings. The method realizes the fusion of multi-feature parameters and obtains optimal health state prediction results. The paper is organized as follows: Section 2 describes the basic theory used in this paper; Section 3 describes the improved BiLSTM model; Section 4 wraps up data description and preprocessing; Section 5 describes experimental analysis; and Section 6 is the conclusion.

2. Methods

2.1. Bi-LSTM

LSTM is a variant of RNN and can solve the problem of gradient explosion in RNN processing of long-term time sequences. The structure is shown in Figure 2.

It converts ordinary hidden nodes into storage units, which overcomes the difficulties in training models for RNN. The key to LSTM is the cell state and various gate structures, including forgetting gates, input gates, and output gates. LSTM achieves the function of filtering out redundant information by adding and deleting the information of the cell-like

C_{t}

through the structure of “gates”. The process of information transfer and update is as follows:

i_{t} = σ (W_{i} \cdot [y_{t - 1}, x_{t}] + b_{i})

(1)

f_{t} = σ (W_{f} \cdot [y_{t - 1}, x_{t}] + b_{f})

(2)

o_{t} = σ (W_{o} \cdot [y_{t - 1}, x_{t}] + b_{o})

(3)

{\tilde{C}}_{t} = t a n h (W_{C} \cdot [y_{t - 1}, x_{t}] + b_{C})

(4)

C_{t} = f_{t} \times C_{t - 1} + i_{t} \times {\tilde{C}}_{t}

(5)

y_{t} = o_{t} t a n h (C_{t})

(6)

In the above equation,

i_{t}

is the input gate at time

t

, and

σ

is the nonlinear activation function.

W_{i}

is the weight matrix of the input gate, and

b_{i}

is the bias vector of the input gate.

y_{t - 1}

is the hidden state of the previous time step, which carries the information of the previous time step.

x_{t}

is the input data for the current time step. Similarly,

f_{t}

,

o_{t}

, represent the forget gate and output gate at time t. Matrices

W

and

b

represent the respective gate weight matrix parameters and bias terms, respectively;

{\tilde{C}}_{t}

is the candidate cell state,

C_{t}

denotes the new cell state, and

y_{t}

is the new hidden state. LSTM networks, although they can deal with time series and achieve predictive effects, have a unidirectional structure and therefore can only deal with unidirectional time-series data, and such a structure is a clear limitation for prediction because effective prediction usually involves mining information from both historical and future data to infer future trends. The unidirectional LSTM cannot obtain any information from future inputs. In contrast, the bidirectional long and short-term memory network (BiLSTM), which can consider both past and future information, better captures the interdependencies in the data, with the structure shown in Figure 3.

It includes an output layer, a forward LSTM layer, a backward LSTM layer, a hidden layer, and an input layer. The forward layer is responsible for parsing the dynamic pattern of forward development in the sequence, while the backward layer analyzes the reverse trend of the sequence. BILSTM synthesize the data from both temporal directions at any given point. When information from these two directions is fused at the output layer, BILSTM provides a more comprehensive and in-depth contextual understanding by synthesizing past and future information to make decisions.

2.2. SOM

SOM is an unsupervised competitive learning algorithm that does not require pre-labeled data and learns the intrinsic connections of data through a self-organizing process. Its structure is shown in Figure 4, which mainly consists of an input layer and a map layer. The topology of the input data is preserved in the low-dimensional output layer by self-organizing the weights of the adjusting neurons [25]. Thus, it can discretize the input of any dimension into a 1D or 2D-dimensional discrete space. The specific steps for the implementation of this algorithm are as follows:

1.: Stochastically initialize the network weight.
2.: The input vibration signal feature vector is ${X = [\begin{matrix} x_{1}, x_{2}, \dots x_{n} \end{matrix}]}^{T}$ . Calculate the Euclidean distance between the neurons in the topology layer and $X$ . The formula is as follows:

$d_{j} = ∥ X - w_{j} ∥ = \sqrt{\sum_{i = 1}^{m} (x_{i} (t) - w_{i j} (t))^{2}}$

(7)

where $w_{i j}$ is the weight vector between neurons in the topology layer.
3.: The neuron with the smallest distance is selected as the winning neuron C and the set of its neighborhood neurons is given.
4.: Weight learning to correct the weights of the output neuron C and its neighborhood neurons.

$Δ w_{i j} = w_{i j} (t + 1) - w_{i j} (t) = η (t) (x_{i} (t) - w_{i j} (t))$

(8)

where $η (t)$ is the gain function.
5.: Train steps 2–5, until the maximum number of training sessions is reached.
6.: Calculate the distance between the input feature vector and the winning unit C at each time point, Minimum Quantization Error ( $M Q E$ ).

$M Q E = ∥ X - w_{c} ∥$

(9)

where $w_{c}$ is the weight vector of the winning unit. Since there are many burrs and perturbations in the calculated $M Q E$ , the accuracy of the prediction will be affected if it is directly used as HI; therefore, this paper decomposes the $M Q E$ into db5 wavelet packets and uses the low-dimensional trend term as the HI curve.

The wavelet packet transform (WPT) is an analytical method that decomposes a signal into different frequencies and scales, providing a more detailed frequency resolution than traditional wavelet computing. The db5 wavelet is the fifth wavelet basis of the Daubechies wavelet family. It is performed as follows:

Obtain the filter coefficients for the db5 wavelet: Let the original signal be $x [n]$ . The low-pass filter coefficients of the db5 wavelet are $h [k]$ . The high-pass filter coefficients are $g [k]$
Low-frequency component:

$A [n] = \sum_{k} x [2 n - k] h [k]$

(10)

High-frequency component:

$D [n] = \sum_{k} x [2 n - k] g [k]$

(11)
Recursive decomposition: Apply the low-pass and high-pass filters again to each new component A and D and continue the decomposition to obtain more detailed frequency components. Repeat the steps until a predetermined level of decomposition is reached.
Reconstructing the signal: wavelet packet reconstruction is the inverse of the decomposition process, where the signal is reduced by reconstructing filters for the low- and high-frequency components.

Low-frequency reconstruction:

x [n] = \sum_{k} A [\frac{n - k}{2}] h [k]

(12)

High-frequency reconstruction:

x [n] = \sum_{k} D [\frac{n - k}{2}] g [k]

(13)

2.3. K-Means Clustering

The K-means clustering method [13] is a method centered on the concept of the center of mass (or centroid). The steps of its algorithm are as follows:

Initialize the center of mass: divide the dataset into algorithmic k classes and select k data points as initial centroids.
Calculate the Euclidean distance from each data point to the center of mass and assign the data points to the class nearest to the center of mass.

$J_{k} = \sum_{j = 1}^{k} \sum_{k = 1}^{n_{j}} {∥x_{k} - m_{j}∥}^{2}$

(14)

where $m_{j}$ denotes the location of the center of gravity of the jth class. To optimize the clustering results, the $J_{k}$ criterion should be minimized.
Update the center of mass: recalculate the centers of mass of the individual classes already obtained. Calculate the Euclidean distance from each data point to the center of mass and assign the data points to the class nearest to the center of mass.
Repeat steps 2–3 until the algorithm converges and the centers of mass to which all samples belong no longer change.

2.4. Particle Swarm Optimization Algorithm

Particle Swarm Optimization (PSO) [25] is an optimization algorithm based on group intelligence that simulates the behavior of a group, such as a flock of birds or a school of fish, and searches for an optimal solution through collaboration and information sharing among individuals. The steps of the PSO algorithm are as follows:

Population initialization: randomly initialize the position and velocity of the particles and calculate the fitness value of each particle; initialize the particle parameters such as the number, maximum number of iterations, and learning factor.
Velocity and position update: each particle has two key attributes: position and velocity. The position of a particle indicates the location of an unknown solution in space, and its velocity is the direction and step size of moving in space. The position of each particle is evaluated by the fitness function to guide the search for the particle. The update formula is as follows:

$v_{i}^{t + 1} = ω v_{i}^{t} + c_{1} γ (x_{p_{i}}^{t} - x_{i}^{t}) + c_{2} η (x_{g}^{t} - x_{i}^{t})$

(15)

$x_{i}^{t + 1} = x_{i}^{t} + v_{i}^{t + 1}$

(16)

where, $v_{i}^{t}$ is the velocity of particle $i$ at moment t, $x_{i}^{t}$ is the position of the particle at moment it, and $x_{p_{i}}^{t}$ is the current found individual best position of the particle. $x_{g}^{t}$ is the overall optimal position of all the particles, and $ω$ is the inertia weights, which are used for controlling the searching range of the particle. $c_{1}$ and $c_{2}$ are the individual and population balance factors, respectively, and t is the number of iterations. $γ$ and $η$ are random numbers taking values between [0, 1] to increase the randomness of the search. The particle velocity variation is shown in Figure 5.
Individual and population optimal solution updating: in each iteration, each particle records its own position; if the updated individual position is better than before, then it is the individual optimal position; if the updated global position is later than before, then, it is the global optimal position.
Termination condition: when the maximum number of iterations is reached to meet the convergence criteria, the algorithm terminates and outputs the global optimal position and the optimal fitness value.

3. Improved BiLSTM Modle Optimized by PSO

Hyperparameters are usually set based on previous experience, but the model needs to set different hyperparameters to face different work requirements, and the artificial settings have great limitations [26]. Therefore, in this paper, the particle swarm algorithm is used to optimize the four key hyperparameters, namely, the number of neurons, the learning rate, the number of network layers, and the L2 regularization coefficient of the optimized BILSTM model. The flowchart is shown in Figure 6.

Step 1: Set the particle swarm parameters and the network parameters of BiLSTM, respectively; input the set particle swarm location information into the BiLSTM model; and then make predictions on the training set divided after preprocessing.

Step 2: Calculate the fitness value of the initial particle swarm, update the individual optimal solution and the population optimal solution of the particles, and judge whether the convergence condition is satisfied.

Step 3: If it does not satisfy, then re-update the velocity and position of each particle, evaluate the fitness value of each particle again, and then go through the training set again to reach the optimal solution.

Step 4: If the condition is satisfied, output the optimized parameters, retrain, and then make a prediction.

4. Data Description and Pre-Processing

4.1. Data Descriptions

The experimental data were obtained from the PHM 2012 Challenge dataset, which was collected on the PRONOSTIA accelerated aging test bed [27]. The structure of the test rig is shown in Figure 7, with synchronized motors, double pulleys, a pneumatic jack, and two accelerometers, model 3035B DYTRAN. The accelerometers were mounted at the horizontal and vertical positions of the bearings to obtain the vibration signals. The data from bearing 1_1 was chosen as the test data set since it exhibited significant degradation behavior. It was subjected to a run-to-failure test under a 4000 N radial load and an 1800 r/min rotational speed, and a total of 2560 samples were recorded with a sampling frequency of 25.6 kHz/10 s. The signals from the horizontal accelerometers will be used for the prediction of the bearing’s health state in the next step by truncating the publicly available data set bearing 1_1. The last 1004 sets of data from the public dataset Bearing1_1 are used as the test data, with 2560 data points in each set.

The vibration signal of bearing 1_1 is shown in Figure 8. At the beginning, the vibration amplitude of the vibration signal is low and steady, indicating that the bearing is in normal working condition. In the middle of the cycle, the amplitude of the signal waveform gradually increases dramatically, indicating that the bearing starts to show initial wear, and finally, the amplitude of the signal waveform starts to change dramatically, indicating that the bearing has failed.

4.2. Data Preprocessing

4.2.1. Feature Extraction

In this experiment, 12 time-domain analysis methods and 3 frequency-domain analysis methods are selected to extract features from the dataset, and the corresponding feature parameters are obtained. Taking the peak value of the time-domain feature and the center-of-gravity frequency of the frequency-domain feature as examples, the extracted state feature parameters are shown in Figure 9 as follows:

4.2.2. Feature Fusion

A total of 15-dimensional fault feature data can be obtained from both time and frequency domains, which will be used as inputs to the SOM for feature fusion, and the specific steps of the network are set up as follows:

The selected degradation-sensitive features are normalized.
Training set and test set: take the first 502 sets of data as the training set and the last 502 sets as the test set.
The SOM network was trained, the input dimension was 15, the output dimension was 25, the learning rate was 0.80, the number of training sessions was 100, and the $M Q E$ between the input feature vector and the winning neuron was computed for each sampling point. Input the test set features into the trained SOM network and calculate the $M Q E$ from the figure. There are a lot of burrs in the $M Q E$ , which will affect the accuracy of the prediction, so the $M Q E$ is decomposed by the db5 wavelet packets, and the low-frequency trend term is used as the HI. The result is shown in Figure 10.

4.2.3. Health Status

The HI curve can represent the health state of the bearing. Through the HI curve stage clustering, you can obtain the health state of the bearing at each stage. This study uses the k-means method of the health state curve (HI) clustering to achieve the results as shown in Figure 11a. The HI curve of the 0–200 data points for the health state, fixed label is set as 1, 200–463 data points for the normal wear state as 2, 463–502 data points for the failure state are set as 3, normal wear state are set as 2, and 463–502 data points for the failure state are set as 3. In order to verify whether the state clustering used in this paper to obtain the health state meets the real health state of the bearing in the dataset, the data points are converted (2470402). Data points of 463–502 correspond to the original vibration signal data points 2470403–2570243. Where the clustering of the fault state Figure 11b obtained with the original vibration signal fault data conversion is consistent, proving that the state clustering method is effective and can be obtained bearing health status.

5. Experimental Verification

5.1. Predictive Modeling Based on PSO-BILSTM

In this case study, the bidirectional long-and short-term memory network (BILSTM) is chosen as the prediction model because the hyperparameters are important factors affecting the prediction accuracy of the model, and the hyperparameters of the BILSTM model need to be optimized if we want to obtain more accurate prediction results. Firstly, the prediction model parameters are initialized and set: the input neuron of the model is 9, and the output neuron is 1. The number of hidden layers in each model is set to 1, and the number of neurons in the hidden layer is set to 32. The number of iterations, the batch size, and the learning rate are 1000, 50, and 0.01, respectively. The loss is optimized using Adam’s algorithm and the dropout technique, which prevents overfitting [24]. Divided into training and test sets in the ratio of 7:3, Table 1 lists the optimal parameters of the BILSTM model before and after PSO optimization.

5.2. Experimental Research

Once the hyperparameters of the model have been determined, the prediction of the health of the bearings can begin. The results are shown in Figure 12, and it can be seen that the predicted values of the health state have some minor fluctuations compared with the actual values, but the overall trend is the same as that of the actual values. Therefore, the method can accurately predict the health state of the bear.

Figure 13 illustrates the prediction results of the traditional prediction methods, and the combined model proposed in this paper. Figure 14 summarizes the specific values of the evaluation metrics they use for health state prediction. From the comparison graph in Figure 13, it can be seen that the RNN has the largest deviation from the actual values, with strong vibration amplitudes at each stage, while the LSTM and BiLSTM also have amplitudes at each stage, but the amplitudes are gradually decreasing compared to the RNN. Most of the methods proposed in this paper converge to the stabilization stage, and only a small portion of them show smaller amplitudes. It is known that the optimized BiLSTM can accurately predict the health state of the bearing in terms of prediction results. According to the evaluation index in Figure 14, it can be seen that among the four methods mentioned above, the evaluation index value of root mean square error (RMSE) is gradually decreasing, and the combination model proposed in this paper is the lowest, and then look at the R2 coefficient, which is gradually increasing, and the method proposed in this paper is the highest, so according to the results, it can be seen that the prediction accuracy of the combination model proposed in this paper is better than that of the remaining three methods.

6. Conclusions

In this study, a novel deep learning method for predicting the health status of key components of rotating equipment is proposed. A new combined model is constructed by combining vibration signal processing with SOM, BiLSTM, and PSO networks. In addition, the combined model is tested using the bearing dataset provided by the IEEE PHM 2012 Data Challenge. The health state of the bearings is predicted using RNN, LSTM, BiLSTM, and the proposed combinatorial model in this paper and compared with the proposed method. The experimental results show that the method outperforms the existing algorithms. When the input data is 1004, the predicted RMSE of the dataset is 0.047%, and the R2 coefficient is 0.954. Therefore, the method proposed in this problem can predict the health state of the key components of rotating equipment better; however, there are still some deficiencies in the application. This paper is only using the fault data of one sensor; some of the fault features are sensitive to the degradation of the bearings, while others may not be sensitive. The fault features for which the fusion is performed should be optimized. On the other hand, a new health indicator can be constructed to characterize the health state of the bearing. A suitable health indicator is expected to quantitatively analyze the health state prediction results. Further research is needed to obtain more accurate and robust health state prediction results.

Author Contributions

Conceptualization, X.Y. and Y.R.; writing—original draft preparation, Y.R., L.L., W.H., M.L. and S.S.; writing—review and editing, X.Y. and L.L. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by the Science and Technology Department of Jilin Province, Project Number 20230201095GX.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Available at: https://www.femto-st.fr/en/Research-departments/AS2M/Research-groups/PHM (accessed on 31 March 2024).

Conflicts of Interest

Author L.L. is employee at FAW Mold Manufacturing Co.; Other authors declare no conflict of interest.

References

Chen, Y.; Rao, M.; Feng, K.; Zuo, M.J. Physics-Informed LSTM hyperparameters selection for gearbox fault detection. Mech. Syst. Signal Process. 2022, 171, 108907. [Google Scholar] [CrossRef]
Yu, X.; Ding, E.; Chen, C.; Liu, X.; Li, L. A novel characteristic frequency bands extraction method for automatic bearing fault diagnosis based on Hilbert Huang transform. Sensors 2015, 15, 27869–27893. [Google Scholar] [CrossRef] [PubMed]
Wang, C.; Peng, Z.; Liu, R.; Chen, C. Research on multi-fault diagnosis method based on time domain features of vibration signals. Sensor 2022, 22, 8164. [Google Scholar] [CrossRef] [PubMed]
Wang, J.; Li, S.; Xin, Y.; An, Z. Gear fault intelligent diagnosis based on frequency-domain feature extraction. J. Vib. Eng. Technol. 2019, 7, 159–166. [Google Scholar] [CrossRef]
Jiang, W.; Xie, C.; Zhuang, M.; Shou, Y.; Tang, Y. Sensor data fusion with z-numbers and its application in fault diagnosis. Sensors 2016, 16, 1509. [Google Scholar] [CrossRef] [PubMed]
Yin, Y.; Liu, F.; Zhou, X.; Li, Q. An efficient data compression model based on spatial clustering and principal component analysis in wireless sensor networks. Sensors 2015, 15, 19443–19465. [Google Scholar] [CrossRef] [PubMed]
Widodo, A.; Yang, B.S. Application of nonlinear feature extraction and support vector machines for fault diagnosis of induction motors. Expert Syst. Appl. 2007, 33, 241–250. [Google Scholar] [CrossRef]
Zhou, H.; Chen, J.; Dong, G.; Wang, H.; Yuan, H. Bearing fault recognition method based on neighbourhood component analysis and coupled hidden Markov model. Mech. Syst. Signal Process. 2016, 66, 568–581. [Google Scholar] [CrossRef]
Stief, A.; Ottewill, J.R.; Baranowski, J.; Orkisz, M. A PCA and two-stage Bayesian sensor fusion approach for diagnosing electrical and mechanical faults in induction motors. IEEE Trans. Ind. Electron. 2019, 66, 9510–9520. [Google Scholar] [CrossRef]
Sun, Y.; Cao, Y.; Li, P.; Su, S. Entropy feature fusion-based diagnosis for railway point machines using vibration signals based on kernel principal component analysis and support vector machine. IEEE Intell. Transp. Syst. Mag. 2023, 15, 96–108. [Google Scholar] [CrossRef]
Yaman, O. An automated faults classification method based on binary pattern and neighborhood component analysis using induction motor. Measurement 2021, 168, 108323. [Google Scholar] [CrossRef]
Zhou, Y.; Yan, S.; Ren, Y.; Liu, S. Rolling bearing fault diagnosis using transient-extracting transform and linear discriminant analysis. Measurement 2021, 178, 109298. [Google Scholar] [CrossRef]
Wang, Q.; Liu, J.; Wei, B.; Chen, W.; Xu, S. Investigating the construction, training, and verification methods of k-means clustering fault recognition model for rotating machinery. IEEE Access 2020, 8, 196515–196528. [Google Scholar] [CrossRef]
Liu, Y.; Ge, Z. Weighted random forests for fault classification in industrial processes with hierarchical clustering model selection. J. Process Control 2018, 64, 62–70. [Google Scholar] [CrossRef]
Guo, Y.; Chen, H. Fault diagnosis of VRF air-conditioning system based on improved Gaussian mixture model with PCA approach. Int. J. Refrig. 2020, 118, 1–11. [Google Scholar] [CrossRef]
Yiakopoulos, C.T.; Gryllias, K.C.; Antoniadis, I.A. Rolling element bearing fault detection in industrial environments based on a K-means clustering approach. Expert Syst. Appl. 2011, 38, 2888–2911. [Google Scholar] [CrossRef]
Lin, X.; Wu, H.; Ye, B.; Xu, B.; Tao, W. Dynamic equivalent modeling of motors based on improved hierarchical clustering algorithm. J. Electr. Eng. Technol. 2019, 14, 1139–1150. [Google Scholar] [CrossRef]
Wang, S.; Wang, Z.; Cheng, X.; Zhang, Z. A double-layer fault diagnosis strategy for electric vehicle batteries based on Gaussian mixture model. Energy 2023, 281, 128318. [Google Scholar] [CrossRef]
Zhang, L.; Zhang, Z.; Peng, H. Diagnostic Method for Short Circuit Faults at the Generator End of Ship Power Systems Based on MWDN and Deep-Gated RNN-FCN. J. Mar. Sci. Eng. 2023, 11, 1806. [Google Scholar] [CrossRef]
Zhang, J.; Feng, Y.; Zhang, J.; Li, Y. The Short Time Prediction of the Dst Index Based on the Long-Short Time Memory and Empirical Mode Decomposition–Long-Short Time Memory Models. Appl. Sci. 2023, 13, 11824. [Google Scholar] [CrossRef]
Ahmad, T.; Zhang, D. A data-driven deep sequence-to-sequence long-short memory method along with a gated recurrent neural network for wind power forecasting. Energy 2022, 239, 122109. [Google Scholar] [CrossRef]
Zollanvari, A.; Kunanbayev, K.; Bitaghsir, S.A.; Bagheri, M. Transformer fault prognosis using deep recurrent neural network over vibration signals. IEEE Trans. Instrum. Meas. 2020, 70, 2502011. [Google Scholar] [CrossRef]
Jiang, T.; Liu, Y. A short-term wind power prediction approach based on ensemble empirical mode decomposition and improved long short-term memory. Comput. Electr. Eng. 2023, 110, 108830. [Google Scholar] [CrossRef]
Liu, Y.; Gan, H.; Cong, Y.; Hu, G. Research on fault prediction of marine diesel engine based on attention-LSTM. Proc. Inst. Mech. Eng. Part M J. Eng. Marit. Environ. 2023, 237, 508–519. [Google Scholar] [CrossRef]
Qiang, Z.; Jieying, G.; Junming, L.; Ying, T.; Shilei, Z. Gearbox fault diagnosis using data fusion based on self-organizing map neural network. Int. J. Distrib. Sens. Netw. 2020, 16, 1550147720923476. [Google Scholar] [CrossRef]
Peng, X.; Zhang, B. Ship motion attitude prediction based on EMD-PSO-LSTM integrated model. J. Coast. Isl. Technol. 2019, 27, 421–426. [Google Scholar]
Nectoux, P.; Gouriveau, R.; Medjaher, K.; Ramasso, E.; Chebel-Morello, B.; Zerhouni, N.; Varnier, C. An experimental platform for bearings accelerated degradation tests. In Proceedings of the IEEE International Conference on Prognostics and Health Management, PHM’12, IEEE Catalog Number: CPF12PHM-CDR, Spokane, WA, USA, 17–19 June 2012; pp. 1–8. [Google Scholar]

Figure 1. Sketch map of the proposed.

Figure 2. Structure of the LSTM network.

Figure 3. Structure of the BiLSTM network.

Figure 4. Structure of the SOM network.

Figure 5. Particle velocity variation graph.

Figure 6. Flowchart of improved BiLSTM model.

Figure 7. Bearing-accelerated degradation test rig.

Figure 8. The vibration signal of bearing 1_1.

Figure 9. (a) Gravity frequency; (b) peak frequency.

Figure 10. HI curve.

Figure 11. (a) Health Status; (b) failure data.

Figure 12. Prediction of 1_1 bearing health status using this method.

Figure 13. Comparison of the results of four health state prediction methods.

Figure 14. Evaluation metrics for different methods of health state prediction.

Table 1. Parameter setting.

Parameter Name	Starting Value	Optimal Value
Training Optimizer	Adam
Maximum number of iteration rounds	1000
Batch size	50
Initial learning rate	0.01	0.0748
Dropout loss function	0.2
Hidden layer neuron	50	156
Network layer	1	2
Regularization factor		1.6164 × 10⁻⁵

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yin, X.; Rong, Y.; Li, L.; He, W.; Lv, M.; Sun, S. Health State Prediction Method Based on Multi-Featured Parameter Information Fusion. Appl. Sci. 2024, 14, 6809. https://doi.org/10.3390/app14156809

AMA Style

Yin X, Rong Y, Li L, He W, Lv M, Sun S. Health State Prediction Method Based on Multi-Featured Parameter Information Fusion. Applied Sciences. 2024; 14(15):6809. https://doi.org/10.3390/app14156809

Chicago/Turabian Style

Yin, Xiaojing, Yao Rong, Lei Li, Weidong He, Ming Lv, and Shiqi Sun. 2024. "Health State Prediction Method Based on Multi-Featured Parameter Information Fusion" Applied Sciences 14, no. 15: 6809. https://doi.org/10.3390/app14156809

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Health State Prediction Method Based on Multi-Featured Parameter Information Fusion

Abstract

1. Introduction

2. Methods

2.1. Bi-LSTM

2.2. SOM

2.3. K-Means Clustering

2.4. Particle Swarm Optimization Algorithm

3. Improved BiLSTM Modle Optimized by PSO

4. Data Description and Pre-Processing

4.1. Data Descriptions

4.2. Data Preprocessing

4.2.1. Feature Extraction

4.2.2. Feature Fusion

4.2.3. Health Status

5. Experimental Verification

5.1. Predictive Modeling Based on PSO-BILSTM

5.2. Experimental Research

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI