1. Introduction
The main pump, the core of the valve cooling system, is powered for the cooling system medium to ensure the converter valve work at a normal temperature through heat exchange, which can affect the safety and stability of HVDC and even threaten large-scale renewable power generation and load electrification [
1,
2]. Therefore, it has great practical significance for fault diagnosis of the main pump [
3,
4]. However, there are few studies on fault diagnosis of the main pump, and most of the existing methods are time-consuming and laborious. Therefore, it is urgent to develop an algorithm that can timely diagnose the state of the main pump with high accuracy.
Generally, the main pump is a horizontal centrifugal pump to undertake the power supply task, thus the main pump in this paper refers to the horizontal centrifugal pump [
5]. In practical application, four faults and one normal state of the main pump appear most, namely unbalance, looseness, parallel misalignment (PM), angular misalignment (AM), and normal.
At present, two main methods have been applied for the fault diagnosis of the main pump, machine learning, and deep learning. The former is to extract the signal features manually and carry out fault diagnosis by machine learning methods, such as support vector machine (SVM), k-neighborhood algorithm (KNN), and so on. Kumar et al. [
6] extracted the features from the original signal and scale edge integral graph, optimized the SVM parameters by genetic algorithm (GA), trained SVM with the optimal parameters, and classified the characteristics of the centrifugal pump. The classification accuracy can reach 96.66%. Ebrahimi et al. [
7] decomposed the vibration signal in three levels by the Daubechies wavelet, and 44 descriptive statistical features were extracted from the detail coefficients and approximation coefficients of the wavelet. The SVM classifier with an accuracy of 96.67% was obtained. Hui et al. [
8] proposed a time-frequency signal analysis method based on the theory of cyclostationary. Firstly, the cyclic autocorrelation function (CAFs) of various signals was calculated, and then the features of CAFs in the frequency domain were obtained by FFT, thus as to carry out the fault classification. Janani et al. [
9] used a wrapper model to select the appropriate features from the power spectrum of vibration signal and line current signal in an induction motor. The features were input into a multi-class support vector machine (MSVM), and the optimal MSVM classifier was obtained by using the fivefold cross-validation to select the optimal Gaussian radial basis function (RBF) and MSVM parameters. Maamar et al. [
10] combined multilayer perceptron with backward propagation (MLP-BP) and genetic algorithm (GA). The feature extraction was carried out by using continuous wavelet transform and three different wavelet functions, and then GA optimized the number of hidden layers and neurons of MLP-BP. Janani et al. [
11] proposed two methods based on MSVM, best energy (BE) criterion, and principal component analysis (PCA). The current and vibration signals of motor were preprocessed by wavelet packet transform (WPT), and then the appropriate features were selected according to BE and PCA, and finally, the classification was completed by MSVM. Zahoor et al. [
12] proposed a three-level fault diagnosis strategy. Firstly, the fault characteristic modes of vibration signals were identified and selected, and then the mixed features were extracted in the time domain, frequency domain, and time-frequency domain of vibration signals. Then, the high correlation features in mixed features were dimensioned and a new feature pool was formed by using Pearson linear discriminant analysis (PLDA). Finally, the fault classification was carried out by KNN. Zahoor et al. [
13] used cross-correlation between health baseline signal and other kinds of signals to obtain new features from the correlation sequence. Then, they extracted the mixed features in time domain, frequency domain, and time-frequency domain from these features and formed feature vectors by calculating correlation coefficients between different signals. Finally, they input feature vectors into MSVM to implement fault diagnosis. The research on fault diagnosis of the main pump mostly adopts the above methods, but they are time-consuming and may cause mistakes due to human misinterpretation.
The latter is to preprocess the signal and extract features automatically through deep learning methods to implement fault diagnosis. Deep learning methods have been used in fault diagnosis of mechanical equipment because of their superior ability of automatic feature extraction, especially convolutional neural network (CNN) and recurrent neural network (RNN). Wang et al. [
14] transformed the raw signal into a spectrum signal through discrete Fourier transform (DFT) and then stacked the spectrum signal as a sample to input into CNN. Guo et al. [
15] proposed a hierarchical CNN network structure with an adaptive learning rate. The first layer was used to recognize the fault type of bearing. The second layer was used to evaluate the fault size in the bearing. Because the learning rate had a great impact on the network, they also proposed a method to obtain an adaptive learning rate for making an improvement on the training effect of the network. Zhang et al. [
16] studied the rolling bearing fault in a noisy environment and under the condition of constantly changing workload and proposed a new CNN training method, which greatly improved the robustness of the network and maintained high accuracy and stability even in a noisy environment and under the condition of constantly changing workload. Kumar et al. [
17] proposed an improved CNN. The gray image of the sound signal was obtained by processing the sound signal with the analytic wavelet function (AWT). They used a new divergence function based on entropy as the loss function of CNN to solve the overfitting problem of CNN. Considering the outstanding extraction ability for temporal features in fault diagnosis, RNN based fault diagnosis model has been widely developed. Talebi et al. [
18] put forward the idea of dynamic modeling of RNN based wind turbines for solving the inevitable problem, which is the wind energy conversion system fault. The residual error was obtained by comparing the built model with the actual system output for improving the performance of the built model. Experiments showed that the scheme could quickly obtain the fault diagnosis results, and the diagnosis effect was very effective. Przystalka et al. [
19] proposed a robust fault detection method based on RNN and chaotic engineering by using local RNN to learn the chaotic behavior of chaotic engineering system. Mrugalski et al. [
20] optimized the dynamic nonlinear system, especially studied the robustness of fault diagnosis. The results output a set of fault diagnosis model to make an improvement on the robustness of RNN. The model was used to simulate the disturbance attenuation process of a dynamic nonlinear system, and the results showed that the system could improve the robustness of fault estimation. Although deep learning has made some achievements in the field of mechanical fault diagnosis with high efficiency, there are few pieces of research on the application of deep learning methods in main pump fault diagnosis. At the same time, CNN’s superior feature extraction ability and automatic feature extraction can get rid of the shortcomings of traditional fault diagnosis methods in manual feature extraction, but CNN cannot extract the temporal features of the signal. On the other hand, RNN can effectively extract the temporal features of signal, but its feature extraction ability is not as good as CNN in other aspects.
In order to solve the above problems, this paper proposed a fault diagnosis method based on Muti-scale Convolutional Neural Network and Long Short-Term Memory (MCNN-LSTM) hybrid neural network model for the main pump of valve cooling system in a converter station, and the performance of the model was evaluated by several indexes. This method takes into account the extraction of temporal and spatial features and retains the most features as far as possible, which makes this method more accurate than other methods. The experimental results showed that the method can diagnose the main pump quickly and accurately and had good generalizability. In this paper,
Section 2 discusses the related works such as 1DCNN and LSTM.
Section 3 introduces the construction and function of network in detail.
Section 4 provides the composition and preprocessing of data.
Section 5 describes the experiment and analyzes the results, and
Section 6 draws a conclusion.
5. Results and Discussion
Because of the excellent performance of accuracy, recall, precision, and
F1-score in the model evaluation, much literature have adopted these indicators as the evaluation criteria of the model. Therefore, this paper selected accuracy, recall, precision, and
F1-score as evaluation indexes.
where
TP,
TN,
FP, and
FN represent the number of true positive, true negative, false positive and false negative respectively.
5.1. Results Analysis
All the experiments in the study were completed with the Spyder (python3.6) compiler, run on a GTX950m graphics card, Intel Core i5 2.3 GHz processor, and a 4 GB RAM. The neural network was implemented under the Keras (2.0.8) framework with tensorflow backend. Some third-party libraries such as Sklearn, SciPy, and Matplotlib were used for data preprocessing and visualization.
We added two sets of comparative experiments to study the effect of sample length and RNN variables on model performance.
Table 3 intuitively shows the experimental results of sample length on the test set from the aspects of data. In
Table 3, the average values of evaluation indexes of each fault type were taken and arranged. It can be seen from
Table 3 that the 1024-length sample has the best performance in F1-score and precision, which are basically above 0.95, and the comprehensive performance is also the best. The mean values of F1-score, recall, and precision decreased obviously with the decrease of sample length from 1024. When it increased from 1024, there was an obvious downward trend. Thus, 1024 was the most suitable sample length. We selected RNN variables including unidirectional LSTM, unidirectional gated recurrent unit (GRU) [
28], bidirectional LSTM (BiLSTM) [
29], and bidirectional GRU (BiGRU) [
30]. Based on the 1024-length sample, the RNN variable comparison test was carried out. In
Table 4, the LSTM performed well in F1-score and recall, but its advantage in precision was not obvious. The precision of LSTM was only slightly higher than that of BiLSTM, but LSTM was superior in other evaluation indexes, and the unidirectional network was superior to the bidirectional network, which was contrary to RNN commonly used in traditional text processing. We speculate that it may be caused by the change of the length and channel number of the data processed by CNN.
5.2. Model Evaluation
The confusion matrix based on the best classification result of the test set is shown in
Figure 3. From
Figure 3, it is obvious that the learning outcome of the model is excellent. For obtaining a more objective and comprehensive evaluation of the model, we calculated the F1-score, recall, and precision on the test set, which is summarized in
Table 5 below. It can be seen that all indexes of this model have high scores, stable performance, and strong generalization ability, and it has good performance for fault diagnosis.
5.3. Algorithm Comparison
There are many algorithms for fault diagnosis of vibration signals. We chose several machine learning algorithms and deep learning algorithms for comparative experiments. From
Table 6, it can be found that our proposed model has a good performance in terms of
F1-score, recall, precision, and accuracy, which is better than the comparison algorithm. The specific values of each index are shown in the table below.
5.4. Network Visualization
The inner part of the neural network model has always been considered as a black box, and the inner principle is difficult to understand. In this section, T-SNE was applied to visualizing the feature extraction process of internal network structure and exploring the internal feature extraction and classification process. First of all, from the input data, we selected the wide kernels CNN to preliminarily classify the data and distinguish PM, normal, and unbalance from AM and looseness. Then, narrow kernel CNN was used to subdivide AM and looseness. Through the first-layer LSTM, it can be preliminarily divided into three categories: PM, normal, and unbalance. On the basis of the first-layer LSTM, the boundaries of the five types of data were clearly divided with the second-layer LSTM. Finally, the data were divided into five categories by softmax. As shown in
Figure 4, the feature distribution extracted in this paper has a very clear boundary, and the classification effect is very good.
5.5. Future Work
According to some problems of the model, the future research focuses on the following three aspects. First of all, we need to improve the data preprocessing method and the network structure and achieve more accurate fault classification while reducing the network parameters as much as possible. Secondly, this study only realized the classification of four faults and one normal state. In the future, more vibration signals of other fault types will be collected to realize more fault classification. Finally, some other data enhancement methods will be tried, and a new fault diagnosis model is established by combining machine learning methods such as PCA with deep learning algorithms.