Intelligent Compound Fault Diagnosis of Roller Bearings Based on Deep Graph Convolutional Network

Chen, Caifeng; Yuan, Yiping; Zhao, Feiyang

doi:10.3390/s23208489

Open AccessArticle

Intelligent Compound Fault Diagnosis of Roller Bearings Based on Deep Graph Convolutional Network

by

Caifeng Chen

,

Yiping Yuan

^*

and

Feiyang Zhao

School of Mechanical Engineering, Xinjiang University, Urumqi 830047, China

^*

Author to whom correspondence should be addressed.

Sensors 2023, 23(20), 8489; https://doi.org/10.3390/s23208489

Submission received: 6 July 2023 / Revised: 27 September 2023 / Accepted: 6 October 2023 / Published: 16 October 2023

(This article belongs to the Section Fault Diagnosis & Sensors)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

The high correlation between rolling bearing composite faults and single fault samples is prone to misclassification. Therefore, this paper proposes a rolling bearing composite fault diagnosis method based on a deep graph convolutional network. First, the acquired raw vibration signals are pre-processed and divided into sub-samples. Secondly, a number of sub-samples in different health states are constructed as graph-structured data, divided into a training set and a test set. Finally, the training set is used as input to a deep graph convolutional neural network (DGCN) model, which is trained to determine the optimal structure and parameters of the network. A test set verifies the feasibility and effectiveness of the network. The experimental result shows that the DGCN can effectively identify compound faults in rolling bearings, which provides a new approach for the identification of compound faults in bearings.

Keywords:

intelligent compound fault diagnosis; deep graph convolutional neural network; roller bearing

1. Introduction

Bearings are an important part of rotary equipment and are also the most easily damaged component. According to statistics, in rotating machinery using rolling bearings, about 30% of mechanical failures are related to bearings [1], and the operating condition directly affects the overall performance of the equipment [2]. In bearing fault diagnosis, the extraction of fault characteristic information from the original vibration signal still plays a dominant role [3]. However, in actual engineering, a variety of bearing failures co-exist to form compound failures. The vibration signals of a compound fault condition are not simply a superposition of single fault signals but are coupled with the vibration signals of other components through complex transmission paths. The faults of different components interact and influence each other, causing the composite fault signal to be characterized by non-smoothness and non-linearity [4] and bringing great difficulties in bearing fault diagnosis. Therefore, how to extract all kinds of fault characteristics from the composite fault signal is the focus and difficulty of current research.

The most important step in the traditional rolling bearing fault diagnosis is feature extraction. The purpose is to extract useful fault information in the signal so as to improve the accuracy of fault diagnosis [5]. Common feature extraction methods include Wavelet Transform (WT), Principal Component Analysis (PCA), Short Time Fourier Transform (STFT), etc. [6]. Currently, most fault diagnosis methods require manual feature selection when dealing with non-smooth, non-linear signals, and different feature selections determine the effectiveness of fault diagnosis. On the other hand, the performance of existing feature extraction methods gradually decreases as the amount of data increase.

Deep learning theory is introduced into the field of fault diagnosis to overcome the above shortcomings. For example, Jiang et al. [7] proposed a multi-scale convolutional neural network (MSCNN) to automatically learn fault characteristics from the original vibration signal and classify gearbox fault types. Xu et al. [8] used deep convolutional neural networks (DCNN) to directly process the original vibration signals, thus realizing the fault diagnosis of rolling bearings. Compared with the traditional shallow model, deep learning builds an “end-to-end” model. Health information can be obtained directly from the collected signals, while meeting the high requirements for fault diagnosis accuracy in the era of big data [9]. The above deep learning method can only learn features from the vertices of the input data, while ignoring the information contained in the edges formed between vertices.

Inspired by this, graph neural networks (GNNs) were introduced to extend CNNs to graph data [10,11] to solve the above problem. The GNN was first proposed by Scarselli et al. in 2009 [12] and is based on graph theory to construct a neural network for data in the graph domain. In the graph domain, using the attributes of nodes and edges can provide additional information to improve the extracted features. In addition, the noise immunity has been improved. As a result, more information can be provided in a graphics field than in a general data field. Bruna et al. [13] introduced convolutional operations to GNNs which are based on spectral graph theory and constructed the first graph convolutional network (GCN) model. Compared with traditional CNN methods, GCN has advantages in dealing with non-stationary, non-linear signals and discriminative feature extraction of discrete spatial domain signals [14]. Up to now, the GCN method has been successfully applied to several research areas such as intelligent acoustic fault diagnosis of rolling bearings [15] and wind turbine gearbox fault diagnosis [16,17,18]. The main research focus of this paper is how to construct the original vibration signal into graph-structured data. Secondly, the GCN model’s feature mining capability is utilized to eliminate the screening feature link in fault diagnosis and improve the efficiency of composite fault diagnosis.

The structure of this paper is as follows. In Section 2, the theory of the work in question is presented. Section 3 describes the process for composite bearing fault diagnosis based on deep graph convolutional networks. The feasibility of the proposed method is verified in Section 4. The conclusion is presented in the Section 5.

2. Related Work

The core idea of GCN is to extend convolutional operations from Euclidean datasets to non-Euclidean datasets by aggregating information about the nodes in the graph and their neighbors via means of supervised or semi-supervised learning. Thereby, extracting high-dimensional features in the graph structure allows for numerous tasks such as node prediction, classification and edge prediction. The GCNs can be divided into spectral and space GCNs. The use of a spectrum-based GCN in this article mainly involves three steps: First, a graph Fourier transform is applied to the input data. Secondly, the results of the transformation are convolved in the spectral domain. Finally, the convolution results are subjected to an inverse graph Fourier transform.

2.1. Representation of Graphs

This study defines the original data into sub-samples based on graph theory as a graph with nodes and edges, which for any undirected graph can be represented as:

\{\begin{array}{l} G = (V, E, A) \\ V = (v_{0}, v_{1}, \dots, v_{n}) \\ E = (v_{i}, v_{j}) \end{array}

(1)

where V refers to the set of N nodes, E refers to the set of edges, A∈R^N ^{× N} is the adjacency matrix defining the interconnections between nodes and in the undirected graph, A_i_,j = A_j_,i.

Other than the adjacency matrix A, the graph can be defined as a Laplacian matrix with L = D − A, where D and A denote the degree matrix and the adjacency matrix, respectively. The adjacency matrix and degree matrix are calculated as shown in Figure 1. The degree matrix indicates the number of connected nodes. For example, with node one having one edge and node five having four edges. The adjacency matrix represents the relationship between nodes. For example, node one and node five are connected and represented by one, while node one and node two are not connected and represented by zero.

2.2. Spectrogram Convolution

In spectral graph convolution, the symmetric normalized graph Laplace operator is usually used and is defined as:

L = I_{N} - D^{\frac{- 1}{2}} A D^{\frac{- 1}{2}}

(2)

where

D = d i a g (\sum_{j} A_{i j})

refers to the degree matrix and

I_{N}

is the unit matrix.

The graph Laplacian matrix is a real symmetric matrix whose eigenvalues can be decomposed as:

L = U \land U^{- 1} = U [\begin{matrix} λ_{\begin{matrix} 1 \\ ⋱ \end{matrix}} \\ λ_{n} \end{matrix}] U^{- 1}

(3)

where Λ is a diagonal matrix of eigenvalues and U is a matrix of eigenvectors.

The spectral convolution of the set of nodes V with the node features can be expressed as:

h = {(x *_{G} f)}_{θ} = U ((U^{T} x) ⊙ (U^{T} f))

(4)

where h refers to the feature map after graph convolution, *G represents the graph convolution, x is the node feature, f is the feature function of Λ, i.e., f(Λ), θ is the learnable parameter, ⊙ is the Hadamard product of elemental forms and U^T x refers to the graph Fourier transform of the node feature x.

Using f_θ = U^Tf as a learnable graph convolution filter, the above equation simplifies to:

h = U f_{θ} U^{T} x .

(5)

However, the filter f_θ is computationally complex and not spatially localized. This article uses the Chebyshev polynomials to approximate the filter and the derived graph convolution and making λmax ≈ 2 further simplifies the Chebyshev polynomials, which can be expressed as:

f_{θ} = \sum_{k = 0}^{K} θ_{k} T_{k} (\tilde{\land})

(6)

h = θ_{0} x + θ_{1} (L - I_{n}) x = θ_{0} x - θ_{1} D^{- 1 / 2} A D^{- 1 / 2}

(7)

where K stands for Chebyshev’s polynomial order,

\land = d i g ([λ_{0}, λ_{1}, \dots, λ_{n - 1}])

and

λ_{n - 1}

, respectively, represent the eigenvalues array and eigenvalues for L,

\tilde{\land} = 2 \land / λ_{m a x} - I_{n}

refers to the rescaled eigenvalue matrix and T_k refers to the Chebyshev polynomial.

Let θ = θ₀ = −θ_1, then the above equation becomes:

h = θ (I_{n} + D^{- 1 / 2} A D^{- 1 / 2}) x .

(8)

To alleviate the gradient explosion/disappearance problem,

I_{n} + D^{- 1 / 2} A D^{- 1 / 2}

is further reduced to

D^{- 1 / 2} A D^{- 1 / 2}

and the final expression is:

H = σ (D^{- 1 / 2} A D^{- 1 / 2} X Θ)

(9)

where H is the convolutional signal matrix, σ refers to the non-linear activation function and Θ refers to the learnable parameters.

2.3. Graph Convolutional Networks

Following is a non-linear activation function, where the GCN with a single message passing is expressed as:

H^{(l + 1)} = σ (D^{- 1 / 2} A D^{- 1 / 2} H^{(l)} W_{1}^{(l)} + H^{(l)} W_{0}^{(l)} + b^{(l)})

(10)

where

H^{(1)} \in R^{N \times d_{1}}

is the hidden matrix of nodes of dimension d_l at level l,

H^{(0)} = X

refers to the matrix of input node characteristics, σ(·) refers to the ReLU activation function,

W_{0}^{(l)} \in R^{d_{l} \times d_{l + 1}}

and

W_{1}^{(l)} \in R^{d_{l} \times d_{l + 1}}

denote the learnable parameter matrix and

b^{(l)}

is the bias vector. A GCN with a message passing operation can be considered as a first-order approximation to a spectral map convolution. In order to further reduce the model parameters, the single-parameter GCN model at layer l can be expressed as:

H^{(l + 1)} = σ ({\tilde{D}}^{- 1 / 2} \tilde{A} {\tilde{D}}^{- 1 / 2} H^{(l)} W^{(l)} + b^{(l)})

(11)

where

\tilde{A} = A + I_{n}

,

{\tilde{D}}^{- 1 / 2} \tilde{A} {\tilde{D}}^{- 1 / 2}

indicates that a self-connected normalised adjacency matrix has been added and

{\tilde{D}}_{i i} = \sum_{j} {\tilde{A}}_{i j}

and

W^{(l)} \in R^{d_{l} \times d_{l + 1}}

is the learnable parameter matrix. GCN reduces over-fitting by using a first-order approximate filtering operation that involves fewer free parameters per filtering operation.

3. Compound Fault Diagnosis of Bearings Based on Deep Graph Convolution

This article proposes a composite fault diagnosis method for bearings based on deep graph convolutional networks, with the process shown in Figure 2. Firstly, vibration signals from bearings are collected and divided into sub-samples. Secondly, the sub-samples will form the sample map. Finally, the DGCN model is used to extract the features of the graph and achieve rapid classification of composite faults.

3.1. Building Graphs Based on Time Series

The original time series X with signal length L is normalized and denoted as:

X^{n o l} = n o r m a l i z a t i o n (X)

(12)

where X^nol is the normalized time series and normalization(∙) refers to a different normalization method. Max-min normalization is used in this paper.

Next, the signal length L original time sequence is divided into sub-samples of length d. There is no overlap between each sample and a corresponding label is assigned to each sub-sample. The obtained sub-sample set can be expressed as:

\prod = [(x_{1}^{n o l}, y_{1}), (x_{2}^{n o l}, y_{2}), \dots, (x_{n}^{n o l}, y_{n})]

(13)

n = [\frac{L}{d}]

(14)

where

\prod

refers to the constructed super-sample set,

x_{n}^{n o l}

refers to the sub-sample and

y_{n}

refers to the label. n refers to the number of sub-samples.

Drawing on reference [19], the use of frequency domain inputs can improve the performance of the model. Therefore, the author performs a Fast Fourier Transform (FFT) on each sub-sample and uses the transformed data as a new sample. The process can be expressed as:

{\tilde{x}}_{i} = F F T (x_{i}^{n o l}), i = 1,2, \dots, n

(15)

where FFT(∙) refers to converting the sub-sample to the frequency domain and taking the first half of the result.

The resulting labeled dataset is represented as:

\prod^{˜} = [({\tilde{x}}_{1}, y_{1}), ({\tilde{x}}_{2}, y_{2}), \dots, ({\tilde{x}}_{n}, y_{n})] .

(16)

3.2. Structural Data for Construction Diagrams

The author uses the radius graph method to construct graph structure data. The specific methods are as follows: Firstly, the n sub-samples obtained above are used as the n nodes of the graph. Secondly, cosine similarity is used to estimate the distance between each sub-sample and set a threshold ε. If the cosine similarity is larger than the threshold, there will be an edge between the two knots. Therefore, the neighbors of node x_i can be obtained by:

N e ({\tilde{x}}_{i}) = \{\begin{array}{l} ε - radius ({\tilde{x}}_{i}, y_{i}), if s ({\tilde{x}}_{i}, {\tilde{x}}_{j}) > ε \\ 0, otherwise \end{array}

(17)

where

N e ({\tilde{x}}_{i})

is the neighbor node of

{\tilde{x}}_{i}

, ε indicates the selected radius, here ε is 0 and

({\tilde{x}}_{i}, {\tilde{x}}_{j})

calculates the cosine similarity of nodes

{\tilde{x}}_{i}

and

{\tilde{x}}_{j}

.

The weight between each two nodes is calculated by a threshold Gaussian kernel weight function, which is expressed as:

w_{i j} = \{\begin{array}{l} \exp (- \frac{s^{2} ({\tilde{x}}_{i,} {\tilde{x}}_{j})}{2 β^{2}}), if s ({\tilde{x}}_{i}, {\tilde{x}}_{j}) > ε \\ 0, otherwise \end{array}

(18)

where β denotes the bandwidth variance of the Gaussian function.

3.3. Fault Diagnosis Structure

In this article, the proposed DGCN model includes seven layers, one input layer, two graph convolution layers, two BatchNorm layers, one fully connected layer and one output layer. Among them, graph convolution layers are used to extract features of nodes and edges. The BatchNorm layer enables faster and more stable training of edges and nodes; the fully connected layer is used for node classification.

4. Case Study

4.1. Description of Experimental Data

In this section, the XJTU-SY [20] bearing dataset is used to verify the validity of the proposed DGCN method. As shown in Figure 3, the experimental platform mainly consists of a drive motor, support shaft, speed controller, support bearings, test bearings and hydraulic loading system. The test bearing is LDK UER204 rolling bearing. The sampling frequency was 25.6 kHz and the sampling interval was one minute. The experiments selected in this paper are shown in Table 1, and in this paper, six bearings were selected for two operating conditions, with a speed of 2100 r/min for condition one and 2400 r/min for condition three. Each condition contains one normal and three faults as shown in Figure 4.

The vibration signals collected from the XJTU-SY test stand are shown in Figure 5 and are pre-processed. First, the original signal was subjected to a max-min normalization process. Second, a sliding window was used to intercept non-overlapping vibration signals, with each sub-sample containing 1024 data points and 1000 samples for each fault type, for a total of 8000 samples for the eight fault types. Finally, each sub-sample was FFT transformed and labeled accordingly. The division of the training and test sets is shown in Table 2. Dataset A contains two working conditions, the fault types of condition one include normal, outer ring fault, cage fault, and composite fault of inner and outer rings. The fault types of condition three include normal, outer ring failure, inner ring, rolling element and cage composite failure, and inner ring failure. Diagnosis of faults in the two operating conditions was conducted separately, with 80% of the samples from each bearing health condition being used for training and the rest for testing in order to compare the same working conditions with different working conditions. In Dataset B, the data for the different working conditions were considered as a whole and only one normal bearing was considered, so that there were seven health states in total. 80% of the samples for each bearing health state were used for training and the remaining samples for testing. In practice, compound fault samples were more difficult to collect than single fault samples. Thus, in dataset C, the percentage of training samples for normal bearings was 50%, the percentage of training samples for single faults 30% and the percentage of training samples for compound faults was 20% or 10%.

4.2. Constructing the Graph

This study selected eight fault samples for two operating conditions, each one with a signal length of 1,024,000 and a sub-sample length of 1024, for a total of 1000 sub-samples. One every ten sub-samples was used to form one ε-radius of zero, and 2β2 is a graph of 1024 as shown in Figure 6. First, calculate the cosine similarity of each sample to its ten nearest neighbors and obtain the weights of the edges by Equation (18). Then, move to the next ten neighbors until all samples are traversed. Finally, move to the next ten neighbors’ nodes until all samples are traversed.

The adopted approach is effective in extracting graph features. Node information is embedded in each layer and fault features in the original data are extracted by the graph convolution layer. This study is a classification of nodes. Therefore, the input layer and the structure of the network does not change after one layer of convolution and two layers of convolution. As shown in Figure 7, each layer of the network is visualized by Gephi.

4.3. Detailed Framework for Fault Diagnosis

This paper employs the use of momentum stochastic gradient descent (SGD) as an optimizer for hyperparameter optimization, where the momentum of SGD is 0.9, the number of iterations is 100 and the batch size is 64. The learning rate decay strategy also adjusts the learning rate, where the weight decay value is initialized at 0.0005, the learning rate is 0.001, the number of network nodes is 640 and the number of edges is 5760. The network connection of the model is I-C1-B1-C2-B2-FC1-FC2, where C stands for the convolutional layer, B stands for the normalization layer and FC stands for the fully connected layer. The network structure for the specific model is set out in Table 3.

4.4. Results and Analysis

Each experiment was repeated 10 times to reduce the randomness of the results and then the average of the 10 test results was taken as the final result.

4.4.1. Compound Fault Diagnosis under the Same Operating Conditions

This section focuses on the performance of the proposed DGCN in terms of classification under the same operating conditions and with balanced samples. Therefore, this section considers Dataset A_1 and Dataset A_2. Figure 8 shows the t-SNE visualization results of the fault features extracted by GCN. It can be seen from the graph that this method is able to extract useful fault features under the same operating conditions and all fault types can be clearly distinguished, achieving good fault detection accuracy. The confusion matrix of the diagnostic results is shown in Figure 9, indicating that the DGCN model can effectively detect multiple faults under the same operating conditions.

4.4.2. Compound Fault Diagnosis under Different Operating Conditions

For comparison, this section used the proposed DGCN for fault classification under different operating conditions, therefore considering Dataset B. The extracted features are also visualized using t-SNE. As shown in Figure 10, it can be seen that the method can effectively extract composite fault data features. The confusion matrix of the classification results is shown in Figure 11. It can be found that the method can effectively avoid the problem of mutual interference between single faults and compound faults leading to more misclassifications. This shows that the proposed method is effective in compound fault diagnosis.

4.4.3. Diagnosis of Compound Faults under Sample Imbalance

In fact, fault samples are harder to collect than normal samples, while composite fault samples are more difficult to collect than single fault samples. In this section, we used the proposed DGCN to classify the imbalanced samples and therefore consider Dataset C. The predictions of the model on the test set were visualized using t-SNE, as is shown in Figure 12, and the characteristics of the different fault types are well separated. Therefore, DGCN is able to learn better features from vibration signals. It can be seen from Figure 13 that DGCN has a very high accuracy in all the confusion matrices. These results demonstrate the superiority of this method in the classification of unbalanced samples.

4.5. Evaluation Indicators

It is worth noting that the following three evaluation metrics were chosen in order to provide a comprehensive assessment of the model’s performance for different datasets: overall accuracy (ACC), true positive rate (TPR) and false acceptance rate (FAR). Each experiment was conducted 10 times to reduce the effect of randomness, as defined below.

A C C = \frac{T N + T P}{T N + T P + F N + F P}

(19)

T P R = \frac{T P}{T P + F N}

(20)

F A R = \frac{F P}{T N + F P}

(21)

where TP refers to a correctly judged unstable sample, TN refers to a correctly judged stable sample, FP refers to a misjudged unstable sample and FN refers to a misjudged stable sample.

The performance metrics on the test set under different datasets are shown in Table 4. It can be found that the overall accuracy of DGCN is high, and the performance on stable and unstable samples is relatively balanced. This fully demonstrates that the model is capable of achieving compound fault classification.

The author also investigated the effect of different distance calculation methods and the number of nodes on the accuracy of the change algorithm, and the results are as follows.

The effect of the number of nodes on the accuracy of the different datasets is shown in Figure 14. It can be found that in the same working condition dataset (DatasetA_1 and DatasetA_2) testing, the accuracy rate basically tends to be stabilized when the number of nodes is 5–10 but decreases significantly when the number of nodes is greater than 15. In cross working conditions and imbalanced sample dataset testing, changes in the number of nodes had little change in accuracy.

Figure 15 shows the impact of using the three distance calculation methods on the diagnostic accuracy of the different datasets. Using cosine similarity to calculate distance has the highest accuracy. In contrast, the accuracy of Euclid and Chebyshev distance calculation methods is slightly less accurate.

5. Conclusions

This paper proposes a composite fault diagnosis method for rolling bearings based on deep graph convolutional networks. This method solves the shortcomings of traditional fault diagnosis methods such as needing to extract features manually and relying on expert knowledge. We can be obtain the following conclusions from the experiments:

(1): The DGCN model enables an end-to-end diagnostic model, eliminating the need for complex feature engineering in traditional diagnostic methods. Graphs constructed using vertices and edges can provide more information for the training of diagnostic models. Thus, the experimental results indicate that the DGCN method is highly advantageous in identifying rolling bearing faults with different operating conditions and sample imbalances.
(2): To address the problem of high correlation between compound faults and single fault samples that can easily cause misclassification, this experiment proves that DGCN can effectively avoid misclassification between single faults and compound faults.

However, this study has not been combined with the bearing mechanism, so its interpretability is not strong. In addition, this study used a standard dataset and did not have a strong generalization ability for actual running data.

Author Contributions

Conceptualization, C.C.; Methodology, C.C.; Data curation, F.Z.; Writing—original draft, C.C.; Visualization, F.Z.; Supervision, Y.Y.; Funding acquisition, Y.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research is supported by National Natural Science Foundation of China (71961029 and 72361032), and Technology Innovation Program for Doctoral Students (XJU2022BS091).

Data Availability Statement

The data used for this study is a publicly available dataset linked to: http://biaowang.tech/xjtu-sy-bearing-datasets (accessed on 5 October 2023).

Conflicts of Interest

The authors declare no conflict of interest.

References

Li, X.; Ma, Z.; Kang, D.; Li, X. Fault diagnosis for rolling bearing based on VMD-FRFT. Measurement 2020, 155, 107554. [Google Scholar]
Yasir, M.N.; Koh, B.H. Data Decomposition Techniques with Multi-Scale Permutation Entropy Calculations for Bearing Fault Diagnosis. Sensors 2018, 18, 1278. [Google Scholar] [CrossRef] [PubMed]
Zhang, H.; Chen, X.; Du, Z.; Yan, R. Kurtosis based weighted sparse model with convex optimization technique for bearing fault diagnosis. Mech. Syst. Signal Process. 2016, 80, 349–376. [Google Scholar] [CrossRef]
Worden, K.; Staszewski, W.J.; Hensman, J.J. Natural computing for mechanical systems research: A tutorial overview. Mech. Syst. Signal Process. 2011, 25, 4–111. [Google Scholar]
Cui, L.; Wu, N.; Ma, C.; Wang, H. Quantitative fault analysis of roller bearings based on a novel matching pursuit method with a new step-impulse dictionary. Mech. Syst. Signal Process. 2016, 68–69, 34–43. [Google Scholar] [CrossRef]
Lei, Y.; Jia, F.; Lin, J.; Xing, S.; Ding, S.X. An Intelligent Fault Diagnosis Method Using Unsupervised Feature Learning Towards Mechanical Big Data. IEEE Trans. Ind. Electron. 2016, 63, 3137–3147. [Google Scholar] [CrossRef]
Jiang, G.; He, H.; Yan, J.; Xie, P. Multiscale Convolutional Neural Networks for Fault Diagnosis of Wind Turbine Gearbox. IEEE Trans. Ind. Electron. 2019, 66, 3196–3207. [Google Scholar] [CrossRef]
Xu, Z.; Li, C.; Yang, Y. Fault diagnosis of rolling bearing of wind turbines based on the Variational Mode Decomposition and Deep Convolutional Neural Networks. Appl. Soft Comput. 2020, 95, 106515. [Google Scholar] [CrossRef]
Jia, F.; Lei, Y.; Lin, J.; Zhou, X.; Lu, N. Deep neural networks: A promising tool for fault characteristic mining and intelligent diagnosis of rotating machinery with massive data. Mech. Syst. Signal Process. 2016, 72–73, 303–315. [Google Scholar] [CrossRef]
Wu, Z.; Pan, S.; Long, G.; Jiang, J.; Zhang, C. Graph WaveNet for Deep Spatial-Temporal Graph Modeling. arXiv 2019, arXiv:1906.00121. [Google Scholar]
Wu, Z.; Pan, S.; Chen, F.; Long, G.; Zhang, C.; Yu, P.S. A Comprehensive Survey on Graph Neural Networks. IEEE Trans. Neural Netw. Learn. Syst. 2021, 32, 4–24. [Google Scholar] [PubMed]
Scarselli, F.; Gori, M.; Tsoi, A.C.; Hagenbuchner, M.; Monfardini, G. The graph neural network model. IEEE Trans. Neural Netw. 2009, 20, 61–80. [Google Scholar] [CrossRef] [PubMed]
Bruna, J.; Zaremba, W.; Szlam, A.; LeCun, Y. Spectral Networks and Deep Locally Connected Networks on Graphs. arXiv 2014, arXiv:1312.6203v3. [Google Scholar]
Such, F.P.; Sah, S.; Dominguez, M.A.; Pillai, S.; Zhang, C.; Michael, A.; Cahill, N.D.; Ptucha, R. Robust Spatial Filtering with Graph Convolutional Neural Networks. IEEE J. Sel. Top. Signal Process. 2017, 11, 884–896. [Google Scholar] [CrossRef]
Zhang, D.; Stewart, E.; Entezami, M.; Roberts, C.; Yu, D. Intelligent acoustic-based fault diagnosis of roller bearings using a deep graph convolutional network. Measurement 2020, 156, 107585. [Google Scholar] [CrossRef]
Yu, X.; Tang, B.; Zhang, K. Fault Diagnosis of Wind Turbine Gearbox Using a Novel Method of Fast Deep Graph Convolutional Networks. IEEE Trans. Instrum. Meas. 2021, 70, 6502714. [Google Scholar] [CrossRef]
Zhao, B.; Zhang, X.; Zhan, Z.; Wu, Q.; Zhang, H. Multiscale Graph-Guided Convolutional Network with Node Attention for Intelligent Health State Diagnosis of a 3-PRR Planar Parallel Manipulator. IEEE Trans. Ind. Electron. 2022, 69, 11733–11743. [Google Scholar] [CrossRef]
Li, T.; Zhao, Z.; Sun, C.; Yan, R.; Chen, X. Multireceptive Field Graph Convolutional Networks for Machine Fault Diagnosis. IEEE Trans. Ind. Electron. 2021, 68, 12739–12749. [Google Scholar] [CrossRef]
Li, T.; Zhou, Z.; Li, S.; Sun, C.; Yan, R.; Chen, X. The emerging graph neural networks for intelligent fault diagnostics and prognostics: A guideline and a benchmark study. Mech. Syst. Signal Process. 2022, 168, 108653. [Google Scholar] [CrossRef]
Wang, B.; Lei, Y.; Li, N.; Li, N. A Hybrid Prognostics Approach for Estimating Remaining Useful Life of Rolling Element Bearings. IEEE Trans. Reliab. 2020, 69, 401–412. [Google Scholar] [CrossRef]

Figure 1. Laplace matrix calculation.

Figure 2. A framework for composite fault diagnosis of bearings based on depth map convolution.

Figure 3. Bearing tested.

Figure 4. Photos of tested bearings.

Figure 5. Vibration signals for each health condition of the bearing.

Figure 6. Graph data construction process. (The raw signal is divided into 10 parts, each representing a node).

Figure 7. Graph Structure Visualization.

Figure 8. Results of feature visualization under the same working conditions. (a) Dataset A_1, (b) Dataset A_2.

Figure 9. Confusion matrix for the same operating conditions. (a) Dataset A_1, (b) Dataset A_2.

Figure 10. Results of feature visualization under different working conditions.

Figure 11. Confusion matrix for different operating conditions.

Figure 12. Results of feature visualization under sample imbalance.

Figure 13. Confusion matrix under sample imbalance.

Figure 14. Effect of the number of nodes on accuracy.

Figure 15. Effect of different distance calculation methods on ACC.

Table 1. Test bearing conditions.

Operating Condition	Bearing Dataset	Fault Types	Fault Description	Label
Condition 1	Bearing 1_3	N_1	Normal	0
	Bearing 1_3	OF_1	Outer race fault	1
	Bearing 1_4	CF	Cage fault	2
	Bearing 1_5	OIF	Inner and outer race compound fault	3
Condition 3	Bearing 3_1	N_2	Normal	4
	Bearing 3_1	OF_2	Outer race fault	5
	Bearing 3_2	ICRF	Inner race, cage and rolling body compound failures	6
	Bearing 3_3	IF	Inner race fault	7

Table 2. Description of the dataset.

Fault Types	Label	Training Samples			Testing Samples
Fault Types	Label	Dataset A	Dataset B	Dataset C	Dataset A/B
N_1	0	80%	80%	50%	20%
OF_1	1	80%	80%	30%	20%
CF	2	80%	80%	30%	20%
OIF	3	80%	80%	10%	20%
N_2	4	80%	80%	-	20%
OF_2	5	80%	80%	20%	20%
ICRF	6	80%	80%	10%	20%
ORF	7	80%	80%	20%	20%

Table 3. Structure of the model.

Layer	Filter	Nodes	Edges
Input	512	640	5760
GConv_1	512 × 1024	640	5760
BatchNorm_1	1024	640	5760
ReLU_1	-	-	-
GConv_2	1024 × 1024	640	5760
BatchNorm_2	1024	640	5760
ReLU_2	-	-	-
FC_1	1024 × 512	640	5760
Dropout ratio	0.2	640	5760
FC_2	512 × C	640	5760

Table 4. DGCN performance on different datasets.

Evaluation Indicators	ACC (%)	TPR (%)	FAR (%)
Dataset	ACC (%)	TPR (%)	FAR (%)
Dataset A_1	1	98	0.7
Dataset A_2	1	98	0.5
Dataset B	1	98	0.2
Dataset C	1	98	0.2

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chen, C.; Yuan, Y.; Zhao, F. Intelligent Compound Fault Diagnosis of Roller Bearings Based on Deep Graph Convolutional Network. Sensors 2023, 23, 8489. https://doi.org/10.3390/s23208489

AMA Style

Chen C, Yuan Y, Zhao F. Intelligent Compound Fault Diagnosis of Roller Bearings Based on Deep Graph Convolutional Network. Sensors. 2023; 23(20):8489. https://doi.org/10.3390/s23208489

Chicago/Turabian Style

Chen, Caifeng, Yiping Yuan, and Feiyang Zhao. 2023. "Intelligent Compound Fault Diagnosis of Roller Bearings Based on Deep Graph Convolutional Network" Sensors 23, no. 20: 8489. https://doi.org/10.3390/s23208489

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Intelligent Compound Fault Diagnosis of Roller Bearings Based on Deep Graph Convolutional Network

Abstract

1. Introduction

2. Related Work

2.1. Representation of Graphs

2.2. Spectrogram Convolution

2.3. Graph Convolutional Networks

3. Compound Fault Diagnosis of Bearings Based on Deep Graph Convolution

3.1. Building Graphs Based on Time Series

3.2. Structural Data for Construction Diagrams

3.3. Fault Diagnosis Structure

4. Case Study

4.1. Description of Experimental Data

4.2. Constructing the Graph

4.3. Detailed Framework for Fault Diagnosis

4.4. Results and Analysis

4.4.1. Compound Fault Diagnosis under the Same Operating Conditions

4.4.2. Compound Fault Diagnosis under Different Operating Conditions

4.4.3. Diagnosis of Compound Faults under Sample Imbalance

4.5. Evaluation Indicators

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI