Mechanical Fault Diagnosis of High-Voltage Circuit Breakers with Dynamic Multi-Attention Graph Convolutional Networks Based on Adaptive Graph Construction

Sui, Guoqing; Yan, Jing; Wu, Yanze; Xu, Zhuofan; Qi, Meirong; Zhang, Zilong

doi:10.3390/app14104036

Open AccessArticle

Mechanical Fault Diagnosis of High-Voltage Circuit Breakers with Dynamic Multi-Attention Graph Convolutional Networks Based on Adaptive Graph Construction

State Key Laboratory of Electrical Insulation and Power Equipment, Xi’an Jiaotong University, Xi’an 710000, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(10), 4036; https://doi.org/10.3390/app14104036

Submission received: 26 March 2024 / Revised: 28 April 2024 / Accepted: 30 April 2024 / Published: 9 May 2024

(This article belongs to the Section Electrical, Electronics and Communications Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

With the rapid development of deep learning, its powerful capabilities make it possible to perform mechanical fault diagnosis of high-voltage circuit breakers (HVCBs). Among deep learning approaches, the convolutional neural network is widely used. However, while it can extract features effectively, it also has some limitations. Specifically, it depends on a large number of training data and only takes data information into account without considering structural information. These shortcomings lead to unused information and unsatisfactory model results. To address these shortcomings, this paper proposes AKNN-DMGCN, a novel dynamic multi-attention graph convolutional network based on an adaptively constructed graph, which can achieve high accuracy and robust mechanical fault diagnosis of HVCBs. First, a novel adaptive k-nearest neighbor (AKNN) graph construction method is proposed to construct informative graphs. The AKNN method can mine the relationship between the original data samples and utilize the data and label information. Thus, it has high fault tolerance to noise signals and can construct a structure graph with rich and accurate information, which can improve the overall model performance. Then, a dynamic multi-attention graph convolutional network (DMGCN) is applied for mechanical fault diagnosis of HVCBs. DMGCN fully utilizes structural and numerical information representing HVCB signals to perform classification. DMGCN has a dynamic multi-attention mechanism with strong expressive ability, which allows it to achieve high diagnostic accuracy. The experimental results indicate that the accuracy of AKNN-DMGCN reaches 97.22% on a balanced dataset and 95.01% on an imbalanced dataset, which demonstrates that the proposed method is effective for both balanced and imbalanced samples.

Keywords:

high-voltage circuit breaker; mechanical fault diagnosis; adaptive graph construction; dynamic multi-attention graph convolutional network

1. Introduction

High-voltage circuit breakers (HVCBs) serve the function of protection and control in power systems, and their normal operation guarantees the security and stability of transmission and distribution systems [1]. Abnormal operation of HVCBs can lead to serious consequences, such as damage to the power grid and personal harm to staff. Therefore, fault diagnosis of HVCBs has high practical significance [2].

With the rapid development of computer computing power, machine learning and deep learning are widely used in the field of fault diagnosis and defect analysis. Ihor Konovalenko et al. [3] investigated the possibility of using residual neural networks for classifying defects. The classifier based on the ResNet50 neural network was used as the foundation. The use of ResNet50 was shown to provide excellent recognition, high speed, and accuracy, which makes it an effective tool for detecting defects on metal surfaces. Volodymyr Havryliuk [4] proposed a method combining pre-processing of the time dependence of relay transient current and artificial neural networks, which makes it possible to reveal the features of the main defects of the relay armature, contact springs, and magnetic system. Denys Havryliv et al. [5] applied deep learning to surface defect detection of industrial ceramics and proved its effectiveness, selecting features of the defect and presenting more stable results in defect detection. Mykola Kozlenko et al. [6] used a classification model based on an artificial multilayered dense feed-forward neural network and a deep learning approach for software-implemented diagnosis of a GTK-25-i type of pumping unit. The result is competitive compared to the latest industry research findings. Alexandr Neftissov et al. [7] presented a monitoring system based on machine learning to prevent and predict possible damage or malfunctions. Leonid Shumilo et al. [8] presented a novel data augmentation method employing generative adversarial neural networks (GANs) with pixel-to-pixel transformation (pix2pix). This method can enhance the control of training data balance and provide a new idea for the diagnosis of unbalanced data.

During operation, an HVCB generates signals such as sound waves and vibrations that contain a large amount of equipment status information. By analyzing these signals, the status of the HVCB can be assessed to detect potential or early failures. Among the different signals, the vibration signal has strong anti-interference, high sensitivity, and a wide application space [9]. Therefore, a series of mechanical fault diagnosis methods for circuit breakers based on vibration signal analysis and deep learning have been widely researched and applied [10,11].

Among these methods, the data-driven convolutional neural network (CNN) has strong learning and representation ability and has achieved favorable results in general circuit breaker fault diagnosis. Ye et al. proposed a capsule CNN, which fully mines the mechanical state information in the vibration signal to improve diagnostic accuracy [12]. Yang et al. [13] employed a CNN to evaluate the state of HVCBs, thereby avoiding manual feature extraction and improving the assessment accuracy. In addition, M.A.Márquez-Vera et al. [14] proposed an inverse fuzzy fault model to detect and isolate faults which has only four fuzzy rules and shows a smaller isolation time than that required using the fuzzy classifier. Zhang et al. [15] proposed multi-sensor information fusion which can achieve more accurate identification of mechanical faults for high-voltage circuit breakers with higher speed. Yan et al. [16] proposed a lightweight CNN construction method that can achieve fault diagnosis of HVCBs under big data.

However, the above methods only utilize numerical information and ignore the structural topology information of fault features, resulting in insufficient information utilization. To address these shortcomings, graph convolutional networks (GCNs) have been proposed and widely used in fault diagnosis [17]. Li et al. [17] introduced a multi-receptive field GCN into mechanical fault diagnosis and observed that the GCN can make full use of representational features to enhance the diagnostic accuracy. Chen et al. [18] proposed a GCN-based diagnosis method, which achieved higher performance than the original diagnosis method by using hybrid metrics and prior knowledge. Zhang et al. [19] proposed a new adaptive propagation graph convolutional network model based on the attention mechanism, which is superior to the baseline model and can improve the accuracy of node- and graph-level classification tasks in downstream tasks.

Graph construction in GCNs is a very important step, as the constructed graph structure determines the performance of the model [20]. At present, many graph construction methods have been proposed. Sun et al. [21] combined an autoencoder with existing graph construction methods to propose a novel graph construction method. This proposed method uses the sigmoid activation function and the top-k order to identify the first k neighbors to complete graph construction. Li et al. [22] proposed a dynamic time warping (DTW) graph construction method. This method first calculates the DTW between samples; then, for the central sample, k neighbors with the smallest DTW are connected to construct the graph. Gao et al. [23] used the Euclidean distance to determine the similarity between nodes; then, they used the top-k order to determine the k neighbors of nodes to construct the graph.

In the above-mentioned methods, the measure of the similarity between nodes differs, while the principle of graph construction is the same, which is the concept of k-nearest neighbor (KNN) graph construction. This approach leads to only one hyperparameter k in the graph construction method, which has great limitations. In addition, this approach does not use the label information of the original data, resulting in unused information. Furthermore, although the GCN has achieved relatively favorable results, its adjacency matrix is composed of zero and one, which signifies that for the central node in the graph, every neighbor is equally important. This makes it impossible to pay sufficient attention to key nodes and limits the expressive ability of the GCN.

To address the above problems, this paper proposes a dynamic multi-attention graph convolutional network (DMGCN) based on adaptively constructed graphs called AKNN-DMGCN for mechanical fault diagnosis of HVCBs. First, a new dynamic adaptive k-nearest neighbor (AKNN) graph construction method is proposed to construct the graph. The AKNN method has high fault tolerance to noise signals and can construct a graph structure with rich and accurate information, which can fully utilize the original data information and enhance the overall performance of the method. Then, DMGCN is adopted to mine the numerical and structural information of the graph for classification. Dynamic multi-attention [24] ensures that the method assigns attention coefficients according to the condition of each node, which improves the expressive power. According to the experimental results, AKNN-DMGCN exhibits high performance on both balanced and imbalanced datasets and can achieve better diagnosis of HVCB mechanical faults than traditional methods. The main achievements of this study are as follows:

(1): Graph construction is an important step in the diagnosis process of GCNs. This paper proposes a new AKNN graph construction method that is able to take full advantage of the numerical and label information of the original data and assign different k according to different node conditions, allowing the graph to contain richer information.
(2): A dynamic multi-attention mechanism is added to the GCN to form DMGCN. To the best of our knowledge, this is the first time that DMGCN is used in the mechanical fault diagnosis of HVCBs. DMGCN not only adds a multi-attention mechanism to the GCN but also achieves the transformation from static attention to dynamic attention.
(3): This paper proposes an AKNN-DMGCN model for HVCB mechanical fault diagnosis. This model exhibits high performance on both balanced and imbalanced datasets and achieves high-precision diagnosis of HVCB mechanical faults.

2. Preliminaries

2.1. Graph Construction

The construction of the association graph is the process of extracting the original numerical and feature information and converting it into a graph, which includes the relationship between each node and the data characteristics of each node [25]. The constructed graph structure determines the performance of the model.

Let G(V, E) represent a graph, V represent a node in the graph, and E represent an edge in the graph. Generally, a sample of the original data is regarded as a node, and the characteristics of the nodes in the graph are defined by the sample characteristics. Adjacency matrix A is used to represent the structural information between nodes. It contains zero and one, which are expressed as follows:

A_{i j} = \{\begin{matrix} 1 if v_{i} ⊳ ⊲ v_{j} \\ 0 if not v_{i} ⊳ ⊲ v_{j} \end{matrix}

(1)

where

v_{i} ⊳ ⊲ v_{j}

indicates that

v_{i}

and

v_{j}

are connected.

To provide accurate information for the subsequent model, nodes of the same category in the graph should be connected. Nodes of the same category often have high similarity; therefore, common graph construction methods calculate the similarity between every two nodes and determine whether the nodes are connected by an edge. Among these methods, the KNN graph construction method is one of the most common. Its principle is to connect the k nodes most similar to the central node. The similarity between nodes can be represented by the Euclidean distance, as expressed by Equation (2):

D (v_{i}, v_{j}) = \sqrt{\sum_{l = 1}^{n} {(v_{i}^{l} - v_{j}^{l})}^{2}}

(2)

where

D (v_{i}, v_{j})

represents the Euclidean distance between

v_{i}

and

v_{j}

, n represents the number of all features of the node, and

v_{i}^{l}

and

v_{j}^{l}

represent the l-th feature of

v_{i}

and

v_{j}

, respectively. If i and j satisfy Equation (3) or (4),

D_{i j} = \min_{k} D_{i}

(3)

D_{i j} = \min_{k} D_{j}

(4)

the adjacency matrix can be expressed as

A_{i j} = 1

(5)

where

A_{i j}

indicates the connection status of the edge between

v_{i}

and

v_{j}

at row i and column j of the adjacency matrix;

D_{i j}

is the Euclidean distance between

v_{i}

and

v_{j}

; and

\min_{k} D_{i}

denotes the top-k smallest Euclidean distances between

v_{i}

and all other nodes.

2.2. GCNs

A GCN is a feature extraction method for a graph structure and has many advantages over a CNN. Namely, traditional CNNs depend on a large number of training data. In addition, they can only extract numerical information of samples and cannot effectively use the structural information of the samples. As a result, there is insufficient use of the sample information, and the diagnostic accuracy of CNNs is not ideal. The GCN can take full advantage of both structural and numerical information and reduce the dependence on samples, thereby enabling good performance in the case of a small number of samples or sample imbalance. Graph construction and forward transfer are important components of the GCN. The forward transfer process of the GCN is similar to the convolution operation, that is, for a central node, information of its neighbors and itself is considered as its characteristic information. The expression is as follows:

h_{i}^{k} = \partial (w_{k - 1} \sum_{u \in N_{i}} \frac{h_{u}^{k - 1}}{N} + B_{k - 1} h_{i}^{k - 1})

(6)

where

h_{i}^{k}

represents the characteristics of

v_{i}

in the k-layer neural network,

N_{i}

represents the neighbors of

v_{i}

,

N

is the number of neighbors of

v_{i}

,

w_{k - 1}

and

B_{k - 1}

are the learnable weight parameters of the neural network, and

\partial

stands for activation function.

The mathematical principle of the GCN is mainly to explore the properties of the graph with the help of the eigenvalues and eigenvectors of the Laplacian matrix [26]. After a series of optimizations, such as Chebyshev polynomial approximation, the final graph convolution expression can be expressed as follows:

h^{k} = R e L U (\hat{A} h^{k - 1} w_{k - 1})

(7)

\tilde{A} = A + I_{N}

(8)

\hat{A} = {\tilde{D}}^{- \frac{1}{2}} \tilde{A} {\tilde{D}}^{- \frac{1}{2}}

(9)

{\tilde{D}}_{i i} = \sum_{j} {\tilde{A}}_{i j}

(10)

where A is the adjacency matrix,

h^{k}

is the feature matrix of nodes in the k-th layer of the GCN,

w_{k - 1}

is the trainable weight matrix of the network,

\tilde{A}

is the node self-connection added to the adjacency matrix,

\tilde{D}

is the degree matrix of

\tilde{A}

,

I_{N}

is the identity matrix,

\hat{A}

is the matrix after normalizing the adjacency matrix, and

R e L U

is the activation function.

3. Proposed Model

3.1. Overview of Proposed Model

This paper proposes a novel AKNN-DMGCN model that can make full use of the numerical and node structure characteristics of the mechanical fault signals of HVCBs and achieve high-precision robust diagnosis of mechanical faults of HVCBs. Its diagnostic framework is presented in Figure 1. First, a new AKNN method is proposed to construct a graph structure from the mechanical fault data of the HVCB. The advantage of this method is that the label information of the original data is applied in the process of graph construction, and the calculation coefficient is set to adjust the calculation weight in the process of graph construction. As a result, the graph structure contains more abundant information. Then, to fully exploit the features of the graph structure and achieve adaptive high-precision mechanical fault diagnosis of HVCBs, DMGCN is adopted in this paper. Dynamic multi-attention improves the expressive ability of the model. Then, the constructed graph structure is input into DMGCN for fault feature extraction and classification. Finally, high-precision and robust diagnosis of mechanical faults of HVCBs is achieved.

3.2. AKNN Graph Construction

The traditional KNN graph construction method has significant limitations [27]. Namely, it has only one hyperparameter k; as a result, the choice of k directly determines the performance of the method. At present, the selection of k mainly depends on empirical judgment with certain limitations. In addition, the single k value is not suitable for all nodes, especially in imbalanced datasets, and often performs poorly. This is because when the samples are imbalanced, the method is biased toward the class with a dominant sample number. For classes with a small number, the accuracy of class prediction is very low. Furthermore, the KNN graph construction method does not use the label information of the training set, resulting in unused information.

To construct a graph structure with rich and accurate information, the concept of adaptive graph construction [28] is used for reference, and a graph construction method based on the principle of AKNN is introduced here. This method can adaptively select the k value according to the neighbors of each node. In addition, it can fully utilize the numerical and label information of the data to obtain deeper hidden information. The method introduces two new variables: the weight of neighbors W and the accuracy p. The graph construction process is as follows.

First, the value range of k is set, that is, the number of edges connected to each node is set to 1–z, and the value of z is determined according to the number of nodes. Then, the weight of the neighbors of the node and the accuracy p corresponding to different k values for each node are calculated to determine the number of connected edges. The expression of the graph construction process is as follows:

W_{i j} = e x p (- {D_{i j}}^{2} / 2)

(11)

n o d e_{i} = s o r t (\max_{z} W_{i j})

(12)

p_{i} (t) = \frac{\sum_{m = 1}^{t} [Y_{i} ⩵ Y_{n o d e_{i} (m)}] W_{n o d e_{i} (m)}}{t} t \in [1, z]

(13)

[Y_{i} = = Y_{n o d e_{i} (m)}] = \{\begin{matrix} {0 Y}_{i} {= Y}_{{node}_{i} (m)} \\ {1 Y}_{i} \neq Y_{{node}_{i} (m)} \end{matrix}

(14)

k_{i} = \max_{p_{i} (t)} t

(15)

Equation (11) calculates the weight between nodes, where

W_{i j}

is the weight of the edge between

v_{i}

and

v_{j}

, and

D_{i j}

represents the Euclidean distance between

v_{i}

and

v_{j}

. Equation (12) selects z neighbors of

v_{i}

with the largest weight and arranges them from large to small, where

n o d e_{i}

indicates the set of the z neighbor nodes, and sort denotes arranging the nodes by weight from large to small. Equation (13) calculates the accuracy corresponding to different k values of

v_{i}

, where

p_{i} (t)

denotes the corresponding accuracy p when

v_{i}

has t connected edges, which is a set containing z values;

Y_{i}

indicates the category to which

v_{i}

belongs; and

Y_{n o d e i (m)}

indicates the category of the node with the m-th largest weight among the neighbors of

v_{i}

. Equation (14) is a judgment formula, that is, if the categories of two nodes are the same, it is one; otherwise, it is zero. Equation (15) calculates the most suitable k value for

v_{i}

, where

k_{i}

represents t corresponding to the maximum value in

p_{i} (t)

, and

k_{i}

is the optimal number of connected edges for

v_{i}

.

However, through experiments, it was found that most of the nodes in the graph structure constructed by the AKNN method have only one edge. This is because the weight function of the AKNN method results in a very small distinction between nodes with different distances, and the calculation function of the accuracy p leads to a high calculation cost of adding an edge. As a result, the AKNN method always tends to build a graph structure with a small number of edges while ignoring the extraction of important structural information, which is undesirable. On this basis, the AKNN method improves the weight function and accuracy calculation function.

After calculation, the Euclidean distance between the nodes is distributed between zero and one. A comparison chart of the weight function of the method before and after modification is presented in Figure 2. It can be seen that the weight function (11) causes very small weight differentiation of nodes with different distances such that the weights of most nodes are concentrated in a small interval. In addition, the weight function (16) increases the differentiation of the weights of nodes. Moreover, the calculation coefficient

θ

and weight coefficient

γ

are introduced to increase the generalization of the method and adjust the calculation cost of adding an edge. Equations (11) and (13) in the above process are replaced with Equations (16) and (17):

W_{i j} = 1 - \tan h ({D_{i j}}^{\frac{1}{4}})

(16)

p_{i} (t) = \frac{\sum_{m = 1}^{t} [Y_{i} = = Y_{n o d e_{i} (m)}] W_{n o d e_{i} (m)}}{t^{θ}} t \in [1, z]

(17)

θ = γ \bar{W}

(18)

where

θ

is the calculation coefficient positively related to the node weight,

γ

is the weight coefficient, and

\bar{W}

is the average value of the node weight. Modification of the calculation of weight and accuracy increases the differentiation of nodes with different distances, reduces the calculation cost of adding edges, and allows the AKNN graph construction method to focus not only on the local information of the few nearest nodes around the central node but also on the global information of additional neighbor nodes. The graph construction process of the HVCB can be summarized as follows: First, the high-voltage circuit breaker signal is converted into nodes in the graph. The signal of the high-voltage circuit breaker is used as the characteristic of the node in the graph. Then, the weight between each node is calculated according to Equation (16) and then according to Equations (15), (17), and (18) to determine the best number of edges K for each node and finally connect each node to its K nodes with the largest weight, according to the value of K, to complete the construction of the graph.

3.3. Dynamic Multi-Attention GCNs

Although a GCN can utilize numerical and structural information, it cannot adaptively assign weight coefficients according to the importance of neighbors, which leads to insufficient attention to key neighbor information. The graph attention network (GAT) solves this problem to a certain extent by adding an attention mechanism to the GCN. The GAT can distinguish the number of connected edges according to the characteristics of the nodes. In addition, it can differentiate the weights of neighbor nodes of the central node. The core of the GAT is the attention coefficient. Figure 3 presents a schematic diagram of the attention layer, which is expressed as follows:

e_{i j} = l e a k y R e l u (α^{T} (w h_{i} || w h_{j}))

(19)

where

e_{i j}

represents the attention coefficient of the edge connecting

v_{i}

and

v_{j}

, and h is the feature matrix of the node.

h \in R^{N \times M}

, that is, there are N nodes in total, and each node has M features. In addition, w is the trainable weight matrix of the network and plays a role in adjusting the output feature dimension:

w \in R^{M \times F}

.

α

is a trainable calculation function in the network, and its dimension is 2*F. || is the concatenation operation. To better represent the importance of neighbors, the attention coefficient must be normalized:

a_{i j} = s o f t m a x (e_{i j}) = \frac{\exp (l e a k y R e l u (α^{T} (w h_{i} || w h_{j})))}{\sum_{k \in N_{i}} \exp (l e a k y R e l u (α^{T} (w h_{i} || w h_{k})))}

(20)

where

a_{i j}

represents the weight of the edge between

v_{i}

and

v_{j}

, leakyRelu is the activation function, and

N_{i}

represents all neighbors of

v_{i}

. After obtaining the weight of the edge, the node feature matrix of the next layer of the network can be obtained according to Equation (21), and the forward propagation of the GAT node features can be achieved:

{h_{i}}^{'} = σ (\sum_{j \in N_{i}} a_{i j} w h_{j})

(21)

where

{h_{i}}^{'}

represents the feature of

v_{i}

, and

σ

represents the activation function.

However, the attention in the GAT is static attention, that is, there are some nodes whose maximum-weight neighbors are all the same node. Decomposing

α

into

α = [α_{1} | | α_{2}]

allows the weight relationship between nodes to be written as follows:

a_{i j} = l e a k y R e l u (α_{1}^{T} w h_{i} + α_{2}^{T} w h_{j})

(22)

In all nodes of the graph structure, there must be a node

j_{m a x}

that leads to the largest

α_{2}^{T} w h_{j}

. Let

i^{'}

be any neighbor of

j_{m a x}

; then, the neighbor with the largest weight for

i^{'}

is also

j_{m a x}

. In the same way, for all neighbor nodes of

j_{m a x}

, the neighbor with the largest weight is

j_{m a x}

, which is the static attention of the GAT. The expressive ability of static attention is limited, which reduces the expressive ability of the GAT.

This paper uses DMGCN to add multi-attention to the GCN. In addition, it achieves the transformation from static attention to dynamic attention, making the model more expressive and robust. First, the operation order of the attention coefficient formula in the GAT is changed to transition from static attention to dynamic attention, and Equation (19) is transformed into Equation (23). The remainder of the forward propagation process is consistent with that of the GAT.

e_{i j} = a^{T} l e a k y R e l u (w [h_{i} || h_{j}])

(23)

Then, different calculation functions

α

and the weight matrix W are initialized N times, and the weight value

a_{i j}

is calculated N times. Finally, the weight value is integrated to achieve the multi-attention mechanism. Figure 4 presents a schematic diagram of the multi-attention mechanism. There are generally two methods of integration: splicing and averaging.

\{\begin{matrix} s p l i c i n g : {h_{i}}^{'} = {||}_{n = 1}^{N} σ [\sum_{v_{j} \in N_{(v_{i})}} (a_{i j}^{n} w^{n} h_{j})] \\ a v e r a g i n g : {h_{i}}^{'} = σ (\frac{1}{N} \sum_{n = 1}^{N} \sum_{v_{j} \in N_{(v_{i})}} (a_{i j}^{n} w^{n} h_{j})) \end{matrix}

(24)

Here, N represents the calculation of the weight value of each node N times. The middle layer of the general network uses the splicing method to make the features more expressive; however, splicing in the last layer of the network is not meaningful, and the averaging operation is used in the last layer instead.

4. Experimental Study

4.1. HVCB Vibration Signal Acquisition

The experimental process of the experimental platform is described as follows. This study used the VS1 (ZN63A) HVCB as an experimental prototype, whose rated operating voltage was 12 kV. The acceleration sensor was a CT1000LG accelerometer with a measuring range of 0–5000 g, bandwidth resolution of 50 mg, and frequency range of 1–15 kHz. The HVCB vibration signal was collected by the CT1000LG accelerometer. The signals were first collected by the acquisition card, and then the signal was amplified by a charge amplifier and transmitted to the host computer through the vibration trigger. Figure 5 displays the experimental equipment used in the experimental platform.

The acceleration sensors were mounted onto the HVCB beam. Faults are prone to occur during the closing and opening of the HVCB, and when this occurs, it is no longer possible to open and close the circuit breaker normally. Even if the opening and closing function can be achieved occasionally, the contact opening distances and overtravel are unable to return to normal after an action. This experiment included four HVCB working conditions: the normal working condition, closing-spring fatigue failure, opening-spring fatigue failure, and jammed iron core. The total number of samples collected was 400, that is, 100 for each working condition. Figure 6 presents the waveform.

4.2. Experimental Setup

The computer used in the experiment had an i7 13,700 kf CPU, RTX 3070-Ti graphics card with graphics processing unit (GPU) acceleration, 10,496 NVIDIA CUDA cores, and a memory capacity of 12 GB. The system was Windows 10, the programming environment was PyTorch 2.1.0, and the Python interpreter version was 3.8.5. The Adam optimization function was selected, and the learning rate was 0.001. The samples were randomly shuffled, and approximately 70% of the dataset was selected as the training set while approximately 30% was selected as the test set. The experiment was repeated 60 times to obtain the average value to eliminate chance.

To better understand the meaning of each abbreviation, this article shows a list of abbreviations in the appendix. To demonstrate that the proposed method performs well on both balanced and imbalanced datasets, using the data collected in the experiment, several datasets were constructed. Dataset1 contained four cases, each containing 100 samples. Dataset2 contained 100, 100, 100, and 30 samples, while dataset3 contained 100, 100, 30, and 30 samples. Dataset4 contained 100, 30, 30, and 30 samples. Table 1 lists the details of the four datasets, where case 0 represents normal, 1 represents closing-spring fatigue failure, case represents opening-spring fatigue failure, and 3 represents jammed iron core.

To evaluate the performance of the modified AKNN method proposed in this paper, the modified AKNN-GCN and AKNN-DMGCN and the unmodified AKNN-GCN and AKNN-DMGCN were selected for comparison, and the value of the weight coefficient of AKNN was analyzed. For the sake of distinction, the model before modification is referred to as BAKNN-GCN and BAKNN-DMGCN below. To evaluate the performance of the proposed AKNN-DMGCN model, CNN, KNN-GCN, KNN-DMGCN, AKNN-GCN, and AKNN-DMGCN were selected for comparison.

4.3. Experiment Results and Discussion

Four models—AKNN-GCN, AKNN-DMGCN, BAKNN-GCN, and BAKNN-DMGCN—were run on the four datasets, and the weight coefficient

γ

in the AKNN method was 1.75. Table 2 presents the average accuracy of the four models on the four datasets. As displayed in the table, the accuracy of AKNN-GCN on the four datasets is 1.46%, 2.4%, 3.03%, and 1.12% higher than that of BAKNN-GCN, respectively. Furthermore, the accuracy of AKNN-DMGCN on the four datasets is 1.17%, 3.64%, 2.99%, and 2.6% higher than that of BAKNN-DMGCN, respectively. It can be seen that the improved AKNN method greatly improves the overall performance of the model. This is because the model learns the characteristics of the data from the previously constructed graph; therefore, the information contained in the constructed graph directly affects the overall performance of the model. In addition, the improved AKNN method improves the accuracy of DMGCN more than the accuracy of the GCN, which also demonstrates that the learning ability of DMGCN is stronger than that of the GCN. The two most representative sets—dataset1 and dataset4—were selected, and Figure 7 and Figure 8 present the confusion matrices of the four models under dataset1 and dataset4, respectively. The vertical axis represents the actual label, and the horizontal axis represents the predicted label. If the abscissa value is equal to the ordinate value, the diagnosis is correct, that is, the diagonal line represents the accuracy. What needs to be emphasized here is that the percentage in each cell represents the accuracy of a category, that is, the sum of each row in the confusion matrix should be 100%.

The BAKNN method was used to construct the graph for dataset1 and dataset4. Figure 9 and Figure 10 present the graph structures constructed for dataset1 and dataset4, respectively. There are 400 nodes and 486 edges in Figure 9, and 190 nodes and 276 edges in Figure 10. It can be seen that most of the nodes in the graph constructed by the BAKNN method have only one connected edge. This leads to an inability to obtain the structural information of the original data, resulting in unsatisfactory model accuracy. Thus, BAKNN is improved as AKNN. The weight coefficient in AKNN is set to

γ = 1.75

, and the graphs for dateset1 and dateset4 are presented in Figure 11 and Figure 12, respectively. Figure 11 contains 400 nodes and 3362 edges, while Figure 12 contains 190 nodes and 1411 edges.

These results are obtained because there is very little weight difference between nodes with different distances in BAKNN, and the calculation cost of adding an edge to a node is too high; therefore, BAKNN always tends to choose a smaller k value to ensure a high p value. In contrast, AKNN increases the weight difference of nodes with different distances by improving the weight function and accuracy calculation function. By introducing calculation coefficients and weight coefficients, the calculation cost of adding an edge in the method is adjusted, and the data characteristics can be extracted fully to construct a structure graph containing more global information. In this paper, the value of the weight coefficient

γ

in the AKNN method is also studied. Table 3 presents the accuracy of AKNN-DMGCN for different values of

γ

, demonstrating that the choice of weight coefficient has a major impact on method performance. The performance of the method is the highest when

γ

is approximately 1.75.

According to the empirical formula [24], the value of k was calculated to perform the AKNN graph construction. Figure 13 presents the results of testing on four datasets using BP neural network (BPNN), CNN, KNN-GCN, AKNN-GCN, KNN-DMGCN, and AKNN-DMGCN models. As illustrated in the figure, for the balanced dataset, dataset1, the accuracy of AKNN-DMGCN reaches 97.22%, which is the highest among all methods, demonstrating the high expressive ability of this method. As the dataset imbalance increases, the accuracy of each algorithm decreases. The accuracies of CNN and BPNN decrease the most, while the accuracy of AKNN-DMGCN decreases the least. This is because the traditional CNN and BPNN only pay attention to numerical information and ignore structural information; therefore, they cannot obtain the full information of the signal and rely on numerous labeled data. The GCN can comprehensively consider both the structural and numerical information. Therefore, it can still exhibit high performance in the case of imbalanced datasets and fewer training samples.

For the most imbalanced dataset, dataset4, the accuracy of AKNN-DMGCN only decreases by 2.21%, and the accuracy reaches 95.01%, which is 6.52% higher than that of KNN-GCN. This is partly due to the limited ability of KNN to construct graphs, resulting in a certain degree of unused information. In addition, because the GCN pays the same attention to all neighbors in the process of node feature forward propagation, the model is unable to adaptively assign weights according to the importance of neighbor nodes and does not distinguish between neighbor nodes. This significantly affects the model’s ability to learn representative and important features in the graph structure. In contrast, AKNN can more effectively mine the relationship between nodes to adaptively assign different k values to each node, filter noise information, and ensure that key information is not lost, thereby constructing a more informative graph structure. DMGCN not only introduces a multi-attention mechanism but also achieves the transformation from static attention to dynamic attention, which greatly enhances the model’s expressive ability. The combination of the powerful graph construction ability of AKNN and the powerful expressive ability of DMGCN greatly improves the model accuracy. In addition, there is basically no difference in the running times of these models in our dataset, which indicates that the algorithmic complexity of the proposed models has not increased. Figure 14 presents box plots of the accuracy of the five methods on the four datasets. As illustrated in the figure, the maximum and minimum values and median and quartiles of AKNN-DMGCN are significantly higher than those of other methods.

To demonstrate the differences between various models more intuitively, the results of the four models—KNN-GCN, AKNN-GCN, KNN-DMGCN, and AKNN-DMGCN—were compared for the most balanced dataset, dataset1, and the most imbalanced dataset, dataset4. t-Distributed stochastic neighbor embedding was used to visualize and analyze the output vectors of each model after training, as illustrated in Figure 15 and Figure 16. It can be seen that the four fault nodes in the KNN-GCN visualization diagram are not completely distinguished, and the boundaries are not sufficiently clear. The AKNN-GCN visualization diagram is somewhat better but still not ideal. In contrast, the AKNN-DMGCN visualization diagram has obvious boundaries and has nodes that are easy to distinguish, which further demonstrates its superior performance. As a result, AKNN-DMGCN provides a reliable reference for the fault diagnosis of HVCBs.

5. Conclusions

To fully utilize vibration data to achieve accurate and robust diagnosis of the mechanical faults of HVCBs, the AKNN-DMGCN model is presented in this paper. A new AKNN graph construction method is proposed that can construct a more informative and accurate graph structure. According to the experimental results, the accuracy of AKNN-DMGCN on dataset1, dataset2, dataset3, and dataset4 is 2.58%, 4.34%, 2.75%, and 5% higher than that of KNN-DMGCN, respectively. In addition, the accuracy of AKNN-GCN on the four datasets is 3.35%, 2.54%, 4.21%, and 1.65% higher than that of KNN-GCN, respectively. It can be seen that, compared with KNN graph construction, AKNN graph construction extracts the internal information of the data more comprehensively. At the same time, the DMGCN algorithm is proposed to judge the type of fault. DMGCN not only introduces a multi-attention mechanism but also achieves the transformation from static attention to dynamic attention, which solves the problem of the GCN not paying sufficient attention to key nodes and paying too much attention to noise. Therefore, DMGCN can differentiate the weights of neighbor nodes of the central node and take full advantage of the topological information, endowing the model with more anti-interference and robustness. For dataset1, dataset2, dataset3, and dataset4, the accuracy of AKNN-DMGCN is 4%, 7.01%, 5.76%, and 7.05% higher than that of KNN-GCN, respectively. This demonstrates that AKNN-DMGCN can achieve the best performance on both balanced and imbalanced datasets, providing a new and effective solution for HVCB mechanical fault diagnosis.

Author Contributions

Conceptualization, G.S. and J.Y.; methodology, G.S.; software, G.S.; validation, G.S., Y.W., Z.X. and M.Q.; formal analysis, G.S.; investigation, G.S., Y.W. Z.X. and M.Q.; resources, Z.Z.; data curation, G.S.; writing—original draft preparation, G.S.; writing—review and editing, J.Y.; visualization, G.S.; supervision, J.Y.; project administration, J.Y.; funding acquisition, J.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Key Research and Development Program of China under Grant 2022YFB2403700.

Data Availability Statement

The data are not publicly available due to privacy or ethical restrictions.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

Abbreviations	Implication
HVCBs	High-voltage circuit breakers
KNN	K-nearest neighbor graph construction method
AKNN	Adaptive k-nearest neighbor graph construction method
CNNs	Convolutional neural networks
BPNNs	BP neural networks
GCNs	Graph convolutional networks
GATs	Graph attention networks
DMGCNs	Dynamic multi-attention graph convolutional networks
KNN-GCN	Graph convolutional network based on a k-nearest neighbor constructed graph
KNN-DMGCN	Dynamic multi-attention graph convolutional network based on a k-nearest neighbor constructed graph
BAKNN-GCN	Graph convolutional network based on an unmodified adaptively constructed graph
BAKNN-DMGCN	Dynamic multi-attention graph convolutional network based on an unmodified adaptively constructed graph
AKNN-GCN	Graph convolutional network based on a modified adaptively constructed graph
AKNN-DMGCN	Dynamic multi-attention graph convolutional network based on a modified adaptively constructed graph

References

Janssen, A.; Makareinis, D.; Sölver, C.E. International surveys on circuit-breaker reliability data for substation and system studies. IEEE Trans. Power Deliv. 2013, 29, 808–814. [Google Scholar] [CrossRef]
Ma, S.; Li, J.; Wu, Y.; Xin, C.; Li, Y.; Wu, J. A novel multi-information decision fusion based on improved random forests in HVCB fault detection application. Meas. Sci. Technol. 2022, 33, 055115. [Google Scholar] [CrossRef]
Konovalenko, I.; Maruschak, P.; Brezinová, J.; Viňáš, J.; Brezina, J. Steel Surface Defect Classification Using Deep Residual Neural Network. Metals 2020, 10, 846. [Google Scholar] [CrossRef]
Havryliuk, V. Artificial neural network based detection of neutral relay defects. In Proceedings of the 2nd International Scientific and Practical Conference “Energy-Optimal Technologies, Logistic and Safety on Transport” (EOT-2019), Lviv, Ukraine, 1 January 2019; Volume 294, p. 03001. [Google Scholar]
Havryliv, D.; Ivakhiv, O.; Semenchenko, M. Defect detection on the surface of the technical ceramics using image processing and deep learning algorithms. In Proceedings of the 2020 21st International Conference on Research and Education in Mechatronics (REM), Cracow, Poland, 9–11 December 2020; pp. 1–3. [Google Scholar]
Kozlenko, M.; Zamikhovska, O.; Tkachuk, V.; Zamikhovskyi, L. Deep Learning Based Fault Detection of Natural Gas Pumping Unit. In Proceedings of the 2021 IEEE 12th International Conference on Electronics and Information Technologies (ELIT), Lviv, Ukraine, 19–21 May 2021; pp. 71–75. [Google Scholar]
NeftiNeftissov, A.; Sarinova, A.; Kazambaev, I.; Kirichenko, L.; Bronin, S. Development of an Intelligent Monitoring System Based on the Use of Fiber-Optic Sensors Deep Learning. In Proceedings of the 2023 IEEE International Conference on Smart Information Systems and Technologies (SIST), Astana, Kazakhstan, 4–6 May 2023; pp. 370–374. [Google Scholar]
Shumilo, L.; Okhrimenko, A.; Kussul, N.; Drozd, S.; Shkalikov, O. Generative adversarial network augmentation for solving the training data imbalance problem in crop classification. Remote Sens. Lett. 2023, 14, 1129–1138. [Google Scholar] [CrossRef]
Attoui, I.; Oudjani, B.; Boutasseta, N.; Fergani, N.; Bouakkaz, M.S.; Bouraiou, A. Novel predictive features using a wrapper model for rolling bearing fault diagnosis based on vibration signal analysis. Int. J. Adv. Manuf. Technol. 2020, 106, 3409–3435. [Google Scholar] [CrossRef]
Ma, S.; Yuan, Y.; Wu, J.; Jiang, Y.; Jia, B.; Li, W. Multisensor decision approach for HVCB fault detection based on the vibration information. IEEE Sens. J. 2020, 21, 985–994. [Google Scholar] [CrossRef]
Gao, W.; Qiao, S.-P.; Wai, R.-J.; Guo, M.-F. A newly designed diagnostic method for mechanical faults of high-voltage circuit breakers via SSAE and IELM. IEEE Trans. Instrum. Meas. 2021, 70, 1–13. [Google Scholar] [CrossRef]
Ye, X.; Yan, J.; Wang, Y.; Lu, L.; He, R. A novel capsule convolutional neural network with attention mechanism for high-voltage circuit breaker fault diagnosis. Electr. Power Syst. Res. 2022, 209, 108003. [Google Scholar] [CrossRef]
Yang, Q.; Ruan, J.; Zhuang, Z.; Huang, D. Condition evaluation for opening damper of spring operated high-voltage circuit breaker using vibration time-frequency image. IEEE Sensors J. 2019, 19, 8116–8126. [Google Scholar] [CrossRef]
Márquez-Vera, M.A.; Ramos-Velasco, L.E.; López-Ortega, O.; Zúñiga-Peña, N.S.; Ramos-Fernández, J.C.; Ortega-Mendoza, R.M. Inverse fuzzy fault model for fault detection and isolation with least angle regression for variable selection. Comput. Ind. Eng. 2021, 159, 107499. [Google Scholar] [CrossRef]
Zhang, J.; Wu, Y.; Xu, Z.; Din, Z.; Chen, H. Fault diagnosis of high voltage circuit breaker based on multi-sensor information fusion with training weights. Measurement 2022, 192, 110894. [Google Scholar] [CrossRef]
Yan, J.; Liu, T.; Ye, X.; Jing, Q.; Dai, Y. Rotating machinery fault diagnosis based on a novel lightweight convolutional neural network. PLoS ONE 2021, 16, e0256287. [Google Scholar] [CrossRef] [PubMed]
Li, T.; Zhao, Z.; Sun, C.; Yan, R.; Chen, X. Multireceptive field graph convolutional networks for machine fault diagnosis. IEEE Trans. Ind. Electron. 2021, 68, 12739. [Google Scholar] [CrossRef]
Chen, Z.; Xu, J.; Peng, T.; Yang, C. Graph convolutional network-based method for fault diagnosis using a hybrid of measurement and prior knowledge. IEEE Trans. Cybern. 2022, 52, 9157. [Google Scholar] [CrossRef] [PubMed]
Zhang, C.; Gan, Y.; Yang, R. Adaptive Propagation Graph Convolutional Networks Based on Attention Mechanism. Information 2022, 13, 471. [Google Scholar] [CrossRef]
Chen, Z.; Xu, J.; Alippi, C.; Ding, S.X.; Shardt, Y.; Peng, T.; Yang, C. Graph neural network-based fault diagnosis: A review. arXiv 2021, arXiv:2111.08185. [Google Scholar]
Sun, K.; Huang, Z.; Mao, H.; Qin, A.; Li, X.; Tang, W.; Xiong, J. Multi-scale cluster-graph convolution network with multi-channel residual network for intelligent fault diagnosis. IEEE Trans. Instrum. Meas. 2021, 71, 1–12. [Google Scholar] [CrossRef]
Jiang, L.; Li, X.; Wu, L.; Li, Y. Bearing fault diagnosis method based on a multi-head graph attention network. Meas. Sci. Technol. 2022, 33, 075012. [Google Scholar] [CrossRef]
Gao, Y.; Chen, M.; Yu, D. Semi-supervised graph convolutional network and its application in intelligent fault diagnosis of rotating machinery. Measurement 2021, 186, 110084. [Google Scholar] [CrossRef]
Brody, S.; Alon, U.; Yahav, E. How attentive are graph attention networks? arXiv 2022, arXiv:2105.1449. Published as a Conference Paper at ICLR 2022. [Google Scholar]
Ning, S.; Ren, Y.; Wu, Y. Intelligent fault diagnosis of rolling bearings based on the visibility algorithm and graph neural networks. J. Braz. Soc. Mech. Sci. Eng. 2023, 45, 12. [Google Scholar] [CrossRef]
Bruna, J.; Zaremba, W.; Szlam, A.; LeCun, Y. Spectral networks and locally connected networks on graphs. arXiv 2013, arXiv:1312.6203. [Google Scholar]
Hou, Y.; Zhang, J.; Cheng, J.; Ma, K. Measuring and improving the use of graph information in graph neural networks. In Proceedings of the International Conference on Learning Representations 2019, New Orleans, LA, USA, 6–9 May 2019. [Google Scholar]
Li, A. Adaptive KNN algorithm based on Gaussian function weighting. Mod. Comput. Res. Dev. 2018, 1007–1423, 14-0003-05. [Google Scholar]

Figure 1. Structure of dynamic multi-attention graph convolutional networks based on adaptive graph construction (AKNN-DMGCN).

Figure 2. Weight function comparison graph.

Figure 3. Graph attention layer.

Figure 4. Multi-attention mechanism.

Figure 5. High-voltage circuit breaker mechanical fault acquisition system.

Figure 6. Waveform diagram under four working conditions: (a) normal; (b) closing-spring fatigue failure; (c) opening-spring fatigue failure; (d) jammed iron core.

Figure 7. Diagnosis results of each method on dataset1: (a) AKNN-DMGCN; (b) BAKNN-DMGCN; (c) AKNN-GCN; (d) BAKNN-GCN.

Figure 8. Diagnosis results of each method on dataset4: (a) AKNN-DMGCN; (b) BAKNN-DMGCN; (c) AKNN-GCN; (d) BAKNN-GCN.

Figure 9. Graph structure constructed by BAKNN for dataset1.

Figure 10. Graph structure constructed by BAKNN for dataset4.

Figure 11. Graph structure constructed by AKNN for dataset1.

Figure 12. Graph structure constructed by AKNN for dataset4.

Figure 13. Diagnosis results of different methods.

Figure 14. Box plots of the diagnosis results.

Figure 15. Scatter plots of different methods for dataset1.

Figure 16. Scatter plots of different methods for dataset4.

Table 1. Constructed datasets.

Dataset	Number of Samples
Dataset	0	1	2	3
Dataset1	100	100	100	100
Dataset2	30	100	100	100
Dataset3	30	30	100	100
Dataset4	30	30	30	100

Table 2. Fault diagnosis accuracy of HVCB using different methods.

Methods	Diagnostic Accuracy (%)
Methods	BAKNN-GCN	AKNN-GCN	BAKNN-DMGCN	AKNN-DMGCN
Dataset1	95.11%	96.57%	96.05%	97.22%
Dataset2	88.69%	91.09%	91.92%	95.56%
Dataset3	89.68%	92.71%	91.27%	94.26%
Dataset4	88.49%	89.61%	92.41%	95.01%

Table 3. Fault diagnosis accuracy of HVCB under different values of

γ

.

Table 3. Fault diagnosis accuracy of HVCB under different values of

γ

.

$γ$	Diagnostic Accuracy (%)
$γ$	Dataset1	Dataset2	Dataset3	Dataset4
1	96.20%	94.50%	92.67%	91.81%
1.5	96.73%	94.43%	93.19%	92.53%
1.75	97.22%	95.56%	94.26%	95.01%
2	96.68%	95.34%	92.37%	94.16%
2.5	96.73%	91.98%	91.05%	89.22%
3	95.95%	91.66%	90.91%	85.97%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Sui, G.; Yan, J.; Wu, Y.; Xu, Z.; Qi, M.; Zhang, Z. Mechanical Fault Diagnosis of High-Voltage Circuit Breakers with Dynamic Multi-Attention Graph Convolutional Networks Based on Adaptive Graph Construction. Appl. Sci. 2024, 14, 4036. https://doi.org/10.3390/app14104036

AMA Style

Sui G, Yan J, Wu Y, Xu Z, Qi M, Zhang Z. Mechanical Fault Diagnosis of High-Voltage Circuit Breakers with Dynamic Multi-Attention Graph Convolutional Networks Based on Adaptive Graph Construction. Applied Sciences. 2024; 14(10):4036. https://doi.org/10.3390/app14104036

Chicago/Turabian Style

Sui, Guoqing, Jing Yan, Yanze Wu, Zhuofan Xu, Meirong Qi, and Zilong Zhang. 2024. "Mechanical Fault Diagnosis of High-Voltage Circuit Breakers with Dynamic Multi-Attention Graph Convolutional Networks Based on Adaptive Graph Construction" Applied Sciences 14, no. 10: 4036. https://doi.org/10.3390/app14104036

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Mechanical Fault Diagnosis of High-Voltage Circuit Breakers with Dynamic Multi-Attention Graph Convolutional Networks Based on Adaptive Graph Construction

Abstract

1. Introduction

2. Preliminaries

2.1. Graph Construction

2.2. GCNs

3. Proposed Model

3.1. Overview of Proposed Model

3.2. AKNN Graph Construction

3.3. Dynamic Multi-Attention GCNs

4. Experimental Study

4.1. HVCB Vibration Signal Acquisition

4.2. Experimental Setup

4.3. Experiment Results and Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI