Next Article in Journal
How Climate Change Will Shape Pesticide Application in Quebec’s Golf Courses: Insights with Deep Learning Based on Assessing CMIP5 and CMIP6
Previous Article in Journal
A Multimodal Recommender System Using Deep Learning Techniques Combining Review Texts and Images
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Research on Chebyshev Graph Convolutional Neural Network Modeling Method for Rotating Equipment Fault Diagnosis under Variable Working Conditions

by
Jige Liao
,
Yaohua Deng
*,
Xiaobo Xie
and
Zilin Zhang
School of Electromechanical Engineering, Guangdong University of Technology, Guangzhou 510006, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2024, 14(20), 9208; https://doi.org/10.3390/app14209208
Submission received: 18 September 2024 / Revised: 2 October 2024 / Accepted: 8 October 2024 / Published: 10 October 2024

Abstract

:
Given the challenges of rotating equipment fault diagnosis under variable working conditions, including the unbalanced transmission of information during feature extraction, difficulty in capturing both global and local features, and limited generalization across different working conditions, a Chebyshev graph convolutional neural network (ChebyNet) method is proposed to address these issues. First, a symmetry processing mechanism is incorporated into the framework of the ChebyNet to balance the transfer of information between nodes in the graph to ensure the fair and efficient integration of information. Secondly, the wide-area feature extraction capabilities of the ChebyNet and the adaptive nodes of the graph attention network (GAT) are integrated to achieve the comprehensive mining of fault characteristics and accurate characterization of complex interactive relationships. Finally, the node reconstruction task of self-supervised learning and collaborative node classification tasks are used to enhance the model’s ability to capture complex changes in variable working conditions data, significantly improving the generalizability of working conditions. In comparative and cross-validation experiments, the proposed method achieved an average diagnostic accuracy of 99.72%, representing an improvement of up to 17.96% compared to other graph neural network (GNN) models. It significantly enhances the accuracy, stability, and generalization of fault diagnosis. Ablation experiments further validate the effectiveness of the proposed method in improving fault diagnosis performance under variable working conditions.

1. Introduction

The fault diagnosis of rotating equipment plays a vital role in ensuring the normal operation of such equipment. Its effectiveness is directly related to the safety and economic benefits associated with the equipment in complex industrial environments [1]. With the advancement of deep learning technology, especially the breakthroughs in the study of neural networks in complex pattern recognition, research has demonstrated that deep learning neural networks offer significant advantages in data feature mining and can accurately capture valuable data related to faults. This capability provides strong support for high-precision, high-reliability fault diagnosis. However, under variable working conditions, the distribution of fault data shifts, and the internal complexity and interrelationships within the data increase. This amplifies the difficulty of accurately capturing underlying correlations in the fault data, posing significant challenges for deep learning-based fault diagnosis methods [2]. Additionally, the commonly used linear or geometric representations of Euclidean space are inadequate for capturing the complex relationships between the fault data in such dynamic environments [3].
In recent years, GNNs have gained significant attention in the field of equipment fault diagnosis due to their superior ability to capture complex data dependencies, particularly in diagnosing faults within rotating equipment under variable working conditions. A GNN is a deep learning model specially designed to process graph-structured data. These graphs are composed of nodes (data points) and edges (relationships among data points). A GNN learns the embedded representation of each node through the interaction between nodes to enhance the understanding of complex relationships among data points [4,5,6]. For example, Jiang et al. (2022) [7] proposed a rotating equipment fault diagnosis method (MHGAT) based on dynamic time warping and the GAT to achieve the automatic learning of node weights, extraction, and the aggregation of multi-scale features, addressing the limitations of Euclidean spatial data relationship mining. However, a GAT may overly focus on significant nodes under specific working conditions and may overlook key information as conditions change, resulting in a decline in diagnostic performance. Li et al. (2022) [8] proposed an improved graph attention network (GATv2) for variable operating conditions using gearbox fault diagnosis, assigning higher weights to nodes with rich information. However, the receptive field of a GAT is confined to direct neighbor nodes, which hinders the capture of a broader range of a graph’s structural information. When the operating conditions change, the structure and features of the graph data will also evolve correspondingly. A GAT may struggle to extract sufficiently comprehensive feature information from these changes, thereby limiting the model’s generalization ability. More recently, Chang et al. (2023) [9] proposed a graph structure search network (GASN) for cross-domain fault diagnosis to address the problem that existing models have difficulty in clearly representing the correlation between data. The goal is to enhance the adaptability to unknown working conditions through a domain adversarial training strategy driven by triple losses. However, it fails to fully consider the global key fault characteristics in the timing dimension, which may lead to misjudgment.
To address the issue of discrepancies in the data distribution between the source domain and the target domain caused by different operating environments, Zhang et al. (2024) [10] proposed using a dual-path convolutional neural network (CNN) and multi-parallel graph convolutional networks (GCN) to gradually extract features from fault data, thereby tackling the challenge of modeling fault diagnosis for rotating equipment under variable operating conditions. Li et al. (2021) [11] proposed a fault diagnosis method based on a domain adversarial graph convolutional network (DAGCN). They constructed an instance graph to extract the structural feature relationships of the samples and measure the cross-domain structural differences using the maximum mean square error. Since the representations of different GCN nodes become increasingly similar with depth, it becomes challenging to capture subtle differences between nodes in graph data under variable working conditions, thereby affecting diagnostic accuracy and stability. Gao et al. (2023) [12] studied a fault diagnosis method for rotating equipment under variable operating conditions that integrates GNN with one-shot learning. They used a CNN for feature extraction and utilized GNN for one-shot learning to achieve fault diagnosis. Li et al. (2024) [13] proposed a semi-supervised meta-path space extended graph neural network (ME-GNN) to transform the nearest neighbor relationships among vibration data, fault information, and variable speed information into a graph; facilitate message passing and aggregation across heterogeneous data types; and investigate the fault diagnosis of rotating equipment under time-varying speed conditions.
Based on existing research, the current GNN method still has problems that need to be solved, such as unbalanced information transmission, insufficient global and local feature extraction capabilities, and limited cross-condition generalization capabilities. To address the above problems, this paper proposes a Chebyshev graph convolutional neural network modeling method for rotating equipment fault diagnosis under variable working conditions. Its application performance is verified through a series of variable working condition fault diagnosis experiments. The main contributions of this paper are as follows:
  • To address the problem of unbalanced information transmission faced by feature extraction under variable working conditions, a symmetry processing mechanism is introduced in the ChebyNet framework to ensure that the information exchange between nodes in the graph is conducted with full consideration of the symmetry in their mutual relationships, thus achieving a fair transmission and effective integration of information.
  • To tackle the challenge of interwoven global and local features, as well as the coexistence of complex relationships in variable operating conditions data, The ChebyNet, renowned for its proficiency in capturing global structural features, is integrated with a GAT, which effectively characterizes local dependencies among nodes. This integration yields a model adept at comprehensively mining and analyzing both global and local features, as well as the intricate relationships within data.
  • To address the problem of insufficient generalization across different working conditions, a self-supervised learning mechanism incorporating a node reconstruction task designed as an auxiliary training component is proposed. This allows the model to deeply explore and map the intrinsic feature structure of the original data during the reconstruction process, forming an efficient and collaborative multi-task learning framework alongside the primary node classification task. As a result, the model can more accurately capture the complex and dynamic features of variable working conditions data, fundamentally enhancing its generalization ability across different conditions.
The remainder of the paper is organized in the following way: Section 2 elaborates on the architecture of the proposed ChebyNet modeling approach for rotating equipment fault diagnosis under variable operating conditions; Section 3 introduces the experimental setup and provides an analysis of the experimental results; and Section 4 concludes the paper.

2. Chebyshev Graph Convolutional Neural Network Modeling Method

The framework of the ChebyNet modeling method for rotating equipment fault diagnosis under variable operating conditions is illustrated in Figure 1 and comprises three modules: data preprocessing, feature extraction, and multi-task collaborative diagnosis.
First, the data preprocessing module includes a fast Fourier transform (FFT) unit for raw data and a data graph construction unit. The FFT unit converts the raw time domain data into frequency domain data to reveal the characteristics of the data in the frequency domain. In the data graph construction unit, the PathGraph method is used to comprehensively consider the complex interactive relationships among the data and transform them into graph data, laying the groundwork for subsequent GNN processing.
Secondly, the data feature extraction module includes the ChebyNet global feature extraction unit and the GAT local feature extraction unit. In the global feature extraction unit of the ChebyNet, symmetry processing enhancements are implemented to enhance its capacity for equitable information integration, thereby addressing the issue of biased information transmission. At the same time, the ChebyNet’s efficient global feature extraction capabilities enable the comprehensive mining of overall features within variable operating conditions data. In the GAT local feature extraction unit, node attention is adaptively calibrated to accurately capture local details and key features within the data. This fusion method not only ensures the integrity of global features but also takes into account the accuracy of local features, thoroughly analyzing the complex interplay between global and local features.
Finally, the multi-task collaborative diagnosis module comprises node classification tasks and a node reconstruction task. The node classification tasks offer a distinct supervisory signal to the model through the construction of a Cross-Entropy Loss function, thereby guiding it to differentiate between various fault categories. The node reconstruction task employs a self-supervised learning approach that constructs a root Mean Squared Error (MSE) Loss function to map the input, thereby compelling the model to gain a deeper understanding of the internal structure and patterns within the data and enhancing its generalization ability in managing variable operating conditions data. The sum of the loss values of the two tasks is considered the total loss, and the parameters are updated according to the gradient information of the total loss to facilitate the coordinated optimization of both tasks. The maximum value of the category probability is determined using the Argmax function and presented as the model’s predicted value for diagnosis.

2.1. Data Preprocessing Based on PathGraph Method

In the data preprocessing module, to conduct an in-depth analysis of the internal connections and potential patterns among data samples, the PathGraph method is employed to transform the data into graph representations [14]. The PathGraph method effectively retains local order information among data samples through iterative partitioning and the construction of ordered node pairs. By introducing the weight mechanism, the graph structure model can more accurately characterize the complexity and diversity of relationships among data samples.
C = v 1 v 2
D i j = k = 1 n ( v 1 k v 2 k ) 2
w i j = exp ( D i j 2 2 σ 2 )
For each of the adjacent nodes i and j , their features are denoted as v 1 and v 2 , respectively, and their two feature vectors are stacked into a matrix C in a row-wise manner. The matrix representation is shown in Equation (1). Then, the distance D i j between adjacent nodes i and j is shown in Equation (2), which measures the difference between the feature vectors of adjacent nodes. Finally, the Gaussian kernel function is employed to precisely model the similarity of adjacent nodes, and the edge weight w i j in the graph is computed accordingly. The edge weight directly reflects the similarity between nodes; the higher the similarity, the greater the weight. Its calculation is shown in Equation (3).
Here, C denotes the matrix formed by stacking the feature vectors of nodes i and j; v 1 k and v 2 k represent the k-th dimension of the feature vectors of nodes i and j, respectively, while n denotes the total number of feature dimensions; D i j is the distance between nodes i and j; w i j is the weight of the edge between node i and j; and σ is the standard deviation of the Gaussian kernel, used to control the kernel’s width.

2.2. Feature Extraction Module Based on Fused Graph Neural Network

In the feature extraction module, the ChebyNet employs Chebyshev polynomials as convolution kernels to iteratively refine features. Its high-order terms can capture features across various dimensions in the data, thereby facilitating the effective extraction of global features across low to high dimensions, significantly enhancing the model’s capacity to comprehend the global structure. However, this iterative update method that emphasizes global features may inadvertently divert attention from local details, leading to the oversight of certain crucial local information. Therefore, a flexible attention mechanism is introduced via a GAT to automatically learn and assign varying weights to neighboring nodes, thereby accurately capturing and enhancing local feature representation, and effectively considering the extraction of both global and local features from the data.

2.2.1. ChebyNet Symmetry Processing Optimizes Feature Extraction

The ChebyNet [15] serves as both the initial and final layer of the feature extraction network, updating the global node representation by determining the neighborhood range and iteratively aggregating information from adjacent nodes. When the value is set to 1, the node aggregates information from all directly adjacent nodes (i.e., nodes directly connected by edges) to update its node representation. When the value is set to 2, the node aggregates information from both directly adjacent nodes and nodes that are indirectly connected to update its node representation. Finally, the learned node representation is utilized to perform node classification. The iterative update process of the ChebyNet node features is illustrated in Figure 2. Here, h i denotes the feature of node i, h N e represents the domain node feature set, W is the weight matrix, and h i is the updated feature of node i.
The symmetry processing of the ChebyNet feature extraction process is enhanced to ensure a balanced consideration of the relationships between nodes and effectively address the challenge of uneven information transmission during feature extraction under variable working conditions. The process of introducing the symmetry processing mechanism into the ChebyNet framework is elaborated in detail across three parts: the description of the original ChebyNet feature update mechanism, the implementation of the symmetry processing mechanism, and the description of the improved ChebyNet feature update mechanism.
(1) Original ChebyNet feature update process: The initial node feature matrix X is represented in Equation (4), and the Laplacian matrix L ^ is used to propagate the features of the domain after one Chebyshev polynomial iteration is shown in Equation (5). The node features after two Chebyshev polynomial iterations are shown in Equation (6). Each iteration combines the previous and next results and recursively captures multi-order neighbor information to optimize the propagation and aggregation process of node features. The initial node feature information is introduced by subtracting the initial node feature matrix, thereby enhancing its propagation effect. The recursive update and the original features are combined to enhance the model’s ability to capture complex relationships in the graph. Similarly, the node features after one iteration of the Chebyshev polynomial can be obtained, as shown in Equation (7).
Z 0 = X
Z 1 = L ^ · X
Z 2 = 2 L ^ · Z 1 Z 0
Z k = 2 L ^ · Z ( k 1 ) Z ( k 2 ) , k 2
Here, Z 0 represents the initial node feature matrix; X denotes the input data feature matrix; Z 1 denotes the node feature after one iteration of the Chebyshev polynomial; L ^ represents the scaled and standardized Laplacian matrix; Z 2 denotes the node feature after two iterations of the Chebyshev polynomial; and Z k denotes the node feature after the k iteration of the Chebyshev polynomial.
(2) Implementation of symmetry processing: Because the internal data distribution of the initial node feature matrix shows significant differences, to better characterize the data features, symmetry processing is introduced in the calculation of the node features after two Chebyshev polynomial iterations by adding and averaging the intermediate matrix representation and its transpose, as shown in Equation (8). After symmetry processing, the node features obtained through recursive operations after introducing symmetry processing and iterating the Chebyshev polynomial are shown in Equation (9).
Z 2 = 1 2 ( Z 2 + ( Z 2 ) T )
Z k = 1 2 ( 2 L ^ · Z ( k 1 ) Z ( k 2 ) ) + 1 2 ( 2 L ^ · Z ( k 1 ) Z ( k 2 ) ) T , k 2
Here, Z 2 is the node feature after introducing symmetry processing and 2 Chebyshev polynomial iterations; Z k is the node feature after introducing symmetry processing and k Chebyshev polynomial iterations.
(3) Improved ChebyNet feature update: After the complete order Chebyshev polynomial recursion, the node features of each order are multiplied by their corresponding weight matrices Θ k to obtain a linear combination of the updated node features, as shown in Equation (10). This process effectively integrates Chebyshev polynomials of varying orders and accurately captures the node feature update process.
X = k = 0 k Z k · Θ k
Here, Θ k is the weight matrix corresponding to the node features of each order; X is the linear combination of the updated node features.

2.2.2. Introduction of Multi-Head Graph Attention Mechanism

To accurately capture the local features in the variable working conditions data, the graph attention network (GAT) is incorporated as an intermediate layer to thoroughly account for the node neighborhood features and dynamically adjust the weight representations [16]. The node feature update process of the GAT is shown in Figure 3a and comprises two key steps: attention coefficient calculation and weighted summation.
(1) Attention coefficient calculation: For node i , the correlation coefficient between node j ( j N i ) is calculated individually, as shown in Equation (11). The linear mapping increases the dimensionality of the node features to enhance feature representation, concatenates the transformed features of the nodes, and maps the resulting high-dimensional features to the correlation coefficient. The correlation calculation between nodes i and j is completed through learnable parameters W and mapping functions. The correlation coefficient is normalized to obtain the attention coefficient. The calculation is shown in Figure 3b and Equation (12).
e i j = a ( W h i W h j ) , j N i
α i j = exp ( L e a k y Re L U ( e i j ) ) k N i exp ( L e a k y Re L U ( e i k ) )
Here, e i j is the correlation coefficient between nodes i and j ; W is a shared parameter; h i and h j represent the features of nodes i and j , respectively; ( W h i W h j ) represents the concatenation of the transformed features of nodes i and j ; e i k is the correlation coefficient between nodes i and k ; k denotes the k -th neighbor node of node i ; and N i is the number of neighboring nodes of node i , including node i itself.
(2) Weighted Summation: The attention coefficient is employed to perform a weighted fusion of the features, resulting in new node features. The features obtained from different attention heads are concatenated to yield new node features, as illustrated in Figure 3c and expressed in Equation (13).
Here, h i is the new feature of node i that integrates the domain features; K is the number of multi-head attention heads; α i j K is the attention coefficient between nodes i and j computed by the k -th attention head, following the previously described calculation steps; and W K is the weight matrix for the input linear transformation corresponding to the k -th attention head.
h i = K = 1 K σ ( j N i α i j K W K h j )

2.3. Multi-Task Collaborative Fault Diagnosis Strategy

In the multi-task collaborative diagnosis module, the model’s generalization ability for variable working conditions data is improved by concurrently executing node classification and node reconstruction tasks. In the node classification task, the node representation produced by the second improved ChebyNet layer is refined through supervised learning, enabling the model to better meet classification requirements. In the node reconstruction task, the objective is to map the node features processed by the network back to the original input space, reconstructing node features that closely resemble the original input. This process does not depend on externally labeled data; instead, it utilizes the structural information inherent to the data as a supervisory signal.
The output of the node classification task serves as the input for the node reconstruction task, and its accuracy directly influences the quality of the reconstruction task. The node reconstruction task assesses the effectiveness of reconstruction by calculating the reconstruction loss, and this evaluation subsequently aids the node classification task in conducting self-inspection and optimization. When the reconstruction loss is low, it indicates that the model retains sufficient effective information during the feature extraction and transformation process, enhancing the accuracy of the node classification task in identifying node categories; conversely, a larger reconstruction loss may indicate that the model has deviations in feature processing, prompting adjustments and optimizations in subsequent learning.
In the forward propagation, for each sample, the network output includes the probability that the sample belongs to the i -th category and the node reconstruction feature vector (the reconstructed output and the original input features are transformed by the fully connected layer to ensure dimensional consistency), which correspond to the node classification Cross-Entropy Loss and the node reconstruction MSE Loss, respectively. Moreover, the node classification Cross-Entropy Loss measures the difference between the model’s predictions for each category and the true labels. The calculation of the classification Cross-Entropy Loss is shown in Equation (14).
l o s s = i = 1 n t i log P i
r e c o n s t r u c t i o n _ l o s s = 1 m i = 1 m ( r e c o n s t r u c t e d _ x i x i ) 2
Here, l o s s is the node classification Cross-Entropy Loss; t i is the value of the i -th element in the true label vector; n is the number of categories; and P i represents the probability vector that the sample is the i -th category.
The node reconstruction loss is calculated using the root MSE, which measures the difference between the output of the model’s node reconstruction and the original input, as shown in Equation (15).
Here, r e c o n s t r u c t i o n _ l o s s is the node reconstruction MSE Loss; r e c o n s t r u c t e d _ x i is the i-th feature dimension of the node reconstruction layer output; x i is the i-th feature dimension of the input data; and m is the number of feature dimensions (experimentally set to 1024).
In backpropagation, the previous gradient is first cleared, and the total loss is computed as the sum of the node classification Cross-Entropy Loss and the node reconstruction MSE Loss. Finally, the backpropagation algorithm is employed to compute the gradient with respect to the model parameters, as shown in Equations (16) and (17). Utilizing stochastic gradient descent (SGD) as the optimization method, the gradient is computed, and the model parameters are updated accordingly in the opposite direction. For each parameter θ i ( t ) , the update rule is shown in Equation (18).
t o t a l _ l o s s = i = 1 n t i log P i + 1 m i = 1 m ( r e c o n s t r u c t e d _ x i x i ) 2
t o t a l _ l o s s θ i ( t ) = l o s s θ i ( t ) + r e c o n s t r u c t i o n _ l o s s θ i ( t )
θ i ( t + 1 ) = θ i ( t ) l r × t o t a l _ l o s s θ i ( t )
Here, t o t a l _ l o s s is the total loss; l r is the learning rate; θ i ( t ) is the parameter value in the t -th iteration; and θ i ( t + 1 ) is the parameter value in the ( t + 1 ) -th iteration.

3. Experiments and Comparisons

The workstation configuration used in the experiment is as follows: GPU is NVIDIA GeForce RTX 3060; CPU is Intel Core i5 10980XE, cuda12.2, Pytorch 1.13.0, Python 3.9.0; and Pycharm is used to run under Windows10. The model proposed in this article is named ChebyAT (ChebyNet-GAT).

3.1. Model Hyperparameter Settings

The setting of neural network hyperparameters has a crucial impact on the model’s training performance, capacity to generalize to complex data, training speed, and its overall performance in practical applications. The learning rate regulates the step size of gradient updates for each weight adjustment. Reference [6] recommends a value of 0.001. Experiments demonstrate that this learning rate effectively facilitates the updating of model weights and accelerates the convergence process while ensuring model stability. Epochs denote the frequency at which the training dataset is fully processed and weights are updated. Its optimal value is determined through experimentation. Setting the number of epochs to 100 avoids overfitting, thereby optimizing computational resources and time while ensuring comprehensive learning. Batch size refers to the number of samples used in each iteration during neural network training. Additional parameters were determined through comparative validation based on the aforementioned experimental conditions to identify the hyperparameters yielding optimal experimental outcomes. The detailed hyperparameter configurations are presented in Table 1.

3.2. Comparing Model Settings

Five GNNs, each exhibiting unique characteristics—namely, the ChebyNet [15], GAT [16], GCN [17], Graph Isomorphic Networks (GINs) [18], and Graph Sample and Aggregate (GraphSAGE) [19] neural networks—were selected as benchmark models. By comparing these models, the advantages of the proposed ChebyAT model in variable operating conditions fault diagnosis can be highlighted from various perspectives. Among these, the ChebyNet and GAT are directly related to the proposed method. Using them as comparison models not only highlights the technical relevance of the proposed method but also allows for an intuitive assessment of how the ChebyAT addresses and integrates the limitations of each. A GCN is a GNN that realizes the local feature aggregation of graph structure data through spectral graph theory. It is used as a comparison to intuitively demonstrate the improvement of the ChebyAT in local feature extraction and aggregation; A GIN is a GNN that captures the overall characteristics of the graph structure through a multi-layer perceptron (MLP) as a node update function. It is used as a comparison to verify the ability of the ChebyAT to represent global information of complex graph structures. The GraphSAGE is an inductive learning GNN method that learns node representation through the sampling and aggregation of neighbor nodes. It can effectively learn node representation on unseen graphs. It is used as a comparison model to evaluate the generalization ability of ChebyAT and the effect of processing large-scale graph data.

3.3. Experimental Dataset Introduction

To validate the effectiveness of the proposed method for diagnosing faults in rotating equipment under variable working conditions, this study utilizes a publicly available gearbox bearing variable operating conditions fault dataset as the experimental data. This dataset contains the operating data of gearbox rotating equipment under variable working conditions, such as various speeds and loads. These conditions are analogous to the complex working conditions encountered in actual application scenarios. Experiments utilizing this data can accurately reflect the impact of variable working conditions on fault diagnosis methods [20]. As shown in Figure 4a, the gearbox test bench consists of a motor, a motor controller, a planetary gearbox, a reduction gearbox, a brake, and a brake controller. It uses vibration sensors to collect data in the x, y, and z directions of the planetary gearbox and the reduction gearbox, respectively. The sampling frequency is 5120 Hz, and the sensor collection position is shown in Figure 4b, where each fault type corresponds to two working conditions.
The vibration signal in the x direction of the planetary gearbox is extracted as input, and 1024 sampling points are set as a sample. Given the sample imbalance characteristics of fault data under variable working conditions, an unbalanced configuration strategy for the number of samples is implemented. The detailed information is shown in Table 2.

3.4. Experimental Results Analysis

3.4.1. Analysis of Comparative Experimental Results

The experimental results are shown in the comparative experiment section of Table 3. The accuracy of the proposed ChebyAT model is as high as 0.9975, which is 0.1255, 0.1030, 0.1237, 0.0870, and 0.2540 higher than the ChebyNet, GAT, GCN, GIN, and GraphSAGE models, respectively. Reference [6] utilized the same data preprocessing method on this dataset, achieving 0.9483 as the highest accuracy. Comparisons with five comparative models and the accuracies reported in the literature indicate that the model presented in this paper exhibits notable superiority in accuracy. Of course, when applied to actual industrial environments, the generalization and migration capabilities of the model need to be studied to adapt to actual fault diagnosis needs. Preliminary attempts at actual fault diagnosis, conducted on the equipment of a company that manufactures injection molding machines, have achieved an accuracy rate exceeding 95%. This indicates that the method proposed in this paper can significantly support real-world fault diagnosis efforts.
The accuracy change curves of the six models after 100 epochs are shown in Figure 5a. The convergence speed of the ChebyAT model is comparable to that of other models during the early stages of training; however, as training progresses, its advantages in stability and accuracy become increasingly evident. Other models exhibit significant deviations after fitting, highlighting their insufficient adaptability to variable working conditions. The experimental results, maintaining high efficiency and stability in variable graph data, demonstrate the strong generalization capability of the proposed ChebyAT model under varying working condition.
Upon introducing the confusion matrix to further evaluate the recognition ability of the model, as shown in Figure 6, the ChebyNet, GAT, GCN, and GIN exhibited extremely low recognition rates when processing the category labeled 2. This low recognition rate is likely due to the significant difference in sample sizes between the two categories, with category 2, having fewer samples, being misclassified as category 1, which has a larger number of samples. The GraphSAGE failed to recognize categories 1 and 6, indicating its weak adaptability and poor diagnostic performance under varying working conditions. In contrast, the proposed ChebyAT model achieved a recognition rate of 97% for category 7 and 100% for the other categories. This near-perfect recognition rate demonstrates the ChebyAT’s exceptional ability to extract key features and accurately classify complex and unbalanced datasets.
The t-distributed Stochastic Neighbor Embedding (t-SNE) visualization technique was introduced to illustrate the classification aggregation effect of different models after 100 epochs of training, as shown in Figure 7. The distribution of the six graph neural network models within the data space is depicted. First, compared to the ChebyNet, GAT, GCN, GIN, and GraphSAGE models, these models exhibited similar shortcomings in clustering effectiveness, characterized by unclear category distinctions and multiple unexpected clustering areas, typically resulting from insufficient model feature extraction. In contrast, the clustering effect of the ChebyAT model is particularly outstanding: it demonstrated clear cluster boundaries and very few errors in the classification process, and the number of cluster groups and category labels of the data set correspond completely, showing excellent feature extraction accuracy and classification ability, and highlighting the strong adaptability and robustness of the ChebyAT model in dealing with complex graph data structures.
In summary, the ChebyAT model has demonstrated considerable advantages in comparative experiments. It not only outperforms other graph neural network models with an accuracy of up to 0.9975, but also remains efficient and stable under changing working conditions. As shown by the confusion matrix evaluation, ChebyAT demonstrates superior recognition capabilities in complex, imbalanced data sets, while t-SNE visualization further highlighted its high accuracy in feature extraction and classification, demonstrating its high reliability in handling data from variable working conditions.

3.4.2. Analysis of 10-Fold Cross-Validation Experimental Results

The model’s generalization performance was evaluated using 10-fold cross-validation, dividing the bearing fault dataset into ten equal subsets, with nine subsets used for training and the remaining subset for evaluation. This process was repeated 10 times, with a different subset selected as the test set each time, ensuring that each data point had the opportunity to be used as part of the test set.
The results of the cross-validation experiment are shown in Table 3 and Figure 5b, within the cross-validation section. The average diagnostic accuracy of the ChebyAT reaches 99.72%, representing increases of 9.79%, 9.33%, 10.31%, 9.89%, and 17.96% over the ChebyNet, GAT, GCN, GIN, and GraphSAGE, respectively. The ChebyAT demonstrates a significant advantage in average diagnostic accuracy, showing its ability to stably and accurately extract relevant features for classification tasks across various subsets, leading to precise predictions. In comparison, although the ChebyNet, GAT, GCN, and GIN also achieved an average diagnostic accuracy of nearly 90%, this outcome may obscure their subpar performance with specific subsets.
The standard deviation of the ChebyAT is 0.0005, representing reductions of 0.0277, 0.014, 0.0287, 0.0297, and 0.018 compared to the ChebyNet, GAT, GCN, GIN, and GraphSAGE, respectively. The standard deviation of the ChebyAT model is significantly lower than those of the other models, reduced by several orders of magnitude, indicating that its performance exhibits less fluctuation across different subsets and demonstrates greater stability. The high standard deviations of the other five GNNs reveal substantial fluctuations in their performance across different dataset partitions. Such fluctuations often indicate that the models are heavily reliant on specific data features, thereby restricting their generalization capability.
As demonstrated through the rigorous evaluation with cross-validation, the ChebyAT not only achieves high accuracy on the current dataset but, more importantly, demonstrates robust performance across various datasets, indicating that it can maintain stable predictive capabilities when encountering diverse data distributions, which is particularly crucial for complex scenarios in practical applications. In summary, this further demonstrates that the ChebyAT model excels not only with specific datasets but also possesses superior generalization capabilities.

3.4.3. Analysis of Ablation Experiment Results

To evaluate the effectiveness of the ChebyAT-based fault diagnosis method, this paper used the classic ChebyNet as a control group and utilized the original model alongside three improved models, as detailed in Table 4, for comparative analysis. Four evaluation metrics—accuracy, precision, recall, and F1 score—were utilized to assess model performance [21]. The precision, recall, and F1 score of the classes were weighted and averaged according to the number of samples in each class. The specific descriptions of each model are as follows:
(1) ChebyNet: Used as a control group with no modifications to its network structure.
(2) S-ChebyNet: Introduces symmetry processing improvements based on ChebyNet to evaluate the effectiveness of this enhancement and serves as a baseline to assess the impact of adding a GAT and the multi-task collaborative strategy.
(3) S-ChebyA: Integrates a GAT into the improved ChebyNet structure from the S-ChebyNet to evaluate the effectiveness of the GAT and serves as a baseline to assess the impact of the multi-task collaborative diagnosis strategy.
(4) ChebyAT: Adds the multi-task collaborative diagnosis strategy to the S-ChebyA network. As an experimental group, it is compared with the S-ChebyA to verify the effectiveness of the multi-task collaborative diagnosis strategy.
The ablation experiment results are shown in Table 4. Compared to the classic ChebyNet model, the F1 score of the S-ChebyNet model, enhanced by symmetry processing, improved by 4.18%. This indicates that symmetry processing effectively addresses the imbalance in information transmission resulting from differing data distributions under variable working conditions, ensuring the fair integration of information and reducing the misclassification of categories with fewer samples, as well as those with more samples. The F1 score of the S-ChebyA model, which incorporates a multi-head graph attention mechanism built upon the S-ChebyNet, increased by 5.73%. This improvement indicates that the GAT effectively addresses the local feature extraction limitations of the ChebyNet, balances global and local features, mitigates misjudgment due to information loss, and enables the model to accurately identify samples through a comprehensive analysis of fault features. The F1 score of the ChebyAT model, which incorporated the multi-task collaborative diagnosis strategy, increased by 5.73%, reaching 99.75%. This indicates that the implementation of the multi-task collaborative strategy effectively enhanced the model’s ability to distinguish similar faults under varying working conditions and significantly improved its generalization capabilities. The evaluation indicators indicate that improvements in symmetry processing, the integration of the GAT, and the inclusion of the multi-task collaborative strategy have significantly enhanced the model’s performance in fault recognition under variable working conditions.
The results of the ablation experiment clearly demonstrate the interaction and mutual enhancement of improvements in symmetry processing, the integration of the GAT, and the inclusion of the multi-task collaborative strategy in enhancing the model’s fault recognition performance under variable operating conditions. The enhanced symmetry processing supplies the model with a more balanced and comprehensive data input. The integration of the GAT further improves the model’s feature extraction capability by considering both local and global features, facilitating balanced extraction. The multi-task collaborative strategy enhances the model’s generalization capability under varying working conditions by simultaneously addressing node classification and reconstruction tasks, leveraging its powerful feature extraction capability. These three modules collaborate during the stages of data input, feature extraction, and fault identification, mutually reinforcing each other and collectively enhancing the model’s fault identification performance under variable operating conditions.
Table 4. Ablation experiment results.
Table 4. Ablation experiment results.
ModelAccuracyPrecisionRecallF1 Score
ChebyNet0.87200.83390.87230.8411
S-ChebyNet0.91250.86910.91200.8829
S-ChebyA0.94350.96510.94410.9402
ChebyAT0.99750.99760.99750.9975

4. Conclusions

This study presents a ChebyNet modeling method for fault diagnosis in rotating equipment under variable working conditions. The Path composition method is utilized to map complex data from variable working conditions into a non-Euclidean space, successfully revealing the intricate interactive relationships within the data. Symmetry processing enhancements have been incorporated into the ChebyNet framework, improving the symmetric transmission of information between graph nodes and ensuring the efficient and equitable integration of fault features. By combining the enhanced ChebyNet with a GAT, the method accurately captures both global features and local details of the data. Additionally, the incorporation of a node reconstruction task through self-supervised learning creates an effective synergy with the node classification task, enhancing the model’s intrinsic understanding of the input data and significantly improving its generalization ability for processing data under variable working conditions. The experimental results indicate that the proposed method achieves an average diagnostic accuracy of 99.72%, surpassing other GNN models. Moreover, the method displays notable stability and generalization, emphasizing its robust performance and significant applicability in complex and variable industrial environments. This study enhances the application of GNNs in industrial contexts and introduces an innovative approach to tackling the challenges of fault diagnosis in rotating equipment under variable operating conditions.
This study primarily focuses on a single data source and does not fully explore the potential of multimodal data fusion, which restricts the model’s ability to comprehensively assess equipment status in complex and dynamic industrial environments. Looking ahead, we are committed to further exploring the application of multimodal data fusion within the GNN framework; investigating effective data preprocessing and feature extraction methods that fully preserve the uniqueness and complementarity of each data modality; exploring innovative data integration strategies to intelligently assign weights to different modal data to achieve efficient and accurate information fusion; and building a GNN model that can efficiently fuse multidimensional information from different sensors. Implementing the model for intelligent health management and maintenance of industrial equipment can yield more accurate and efficient solutions for industrial production, aiding enterprises in achieving cost reduction, enhanced efficiency, and sustainable development.

Author Contributions

J.L. proposed the method and wrote the paper; Y.D. revised and improved the paper; J.L. and Y.D. contributed equally to this paper; X.X. and Z.Z. conducted the experiment and analysis of results. All authors have read and agreed to the published version of the manuscript.

Funding

This work is partially supported by the Guangdong Basic and Applied Basic Research Foundation (Grant No. 2022B151520053), the National Natural Science Foundation of China (Grant No. 52175457), and the Guangdong Science and Technology Plan Project (Grant No. 2023A0505050151).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare that there is no conflict of interest regarding the publication of this paper.

References

  1. Hu, C.; Li, G.; Yong Zhao, Y.; Cheng, F. Summary of Fault Diagnosis Methods for Rolling Bearings under Variable Working Conditions. J. Comput. Eng. Appl. 2022, 58, 26. [Google Scholar]
  2. Ding, Y.; Jia, M.; Zhuang, J.; Cao, Y.; Zhao, X.; Lee, C.G. Deep imbalanced domain adaptation for transfer learning fault diagnosis of bearings under multiple working conditions. Reliab. Eng. Syst. Saf. 2023, 230, 108890. [Google Scholar] [CrossRef]
  3. Zhu, Z.; Lei, Y.; Qi, G.; Chai, Y.; Mazur, N.; An, Y.; Huang, X. A review of the application of deep learning in intelligent fault diagnosis of rotating machinery. Measurement 2023, 206, 112346. [Google Scholar] [CrossRef]
  4. Fan, L.; Cheng, Q.; Yu, W.; Griffith, D.; Golmie, N. Survey of graph neural networks and applications. Wirel. Commun. Mob. Comput. 2022, 2022, 9261537. [Google Scholar]
  5. Chen, Z.; Xu, J.; Alippi, C.; Ding, S.X.; Shardt, Y.; Peng, T.; Yang, C. Graph neural network-based fault diagnosis: A review. arXiv 2021, arXiv:2111.08185. [Google Scholar]
  6. Li, T.; Zhou, Z.; Li, S.; Sun, C.; Yan, R.; Chen, X. The emerging graph neural networks for intelligent fault diagnostics and prognostics: A guideline and a benchmark study. Mech. Syst. Signal Process. 2022, 168, 108653. [Google Scholar] [CrossRef]
  7. Jiang, L.; Li, X.; Wu, L.; Li, Y. Bearing fault diagnosis method based on a multi-head graph attention network. Meas. Sci. Technol. 2022, 33, 075012. [Google Scholar] [CrossRef]
  8. Li, C.; Kwoh, C.K.; Li, X.; Mo, L.; Yan, R. Rotating Machinery Fault Diagnosis Based on Multi-Sensor Information Fusion Using Graph Attention Network. In Proceedings of the 2022 17th International Conference on Control, Automation, Robotics and Vision (ICARCV), Singapore, 11–13 December 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 678–683. [Google Scholar]
  9. Chang, Y.; Chen, J.; Zheng, W.; He, S.; Xu, E. Triplet adversarial Learning-driven graph architecture search network augmented with Probsparse-attention mechanism for fault diagnosis under Few-shot & Domain-shift. Mech. Syst. Signal Process. 2023, 199, 110462. [Google Scholar]
  10. Zhang, Y.; Zhang, S.; Zhu, Y.; Ke, W. Cross-domain bearing fault diagnosis using dual-path convolutional neural networks and multi-parallel graph convolutional networks. ISA Trans. 2024, 152, 129–142. [Google Scholar] [CrossRef] [PubMed]
  11. Li, T.; Zhao, Z.; Sun, C.; Yan, R.; Chen, X. Domain adversarial graph convolutional network for fault diagnosis under variable working conditions. IEEE Trans. Instrum. Meas. 2021, 70, 3515010. [Google Scholar] [CrossRef]
  12. Gao, Y.; Wu, H.; Liao, H.; Chen, X.; Yang, S.; Song, H. A fault diagnosis method for rolling bearings based on graph neural network with one-shot learning. EURASIP J. Adv. Signal Process. 2023, 2023, 101. [Google Scholar] [CrossRef]
  13. Li, Y.; Zhang, L.; Liang, P.; Wang, X.; Wang, B.; Xu, L. Semi-supervised meta-path space extended graph convolution network for intelligent fault diagnosis of rotating machinery under time-varying speeds. Reliab. Eng. Syst. Saf. 2024, 251, 110363. [Google Scholar] [CrossRef]
  14. Yang, C.; Liu, J.; Zhou, K.; Jiang, X.; Ge, M.F.; Liu, Y. A node-level PathGraph-based bearing remaining useful life prediction method. IEEE Trans. Instrum. Meas. 2022, 71, 3517610. [Google Scholar] [CrossRef]
  15. Defferrard, M.; Bresson, X.; Vandergheynst, P. Convolutional neural networks on graphs with fast localized spectral filtering. Adv. Neural Inf. Process. Syst. 2016, 29, 3844–3852. [Google Scholar]
  16. Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph attention networks. arXiv 2017, arXiv:1710.10903. [Google Scholar]
  17. Kipf, T.N.; Welling, M. Semi-supervised classification with graph convolutional networks. arXiv 2016, arXiv:1609.02907. [Google Scholar]
  18. Xu, K.; Hu, W.; Leskovec, J.; Jegelka, S. How powerful are graph neural networks? arXiv 2018, arXiv:1810.00826. [Google Scholar]
  19. Hamilton, W.; Ying, Z.; Leskovec, J. Inductive representation learning on large graphs. Adv. Neural Inf. Process. Syst. 2017, 30, 1025–1035. [Google Scholar]
  20. Shao, S.; McAleer, S.; Yan, R.; Baldi, P. Highly accurate machine fault diagnosis using deep transfer learning. IEEE Trans. Ind. Inform. 2018, 15, 2446–2455. [Google Scholar] [CrossRef]
  21. Han, T.; Xie, W.; Pei, Z. Semi-supervised adversarial discriminative learning approach for intelligent fault diagnosis of wind turbine. Inf. Sci. 2023, 648, 119496. [Google Scholar] [CrossRef]
Figure 1. Framework diagram of ChebyNet modeling method for rotating equipment fault diagnosis under variable operating conditions. The black circle represents the reference node whose features are to be updated, and the color of other circles changes, indicating that the features of the corresponding nodes are aggregated by the reference node.
Figure 1. Framework diagram of ChebyNet modeling method for rotating equipment fault diagnosis under variable operating conditions. The black circle represents the reference node whose features are to be updated, and the color of other circles changes, indicating that the features of the corresponding nodes are aggregated by the reference node.
Applsci 14 09208 g001
Figure 2. Node feature update mechanism of ChebyNet when the aggregation neighborhood range is 1 and 2, respectively.
Figure 2. Node feature update mechanism of ChebyNet when the aggregation neighborhood range is 1 and 2, respectively.
Applsci 14 09208 g002
Figure 3. (a) GAT node feature update process. (b) Calculating the attention coefficient of adjacent nodes. (c) Node feature fusion update of multiple attention heads.
Figure 3. (a) GAT node feature update process. (b) Calculating the attention coefficient of adjacent nodes. (c) Node feature fusion update of multiple attention heads.
Applsci 14 09208 g003
Figure 4. (a) Gearbox test platform. (b) Sensor collection location diagram.
Figure 4. (a) Gearbox test platform. (b) Sensor collection location diagram.
Applsci 14 09208 g004
Figure 5. (a) Comparison experiment accuracy curve. (b) Cross-validation error bar chart.
Figure 5. (a) Comparison experiment accuracy curve. (b) Cross-validation error bar chart.
Applsci 14 09208 g005
Figure 6. Confusion matrix visualization experimental results of six models. (a) ChebyNet. (b) GAT. (c) GCN. (d) GIN. (e) GraphSAGE. (f) ChebyAT.
Figure 6. Confusion matrix visualization experimental results of six models. (a) ChebyNet. (b) GAT. (c) GCN. (d) GIN. (e) GraphSAGE. (f) ChebyAT.
Applsci 14 09208 g006
Figure 7. t-SNE visualization experimental results of six models. (a) ChebyNet. (b) GAT. (c) GCN. (d) GIN. (e) GraphSAGE. (f) ChebyAT.
Figure 7. t-SNE visualization experimental results of six models. (a) ChebyNet. (b) GAT. (c) GCN. (d) GIN. (e) GraphSAGE. (f) ChebyAT.
Applsci 14 09208 g007
Table 1. Model hyperparameter settings.
Table 1. Model hyperparameter settings.
LayerNameParameter
Iuput LayerSample length1024
Batch size64
Epochs100
Learning rate0.001
ChebyNet Layer1K2
Batchnorm11024
GATHeads number4
Batchnorm21024 × 4
Fully connected512
Dropout0.2
ChebyNet Layer2K2
Batchnorm310
Node ReconstructionFully connected1024
Table 2. Experimental dataset details.
Table 2. Experimental dataset details.
Working
Conditions
Fault LocationFault TypeLabelNumber of
Training/Testing
Working condition 1: speed 20 Hz, load 0 VRollerCrack0680/170
Inner and outer ringCrack1840/210
NoneNormal2440/110
Inner ringCrack3840/210
Outer ringCrack4680/170
Working condition 2: speed 30 Hz, load 2 VRollerCrack5880/220
Inner and outer ringCrack61000/250
NoneNormal7760/190
Inner ringCrack8600/150
Outer ringCrack9800/200
Table 3. Comparison and cross-validation experiment accuracy results.
Table 3. Comparison and cross-validation experiment accuracy results.
ExperimentalModelChebyNetGATGCNGINGraphSGAEChebyAT
ComparativeAccuracy0.87200.89450.89600.91450.82150.9975
10-fold cross-validation00.86220.91370.92670.90120.81250.9975
10.91850.88880.88120.86150.83650.9980
20.91880.89750.86250.88480.82200.9970
30.89620.90870.91170.93270.78430.9965
40.86220.87300.90970.88300.84000.9965
50.92130.91130.90620.87020.79830.9970
60.92850.91670.87400.93100.81770.9970
70.93350.89750.90870.93870.80250.9975
80.88000.91750.83700.86120.82300.9970
90.87180.91400.92350.91830.83870.9975
Average value0.89930.90390.89410.89830.81760.9972
Standard deviation0.02820.01450.02920.03020.01850.0005
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liao, J.; Deng, Y.; Xie, X.; Zhang, Z. Research on Chebyshev Graph Convolutional Neural Network Modeling Method for Rotating Equipment Fault Diagnosis under Variable Working Conditions. Appl. Sci. 2024, 14, 9208. https://doi.org/10.3390/app14209208

AMA Style

Liao J, Deng Y, Xie X, Zhang Z. Research on Chebyshev Graph Convolutional Neural Network Modeling Method for Rotating Equipment Fault Diagnosis under Variable Working Conditions. Applied Sciences. 2024; 14(20):9208. https://doi.org/10.3390/app14209208

Chicago/Turabian Style

Liao, Jige, Yaohua Deng, Xiaobo Xie, and Zilin Zhang. 2024. "Research on Chebyshev Graph Convolutional Neural Network Modeling Method for Rotating Equipment Fault Diagnosis under Variable Working Conditions" Applied Sciences 14, no. 20: 9208. https://doi.org/10.3390/app14209208

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop