Fault Diagnosis of Rolling Bearings in Agricultural Machines Using SVD-EDS-GST and ResViT

Xie, Fengyun; Wang, Yang; Wang, Gan; Sun, Enguang; Fan, Qiuyang; Song, Minghua

doi:10.3390/agriculture14081286

Open AccessArticle

Fault Diagnosis of Rolling Bearings in Agricultural Machines Using SVD-EDS-GST and ResViT

by

Fengyun Xie

^1,2,*,

Yang Wang

¹,

Gan Wang

¹,

Enguang Sun

¹,

Qiuyang Fan

¹ and

Minghua Song

¹

School of Mechanical Electrical and Vehicle Engineering, East China Jiaotong University, Nanchang 330013, China

²

State Key Laboratory of Performance Monitoring Protecting of Rail Transit Infrastructure, East China Jiaotong University, Nanchang 330013, China

^*

Author to whom correspondence should be addressed.

Agriculture 2024, 14(8), 1286; https://doi.org/10.3390/agriculture14081286

Submission received: 5 July 2024 / Revised: 2 August 2024 / Accepted: 3 August 2024 / Published: 4 August 2024

(This article belongs to the Special Issue Computational, AI and IT Solutions Helping Agriculture)

Download

Browse Figures

Versions Notes

Abstract

:

In the complex and harsh environment of agriculture, rolling bearings, as the key transmission components in agricultural machinery, are very prone to failure, so research on the intelligent fault diagnosis of agricultural machinery components is critical. Therefore, this paper proposes a new method based on SVD-EDS-GST and ResNet-Vision Transformer (ResViT) for the fault diagnosis of rolling bearings in agricultural machines. Firstly, an experimental platform for rolling bearing failure in agricultural machinery is built, and one-dimensional vibration signals are obtained using acceleration sensors. Next, the signal is preprocessed for noise reduction using singular value decomposition (SVD) combined with the energy difference spectrum (EDS) to solve for the interference of complex noise and redundant components in the vibration signal. Secondly, generalized S-transform (GST) is used to process vibration signals into images. Then, the ResViT model is proposed, where the ResNet34 network is used to replace the image chunking mechanism in the original Vision Transformer model for feature extraction. Finally, an improved Vision Transformer (ViT) is utilized to synthesize global and local information for fault classification. The experimental results show that the proposed method’s average accuracy in rolling bearing fault classification for agricultural machinery reaches 99.08%. In addition, compared with SVD-EDS-GST-CNN, SVD-EDS-GST-LSTM, STFT-ViT, GST-ViT, and SVD-EDS-GST-ViT, the accuracy rate was improved by 3.5%, 3.84%, 4.8%, 8.02%, and 0.56%, and the standard deviation was also minimized.

Keywords:

agricultural machinery rolling bearing; fault diagnosis; SVD-EDS; GST; ResViT

1. Introduction

Today, the world is undergoing rapid changes, including population growth and changes in the global climate and ecology, so it is necessary to immediately provide sustainable and safe intelligent solutions for food production. Intelligent fault diagnosis research on agricultural machinery components is critical in this context [1]. Agricultural machinery is widely used in farming, sowing, fertilization, harvesting, and other aspects of modern agricultural production. Given the major role of this machinery, its efficient operation directly affects the efficiency and output of agricultural production [2]. Rolling bearings are a key transmission component in agricultural machinery, responsible for supporting rotating shafts and reducing friction, thereby improving mechanical efficiency and reducing energy loss. However, the operating conditions of agricultural machinery and other mechanical environments are very different. The operating environment of agricultural machinery is particularly harsh and changeable, including exposure to dust, moisture, and changing loads, and these conditions very easily produce fatigue problems such as increased strain, wear, and insufficient lubrication, which lead to the deterioration of bearing performance and failure [3]. The failure of a rolling bearing not only leads to the shutdown of agricultural machinery but also may cause more serious mechanical damage, resulting in economic losses. Therefore, it is very important to diagnose faults in agricultural machinery rolling bearings [4].

Traditional fault diagnosis methods mainly rely on practical experience and regular maintenance, but this method is low in efficiency, high in cost, and cannot easily provide early warnings [4,5]. With the advancements in artificial intelligence, more and more researchers are using intelligent diagnostic methods to perform fault detections on mechanical equipment [6]. Xie et al. [7] applied an improved SMOTE method to the early fault diagnosis of rolling bearings in agricultural machinery, enhancing the effectiveness of diagnosis under imbalanced data conditions. Wei et al. [8] applied improved DE and improved VMD algorithms for fault diagnosis in combine harvesters, successfully identifying faults in rolling bearings. Liu et al. [9] used ABC-VMD to decompose the signal of rolling bearing faults in a corn harvester, selected the best decomposed signal, and then proposed an improved EfficientNet model for diagnosis, realizing the accurate identification of rolling bearing fault conditions in a heavy noise environment. Fan et al. [10] proposed an algorithm combining FUDL and SC and applied it to tractor fault diagnosis, successfully identifying bearing and gearbox faults. Mystkowski et al. [11] applied different machine learning methods to create a model based on MLP for the fault measurement and diagnosis of rotary hay stands. Luo et al. [12] proposed a learning vector quantization neural network and a CNN VGG-16 network for tractor state recognition, providing theoretical support for angle detection error correction. Wang et al. [13] proposed a kind of infrared thermal imaging based on DCNN-SR for the early fault diagnosis of diesel engines, which showed a good ability to resist interference from temperature fluctuations.

The above methods show the application potential of artificial intelligence algorithms in fault analysis, but they also show that it is difficult to achieve both rapidity and accuracy in a model. In 2017, Google proposed the Transformer model. Traditional neural networks are not easy to parallelize, but the Transformer model shows great potential in solving this problem with its unique self-attention mechanism. Dosovitskiy et al. [14] introduced it into the field of image recognition and built the Vision Transformer (ViT) network structure, which does not rely on recurrent neural networks or convolutional neural networks. Its recognition accuracy and efficiency were found to be significantly improved, compared with those of traditional neural network models, when processing large-scale data sets [15]. This model has been widely applied. For example, Salamai et al. [16] used a ViT network for the classification and detection of agricultural rice diseases. Jamil et al. [17] used ViT for the medical classification and detection of heart valve diseases. Wang et al. [18] combined EfficientNet and Vision Transformer for cross-domain bearing fault diagnosis under mixed working conditions and achieved good results. To sum up, Vision Transformer has been proven to be superior to convolutional networks in many fields. However, when processing classification tasks, the image segmentation method of ViT may lead to the loss of local information in the curve. Therefore, in order to complete the task of fault classification more effectively, ViT needs to be improved.

At present, the vast majority of agricultural machinery fault diagnosis methods based on vibration signals still use acceleration sensors to obtain one-dimensional vibration signals of rolling bearings for analysis and processing, followed by traditional machine learning methods and deep learning methods for pattern recognition and classification [19]. Because the working environment of agricultural machinery is very complicated, the vibration signal is often subject to various types of interference, which complicates fault feature extraction. Converting a one-dimensional sequence signal into an image has potential advantages for feature extraction and differentiation in fault diagnosis. Deep learning also has better diagnostic capabilities than traditional machine learning methods. For example, Wang et al. [20] applied an optimized BPNN and CNN method to the fault diagnosis of a tractor’s hydraulic continuously variable transmission shift system, converted the one-dimensional signal into a two-dimensional gray-level map, and adopted a convolutional neural network for diagnosis, which greatly improved the diagnosis accuracy when compared with the one-dimensional signal and BPNN.

In order to solve the above problems, a new fault diagnosis method based on SVD-EDS-GST and ResViT for rolling bearings in agricultural machinery is proposed in this paper. SVD-EDS-GST technology is used to enhance the feature extraction process by breaking down the vibration signals and identifying the most important components that represent fault-related information. Then, the proposed ResViT is used for feature extraction and pattern recognition on the GST fault images. The main contents and innovations of this paper are as follows:

An experimental platform for the fault diagnosis of rolling bearings in agricultural machinery is built, and a new fault diagnosis method for such bearings with SVD-EDS-GST and ResViT is proposed.
SVD combined with the EDS is proposed to reduce the noise in vibration signals in order to remove the interference of complex noise and redundant components. The vibration signal noise reduction is realized via matrix reconstruction.
GST is applied to the vibration signal of rolling bearings in agricultural machinery after noise removal; thus, the one-dimensional vibration signal is converted into a two-dimensional time–frequency image, and a fault data set is established.
An improved ViT model combined with ResNet34 (ResViT) is proposed. This model uses a ResNet34 network to replace the image segmentation mechanism in the original Vision Transformer model for feature extraction, which makes the training more efficient and has strong robustness to small perturbations. At the same time, considering that the attention mechanism is not sensitive to the element position when processing global information, relative position coding is used instead of absolute position coding in order to better retain spatial information.

Comparison experiments with the popular CNN, LSTM, and Transformer deep learning models verify that the proposed fault diagnosis model combining SVD-EDS-GST with ResViT has good test accuracy and stability.

The framework of the rest of this paper is as follows: The second section introduces relevant theories, including SVD, EDS, GST, and ViT. The third section introduces the fault diagnosis model based on SVD-EDS-GST and ResViT for rolling bearings in agricultural machinery. The fourth section describes the setup of the experimental platform for rolling bearing failure in agricultural machinery and introduces the process of data acquisition. In Section 5, the experimental results are analyzed from many aspects and compared with other models. The sixth section presents the conclusions.

2. Basic Principles

2.1. SVD-EDS

As a key matrix decomposition technique, SVD plays an important role in numerical analysis and linear algebra. In particular, in the processing of rolling bearing vibration signals, SVD technology greatly retains the fault information of the original vibration signal by removing noise [21]. However, the methods commonly used to determine the effective order of SVD often rely on the user’s experience, resulting in their unsatisfactory noise reduction performance when applied to rolling bearing vibration signals [22]. To overcome this challenge, a new strategy, singular value decomposition combined with the energy difference spectrum (EDS), is proposed in this paper to determine the order of SVD more precisely, so as to improve the noise reduction effect on the vibration signals of agricultural machinery rolling bearings.

The one-dimensional signal

A = {a_{1}, a_{2}, \cdot \cdot \cdot, a_{N}}

is transformed into a two-dimensional form using the Hankel matrix [23], as shown in Equation (1):

A_{m \times n} = |\begin{matrix} a (1) & \dots & a (n) \\ ⋮ & ⋱ & ⋮ \\ a (m) & \dots & a (N) \end{matrix}| = D_{m \times n} + W_{m \times n}

(1)

Here,

A_{m \times n}

represents the Hankel matrix, the noise signal is

W_{m \times n}

, and the processed signal is

D_{m \times n}

. When

m = N / 2

, that is, when m is half the length of the original fault signal, the noise reduction effect of the Hankel matrix is generally more obvious, and the fault characteristics of the original signal can be retained [21].

In this study, the energy difference spectrum (EDS) is used to determine the singular value order, as shown in Equation (2):

E = \sum_{i = 1}^{q} σ_{i}^{2}

(2)

Here, the energy is represented by E, and the singular value is represented by

σ_{i}

. Using Equation (3) eliminates the influence of energy differences between different signals:

p (i) = \frac{σ_{i}^{2} - σ_{i + 1}^{2}}{E}

(3)

Here, the sequence p(i) represents the energy difference spectrum. The energy change in the singular values in adjacent order can help to identify the useful signal and the useless signal. By finding the corresponding singular value in the energy difference spectrum, the noise signal of the rolling bearing can be effectively removed.

2.2. GST

The generalized S-transform method combines time–domain and frequency–domain analysis to obtain instantaneous frequency information from a signal [24]. The GST is expressed as shown in Equation (4):

G S T_{x} (t, f) = \int_{- \infty}^{+ \infty} x (τ) w (t - τ) e^{- \frac{{(t - τ)}^{2} f^{2}}{2}} e^{- j 2 π f τ} d τ

(4)

In Formula (4),

G S T_{x} (t, f)

represents the S-transform, x(τ) represents the signal to be analyzed, and the translation quantity is represented by

τ

. w(t) represents the Gaussian window function,

w (t) = \frac{1}{σ (f) \sqrt{2 π}} \exp (\frac{- t^{2}}{2 σ {(f)}^{2}})

, and

σ (f) = 1 / | f |

. By introducing parameter m into the S-transform formula to adjust the width of the Gaussian window, GST can be significantly improved [25]. The improved GST is expressed as shown in Equation (5):

G S T_{x} {(t, f)}_{2} = \int_{- \infty}^{+ \infty} x (t - τ) w (t) e^{- \frac{{(t - τ)}^{2} f^{2}}{2}} e^{- j 2 π f τ} d τ

(5)

Here,

w (t) = \frac{1}{σ_{m} (f) \sqrt{2 π}} \exp (\frac{- t^{2}}{2 σ_{m} {(f)}^{2}})

, and

σ_{m} (f) = 1 / | f |^{m}

.

2.3. ResNet

ResNet (Residual Network) is a deep convolutional neural network architecture proposed by Microsoft Research Asia [26]. By introducing residual learning, very deep networks can be trained. The ResNet model is an improvement on VGG due to the introduction of residual network units, which use skip connections to directly add inputs to outputs, thus propagating information more effectively and making deep network structures easier to optimize and train. The most common residual blocks include Basic Block and Bottleneck Block [27], as shown in Figure 1a,b.

Basic Block consists of two convolution layers containing 3 × 3 convolution cores and a skip connection that adds the input directly to the output. Bottleneck Block contains a 1 × 1 convolution layer (to reduce the input dimensions), a 3 × 3 convolution layer, and a re-use 1 × 1 convolution layer (to restore dimensions). This design can significantly reduce the number of parameters and network computations, improving network performance. There are many different choices for network depth in ResNet [26].

2.4. Vision Transformer

In 2020, Dosovitskiy et al. proposed an image classification model with Vision Transformer (ViT) architecture [14]. Its structure is shown in Figure 2.

Patch and Position Embedding module. First, the GST two-dimensional fault image is divided into image blocks of the same size, and then these image blocks are linearly transformed and projected into a low-dimensional space, and the position encoding containing spatial information is added as input to the Transformer Encoder layer [28].
Transformer Encoder module. Transformer Encoder can efficiently extract the features of input data [29]. Its structure is shown in Figure 3. Input data from the Transformer Encoder are first normalized to improve the training stability. The data capture the in-sequence dependencies through a multi-head self-attention mechanism, and dropout is applied to prevent overfitting. After that, the processed data are fused with the original input using a residual connection and are normalized again. Finally, the data are fed into the MLP block for feature transformation. The original information is retained through residual connections [30].
Categorization Component. The features output by the Transformer Encoder are fed into the MLP head for classification, resulting in fault classification outcomes.

3. Fault Diagnosis Model

3.1. ResViT Model

In image processing, the first problem to be solved is to transform the two-dimensional vibration image into a two-dimensional matrix that the model can receive as input. Vision Transformer divides the input two-dimensional image into image blocks, then transforms these image blocks into one-dimensional sequences and uses the self-attention mechanism to learn the relationship between the position and sequence of each image block. Finally, the learned image data features are used in classification tasks. When expanding into a one-dimensional sequence form, a 16 × 16 two-dimensional convolution with a step size of 16 is usually used. If the step size and convolution kernel are too large, they may not be effective in capturing local information in the image [31].

In order to solve the above problems and improve the feature extraction performance of the model, a ResNet network was used to replace the Vision Transformer model for feature extraction. The ResNet network gradually extracts the features of images through cascading convolutional and pooling layers, which can capture low-level to high-level feature information. Compared with Vision Transformer, in which the image needs to be chunked and a self-attention mechanism is introduced, ResNet can show better computational performance and speed when dealing with a large amount of image information.

After a comprehensive analysis, a specific Residual Network model, ResNet34, was used. It has certain advantages over both smaller (such as ResNet18) and larger (such as ResNet50 or larger) ResNet models [27]. ResNet34 may offer a good balance of keeping the computational complexity low while achieving high performance. ResNet34 is shown in Figure 4.

At the end of each ResNet34 base block, there is a convolution layer that adjusts the dimensions to ensure that the input has the same dimensions as the output of the block. At the beginning of each stage, there is a subsampled convolution layer that halves the size of the feature map. These downsampling steps (stride = 2) result in the size of each stage being reduced to half its original size [27]. To sum up, when an image passes through the ResNet34 model, the total downsampling multiple is the (2 + 2 + 2 + 2)th power of 2—that is, 16-times downsampling. The specific process is shown in Figure 5.

Considering that the attention mechanism is not sensitive to the element position when processing global information, relative position coding was used instead of absolute position coding in order to better retain spatial information. The parameters used in the Vision Transformer model are shown in Table 1.

In the traditional Vision Transformer model, a larger patch size is usually used, such as 16 × 16 or 32 × 32. Here, in order to match the ResNet processed image data input, a patch size of 14 × 14 was chosen. The use of a smaller patch size (e.g., 14 × 14) also increases the model’s sensitivity to image layers.

3.2. Proposed SVD-EDS-GST and ResViT Fault Diagnosis Model

Fault diagnosis of rolling bearings in agricultural machines using SVD-EDS-GST and ResViT is shown in Figure 6.

The model flow is divided into the following parts: (1) Firstly, a piezoelectric acceleration sensor is used to obtain the vibration signal of a rolling bearing in agricultural machinery. The analog signal from the sensor is received by the data acquisition card and converted into a digital signal, which is then transmitted to the computer through the computer interface and processed by YE7600 software to convert the file into a .mat format. (2) The combined SVD and energy difference spectrum method is used to reduce the noise in the signal. (3) GST is used to process vibration signals into images. (4) A ResNet34 model is used to replace the image segmentation mechanism in the original Vision Transformer model for feature extraction, and considering that the attention mechanism is not sensitive to the element position when processing global information, relative position coding was used instead of absolute position coding. (5) A Vision Transformer model is used to recognize patterns in the image data and output the training results. (6) After the training is completed, the trained model is used to conduct a rolling bearing fault classification test.

4. Experimental Platform for Rolling Bearing Failure in Agricultural Machinery

The rolling bearing failure experimental platform constructed in this paper mainly consisted of a three-phase asynchronous motor (Model:YE3-100L2-4, Manufacturer: Zhejiang Jinsu Electric Motor Co., Ltd., Taizhou, China), gearbox reducer (Model:JZQ250, Manufacturer: Shandong Zibo Xinyuan Machinery Co., Ltd., Zibo, China), data acquisition card (Model:YE6231, Manufacturer: Jiangsu Lianeng Electronics Technology Co., Ltd., Yangzhou, China), frequency converter (Model:G7R5/P011T4, Manufacturer: Zhejiang Xintuo New Energy Co., Ltd., Yueqing, China), piezoelectric acceleration sensor (Model:CAYD051V, Manufacturer: Xiamen Nair Electronics Co., Ltd., Xiamen, China), and other components [23]. The flow of the test program is shown in Figure 7. Following the flow chart, the rolling bearing fault diagnosis experimental platform was constructed, and a photograph of the completed fault diagnosis platform is shown in Figure 8.

A total of four fault states and one normal state were set on the rolling bearing fault test platform, namely, the inner ring fault, cage fracture fault, outer ring fault, rolling element fault, and a normal state. The fault form of each state is shown in Figure 9. The data settings are shown in Table 2.

For each of the five rolling bearing states, the sampling frequency was set to 6 kHz, with 1000 groups of samples, for a total of 5000 groups. Each group of samples had 1024 sampling signal points. The sample groups were divided into a training set, validation set, and test set in a ratio of 7:2:1, i.e., 3500 groups for the training set, 1000 groups for the validation set, and 500 groups for the test set. The specific division is shown in Table 3.

5. Results and Discussion

5.1. Vibration Signal Denoising Analysis

This study uses the SVD-EDS method to denoise. A specific analysis follows. As shown in Figure 10, the singular-value curve changed significantly after the 30th order, and the decline rate slowed down significantly. This transition was subtle and therefore required further analysis by means of energy difference spectroscopy. As shown in Figure 11, the energy difference spectrum provided a more explicit order identification. At the 30th order, a distinct peak signal can be observed, indicating a sudden change. This indicates that the useful signal and noise were separated at this point. The singular values before order 30 correspond to useful signals, and the singular values after that are mainly noise.

As shown in Figure 12. Signals reconstructed using the first 30 singular values show better periodicity and readability. The processed data (shown in red) have a clearer signal and less noise than the original data (shown in blue). This enhancement validates the effectiveness of using order 30 for signal reconstruction.

In summary, using the SVD-EDS, the optimal order of vibration signal denoising for rolling bearings was determined to be 30. This significantly improved the clarity of useful signals and was a reliable signal-processing method.

5.2. GST 2D Graph Conversion

GST was used to convert the vibration signals denoised via SVD and EDS into two-dimensional time–frequency graphs, as shown in Figure 13.

5.3. Fault Diagnosis Results

5.3.1. Fault Diagnosis Model Analysis

The method used in this study was tested on the Pytorch deep learning framework on a Windows 10 system. The optimizer selected is AdamW, and cosine annealing was chosen as the learning rate adjustment strategy. Figure 14 shows the iterative changes in the accuracy and loss of the proposed method on the training and validation sets.

First, observe the change curves for the loss values in Figure 14a. As can be seen, the loss values on the training and validation samples decreased continuously with an increase in the number of epochs and finally tended to be relatively stable. The loss values on the training and validation samples decreased very rapidly within the first 10 iterations, and the decline rate of the two curves was synchronized. The loss values decreased from about 1.75 to about 0.25, indicating that the model was rapidly converging. Beyond 10 epochs, the training loss and validation loss continue to decrease gradually, approaching near-zero values by epoch 80. This suggests the model is effectively learning and fitting the training data. The curve converges very well and there is no overfitting.

Observe the change curves for the accuracy in Figure 14a. As can be seen, the training accuracy and validation accuracy start low and increases rapidly within the first 20 epochs. Beyond 20 epochs, the accuracy continues to improve gradually. Within 10 to 40 iterations, the accuracy of the training and verification samples rose relatively steadily and slowly, from about 80% to about 98%. After 40 iterations, the accuracy of the training and verifications sample almost coincided and basically remained unchanged. The curves were smooth on the whole, and there were no large line fluctuations, which indicates that the network model was trained; it also proves that the model had fast convergence and high fault diagnosis classification accuracy.

5.3.2. Feature Visualization Analysis

The features of the training result layer output are visualized in three dimensions using T-SNE [32], as shown in Figure 15.

From the three-dimensional graph (a), it can be seen that the fault characteristics of each category of the original data are relatively scattered and there is a lot of overlap between different categories. It is impossible to accurately identify various faults. After feature extraction and training with the method proposed in this paper, it can be seen from the three-dimensional graph (b) that the various dispersion and overlap degrees of the T-SNE diagrams were significantly improved; thus, the accuracy and stability of the data processing were relatively good. This further proves that the model proposed in this paper is effective and advanced as a new fault diagnosis method.

5.3.3. Result Analysis

The confusion matrix [23] of the classification results of the test samples is shown in Figure 16.

As can be seen from the sample test results in Figure 16, the recognition rate of the model for label 1 (normal state) in this recognition process reached 100%. This shows that the model can identify the normal state of a rolling bearing very accurately. For label 0 (cage breakage), label 3 (rolling element failure), and label 4 (outer ring failure), the recognition rates reached 96.51%, 98.98%, and 97.48%, respectively, and the overall performance was very good. For label 2 (inner ring failure), the recognition rate was relatively low, reaching 94.23%. In all, 4.65% of the inner ring failure conditions were mistaken for bearing cage breakage, and 1.68% were mistaken for outer ring failure. Therefore, the proposed method has satisfactory diagnostic accuracy.

The results of repeated experiments were obtained and are shown in Figure 17. When the fault diagnosis model based on SVD-EDS-GST combined with ResViT was run ten times, the sixth time had the highest recognition rate of 99.88%, and the third time had the lowest recognition rate of 97.63%. The average accuracy of 10 runs is 99.08%, indicating that the proposed method has satisfactory diagnostic accuracy and stability.

5.4. Comparative Analysis

In order to illustrate the advantages of the SVD-EDS-GST/ResViT method proposed in this paper in fault diagnosis, it is compared with several CNN, LSTM, and Transformer deep learning models on the same diagnosis case. All models were set with the same parameters and data preprocessing to enable fair comparison. To reduce the impact of randomness, the diagnostic results obtained from 10 repeated experiments are averaged, and the standard deviation of each diagnostic result is calculated, as shown in Table 4.

As shown in Table 4, the identification accuracy of the fault diagnosis model proposed in this paper, based on SVD-EDS-GST combined with ResViT, reached 99.076%. Compared with STFT-ViT, SVD-EDS-GST-2DCNN, SVD-EDS-GST-LSTM, GST-ViT, and SVD-EDS-GST-ViT, the accuracy rate was improved by 3.5%, 3.84%, 4.8%, 8.02%, and 0.56%, respectively. The model proposed in this study has the smallest standard deviation, which is 0.4128. It proves that the method proposed in this paper has good accuracy and stability.

6. Conclusions

In the complex and harsh environment of agriculture, research on the intelligent fault diagnosis of agricultural machinery components is very important. The Vision Transformer model has high accuracy in image classification, but it processes image blocks instead of traditional pixel-level features, which may affect the acquisition of local information in some cases. To solve this problem, this study proposed a method based on SVD-EDS-GST combined with ResViT for the fault diagnosis of rolling bearings in agricultural machinery. Through comprehensive experimental verification and performance evaluation, the following conclusions were drawn:

(1) Singular value decomposition combined with the energy difference spectrum method can be used to accurately reconstruct signals and significantly improve the quality and clarity of useful signals, effectively de-noising rolling bearing vibration signals.

(2) The ResNet-Vision Transformer method proposed in this paper integrates the ResNet34 architecture and retains the relative relationships between locations through relative position coding to enhance the model’s generalization ability. The experimental results showed that the average accuracy of the fault diagnosis model used in this paper was 99.08% for different fault states of rolling bearings, with higher accuracy and stability.

(3) The proposed fault diagnosis model based on SVD-EDS-GST combined with ResViT provided a more significant improvement when compared to STFT-ViT, SVD-EDS-GST-2DCNN, SVD-EDS-GST-LSTM, GST-ViT, and SVD-EDS-GST-ViT, with improvements in accuracy of 3.5%, 3.84%, 4.8%, 8.02%, and 0.56%, respectively. The standard deviation was also minimized, again proving that the proposed model has good accuracy and stability.

Although the diagnostic results of this study are very good, there are still some limitations. The method we proposed is currently only studied and applied under laboratory offline conditions, and cannot be used for online real-time monitoring. In the future, we will monitor the rolling bearing status online in real time and realize early prediction of faults through deep learning models. Agricultural machinery and equipment often operate under variable working conditions, and the introduction of transfer learning can be considered in future work. It is also hoped that the experimental theory can be applied to engineering practice.

Author Contributions

Conceptualization, F.X. and Y.W.; methodology, F.X. and G.W.; validation, F.X. and Y.W.; investigation, F.X., E.S. and Q.F.; writing—original draft preparation, F.X. and Y.W.; writing—review and editing, F.X. and M.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (52265068), the Natural Science Foundation of Jiangxi Province (20224BAB204050), Equipment Key Laboratory Project of the Ministry of Education (KLCEZ2022-02), and the Project of Jiangxi Provincial Department of Education (GJJ2200627).

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The data used to support the finding of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Zhao, S.; Li, T.; Wang, G.; Zhang, Y. Adjustment of Meat Consumption Structure under the Dual Goals of Food Security and Carbon Reduction in China. Agriculture 2023, 13, 2242. [Google Scholar] [CrossRef]
Li, C.; Wu, J.; Pan, X.; Dou, H.; Zhao, X.; Gao, Y.; Yang, S.; Zhai, C. Design and Experiment of a Breakpoint Continuous Spraying System for Automatic-Guidance Boom Sprayers. Agriculture 2023, 13, 2203. [Google Scholar] [CrossRef]
Fargnoli, M.; Lombardi, M. Safety Vision of Agricultural Tractors: An Engineering Perspective Based on Recent Studies (2009–2019). Safety 2020, 6, 1. [Google Scholar] [CrossRef]
Liu, Z.C.; Fang, L.L.; Jiang, D.; Qu, R.H. A machine-learning-based fault diagnosis method with adaptive secondary sampling for multiphase drive systems. IEEE. Trans. Power Electron. 2022, 37, 8767–8772. [Google Scholar] [CrossRef]
Cao, H.R.; Shao, H.D.; Zhang, X.; Deng, Q.W.; Yang, X.K.; Xuan, J.P. Unsupervised domain-share CNN for machine fault transfer diagnosis from steady speeds to time-varying speeds. J. Manuf. Syst. 2022, 62, 186–198. [Google Scholar] [CrossRef]
Rui, L.; Ding, X.X.; Wu, S.S.; Wu, Q.H.; Shao, Y.M. Signal processing collaborated with deep learning: An interpretable FIRNet for industrial intelligent diagnosis. Mech. Syst. Signal Process. 2024, 212, 111314. [Google Scholar] [CrossRef]
Xie, F.Y.; Li, G.; Liu, H.; Sun, E.G.; Wang, Y. Advancing Early Fault Diagnosis for Multi-Domain Agricultural Machinery Rolling Bearings through Data Enhancement. Agriculture 2024, 14, 112. [Google Scholar] [CrossRef]
Jiang, W.; Shan, Y.H.; Xue, X.M.; Ma, J.P.; Chen, Z.; Zhang, N. Fault Diagnosis for Rolling Bearing of Combine Harvester Based on Composite-Scale-Variable Dispersion Entropy and Self-Optimization Variational Mode Decomposition Algorithm. Entropy 2023, 25, 1111. [Google Scholar] [CrossRef] [PubMed]
Liu, Z.Y.; Sun, W.L.; Chang, S.K.; Zhang, K.N.; Ba, Y.J.; Jiang, R.B. Corn Harvester Bearing fault diagnosis based on ABC-VMD and optimized EfficientNet. Entropy 2023, 25, 1273. [Google Scholar] [CrossRef]
Fan, W.; Yang, C.X.; Chen, C.; He, C.B.; Yuan, Y.; Li, Y. Adaptive feature-oriented dictionary learning and sparse classification framework for bearing compound fault diagnosis. IEEE Trans. Instrum. Meas. 2024, 73, 1–10. [Google Scholar] [CrossRef]
Mystkowski, A.; Wolniakowski, A.; Idzkowski, A.; Ciężkowski, M.; Ostaszewski, M.; Kociszewski, R.; Kotowski, A.; Kulesza, Z.; Kulesza, S.; Miastkowski, K. Measurement and diagnostic system for detecting and classifying faults in the rotary hay tedder using multilayer perceptron neural networks. Eng. Appl. Artif. Intell. 2024, 133, 108513. [Google Scholar] [CrossRef]
Luo, Y.H.; Li, C.; Jiang, P.; Shi, Y.X.; Li, B.; Hu, W.W. Research on Tractor Condition Recognition Based on Neural Networks. Agriculture 2024, 14, 584. [Google Scholar] [CrossRef]
Wang, R.C.; Jia, X.S.; Liu, Z.C.; Dong, E.Z.; Li, S.Y.; Chen, Z.H. Conditional generative adversarial network based data augmentation for fault diagnosis of diesel engines applied with infrared thermography and deep convolutional neural network. Eksploat. I Niezawodn. 2024, 26, 175291. [Google Scholar] [CrossRef]
Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv 2010, arXiv:2010.11929. [Google Scholar]
Acheampong, F.A.; Nunoo-Mensah, H.; Chen, W.Y. Transformer models for text-based emotion detection: A review of BERT-based approaches. Artif. Intell. Rev. 2021, 54, 5789–5829. [Google Scholar] [CrossRef]
Salamai, A.A.; Ajabnoor, N.; Khalid, W.E.; Ali, M.M.; Murayr, A.A. Lesion-aware visual transformer network for Paddy diseases detection in precision agriculture. Eur. J. Agron. 2023, 148, 126884. [Google Scholar] [CrossRef]
Jamil, S.; Roy, A.M. An efficient and robust phonocardiography (pcg)-based valvular heart diseases (vhd) detection framework using vision transformer (vit). Comput. Biol. Med. 2023, 158, 106734. [Google Scholar] [CrossRef] [PubMed]
Wang, Y.G.; Zhou, J. Research on bearing fault diagnosis under mixed working condition based on Vision Transformer hybrid model. In Proceedings of the International Conference on Computer Graphics, Artificial Intelligence, and Data Processing (ICCAID 2023), Qingdao, China, 1–3 December 2023. [Google Scholar]
Wang, J.Y.; Mo, Z.L.; Zhang, H.; Miao, Q. A deep learning method for bearing fault diagnosis based on time-frequency image. IEEE Access 2019, 7, 42373–42383. [Google Scholar] [CrossRef]
Wang, J.B.; Lu, Z.X.; Wang, G.M.; Hussain, G.; Zhao, S.H.; Zhang, H.J.; Xiao, M.H. Research on fault diagnosis of HMCVT shift hydraulic system based on optimized BPNN and CNN. Agriculture 2023, 13, 461. [Google Scholar] [CrossRef]
Thi, T.X.C. Singular value decomposition and applications in data processing and artificial intelligence. HPU2 J. Sci. Nat. Sci. Technol. 2023, 2, 34–41. [Google Scholar]
Lv, Y.; Zhang, Q.X.; Yuan, R.; Dang, Z.; Ge, M. Local lowest-rank dynamic mode decomposition for transient feature extraction of rolling bearings. ISA Trans. 2023, 133, 539–558. [Google Scholar] [CrossRef] [PubMed]
Xie, F.Y.; Wang, G.; Zhu, H.Y.; Sun, E.G.; Fan, Q.Y.; Wang, Y. Rolling bearing fault diagnosis based on SVD-GST combined with vision transformer. Electronics 2023, 12, 3515. [Google Scholar] [CrossRef]
Wang, H.W.; Fang, Z.W.; Wang, H.L.; Li, Y.A.; Geng, Y.D.; Chen, L.; Chang, X. A novel time-frequency analysis method for fault diagnosis based on generalized S-transform and synchroextracting transform. Meas. Sci. Technol. 2023, 35, 36101. [Google Scholar] [CrossRef]
Cheng, H.; Zhang, Y.; Lu, W.; Yang, Z. A bearing fault diagnosis method based on VMD-SVD and Fuzzy clustering. Int. J. Pattern Recognit. Artif. Intell. 2019, 33, 1950018. [Google Scholar] [CrossRef]
He, C.B.; Cao, Y.J.; Yang, Y.; Liu, Y.B.; Liu, X.Z.; Cao, Z. Fault diagnosis of rotating machinery based on the improved multidimensional normalization ResNet. IEEE Trans. Instrum. Meas. 2023, 72, 1–11. [Google Scholar] [CrossRef]
Wen, L.; Li, X.Y.; Gao, L. A transfer convolutional neural network for fault diagnosis based on ResNet-50. Neural Comput. Appl. 2020, 32, 6111–6124. [Google Scholar] [CrossRef]
Xu, Z.B.; Tang, X.Y.; Wang, Z.G. A multi-information fusion ViT model and its application to the fault diagnosis of bearing with small data samples. Machines 2023, 11, 277. [Google Scholar] [CrossRef]
Liang, P.F.; Yu, Z.Z.; Wang, B.; Xu, X.F.; Tian, J.Y. Fault transfer diagnosis of rolling bearings across multiple working conditions via subdomain adaptation and improved vision transformer network. Adv. Eng. Inform. 2023, 57, 102075. [Google Scholar] [CrossRef]
Hou, Y.D.; Wang, J.J.; Chen, Z.G.; Ma, J.L.; Li, T.Z. Diagnosisformer: An efficient rolling bearing fault diagnosis method based on improved Transformer. Eng. Appl. Artif. Intell. 2023, 124, 106507. [Google Scholar] [CrossRef]
Diao, N.K.; Wang, Z.C.; Ma, H.X.; Yang, W.B. Fault diagnosis of rolling bearing under variable working conditions based on CWT and T-ResNet. J. Vib. Eng. Technol. 2023, 11, 3747–3757. [Google Scholar] [CrossRef]
Liang, Z.G.; Zhang, L.J.; Wang, X.Z. A novel intelligent method for fault diagnosis of steam turbines based on T-SNE and XGBoost. Algorithms 2023, 16, 98. [Google Scholar] [CrossRef]

Figure 1. Residual structures: (a) Basic Block; (b) Bottleneck Block.

Figure 2. Vision Transformer model structure.

Figure 3. Transformer Encoder model structure.

Figure 4. Resnet34.

Figure 5. The ResNet process.

Figure 6. Overall model of SVD-EDS-GST and ResViT.

Figure 7. Fault diagnosis experiment flowchart.

Figure 8. Fault diagnosis experimental platform.

Figure 9. Failure pictures: (a) inner raceway fault (A defect in the inner track where the bearing balls run); (b) ball fault (A defect in the bearing ball itself); (c) cage fracture (A break in the cage that holds the bearing balls in place); (d) normal (No faults detected); (e) outer raceway fault (A defect in the outer track where the bearing balls run).

Figure 10. Vibration signal singular-value distribution curve.

Figure 11. Vibration signal EDS distribution curve.

Figure 12. Denoising effect diagram.

Figure 13. GST diagrams of rolling bearings in different states: (a) cage fracture; (b) normal; (c) inner raceway fault; (d) ball fault; (e) outer raceway fault.

Figure 14. Iterative curves: (a) loss value change curves for the training and validation sets; (b) accuracy change curves for the training and validation sets.

Figure 15. T-SNE visualization: (a) 3D distribution of the original data; (b) 3D distribution of feature data.

Figure 16. Confusion matrix.

Figure 17. Display of recognition accuracy results from 10 runs.

Table 1. Improved ViT parameters.

Patch Size	Layers	Hidden Size D	MLP Size	Heads	Params
14 × 14	12	768	3072	12	86M

Table 2. Data settings.

Label	Rolling Bearing Condition	Motor Speed (r/min)	Length	Number of Data Sets	Sampling Frequency (Hz)
1	Cage fracture	900	1024	1000	6K
2	Normal	900	1024	1000	6K
3	Inner raceway fault	900	1024	1000	6K
4	Ball fault	900	1024	1000	6K
5	Outer raceway fault	900	1024	1000	6K

Table 3. Data segmentation preprocessing.

Rolling Bearing Condition	Training Set	Validation Set	Test Set	Label
Cage fracture	700	200	100	0
Normal	700	200	100	1
Inner raceway fault	700	200	100	2
Ball fault	700	200	100	3
Outer raceway fault	700	200	100	4
Total Array	3500	1000	500

Table 4. Diagnostic results of different methods.

Method	Average Accuracy (%)	Standard Deviation (%)
STFT-ViT	95.58	0.7236
SVD-EDS-GST-2DCNN	95.24	1.2933
SVD-EDS-GST-LSTM	94.28	1.7863
GST-ViT	91.06	0.9834
SVD-EDS-GST-ViT	98.52	0.4266
SVD-EDS-GST-ResViT	99.08	0.4128

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Xie, F.; Wang, Y.; Wang, G.; Sun, E.; Fan, Q.; Song, M. Fault Diagnosis of Rolling Bearings in Agricultural Machines Using SVD-EDS-GST and ResViT. Agriculture 2024, 14, 1286. https://doi.org/10.3390/agriculture14081286

AMA Style

Xie F, Wang Y, Wang G, Sun E, Fan Q, Song M. Fault Diagnosis of Rolling Bearings in Agricultural Machines Using SVD-EDS-GST and ResViT. Agriculture. 2024; 14(8):1286. https://doi.org/10.3390/agriculture14081286

Chicago/Turabian Style

Xie, Fengyun, Yang Wang, Gan Wang, Enguang Sun, Qiuyang Fan, and Minghua Song. 2024. "Fault Diagnosis of Rolling Bearings in Agricultural Machines Using SVD-EDS-GST and ResViT" Agriculture 14, no. 8: 1286. https://doi.org/10.3390/agriculture14081286

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Fault Diagnosis of Rolling Bearings in Agricultural Machines Using SVD-EDS-GST and ResViT

Abstract

1. Introduction

2. Basic Principles

2.1. SVD-EDS

2.2. GST

2.3. ResNet

2.4. Vision Transformer

3. Fault Diagnosis Model

3.1. ResViT Model

3.2. Proposed SVD-EDS-GST and ResViT Fault Diagnosis Model

4. Experimental Platform for Rolling Bearing Failure in Agricultural Machinery

5. Results and Discussion

5.1. Vibration Signal Denoising Analysis

5.2. GST 2D Graph Conversion

5.3. Fault Diagnosis Results

5.3.1. Fault Diagnosis Model Analysis

5.3.2. Feature Visualization Analysis

5.3.3. Result Analysis

5.4. Comparative Analysis

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI