1. Introduction
The gearbox is a common rotating mechanical device used to transmit power, and it is widely used in various fields. Once the gear box breaks down, it will directly affect industrial production and daily life [
1,
2]. Severe cases may even cause personal injury and serious economic losses [
3]. Therefore, carrying out a fault diagnosis on—as well as the timely detection and troubleshooting of—a gearbox and its key components plays a key role in ensuring the healthy operation of mechanical equipment and reducing equipment maintenance costs. The gearbox has a complex structure, and the interaction between its various parts during operation is accompanied by the interference of environmental noise [
4]. The means with which to extract useful information from vibration signals containing complex information and interference components, along with realizing the intelligent diagnosis of gearboxes and its key component failures have particularly important practical significance for the normal operation of mechanical equipment and the improvement of production efficiencies.
Methods for the fault diagnosis of gearboxes can generally be divided into three categories: signal-processing-based methods [
5], traditional machine learning-based methods [
6], and deep learning-based methods [
7]. The signal-processing method mainly completes the fault diagnosis of the gearbox by analyzing the acquired vibration signal and extracting the characteristic information that can reflect the health condition [
8]. The methods based on signal processing are usually diagnosed and analyzed from the time domain, frequency domain, and time–frequency domain. Although the methods based on signal processing have made some achievements in the fault diagnosis of rotating equipment such as gearboxes, they require strong manual experience and theoretical knowledge reserves in the process of diagnosis [
9,
10]. Especially in the face of massive data collection in engineering, it is difficult to realize automatic and intelligent fault diagnosis.
The fault diagnosis method based on traditional machine learning methods mainly extracts useful feature indexes that can represent vibration signals through signal processing methods [
11]. Additionally, a shallow machine learning method to perform pattern recognition and to complete the intelligent diagnosis of gearbox faults can be conducted [
12]. The commonly used shallow machine learning methods mainly include the BP neural network [
13], the support vector machine (SVM) [
14], the extreme learning machine (ELM) [
15], and other models. Although the traditional machine learning algorithm can realize the intelligent diagnosis of faults, it improves the efficiency and accuracy of a fault diagnosis to a certain extent [
16]. However, there is still the limitation that the manual feature extraction of vibration signals is required, and existing feature-extraction methods still rely on signal-processing methods. At the same time, whether the feature extraction is good or not will have a direct impact on the subsequent pattern recognition and classification accuracy [
17,
18].
The concept of deep learning was first proposed by scholars such as Hinton [
19], and it has been widely used in image processing and in other directions. In recent years, more and more scholars have applied deep learning models to conduct a fault diagnosis of mechanical equipment. Compared with the traditional fault diagnosis method of the machine learning model, the fault diagnosis method based on a deep learning model does not need manual feature extraction. Using the stacked multi-layer nonlinear layers in the deep learning model can automatically complete the mining of latent features inside the signal [
20]. The dependence on signal-processing-based feature-extraction methods and the influence of human experience on feature extraction are reduced. The generalization performance and adaptability of the fault diagnosis method are improved [
21]. Huang et al. [
22] input the vibration signal of the gearbox that was decomposed by the wavelet packet into the multi-scale CNN and realized the effective classification of the faults. This method combines the multi-scale characteristics of WPD and the powerful classification ability of CNN, without the complicated manual feature-extraction steps adopted. This end-to-end intelligent fault diagnosis is worth learning. Jin et al. [
23] proposed a light neural network based on CNN that can realize the effective diagnosis of rotating machinery faults, such as gearboxes, with fewer parameters, and it maintains a good diagnostic performance under different working conditions. However, the scale of this light neural network is small and the number of parameters is limited, which is prone to overfitting problems in the training process. Yin et al. [
24] used LSTM with cosine loss to improve the classification ability of wind turbine gearbox faults. By introducing memory units and gating mechanisms, LSTM can effectively capture and maintain long-term dependencies of information, making it perform well in processing long sequence data. Miao et al. [
25] used GRU for the real-time status monitoring of planetary gearboxes and introduced dropout technology to reduce the requirements for training data, which has good real-time diagnostic capabilities. GRU has a more compact structure than LSTM, reducing the number of required parameters. Thus, the GRU model is relatively light and is easier to train and compute. If CNN and GRU can be combined to obtain the deep spatiotemporal features of spatial information and temporal information, good results may be obtained.
With the continuous improvement of sensor technology, sensors are used to obtain various signals that can effectively characterize the operating status of equipment. Additionally, the use of the acquired signal to analyze the health status of mechanical equipment has been widely studied by, and has been of interest for, many scholars in recent years [
26]. In fault diagnoses, signals such as vibration, acoustic emission, oil, and temperature are commonly used for analysis. Compared with other signals, vibration signal detection technology is more mature and is easy to operate; it also contains a wealth of useful information, such that it is often used to analyze the health status of rotating machinery. At the same time, thanks to the advancement of artificial intelligence technology, fault diagnosis methods have also developed rapidly. In particular, the advancement of machine learning and deep learning methods has gradually reduced the dependence on expert experience and human judgment for fault diagnoses [
27]. Intelligent fault diagnosis has gradually become an important direction for gearbox fault diagnosis.
In summary, this paper proposes a gearbox fault diagnosis method based on multi-sensor deep spatiotemporal feature representation. The contributions of this paper are as follows:
- (1)
The vibration signal obtained by a single sensor is susceptible to noise interference and cannot effectively characterize the operating state and fault characteristics of the gearbox. At the same time, it is also necessary to obtain a more effective and stable gearbox fault diagnosis model. A fault diagnosis model based on the PCNN–GRU fusion of multi-sensor information is proposed.
- (2)
On the basis of CNN–GRU, a parallel CNN combined with the GRU fault diagnosis model is proposed to fuse the vibration signal information acquired by the multiple sensors. Additionally, SoftMax was used to identify and complete the intelligent diagnosis of gearbox “end” to “end”.
- (3)
A fault diagnosis experiment platform was designed and built, and the validity and stability of the model proposed in this paper were verified by comparing it against other related models.
The remainder of this paper is organized as follows:
Section 2 introduces the relevant background of this paper, including the CNN, GRU, and multi-sensor information fusion technology;
Section 3 introduces the construction of the relevant fault diagnosis model in this paper;
Section 4 introduces the construction of the gearbox fault diagnosis experiment platform and the data collection and division;
Section 5 is the experimental analysis and verification section; and, lastly,
Section 6 contains the conclusion of this paper.
3. Fault Diagnosis Model Construction
According to the complexity of the network model and the scale of the data obtained from the designed fault diagnosis experiment, as well as multiple experiments to adjust and optimize the model parameters, a fault diagnosis model is constructed. The structure of the fault diagnosis model proposed in this paper based on PCNN–GRU fusion of multi-sensor information is shown in
Figure 2. The model structure parameters are shown in
Table 2.
From
Figure 2 and
Table 2, it can be seen that the model mainly consists of two parallel convolutional neural network layers, a fusion layer, a gated loop unit layer, and two fully connected layers (the last layer is the SoftMax layer for the final classification). Parallel convolutional neural networks are composed of two one-dimensional convolutional layers and two one-dimensional pooling layers. Additionally, the model structure parameters of the upper and lower channels are set as the same. The number of convolutional kernels, the kernel size, and the stride of convolutional layer 1–1 and convolutional layer 1–2 were all set to 32, 64, and 8, respectively. The number of convolution kernels, the kernel size, and the step size of convolution layer 2–1 and convolution layer 2–2 were all set to 64, 5, and 1, respectively. Among them, the padding mode and activation function of the convolutional layer were set to ‘same’ and ‘ReLU’, respectively. The size and stride of the pooling layers 1–1 and 1–2 were set to 4 and 1, respectively, and the size and stride of the pooling layers 2–1 and 2–2 were set to 2 and 1, respectively, both of which adopted maximum pooling. Through the joint action of the continuous convolutional layer and the pooling layer, the spatial characteristics of the vibration signals acquired by sensors at different positions were continuously mined and fused through the fusion layer. The spatial features obtained by merging the position information of different sensors were further mined for time series information; this was achieved via a GRU with 16 neurons to obtain the final multi-sensor deep spatiotemporal features. Finally, the expanded features are classified using a fully connected layer with thirty-two neurons and a SoftMax layer with four neurons, thus completing the end-to-end intelligent diagnosis of gearbox faults. That is, the effective identification of different fault states can be completed by directly inputting the vibration signal of the gearbox into the model. The overall flowchart of the fault diagnosis model is shown in
Figure 3.
It can be seen from
Figure 3 that the overall process is divided into four parts. The first part is the sensor data-collection part. The vibration signals of the gearbox are collected separately by dual sensors and output to the computer through the acquisition card to obtain the vibration signals of the gearbox. The second part is the data feature-extraction part. The gearbox vibration signal is input into the parallel CNN to mine the spatial information of the vibration signal, the fusion layer is used to fuse the spatial information of the different sensors, and the GRU model was further used to extract the time series information to obtain the deep spatiotemporal features that integrate the spatial and temporal information of multiple sensors. The third part is the feature-extraction part. The SoftMax layer was used for pattern recognition and classification, and then the classification information was finally output to obtain the final result.