1. Introduction
Rolling-element bearing is the key component of mechanical equipment, and the bad and complex working environments can easily cause rolling-element bearing fault during runtime [
1]. To ensure the long-term and stable operation of rolling-element bearing, many researches have been done on rolling-element bearing fault diagnosis. The traditional rolling-element bearing fault diagnosis mainly adopts signal processing and machine learning techniques. The vibration signal processing techniques used in rolling-element bearing fault diagnosis mainly include time-domain analysis [
2], frequency-domain analysis [
3] and time-frequency analysis [
4,
5,
6,
7]. The wavelet analysis [
4], short-time Fourier transform (STFT) [
5], empirical mode decomposition [
6] and singular value decomposition [
7] are commonly used methods in time-frequency analysis of vibration signals of rolling-element bearing. The machine learning method used in rolling-element bearing fault diagnosis firstly extracts fault features from vibration signals, and then maps the extracted fault features into the fault type of rolling-element bearing. The common machine learning methods for rolling-element bearing fault diagnosis include support vector machine (SVM) [
8], k-nearest neighbor (k-NN) [
9], K-Means clustering [
10], back propagation neural network (BPNN) [
11], etc. The traditional rolling-element bearing fault diagnosis methods have been widely used, but with the increasing complexity of vibration signals, these methods have a certain limitation; however, the deep learning methods have a greater advantage in analyzing complicated and non-stationary vibration signals.
The deep learning methods can automatically extract fault features from vibration signals [
12], recently there are many researches are conducted on rolling-element bearing fault diagnosis using deep learning. Yin et al. [
13] extracted the original features of vibration signals through time-domain analysis, frequency-domain analysis and wavelet transform, and obtained the low-dimensional features from 38 original features using the nonlinear global algorithm, and the low-dimensional features array is input into the deep belief network (DBN) to evaluate the performance status of rolling-element bearing. Liu et al. [
14] obtained the spectrogram of vibration signals through STFT, used the stacked sparse auto-encoder (SAE) to automatically extract fault features, and employed the softmax regression to identify the fault type of rolling-element bearing. Liu et al. [
15] used the recurrent neural network (RNN) to classify the faults of rolling-element bearing, and adopted the gated recurrent unit based denoising auto-encoder to enhance fault classification accuracy. Among different deep learning methods, compared with DBN, SAE and RNN, the convolution neural network (CNN) has the characteristics of local perception, weight-sharing and subsampling, which can achieve higher performance at a lower cost.
Recently, 2D CNN has been widely used in rolling-element bearing fault diagnosis [
16,
17,
18,
19,
20,
21,
22,
23,
24,
25]. Janssens et al. [
16] proposed a feature learning method based on 2D CNN for detecting fault of rolling-element bearing, and the accuracy increases by about 6% compared with the random forest classifier. Hoang et al. [
17] proposed a bearing fault diagnosis method based on 2D CNN without manual feature extraction, which converts 1D vibration signals into 2D gray images and takes them as input data of the CNN classifier. Lu et al. [
18] built a rolling-element bearing fault diagnosis model using a hierarchical 2D CNN, the experiments prove that it can provide higher classification accuracy than using SAE and SVM. Guo et al. [
19] investigated a hierarchical adaptive 2D CNN on bearing fault diagnosis, which can automatically and sensitively extract fault features from vibration signals. Fuan et al. [
20] proposed an adaptive deep 2D CNN for rolling-element bearing fault diagnosis, and the key parameters of the CNN classifier are determined by particle swarm optimization algorithm. Li et al. [
21] proposed a bearing fault diagnosis method based on deep 2D CNN and D-S evidence theory, the results show that it can adapt to different load conditions. Liu et al. [
22] proposed a bearing fault diagnosis method using a lightweight 2D CNN, and improved the diagnosis accuracy and generalization ability by adding a BN layer and L2-regularization. Wen et al. [
23,
24,
25] conducted a series of studies on rolling-element bearing fault diagnosis using the state-of-the-art 2D CNN models including AlexNet, VGG-19 and ResNet-50, and the experiments show that they work well in the bearing fault diagnosis field. The existing researches indicate that the fault diagnosis methods based on 2D CNN can get high diagnosis accuracy, but some problems exist such as time-consuming preprocessing stage, high computational complexity, long training time and poor real-time performance.
Compared with 2D CNN, 1D CNN has a simpler network structure and a lower computational complexity, and it directly takes 1D raw vibration signals as input without any preprocessing, so it can provide a faster processing speed and is suitable for real-time fault diagnosis. Recently, there have been many works on rolling-element bearing fault diagnosis using 1D CNN [
26,
27,
28,
29,
30]. Eren et al. [
26,
27] developed a bearing fault diagnosis system using the compact adaptive 1D CNN classifier, which directly takes raw vibration signals as input and provides a competitive classification performance. Abdeljaber et al. [
28] studied a compact 1D CNN to identify, quantify, and localize ball bearing damage. Zhang et al. [
29] proposed a method based on deep 1D CNN to address bearing fault diagnosis problem, it takes raw vibration signals as input and does not need any denoising preprocessing, and the results show that the method performs well in noisy environment and achieves a high fault diagnosis accuracy under different working load. Ma et al. [
30] proposed a lightweight deep 1D CNN for rotating machinery fault diagnosis, which has a high training speed and a strong transfer-learning ability.
The LeNet-5 network developed by LeCun et al. [
31] is a classic 2D CNN model, which has been successfully applied to Alzheimer’s disease recognition [
32], traffic sign recognition [
33], facial expression recognition [
34], gas recognition [
35], pedestrian detection [
36] and other fields. Due to LeNet-5 network has a relatively simple structure and a powerful classification capability, this paper employs LeNet-5 network for rolling-element bearing fault diagnosis. Aiming at the problems of low recognition accuracy, slow convergence speed and weak generalization ability in rolling-element bearing fault diagnosis based on traditional LeNet-5 network, this paper proposes a novel rolling-element bearing fault diagnosis method using improved 2D LeNet-5 network, which can provide a rolling-element bearing fault diagnosis model with high classification accuracy, fast convergence speed and strong generalization ability. On the basis of improved 2D LeNet-5 network, this paper proposes an improved 1D LeNet-5 network for rolling-element bearing fault diagnosis, which can greatly reduce the training time and provide better diagnosis accuracy in most cases. The effectiveness of the proposed methods are evaluated through the rolling-element bearing data [
37] from Case Western Reserve University (CWRU). The main contributions of this paper are as follows:
The histogram equalization is carried out on the gray images during the preprocessing of experimental data, which can provide better input data for an improved 2D LeNet-5 network.
The convolution and pooling layers are reasonably designed and the size and number of convolution kernels are carefully adjusted, which can enhance the fault classification capability of improved 2D LeNet-5 network.
The batch normalization is used to normalize the output of each convolution layer, and the dropout operation is introduced after each full-connection layer except the last layer, which can improve the convergence speed and generalization ability of improved 2D LeNet-5 network.
On the basis of improved 2D LeNet-5 network, a well-designed 1D LeNet-5 network is proposed for performing the 1D convolution and pooling operations on the 1D raw vibration signals, which can provide a higher fault diagnosis accuracy with a less training time in most cases.
The rest of the paper is organized as follows. The basic theory is introduced in
Section 2. The proposed rolling-element bearing fault diagnosis method using improved 2D LeNet-5 network is described in
Section 3. The proposed rolling-element bearing fault diagnosis method using improved 1D LeNet-5 network is discussed in
Section 4. The experimental results and analysis are presented in
Section 5. The conclusions and future work are given in
Section 6.
3. Rolling-Element Bearing Fault Diagnosis Method Using Improved 2D LeNet-5 Network
3.1. Process of Rolling-Element Bearing Fault Diagnosis Based on Improved 2D LeNet-5 Network
The process of rolling-element bearing fault diagnosis based on the improved 2D LeNet-5 network is shown in
Figure 1, which can be described as follows:
- Step 1:
The vibration signals are collected by sensors deployed on the rolling-element bearing.
- Step 2:
The 1D raw vibration signals are transformed into the 2D gray images, and the histogram equalization is carried out on the gray images for enhancement.
- Step 3:
The dataset composed of gray images is divided into the training set and test set.
- Step 4:
The training set is input into the improved 2D LeNet-5 network for training, and the fault diagnosis model based on the improved 2D LeNet-5 network is obtained.
- Step 5:
The test set is input into the fault diagnosis model for testing, and the results of rolling-element bearing fault diagnosis are analyzed to evaluate the validity of the model.
3.2. Preprocessing of Experimental Data Used in Improved 2D LeNet-5 Network
The experimental data is provided by CWRU [
37], and the data used in this paper is collected under 12K and 48K sampling frequencies and motor load of 0, 1, 2, 3 horsepower (HP). Specifically, the experimental data includes the normal condition data, inner-race fault data, ball fault data and outer-race fault data.
The preprocessing of experimental data used in improved 2D LeNet-5 network is similar to the transformation process of signals described in [
39], at first every 4096 pieces of continuous raw vibration signals are divided into a sample, and then each sample is divided into 64 equal parts, which are aligned as the rows of the 2D image. In this way, the 1D raw vibration signals with a length of 4096 is transformed into a 2D image with a size of
, and each sample is normalized according to Equation (
3) and transformed into a gray image of
pixels using MATLAB.
In Equation (
3),
represents the
i-th sampling point of the current sample, and
and
represent the minimum and maximum values of all sampling points of the current sample respectively.
To solve the problem that the local features of gray images are not obvious, the histogram equalization method is adopted to make the distribution of pixel gray values become more uniform and enhance the contrast of images, which is helpful to promote convergence speed and fault classification accuracy of improved 2D LeNet-5 network. The process of performing histogram equalization on a gray image is as follows.
- Step 1:
The number of pixels of each gray level is calculated according to the gray value of each pixel of a gray image, and the histogram is obtained according to the gray level. The x-axis and y-axis of histogram represent the gray level and the number of pixels, respectively.
- Step 2:
All the gray levels whose number of pixels are more than zero are found.
- Step 3:
The gray level with the least number of pixels is found and denoted as .
- Step 4:
The cumulative distribution function (CDF) of each gray level is calculated.
- Step 5:
The gray value of each pixel which belongs to the gray level whose number of pixels is more than zero is updated by Equation (
4), where
M and
N represent the length and width of the gray image respectively.
To illustrate the effect of histogram equalization, four different samples under motor load of 1 HP are selected, including one sample with normal condition, one sample with inner-race fault, one sample with ball fault and one sample with outer-race fault. These four samples are transformed into four gray images, as shown in
Figure 2. The left side of each sub-figure is the gray image without histogram equalization, and the right side of each sub-figure is the gray image with histogram equalization. Obviously, the histogram equalization method can effectively enhance the contrast of images.
The dataset composed of gray images with different conditions of rolling-element bearing under different motor loads are divided into training sets and test sets according to the ratio of 7:3, as shown in
Table 2. The gray images are marked according to different conditions of rolling-element bearing, the normal condition is marked as N, the inner-race fault with fault diameter of 0.007, 0.014 and 0.021 inches are marked as I007, I014 and I021 respectively, the ball fault with fault diameter of 0.007, 0.014 and 0.021 inches are marked as B007, B014 and B021 respectively, and the outer-race fault with fault diameter of 0.007, 0.014 and 0.021 inches are marked as O007, O014 and O021 respectively.
3.3. Structure of Improved 2D LeNet-5 Network for Fault Diagnosis
It is observed that the traditional LeNet-5 network used in rolling-element bearing fault diagnosis has low fault classification accuracy, slow convergence speed and weak generalization ability, therefore the following improvements of traditional LeNet-5 network are made.
The gray images of pixels are used in the input layer. In the training of rolling-element bearing fault diagnosis model, it is found that the smaller the image, the lower the fault diagnosis accuracy, and the larger the image, the slower the training speed. The fault diagnosis accuracy and training speed are comprehensively considered, it is necessary to determine a suitable image size, and the image of pixels is selected.
One convolution layer and one pooling layer are added. Theoretically, the deeper the neural network, the stronger the feature expression ability, but the more difficult the optimization problem. Three convolution layers and three pooling layers are used in the improved 2D LeNet-5 network, which can extract much more fault feature information and obtain better training effect.
The size and number of convolution kernels are changed. The number of convolution kernels of each convolution layer of traditional LeNet-5 network is less, in view of the non-stationarity and complexity of vibration signals, it is necessary to carefully adjust the size and number of convolution kernels to enhance the fault classification capability. The first convolution layer uses eight convolution kernels of size , the second convolution layer uses 32 convolution kernels of size , and the third convolution layer uses 64 convolution kernels of size .
The batch normalization is adopted. BN can speed up the convergence, simplify the parameter adjustment and avoid the gradient vanishing problem.
The dropout operation is introduced. The dropout operation can effectively prevent and reduce over-fitting during the training of the fault diagnosis model, and improve the generalization ability of the model.
The ReLU activation function is used. When computing the error gradient by back propagation, the ReLU activation function can effectively alleviate the gradient disappearance, and it has faster computation speed compared with the sigmoid activation function used in traditional LeNet-5 network, so it can accelerate the training of neural network.
The improved 2D LeNet-5 network for rolling-element bearing fault diagnosis has nine layers, as shown in
Figure 3, which includes three convolution layers (i.e., Conv1, Conv2 and Conv3), three pooling layers (i.e., Pool1, Pool2 and Pool3) and three full-connection layers (i.e., FC1, FC2 and FC3).
The Conv1 layer performs the convolution operation on the neighborhood of size of a gray image of pixels with 8 convolution kernels of size , and 8 feature maps of pixels are generated. The Pool1 layer performs the max-pooling operation on the neighborhood of size of each feature map outputted by Conv1 layer, and eight feature maps of pixels are generated. In this paper, the strides of each convolution operation and each pooling operation are set to 1 and 2 respectively, and the padding modes of all the convolution and pooling layers are set to ‘VALID’.
The Conv2 layer performs the convolution operation on the neighborhood of size of each feature map outputted by Pool1 layer with 32 convolution kernels of size , and 32 feature maps of pixels are generated. The Pool2 layer performs the max-pooling operation on the neighborhood of size of each feature map outputted by Conv2 layer, and 32 feature maps of pixels are generated.
The Conv3 layer performs the convolution operation on the neighborhood of size of each feature map outputted by Pool2 layer with 64 convolution kernels of size , and 64 feature maps of pixels are generated. The Pool3 layer performs the max-pooling operation on the neighborhood of size of each feature map outputted by Conv3 layer, and 64 feature maps of pixels are generated.
The FC1 layer is fully connected with the output of Pool3 layer through 120 neurons, which combines all the local features of feature maps outputted by Pool3 layer into the global features, and 120 feature maps of pixels are produced. The FC2 layer is fully connected with the output of FC1 layer through 84 neurons. The FC3 layer (i.e., the output layer) is fully connected with the output of FC2 layer through four neurons, which uses the softmax function to classify the input data into four different categories corresponding to the normal condition, inner-race fault, ball fault and outer-race fault of rolling-element bearing.
After each convolution layer, the BN is adopted to normalize each feature map generated from the convolution operation, which can reduce internal covariate shift and promote the training efficiency of improved 2D LeNet-5 network.
After each of the first two full-connection layers, the dropout operation is introduced and the dropout ratio is set to 0.2, namely the neurons will be temporarily discarded from the neural network with a probability of 20%, which can to some extent restrain over-fitting.
After each convolution layer and each of the first two full-connection layers, the ReLU activation function is used to change all the negative values of each feature map into zero, which can completely backward-propagate the calculated gradient without causing the gradient disappearance.
The detailed settings of improved 2D LeNet-5 network structure for rolling-element bearing fault diagnosis are listed in
Table 3.
4. Rolling-Element Bearing Fault Diagnosis Method Using Improved 1D LeNet-5 Network
Although the proposed rolling-element bearing fault diagnosis method based on the improved 2D LeNet-5 network has high diagnosis accuracy, fast convergence speed and strong generalization ability, it has the following disadvantages: (i) the transformation of 1D raw vibration signals into 2D gray images is time-consuming; (ii) the multi-layer 2D convolution and pooling operations result in a relative long training time. In order to further improve the efficiency and effectiveness of fault diagnosis, on the basis of improved 2D LeNet-5 network, an end-to-end rolling-element bearing fault diagnosis method based on the improved 1D LeNet-5 network is discussed in this section.
4.1. Process of Rolling-Element Bearing Fault Diagnosis Based on Improved 1D LeNet-5 Network
Similar to the process of rolling-element bearing fault diagnosis based on the improved 2D LeNet-5 network, the process of fault diagnosis based on the improved 1D LeNet-5 network can be described as follows: firstly, the vibration signals are collected by sensors deployed on the rolling-element bearing; secondly, the dataset composed of 1D raw vibration signals is divided into training set and test set; thirdly, the training set is input into the improved 1D LeNet-5 network for training, and the rolling-element bearing fault diagnosis model is obtained; finally, the test set is input into the rolling-element bearing fault diagnosis model for testing, and the testing results are analyzed to evaluate the performance of the model.
For the improved 1D LeNet-5 network, the experimental data is also provided by CWRU, and every 4096 pieces of vibration data are divided into a sample. The dataset composed of 1D raw vibration signals with different conditions of rolling-element bearing under different motor loads are divided into training sets and test sets according to the ratio of 7:3, as shown in
Table 4.
4.2. Structure of Improved 1D LeNet-5 Network for Fault Diagnosis
The improved 1D LeNet-5 network used in rolling-element bearing fault diagnosis has the similar structure with improved 2D LeNet-5 network, as shown in
Figure 4, which includes five convolution layers, five pooling layers and three full-connection layers. The detailed settings of improved 1D LeNet-5 network structure for rolling-element bearing fault diagnosis are listed in
Table 5.
Each convolution layer adopts an appropriate number of convolution kernels with suitable size to perform the 1D convolution operation with a stride of one. Specifically, the Conv1 layer adopts six convolution kernels of size 64×1, the Conv2 layer adopts 16 convolution kernels of size 64×1, the Conv3 layer adopts 16 convolution kernels of size 16×1, the Conv4 layer adopts 32 convolution kernels of size 8×1, and the Conv5 layer adopts 32 convolution kernels of size 4×1. Each pooling layer adopts a suitable size of pooling kernel to perform the 1D pooling operation. Specifically, the Pool1 layer performs the 8×1 max-pooling operation with a stride of eight, the Pool2 layer performs the 4×1 max-pooling operation with a stride of four, and the Pool3, Pool4 and Pool5 layers perform the 2×1 max-pooling operation with a stride of one.
After each convolution layer, the BN and ReLU activation function are adopted. After FC1 and FC2 layers, the dropout operation are performed, and the dropout ratio is set to 0.2. The samples composed of 1D raw vibration signals are used in the input layer, and four different conditions of rolling-element bearing (i.e., normal condition, inner-race fault, ball fault and outer-race fault) are recognized by the FC3 layer with four neurons.