4.1. Datasets Introduction
This experiment was conducted in the Precision Metrology Laboratory, at the Mechanical Engineering Department of Sant Longowal Institute of Engineering and Technology Longowal, India. In this case, we try to identify fault types for testifying the property of the DA-ResNet model using the laboratory cylindrical roller bearings with different defect sizes. The test rig is shown in
Figure 3, and the shaft speed is measured by the proximity sensor. The power is provided by a 346-Watt AC motor and then is transferred to the shaft; a 2 kg disc is mounted in the middle of the shaft. A device named a lever arrangement is applied to load a roller bearing in the vertical direction. The load cell is installed below the bearing housing to measure the applied load. The accelerometer is set on the top of the bearing housing to decrease the effect of the transferring path. At the same time, the microphone is set to the nearest of the test bearing. The experiment was conducted under the conditions: shaft speed and vertical load are 2050 rpm and 200 N, respectively, and the signals acquired in this work were recorded at a sampling rate of 70,000 Hz.
Table 1 and
Table 2, respectively, describe the size parameters of the roller bearings and the four failure degrees of inner race, outer race, and roller. The four failure degrees are used to testify the proposed ideas, and the data are divided into four datasets. We labeled the different failure degrees with different numbers for constructing the source and target domains. The figures of the failure elements’ degrees are displayed in
Figure 4.
The waveform of vibration and acoustic signals is shown in
Figure 5. The
x axis represents the number of sampled points, and the vertical direction is the amplitude of the vibration and acoustic signals. It is noted that, from top to bottom, are the exhibited subfigures of
Figure 5a, named VS1, VS2, VS3 (vertical, horizontal, and axial directions of the tested bearings), and AS signals. The faulty vibration signals have transient impulses, and an inner race and roller cases; the acoustic signals could match the impulses’ locations sometimes. There are two ideas verified by the above signals, the first idea is to check that the DA-ResNet is superior to the other intelligent diagnosis methods. Then, the effectiveness of vibro-acoustic multi-source signal is testified by a transfer task.
4.2. Experimental Configuration
The common compared models are used to test the performance of the proposed model. Multilayer perception (MLP), also named ANN, inputs and outputs layers; many hidden layers are included between the input and output layers. The simplest network has a hidden layer, which can learn features from the input data. BiLSTM consists of the forward and backward LSTM, the former is to process the input data and the latter for the reversed data, and then to splice the output of two LSTMs after processing. CNN is a supervised learning neural network with a convolutional layer, a pooling layer, a batch normalization layer, and activation function, commonly regarded as a feature extractor. CNN has had great success in high dimension, such as image, video, and light fields. Low dimension includes seismic waves, radar data, biological signals, and so on. In particular, CNN is widely applied in fault diagnosis for feature extraction.
- (1)
Baseline: MLP
As a baseline model, the MLP is composed of two dense layers (called fully connected layers). A large number of parameters in MLP results from the full connection between the input and output of each dense layer, and the dropout layer is used to overcome the overfitting caused by numerous trainable parameters in the MLP model. Specifically, the structure of MLP can be described as: {Input (4096,), dense (32,), dropout (32,), dense (128,), dropout (128,)}.
- (2)
BiLSTM
For the BiLSTM model, the convolutional layer is introduced to overcome the computational complexity caused by the recurrence mechanisms of LSTM and the solving technique is to embed the row signals into a low dimensional feature vector. The structure of BiLSTM is as follows: {Input (4096,), convolution (128, 64), BiLSTM (32,), dropout (32,), dense (128,), dropout (128,)}.
- (3)
CNN
In CNN, the structure of this model is to stack in turn several convolutional and max-pooling layers. Specifically, the details of the CNN model are as follows: {Input (4096,), convolution (1024, 4), max-pooling (512, 4), convolution (128, 8), max pooling (64, 8), flatten (512,), dense (128,)}.
- (4)
ResNet
For the ResNet, residual blocks are significant characteristics and provide a multi-receptive field due to the skip-connection. Inspired by residual networks in computer vision, a simple ResNet is designed to diagnose bearings’ faults. The constructed structure of the model can be described as follows: {Input (4096,), convolution (1024, 4), residual block (512, 8), max-pooling (256, 8), residual block (256, 16), max-pooling (128, 16), residual block (128, 32), max-pooling (64, 32), flatten (2048,), dense (128,)}.
In this paper, the specific parameters of the DA-ResNet model are shown in
Table 3, withthe improved ResNet in
Figure 2c. The specific layers’ parameters of the mentioned four compared models are described in the above part of the table. Similarly, more information on these models is shown in
Table 4.
4.3. Cross-Domain Fault Diagnosis
To further validate the property of DA-ResNet, the following methods for comparison tests were used,. Methods include MLP, BiLSTM, CNN, ResNet, and domain adaption CNN (DA-CNN). The first four are common deep learning methods for object detection, object recognition, and so on. In this case, the effectiveness of the feature transfer is testified with four experimental tasks. Vibration and acoustic signals are collected under the same working conditions. The difference is the defect size of the faulty elements of roller bearings, and the training sample consists of vibration and acoustic signals.
The diagnostic results of six methods are given in
Table 5, and their corresponding histogram is shown in
Figure 6. The F1-scores of MLP, BiLSTM, CNN, ResNet, DA-CNN, and DA-ResNet are shown in
Table 6. The histogram of the F1-scores is shown in
Figure 7. In
Table 5, the capital letters A, B, C, and D are denoted by the fault degrees (also named datasets under the same defect size). For the detailed sizes, are refer to
Table 2. For example, the fault degree A is the source domain and B is the target domain.
The F1-score is a tool that evaluates the accuracy of predictions and takes into account whether intelligent diagnostic methods have a preference for diagnostic performance in different categories. The formulas of the F1-score are given as follows:
in which
TP is denoted as the predicted positive class,
FP is the predicted positive class of error, and
FN represents the predicted negative class of error. The parameter definitions of the above formulas are shown as follows: precision and recall represent accuracy and recall, respectively, and
F1 is the F1-score. The F1-score is introduced to evaluate the diagnostic property of MLP, BiLSTM, CNN, ResNet, DA-CNN, and DA-ResNet in diagnostic tasks.
The key to this case is to use one defect size of the faulty element of bearings to diagnose the other fault types. The diagnosis results of the MLP, BiLSTM, CNN, ResNet, and DA-CNN are given in
Table 5. In
Table 5, the capital letters A, B, C, and D denote the fault degrees (also named datasets under the same defect size). The detailed sizes can be referred to in
Table 2. For example, the fault degree A is a training sample, and B is used to test the sample. The average accuracy of the compared methods is 48.9%, 74.27%, 85.43%, 88.73%, and 91.83%. However, the proposed method result is 94.22%, which is superior to the comparison methods in four diagnostic tasks. Especially in the task from A to C, the lowest accuracy is half of the highest diagnostic value. The result can predict the MLP, and BiLSTM methods cannot separate the unknown label samples. CNN and ResNet can separate parts of unknown label samples, Nevertheless, without solving the problem of domain adaptation that model would not lead to a good result. The structure characteristic of ResNet is residual connections; a residual block is introduced to solve the problem of the accuracy decrease as a result of the increase of network depth. Furthermore, the residual block can keep the performance of models while the depth of the model increases. On the contrary, the characteristic of conventional networks is that accuracy will maintain a certain value and degrade rapidly as the network depth increases; meanwhile, more layers are added will bring more high training error. From
Table 6, the highest F1-score values are used in bold font in the table. The performance of MLP in four diagnostic tasks is not good, and the proposed method has the highest F1-score in three diagnostic tasks. Combined with the accuracy of the DA-ResNet, it can be verified that the diagnostic property is superior to the other diagnostic methods.
To compare the results of six methods, t-Distributed Stochastic Neighbor Embedding (t-SNE) [
43] is introduced to visualize the operating results. Four colors represent four healthy conditions of roller bearings. As shown in
Figure 8, the four colors are completely mixed together, which is obtained by MLP. Comparing CNN and ResNet, the blue and red colors are mixed, which is obtained by using the CNN model, and the result is worse than ResNet’s result. The confusion matrix of diagnostic results of MLP, BiLSTM, CNN, ResNet, DA-CNN, and DA-ResNet is shown in
Figure 9. It is obvious that the multi-layer method MLP has not solved the domain adaption problem of source and target domains, only in the shared part with the recognition accuracy being close to 95%. The non-shared part from the target domain is mixed with the other classes and the average accuracy is 15%. The confusion matrix of BiLSTM is better than MLP; however, the second class is used to predict the same class with a low diagnostic accuracy of 69.6%. Comparing CNN with DA-CNN methods means that the MMD principle can decrease the discrepancy of the source and target domains. The same result is obtained from ResNet and DA-ResNet. Then, the diagnostic accuracy of DA-CNN is lower than DA-ResNet. The result indicates that the conventional network accuracy will maintain a certain value and degrade rapidly as the network depth increases. Meanwhile, more layers are added which will bring a higher training error.