**4. Experiment**

#### *4.1. Rolling Bearing Test Bench*

To verify the superiority of the proposed method, the experimental data are obtained from the self-made rolling bearing fault test bench belonging to Anhui University of Technology, as shown in Figure 3. The experimental bearing is 6206-2RS1 SKF. Different depth faults are manufactured on the inner ring, outer ring and rolling ball for the rolling bearing by electric sparkline cutting technology. Figure 4 presents four different health states for rolling bearing.

**Figure 3.** Schematic diagram of rolling bearing test rig.

**Figure 4.** Different health states of rolling bearings.

#### *4.2. Rolling Bearing Multisensor Signals*

The sampling frequency is set to 8192 Hz. When the load is 5 KN and the motor speed is 300 r/min, the signals of the rolling bearings in different health states are collected. Figure 5 shows the time-domain vibration signals of rolling bearings from three different directional sensors. The signals collected from each directional sensor contain six health states, including two types of inner ring faults with the fault depth of 0.3 and 0.4 mm, two types of outer ring faults with the fault depth of 0.2 and 0.3 mm, and one type of rolling bearing normal state.

**Figure 5.** Time-domain vibration signals of rolling bearing from three different sensors.

### *4.3. Dataset Construction*

Under the same fault type, 1024 data points are taken as one sample, and 100 samples are taken for each fault type randomly, which comes to 600 samples in total in this experiment. Three IMF components were obtained by decomposing each sample with VMD, and 17 time-domain and frequency-domain features were extracted for each IMF component, as well as five multiscale entropy values. After feature extraction, a sample of 1024 data points is changed into a sample of 66 data points as the input of the proposed model and the comparison model. 90% of them are randomly divided into the training set and 10% into the testing set, as shown in Table 1. That is, each category obtained a training sample of faults with a data dimension of 90 × 66 and a test sample of 10 × 66. After fusion at the one-dimensional feature level, each type of fault of the multisensor signal obtained a training sample of 90 × 198 and a test sample of 10 × 198.



#### *4.4. Comparative Experiments and Analysis of Results*

4.4.1. The Feasibility and Effectiveness of Multisensor Collaborative Diagnosis

In order to prove the feasibility and effectiveness of multisensor collaborative diagnosis, the vibration signals in three directions of sensor and multisensor fusion signals are input into DAEN for comparison according to the dataset construction method in Section 4.3. Through many experiments, the structure of DAEN based on a single sensor signal is set as [66 50 30 10 6], and the structure of DAEN based on multisensor signals is set as [198 50 30 10 6], i.e., one input layer, three hidden layers and one output layer [30]. The initial learning rate of DAEN is 0.01, the maximum number of iterations is 100, the sparse parameter *r* is 0.01 and the sparse penalty coefficient is 0.13. In order to eliminate the influence of random errors, 10 experiments were conducted for each method, and the mean and standard deviation of the 10 experimental results were used as the evaluation index of the method. A total of 10 experimental results were compared, as shown in Figure 6, and the mean accuracy and standard deviation of the 10 experiments are shown in Table 2.

As can be seen from Table 2, compared with single sensor 1~sensor 3, the diagnosis accuracy based on multisensor fusion is improved by 4.43%, 10.10% and 6.27%, respectively. The above results show that the diagnosis effect based on multisensor fusion signal is significantly better than that of the single sensor fusion signal, which proves that multisensor signal co-operative diagnosis is feasible and effective. At the same time, we can see from Table 2 that the diagnostic accuracy of different sensors is very different, indicating that the fault information contained in different sensor signals is different. When different sensors co-operate in a diagnosis, more accurate and reliable results can be provided.

#### 4.4.2. Verification of the Superiority of the Proposed Method

To verify the performance of the proposed model, we compared stacked sparse autoencoder (SSAE), traditional machine learning method random forest (RF) and support vector machine (SVM). For fair comparison, the network structure of SSAE is the same as the proposed method, and the sparse parameter in SSAE is set to 0.2 and the sparse penalty coefficient is set to 0.15. The maximum depth of RF is set to 2, which contains 200 trees. The kernel function of SVM adopts RBF function. The penalty factor and kernel function parameters are set to 10 and 0.01, respectively.

**Figure 6.** Comparison of 10 experiment results for different sensors' datasets.



In order to eliminate the influence of random errors, 10 experiments were conducted for each method, and the mean and standard deviation of the 10 experimental results were used as the evaluation index of the method. 10 experimental results were compared, as shown in Figure 7, and the mean accuracy and standard deviation of the 10 experiments are shown in Table 3.

**Figure 7.** Comparison of experimental results.


**Table 3.** The average accuracy and standard deviation.

As can be seen from Figure 7 and Table 3, among the four methods, as traditional machine learning methods, the diagnostic results of RF and SVM in 10 experiments are lower than the other two autoencoder networks. This shows that the traditional machine learning method has a weak feature extraction ability and a low generalization ability when dealing with complex signals, and it is difficult to obtain a good diagnosis effect. Among the two autoencoder networks, the diagnostic accuracy of SSAE is 6.88% lower than that of DAEN, and the standard deviation is increased by 71.26%, which indicates that SSAE has a weaker feature extraction ability. The proposed method has the highest diagnostic accuracy and the lowest standard deviation in 10 experiments, indicating that the proposed method can mine fault-sensitive features effectively, make more, full use of multisensor information and improve the diagnostic effect and stability.

Figure 8 shows the confusion matrix of the first trial of the proposed method. The horizontal co-ordinates of the confusion matrix plot are the true labels, the vertical coordinates are the predicted labels and the numbers on the diagonal lines indicate the classification accuracy of the proposed method for each type of sample. From Figure 8, it can be seen that the proposed method can identify 100% of the five conditions of inner ring fault 2, outer ring fault 1, outer ring fault 2, rolling ball failure and normal condition for the rolling bearing dataset of six health conditions. The only misclassification occurred in the inner ring fault 1 sample.

**Figure 8.** Confusion matrix of the first trial of the proposed method.

The t-distribution neighborhood embedding (t-SNE) algorithm [33] is adopted for feature visualization. T-SNE method is used to draw scatter plots of the raw data, respectively. The output of the features from the Softmax layer of the proposed method is shown in Figure 9. From Figure 9, it can be seen that the raw time-domain signal contains too much

redundant information, and the features of all categories are difficult to distinguish. In contrast, the features extracted by the proposed method in the Softmax layer are easier to distinguish and show a better classification effect, i.e., the same fault features are clustered according to the same center and different fault features are distinguished, which proves the better performance of the proposed method.

**Figure 9.** Feature visualization. (**a**) Feature visualization of raw signal; (**b**) Feature visualization of Softmax layer.
