**3. Experimental Verification**

In this section, the open dataset from the Bearing Data Center of Case Western Reserve University is used to verify our method. In order to reduce the process of real data collection, model-based data augmentation is used to construct the training dataset. To reduce the parameters to be identified, Equation (1) can be written as:

$$\begin{aligned} \ddot{y}\_s + \frac{c\_s}{m\_s} \dot{y}\_s + \frac{k\_s}{m\_s} y\_s &= \frac{F\_r - F\_y}{m\_s}, \\ \ddot{y}\_p + \frac{c\_p}{m\_p} \dot{y}\_p + \frac{k\_p}{m\_p} y\_p &= \frac{F\_y}{m\_p}. \end{aligned} \tag{15}$$

An error index is used to evaluate the distance between the simulation results and the measured experimental results. To consider the influence of the wave shift, this index is defined in the frequency domain as

$$\mathcal{c}\_{\text{inx}} = \parallel |FFT\left(\ddot{\mathcal{y}}\_{p,\text{sim}}\right)| - |FFT\left(\ddot{\mathcal{y}}\_{p,\text{real}}\right)| \parallel \prime \parallel ||FFT\left(\ddot{\mathcal{y}}\_{p,\text{real}}\right)|| \tag{16}$$

where |*FFT*( )| is the amplitude of the frequency.

Then, the system parameters are obtained by solving the optimization problem,

$$\begin{array}{l} \mathsf{argmin}\_{i \in \mathcal{P}} \\ P \in \{ \mathsf{c}\_{\sf s} / m\_{\sf s}, \mathsf{c}\_{\sf p} / m\_{\sf p}, k\_{\sf s} / m\_{\sf s}, k\_{\sf p} / m\_{\sf p} \} \end{array} \tag{17}$$

By comparing the simulation data with the experimental data, the parameters of the bearing model can be obtained, as shown in Table 1.

**Table 1.** Value of parameters in the dynamic equation.


To better simulate the real situation, a disturbance signal is added in the Equation (1),

$$-m\_{\\$}\ddot{y}\_{\\$} + c\_{\\$}\dot{y}\_{\\$} + k\_{\\$}y\_{\\$} + F\_{\\$} + F\_{\text{ext}} = F\_{\text{r}} \tag{18}$$

with

$$F\_{ext} = A \sin(\omega t)$$

where *A* is the amplitude of the disturbance force, while *ω* is the frequency of the disturbance force. In the fault-free case, the information about the disturbance force can be

extracted. Figure 5a shows the experiment in the fault-free case. A vibration with 30 Hz can be measured, which is considered the disturbance force.

**Figure 5.** Real and simulation data in fault-free cases: (**<sup>a</sup>**,**c**,**<sup>e</sup>**) are the original data, the envelop data, and the frequency spectrum of the real envelop data; (**b**,**d**,**f**) are the original data, the envelop data, and the frequency spectrum of the simulation envelop data.

For the purpose of the data augmentation, we used the envelop of the signals instead of the original signals. The reason is that the envelop signals contain less noise. Additionally, the information of the eigenfrequency was not taken into consideration in the envelop signal. Thus, by using envelop signals, we did not need a sufficiently exact model, i.e., the parameter *ks* and *kp* could deviate to the real value to some extent. Figure 5 gives the real and simulation data in the fault-free case, and (a,c,e) are the original data, the envelop data, and the frequency spectrum of the real envelop data, and (b,d,f) are the original data, the envelop data, and the frequency spectrum of the simulation envelop data, respectively. Figure 6 shows the real and simulation data in the outer race fault case. The task of fault detection is to distinguish the normal case, outer race fault, and inner race fault. Therefore, a Resnet deep neural network was used to design the classifier. The deep neural network shows a powerful ability for classification, but requires mass data for training. Therefore, we used the dynamic model to generate the dataset to assist the training process.

**Figure 6.** Real and simulation data with outer race fault: (**<sup>a</sup>**,**c**,**<sup>e</sup>**) are the original data, the envelop data, and the frequency spectrum of the real envelop data; (**b**,**d**,**f**) are the original data, the envelop data, and the frequency spectrum of the simulation envelop data.

In parameter identification, a group of experiment data is required. After parameter identification, the dynamic model can generate data under different conditions. For example, parameter identification is carried out when the rotation speed is 1797 rpm with a 12 kg load. The model can generate vibration data at different speeds and different loads. Figure 7 shows the generated data with the outer race fault, where (a,c,e) are the original data at different speeds, different pre-loads, and with noise, and (b,d,f) are the envelop data at different speeds, different pre-loads, and with noise. Similarly, the vibration data in the normal case and the inner race fault case can also be generated. Figure 8 shows the simulated vibration data in the inner race fault case. Only a few data are required (such as data with the outer race fault), and the model can generate rich data in different situations.

The Resnet classifier is used for the fault detection of the bearing faults. Table 2 shows the parameters of the Resnet. Three training datasets are constructed for the verification of the proposed method. The first one is the original data. Then, the real envelop dataset and the simulated envelop dataset are used to train the Resnet classifier. The testing dataset includes the normal, outer race fault, and inner race fault cases.

**Figure 7.** The generated data with the outer race fault: (**<sup>a</sup>**,**c**,**<sup>e</sup>**) the original data at different speeds, different pre-loads, and with noise; (**b**,**d**,**f**) are the envelop data at different speeds, different pre-loads, and with noise.

**Figure 8.** The generated data with inner race fault: (**<sup>a</sup>**,**b**) are the original data and the envelop data when the rotation speed is 1797 rpm; (**<sup>c</sup>**,**d**) are the original data and the envelop data when the rotation speed is 1730 rpm.


**Table 2.** Structure parameters of Resnet.

In general, if smaller differences exist between the simulation data and the real data, the classified results will be more accurate. However, the gap between the simulation and the real data will always exist. This is the reason why we use the envelop data instead of the original data. Figure 9 shows the comparison results between the simulation and real data. The error index is used to evaluate the performance of the simulation results for data augmentation. The error index for the original data is 0.9214, while the error index for the envelop data is 0.4622. The gap between the real and simulation results of the original data is much larger than that of the envelop data. This is the reason why we use the envelop data as the training dataset. training cost, which has a large application prospect.

**Figure 9.** Comparison of experimental and simulation data: (**<sup>a</sup>**,**<sup>c</sup>**) are the data in the time domain; (**b**,**d**) are the data in the frequency domain, where the blue line is the experiment data, and the red line is the simulation data.

Figure 10 shows the distribution of the probability of each dataset after training. The training dataset contains 500 groups of data, while the testing dataset contains 150 groups of data. If the real data (the original data and the envelop data) are used for training, the classified accuracy can reach 100%. The reason for this is that the difference is grea<sup>t</sup> for the signal in the three cases. However, the collection of the data in different operation situations is expensive work. Figure 10c shows the classification results of the Resnet classifier, which is trained by pure simulation data. The classified accuracy is still 100%, but the possibility is lower than that by using the real data. The reason for this is that the simulation data

is not the same as the real data. A gap between the simulation and real data therefore results in a low possibility. By using the envelop data, we can reduce the gap and achieve accepted classification results. The proposed method, based on the Resnet classifier with model-based data augmentation, can overcome the high costs of the classifier training cost, which has a large application prospect.

**Figure 10.** Classification results of the Resnet classifier: (**a**) the training and testing datasets are the real original data; (**b**) the training and testing datasets are the real envelop data; (**c**) the training dataset is the simulation data while the testing dataset is the real data.
