*3.4. Data Augmentation*

In order to recognize images using in-depth learning, a large amount of image data needs to be prepared for model training, especially when using neural network model algorithms. For example, most common data collections in in-depth learning contain a large amount of image data, including 60,000 training data and 10,000 test set data in the Mixed National Institute of Standards and Technology (MNIST) dataset. There are 60,000 color images in the Canadian Institute For Advanced Research-10 (CIFAR-10) dataset, of which 50,000 are training data and 10,000 are test data. Therefore, in order to train our own neural network model, we need to prepare a large amount of experimental data for the training model. However, the experimental data cannot meet the actual training requirements of the neural network, and data enhancement methods need to be introduced to increase the amount of sample data. In the field of computer vision, data enhancement is usually achieved by introducing operations such as flip, rotation, clipping, distortion, scaling, etc. However, such methods cannot be used in time domain sequence signals. We enhanced the data in fault diagnosis by introducing a fixed-length sliding window to slice sequential time-domain signals in turn, as shown in Figure 11.

Using this method, we will ge<sup>t</sup> 79,980 training samples from 80,000 original data collected by UAV. This method can effectively solve the problem of insufficient training samples in actual training, but this method has been ignored in many articles [40–43], because they do not use overlapping sampling methods for data enhancement. As a result, there are only hundreds or thousands of training samples during model training. At the bottom of the article, we will verify the necessity of data enhancement through actual experiments.

### **4. Validation of the sPSDAE-CNN Model**
