2.5.3. Learning Rate

Learning rate is a very crucial hyperparameter in CNN classification models and impacts the recognition accuracy of the model. It is difficult and extremely important to choose the appropriate learning rate. In this paper, the model was trained by the equal-interval learning rate decay method, where the values of step\_size and gamma were determined by BOA. The equation for the equal-interval learning rate decay is as follows.

$$new\\_lr = initial\\_lr \times gamma^{\frac{epochs}{step\\_size}} \tag{4}$$

where *new*\_*lr* is the learning rate after decay, *initial*\_*lr* is the learning rate before decay, *gamma* is the decay rate less than 1, *epoch* is the number of training rounds, and *step*\_*size* is the decay step.

## 2.5.4. Regularization

Regularization is performed by adding penalty terms for the loss function to reduce model complexity and instability to avoid overfitting the model. L2 regularization not only prevents overfitting but also makes the process of optimizing the solution stable and fast through weight decay. Therefore, the L2 regularization method was adopted to solve the problem of model overfitting, and the regular term parameter was calculated by BOA.
