**3. Results and Discussion**

Rice image datasets with different DOMs were trained on BPNN, AlexNet, VGG16, ResNet34, and IRBOA models. In addition, we compared the five models to find the optimal rice DOM inspection model. The training epochs for the BPNN and CNN models were 5000 and 100, respectively. Figure 6 shows the loss and accuracy curves of the four CNN models on the training set. The horizontal axis in the graph is the number of training epochs, and the vertical axes are the loss value (Loss) and accuracy (Acc) of the model, respectively. With the continuous increase of training epochs, the classification error of the training set shows a downward trend, and the accuracy shows an opposite trend. When the training epochs of the IRBOA model reach 69, the training loss is close to a stable value. The stable value of the average loss is 0.087, which is lower than the other three CNN models, and the accuracy is significantly higher than other models. In conclusion, the IRBOA model designed in this paper is reasonable and provides satisfactory training results.

**Figure 6.** Comparison of the learning curves of the four CNN models. (**a**) Loss curve. (**b**) Accuracy curve.

The hyperparameter optimization result of the IRBOA model is shown in Figure 7. The horizontal axis (Trial) in Figure 7 represents the number of iterations of the BOA, when it is 98, the objective function value is 0.9690 and the best result is obtained. However, the value of the objective function is still changing as the number of iterations increases. The effect indicates that the BOA is still trying to explore other optimal positions while approaching the optimal value. Table 3 lists hyperparameters obtained by the BOA for the five models, from which we can see that the hyperparameters are those that would normally not be set manually. The algorithm saves time and achieves results that cannot be captured by manual search. The models were trained and tested based on the optimized hyperparameters and the recognition rates were calculated for each model based on the test set. According to the comparative analysis in Table 3, we found that the detection accuracy of the IRBOA model for recognizing rice images was higher than that of the other four models, at 96.90%.

**Figure 7.** The Bayesian optimization process for the IRBOA model.

Accuracy is not sufficient to describe the practical application performance of the model in the case of significant differences and imbalances in the data samples. Confusion matrices were plotted for several models based on the test set (Figure 8) to accurately assess the classification performance of the above five classification models for rice DOM. The actual categories (horizontal axis) are compared with the predicted category (vertical axis) in Figure 8 to depict the individual classification performance of each category. 'A' in the diagram for well-milled, 'B' for reasonably well-milled, and 'C' for substandard. These results demonstrated that the classification effect of the CNN models was better than that

of BPNN, with the IRBOA model offering the best classification efficiency. The recognition precision of this model was 99.22%, 94.92%, and 96.55% for well-milled, reasonably wellmilled and substandard rice, respectively, with an average correct detection rate of 96.90%. The accuracy of the IRBOA model is 7.41% higher than that of traditional machine learning and no less than 1.35% higher than that of the classic CNN models.

**Table 3.** Hyperparameter results for the five models of Bayesian optimization.


**Figure 8.** Confusion matrix of five models. (**a**) BPNN. (**b**) AlexNet. (**c**) VGG16. (**d**) ResNet34. (**e**) IRBOA.

According to the prediction value in the confusion matrix, four different statistical indicators were attained, namely, *TP*, *TF*, *FP*, and *FN*. Moreover, the four evaluation indicators of accuracy, precision, recall, and F1-score, as well as the training time and single image test time of each model were calculated to compare the performance of several classification models (Table 4). The precision, recall, and F1-score of the IRBOA model were all 96.90% from Table 4. The corresponding values of BPNN, AlexNet, VGG16, and ResNet34 were all lower than the model proposed. Their F1-scores were 89.43%, 92.32%, 92.94%, and 95.59%. The experiments indicated that the recognition performance of the IRBOA model is better than that of the remaining four models, with higher accuracy and

generalization performance. Meanwhile, we found that the BPNN took a longer time when testing the network on a single piece of data although its training time of it was much faster than the CNN model. The reason for this consequence is the BPNN takes a large amount of time in extracting the color and texture feature parameters and in reducing the dimension of the feature parameters using principal component analysis. The IRBOA model for recognizing rice DOM is characterized by its long training time, but high detection accuracy and less than 20 milliseconds for a single image among the four CNN models. The effect of the model proposed can meet the actual needs in terms of temporal and model recognition performance.


**Table 4.** Detection performance indicators for the five models.
