**4. Conclusions and Future Research**

The detection of walnut impurities is of great significance to the safety of nut food. In this paper, an impurity detection model of walnut kernels based on the improved YOLOv5 network is established: a small target recognition layer is added to the original prediction head of the model to obtain more small impurities feature information. Then, some convolution blocks in the network are replaced by Trans-E blocks, which can capture more comprehensive information in different subspaces at different locations. The CBAM attention module is added to the neck part of the network model for feature fusion, which improves the network performance at a small cost. Finally, Ghostconv is introduced to replace the original Conv, which reduces the computational burden of the model and improves the detection speed. The improved model detection *mAP* can reach 88.9% and *F*1 can reach 90.81%, which is better than the original YOLOv5 network and other networks. Moreover, the improved network model has not only a high detection rate, but also a significant improvement in the identification rate of small target impurities. The model improvement studied in this paper is to maintain a balance between detection performance and detection speed, so as to meet the demand of the real-time detection of walnut impurities. Near infrared spectroscopy is an important tool in the field of food impurity detection [28]. However, it requires demanding hardware. The detection technology based on YOLOv5 has a higher detection rate, lighter detection equipment and a wider range of application objects when compared to the near infrared spectroscopy. It also has certain advantages in detection accuracy. The research content is also applicable to other nut food impurity detection fields, and provides technical reference for the detection of snack food impurities.

However, the improved YOLOv5 model has limitations, such as a fraction of missing and wrong detection cases for small foreign bodies. Therefore, the detection accuracy of the model still needs to be improved. Improving the resolution of the camera is conducive to improving the detection accuracy. Then, due to the influence of external light source, the illumination of the image is biased. Fan Youchen et al. improved the YOLOv5 combined with dark channel enhancement to solve the problem of insufficient illumination. [29] This method can be applied to solve the illumination problem of the image. In addition, making the detection model lighter is one of the key points of future research. Chu et al. proposed a real-time apple flower detection method based on YOLOv4 and using the channel pruning method. [30] Isa Iza Sazanita et al. used the adaptive moment estimation optimizer and the function reducing-learning-rate-on-plateau to optimize the model's training scheme [31]. In the future, we can try to replace the backbone network with other lightweight networks to reduce the number of model parameters.

**Author Contributions:** Conceptualization, L.Y.; Data curation, Q.C.; Formal analysis, F.S.; Funding acquisition, M.Q.; Investigation, J.P.; Project administration, M.Q.; Resources, L.Y.; Software, Q.C.; Visualization, F.S. and J.P.; Writing—original draft, L.Y.; Writing—review and editing, M.Q. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by National Key R&D Program of China (2017YFC1600800) and "Pioneer" and "Leading Goose" R&D Program of Zhejiang (2022C02042 & 2022C02057).

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data are available from the corresponding author.

**Acknowledgments:** The authors would like to thank the anonymous reviewers and editors whose insightful comments and valuable suggestions are crucial to the improvement of the manuscript. Financial support from the above funds and organizations are gratefully acknowledged.

**Conflicts of Interest:** The authors declare no conflict of interest.
