1. Introduction
Food is a matter of national transportation and people’s livelihood, and food security is an important foundation of national security. In the process of granary storage and transportation, due to the influence of constant changes in temperature and humidity, the resulting granary pests can lead to mold and decay, which directly affects the edible quality and processing characteristics of stored granary and causes a large amount of granary loss [
1]. Solving the problem of pest infestation in granary silos has become a top priority in granary storage. Therefore, it is of great practical value to accurately detect and identify granary pests, which can then lead to the effective control of pests and reduce damage scientifically and efficiently to ensure food security [
2].
The traditional method of granary pest recognition is mainly by manual identification, with the help of human eyes or equipment such as magnifying glasses and microscopes, to identify the species and number of pests in granary bins. This has disadvantages including lower efficiency and limited professional ability to identify granary pests. In order to overcome such limitations, physical and chemical detection methods are used for granary pest detection. Physical methods such as the cuttings sieving method [
3], probe tube trapping method [
4], conductivity method [
5], etc., are limited by lower detection sensitivity, long time spent, high labor costs and the inability to detect hidden pests. Chemical methods such as the staining method [
6], solid-phase microextraction method [
7], molecular biology methods [
8], characteristic volatile compounds detection method [
9], etc., are more convenient, intuitive and sensitive in operation. However, due to the requirement of special chemical instruments and equipment, experienced professionals and high cost, the resulting chemical methods have been difficult to popularize.
With the development of image recognition technology, computer vision technology has been applied to the identification of granary pests. For example, Chelladurai et al. used soft X-ray and hyperspectral imaging to identify infected and uninfected soybeans with an accuracy of more than 90% [
10]. Karunakaran et al. used soft X-ray to identify larvae and adults of rice weevils and granary borers with an identification rate of 98% [
11]. Dowwell et al. detected 11 beetle species associated with stored granarys with near-infrared, identifying 99% of the tested insects as primary [
12]. Keagy et al. used a machine recognition method to detect weevils and weevil damage regions with wheat granary film X-ray images [
13]. However, these traditional recognition models mainly rely on manually extracted features, which makes it difficult to distinguish the marker features of small storage pests from the complex background. This is in addition to poor recognition, low accuracy and a slow calculation speed.
In recent years, deep learning methods are increasingly used in the recognition of granary pests. Chen et al. used a car loaded with a camera to drive on the surface of a grain bin, and the camera took pictures of two pests, red flour beetle and rice weevil, and then used the trained yolov4 model to identify them with an average accuracy (mAP) of 97.55% [
14]. Jiang et al. applied deep convolutional neural networks to construct feature pyramid networks to extract insect image features with different spatial resolutions and semantic information, and used tiled anchor points at different scales to deal with grain storage pests at different scales, and achieved an average detection accuracy of 94.77% for 10 kinds of grain bin pests [
15]. Shen et al. used the Faster-RCNN method to extract regions in the image that might contain grain storage pests and identified these regions using an improved Inception neural network structure, which was able to effectively detect grain storage pests in the background with impurities, with an mAP value of 88% [
16]. He, Y. et al. improved the Faster R-CNN and YOLOv3 algorithms and tested them on the brown rice lice dataset, and the recognition rate of the improved algorithm was higher than that of Faster R-CNN, with an average accuracy of 96.48% [
17]. Rahman, C.R. et al. proposed a two-stage small CNN to detect 10 rice pests with 93.3% accuracy [
18].
The deep learning based recognition methods for granary pests make up for the problem of insufficient feature extraction in the traditional methods, improving the detection performance. However, the detection and identification of granary pests in a real granary bin environment needs to solve the problems of distinguishing granary pests between different scales, different backgrounds and species, as well as the complex spatial environment. As a result, there is not a good model available to achieve effective results for the recognition of granary pests.
The detection of granary pests should not only take into account the small target of granary pests, but also the complex background of the granary in which the pests are located and the small differences between pest species. Thus, the key to detecting granary pests using deep learning methods is to solve the problem of detecting multi-scale granary pests in complex backgrounds and to improve the detection accuracy of granary pests in different environmental backgrounds. In this study, we first collected a multi-scale image dataset of granary pests. This dataset included 5231 images acquired with a DSLR-shot, microscope, cell phone and online crawler for seven common granary pests. Each image contains different species of granary pests in a different background. Then, an improved granary pest recognition model was proposed based on YOLOv5 network. The main contributions of this work include the following three points: Firstly, we collected one dataset, including seven common grain bin pests, containing mixed kinds of images in different backgrounds and environments. Second, we designed an improved yolov5 model, which can, to some extent, solve the problem of detecting and identifying multiple species of grain bin pests in complex backgrounds. The average accuracy of the improved model proposed in this study can reach 97.20%, and 98.20% for mAP0.5. In addition, we compared the performance of different models established with Efficientdet [
19], Faster rcnn [
20], Retinanet [
21], SSD [
22], YOLOx [
23], YOLOv3 [
24], YOLOv4 [
25], YOLOv5s and improved Yolov5s we designed.. The experimental results show that the detection and recognition ability and generalization ability of this model are higher than those of the above models. Next, an ablation analysis was conducted for detecting the robustness of the proposed model. Finally, feature visualization was analyzed. The flowchart of this paper is shown in
Figure 1.