Identification of Grape Diseases Based on Improved YOLOXS

Wang, Chaoxue; Wang, Yuanzhao; Ma, Gang; Bian, Genqing; Ma, Chunsen

doi:10.3390/app13105978

Open AccessArticle

Identification of Grape Diseases Based on Improved YOLOXS

by

Chaoxue Wang

¹,

Yuanzhao Wang

¹

,

Gang Ma

²

,

Genqing Bian

¹ and

Chunsen Ma

^2,3,*

¹

School of Information and Control Engineering, Xi’an University of Architecture and Technology, Xi’an 710055, China

²

State Key Laboratory for Biology of Plant Disease and Insect Pests, Chinese Academy of Agricultural Sciences, Beijing 100193, China

³

School of Life Science, Institute of Life Science and Green Development, Hebei University, Baoding 071002, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(10), 5978; https://doi.org/10.3390/app13105978

Submission received: 22 March 2023 / Revised: 22 April 2023 / Accepted: 8 May 2023 / Published: 12 May 2023

Download

Browse Figures

Versions Notes

Abstract

:

Here we proposed a grape disease identification model based on improved YOLOXS (GFCD-YOLOXS) to achieve real-time detection of grape diseases in field conditions. We build a dataset of 11,056 grape disease images in 15 categories, based on 2566 original grape disease images provided by the State Key Laboratory of Plant Pest Biology data center after pre-processing. To improve the YOLOXS algorithm, first, the FOCUS module was added to the backbone network to reduce the lack of information related to grape diseases in the convolution process so that the different depth features in the backbone network are fused. Then, the CBAM (Convolutional Block Attention Module) was introduced at the prediction end to make the model focus on the key features of grape diseases and mitigate the influence of the natural environment. Finally, the double residual edge was introduced at the prediction end to prevent degradation in the deep network and to make full use of the non-key features. Compared with the experimental results of relevant authoritative literature, GFCD-YOLOXS had the highest identification accuracy of 99.10%, indicating the superiority of the algorithm in this paper.

Keywords:

grape disease; YOLOXS; FOCUS; Convolutional Block Attention Module; double residual edge

1. Introduction

Grapes are one of the four most produced fruits in the world, with an annual production of about 75 million tonnes; thus, they occupy an important position among the fruits of the world [1]. Various diseases can attack grapes during the growing process, which can reduces the quality and yield of the grapes and even lead to their deaths in severe cases. Controlling grapevine diseases is important for the grape industry due to the variety of diseases and their severe damage. The efficient identification of grape diseases is one of the key issues in grape disease control. Currently, the identification of grape diseases is mainly made using artificial intelligence methods, especially deep learning, of which there are two main categories: classification of grape diseases using picture classification, and detection of grape diseases using target detection.

In the study of the classification of grape disease pictures, ALKAN et al. [2] developed a grape disease identification system using convolutional neural networks and transfer learning techniques to achieve an accuracy of 92.5% in identifying ten grape diseases. Liu et al. [3] enhanced multidimensional feature extraction using the Inception network structure and applied the dense connectivity strategy to four cascaded Inception structures to achieve an accuracy of 97.22% in identifying six grape diseases. Shantkumari et al. [4] used CNNC and IKNN, respectively, and the accuracy of the two models was 96.60% and 98.07%, respectively, for identifying four grape diseases in the Plant Village dataset [5]. Math et al. [6] constructed their own CNN classification model (DCNN) using a dataset of four grape crop images from the Plant Village dataset, achieving an accuracy of 99.34%. Peng et al. [7] fused the features extracted from ResNet50 and ResNet10 to build a grape disease identification model, eventually achieving 99.08% accuracy in identifying four grape diseases in the Plant Village dataset. Yin [8] used transfer learning in combination with the MobileNetV3 network to achieve a final accuracy of 99.4% for six grape diseases. Yang et al. [9] achieved 96.05% accuracy in identifying six grape pests and diseases in complex environments based on the ShuffleNet V2 network combined with the MDF decision-making method. Lu et al. [10] combined ghost convolution with a transformer to propose a GeT grape pest identification model, which eventually achieved an accuracy of 98.14% for 11 grape pests and diseases. Kaur et al. [11] extracted features using a fully connected layer, removed irrelevant features from the feature extraction vector using the proposed variance technique and used a classification algorithm to classify the resulting features, achieving an accuracy of 98.7% for the detection of six grape diseases. Suo et al. [12] first used the GSLL algorithm to reduce image noise and then used CASM-AMFMNet to achieve an accuracy of 95.95% in identifying five grape diseases. Asad et al. [13] created a dataset of grape leaves affected by nutrient deficiencies from farmland in a controlled environment, before customizing a CNN model and validating it using an n-fold cross-validation method, ultimately achieving an accuracy of 97.07% for the identification of four grape diseases. Lin et al. [14] achieved 86.29% accuracy in recognizing seven grape diseases in AI Challenger 2018 through implementing feature fusion of different depths through the RFFB module and introducing CBAM to extract valid disease information. ZINONOS et al. [15] combined long-range radio technology with deep learning to transmit images over long distances, achieving an accuracy of 98.33% in identifying four grape diseases in the Plant Village dataset, even with a packet loss rate of 50%. Although much research has been performed on the classification of grape diseases, image classification can only determine the disease type of the whole picture. It cannot, however, directly locate and mark the location of the disease; thus, it has some limitations in real-time detection.

Fewer studies have previously been conducted on the target detection of grape diseases. Xie et al. [16] based their Faster R-CNN detection algorithm and introduced the Inception-v1, Inception-ResNet-v2, and SE modules to achieve a recognition rate of 81.1% for the four grape diseases. Zhu et al. [17] first enhanced the images of black rot in grapes in the Plant Village dataset via bilinear interpolation, before introducing the SPP module to the backbone network of YOLOv3 and replacing the loss function from IOU to GIOU to achieve an accuracy of 95.79% for the detection of black rot in grapes. Dwivedi et al. [18] achieved an accuracy of 99.93% in identifying four grape diseases in the Plant Village dataset based on Faster RCNN, introducing multi-task learning and a dual attention mechanism construct. However, most of these images are from the Plant Village dataset, which contains only four grape images, i.e., healthy leaves, black rot, brown spot, and Esca, which are mainly used to demonstrate object detection tasks in laboratory environments with relatively simple backgrounds that differ from those of grape diseases in natural environments; therefore, they may affect the generalization ability of the model. In addition, the detection speed of these algorithms is slow, which may have some drawbacks for application scenarios with high real-time detection requirements.

To solve the above problems, this paper, based on previous research and addressing the shortcomings of the identification method of improved YOLOV3 in terms of real-time and model size at earlier work [19], realizes real-time detection of grape diseases in natural environments using the more advanced YOLOXS [20] algorithm in the YOLO [21] series and combining it with the characteristics of grape diseases. The main work of this paper is as follows:

(1): In response to the fact that most current grape disease detection is performed in a laboratory setting, this paper relies on the State Key Laboratory of Plant Pest Biology to establish a dataset of grape diseases in natural environments.
(2): To solve the problem of slow detection of grape diseases, the advanced YOLOXS algorithm in the YOLO series is selected in this paper, and the FOCUS module, CBAM module [22], and double residual edges are added according to the characteristics of grape diseases.
(3): In this paper, we experimentally compare the performance of the improved model before and after the improvement and compare the results with the current available related research. The experimental results show that the enhanced method used in this paper has greater practicality and application value by ensuring high accuracy while maintaining a faster detection speed.

2. Improved YOLOXS Model for Grape Disease Detection

2.1. YOLOXS Network

The YOLO (You Only Look Once) algorithm was proposed by Redmon et al. [21] in 2016 and developed into a series of algorithms; YOLOX was proposed by MEGVII Inc. of Beijing, China in 2021 and had superior performance. According to the characteristics of grape diseases, YOLOXS was selected as the benchmark model for grape disease identification in this paper.

YOLOXS consists of 5 parts: input, backbone network, neck, prediction, and output. On the input side, a 416 × 416 × 3 resolution image is input into the CSPNet [23] backbone network to obtain three features with different granularity. The neck part then uses FPN [24] and PAN [25] for feature fusion and extraction, and the prediction part uses a double decoupling head. Finally, the results are output at the output, whose network structure is shown in Figure 1.

2.2. Improved YOLOXS Network

The improved YOLOXS model (GFCD-YOLOXS) is shown in Figure 2, and the details are described below.

2.2.1. Addition of Three FOCUS Modules to the Backbone Network

FOCUS, first proposed in YOLOV5 [26], has very few parameters and can obtain down-sampled maps without information loss, as shown in the structure diagram in Figure 3.

In response to the possible loss of information when YOLOXS extracts features from grapes in natural environments, this paper adds three FOCUS modules to the backbone network of YOLOXS to connect feature maps of different depths, effectively using features of different depths and reducing the loss of disease features from grapes; the specific structure is shown in Figure 4.

2.2.2. Introduction of CBAM Module for Prediction Heads

The CBAM module was proposed by Woo et al. [22]. in 2018 and consists of channel attention and spatial attention modules. The weights of the feature map in the channel and spatial dimensions can be inferred to improve the extraction of key features of the feature map; the specific structure is shown in Figure 5.

This paper adds a CBAM module to the prediction head of YOLOXS, ensuring that the network can focus on the key information of grape diseases and improve the accuracy of grape disease identification. The network structure is shown in Figure 6.

2.2.3. Prediction Head Introduces Double Residual Edges

The residual edge is proposed by He et al. [27] in the ResNet network structure, which adds the input features directly to the output features to solve the problem of network degradation resulting from the depth structure of the network model, while ensuring the integrity of the original features, as shown in Figure 7.

This paper introduces the idea of residual edges to address problems resulting from the YOLOXS prediction head being located in the deeper part of the network, which can easily lead to network degradation. It is inspired by the fact that human beings need to associate non-key features with key features when recognizing objects, and adds double residual edges (DRES) to the YOLOXS prediction head to strengthen the information of non-key features. The improved prediction head is shown in Figure 8.

3. Experiments and Analysis

3.1. Experimental Environment

The programming language used is Python 3.8.13, the deep learning framework used is Pytorch 1.21.1, the development platform used is PyCharm 2021.3.2, the operating system used is Linux 4.15.0, the CPU model used is Intel(R) Xeon(R) Gold 5218 CPU @ 2.30 GHz, and the memory space used is 32 GB; the GPU Tesla T4 and cuda11.3 were also used for GPU acceleration.

3.2. Datasets and Pre-Processing

3.2.1. Initial Dataset

The dataset referred to in this paper contains 15 different diseases that may be encountered during the growth of grapes, namely: (1) healthy grapes, (2) healthy leaves, (3) powdery mildew, (4) Black rot, (5) Esca, (6) brown spot, (7) black spot, (8) downy mildew, (9) Sour rot, (10) deficiency, (11) anthracnose, (12) botrytis cinerea, (13) felt diseases, (14) leaf roll virus, (15) and phylloxera. These images come from the State Key Laboratory of Plant Pest Biology, which is the authoritative modern agricultural technology industrial system for grapevine pest prevention and control in China, possessing comprehensive and targeted research on grapevine pests and diseases in China and complete data on grapevine diseases and control strategies. Figure 9 shows representative images of some grape diseases in the dataset.

As shown in Figure 9, grape anthracnose manifests itself as small brown circular spots on the fruit surface; grape sour rot manifests itself as weak brown spots or streaks on the grape surface; grape downy mildew manifests itself as polygonal yellow–brown spots consisting of grape leaves; and grape leaf roll manifests itself as a recoiling of the growing leaves from the leaf margin downward.

Figure 10 shows the initial data volume of the dataset, with 15 categories and 2566 images. From this figure, it is found that the various types of image data in the original dataset are unbalanced, which may lead to problems such as overfitting and poor generalization of the trained model. Therefore, pre-processing of the original grape disease images is needed.

3.2.2. Data Preprocessing

In this paper, a data enhancement method [28] was used to expand 15 categories of grape images through rotating, scaling, cropping, and adding Gaussian noise to increase the number of training sets based on a balanced distribution of image data in each category; the effect of some of the images after processing is shown in Figure 11. Figure 11a–c correspond to the damage symptoms of the black spot of the grape, leaf roll of the grape, and felt disease of the grape, with the first image from the left in each category being the original image of the disease and the remaining four images being data enhanced. After the expansion, the number of images in each category was balanced, and the grape disease data set reached 11,056 images, as shown in Figure 12.

3.3. Evaluation Indicator

In this experiment, mAP (mean average precision), FPS (frames per second), and ModelSize are used as model performance evaluation metrics, which are defined as shown in Equations (1)–(3).

P r e c i s i o n = \frac{T P}{T P + F P}

(1)

R e c a l l = \frac{T P}{T P + F N}

(2)

A P = \int_{0}^{1} P r e c i s i o n (R e c a l l) d (R e c a l l)

(3)

T and F denote the number of samples classified correctly and incorrectly, respectively, and P and N denote the number of samples predicted to be positive and negative, respectively. TP indicates the number of samples for which the sample is positive and which the classifier predicts to be positive. FP indicates the number of negative samples predicted to be positive using the classifier. FN indicates the number of samples that are positive and predicted to be negative using the classifier.

3.4. Model Training and Parameter Settings

In this paper, the input image has a resolution of 416 × 416, and the model is trained using transfer learning [29]. Training is divided into two phases: pre-training and formal training.

For pre-training, 300 Epoch thawing training was performed on the public dataset VOC [30] (visual object class) to obtain the initial weights for the grape disease identification model. Formal training divides the dataset of this paper into training, validation, and test sets in the ratio of 8:1:1 for training. Formal training starts with a freeze session of 50 Epochs, followed by a thaw session of 250 Epochs, for a total of 300 Epochs. The initial learning rates for freeze training and thaw training were 0.001 and 0.0001, respectively, and the learning rates were all adjusted between 0.01 and 0.0001 using cosine annealing decay. The batch sizes for freeze and thaw training were set to 16 and 8, respectively, and the optimizer used stochastic gradient descent (SGD). Mosaic data augmentation was used for the former 150 pre-training and formal training rounds, though not for the latter 150 rounds.

3.5. Improvement Experiments

3.5.1. Ablation Experiments

In this paper, eight sets of experiments were designed to verify the impact of the three aspects of GFCD-YOLOXS improvement and their combinations on the results of grape disease identification, as shown in Table 1, where “√” indicates that the module was added in this experiment.

As can be seen from Table 1, the addition of the FOCUS module increased the mAP values for all disease identification methods by 1.15% compared to the original YOLOXS, as the use of the FOCUS network to connect features of different depths while retaining the complete features reduced the lack of information on features related to grape diseases; the introduction of the CBAM module also increased the mAP values for all disease identification methods compared to the original YOLOXS. The introduction of the CBAM module increased the mAP values for all disease recognition methods by 0.41%, as the introduction of the CBAM module allowed the network to focus on key features of grape diseases; the introduction of only double residual edges decreased the mAP values for all disease recognition by 0.25%, because the introduction of only residual edges caused the model to generate more noise, resulting in a decrease in the recognition rate when the CBAM module and DRES were introduced, compared to the introduction of only the CBAM module. This result occurred because the introduction of these two modules allows the model to focus on key features while connecting non-key features for recognition; when the three improved modules are used together, the highest mAP value of 99.10% is achieved for all grape disease recognition methods, which is because the improvement not only preserves the complete information but also allows the model to connect non-key features while focusing on key features. In addition, the improved ModelSize did not increase excessively, and the FPS values did not decrease.

To further demonstrate the effectiveness of these three improvements, we plotted the loss function and mAP plots for the validation set during formal training of the ablation experiment. The loss function plot is shown in Figure 13, with the loss values decreasing and leveling off as the training Epoch increases, and the GFCD-YOLOXS model having the lowest loss values. The mAP plot is shown in Figure 14. The mAP values gradually increased and leveled off with increasing training Epoch, with the highest mAP values for GFCD-YOLOXS. Combined with the analysis in Figure 13 and Figure 14, the GFCD-YOLOCS model improved in this paper has the most effective recognition effect.

3.5.2. A Comparison of Different Attention Mechanisms

In order to verify the superiority of introducing the CBAM module into the grape disease identification model, this paper replaces the CBAM with SE [31] and CA [32] (Coordinate Attention) modules, respectively, based on GFCD-YOLOXS, for comparative experiments. The results are shown in Table 2, where the introduction of the SE attention mechanism resulted in the highest AP values for the identification of grape anthracnose, grape felt, and grape leaf roll; the introduction of the CA attention mechanism resulted in the highest AP values for the identification of grape sour rot, grape deficiency, grape downy mildew, and healthy grapes; and the introduction of the CBAM attention mechanism resulted in the highest AP values for identifying healthy grape leaves, grape phylloxera, and grape powdery mildew. In a comprehensive comparison, the introduction of the CBAM module had the highest mAP value of 99.10% for all disease identification, which was 0.25% higher than the introduction of the SE module and 0.08% higher than the introduction of the CA module. This result indicates that introducing the CBAM module allows the model to more effectively focus on the key features of grape diseases and reduce the influence of noise, thus improving the accuracy of grape disease identification.

3.5.3. Comparison of Residual Edges

In order to verify the superiority of introducing double residual edges into the grape disease identification model, the GFCD-YOLOXS model is compared with the GFCD-YOLOXS model without introducing residual edges, while the GFCD-YOLOXS model is compared with the introduction of single residual edges in separate experiments. The results are shown in Table 3. The introduction of single residual edges improved the mAP values for all disease identification methods by 0.18% compared to the method without double residual edges, indicating that introducing single residual edges can prevent degradation of the network. The introduction of double residual edges increased the mAP values for all disease identification by 0.21% compared to single residual edges, suggesting that introducing double residual edges can further enhance non-critical features and improve the accuracy of grape disease identification.

3.6. Comparison of the Model in This Paper with Other Models

3.6.1. About This Dataset Compared to Other Models

In order to verify the feasibility of the GFCD-YOLOXS algorithm, this paper compares the model with the existing classical target detection models Faster-RCNN [33], SSD [34], RetinaNet [35], YOLOV5S, and Eff-B3-YOLOV3 [19]; the experimental results are shown in Table 4. Table 4 shows that the GFCD-YOLOXS recognition method proposed in this paper has the highest accuracy of 99.10%. Moreover, its model size and detection speed are smaller and faster than Faster RCNN, SSD, RetinaNet, and the previously studied Eff-B3-YOLOV3 model, meaning that it can meet the requirements of real-time detection of plant protection equipment.

Furthermore, three pictures suffering from grape powdery mildew, grape leaf roll, and grape deficiency were selected to compare the detection effects of different models. The experimental results are shown in Figure 15. When using the Faster RCNN model, grape powdery mildew appears to be incorrectly identified, and grape leaf roll and grape deficiency appear to be missed in complex backgrounds; when using the SSD model, grape powdery mildew appears to be missed in boundary identification, grape leafroll appears to be missed in dim environment identification, and grape deficiency appears to be missed in the presence of occlusion; and when using the RetinaNet model, grape powdery mildew appears to be missed. The detection of small targets in grapevine leaf roll and grapevine deficiency improved when using the RetinaNet model; however, there were still missed detections. When using the YOLOV5S model, grapevine deficiency was missed for complex backgrounds and small targets. In Eff-B3-YOLOV3, the overall confidence of the identification results was low; in contrast, In contrast, the use of GFCD-YOLOXS reduces feature loss due to the inclusion of the FOCUS network to fuse features of different depths, thus reducing missed detection, and the inclusion of CBAM, double residual edges, allows the network to focus on key features while linking non-key features, thus improving the accuracy of model detection.

3.6.2. On Public Datasets Compared with Other Literature

In order to verify the advancements made using this model, experiments were conducted in this paper using data on grapes from plant trees, which were then compared with the latest literature. The comparison results are shown in Table 5, which shows that the GFCD-YOLOXS model has the highest recognition rate and better generalization and robustness compared with the methods proposed in the literature [6,7,14].

3.7. Example of Recognition Results

A typical example of the results of this model for grapevine disease identification is shown in Figure 16, which can successfully identify diseases suffered by the fruit stalks of grapes, such as Botrytis cinerea of grapes; Can successfully identify diseases suffered by grapes, such as Botrytis cinerea, acid rot, black spot, powdery mildew, and anthracnose; The possibility of successfully identifying diseases suffered by grape leaves, such as grape leaf roll and grape downy mildew; Multiple targets of the same species to be detected, such as grape leaf roll disease, healthy grapes, can be successfully identified.

4. Conclusions

This paper investigates the intelligent identification of grape diseases in their natural state based on an improved YOLOXS. A grape disease dataset in a natural environment was initially constructed. The YOLOXS algorithm was then improved through adding the FOCUS module to the backbone network, introducing CBAM, double residual edges in the prediction head. Finally, a grape disease identification model with a mAP of 99.10% (GFCR-YOLOXS) was obtained through training. The rich experimental and comparative results show that the model in this paper is superior and practical compared to the classical model and the models in the relevant recent authoritative literature, and has important reference values for the control of grape diseases.

In the future, we will investigate the following aspects: (1) maintenance and update of the dataset. The initial dataset for this work was small and did not include photos from complex environments, such as rain and snow, or the lack of breakdown of deficiency diseases. In the future, we will collect more experimental data, further refine the severity of grape diseases, and include non-invasive disease species of grapes to improve the comprehensiveness and reliability of the dataset. (2) Multimodal data fusion. In practical applications, multiple data sources may be involved, such as infrared images, hyperspectral images, etc. Therefore, the fusion of various data can improve the recognition accuracy and robustness of the model to further improve the diagnosis and treatment. (3) Identification of other crop diseases. We will study the disease characteristics of different crops and improve the existing models to apply them to disease identification in other crops. This approach will help farmers to detect and treat crop diseases promptly, and enhance the efficiency of agricultural production and the quality of the farm output. In conclusion, future research directions will further explore the characteristics of crop diseases and their detection techniques in depth, and continuously improve the models’ accuracy, robustness, and practicality to promote the sustainable development of agricultural production.

Author Contributions

Methodology, Y.W. and C.W.; Dataset Acquisition, C.M. and G.M.; Writing—original draft, Y.W.; Writing—review and editing, G.B.; Supervision, C.W. and C.M.; Formal analysis: C.W. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (No. 62072363) and the Natural Science Foundation of Shaanxi Province (No. 2019JM-167).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data set for this article are available from Y.W. on request.

Conflicts of Interest

The authors declare no conflict of interest.

References

Zhou, D.D.; Li, J.; Xiong, R.G.; Xiong, R.G.; Saimaiti, A.; Huang, S.Y.; Wu, S.X.; Yang, Z.J.; Shang, A.; Zhao, C.N.; et al. Bioactive compounds, health benefits and food applications of grape. Foods 2022, 11, 2755. [Google Scholar] [CrossRef]
Alkan, A.; Abdullah, M.U.; Abdullah, H.O.; Assaf, M.; Zhou, H. A smart agricultural application: Automated detection of diseases in vine leaves using hybrid deep learning. Turk. J. Agric. For. 2021, 45, 717–729. [Google Scholar] [CrossRef]
Liu, B.; Ding, Z.; Tian, L.; He, D.; Li, S.; Wang, H. Grape leaf disease identification using improved deep convolutional neural networks. Front. Plant Sci. 2020, 11, 1082. [Google Scholar] [CrossRef]
Shantkumari, M.; Uma, S.V. Grape leaf image classification based on machine learning technique for accurate leaf disease detection. Multimed. Tools Appl. 2023, 82, 1477–1487. [Google Scholar] [CrossRef]
Geetharamani, G.; Pandian, A. Identification of plant leaf diseases using a nine-layer deep convolutional neural network. Comput. Electr. Eng. 2019, 76, 323–338. [Google Scholar]
Math, R.K.M.; Dharwadkar, N.V. Early detection and identification of grape diseases using convolutional neural networks. J. Plant Dis. Prot. 2022, 129, 521–532. [Google Scholar] [CrossRef]
Peng, Y.; Zhao, S.; Liu, J. Fused-Deep-Features Based Grape Leaf Disease Diagnosis. Agronomy 2021, 11, 2234. [Google Scholar] [CrossRef]
Yin, X.; Li, W.; Li, Z.; Yi, L. Recognition of grape leaf diseases using MobileNetV3 and deep transfer learning. Int. J. Agric. Biol. Eng. 2022, 15, 184–194. [Google Scholar] [CrossRef]
Yang, R.; Lu, X.; Huang, J.; Zhou, J.; Jiao, J.; Liu, Y.; Liu, F.; Su, B.; Gu, P. A Multi-Source Data Fusion Decision-Making Method for Disease and Pest Detection of Grape Foliage Based on ShuffleNet V2. Remote Sens. 2021, 13, 5102. [Google Scholar] [CrossRef]
Lu, X.; Yang, R.; Zhou, J.; Jiao, J.; Liu, F.; Liu, Y.; Su, B.; Gu, P. A hybrid model of ghost-convolution enlightened transformer for effective diagnosis of grape leaf disease and pest. J. King Saud Univ. Comput. Inf. Sci. 2022, 34, 1755–1767. [Google Scholar] [CrossRef]
Kaur, P.; Harnal, S.; Tiwari, R.; Upadhyay, S.; Bhatia, S.; Mashat, A.; Alabdali, A.M. Recognition of leaf disease using hybrid convolutional neural network by applying feature reduction. Sensors 2022, 22, 575. [Google Scholar] [CrossRef] [PubMed]
Suo, J.; Zhan, J.; Zhou, G.; Chen, A.; Hu, Y.; Li, L. CASM-AMFMNet: A Network Based on Coordinate Attention Shuffle Mechanism and Asymmetric Multi-Scale Fusion Module for Classification of Grape Leaf Diseases. Front. Plant Sci. 2022, 13, 846767. [Google Scholar] [CrossRef]
Ali, A.; Ali, S.; Husnain, M.; Missen, M.M.S.; Khan, M. Detection of deficiency of nutrients in grape leaves using deep network. Math. Probl. Eng. 2022, 2022, 3114525. [Google Scholar] [CrossRef]
Lin, J.; Chen, X.; Pan, R.; Cao, T.; Cai, J.; Chen, Y.; Peng, X.; Cernava, T.; Zhang, X. GrapeNet: A Lightweight Convolutional Neural Network Model for Identification of Grape Leaf Diseases. Agriculture 2022, 12, 887. [Google Scholar] [CrossRef]
Zinonos, Z.; Gkelios, S.; Khalifeh, A.F.; Hadjimitsis, D.G.; Boutalis, Y.S.; Chatzichristofis, S.A. Grape Leaf Diseases Identification System Using Convolutional Neural Networks and LoRa Technology. IEEE Access 2021, 10, 122–133. [Google Scholar] [CrossRef]
Xie, X.; Ma, Y.; Liu, B.; He, J.; Li, S.; Wang, H. A deep-learning-based real-time detector for grape leaf diseases using improved convolutional neural networks. Front. Plant Sci. 2020, 11, 751. [Google Scholar] [CrossRef] [PubMed]
Zhu, J.; Cheng, M.; Wang, Q.; Yuan, H.; Cai, Z. Grape leaf black rot detection based on super-resolution image enhancement and deep learning. Front. Plant Sci. 2021, 12, 1308. [Google Scholar] [CrossRef]
Dwivedi, R.; Dey, S.; Chakraborty, C.; Tiwari, S. Grape disease detection network based on multi-task learning and attention features. IEEE Sens. J. 2021, 21, 17573–17580. [Google Scholar] [CrossRef]
Wang, C.; Qi, X.; Ma, G.; Zhu, L.; Wang, B.X.; Ma, C.S. Artificial intelligence identification system for grape diseases based on YOLO V3. Plant Prot. 2022, 48, 278–288. [Google Scholar] [CrossRef]
Ge, Z.; Liu, S.; Wang, F.; Li, Z.; Sun, J. Yolox: Exceeding yolo series in 2021. arXiv 2021, arXiv:2107.08430. [Google Scholar]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 779–788. [Google Scholar]
Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. Cbam: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. Yolov4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
Lin., T.Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2117–2125. [Google Scholar]
Wang, W.; Xie, E.; Song, X.; Zang, Y.; Wang, W.; Lu, T.; Yu, G.; Shen, C. Efficient and accurate arbitrary-shaped text detection with pixel aggregation network. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 8440–8449. [Google Scholar]
Ultralytics: Yolov5. Available online: https://github.com/ultralytics/yolov5 (accessed on 21 October 2022).
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Shorten, C.; Khoshgoftaar, T.M. A survey on image data augmentation for deep learning. J. Big Data 2019, 6, 1–48. [Google Scholar] [CrossRef]
Sharif Razavian, A.; Azizpour, H.; Sullivan, J.; Carlsson, S. CNN features off-the-shelf: An astounding baseline for recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Columbus, OH, USA, 23–28 June 2014; pp. 806–813. [Google Scholar]
Everingham, M.; Eslami, S.M.A.; Van Gool, L.; Williams, C.K.I.; Winn, J.; Zisserman, A. The pascal visual object classes challenge: A retrospective. Int. J. Comput. Vis. 2015, 111, 98–136. [Google Scholar] [CrossRef]
Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 7132–7141. [Google Scholar]
Hou, Q.; Zhou, D.; Feng, J. Coordinate attention for efficient mobile network design. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual, 19–25 June 2021; pp. 13713–13722. [Google Scholar]
Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. In Proceedings of the Advances in neural information processing systems, Montreal, QC, USA, 7–12 December 2015; Volume 28. [Google Scholar]
Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. Ssd: Single shot multibox detector. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; Springer: Cham, Switzerland, 2016; pp. 21–37. [Google Scholar]
Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]

Figure 1. YOLOXS network structure.

Figure 2. GFCD-YOLOXS network structure.

Figure 3. FOCUS module.

Figure 4. Introduction of FOCUS module in the backbone network.

Figure 5. CBAM module.

Figure 6. Introduction of CBAM in prediction head.

Figure 7. Residual structure.

Figure 8. Prediction head introduces double residual edges.

Figure 9. Dataset images. (a) Anthracnose; (b) Sour rot; (c) powdery mildew; (d) leaf roll.

Figure 10. Distribution of raw grape disease data.

Figure 11. Enhanced grape disease data. (a) Grape black spot and its data enhancement effects; (b) grape leaf roll disease and its data enhancement effects; and (c) grape felt disease and its data enhancement effects.

Figure 12. Distribution of raw grape disease data.

Figure 13. Loss diagram of ablation experiment.

Figure 14. mAP curve of ablation experiment.

Figure 15. Identification results of different models. (a) Recognition effect of the Faster RCNN model; (b) recognition effect of the SSD model; (c) recognition effect of the RetinaNet model; (d) recognition effect of the YOLOV5S model; (e) recognition effect of the Eff-B3-YOLOV3 model; and (f) recognition effect of GFCD-YOLOXS model.

Figure 16. Effect of grape disease identification.

Table 1. Comparison of mAP in ablation experiment.

Model	FOCUS	CBAM	DRES	mAP	ModelSize (M)	FPS
YOLOXS				97.05%	34.3	64.58
YOLOXS_1	√			98.20%	37.0	50.21
YOLOXS_2		√		97.46%	34.4	53.57
YOLOXS_3			√	96.80%	34.3	64.32
YOLOXS_4	√	√		98.71%	37.1	49.83
YOLOXS_5	√		√	98.22%	37.0	50.14
YOLOXS_6		√	√	97.64%	34.4	53.32
GFCD-YOLOXS	√	√	√	99.10%	37.1	48.57

Table 2. Comparison of different attention mechanism modules.

Disease Name	SE	CA	CBAM
Sour rot	95.16%	95.86%	95.26%
Anthracnose	99.91%	96.93%	99.90%
Black rot	100.00%	100.00%	100.00%
Black spot	100.00%	100.00%	100.00%
Botrytis cinerea	100.00%	100.00%	100.00%
Brown spot	100.00%	100.00%	100.00%
Deficiency	95.78%	97.47%	96.48%
Downy mildew	98.77%	99.85%	99.64%
Esca	100.00%	100.00%	100.00%
Felt diseases	98.40%	97.41%	98.21%
Healthy grape	99.28%	99.83%	99.36%
Healthy leaves	99.20%	99.06%	99.69%
Leaf roll	99.18%	99.04%	98.11%
Phylloxera	97.06%	99.86%	99.90%
Powdery mildew	99.97%	99.98%	99.99%
All diseases	98.85%	99.02%	99.10%

Table 3. Comparison of different residual edges.

Disease Name	No RES	RES	DRES
Sour rot	93.99%	93.66%	95.26%
Anthracnose	99.73%	98.42%	99.90%
Black rot	100.00%	100.00%	100.00%
Black spot	100.00%	100.00%	100.00%
Botrytis cinerea	100.00%	99.98%	100.00%
Brown spot	100.00%	100.00%	100.00%
Deficiency	96.64%	96.42%	96.48%
Downy mildew	99.73%	99.88%	99.64%
Esca	100.00%	99.95%	100.00%
Felt diseases	98.24%	98.58%	98.21%
Healthy grape	99.48%	98.63%	99.36%
Healthy leaves	99.07%	99.36%	99.69%
Leaf roll	95.14%	99.18%	98.11%
Phylloxera	98.71%	99.34%	99.90%
Powdery mildew	99.98%	100.00%	99.99%
All diseases	98.71%	98.89%	99.10%

Table 4. Results of different model checks for this paper’s dataset.

Model	mAP	ModelSize (M)	FPS
Faster RCNN	94.57%	108	17.20
SSD	90.54%	100	20.12
RetinaNet	94.60%	140	16.40
YOLOV5S	95.99%	28.6	60.3
Eff-B3-YOLOV3	98.60%	77.6	31.20
GFCD-YOLOXS	99.10%	37.1	48.57

Table 5. On comparison of public datasets with other literature.

Literature	Model	Precision	F1	Accuracy
Math [6]	Custom Models	99.34%	99.44%	99.32%
Peng [7]	CNN + SVM	99.08%	99.26%	99.25%
Lin [14]	GrapeNet	86.29%	77.76%	79.05%
Ours	GFCD-YOLOXS	100%	100%	100%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, C.; Wang, Y.; Ma, G.; Bian, G.; Ma, C. Identification of Grape Diseases Based on Improved YOLOXS. Appl. Sci. 2023, 13, 5978. https://doi.org/10.3390/app13105978

AMA Style

Wang C, Wang Y, Ma G, Bian G, Ma C. Identification of Grape Diseases Based on Improved YOLOXS. Applied Sciences. 2023; 13(10):5978. https://doi.org/10.3390/app13105978

Chicago/Turabian Style

Wang, Chaoxue, Yuanzhao Wang, Gang Ma, Genqing Bian, and Chunsen Ma. 2023. "Identification of Grape Diseases Based on Improved YOLOXS" Applied Sciences 13, no. 10: 5978. https://doi.org/10.3390/app13105978

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Identification of Grape Diseases Based on Improved YOLOXS

Abstract

1. Introduction

2. Improved YOLOXS Model for Grape Disease Detection

2.1. YOLOXS Network

2.2. Improved YOLOXS Network

2.2.1. Addition of Three FOCUS Modules to the Backbone Network

2.2.2. Introduction of CBAM Module for Prediction Heads

2.2.3. Prediction Head Introduces Double Residual Edges

3. Experiments and Analysis

3.1. Experimental Environment

3.2. Datasets and Pre-Processing

3.2.1. Initial Dataset

3.2.2. Data Preprocessing

3.3. Evaluation Indicator

3.4. Model Training and Parameter Settings

3.5. Improvement Experiments

3.5.1. Ablation Experiments

3.5.2. A Comparison of Different Attention Mechanisms

3.5.3. Comparison of Residual Edges

3.6. Comparison of the Model in This Paper with Other Models

3.6.1. About This Dataset Compared to Other Models

3.6.2. On Public Datasets Compared with Other Literature

3.7. Example of Recognition Results

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI