Automatic Disease Detection from Strawberry Leaf Based on Improved YOLOv8

He, Yuelong; Peng, Yunfeng; Wei, Chuyong; Zheng, Yuda; Yang, Changcai; Zou, Tengyue

doi:10.3390/plants13182556

Open AccessArticle

Automatic Disease Detection from Strawberry Leaf Based on Improved YOLOv8

by

Yuelong He

^1,2,

Yunfeng Peng

^1,2,

Chuyong Wei

^1,2,

Yuda Zheng

^1,2,

Changcai Yang

³

and

Tengyue Zou

^1,2,*

¹

College of Mechanical and Electrical Engineering, Fujian Agriculture and Forestry University, Fuzhou 350002, China

²

Fujian Key Laboratory of Agricultural Information Sensoring Technology, Fujian Agriculture and Forestry University, Fuzhou 350002, China

³

College of Computer and Information Sciences, Fujian Agriculture and Forestry University, Fuzhou 350002, China

^*

Author to whom correspondence should be addressed.

Plants 2024, 13(18), 2556; https://doi.org/10.3390/plants13182556

Submission received: 6 August 2024 / Revised: 29 August 2024 / Accepted: 9 September 2024 / Published: 11 September 2024

(This article belongs to the Section Plant Modeling)

Download

Browse Figures

Versions Notes

Abstract

Strawberries are susceptible to various diseases during their growth, and leaves may show signs of diseases as a response. Given that these diseases generate yield loss and compromise the quality of strawberries, timely detection is imperative. To automatically identify diseases in strawberry leaves, a KTD-YOLOv8 model is introduced to enhance both accuracy and speed. The KernelWarehouse convolution is employed to replace the traditional component in the backbone of the YOLOv8 to reduce the computational complexity. In addition, the Triplet Attention mechanism is added to fully extract and fuse multi-scale features. Furthermore, a parameter-sharing diverse branch block (DBB) sharing head is constructed to improve the model’s target processing ability at different spatial scales and increase its accuracy without adding too much calculation. The experimental results show that, compared with the original YOLOv8, the proposed KTD-YOLOv8 increases the average accuracy by 2.8% and reduces the floating-point calculation by 38.5%. It provides a new option to guide the intelligent plant monitoring system and precision pesticide spraying system during the growth of strawberry plants.

Keywords:

deep learning; smart agriculture; strawberry disease; target detection

1. Introduction

The commercial strawberry production has a high economic value [1]. However, strawberry plants are susceptible to various diseases, among which leaf diseases [2] are particularly common. These diseases seriously reduce the yield and quality of strawberry production and may cause substantial economic losses [3]. Because the wide variety of diseases and complex plant pathogenesis make it hard to prevent and control the diseases, traditionally, growers’ visual inspections are relied upon to detect plant diseases in the early stage [4]. However, because strawberries are affected by many types of diseases that share similarities [5], visual inspection may result in misdetection or misjudgment [6]. With the advancement of the image-processing technology and an increase in computational power [7], more and more researchers are paying attention to techniques for the intelligent diagnosis of crop diseases based on image-processing technology [8]. These techniques not only save time and reduce labor costs but also are more precise and accurate [9]. Consequently, the employment of image recognition technology to achieve an efficient and timely detection of strawberry diseases [10] will boost strawberry production, lead to a more precise application of pesticides, and collect more reliable data for the agricultural administration [11,12,13].

In recent years, machine learning has gradually become a common approach in the plant disease detection field [14]. K-means clustering was used to separate the disease-affected area of the grape leaf from the healthy area, and features were extracted based on three color models: RGB, HSV, and l*a*b. Then, the support vector machine (SVM) was used to diagnose and classify the leaf diseases, achieving an accuracy of 98.71% [15]. K-Nearest Neighbor (KNN) classifier was also adopted to recognize the affected maize plant leaves with an accuracy of 90% [16]. However, these methods rely on agricultural experts’ experience to extract disease features, which is time-consuming and has limited recognition accuracy. Compared with traditional methods, deep learning techniques can shorten the recognition time and improve the recognition accuracy [17]. Furthermore, DenseNet201 and SVM can be used together to recognize diseases on maize leaves and achieve a recognition rate of 94.9% [18]. To detect small-target lepidoptera pests on leaves, MD-YOLO was proposed. The DenseNet block and Adaptive attention module were added to the feature extraction process, and the accuracy could reach 86.2% [19]. The Updated Faster R-CNN architecture was proposed [20] for the automatic detection of the beet leaf spot disease and achieved an overall correct classification rate of 95.48%, with 155 images being processed.

For the detection of strawberry diseases, a new Faster R-CNN architecture was proposed to deal with the complex background and small lesions in images of strawberry diseases [21]. Its model can effectively extract abundant image features of strawberry diseases and reach an average accuracy of 92.18% on their self-built dataset. Four convolutional neural network models (SqueezeNet, EfficientNet-B3, VGG-16, and AlexNet) were trained to classify healthy and leaf scorch-infected strawberry plants [22], and the EfficientNet-B3 model achieved the highest accuracy of 80% among these four models. Moreover, the DAC-YOLOv4 was proposed to detect powdery mildew and infected leaves of strawberries under complex backgrounds [23]. The original YOLOv4 backbone and neck are replaced with deep convolution and a hybrid attention mechanism. By using a combination of convolutional block layer (CBL) and depthwise-convolutional block layer (DW-CBL) structures to simplify the convolution, the DAC-YOLOv4 can achieve an average accuracy of 72.7%.

These studies provide good ideas for image-based crop disease detection. However, there are still some actual problems in strawberry disease detection under real-world conditions: (1) One problem is the poor quality of training datasets. Most existing training datasets in this field are derived from single background images in the laboratory. It is difficult for the model trained with these kinds of training sets to adapt to recognition tasks under real-world complex conditions. (2) Another problem is insufficient accuracy. The differences between the features of some specific strawberry leaf diseases are indistinguishable. They show a high degree of similarity, which makes it difficult for the neural network to classify them, resulting in a lower recognition accuracy. (3) Huge computational complexity is a challenge. The leaf disease area is small, and its symptoms are not obvious. Thus, a large feature space for classification is needed. This requires a large number of computing resources and is hard to be deployed on a mobile device.

To solve these problems, the following key improvements are added to the original YOLOv8 for strawberry leaf disease detection: (1) KernelWarehouse [24] dynamic convolution is taken to replace the original convolutions in the backbone network and in the C2f (CSPLayer_2ConV) function. Because KWConv replaces Conv in the C2f function, the replaced module is named KW-C2f. This module can extract the texture information of leaf diseases more efficiently and reduce the computational complexity significantly. (2) The Triplet Attention mechanism is adopted [25], which enhances the feature extraction ability of the model at different spatial scales and further significantly improves the accuracy. (3) A parameter-sharing DBB (diverse branch block [26]) sharing head is constructed to help the model extract the feature information more efficiently without significantly increasing the computational complexity. In the dataset, we collected pictures of strawberry diseases in real-life plantations and increased the generalization of the dataset via data enhancement. Finally, ablation studies were carried out, and experiments were conducted to confirm the performance of the algorithm. The results show that the introduced algorithm performs better in strawberry disease detection in complex real-world circumstances. After engineering optimization, the proposed algorithm can be deployed on a self-propelled robot in the field with a digital camera. Using the real-time video taken from the camera, the strawberry disease can be discovered.

2. Results and Discussion

The most common evaluation metrics in deep learning were selected to evaluate the algorithm: precision (P), recall (R), mean average precision (mAP), and floating-point operations (FLOPS). The rate of precision is the proportion of true victimizations out of all results detected as victimizations. Recall is the proportion of detected victimizations to the total number of actual victimizations. The average precision (AP) is calculated based on precision and recall. The area under the precision–recall (P–R) curve is the AP value. mAP is the average value of all the categories of precision. It is the main evaluation index for target detection, which reflects the model’s overall performance. The higher the value, the better the model performs. FLOPS are used to measure the running time of the model. The lower the value, the less calculation the model needs to perform. Furthermore, mAP@0.5 indicates the mAP value when the Intersection over Union (IoU) is set to 0.5. The formulas are shown in Equations (1) and (2).

AP = \int_{0}^{1} P (R) dR

(1)

mAP = \frac{1}{n} \sum_{i = 1}^{n} AP

(2)

In the overall recognition experiments, the testing dataset was inputted into the introduced KTD-YOLOv8 model, and the results are shown in Figure 1 and Figure 2. The precision of the method was 0.90, and the recall was 0.813. Its mAP@0.5 was 0.897, and the calculation amount was 17.7 GFLOPS. The results show that the KTD-YOLOv8 increases the confidence level, which can effectively increase detection accuracy in the actual environment and reduce the computation amount.

To further verify the effectiveness of individual modules in KTD-YOLOv8, the YOLOv8s model was used as the baseline model as a comparison. The ablation experiments were executed on the dataset by adding modules to the baseline network one by one, and the results are shown in Table 1 and Figure 3 and Figure 4. Compared with the original YOLOv8s model, the addition of the Triplet Attention mechanism makes the most significant contribution to the improvement of the accuracy, as mAP@0.5 increased by 2.4%, while a reduction in the computational load of 0.3 GFLOPS was achieved. The constructed DBB sharing head makes the most significant contribution to the model inference time, reducing it from 13.1 ms to 11.1 ms. It also increases the mAP@0.5 value by 1.2%, with an increase in the computational load and the number of parameters. Using KernelWarehouse to replace the base convolution reduces the computational load by 14.6 GFLOPS, and mAP@0.5 also increases by 1%. Figure 3 shows the mAP@0.5 curves for that experiment. The overall loss comparison is shown in Figure 4.

In real strawberry leaves, some spots of strawberry diseases occupy a tiny portion of the leaf. Thus, a baseline 3 × 3 convolution kernel cannot extract enough information from the target, while larger convolution kernels may cost more computational resources and ignore the tiny details. The KWConv convolution mechanism is introduced to improve it by linearly mixing several static convolutional kernels together. It increases the number of kernels and reduces the dimension for each kernel to significantly cut down the amount of computation and improve accuracy. To evaluate the effectiveness of the mechanism, the experiments were executed involving DySnakeConv [27], SPDConv [28], and KWConv. The baseline convolution component was replaced with these algorithms one by one, and calculations were run on a self-constructed dataset. The training environmental parameters were the same as before. The experimental results are shown in Table 2. It could be found that replacing the base convolution with SPDConv acquires the highest mAP@0.5 value, which reaches 88.1%. However, it increases the calculations significantly to 43 GFLOPS. Replacing the base convolution with the KWConv increases the mAP@0.5 to 87.9%, and the calculation requirement is only 14.2 GFLOPS.

To verify the effectiveness of the constructed DBB sharing head, Pose Head [29] and Aux head [30] modules were selected for comparison. In the actual environment, disease areas are often confused with the background. Thus, the detection head needs to output a large amount of predicting information. The original YOLOv8 has a decoupled head structure, so it occupies a large portion of the calculation for the whole model. To obtain higher accuracy without significantly increasing the computational complexity, the DBB sharing head was constructed in this study. The experimental results are shown in Table 3. It can be found that the Aux Head, the Pose Head, and the DBB sharing head have a nearly equal effect on improving the model, with about an 88% mAP@0.5 value. According to their calculation needs, the DBB sharing head has the advantage of saving system resources.

The attention mechanism can help the model distinguish similar diseases. The Triplet Attention mechanism introduced in this study can integrate information in different dimensions through a series of rotations and arrangements to better capture the intrinsic characteristics of the disease area. Other mechanisms, such as SimAM [31] and CPCA [32], were added to the experiment for comparison. The attention modules were inserted into the module one by one in the experiment. The experimental parameters are the same as before. According to the experimental results shown in Table 4, adding the SimAM reduces the calculations of the model to 28.4 GFLOPS, but the mAP@0.5 was not significantly improved. Triplet Attention increases the mAP@0.5 to 89.3%, and the calculations of the model were reduced to 28.5 GFLOPS, which was only 0.1 GFLOPS more than adding the SimAM. The experiment proved that the Triplet Attention mechanism has good actual performance in strawberry disease detection.

In order to further validate the detection effect of KTD-YOLOv8, the algorithm was compared with YOLOv5, YOLOv6, YOLOv7, YOLOv8, and YOLOv9 in experiments, and the results are shown in Table 5. Under the 100 training rounds, the results show that YOLOv5, YOLOv8, and KTD-YOLOv8 did not increase the parameters much. Among the three algorithms above, KTD-YOLOv8 could acquire the highest mAP@0.5 value of 89.7% and the minimum inference time of 12.1 ms. Thus, KTD-YOLOv8 showed the best comprehensive performance. Furthermore, these five algorithms and KTD-YOLOv8 were used to detect five kinds of strawberry diseases, and the comparative results are shown in Figure 5. It could be found that the KTD-YOLOv8 was able to detect the target more accurately and had a higher average confidence rate than other algorithms. In the detection test of powdery mildew with small disease characteristics and overlapping leaves, there were many missed objects in the response of YOLOv5, YOLOv6, YOLOv7, and YOLOv8 models and one missed object in YOLOv9, while KTD-YOLOv8 did not miss any detections. In the detection of bacterial leaf blight under a complex background, YOLOv9 showed false detection of wilted leaves, while KTD-YOLOv8 showed good robustness. KTD-YOLOv8 also did better in recognizing the infested leaves from the disease scene and obtained a high confidence level.

Through experiments, it was found that the improved KTD-YOLOv8 increases the model’s mAP@0.5 by 2.8% and reduces the model’s computational load by 11.1 GFLOPS, as shown in Table 1. The introduced KW-Conv component was found to reduce the model’s computational load by 14.6 GFLOPS and increase mAP@0.5 by 1%. This was mainly due to the reduction of the kernel dimension and the increase in kernel numbers, as well as the enhancement of the dependence of convolution parameters within the same layer and between consecutive layers. As shown in Table 4, the Triplet Attention mechanism increases the mAP@0.5 by 2.4% and reduces the model’s computational load by 0.3 GFLOPS. It integrates multi-scale features better and improves extraction ability for small targets. As shown in Table 1, the DBB sharing head improves the mAP@0.5 value of the baseline model by 1.2% and decreases the model’s inference time. However, it increases the computational load by 3 GFLOPS and the number of parameters by 0.214 × 10⁷. This is because the DBB module adopts a multi-branch structure during training and uses the fusion of multiple branches as the main one during the inference procedure.

3. Materials and Methods

3.1. Image Dataset

The dataset in this study was collected from a strawberry orchard in Ganzhou City, Jiangxi Province, China, between 2:00 and 5:00 p.m. on 9–24 February 2023. The time was during the dormant period of strawberries, and there were many strawberry leaves. The strawberry leaves were photographed at a distance of 40 cm. After the screening, 823 images indicating strawberry leaf disease information comprised the dataset. The dataset contained five kinds of leaf diseases. The samples are shown in Figure 6.

3.2. Image Enhancement

Few training images may lead to overfitting or non-convergence of the deep learning algorithm. Therefore, we increased the number of images in the dataset to overcome this defect through image enhancement. We labeled the dataset with LabelImg software (version: 1.8.6) and generated “XML” files according to the PASCAL VOC [33] dataset format. Then, we processed the images with image-augmentation software using two randomly selected methods from the general functions for image enhancement, including rotating the images from multiple angles, adjusting the brightness, blurring, adding Gaussian noise, etc. A total of 5714 images were obtained through this enhancement, and the dataset was divided according to the ratio of 6:2:2, where 3428 images were put in the training subset, 1143 images in the validation subset, and 1143 images in the testing subset. The number of images for different types of diseases is shown in Figure 7.

3.3. Experimental Platform

The experimental platform used the SGD optimizer, and the training round was set to 100 epochs. The input image size was 640 × 640 pixels. The learning rate was set to 0.01, and the batch size was 16. To avoid overfitting and improve the model’s generalization ability, the mosaic data augmentation was set to the default optimal value of 1 for YOLOv8, and the mix-up was not used. To ensure the reliability and accuracy of the experimental results, all experiments were trained without using pre-trained weights. The operating system was Windows 10, using Intel(R) Core(TM) i5-12400F CPU, RTX3060 GPU, and the Pytorch 2.1.1 and CUDA 12.3 deep learning frameworks. The pre-training weights provided officially by YOLOv8s were utilized as the training initialization parameter., Under this experimental platform, the training time of the original YOLOv8s under this experimental platform was 146 min, and the number of parameters was 1.113 × 10⁷.

3.4. KTD-YOLOv8 Model

Currently, YOLO networks are widely used in agriculture detection and segmentation tasks. YOLOv8 is characterized by fast detection and lightweight capability; thus, it is suitable for embedded devices. Compared with YOLOv5, YOLOv8 has advantages in several aspects. In the backbone network (Backbone) part, YOLOv8 adopts the DarkNet53 structure for a bottom–up feature extraction. It uses the C2f module to replace the C3 module of YOLOv5 and improves feature extraction efficiency and accuracy. The Path Aggregation Network (PANet) enhances the model’s feature fusion and multi-scale detection ability in the neck network part. In the head network part, YOLOv8 adopts a decoupled head structure to process the classification and detection tasks separately. This reduces complexity and improves accuracy. However, the actual strawberry leaf disease detection task is often affected by background interference, which leads to a larger computational load. Because of the limited calculation resources on embedded devices, it is difficult to achieve a balance between accuracy and computational load when using the original YOLOv8. In order to solve this problem, this study introduces three improvements based on the YOLOv8s model. Firstly, in the backbone network, KWConv is taken to replace the Conv function in YOLOv8, and KW-C2f is adopted to replace the C2f function in YOLOv8. Secondly, after each KW-C2f operation, the Triplet Attention mechanism is added. Thirdly, a DBB sharing head is constructed for parameter sharing. These improvements aim to improve the recognition accuracy and keep the model lightweight. The network structure of KTD-YOLOv8 is shown in Figure 8.

Each improving component is described below:

KernelWarehouse convolution (KWConv)

The structure of the KernelWarehouse convolution is shown in Figure 9. Its operation can be divided into two steps: kernel partition and warehouse sharing. In the kernel partition step, KernelWarehouse first divides the static kernel W in the convolution layer into m non-overlapping parts equally (m is set to 16 in this paper), w₁, …, w_m, which are called “kernel units”. Through these kernel units, kernel partition can be defined as

W = w_{1} \cup \dots \cup w_{m}, i, j \in \{1, \dots, m\}, i \neq j, w_{i} \cap w_{j} = \emptyset

(3)

In order to generate a high degree of freedom, a kernel partition is used to replace the kernel unit in the static convolution kernel W. Then, they are combined to form a new convolution kernel called KWConv. The linear mixture of kernel units is generated as follows. The core units w₁, …, w_m are regarded as the “local core”, and the “warehouse” E = {e₁, …, e_n} containing n core units is defined, where the dimensions e₁, …, e_n are the same as w₁, …, w_m and each core unit can be defined as

w_{i} = α_{i 1} e_{1} + \dots + α_{i n} e_{n}, i \in \{1, \dots, m\}

(4)

where α_i₁, …, α_in are the scalar attention elements generated by the new attention module (NAF) based on the input x.

For the i-th kernel unit in the static convolution kernel W, the attention function can be defined as

α_{i j} = (1 - τ) \frac{Z_{i j}}{\sum_{p = 1}^{n} |Z_{i p}|} + τ β_{i j}, j \in \{1, \dots, n\}

(5)

where τ is a temperature parameter that decreases linearly from 1 to 0 during the training phase; β_ij is a binary value (0 or 1) used to initialize attention. The feature scalars Z_i₁, …, Z_in is generated as follows. Firstly, for any convolutional layer with a static convolution kernel W, a global average pooling (GAP) operation is performed to map the input to a feature vector. Then, it passes through a fully connected (FC) layer, a rectified linear unit (ReLU), another FC layer, and a new attention function. The length of the feature vector is reduced to 1/16 of the original one through the first FC layer. The m groups of feature scalars, named Z_i₁, …, Z_in, are generated through the second FC layer.

In the warehouse sharing operation, if the same convolution kernel unit size is used, multiple adjacent convolutional layers of KWConv can share the same warehouse value. This can enhance its efficiency and representation capability. In the YOLOv8 network, C2f function can reduce the resolution of the feature map and extract more representative features. In the components of the C2f module, the Bottleneck part can reduce the computation by decreasing channel numbers in the feature map, as shown in Figure 10. The base convolution in the C2f and Bottleneck parts are replaced by KWConv. The replaced C2f functions are named KW-C2f and KW-Bottleneck, respectively. KWConv reduces the computation by dynamically adjusting the weights of the convolution kernel. In addition, the KWConv can use the global average pooling and attention mechanisms to make the features extracted more representative.

2.: Triplet Attention mechanism

The attention mechanism can help the model filter the target region with important information from many irrelevant background areas. In the actual environment, the spots of strawberry leaves only occupy a small portion of the image, so a specific attention mechanism is needed to deal with the target object information. The Triplet Attention mechanism is selected in this work to process the feature map (C × H × W), where C is the number of channels, H is the height, and W is the width.

The high branch of the mechanism is responsible for computing the attention weights of the channel dimension C and the spatial dimension W. It rotates the input features by 90 degrees counterclockwise, and the rotated features follow F∈R^C×H×W. Then, a Z-pooling operation is performed. The Z-pooling operation maximizes and averages the input tensor along the H-axis, combines the features, and maintains the original shapes by using convolutional layers and Sigmoid activation functions. Finally, the output is rotated 90 degrees clockwise around the H-axis. The central branch is responsible for capturing the dependencies between the channel dimension C and the spatial dimensions H and W. This branch first interacts on the W-axis with the input feature F rotated 90 degrees counterclockwise, and the rotated features follow F∈R^C×H×W. Then, Z-pooling and convolution operations are performed, and the attention weights are generated by a Sigmoid function. Finally, the output is rotated 90 degrees clockwise around the W-axis to maintain the original shape. The low branch is used to capture the dependencies between spatial dimensions. This branch maintains the input’s identity and performs Z-pooling and convolution operations. It can later generate the attention weights through the Sigmoid function. The Z-pooling formula is shown in Equation (6).

Z - P O O L (X) = [M a x P o o l_{0 d} (x), A v g P o o l_{0 d} (x)]

(6)

Each branch aligns the inputs after generating the attention weights (Permutation). Then, the outputs of the three branches are average aggregated (Avg) to obtain the Triplet Attention output. In this way, Triplet Attention can enhance the network’s attention to leaf spot diseases of strawberries and improve the efficiency and efficacy of the model. The schematic diagram of the Triplet Attention mechanism is shown in Figure 11.

3.: DBB Sharing Head

Constructing a DBB sharing head aims to improve the feature extraction ability of the model and reduce the computational amount. DBB separates training from inference and uses a single convolution to combine different branches. During the training, DBB takes a multi-branch structure where each branch extracts different types of features. DBB uses 1 × 1, 1 × 1−K × K, 1 × 1-mean, and K × K instead of the regular K × K convolutions for feature extraction to augment the original layers. After extracting features from multiple branches, these features are fused to a final comprehensive value. Before the inference stage, reparameterization is performed in the multi-branch structure and fused into one main branch to save time. Because the DBB structure can be equivalently converted into a single convolutional layer for deployment, it can increase the training time to improve accuracy but has a relatively less impact on the inference time. The structure of the DBB training and inference process is shown in Figure 12a.

The YOLOv8 head structure and the DBB sharing head structure are shown in Figure 12b. DBB fuses multiple branches into one master branch under a multi-branch structure to extract different types of features before inference. Using the DBB to replace the original convolution of YOLOv8 can improve model accuracy without significantly increasing the computational complexity. Moreover, to further reduce the computational complexity, a parameter-sharing approach is constructed in this work. It uses two DBB modules to replace the four original convolutions in the decoupling head of YOLOv8 to improve efficiency.

4. Conclusions

In order to improve the recognition of strawberry leaf diseases, a KTD-YOLOv8 method is introduced in this work. Experimental results show that compared with the original YOLOv8, KTD-YOLOv8 improves the mAP@0.5 from 86.9% to 89.7%, reduces the model inference time from 13.1 ms to 12.1 ms, and reduces the calculation load from 28.8 GFLOPS to 17.7 GFLOPS, which is a reduction of 38.5%. Thus, it can achieve better performance in the recognition of strawberry leaf diseases with higher accuracy and lower computation, especially for mobile embedded devices. The detection results can help farmers detect and control the diseases early and gain more profits from growing strawberries.

Author Contributions

Conceptualization, Y.H. and T.Z.; methodology, Y.H., T.Z., Y.P. and Y.Z.; validation, Y.P. and C.W.; investigation, Y.H., T.Z., C.Y. and Y.P.; writing—original draft preparation, Y.H. and T.Z.; writing—review and editing, Y.H., T.Z. and C.Y.; supervision, T.Z. and Y.Z.; funding acquisition, T.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This project is supported by the Natural Science Foundation of Fujian Province in China (2023J01470) and (2022J01609) and the Science and Technology Innovation Special Fund of Fujian Agriculture and Forestry University (CXZX2020132B).

Data Availability Statement

The authors confirm that all data underlying the findings of this work are available within this manuscript. Raw data that support the outcome of this study are available from the corresponding authors, upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Giampieri, F.; Tulipani, S.; Alvarez-Suarez, J.M.; Quiles, J.L.; Mezzetti, B.; Battino, M. The strawberry: Composition, nutritional quality, and impact on human health. Nutrition 2012, 28, 9–19. [Google Scholar] [CrossRef] [PubMed]
Sarkar, C.; Gupta, D.; Gupta, U.; Hazarika, B.B. Leaf disease detection using machine learning and deep learning: Review and challenges. Appl. Soft Comput. 2023, 145, 110534. [Google Scholar] [CrossRef]
Li, G.; Jiao, L.; Chen, P.; Liu, K.; Wang, R.; Dong, S.; Kang, C. Spatial convolutional self-attention-based transformer module for strawberry disease identification under complex background. Comput. Electron. Agric. 2023, 212, 108121. [Google Scholar] [CrossRef]
Camargo, A.; Smith, J. Image pattern classification for the identification of disease causing agents in plants. Comput. Electron. Agric. 2009, 66, 121–125. [Google Scholar] [CrossRef]
Maas, J.L. Strawberry disease management. In Diseases of Fruits and Vegetables: Volume II: Diagnosis and Management; Springer: Dordrecht, The Netherland, 2004; pp. 441–483. [Google Scholar]
You, J.; Jiang, K.; Lee, J. Deep metric learning-based strawberry disease detection with unknowns. Front. Plant Sci. 2022, 13, 891785. [Google Scholar] [CrossRef]
Li, Z.; Li, B.-x.; Li, Z.-h.; Zhan, Y.-f.; Wang, L.-h.; Gong, Q. Research progress in crop disease and pest identification based on deep learning. Hubei Agric. Sci. 2023, 62, 165. [Google Scholar]
Bharate, A.A.; Shirdhonkar, M. A review on plant disease detection using image processing. In Proceedings of the 2017 International Conference on Intelligent Sustainable Systems (ICISS), Palladam, India, 7–8 December 2017; pp. 103–109. [Google Scholar]
Radhakrishnan, S. An improved machine learning algorithm for predicting blast disease in paddy crop. Mater. Today Proc. 2020, 33, 682–686. [Google Scholar] [CrossRef]
Thakur, P.S.; Khanna, P.; Sheorey, T.; Ojha, A. Trends in vision-based machine learning techniques for plant disease identification: A systematic review. Expert Syst. Appl. 2022, 208, 118117. [Google Scholar] [CrossRef]
Liu, J.; Abbas, I.; Noor, R.S. Development of deep learning-based variable rate agrochemical spraying system for targeted weeds control in strawberry crop. Agronomy 2021, 11, 1480. [Google Scholar] [CrossRef]
Meshram, A.T.; Vanalkar, A.V.; Kalambe, K.B.; Badar, A.M. Pesticide spraying robot for precision agriculture: A categorical literature review and future trends. J. Field Robot. 2022, 39, 153–171. [Google Scholar] [CrossRef]
Wu, G.; Fang, Y.; Jiang, Q.; Cui, M.; Li, N.; Ou, Y.; Diao, Z.; Zhang, B. Early identification of strawberry leaves disease utilizing hyperspectral imaging combing with spectral features, multiple vegetation indices and textural features. Comput. Electron. Agric. 2023, 204, 107553. [Google Scholar]
Aggarwal, M.; Khullar, V.; Goyal, N. Exploring classification of rice leaf diseases using machine learning and deep learning. In Proceedings of the 2023 3rd International Conference on Innovative Practices in Technology and Management (ICIPTM), Uttar Pradesh, India, 22–24 February 2023; pp. 1–6. [Google Scholar]
Javidan, S.M.; Banakar, A.; Vakilian, K.A.; Ampatzidis, Y. Diagnosis of grape leaf diseases using automatic K-means clustering and machine learning. Smart Agric. Technol. 2023, 3, 100081. [Google Scholar] [CrossRef]
Zhang, S.; Shang, Y.; Wang, L. Plant disease recognition based on plant leaf image. J. Anim. Plant Sci. 2015, 25, 42–45. [Google Scholar]
Dargan, S.; Kumar, M.; Ayyagari, M.R.; Kumar, G. A survey of deep learning and its applications: A new paradigm to machine learning. Arch. Comput. Methods Eng. 2020, 27, 1071–1092. [Google Scholar] [CrossRef]
Dash, A.; Sethy, P.K.; Behera, S.K. Maize disease identification based on optimized support vector machine using deep feature of DenseNet201. J. Agric. Food Res. 2023, 14, 100824. [Google Scholar] [CrossRef]
Tian, Y.; Wang, S.; Li, E.; Yang, G.; Liang, Z.; Tan, M. MD-YOLO: Multi-scale Dense YOLO for small target pest detection. Comput. Electron. Agric. 2023, 213, 108233. [Google Scholar] [CrossRef]
Ozguven, M.M.; Adem, K. Automatic detection and classification of leaf spot disease in sugar beet using deep learning algorithms. Phys. A Stat. Mech. Its Appl. 2019, 535, 122537. [Google Scholar] [CrossRef]
Zhao, S.; Liu, J.; Wu, S. Multiple disease detection method for greenhouse-cultivated strawberry based on multiscale feature fusion Faster R_CNN. Comput. Electron. Agric. 2022, 199, 107176. [Google Scholar] [CrossRef]
Abbas, I.; Liu, J.; Amin, M.; Tariq, A.; Tunio, M.H. Strawberry fungal leaf scorch disease identification in real-time strawberry field using deep learning architectures. Plants 2021, 10, 2643. [Google Scholar] [CrossRef]
Li, Y.; Wang, J.; Wu, H.; Yu, Y.; Sun, H.; Zhang, H. Detection of powdery mildew on strawberry leaves based on DAC-YOLOv4 model. Comput. Electron. Agric. 2022, 202, 107418. [Google Scholar] [CrossRef]
Li, C.; Yao, A. KernelWarehouse: Towards Parameter-Efficient Dynamic Convolution. arXiv 2023, arXiv:2308.08361. [Google Scholar]
Misra, D.; Nalamada, T.; Arasanipalai, A.U.; Hou, Q. Rotate to attend: Convolutional triplet attention module. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Virtual, 5–9 January 2021; pp. 3139–3148. [Google Scholar]
Ding, X.; Zhang, X.; Han, J.; Ding, G. Diverse branch block: Building a convolution as an inception-like unit. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 10886–10895. [Google Scholar]
Qi, Y.; He, Y.; Qi, X.; Zhang, Y.; Yang, G. Dynamic snake convolution based on topological geometric constraints for tubular structure segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 2–6 October 2023; pp. 6070–6079. [Google Scholar]
Sunkara, R.; Luo, T. No more strided convolutions or pooling: A new CNN building block for low-resolution images and small objects. In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Grenoble, France, 19–23 September 2022; pp. 443–459. [Google Scholar]
Jin, G.; Taniguchi, R.-I.; Qu, F. Auxiliary detection head for one-stage object detection. IEEE Access 2020, 8, 85740–85749. [Google Scholar] [CrossRef]
Maji, D.; Nagori, S.; Mathew, M.; Poddar, D. Yolo-pose: Enhancing yolo for multi person pose estimation using object keypoint similarity loss. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 2637–2646. [Google Scholar]
Yang, L.; Zhang, R.-Y.; Li, L.; Xie, X. Simam: A simple, parameter-free attention module for convolutional neural networks. In Proceedings of the International Conference on Machine Learning, Virtual, 18–24 July 2021; pp. 11863–11874. [Google Scholar]
Huang, H.; Chen, Z.; Zou, Y.; Lu, M.; Chen, C. Channel prior convolutional attention for medical image segmentation. arXiv 2023, arXiv:2306.05196. [Google Scholar] [CrossRef]
Tong, K.; Wu, Y. Rethinking PASCAL-VOC and MS-COCO dataset for small object detection. J. Vis. Commun. Image Represent. 2023, 93, 103830. [Google Scholar] [CrossRef]

Figure 1. Performance curves of the KTD-YOLOV8 results. (a) Precision–recall Curve; (b) Recall–confidence Curve.

Figure 2. A comparison of the accuracy of YOLOv8 and KTD-YOLOv8 for various diseases.

Figure 3. Comparison of mAP@0.5 of models.

Figure 4. Overall loss comparison of models.

Figure 5. A comparison of detection effects of different detection models.

Figure 6. The samples of different types of diseases.

Figure 7. The number of images for different types of diseases.

Figure 8. Diagram of KTD-YOLOv8 network structure.

Figure 9. The framework of the KernelWarehouse dynamic convolution.

Figure 10. The structure of C2f, Bottleneck, KW-C2f, and KW-Bottleneck.

Figure 11. A schematic diagram of the Triplet Attention mechanism.

Figure 12. (a) The structure of the diverse branch block; (b) the structure of the YOLOv8 head with a DBB sharing head component.

Table 1. Comparative results of ablation experiments.

Baseline	Convolution	Attention	Head	Accuracy/ %	Recalls/ %	mAP@0.5/%	GFLOPS	Parameters	Inference Time/ms
√				89.1	77.6	86.9	28.8	1.113 × 10⁷	13.1
√	√			91.3	77.2	87.9	14.2	1.123 × 10⁷	17.1
√		√		90.0	81.0	89.3	28.5	1.114 × 10⁷	14.2
√			√	89.2	79.4	88.1	31.8	1.327 × 10⁷	11.1
√	√		√	89.5	80.3	88.2	17.6	1.338 × 10⁷	12.3
√	√	√		92.1	79.0	89.2	14.3	1.124 × 10⁷	18.6
√		√	√	91.9	80.2	89.6	31.9	1.327 × 10⁷	11.8
√	√	√	√	90.0	81.3	89.7	17.7	1.343 × 10⁷	12.1

Table 2. Comparison of different base convolutions.

Convolution	Accuracy/%	Recall/%	mAP@0.5/%	GFLOPS
YOLOv8s	89.1	77.6	86.9	28.8
DySnakeConv	88.8	79.2	87.9	31.6
SPDConv	90.3	78.6	88.1	43.0
KWConv	91.3	77.2	87.9	14.2

Table 3. Comparative experiment for different detecting heads.

Head	Accuracy (%)	Recall (%)	mAP@0.5 (%)	GFLOPS
YOLOv8s	89.1	77.6	86.9	28.8
Aux Head	89.8	80.0	88.2	36.8
Pose Head	89.9	79.0	88.2	39.7
DBB Sharing Head	89.2	79.4	88.1	31.8

Table 4. Comparative experiment for different attention mechanisms.

Attention	Accuracy/%	Recall/%	mAP@0.5/%	GFLOPS
YOLOv8s	89.1	77.6	86.9	28.8
SimAM	88.1	79.9	88.7	28.4
CPCA	86.5	80.1	87.6	29.4
Triplet Attention	90.0	81.0	89.3	28.5

Table 5. Comparative results of different detection models.

Arithmetic	Accuracy (%)	Recall (%)	mAP@0.5 (%)	GFLOPS	Parameters	Inference Time (ms)
YOLOv5	86.9	77.8	86.5	14.2	0.711 × 10⁷	12.5
YOLOv6	86.5	73.6	83.2	44.0	1.629 × 10⁷	13.2
YOLOv7	89.8	78.7	88.0	103.2	3.650 × 10⁷	21.0
YOLOv8	89.1	77.6	86.9	28.8	1.113 × 10⁷	13.1
YOLOv9	89.2	80.0	89.4	237.7	5.097 × 10⁷	30.2
KTD-YOLOv8	90.0	81.3	89.7	17.7	1.343 × 10⁷	12.1

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

He, Y.; Peng, Y.; Wei, C.; Zheng, Y.; Yang, C.; Zou, T. Automatic Disease Detection from Strawberry Leaf Based on Improved YOLOv8. Plants 2024, 13, 2556. https://doi.org/10.3390/plants13182556

AMA Style

He Y, Peng Y, Wei C, Zheng Y, Yang C, Zou T. Automatic Disease Detection from Strawberry Leaf Based on Improved YOLOv8. Plants. 2024; 13(18):2556. https://doi.org/10.3390/plants13182556

Chicago/Turabian Style

He, Yuelong, Yunfeng Peng, Chuyong Wei, Yuda Zheng, Changcai Yang, and Tengyue Zou. 2024. "Automatic Disease Detection from Strawberry Leaf Based on Improved YOLOv8" Plants 13, no. 18: 2556. https://doi.org/10.3390/plants13182556

APA Style

He, Y., Peng, Y., Wei, C., Zheng, Y., Yang, C., & Zou, T. (2024). Automatic Disease Detection from Strawberry Leaf Based on Improved YOLOv8. Plants, 13(18), 2556. https://doi.org/10.3390/plants13182556

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Automatic Disease Detection from Strawberry Leaf Based on Improved YOLOv8

Abstract

1. Introduction

2. Results and Discussion

3. Materials and Methods

3.1. Image Dataset

3.2. Image Enhancement

3.3. Experimental Platform

3.4. KTD-YOLOv8 Model

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI