CEFW-YOLO: A High-Precision Model for Plant Leaf Disease Detection in Natural Environments

Tao, Jinxian; Li, Xiaoli; He, Yong; Islam, Muhammad Adnan

doi:10.3390/agriculture15080833

Open AccessArticle

CEFW-YOLO: A High-Precision Model for Plant Leaf Disease Detection in Natural Environments

College of Biosystems Engineering and Food Science, Zhejiang University, Hangzhou 310058, China

^*

Author to whom correspondence should be addressed.

Agriculture 2025, 15(8), 833; https://doi.org/10.3390/agriculture15080833

Submission received: 17 March 2025 / Revised: 4 April 2025 / Accepted: 9 April 2025 / Published: 12 April 2025

(This article belongs to the Section Digital Agriculture)

Download

Browse Figures

Versions Notes

Abstract

:

The accurate and rapid detection of apple leaf diseases is a critical component of precision management in apple orchards. The existing deep-learning-based detection algorithms for apple leaf diseases typically demand high computational resources, which limits their practical applicability in orchard environments. Furthermore, the detection of apple leaf diseases in natural settings faces significant challenges due to the diversity of disease types, the varied morphology of affected areas, and the influence of factors such as lighting variations, leaf occlusions, and differences in disease severity. To address the above challenges, we constructed an apple leaf disease detection (ALD) dataset, which was collected from real-world scenarios, and we applied data augmentation techniques, resulting in a total of 9808 images. Based on the ALD dataset, we proposed a lightweight YOLO11n-based detection network, named CEFW-YOLO, designed to tackle the current issues in apple leaf disease identification. First, we designed a novel channel-wise squeeze convolution (CWSConv), which employs channel compression and standard convolution to reduce computational resource consumption, enhance the detection of small objects, and improve the model’s adaptability to the morphological diversity of apple leaf diseases and complex backgrounds. Second, we developed an enhanced cross-channel attention (ECCAttention) module and integrated it into the C2PSA_ECCAttention module. By extracting global information, combining horizontal and vertical convolutions, and strengthening cross-channel interactions, this module enables the model to more accurately capture disease features on apple leaves, thereby enhancing detection accuracy and robustness. Additionally, we introduced a new fine-grained multi-level linear attention (FMLAttention) module, which utilizes multi-level asymmetric convolutions and linear attention mechanisms to improve the model’s ability to capture fine-grained features and local details critical for disease detection. Finally, we incorporated the Wise-IoU (WIoU) loss function, which enhances the model’s ability to differentiate overlapping targets across multiple scales. A comprehensive evaluation of CEFW-YOLO was conducted, comparing its performance against state-of-the-art (SOTA) models. CEFW-YOLO achieved a 20.6% reduction in computational complexity. Compared to the original YOLO11n, it improved detection precision by 3.7%, with the mAP@0.5 and mAP@0.5:0.95 increasing by 7.6% and 5.2%, respectively. Notably, CEFW-YOLO outperformed advanced SOTA algorithms in apple leaf disease detection, underscoring its practical application potential in real-world orchard scenarios.

Keywords:

apple leaf disease; YOLO11; convolutional module; attention mechanism; loss function

1. Introduction

The importance of apple leaf disease detection in modern agriculture is becoming increasingly prominent, particularly in the context of precision orchard management and the development of smart agricultural systems. As climate change and the diversity of disease types continue to evolve, apple leaf diseases exhibit a wide range of forms and varied lesion areas, making traditional manual detection methods insufficient to meet the growing demands for efficient and accurate monitoring [1]. Compounding this complexity is the fact that disease manifestation is influenced by factors such as lighting variations, leaf occlusion, and differing degrees of lesion severity, all of which pose challenges to existing detection technologies. In this context, deep-learning-based automated detection methods have emerged as an effective solution. However, current deep learning models typically require substantial computational resources, which hinders their ability to achieve real-time, efficient disease detection in resource-constrained practical environments [2,3]. Therefore, researching and proposing efficient and accurate disease detection algorithms is of paramount importance for enhancing disease recognition accuracy, reducing computational complexity, and advancing the intelligent and precise management of agricultural production. Furthermore, precise disease detection provides scientific decision support for agricultural managers [4], enabling early warning and targeted control of diseases in apple orchards, thereby ensuring the stability and sustainable development of agricultural production.

Research in the field of agricultural disease detection has primarily focused on two-stage detection models based on candidate boxes and one-stage regression-based detection models [5]. Typical representatives of the former include Faster R-CNN [6] and Mask R-CNN [7]. These models achieve high-precision detection by generating candidate regions, and they are suitable for pest and disease identification in complex backgrounds. The latter, such as YOLO [8] and SSD [9], offer superior real-time performance, making them more appropriate for the rapid detection of pests and diseases over large agricultural areas. Numerous studies have focused on optimizing these models to address the challenges posed by agricultural environments. Gong et al. [10] proposed an enhanced Faster R-CNN method that integrates Res2Net and a feature pyramid network to effectively extract multidimensional features, improving the performance of apple leaf disease detection. Li et al. [11] improved Faster R-CNN by introducing binary cross-entropy and box regression loss functions, which enhanced the recognition accuracy of maize leaf diseases. Bondre et al. [12] proposed the IFMR-CNN method, which leverages transfer learning and Mask R-CNN for the precise segmentation of leaf diseases. Luo et al. [13] improved apple leaf disease detection accuracy by integrating the CBAM attention mechanism into the RepSSD network. Yao et al. [14] proposed the YOLO-Wheat algorithm based on YOLOv8s for wheat disease recognition, enhancing the model’s ability to extract features from small targets through improved modules. Additionally, Trinh et al. [15], Du et al. [16], and Zhang et al. [17], respectively, improved Faster R-CNN and Mask R-CNN for the detection and segmentation of pests such as mangosteen, Spodoptera frugiperda, and pepper clusters.

Additionally, numerous studies have proposed new solutions to address the complex environments and diverse features encountered in agricultural pest and disease detection. Huang et al. [18] improved Mask R-CNN to enhance the accuracy of cabbage leaf area estimation, while Wang et al. [19] applied a Transformer-based Mask R-CNN to improve tomato detection performance. Guo et al. [20] introduced MobileNetV2 and various attention mechanisms into an enhanced SSD model, achieving efficient cotton leaf disease detection. The AgriPest-YOLO model proposed by Zhang et al. [21] increased pest detection accuracy through coordination and local attention mechanisms. Tian et al. [22] developed the MD-YOLO method, which achieved high precision in detecting small target Lepidoptera pests. Zhang et al. [23] enhanced the YOLO-CRD model by improving the YOLOv5s architecture, thereby boosting rice disease detection performance. Zhang et al. [24] proposed the GVC-YOLO model, which efficiently extracted features in cotton aphid detection, delivering high accuracy and speed. Zhao et al. [25] further improved the YOLOv7 algorithm by incorporating CBAM and convolutional self-attention mechanisms, enhancing the precision and efficiency of crop pest detection. These studies have not only advanced agricultural pest and disease detection technologies but have also provided crucial technical support for the development of smart agriculture.

The existing research has made significant advancements in the accuracy of agricultural disease identification [26], with the continual optimization of convolutional neural network (CNN) models expanding both their depth and breadth. While more complex models can offer higher precision, they come with longer training times and greater computational resource demands, posing challenges for mobile devices that require rapid responses [27]. Consequently, the design of lightweight models that combine high accuracy with fast inference has become a focal point of current research [28]. The substantial computational load makes it difficult to deploy complex models on practical edge computing devices, hindering their ability to meet the real-time requirements of field-based apple leaf disease identification tasks [29]. Additionally, model deployment in real-world applications faces challenges related to dataset quality, particularly in the domain of apple leaf disease identification in natural environments [30]. Current apple leaf datasets often focus solely on individual leaves, failing to account for potential overlap or occlusion that may occur in natural settings, with simple backgrounds that do not fully capture the complexities of real-world environments [31]. As a result, the lack of high-quality, representative datasets, particularly in the context of precise management of apple leaf diseases, has become a significant barrier to the widespread deployment of these models in practical applications [32]. Efforts must continue to address the shortage of apple leaf disease datasets in natural environments to facilitate the technological implementation and application of this field.

To address the aforementioned challenges, this paper proposes an enhanced deep learning model based on YOLO11n (https://github.com/ultralytics/ultralytics, accessed on 7 January 2025) to improve the generalization capability of apple leaf disease detection. The algorithm effectively resolves common issues encountered in apple leaf disease images, such as occlusion, dense target distribution, and variations in lighting conditions, while simultaneously improving detection accuracy. First, a dataset of apple leaf diseases in natural field environments was constructed. Next, the CWSConv, C2PSA_ECCAttention, and FMLAttention modules and WIoU loss function [33] were integrated into the YOLO11n model to develop the CEFW-YOLO model. The main contributions are as follows:

(1) The CWSConv module is designed by combining channel compression and standard convolution, which reduces computational resource consumption while enhancing the model’s ability to detect small objects. This module improves the model’s adaptability to the morphological diversity of apple leaf diseases and complex backgrounds, thereby enhancing both the accuracy and efficiency of detection.

(2) The ECCAttention module is proposed, and the C2PSA_ECCAttention module is constructed, which combines the extraction of global information, the fusion of horizontal and vertical convolutions, and the enhancement of cross-channel interactions. This enables the model to more accurately capture the disease characteristics of apple leaves. The introduction of this module enhances the model’s performance in complex backgrounds.

(3) The FMLAttention module is proposed, which enhances the model’s ability to capture fine-grained features and local information through multi-level asymmetric convolutions and a linear attention mechanism. This module enables more precise recognition of local features associated with apple leaf diseases, thereby improving detection performance under complex disease patterns.

(4) The introduction of the WIoU loss function enhances the detection accuracy of multi-scale overlapping objects, particularly in addressing target overlap and small object detection, thereby improving the model’s resolution capability. The WIoU loss function optimizes the regression process of object boundaries, leading to improved accuracy in apple leaf disease detection.

(5) A comprehensive performance analysis and comparative experiments were conducted on CEFW-YOLO. The results indicate that the model exhibits better convergence and accuracy, with reduced computational complexity.

The remainder of this paper is organized as follows: Section 1 outlines the dataset construction process. Section 2 provides a detailed description of our methodology and contributions. Section 3 presents the experimental results, along with a comprehensive performance analysis and comparison with state-of-the-art methods. Finally, Section 4 concludes the paper.

2. Materials and Methods

2.1. Data Acquisition

This study developed an apple leaf disease dataset, comprising images of apple leaves captured under natural conditions from an apple orchard located in Laizhou (120° E, 37° N), Yantai, Shandong Province. The location is shown in Figure 1A. The weather conditions included sunny, overcast, morning, and evening scenarios. The dataset accounts for various conditions, such as single and multiple targets, complex backgrounds, and multiple angles. The images were taken with a Xiaomi 13 smartphone at a height ranging from 100 to 200 cm above the ground, with a resolution of 1080 pixels × 1920 pixels. A total of 2452 useful JPG images were selected for the dataset, with a sample shown in Figure 1B.

2.2. Data Annotation

In this study, we collected a dataset consisting of 2452 images of apple leaf diseases and annotated the image data using the LabelImg tool. The annotation results were saved in YOLO-format txt label files to meet the requirements for deep learning model training. Based on the disease conditions of the apple leaves in the dataset, we classified them into four common types of diseases: powdery mildew (PM), black rot (Rot), apple scab (Scab), and rust. Images are shown in Figure 1C. Through the above annotation process, we constructed a high-quality annotated dataset, aimed at providing reliable training data support for subsequent deep-learning-based disease detection and classification models. Furthermore, the annotations for each disease category adhered to strict consistency standards to ensure annotation quality and data integrity, thus enhancing the accuracy and robustness of model training.

2.3. Data Partitioning and Data Enhancement

In this study, a total of 2452 original image samples were collected. Since the training of deep neural network models relies on a large volume of diverse image data, data augmentation techniques were applied to enhance the model’s generalization capability and robustness and to mitigate the risk of overfitting caused by limited data. Specifically, operations such as rotation, translation, brightness adjustment, and flipping were employed. These augmentation methods effectively increased the background diversity of the images and simulated various complex scenarios that may occur in real-world applications. As a result of these augmentation techniques, the dataset was expanded fourfold, from 2452 to 9808 images. The augmented dataset was then randomly divided into training, validation, and test sets at a ratio of 8:1:1, resulting in 7846 images for training, 981 for validation, and 981 for testing. This enriched and diversified dataset provided a more solid foundation for training the deep neural network model, thereby significantly improving its performance.

2.4. General Technical Route

This study followed the standard process for deep learning model development, which includes model construction, training, and testing. The research process begins with the data collection phase, followed by key stages, such as data preprocessing, model design and optimization, model training, and performance evaluation and testing. Ultimately, the trained model is applied to inference and validation on the test data to assess its effectiveness in practical applications. The detailed steps of the entire research process and its respective stages are illustrated in Figure 2A.

2.5. The Proposed CEFW-YOLO

In this study, we proposed an improved detection algorithm based on YOLO11n, named CEFW-YOLO, aimed at addressing the precise detection of apple leaf diseases in natural environments. Specifically, the CWSConv module was designed to replace the original Conv module within the YOLO11n framework. A new C2PSA_ECCAttention module was introduced to enhance the model’s capability to extract disease features. Additionally, the FMLAttention module was designed to improve the model’s ability to capture fine-grained features and local information. Lastly, the WIoU loss function was introduced to replace the original CIoU loss function, enhancing the accuracy of apple leaf disease detection. The network architecture of CEFW-YOLO is illustrated in Figure 2B.

2.5.1. CWSConv

In the task of apple leaf disease detection, the design of lightweight models is particularly important to address high computational resource demands and performance bottlenecks in real-world applications. Traditional convolution modules (Conv), while capable of extracting effective features from complex images, tend to have high computational costs and redundant parameters, especially when processing a large number of channels. This leads to inefficient computation and slow inference, which hinders their suitability for real-time applications. Apple leaf disease detection requires a fast response, fine-grained feature extraction, and efficient computation. However, the original Conv module does not effectively compress channels and struggles with handling the diversity and complexity of input data, making it inadequate for addressing variations in lighting, occlusion, and different disease manifestations. This makes the original Conv module less effective for practical applications in apple orchards.

The core design of the CWSConv module lies in the combination of channel compression and standard convolution. First, CWSConv applies a 1 × 1 convolution layer to compress the channels of the input feature map. Assuming the input feature map has the dimensions

H \times W \times C_{i n}

, after the 1 × 1 convolution, the number of channels is reduced to

\frac{C_{i n}}{2}

, resulting in an output feature map of size

H \times W \times \frac{C_{i n}}{2}

. Next, a standard convolution operation

K \times K

is applied to further extract key features. This design minimizes the computational cost of the convolution operation while preserving the ability to extract important features. Finally, a batch normalization layer (BatchNorm) is used to normalize the output, accelerating network training and stabilizing the features. The entire process can be mathematically expressed as follows:

x_{o u t} = {ReLU (BN (Conv (Conv}_{1 \times 1} (x))))

(1)

where

x_{o u t}

is the output feature map,

{Conv}_{1 \times 1}

represents the channel compression operation,

Conv

represents the standard convolution operation,

BN

represents batch normalization, and

ReLU

is the activation function.

The CWSConv module effectively addresses the challenges in apple leaf disease detection, particularly excelling in handling complex environmental variations. Firstly, the channel compression operation reduces the number of parameters, significantly lowering computational complexity and memory usage, making it suitable for resource-constrained embedded devices. Secondly, by applying a 1 × 1 convolution for channel compression followed by a standard convolution for feature extraction, CWSConv can better retain useful information while eliminating redundant features. This allows the model to more efficiently process the details of different diseased regions in apple leaf images. Given the complexity of disease lesions, significant lighting variations, and occlusion in apple leaf images, the advantages of CWSConv lie in its lower computational cost and higher feature extraction efficiency. This enables the model to quickly and accurately detect diseases in real-world settings, meeting the practical requirements for both real-time performance and accuracy. The CWSConv module is shown in Figure 3.

2.5.2. C2PSA_ECCAttention

In apple leaf disease detection, the complexity of leaf lesion types and shapes poses significant challenges. Although the original C2PSA module provides a certain level of feature extraction capability, it is limited in capturing fine-grained lesion features and details from various directions and scales. In particular, under natural environmental conditions, apple leaf diseases are affected by factors such as lighting variations, leaf occlusion, and the extent of lesions, making it difficult for the traditional C2PSA module to fully address these challenges. To overcome these limitations, we propose a new C2PSA_ECCAttention module, which enhances the model’s ability to adapt to complex situations through more flexible cross-channel interactions and directional convolutions.

The ECCAttention module first employs global average pooling to extract global information from the input features. It then uses convolution operations to capture features at different scales and directions. Specifically, ECCAttention introduces horizontal convolutions and vertical convolutions to capture features in the horizontal and vertical directions, respectively. These features are then fused through a cross-channel interaction module, which uses a 1 × 1 convolution to strengthen the exchange of information between channels. Finally, the second convolution layer integrates the features further, and the Sigmoid activation function generates an attention map, which is multiplied by the input features to highlight important features and suppress irrelevant ones. Mathematically, the operations of ECCAttention can be expressed as follows:

ECCAttention = σ {(Conv}_{2} {(Conv}_{1} (AvgPool (X))))

(2)

where

{Conv}_{1}

and

{Conv}_{2}

represent the first and second convolution layers, respectively;

AvgPool (X)

is the global average pooling operation; and

σ

is the Sigmoid activation function. The ECCAttention module is shown in Figure 4.

The C2PSA_ECCAttention module integrates ECCAttention with the PSABlock module. First, C2PSA_ECCAttention uses a 1 × 1 convolution to expand the input feature channels to 2c and splits the resulting features into two parts: one part is processed by the ECCAttention module, while the other part is passed through n PSABlock modules (each consisting of ECCAttention and feed-forward operations) for feature extraction and fusion. The final features are then combined through another 1 × 1 convolution, producing the integrated output. The C2PSA_ECCAttention module not only enhances cross-channel feature interaction but also incorporates a multi-layer self-attention mechanism, further improving sensitivity to complex lesion areas. The C2PSA_ECCAttention module is shown in Figure 5.

The C2PSA_ECCAttention module excels in capturing fine-grained features of apple leaf diseases, particularly in terms of directional and scale variations in the lesion areas. The introduction of horizontal and vertical convolutions within the ECCAttention module allows the model to handle features from different angles and scales, especially in the presence of lighting changes and leaf occlusion, which improves its ability to detect local lesions. Furthermore, the cross-channel interaction mechanism enhances the flow of information between channels, enabling the network to better exploit the correlations between multi-channel features. As a result, the C2PSA_ECCAttention module significantly outperforms the original C2PSA module in terms of both accuracy and robustness, providing a more effective solution for apple leaf disease detection in complex natural environments.

2.5.3. FMLAttention

In the task of apple leaf disease detection, the diversity of disease types, the complexity of lesions, and the influence of environmental factors, such as lighting changes and leaf occlusion, pose significant challenges. Traditional convolutional neural networks often fail to effectively capture the fine-grained features of the lesion areas, as they struggle to distinguish and extract detailed information from diverse disease regions. While existing deep learning models can recognize broad disease patterns, they are less effective in addressing the intricate forms of lesions. Therefore, in order to improve both detection accuracy and robustness, we proposed the design of a new fine-grained multi-level linear attention module (FMLAttention), which integrates multi-scale convolution and linear attention mechanisms to enhance the model’s ability to capture fine details. This design aims to address the complex challenges associated with apple leaf disease detection in real-world environments.

The core of the FMLAttention module lies in its combination of multi-level asymmetric convolution chains and linear attention mechanisms. Initially, FMLAttention employs several asymmetric convolution operations of varying scales, including kernels of sizes (1, 7), (7, 1), (1, 11), (11, 1), (1, 21), and (21, 1), which allow the model to capture local features of different sizes. This enables the extraction of fine-grained details, particularly through asymmetric convolutions that enhance the module’s ability to capture subtle features. Following the convolution operations, the linear attention mechanism is used to aggregate the features. Specifically, the input features are linearly transformed to produce query (Q), key (K), and value (V) vectors. The attention weights are computed as the inner product between the query and key vectors, and these weights are then applied to the value vector to generate the output. The process is described mathematically as follows:

Q = Φ_{q} (X), K = Φ_{k} (X), V = Φ_{v} (X)

(3)

{FMLAttention}_{attn_weights} = Softmax (Q K^{T})

(4)

{FMLAttention}_{attn_output} = attn_weights \times V

(5)

where

Φ_{q}

,

Φ_{k}

, and

Φ_{v}

represent the linear transformation functions, and

Q

,

K

, and

V

correspond to the query, key, and value features, respectively. This mechanism allows the model to adaptively capture important features in the image while enhancing the recognition of detailed regions of interest.

The FMLAttention module, by incorporating multi-level asymmetric convolutions and a linear attention mechanism, significantly improves the performance of apple leaf disease detection. Firstly, the multi-level asymmetric convolutions enable the extraction of richer local features at various scales and orientations, which is critical for capturing the diverse and irregular forms of lesions on the leaves. This enhancement proves particularly effective when dealing with complex disease appearances. Secondly, the linear attention mechanism enables the model to focus on key areas of the disease regions by calculating the relationships among global features, thereby emphasizing critical information while suppressing irrelevant background noise. This makes the model more precise and robust in real-world applications. Lastly, the design of FMLAttention balances computational efficiency with accuracy, reducing the high computational cost often associated with other methods while improving detection performance. This results in a model that is both accurate and efficient, particularly in challenging conditions, such as those with varying lighting and occlusion, where it maintains stable and reliable performance. Therefore, FMLAttention enhances the model’s ability to extract fine-grained features and improves its robustness and adaptability to complex environments. The FMLAttention module is shown in Figure 6.

2.5.4. WIoU Loss Function

In apple leaf disease detection tasks, although the original CIoU loss function can combine the penalty terms of the IoU value, centroid distance, and aspect ratio of the target frame to optimize the positioning accuracy and shape matching of the prediction frames, its performance is significantly limited in complex field environments. The CIoU loss function treats all the target areas equally and fails to dynamically adjust to the differences in the background complexity, target size, and morphological characteristics of the disease. For example, in field scenarios with drastic light changes, partially shaded leaves, or complex background noise, the importance of certain key disease regions may be underestimated, leading to optimization directions that deviate from the true objective. CIoU relies mainly on the combined metrics of IoU and centroid distance, failing to pay effective attention to the importance of the boundary details for disease detection. In field disease detection, the disease area may present irregular or scattered shapes, and it is difficult for the global optimization strategy of CIoU to adequately capture such detailed information, thus affecting the detection accuracy. In complex field environments, the distribution and shape of disease targets are often highly inhomogeneous, and target sizes may vary significantly. It is difficult for the fixed optimization mechanism of CIoU to adapt to these dynamic changes, which can easily lead to the model’s underperformance in small target detection or specially shaped regions. The formula of CIoU is shown in Equation (6).

L_{C I o U} = 1 - L_{I o U} + \frac{ρ^{2} (b, b^{g t})}{c^{2}} + α v

(6)

α = \frac{v}{(1 - L_{I o U}) + v}

(7)

v = \frac{4}{π^{2}} {(\arctan \frac{w^{g t}}{h^{g t}} - \arctan \frac{w}{h})}^{2}

(8)

where

ρ

is the true distance between

b

and

b^{g t}

;

b

and

b^{g t}

are the centroids of the boundaries of the predictive and real frames, respectively;

c

is the diagonal distance of the smallest closure region that can contain both the predictive and real frames;

w

and

h

are the width and height of the predictive frame, respectively;

w^{g t}

and

h^{g t}

are the width and height of the real frame, respectively;

α

is the weight function; and

v

is used to measure the consistency of the width-to-height ratio.

Compared to the CIoU loss function, this study demonstrates the advantages of using the WIoU loss function in apple leaf disease detection tasks. The core improvement lies in the introduction of a weighting mechanism and a dynamic adjustment strategy, which can better adapt to the characteristics of disease targets in complex field environments. WIoU dynamically adjusts the importance of different targets by assigning weights to each pixel or target region. In field environments, disease regions are affected by light, occlusion, and other factors, and through the weighting mechanism, WIoU can significantly enhance the model’s focus on key regions, thus improving the robustness and accuracy of disease detection. Compared with CIoU, WIoU further optimizes the calculation of IoU, which enables the details of disease boundaries to be more accurately preserved and optimized by combining the weights with the refinement of the intersection and concurrency ratios. This is particularly important for detecting disease areas with complex shapes or blurred boundaries and helps to improve the model’s sensitivity to different disease features. WIoU is able to dynamically assign optimized weights to different targets with respect to their size, shape, and importance, thus effectively coping with the heterogeneous distribution of targets in complex field scenarios. This flexibility significantly enhances the model’s ability to detect small-target disease areas (e.g., initial disease spots) and local details. In disease detection, the model needs to balance both the detection accuracy and generalization ability. WIoU suppresses IoU values that are too large or too small by weighting the penalty of the squared term of IoU, thus achieving balanced steering of the model’s optimization direction to ensure that it is able to consider the global detection performance in a complex context. The formula of WIoU is shown in Equation (9).

L_{W I o U} = \exp (\frac{{(x - x_{g t})}^{2} + {(y - y_{g t})}^{2}}{{(W_{g}^{2} + H_{g}^{2})}^{*}})

(9)

3. Results and Discussion

3.1. Model Evaluation

To objectively evaluate the detection performance of the model, we adopted multiple metrics for a comprehensive assessment, including precision (P), recall (R), mean average precision (mAP), frames per second (FPS), and computational cost (GFLOPs). Specifically, the mAP consisted of two components: mAP@0.5 and mAP@0.5:0.95. The corresponding calculation formulas can be found in Equations (10)–(13).

P = \frac{T P}{T P + F P} \times 100 %

(10)

R = \frac{T P}{T P + F N} \times 100 %

(11)

A P = \int_{0}^{1} P (r) d r

(12)

m A P = \frac{1}{m} \sum_{1}^{m} A P

(13)

The experimental environment was configured as follows: The operating system was Windows 10, the processor was an Intel Core i9-10900 CPU (2.80 G) (Intel Corporation, Santa Clara, California, USA), and the graphics card was an NVIDIA Quadro RTX 5000 (16 GB) (NVIDIA Corporation, Santa Clara, CA, USA). The deep learning framework was PyTorch 2.0.1, and the CUDA version was 11.7. During the training process, the SGD optimizer was used, with the momentum factor set to 0.937, the optimizer’s weight decay coefficient set to 0.0005, the batch size set to 16, the number of iteration rounds set to 200, and the initial learning rate set to 0.01.

3.2. Comparison of YOLO11n and CEFW-YOLO

To evaluate the effectiveness of the CEFW-YOLO model, we conducted a comparative study with the original YOLO11n model, with the results presented in Table 1. Figure 2C shows the comparison curves of the detection accuracy, recall, mAP@0.5, and mAP@0.5:0.95 between the CEFW-YOLO and YOLO11n models. According to the data in Table 1, the accuracy of CEFW-YOLO improved from 0.818 to 0.855, an increase of 0.037, indicating that CEFW-YOLO performed better in terms of precision. Moreover, the mAP@0.5:0.95 of CEFW-YOLO increased from 0.519 to 0.571, further confirming the model’s improved accuracy at different IoU thresholds. The FPS (frames per second) of CEFW-YOLO was 136.4, which is a 33.1 increase over that of YOLO11n, with 103.3, demonstrating an improvement in real-time performance. In terms of FLOPs, CEFW-YOLO slightly decreased from 6.3 GFLOPs in YOLO11n to 5.0 GFLOPs, suggesting a reduction in computational overhead. In summary, CEFW-YOLO demonstrated significant advantages over YOLO11n in terms of accuracy, speed, and computational cost, indicating that the model achieves superior performance in object detection tasks.

To provide a comprehensive comparison between the CEFW-YOLO and YOLO11n models, visualizations of their respective feature maps are presented. Through a detailed analysis of the heatmaps generated from the original images and the YOLO11n and CEFW-YOLO outputs, it is evident that CEFW-YOLO outperformed YOLO11n in apple leaf disease detection. In the case of black rot disease, CEFW-YOLO demonstrated a more comprehensive and clearer extraction of disease features, particularly excelling in the identification of small-scale lesions at the leaf edges. For powdery mildew, CEFW-YOLO showed enhanced adaptability under complex lighting and angle variations, effectively capturing the features of tiny lesions. Regarding rust disease, CEFW-YOLO improved fine-grained feature extraction of rust spots through its enhanced modules, with its heatmap clearly depicting the disease characteristics. In the detection of scab disease, CEFW-YOLO was also able to accurately identify morphological changes in the lesions, with the heatmap shown in Figure 7.

To evaluate the performance of the improved CEFW-YOLO algorithm in apple leaf disease detection, we compared its detection results with those of the original YOLO11n algorithm. The experimental results indicate that the original algorithm exhibited lower detection accuracy in complex scenarios, with issues of both missed detections and false positives. In contrast, the improved CEFW-YOLO algorithm not only enhanced the detection accuracy but also significantly reduced the occurrence of missed detections and false positives. Figure 8 presents a comparison of the algorithms before and after the improvement.

3.3. Ablation Experiment

To investigate the roles of CWSConv, C2PSA_ECCAttention, FMLAttention, and the WIoU loss function in the CEFW-YOLO network, we conducted ablation experiments, with the results presented in Table 2. According to the experimental results shown in Table 2, various combinations of the proposed modules significantly enhanced the performance of the CEFW-YOLO network. The baseline model (without any modules) achieved a precision of 0.818, recall of 0.779, mAP@0.5 of 0.797, and mAP@0.5:0.95 of 0.519. After introducing the CWSConv module alone, the precision increased to 0.823, the recall to 0.784, the mAP@0.5 to 0.805, and the mAP@0.5:0.95 to 0.524, indicating its effectiveness in lightweight feature extraction and small object detection. When only the C2PSA_ECCAttention module was used, the model achieved a precision of 0.820, recall of 0.787, mAP@0.5 of 0.802, and mAP@0.5:0.95 of 0.522, demonstrating its ability to enhance cross-channel interactions and spatial feature extraction in complex backgrounds. The FMLAttention module alone yielded a precision of 0.835, recall of 0.791, mAP@0.5 of 0.835, and mAP@0.5:0.95 of 0.531, which highlights its superior capability in capturing fine-grained and local features through multi-scale asymmetric convolutions and linear attention. The WIoU loss function improved the precision to 0.830, recall to 0.795, mAP@0.5 to 0.827, and mAP@0.5:0.95 to 0.528 by enhancing the multi-scale object localization accuracy. Combining CWSConv and C2PSA_ECCAttention yielded a precision of 0.837, recall of 0.797, mAP@0.5 of 0.841, and mAP@0.5:0.95 of 0.539, showing the synergy in lightweight feature processing and enhanced attention mechanisms. The combination of CWSConv and FMLAttention further improved the performance to a precision of 0.839, recall of 0.799, mAP@0.5 of 0.847, and mAP@0.5:0.95 of 0.545, emphasizing the benefit of combining efficient convolution with detailed feature extraction. When CWSConv, C2PSA_ECCAttention, and FMLAttention were integrated, the model achieved a precision of 0.842, recall of 0.800, mAP@0.5 of 0.855, and mAP@0.5:0.95 of 0.556, effectively improving the robustness in complex natural environments. The combination of C2PSA_ECCAttention, FMLAttention, and WIoU reached a precision of 0.845, recall of 0.804, mAP@0.5 of 0.865, and mAP@0.5:0.95 of 0.562, demonstrating advantages in fine-grained feature learning and object localization. Finally, when all four modules were integrated, the model achieved the best overall performance, with a precision of 0.855, recall of 0.812, mAP@0.5 of 0.873, and mAP@0.5:0.95 of 0.571, confirming that the joint contribution of all proposed modules yielded significant improvements in detection accuracy, robustness, and generalization.

3.4. Comparison Experiment

To further validate the effectiveness of the CEFW-YOLO model in apple leaf disease detection, comparative experiments were conducted, including in FasterRCNN [6], YOLOv5n [34], YOLOv6n [35], YOLOv8n (https://github.com/ultralytics/ultralytics (7 January 2025)), YOLOv9t [36], and YOLOv10n [37]. All experiments were performed using the same dataset and training environment, with the results presented in Table 3. Compared with the classical two-stage detection model FasterRCNN (p = 0.753, R = 0.732, mAP@0.5 = 0.720, mAP@0.5:0.95 = 0.441, FPS = 25.3), CEFW-YOLO achieved significant improvements across all evaluation metrics, attaining a precision of 0.855, recall of 0.812, mAP@0.5 of 0.873, and mAP@0.5:0.95 of 0.571. Moreover, its inference speed reached 136.4 FPS, demonstrating excellent real-time performance. When compared to lightweight YOLO models, such as YOLOv5n (mAP@0.5 = 0.794, FPS = 98.2), YOLOv6n (mAP@0.5 = 0.795, FPS = 54.5), YOLOv8n (mAP@0.5 = 0.804, FPS = 76.2), YOLOv9t (mAP@0.5 = 0.802, FPS = 90.1), YOLOv10n (mAP@0.5 = 0.790, FPS = 78.7), and YOLO11n (mAP@0.5 = 0.797, FPS = 103.3), CEFW-YOLO not only achieved superior detection accuracy but also a better balance between speed and computational complexity, with the FLOPs reduced to just 5.0G—the lowest among all the compared models. These results convincingly demonstrate the comprehensive advantages of CEFW-YOLO in terms of detection precision, real-time performance, and lightweight design, highlighting its strong potential for practical application and deployment in real-world scenarios (Figure 9).

4. Conclusions

This study addressed the challenges in apple leaf disease detection under natural environmental conditions, including high computational resource demands, poor adaptability to complex environments, and insufficient small object detection capability. We proposed a lightweight apple leaf disease detection network based on the YOLO11n architecture, named CEFW-YOLO, and we introduced three innovative modules for the first time: CWSConv, C2PSA_ECCAttention, and FMLAttention. The CWSConv module, by combining channel compression with standard convolutions, not only significantly reduced computational resource consumption but also enhanced the detection ability for small objects. The C2PSA_ECCAttention module, through the integration of global information extraction, the fusion of horizontal and vertical convolutions, and cross-channel interaction enhancement, significantly improved the model’s feature capture ability in complex backgrounds. The FMLAttention module, leveraging multi-stage asymmetric convolutions and a linear attention mechanism, effectively enhanced the model’s ability to capture fine-grained features and local information. Additionally, the WIoU loss function was introduced to further optimize the detection performance of multi-scale targets. The experimental results on the ALD dataset show that CEFW-YOLO outperformed the original YOLO11n in terms of detection accuracy (mAP@0.5 and mAP@0.5:0.95), real-time performance (FPS), and computational efficiency (FLOPs). It also surpassed current state-of-the-art (SOTA) object detection algorithms. These results validate the effectiveness and advantages of the CWSConv, C2PSA_ECCAttention, and FMLAttention modules and the WIoU loss function in apple leaf disease detection, demonstrating that CEFW-YOLO holds significant practical application potential.

Despite the significant achievements of CEFW-YOLO in the detection of apple leaf diseases, there remains room for further optimization. In future work, we plan to investigate the following aspects: Firstly, in response to a broader range of disease types and the increasing complexity of scenarios, we aim to further expand and enrich the apple leaf disease dataset in order to enhance the model’s generalization ability. Secondly, we will consider integrating more sophisticated self-supervised learning methods with the existing network architecture to reduce reliance on annotated data while improving the model’s adaptability and robustness. Additionally, we will explore the incorporation of multimodal data (such as infrared images, spectral data, etc.) to enhance the model’s comprehensive representation of disease features, thereby enabling more effective disease detection under various lighting conditions and complex scenarios. Finally, we hope to extend the application of CEFW-YOLO to the detection of diseases in other crops, further validating its versatility and practical value in agricultural production management.

Author Contributions

Conceptualization, J.T. and X.L.; methodology, J.T. and Y.H.; software, J.T. and M.A.I.; validation, J.T. and X.L.; formal analysis, X.L., Y.H. and M.A.I.; investigation, J.T., X.L. and M.A.I.; resources, X.L.; data curation, J.T.; writing—original draft preparation, J.T. and X.L.; writing—review and editing, J.T., X.L. and Y.H.; visualization, J.T.; supervision, X.L.; project administration, X.L.; funding acquisition, X.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Key R&D Projects in Zhejiang Province [2023C02009, 2023C02043, 2022C02044], the National Natural Science Foundation of China [32171889], and the Earmarked Fund for CARS [CARS-19-02A].

Data Availability Statement

The data provided in this study are available from the corresponding author upon request. The data are not publicly available due to privacy.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Dutot, M.; Nelson, L.M.; Tyson, R.C. Predicting the Spread of Postharvest Disease in Stored Fruit, with Application to Apples. Postharvest Biol. Technol. 2013, 85, 45–56. [Google Scholar] [CrossRef]
Yang, R.; He, Y.; Hu, Z.; Gao, R.; Yang, H. CA-YOLOv5: A YOLO Model for Apple Detection in the Natural Environment. Syst. Sci. Control Eng. 2024, 12, 2278905. [Google Scholar] [CrossRef]
Lv, M.; Su, W.-H. YOLOV5-CBAM-C3TR: An Optimized Model Based on Transformer Module and Attention Mechanism for Apple Leaf Disease Detection. Front. Plant Sci. 2024, 14, 1323301. [Google Scholar] [CrossRef] [PubMed]
Zhu, S.; Ma, W.; Wang, J.; Yang, M.; Wang, Y.; Wang, C. EADD-YOLO: An Efficient and Accurate Disease Detector for Apple Leaf Using Improved Lightweight YOLOv5. Front. Plant Sci. 2023, 14, 1120724. [Google Scholar] [CrossRef]
Pavate, A.; Kukreja, S.; Janrao, S.; Bankar, S.; Patil, R.; Bidve, V. Efficient Model for Cotton Plant Health Monitoring via YOLO-Based Disease Prediction. Indones. J. Electr. Eng. Comput. Sci. 2025, 37, 164. [Google Scholar] [CrossRef]
Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef]
He, K.; Gkioxari, G.; Dollar, P.; Girshick, R. Mask R-CNN. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
Vijayakumar, A.; Vairavasundaram, S. YOLO-Based Object Detection Models: A Review and Its Applications. Multimed. Tools Appl. 2024, 83, 83535–83574. [Google Scholar] [CrossRef]
Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.-Y.; Berg, A.C. SSD: Single Shot Multibox Detector. Computer Vision—ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016, Proceedings, Part I; Lecture Notes in Computer Science; Springer International Publishing: Cham, Switzerland, 2016; pp. 21–37. [Google Scholar]
Gong, X.; Zhang, S. A High-Precision Detection Method of Apple Leaf Diseases Using Improved Faster R-CNN. Agriculture 2023, 13, 240. [Google Scholar] [CrossRef]
Li, X.; Yang, C. Maize Leaf Disease Identification Method Based on Improved Faster R-CNN. In Proceedings of the 2023 IEEE 5th International Conference on Civil Aviation Safety and Information Technology (ICCASIT), Dali, China, 11 October 2023; pp. 961–964. [Google Scholar]
Bondre, S.; Patil, D. Crop Disease Identification Segmentation Algorithm Based on Mask-RCNN. Agron. J. 2024, 116, 1088–1098. [Google Scholar] [CrossRef]
Luo, W.; Cai, L.; Yang, Y. Apple Leaf Disease Recognition in Natural Scenes Based on Re-Parameterized SSD Algorithm. In Proceedings of the International Conference on Computer Graphics, Artificial Intelligence, and Data Processing (ICCAID 2022), Guangzhou, China, 23–25 December 2023; Liang, R., Wang, J., Eds.; SPIE: Guangzhou, China, 2023; p. 137. [Google Scholar]
Yao, X.; Yang, F.; Yao, J. YOLO-Wheat: A Wheat Disease Detection Algorithm Improved by YOLOv8s. IEEE Access 2024, 12, 133877–133888. [Google Scholar] [CrossRef]
Trinh, T.; Bui, X.; Tran, T.; Nguyen, H.; Ninh, K. Mangosteen Fruit Detection Using Improved Faster R-CNN. Intelligence of Things: Technologies and Applications: The First International Conference on Intelligence of Things (ICIT 2022), Hanoi, Vietnam, 17–19 August 2022, Proceedings; Lecture Notes on Data Engineering and Communications Technologies; Springer International Publishing: Cham, Switzerland, 2022; pp. 366–375. [Google Scholar]
Du, L.; Sun, Y.; Chen, S.; Feng, J.; Zhao, Y.; Yan, Z.; Zhang, X.; Bian, Y. A Novel Object Detection Model Based on Faster R-CNN for Spodoptera frugiperda According to Feeding Trace of Corn Leaves. Agriculture 2022, 12, 248. [Google Scholar] [CrossRef]
Zhang, Z.; Wang, S.; Wang, C.; Wang, L.; Zhang, Y.; Song, H. Segmentation Method of Zanthoxylum bungeanum Cluster Based on Improved Mask R-CNN. Agriculture 2024, 14, 1585. [Google Scholar] [CrossRef]
Huang, F.; Li, Y.; Liu, Z.; Gong, L.; Liu, C. A Method for Calculating the Leaf Area of Pak Choi Based on an Improved Mask R-CNN. Agriculture 2024, 14, 101. [Google Scholar] [CrossRef]
Wang, C.; Yang, G.; Huang, Y.; Liu, Y.; Zhang, Y. A Transformer-Based Mask R-CNN for Tomato Detection and Segmentation. IFS 2023, 44, 8585–8595. [Google Scholar] [CrossRef]
Guo, W.; Feng, S.; Feng, Q.; Li, X.; Gao, X. Cotton Leaf Disease Detection Method Based on Improved SSD. Int. J. Agric. Biol. Eng. 2024, 17, 211–220. [Google Scholar]
Zhang, W.; Huang, H.; Sun, Y.; Wu, X. AgriPest-YOLO: A Rapid Light-Trap Agricultural Pest Detection Method Based on Deep Learning. Front. Plant Sci. 2022, 13, 1079384. [Google Scholar] [CrossRef]
Tian, Y.; Wang, S.; Li, E.; Yang, G.; Liang, Z.; Tan, M. MD-YOLO: Multi-Scale Dense YOLO for Small Target Pest Detection. Comput. Electron. Agric. 2023, 213, 108233. [Google Scholar] [CrossRef]
Zhang, R.; Liu, T.; Liu, W.; Yuan, C.; Seng, X.; Guo, T.; Wang, X. YOLO-CRD: A Lightweight Model for the Detection of Rice Diseases in Natural Environments. Phyton 2024, 93, 1275–1296. [Google Scholar] [CrossRef]
Zhang, Z.; Yang, Y.; Xu, X.; Liu, L.; Yue, J.; Ding, R.; Lu, Y.; Liu, J.; Qiao, H. GVC-YOLO: A Lightweight Real-Time Detection Method for Cotton Aphid-Damaged Leaves Based on Edge Computing. Remote Sens. 2024, 16, 3046. [Google Scholar] [CrossRef]
Zhao, C.; Bai, C.; Yan, L.; Xiong, H.; Suthisut, D.; Pobsuk, P.; Wang, D. AC-YOLO: Multi-Category and High-Precision Detection Model for Stored Grain Pests Based on Integrated Multiple Attention Mechanisms. Expert Syst. Appl. 2024, 255, 124659. [Google Scholar] [CrossRef]
Boudaa, B.; Abada, K.; Aichouche, W.A.; Nabil Belakermi, A. Advancing Plant Diseases Detection with Pre-Trained YOLO Models. In Proceedings of the 2024 6th International Conference on Pattern Analysis and Intelligent Systems (PAIS), El Oued, Algeria, 24–25 April 2024; pp. 1–6. [Google Scholar]
Liu, B.; Huang, X.; Sun, L.; Wei, X.; Ji, Z.; Zhang, H. MCDCNet: Multi-Scale Constrained Deformable Convolution Network for Apple Leaf Disease Detection. Comput. Electron. Agric. 2024, 222, 109028. [Google Scholar] [CrossRef]
Zeng, W.; Pang, J.; Ni, K.; Peng, P.; Hu, R. Apple Leaf Disease Detection Based on Lightweight YOLOv8-GSSW. Appl. Eng. Agric. 2024, 40, 589–598. [Google Scholar] [CrossRef]
Zhou, S.; Yin, W.; He, Y.; Kan, X.; Li, X. Detection of Apple Leaf Gray Spot Disease Based on Improved YOLOv8 Network. Mathematics 2025, 13, 840. [Google Scholar] [CrossRef]
Yan, C.; Yang, K. FSM-YOLO: Apple Leaf Disease Detection Network Based on Adaptive Feature Capture and Spatial Context Awareness. Digit. Signal Process. 2024, 155, 104770. [Google Scholar] [CrossRef]
Qiu, Z.; Xu, Y.; Chen, C.; Zhou, W.; Yu, G. Enhanced Disease Detection for Apple Leaves with Rotating Feature Extraction. Agronomy 2024, 14, 2602. [Google Scholar] [CrossRef]
Huo, S.; Duan, N.; Xu, Z. An Improved Multi-scale YOLOv8 for Apple Leaf Dense Lesion Detection and Recognition. IET Image Proc. 2024, 18, 4913–4927. [Google Scholar] [CrossRef]
Tong, Z.; Chen, Y.; Xu, Z.; Yu, R. Wise-IoU: Bounding Box Regression Loss with Dynamic Focusing Mechanism. arXiv 2023, arXiv:2301.10051. [Google Scholar]
Zhu, X.; Lyu, S.; Wang, X.; Zhao, Q. TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-Captured Scenarios. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), Montreal, BC, Canada, 11–17 October 2021; pp. 2778–2788. [Google Scholar]
Li, C.; Li, L.; Jiang, H.; Weng, K.; Geng, Y.; Li, L.; Ke, Z.; Li, Q.; Cheng, M.; Nie, W.; et al. YOLOv6: A single-stage object detection framework for industrial applications. arXiv 2022, arXiv:2209.02976. [Google Scholar]
Wang, C.-Y.; Yeh, I.-H.; Mark Liao, H.-Y. YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information. In Proceedings of the Computer Vision—ECCV 2024, Milan, Italy, 29 September–4 October 2024; Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G., Eds.; Lecture Notes in Computer Science. Springer Nature: Cham, Switzerland, 2025; Volume 15089, pp. 1–21, ISBN 978-3-031-72750-4. [Google Scholar]
Wang, A.; Chen, H.; Liu, L.; Chen, K.; Lin, Z.; Han, J. Guiguang Ding Yolov10: Real-time end-to-end object detection. Adv. Neural Inf. Process. Syst. 2024, 37, 107984–108011. [Google Scholar]

Figure 1. Data acquisition and enhancement. (A) Map of data acquisition location; (B) different shooting background renderings; (C) apple leaf disease images; (D) data enhancement atlas.

Figure 2. Model improvement process, entire structure of CEFW-YOLO, and comparison of experimental results. (A) Overall technology roadmap; (B) network structure of CEFW-YOLO; (C) performance comparison between CEFW-YOLO and YOLO11n.

Figure 3. CWSConv module structure.

Figure 4. ECCAttention module structure.

Figure 5. C2PSA_ECCAttention module structure.

Figure 6. FMLAttention module structure.

Figure 7. Comparison of feature visualizations before and after model improvement.

Figure 8. (a) YOLO11n detection results; (b) CEFW-YOLO detection results.

Figure 9. Test results for each model.

Table 1. Improved experimental results.

Model	P	R	mAP@0.5	mAP@0.5:0.95	FPS	FLOPs(G)
YOLO11n	0.818	0.779	0.797	0.519	103.3	6.3
CEFW-YOLO	0.855	0.812	0.873	0.571	136.4	5.0

Table 2. Ablation experiment.

CWSConv	C2PSA_ECCAttention	FMLAttention	WIoU	P	R	mAP@0.5	mAP@0.5:0.95
×	×	×	×	0.818	0.779	0.797	0.519
√	×	×	×	0.823	0.784	0.805	0.524
×	√	×	×	0.820	0.787	0.802	0.522
×	×	√	×	0.835	0.791	0.835	0.531
×	×	×	√	0.830	0.795	0.827	0.528
√	√	×	×	0.837	0.797	0.841	0.539
√	×	√	×	0.839	0.799	0.847	0.545
√	√	√	×	0.842	0.800	0.855	0.556
×	√	√	√	0.845	0.804	0.865	0.562
√	√	√	√	0.855	0.812	0.873	0.571

Table 3. Comparison experiment.

Model	P	R	mAP@0.5	mAP@0.5:0.95	FPS	FLOPs(G)
FasterRCNN	0.753	0.732	0.720	0.441	25.3	-
YOLOv5n	0.787	0.794	0.794	0.518	98.2	7.1
YOLOv6n	0.813	0.774	0.795	0.520	54.5	11.8
YOLOv8n	0.798	0.785	0.804	0.526	76.2	8.1
YOLOv9t	0.809	0.775	0.802	0.536	90.1	7.6
YOLOv10n	0.817	0.766	0.790	0.515	78.7	8.2
YOLO11n	0.818	0.779	0.797	0.519	103.3	6.3
CEFW-YOLO	0.855	0.812	0.873	0.571	136.4	5.0

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Tao, J.; Li, X.; He, Y.; Islam, M.A. CEFW-YOLO: A High-Precision Model for Plant Leaf Disease Detection in Natural Environments. Agriculture 2025, 15, 833. https://doi.org/10.3390/agriculture15080833

AMA Style

Tao J, Li X, He Y, Islam MA. CEFW-YOLO: A High-Precision Model for Plant Leaf Disease Detection in Natural Environments. Agriculture. 2025; 15(8):833. https://doi.org/10.3390/agriculture15080833

Chicago/Turabian Style

Tao, Jinxian, Xiaoli Li, Yong He, and Muhammad Adnan Islam. 2025. "CEFW-YOLO: A High-Precision Model for Plant Leaf Disease Detection in Natural Environments" Agriculture 15, no. 8: 833. https://doi.org/10.3390/agriculture15080833

APA Style

Tao, J., Li, X., He, Y., & Islam, M. A. (2025). CEFW-YOLO: A High-Precision Model for Plant Leaf Disease Detection in Natural Environments. Agriculture, 15(8), 833. https://doi.org/10.3390/agriculture15080833

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

CEFW-YOLO: A High-Precision Model for Plant Leaf Disease Detection in Natural Environments

Abstract

1. Introduction

2. Materials and Methods

2.1. Data Acquisition

2.2. Data Annotation

2.3. Data Partitioning and Data Enhancement

2.4. General Technical Route

2.5. The Proposed CEFW-YOLO

2.5.1. CWSConv

2.5.2. C2PSA_ECCAttention

2.5.3. FMLAttention

2.5.4. WIoU Loss Function

3. Results and Discussion

3.1. Model Evaluation

3.2. Comparison of YOLO11n and CEFW-YOLO

3.3. Ablation Experiment

3.4. Comparison Experiment

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI