Research on Lightweight Rice False Smut Disease Identification Method Based on Improved YOLOv8n Model

Yang, Lulu; Guo, Fuxu; Zhang, Hongze; Cao, Yingli; Feng, Shuai

doi:10.3390/agronomy14091934

Open AccessArticle

Research on Lightweight Rice False Smut Disease Identification Method Based on Improved YOLOv8n Model

by

Lulu Yang

¹,

Fuxu Guo

¹

,

Hongze Zhang

¹,

Yingli Cao

^1,2,* and

Shuai Feng

^1,2,*

¹

College of Information and Electrical Engineering, Shenyang Agricultural University, Shenyang 110866, China

²

Liaoning Key Laboratory of Intelligent Agricultural Technology, Shenyang 110866, China

^*

Authors to whom correspondence should be addressed.

Agronomy 2024, 14(9), 1934; https://doi.org/10.3390/agronomy14091934

Submission received: 21 July 2024 / Revised: 23 August 2024 / Accepted: 27 August 2024 / Published: 28 August 2024

(This article belongs to the Section Pest and Disease Management)

Download

Browse Figures

Versions Notes

Abstract

In order to detect rice false smut quickly and accurately, a lightweight false smut detection model, YOLOv8n-MBS, was proposed in this study. The model introduces the C2f_MSEC module to replace C2f in the backbone network for better extraction of key features of false smut, enhances the feature fusion capability of the neck network for different sizes of false smut by using a weighted bidirectional feature pyramid network, and designs a group-normalized shared convolution lightweight detection head to reduce the number of parameters in the model head to achieve model lightweight. The experimental results show that YOLOv8n-MBS has an average accuracy of 93.9%, a parameter count of 1.4 M, and a model size of 3.3 MB. Compared with the SSD model, the average accuracy of the model in this study increased by 4%, the number of parameters decreased by 89.8%, and the model size decreased by 86.9%; compared with the YOLO series of YOLOv7-tiny, YOLOv5n, YOLOv5s, and YOLOv8n models, the YOLOv8n-MBS model showed outstanding performance in terms of model accuracy and model performance detection; compared to the latest YOLOv9t and YOLOv10n models, the average model accuracy increased by 2.8% and 2.2%, the number of model parameters decreased by 30% and 39.1%, and the model size decreased by 29.8% and 43.1%, respectively. This method enables more accurate and lighter-weight detection of false smut, which provides the basis for intelligent management of rice blast disease in the field and thus promotes food security.

Keywords:

rice false smut; digital imaging; YOLOv8n; feature fusion; lightweight network

1. Introduction

Rice is the main food crop for more than half of the population globally, and its high quality and yield are essential to ensure food security in China [1]. The incidence of rice false smut (RFS) has been increasing year by year and has become a significant threat to rice production due to the widespread promotion of high-yielding, dense, large-spike rice varieties and changes in global climate [2,3]. False smut is a disease of the spike caused by the rice green kernel fungus, which first appears as small yellowish-green clumps at the hinge of the glumes, gradually expands, and finally wraps around the whole glume, which is dark green or olive-colored, and finally opens up and is covered with a dark green powder. Currently, rice false smut is widespread in more than 10 provinces in China, and its occurrence area accounts for more than 1/3 of the rice planting area, with an average yield reduction between 20–30% in severe cases. It can be up to 50% or more, and the local area might even have no harvest [4]. Rice false smut not only causes yield reduction but also produces a variety of mycotoxins, such as ustiloxins and ustilaginoidins, which are a severe health hazard after processing and consumption, and pose a serious threat to rice food safety [5,6,7]. Therefore, there is an urgent need for a rapid and accurate means of detecting false smut.

Currently, the existing disease detection methods rely on plant protection staff for field surveys, which are highly accurate and reliable but require a large workforce and a large number of material resources [8,9,10,11]. Using biological means such as PCR in disease detection is accurate but requires professional technical guidance and high equipment costs. Not only can disease detection based on digital images better extract the phenotypic characteristics of diseases, but it also has a low price and is more universal. Based on the extracted digital images, disease detection can be carried out using deep learning techniques, whereas traditional machine learning relies heavily on feature extraction, and the model extraction process suffers from subjective human judgment and poor generalizability [12].

On the other hand, deep learning models show great potential in disease recognition by actively learning disease features from raw data, and their powerful feature extraction capability and detection performance provide new ideas for disease detection [13,14,15,16,17]. M et al. proposed a MaizeNet model with ResNet-50 as the base network to improve Faster RCNN for detecting three classes of maize leaf diseases. The model’s average accuracy was 97.89% [18]. Tian et al. used the SSD algorithm to perform multiscale feature fusion. It introduced the attention mechanism module for detecting apple leaf diseases, and the average detection accuracy reached 83.19% [19]. In Zhong et al., to effectively reduce false and missing detection, the bidirectional connection module and connection module are integrated into the darknet53 structure to enhance the network wildness and improve the model detection performance. The results show that the mAP (0.5) of the algorithm is 96.1% [20]. Xie et al. proposed a YOLO-EAF model, which enhances the feature extraction capability of the model by integrating the multiscale attention module (EMA) into the backbone feature extraction network and improves the generalization capability of the model by enhancing the feature fusion capability of the model at different levels using the adaptive spatial feature fusion (ASFF) module. The results of the experiment show that the accuracy of YOLO-EAF on the self-constructed dataset is improved by 8.4% to 82.7% compared to YOLOv8n, but the detection speed is slightly reduced [21].

Based on the problem of larger frameworks and more complex models of convolutional neural networks, some researchers have also used lightweight networks for disease identification. Sun et al. designed a lightweight feature extraction module to reduce the number of model parameters and improve small target detection. The final model size is only 2.4 MB, and the parameters are reduced by 22% compared with YOLOv5s [22]. Li et al. integrated the GhostNet lightweight network and Triplet attention mechanism to improve the performance of densely distributed disease detection while reducing the number of parameters. The experimental results showed that the mAP (0.5) was 91.40% under a model size of 11.20 MB [23]. Yang et al. proposed improving the lightweight YOLOv8s model to solve the problem of many model parameters and the high resource demand, making it challenging to apply to mobile devices. The C2f module is replaced by the C-faster net module, which reduces the number of parameters and the amount of calculation. As a result of the experiment, the size of the improved model (YOLOv8s-GCF) is only 52.0% and 11.7 m of the original model (yolov8s). This is 51.4% of the original model [24]. Bai et al. established a lightweight and efficient detection model, T-YOLO, for accurately detecting tea nutritional teeth in complex environments by pruning the YOLOv5m head network and integrating a dynamic detection head to achieve model lightweight. The experimental results show that the number of model parameters is 11.26 M, which is 47% lower than the original [25]. Ma et al. proposed a lightweight YOLOv8n-ShuffleNetv2-Ghost-SE to realize the monitoring of apple fruits throughout the growing period in smart orchards. The model accuracy is 94.1%, and the number of model parameters is 1.18 M [26]. Solimani et al. proposed a model based on YOLOv8 for effectively detecting flowers, fruits, and nodes in tomato plants to solve the problem of data imbalance in tomato plants and improve the ability to detect objects of different sizes in complex environments [27]. LI et al. used the proposed model to identify diseases on the apple dataset with 98.2% identification accuracy, and visualization of the identification results found that the model better focuses on the critical features of apple diseases [28].

The above scholars balanced the accuracy and speed of model detection by improving the structure of the model, which improved the practical performance of the model to a certain extent. However, most of the above studies were based on disease detection in the leaf part of the crop. In contrast, false smut is more challenging to detect due to its onset in the spike and the specificity of varying morphological sizes. Therefore, this study proposes a YOLOv8n-MBS model for false smut detection, aiming at better extraction of key features of false smut and fusion of features for different sizes of false smut, as well as reducing the model parameters and size to make the model more lightweight to adapt to the later stage of real-time detection of false smut on smart devices in the field, and adopting agricultural control strategies to control the spread of false smut. The main improvement points of this paper are as follows:

(1): Backbone: design of Multiscale Lightweight Convolutional Module C2f_MSEC, changing some of the C2f modules of the backbone network for better extraction of key features of false smut.
(2): Neck: integrate the BiFPN module and add a new small-size detection layer, removing the large-size detection layer while reducing the model parameters to enhance the feature fusion ability for different false smut sizes.
(3): Detection head: a new shared convolution lightweight detection head is designed to process the results of the convolution operation using a group normalization layer, which further achieves lightweight while ensuring the model training effect.

2. Materials and Methods

2.1. Digital Image Data Acquisition and Processing

2.1.1. Experimental Design and Digital Image Acquisition of Rice False Smut

The field trial was conducted at the Scientific Experimental Base of Shenyang Agricultural University in Shenyang City, Liaoning Province (123°55′85″ E, 41°81′63″ N), as shown in Figure 1. The base is located in the south of northeast China, with an average annual temperature of 10.4 °C, an average yearly precipitation of 600–800 mm, and a temperate continental semi-humid climate. The rice varieties bred in this experimental field include Liao Japonica 9, Northern Japonica 202, Northern Japonica 204, and Northern Japonica 1705.

The experiment was conducted in this test area from August to September 2023 (rice spike to maturity). False smut data were sampled every three days to obtain data on rice with different incidence levels. The shooting time was selected for collection from 10 am to 3 pm. To ensure data diversity, different shooting angles and shooting distances were selected during the collection process to further increase the target data diversity and the model’s generalization ability. Using a Canon EOS1500D DSLR camera with an effective pixel count of approximately 24.1 megapixels, with a Canon EF-S 18–55 mm f/4–5.6 IS STM lens, shooting with automatic white balance, an image resolution of 1920 × 1280 pixels, the image was saved in the ‘JPG’ format, which preserves distracting information such as the background of the rice paddy. The captured images are shown in Figure 2.

2.1.2. Data Labeling and Dataset Construction

Labeling [29] used the image annotation tool to annotate the false smut disease to obtain the dataset in txt format. To avoid the model overfitting problem caused by the small number of samples, the samples and labels were augmented with data separately to suit different training needs: adopting Gaussian noise and Gaussian rotation to add more environmental influences; adopting spatial transformations such as pan-flip, random rotation, and random translation to increase spatial diversity; adopting the above two primary enhancement means to improve the diversity of the environment for detecting rice wilt disease, and to improve the generalization ability of the model [30]. The enhanced part of the image is shown in Figure 3, and a total of 4680 images and the corresponding txt tag files are obtained, divided according to the ratio of 7:2:1 for the training set, test set, and validation set. The training set of 3276 images contains 12,181 false smut labels, the test set of 936 images includes 4224 false smut labels, and the validation set of 468 images contains 1780 false smut labels.

2.2. Improved YOLOv8n Model Design for Lightweight Rice False Smut

Based on the analysis of the previous experiments and references to the existing research literature [31,32], the YOLOv8 [31] models were selected for disease detection in this study. To make the detection model more lightweight, the smaller and more accurate version of YOLOv8n [33] was optimized in this study. Although YOLOv8n shows good adaptability for detecting rice vine disease targets, the feature extraction and fusion ability for rice vine disease targets of different sizes still needs to be improved. The introduction of a large number of convolutional blocks and C2f blocks in YOLOv8n also increases the number of parameters of the model accordingly, especially the detection head part, whose computations and parameters take up a large part of the model.

Therefore, this study improves it as follows: the lightweight multiscale convolutional module C2f_MSEC replaces part of the C2f module of the original backbone network to reduce the volume of the model parameters and improve the extraction of features of different sizes. Fusing the feature pyramid network BiFPN, a new small-size detection head is added to enhance the performance of small-size object detection in the pre-preparation period of false smut, and the large-size detection layer is removed to reduce the number of parameters of the model. The lightweight shared convolution referenced in the detection head further reduces the model parameters and volume to achieve the subsequent embedded platform integration requirements. The improved model network structure is shown in Figure 4. Where Backbone is the backbone network, Neck is the neck network, and Head is the head network, Backbone Conv contains convolutional layers, BN (batch normalization), and activation functions for feature extraction, and Upsample denotes the up-sampling operation to increase the spatial resolution of the feature map. The rest of C2f_MSEC, BiFPN, and SLDH-Head are the three improvement points of this paper that are described in detail subsequently.

2.2.1. Lightweight Multiscale Convolutional Module Design

Targeting the problems of channel information redundancy and poor extraction of targets of different sizes in the C2f module in YOLOv8n when extracting features of false smut disease, a lightweight multiscale convolution module MSEC (Multiscale Efficient Conv) is proposed, which combines the advantages of the GhostNet [34] network in reducing redundant features and the MobileNet [35,36,37] network using the group convolution concept to fuse multiple channel features, and the group convolution idea [38,39], to reduce the number of parameters and computations to improve the portability of the model as well as to improve the model’s ability to extract features of different sizes of false smut disease from a large number of redundant feature maps.

This convolution module operation uses multiple convolution kernels of different scales to process the input data by performing convolutional feature extraction on the number of channels in each layer, where half of the channel numbers are directly waiting for the subsequent fusion output without convolution operation. The other half is subjected to feature extraction by grouping operation (usually including 1 × 1, 3 × 3, 5 × 5, and 7 × 7 convolution kernels, which is also denoted by Φ in the Figure), which will be used to stack the two parts of the feature maps that are stacked to exchange channel information. The feature fusion output is performed by point-by-point convolution (1 × 1Conv). This module effectively fuses information on different sizes of false smut and improves the detection performance of three distinct periods of false smut. It also effectively reduces the number of parameters and model size, reduces the model’s storage requirement, and improves its reliability in practical applications. The module structure is shown in Figure 5.

2.2.2. Fusion-Weighted Bidirectional Feature Pyramid Networks

During the training process, feature size differences exist in different periods of rice bollworms. The stacking method of traditional PANet [40] may lead to the existence of unequal weights of the false smut target features in the fusion output, thus masking the detection performance of features of different sizes and dimensions of rice bollworm, resulting in the problem of misdetection and omission of the disease target.

To address this problem, this study fused BiFPN (bidirectional feature pyramid network) [41] with a new small-size detection layer. It removed the large-size detection layer based on the YOLOv8n model. BiFPN introduces a bi-directional feature propagation mechanism that covers both top-down and bottom-up feature information, which helps convey and fuse information across different levels more comprehensively and finely and enhances the model’s ability to capture the different size features of false smut. BiFPN adjusts and selects features from the merged results to ensure that the critical features of the various size targets of false smut in the complex background of the field are extracted, which improves the accuracy and validity of the detection of the targets of false smut, and at the same time further reduces the number of model parameters and improves the overall effectiveness of the model. The structure of BiFPN is shown in Figure 6.

2.2.3. Shared Convolutional Lightweight Detection Head Design

The original model detection head occupies most of the memory of the model, affecting the model efficiency and performance; for this reason, this study designs a new shared convolutional lightweight detection head (SLDH) aiming to reduce memory occupancy through the lightweight processing while maintaining the model’s training effectiveness. The module structure is shown in Figure 7.

Feature extraction for false smut is performed according to Equation (1) in the convolutional operation:

y_{i j} = \sum_{m = 0}^{k - 1} \sum_{n = 0}^{k - 1} x_{(i + m) (j + n)} \cdot w_{m n}

(1)

where

y_{i j}

is the value of the output feature map at position (i, j),

x_{(i + m) (j + n)}

is the value of the input feature map at position (I + m, j + n),

w_{m n}

is the weight value of the convolution kernel at position (m, n), and k is the size of the convolution kernel.

The group normalized GN (GroupNorm) [42] layer is used instead of the original BN layer to process the results of the convolution operation, ensuring the model’s training effect while keeping it lightweight. The channels c of the group normalized feature map x (the value of c ranges from 0 to C-1, where C is the number of channels) are divided into G groups (in this case, G = 16) with C/G channels in each group. The elements within each group are then normalized. For each element

x_{n, c, h, w}

(where n is the index in the batch, c is the channel index, and h and w are the spatial location indexes) in group g (g ranges from 0 to G-1), the formula for group normalization is (2):

{\hat{x}}_{n, c, h, w} = \frac{x_{n, c, h, w} - μ_{g}}{\sqrt{σ_{g}^{2} + ϵ}}

(2)

where

μ_{g}

and

σ_{g}^{2}

are the mean and variance of group g, respectively, and ϵ is a small constant for numerical stabilization. In addition, the nonlinear representation of the model is enhanced using the SiLU (also known as Swish) activation function, which is given by (3):

f (x) = x \cdot sigmoid (x) = \frac{x}{1 + e^{- x}}

(3)

To cope with the problem of inconsistency in the size of the false smut target detected by each detector head, a Scala layer is invoked to scale the features (S in the Figure represents the Scala layer operation). The Scala layer learns the parameters to achieve the scaling operation on the input data. It gradually adjusts the data size to make the model more flexible and adapted to the different data sizes. The operation of the Scala layer can be expressed as (4):

y = x \cdot scale

(4)

In this expression, x is an input tensor of arbitrary shape, and scale is a learnable parameter of shape (1).

In this study, by designing a shared convolution lightweight detection head, employing group normalization, introducing the SiLU activation function, and using the Scala layer, the number of model parameters volume is further reduced to improve the operational efficiency and stability of the model while ensuring that the information of the features of different sizes for detecting false smut is not lost.

2.3. Test Environment Configuration and Parameter Setting

The experimental processing platform is i9-10980XE CPU, main frequency 2934 MHz, memory 64 G, GPU model Quadro RTX 5000, software environment built by pytorch 1.13.1 + Cuda 11.6 + Python 3.8.18, and the operating system is Windows 11. The experiments were set up with an input image size of (640 × 640) and an epoch of 300. The Adam gradient optimization algorithm was used with an initial learning rate of 0.001 and a momentum factor of 0.937.

2.4. Evaluation Indicators

The main evaluation metrics in the experiments in this paper include model average precision mAP (0.5), model recall (recall), and the number of model parameters and model size. Recall assesses the proportion of positive examples correctly predicted by the model out of all positive examples. It is calculated using the following formula:

R = \frac{T P}{T P + F N}

(5)

The mean accuracy of each category is used as an indicator to evaluate the model’s performance in target detection. mAP (0.5) is obtained by averaging the mean accuracies of each category using the following formula:

m A P = \frac{1}{N} \sum_{i = 1}^{N} \int_{0}^{1} P r e c i s i o n (R e c a l l) d (R e c a l l)

(6)

Here, N denotes the number of categories, and APi is the average precision for category i. The average precision of each category helps to determine the error rate and correctness by calculating the intersection and concatenation ratios between the predicted and actual labeled results. The metric mAP (0.5) represents the average accuracy when the “intersections on concatenation” threshold is set to 0.5. These metrics provide a comprehensive assessment of accuracy under different IoU conditions.

The parameter metric represents the number of parameters in a model and indicates model complexity. Typically, a more significant number of parameters indicates a higher level of model complexity. The number of parameters can be calculated using the following formula:

P a r a m e t e r s = C_{i n} \times C_{o u t} \times K \times K

(7)

Here, K denotes the convolutional kernel size, and C_in and C_out denote the number of input and output feature channels, respectively.

3. Results

3.1. Model Training Results

3.1.1. YOLOv8 Series Model Analysis

The experiments were all conducted using the Adam gradient optimization algorithm with the initial learning rate set to 0.001 and a momentum factor of 0.937 for the YOLOv8 series model. The results from Table 1 show that although the YOLOv8x model has the highest average accuracy of 95.9%, the number of parameters of its model reaches 68.1 M, and the model size reaches 136.7 MB, which is a significant increase in the number of parameters and the size of the model compared to the most miniature YOLOv8n model, which is a serious inconsistency with the need of this study to design a lightweight model for subsequent deployment to the mobile devices. Therefore, the YOLOv8n, which has the most miniature models, was selected as the baseline model for this study.

3.1.2. Comparison of Detection Performance of Different Models

The improved YOLOv8n-MBS model was trained and tested with YOLOv8n, SSD [43], YOLOv7-tiny [44], YOLOv5s [17], YOLOv5n [17], YOLOv9t [45], and YOLOv10n [46] mainstream target detection models, respectively, on the same in-field paddy strawberry dataset. Model performance was evaluated based on model recall, mAP (0.5), mAP (0.5–0.95), number of parameters, and model weight size. The comparison results are summarized in Table 2.

The YOLOv8n-MBS model shown in Table 2 achieved significant results, which were significantly smaller than the other models in terms of the number of model parameters, the model size above, and the average accuracy, which are also more significant than the different models. Among them, the YOLOv8n-MBS model improves the average accuracy by 4%, the number of parameters decreases by 89.8%, and the model size decreases by 86.9% compared with the SSD model; compared with the YOLOv7-tiny, YOLOv5n, YOLOv5s, and YOLOv8n models of the YOLO series, the YOLOv8n-MBS model shows perfect detection results both in terms of model accuracy and model performance. Moreover, compared to the YOLOv5n model, which has the smallest number of model parameters and model size, the YOLOv8n-MBS model has a 3.8% improvement in average accuracy, a 44% reduction in the number of model parameters, and a 37.7% reduction in model size. Compared with the newest YOLOv9t and YOLOv10n models, the average accuracy of the model increases by 2.8% and 2.2%, respectively. The number of model parameters decreased by 30% and 39.1%, and the model size decreased by 29.8% and 43.1%, respectively, compared to the latest YOLOv9t and YOLOv10n models. Figure 8 shows the mAP (0.5), mAP (0.5–0.95), and recall curves for the above model. In conclusion, the improved YOLOv8n-MBS model showed significant advantages and performance in various indicators and obtained excellent detection results with fewer parameters and a smaller model. Subsequently, the efficient deployment and application of the model in resource-constrained equipment environments can be ensured, supporting the realization of efficient and sustainable field management measures for false smut.

4. Discussion

4.1. Ablation Study

Currently, most of the detection is for leaf diseases of crops, and there are few reports for spike diseases of crops. Moreover, under the natural environment in the field, the onset of false smut is characterized by different periods, and the traditional target detection model makes it difficult to accurately extract the critical information, resulting in leakage and misdetection of the disease. At the same time, the existing target detection framework structure is a large and complex model. To tackle the above problems, this paper proposes a lightweight target detection model for false smut to accurately detect disease at different periods in complex environments in the field. A comparative ablation test with the original baseline model was designed to validate the improved model’s real-time performance and accuracy. Model training and testing were performed using the same dataset and validation set to ensure the test’s reliability. The results are shown in Table 3.

As shown in Table 3, YOLOv8n-MBS significantly reduces the number of parameters and the model size compared to the original YOLOv8n model, while the average accuracy is also improved. In this study, the use of the multiscale lightweight convolutional module MSEC to replace some of the C2f blocks of the backbone network of the YOLOv8n model enhances the feature extraction capability for false smut images of different sizes and also reduces the number of parameters and model size of the model. The reason that the average accuracy is not improved may be due to the grouping operation of this module, which effectively reduces the model size and computational redundancy while losing a small amount of feature information contained in the channel; BiFPN is integrated with the feature fusion module to replace the original PANet module. A new small-size detection layer is added to reduce small-size targets’ leakage and misdetection rate in the early stage of false smut disease. The large-size detection layer is removed further to reduce the number of model parameters and size. After integrating the structural features of BiFPN, the number of model parameters is reduced by 40.1%. The model size is reduced by 2.3 MB. At the same time, the average accuracy is improved, which indicates that BiFPN can better integrate the feature information of different sizes of false smut disease, and through the introduction of the feature fusion mechanism with weight, the correct features are effectively given a higher weight to improve the overall performance of the model; the shared convolutional lightweight detection head LSCD is used instead of the original baseline model detection head. The shared convolution is used to further reduce the number of parameters in the model and compensate for the disadvantage that the original model detection head occupies most of the memory in the model. The average accuracy is not improved because the shared convolution operation is relatively weak in capturing the global information, but it also retains the model’s accuracy. At the same time, the number of model parameters and the model size are significantly reduced, with a 52.2% reduction in the number of parameters and a 47.6% reduction in the model size compared to the original model. Overall, compared with the original YOLOv8n model, the improved model increased the average accuracy of the model by 0.4%.

In comparison, the number of parameters of the model decreased by 53.3%. The model size decreased by 47.6%, which indicates that the improved YOLOv8n-MBS model is more capable of effective feature extraction for different periods of paddy berry disease in the field. At the same time, the model is more lightweight and can provide a reference for the follow-up field mobile terminal for real-time monitoring.

4.2. Analysis of Model Detection Performance for Different Shooting Distances

To further validate the performance of the YOLOv8n-MBS model for the detection of false smut in complex backgrounds in the field, two different shooting angles were selected for testing, and comparative tests on false smut were conducted for YOLOv5, SSD, YOLOv8, YOLOv10, and the YOLOv8-MBS method used in this paper, respectively. The test results are shown in Figure 9 and Figure 10. The purple box represents the missed detection box, the blue box is the false detection box, and the yellow box is the repeated detection box. In Figure 9, it is shown that the YOLOv5 model has a missed detection of the rice blast target under partial shade of rice grains, the SSD model has one false detection, and the YOLOv8n and YOLOv10n models have the problem of duplicate detection for the same target. However, the YOLOv8n-MBS model in this study provided better detection performance for partially occluded and blurred false smut targets at different angles.

4.3. Feature Visualization Network

To better present the effect of the model in this study in practical applications, Figure 11 shows an example of Grad-GAM [47,48] class activation mapping visualization for early, middle, and late images of false smut. To respond more directly to the importance of the target object, the last layer of the convolutional module of the network was chosen to be computed as the target detection layer. This choice is based on the hierarchical nature of convolutional neural networks in processing images, and the last layer of convolutional modules often contains the wealthiest and most abstract feature information; this can more directly reflect the model’s attention to the target object. The experimental results show that compared with the original baseline model, the location and color of the false smut disease are more consistent in the visualization results obtained by this research method, which better highlights the key places where the disease is located, indicating that the improved network can better focus on the feature points of false smut disease. Therefore, the Grad-GAM technique demonstrated superior performance in the target detection of false smut disease, which provides a new idea and tool for the intelligent diagnosis of agricultural diseases.

4.4. Model Performance Analysis Based on New Data

To further test the detection performance of the model in this paper, the latest data collected in 2024 for the early and middle stages of false smut were selected for testing. Because the disease has not yet reached the late stage of development, this paper does not test for the late stage and will be added later. The test charts are shown in Figure 12 and Figure 13.

The blue box in the Figure represents the target box that correctly detects false smut, the red box represents the model misdetection, and the black box represents the model missed detection. According to the above test effect graphs for false smut’s early and middle stages, YOLOv8n-MBS showed a good detection effect. None of them had misdetection or omission, indicating that the model in this paper can better extract the key information of false smut. It can integrate the features of different periods of false smut well.

In this paper, only data from different angles of rice curculio were taken, and data from various weather conditions will be added in the future to enhance the diversity of data samples. The digital image data captured in this paper is based on the ground scale. In the future, we will consider collecting data on false smut from remote sensing imagery from UAVs for real-time detection of the disease in large field areas. We will continue to explore more accurate and lighter-weight models for detecting false smut so that timely field management measures can be taken to improve food security.

5. Conclusions

In this paper, a YOLOv8n-MBS model is proposed to achieve a more lightweight and accurate detection of false smut, which provides a reference for the subsequent intelligent management of false smut in the field. The backbone network of the model introduces the C2f_MSEC module, which integrates the lightweight GhostNet network, MobileNet network, and group convolution idea for better extraction of the critical features of false smut; the fusion of BiFPN structure at the neck increases the feature fusion performance of different sizes of false smut and adds a new small-size detection layer to better deal with the small-size false smut targets, and removes the large-size detection layer to reduce the number of model parameters; and a new shared convolution lightweight detection head is designed to reduce the model head size further to achieve the model lightweight. A large detection layer is added to reduce the number of model parameters; a new shared convolution lightweight detection head is designed to further reduce the model head size and achieve a lightweight model. The model can accurately detect rice blast targets at different angles and periods. The experimental results show that the average accuracy of YOLOv8n-MBS is 93.9%, the number of parameters is 1.4 M, and the model size is 3.3 MB, which improves the average accuracy of the YOLOv8n-MBS model by 4%, reduces the number of parameters by 89.8%, and reduces the model size by 86.9% compared with the SSD model; compared to the YOLO series of YOLOv7-tiny, YOLOv5n, YOLOv5s, and YOLOv8n models, the YOLOv8n-MBS model shows excellent detection results in terms of model accuracy and model performance. Compared with the YOLOv5n model, which has the smallest number of parameters and size, the YOLOv8n-MBS model shows an increase of 3.8% in the average accuracy and a decrease of 44% in the number of parameters and 37% in the model size. 44%, and the model size is reduced by 37.7%; compared to the latest YOLOv9t and YOLOv10n models, the average model accuracy increases by 2.8% and 2.2%, the number of model parameters is reduced by 30% and 39.1%, and the model size is reduced by 29.8% and 43.1%, respectively. This will facilitate the migration of subsequent models to hardware platforms such as edge devices or embedded systems.

Author Contributions

Methodology, Software, Validation, Formal Analysis, Visualization, Investigation, Data curation, Resources, Writing—original draft, Writing—review & editing, L.Y.; Data curation, Investigation, F.G.; Data curation, Visualization, H.Z.; Methodology, Validation, Investigation, Funding acquisition, Project administration, Supervision, Y.C.; Data curation, Resources, Validation, Software, S.F. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by the National Rice Industry Technology System (CARS-01)/Supported by the earmarked fund for CARS-01 and the Doctoral Research Fund of Shenyang Agricultural University (880423038).

Data Availability Statement

The experimental data is for us to use together as a team, while subsequent students need to train and learn. However, if needed, they can be obtained by contacting the authors of this paper or the corresponding author.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

Lu, Y.; Yi, S.; Zeng, N.; Liu, Y.; Zhang, Y. Identification of rice diseases using deep convolutional neural networks. Neurocomputing 2017, 267, 378–384. [Google Scholar] [CrossRef]
Rush, M.C.; Shahjahan, A.K.M.; Jones, J.P.; Groth, D.E. Outbreak of false smut of rice in Louisiana. Plant Dis. 2000, 84, 100. [Google Scholar] [CrossRef]
Guo, X.; Li, Y.; Fan, J.; Li, L.; Huang, F.; Wang, W. Progress in the study of false smut disease in rice. J. Agric. Sci. Technol. 2012, 11, 1211–1217. [Google Scholar]
Khanal, S.; Gaire, S.P.; Zhou, X.-G. Kernel Smut and False Smut: The Old-Emerging Diseases of Rice—A Review. Phytopathology 2023, 113, 931–944. [Google Scholar] [CrossRef]
Koiso, Y.; Li, Y.; Iwasaki, S.; Hanaka, K.; Kobayashi, T.; Sonoda, R.; Fujita, Y.; Yaegashi, H.; Sato, Z. Ustiloxins, antimitotic cydic peptides from false smut balls on rice panicles caused by Ustilaginoidea virens. J. Antibiot. 1994, 47, 765–773. [Google Scholar] [CrossRef]
Nakamura, K.; Izumiyama, N.; Ohtsubo, K.; Koiso, Y.; Iwasaki, S.; Sonoda, R.; Fujita, Y.; Yaegashi, H.; Sato, Z. “Lupinosis”-Like lesions in mice caused by ustiloxin, produced by Ustilaginoieda virens: A morphological study. Nat. Toxins 1994, 2, 22–28. [Google Scholar] [CrossRef] [PubMed]
Zhou, L.; Lu, S.; Shan, T.; Wang, P. Chemistry and biology of mycotoxins from rice false smut pathogen. In Mycotoxins: Properties, Applications and Hazards; Melborn, B.J., Greene, J.C., Eds.; Nova Science Publishers: New York, NW, USA, 2012. [Google Scholar]
Chahal, S. Epidemiology and management of two cereals. Indian Phytopathol. 2001, 54, 145–157. [Google Scholar]
Bin, L.; Yun, Z.; He, D.; Li, Y. Identification of Apple Leaf Diseases Based on Deep Convolutional Neural Networks. Symmetry 2017, 10, 11. [Google Scholar] [CrossRef]
Vithu, P.; Moses, J.A. Machine vision system for food grain quality evaluation: A review. Trends Food Sci. Technol. 2016, 56, 13–20. [Google Scholar] [CrossRef]
Dutot, M.; Nelson, L.; Tyson, R. Predicting the spread of postharvest disease in stored fruit, with application to apples. Postharvest Biol. Technol. 2013, 85, 45–56. [Google Scholar] [CrossRef]
Sujatha, R.; Chatterjee, J.M.; Jhanjhi, N.; Brohi, S.N. Performance of deep learning vs machine learning in plant leaf disease detection. Microprocess. Microsystems 2020, 80, 103615. [Google Scholar] [CrossRef]
Ye, Z.; Zhao, M.; Jia, L. Research on image recognition of complex background crop diseases. Trans. Chin. Soc. Agric. Mach. 2021, 52, 118–124. [Google Scholar]
Guo, X.; Yu, S.; Shen, H. A crop disease identification model based on global feature extraction. Trans. Chin. Soc. Agric. Mach. 2022, 53, 301–307. [Google Scholar]
Sun, J.; Zhu, W.; Luo, Y. Identification of field crop leaf diseases based on improved MobileNet-V2. Trans. Chin. Soc. Agric. Eng. (Trans. CSAE) 2021, 37, 161–169. [Google Scholar]
Du, T.; Nan, X.; Huang, J.; Zhang, W.; Ma, Z. Improving RegNet to identify the damage degree of various crop diseases. Trans. Chin. Soc. Agric. Eng. (Trans. CSAE) 2022, 38, 150–158. [Google Scholar]
Sun, F.; Wang, Y.; Lan, P.; Zhang, X.; Chen, X.; Wang, Z. Identification method of apple fruit diseases based on improved YOLOv5s and transfer learning. Trans. Chin. Soc. Agric. Eng. (Trans. CSAE) 2022, 38, 171–179. [Google Scholar]
Masood, M.; Nawaz, M.; Nazir, T.; Javed, A.; Alkanhel, R.; Elmannai, H.; Dhahbi, S.; Bourouis, S. MaizeNet: A Deep Learning Approach for Effective Recognition of Maize Plant Leaf Diseases. IEEE Access 2023, 11, 52862–52876. [Google Scholar] [CrossRef]
Tian, Y.; Yang, G.; Wang, Z.; Wang, H.; Li, E.; Liang, Z. Apple detection during different growth stages in orchards using the improved YOLO-V3 model. Comput. Electron. Agric. 2019, 157, 417–426. [Google Scholar] [CrossRef]
Zhong, Z.; Yun, L.; Cheng, F.; Chen, Z.; Zhang, C. Light-YOLO: A Lightweight and Efficient YOLO-Based Deep Learning Model for Mango Detection. Agriculture 2024, 14, 140. [Google Scholar] [CrossRef]
Xie, W.; Feng, F.; Zhang, H. A Detection Algorithm for Citrus Huanglongbing Disease Based on an Improved YOLOv8n. Sensors 2024, 24, 4448. [Google Scholar] [CrossRef]
Sun, Z.; Feng, Z.; Chen, Z. Highly Accurate and Lightweight Detection Model of Apple Leaf Diseases Based on YOLO. Agronomy 2024, 14, 1331. [Google Scholar] [CrossRef]
Li, R.; Li, Y.; Qin, W.; Abbas, A.; Li, S.; Ji, R.; Wu, Y.; He, Y.; Yang, J. Lightweight Network for Corn Leaf Disease Identification Based on Improved YOLO v8s. Agriculture 2024, 14, 220. [Google Scholar] [CrossRef]
Yang, C.; Sun, X.; Wang, J.; Lv, H.; Dong, P.; Xi, L.; Shi, L. YOLOv8s-CGF: A lightweight model for wheat ear Fusarium head blight detection. Peer J. Comput. Sci. 2024, 10, 1948. [Google Scholar] [CrossRef]
Bai, B.; Wang, J.; Li, J.; Yu, L.; Wen, J.; Han, Y. T-YOLO: A lightweight and efficient detection model for nutrient bud in complex tea plantation environment. J. Sci. Food Agric. 2024, 104, 5698–5711. [Google Scholar] [CrossRef] [PubMed]
Ma, B.; Hua, Z.; Wen, Y.; Deng, H.; Zhao, Y.; Pu, L.; Song, H. Using an improved lightweight YOLOv8 model for real-time detection of multi-stage apple fruit in complex orchard environments. Artif. Intell. Agric. 2024, 11, 70–82. [Google Scholar] [CrossRef]
Solimani, F.; Cardellicchio, A.; Dimauro, G.; Petrozza, A.; Summerer, S.; Cellini, F.; Renò, V. Optimizing tomato plant phenotyping detection: Boosting YOLOv8 architecture to tackle data complexity. Comput. Electron. Agric. 2024, 218, 108728. [Google Scholar] [CrossRef]
Li, D.; Zeng, X.; Liu, Y. Apple leaf disease identification model by coupling global and patch features. Trans. Chin. Soc. Agric. Eng. 2022, 38, 207–214. [Google Scholar]
Tzutalin, D. LabelImg. Git Code (2015). Available online: https://github.com/tzutalin/labelImg (accessed on 26 August 2024).
Shorten, C.; Khoshgoftaar, T.M. A survey on Image Data Augmentation for Deep Learning. J. Big Data 2019, 6, 60. [Google Scholar] [CrossRef]
Wang, X.; Liu, J. Vegetable disease detection using an improved YOLOv8 algorithm in the greenhouse plant environment. Sci. Rep. 2024, 14, 4261. [Google Scholar] [CrossRef]
Ye, R.; Shao, G.; He, Y.; Gao, Q.; Li, T. YOLOv8-RMDA: Lightweight YOLOv8 Network for Early Detection of Small Target Diseases in Tea. Sensors 2024, 24, 2896. [Google Scholar] [CrossRef]
Aboah, A.; Wang, B.; Bagci, U.; Adu-Gyamfi, Y. Real-time multi-class helmet violation detection using few-shot data sampling technique and yolov8. arXiv 2023, arXiv:2304.08256. [Google Scholar]
Han, K.; Wang, Y.; Tian, Q.; Guo, J.; Xu, C.; Xu, C. Ghostnet: More features from cheap operations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020. [Google Scholar] [CrossRef]
Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.-C. Mobilenetv2: Inverted residuals and linear bottlenecks. arXiv 2018, arXiv:1801.04381. [Google Scholar]
Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv 2017, arXiv:1704.04861. [Google Scholar]
Howard, A.; Sandler, M.; Chu, G.; Chen, L.-C.; Chen, B.; Tan, M.; Wang, W.; Zhu, Y.; Pang, R.; Vasudevan, V. Searching for MobileNetV3. arXiv 2019, arXiv:1905.02244. [Google Scholar]
Wang, X.; Kan, M.; Shan, S.; Chen, X. Fully learnable group convolution for the acceleration of deep neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019. [Google Scholar] [CrossRef]
Zhang, T.; Qi, G.J.; Xiao, B.; Wang, J. Interleaved group convolutions. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; Volume 469, pp. 4383–4392. [Google Scholar]
Wang, C.; He, W.; Nie, Y.; Guo, J.; Liu, C.; Han, K.; Wang, Y. Gold-YOLO: Efficient Object Detector via Gather-and-Distribute Mechanism. arXiv 2023, arXiv:2309.11331. [Google Scholar]
Tan, M.; Pang, R.; Le, Q.V. EfficientDet: Scalable and efficient object detection. arXiv 2020, arXiv:1911.09070. [Google Scholar]
Wu, Y.; He, K. Group normalization. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.-Y.; Berg, A.C. SSD: Single shot multibox detector. arXiv 2016, arXiv:1512.02325. [Google Scholar]
Wang, C.Y.; Bochkovskiy, A.; Liao, H.Y.M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the 2023 IEEE Conference on Computer Vision and Pattern Recognition(CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 7464–7475. [Google Scholar]
Wang, C.-Y.; Yeh, I.-H.; Liao, H.-Y.M. YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information. arXiv 2024, arXiv:2402.13616. [Google Scholar]
Ahmed, A.; Manaf, A. Pediatric Wrist Fracture Detection in X-rays via YOLOv10 Algorithm and Dual Label Assignment System. arXiv 2024, arXiv:2407.15689. [Google Scholar]
Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. Int. J. Comput. Vis. 2020, 128, 336–359. [Google Scholar] [CrossRef]
Tian, L.; Xue, B.; Wang, Z.; Li, D.; Yao, X.; Cao, Q.; Zhu, Y.; Cao, W.; Cheng, T. Spectroscopic detection of rice leaf blast infection from asymptomatic to mild stages with integrated machine learning and feature selection. Remote. Sens. Environ. 2021, 257, 112350. [Google Scholar] [CrossRef]

Figure 1. Scientific Experiment Base of Shenyang Agricultural University.

Figure 2. Early, mid, and late images of rice false smut.

Figure 3. Partial Data Enhancement Picture.

Figure 4. YOLOv8n-MBS structure diagram and C2f module structure.

Figure 5. C2f_MSEC network structure diagram.

Figure 6. Schematic diagram of PANet feature fusion structure; BiFPN feature fusion diagram; Improved feature fusion diagram.

Figure 7. Structure diagram of the shared convolutional lightweight detection head.

Figure 8. Plot of training results for different models.

Figure 9. Comparison of detection results taken at the same horizontal line.

Figure 10. Comparison of detection results taken at a tilt of 30 degrees.

Figure 11. Feature visualization heat.

Figure 12. Effectiveness of early testing of rice false smut in 2024.

Figure 13. Effectiveness of mid-term test for rice false smut in 2024.

Table 1. YOLOv8 Series Model Comparison Test.

Experimental	mAP (0.5)%	mAP (0.5–0.95)%	R%	Params/M	Model Size/MB
YOLOv8n	93.5	66.6	86.6	3.0	6.3
YOLOv8s	94.3	69.2	88.6	11.1	22.5
YOLOv8m	95.7	77.7	91.3	25.9	52.1
YOLOv8l	95.5	79.9	91.2	43.6	87.7
YOLOv8x	95.9	82.3	90.9	68.1	136.7

Table 2. Comparison test of different models.

Experimental	mAP (0.5)%	mAP (0.5–0.95)%	R%	Params/M	Model Size/MB
SSD	89.9	59.2	81.7	13.7	25.2
YOLOv7-tiny	90.7	61.1	84.3	6.0	14.6
YOLOv5n	90.1	61.7	81.7	2.5	5.3
YOLOv5s	91.6	61.8	82.5	2.7	5.3
YOLOv8n	93.5	66.6	86.6	3.0	6.3
YOLOv9t	91.1	63.9	84.0	2.0	4.7
YOLOv10n	91.7	62.5	83.8	2.3	5.8
YOLOv8n-MBS	93.9	68.5	86.7	1.4	3.3

Table 3. Results of the ablation test.

MSEC	BiFPN	LSCD	mAP (0.5)%	R%	Param/M	Model Size/MB
×	×	×	93.5	86.6	3.0	6.3
√	×	×	93.5	86.7	2.8	6.0
√	√	×	93.9	86.1	1.8	4.0
√	√	√	93.9	86.7	1.4	3.3

Note: where “√” indicates the method selected and “×” indicates that the algorithm is not used.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yang, L.; Guo, F.; Zhang, H.; Cao, Y.; Feng, S. Research on Lightweight Rice False Smut Disease Identification Method Based on Improved YOLOv8n Model. Agronomy 2024, 14, 1934. https://doi.org/10.3390/agronomy14091934

AMA Style

Yang L, Guo F, Zhang H, Cao Y, Feng S. Research on Lightweight Rice False Smut Disease Identification Method Based on Improved YOLOv8n Model. Agronomy. 2024; 14(9):1934. https://doi.org/10.3390/agronomy14091934

Chicago/Turabian Style

Yang, Lulu, Fuxu Guo, Hongze Zhang, Yingli Cao, and Shuai Feng. 2024. "Research on Lightweight Rice False Smut Disease Identification Method Based on Improved YOLOv8n Model" Agronomy 14, no. 9: 1934. https://doi.org/10.3390/agronomy14091934

APA Style

Yang, L., Guo, F., Zhang, H., Cao, Y., & Feng, S. (2024). Research on Lightweight Rice False Smut Disease Identification Method Based on Improved YOLOv8n Model. Agronomy, 14(9), 1934. https://doi.org/10.3390/agronomy14091934

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Research on Lightweight Rice False Smut Disease Identification Method Based on Improved YOLOv8n Model

Abstract

1. Introduction

2. Materials and Methods

2.1. Digital Image Data Acquisition and Processing

2.1.1. Experimental Design and Digital Image Acquisition of Rice False Smut

2.1.2. Data Labeling and Dataset Construction

2.2. Improved YOLOv8n Model Design for Lightweight Rice False Smut

2.2.1. Lightweight Multiscale Convolutional Module Design

2.2.2. Fusion-Weighted Bidirectional Feature Pyramid Networks

2.2.3. Shared Convolutional Lightweight Detection Head Design

2.3. Test Environment Configuration and Parameter Setting

2.4. Evaluation Indicators

3. Results

3.1. Model Training Results

3.1.1. YOLOv8 Series Model Analysis

3.1.2. Comparison of Detection Performance of Different Models

4. Discussion

4.1. Ablation Study

4.2. Analysis of Model Detection Performance for Different Shooting Distances

4.3. Feature Visualization Network

4.4. Model Performance Analysis Based on New Data

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI