Next Article in Journal
Superabsorbent Seed Coating and Its Impact on Fungicide Efficacy in a Combined Treatment of Barley Seeds
Previous Article in Journal
Can Agricultural Insurance Policy Adjustments Promote a ‘Grain-Oriented’ Planting Structure?: Measurement Based on the Expansion of the High-Level Agricultural Insurance in China
 
 
Article
Peer-Review Record

Rice Diseases Identification Method Based on Improved YOLOv7-Tiny

Agriculture 2024, 14(5), 709; https://doi.org/10.3390/agriculture14050709
by Duoguan Cheng, Zhenqing Zhao and Jiang Feng *
Reviewer 1:
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Agriculture 2024, 14(5), 709; https://doi.org/10.3390/agriculture14050709
Submission received: 7 March 2024 / Revised: 19 April 2024 / Accepted: 25 April 2024 / Published: 29 April 2024
(This article belongs to the Section Crop Protection, Diseases, Pests and Weeds)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

General comments

 

Authors present a paper entitled “Rice Diseases Identification Method Based on Improved YOLOv7-Tiny”. The paper propose an innovative YoloV7-tiny model enhanced to detect disease at rice crop. Authors improve YOLOv7-Tiny model’s backbone network has been enhanced by incorporating the Convolutional Block Attention Module (CBAM). This type of tools are very isufuls for section object but where is the novelty on this? Detect disease which are visually identify not supply any value for the farmer, since they already can see by their eys. Current state of art tin this field is based on use hyperspectral images to detect disease before to observe visually the symptoms. 

 

Introduction

 

The introduction lacks depth as the authors primarily concentrate on the detection of diseases using object detection models without clarifying the superiority of their method or delineating the state of the art. As a researcher, I am left questioning the novelty of their approach.

Why did the authors opt for YOLOv7 when YOLOv9 is the latest version? It's also puzzling why they didn't at least compare it with YOLOv9. Furthermore, their choice to use YOLOv7 Tiny without juxtaposing it with the full YOLOv7 architecture raises additional questions about their methodology.

 

The introduction significantly omits details on how rice diseases are detected, particularly failing to explain the novelty of their method compared to the state of the art. For instance, based on my experience, multispectral and hyperspectral information is crucial for disease detection. However, the authors neglect to mention this in the introduction.

 

Materials and methods

 

The data preprocessing section is inadequately detailed, necessitating further elaboration. Questions arise regarding the rationale behind the specific data augmentation techniques chosen and the scientific principles guiding these decisions.

 

Regarding line 91, is a dataset of 1,500 images deemed sufficient? Given the complexity of the disease, there are significant doubts concerning the model's ability to generalize beyond the training data.

 

On line 94, the question arises as to which annotation format was utilized by the authors (e.g., PASCAL VOC). Although LabelImg is a widely recognized software, it is imperative for the authors to cite it properly.

 

Results

 

Table 2 indicates that while the models proposed by the authors achieved good F1 scores and [email protected], there needs to be an explanation for the significantly longer inference times observed in the proposed models.

 

Discussion

 

This section lacks a comparative approach, such as juxtaposing the results with other models within the YOLO family to elucidate the superiority of their model. The authors emphasize the model's impressive inference time, yet this attribute is not inherently indicative of the model's quality, as it largely depends on the GPU utilized for image inference.

Author Response

To Reviewer #1:

Dear Reviewer:

Thank you very much for your letter and the comments regarding our paper submitted to Agriculture. We have carefully considered the comments and have revised the manuscript accordingly. We submit here the revised manuscript as well as a list of changes. The change list is marked in yellow.

Comment 1: Authors present a paper entitled “Rice Diseases Identification Method Based on Improved YOLOv7-Tiny”. The paper propose an innovative YoloV7-tiny model enhanced to detect disease at rice crop. Authors improve YOLOv7-Tiny model’s backbone network has been enhanced by incorporating the Convolutional Block Attention Module (CBAM). This type of tools are very isufuls for section object but where is the novelty on this? Detect disease which are visually identify not supply any value for the farmer, since they already can see by their eys. Current state of art tin this field is based on use hyperspectral images to detect disease before to observe visually the symptoms.

 

Response: Thanks for your suggestion.

In this study, the convolutional block attention module (CBAM) was integrated into the YOLOv7-Tiny model to substantially enhance the model’s capability to detect features of rice diseases.  CBAM, serving as an attention mechanism, augments the model's focus on critical features during image processing, thereby elevating both the accuracy and efficiency of disease detection.  This enhancement represents an innovation within existing deep learning frameworks, particularly for the Tiny version, which is characterized by reduced processing speed and lower resource consumption, rendering it more apt for practical agricultural settings.

Although numerous diseases are visible to the naked eye, accurately identifying the type of disease remains challenging for those without experience.  This paper introduces an enhanced YOLOv7-Tiny model designed for the identification of rice diseases, which plays a crucial role in curbing the spread of diseases and minimizing pesticide application.  Moreover, the model is capable of consistent disease detection under complex conditions, such as varying lighting and backgrounds, a task that proves challenging for visual observation alone.

 

Existing research predominantly relies on hyperspectral imaging techniques which, despite their effectiveness, are costly and cumbersome for field deployment.  By contrast, this study employs standard RGB imaging, utilizing an advanced deep learning model to achieve superior recognition outcomes.  This approach significantly lowers the barriers to technological adoption and facilitates the broader dissemination of precision agriculture technologies.

 

Comment 2: The introduction lacks depth as the authors primarily concentrate on the detection of diseases using object detection models without clarifying the superiority of their method or delineating the state of the art. As a researcher, I am left questioning the novelty of their approach.

Why did the authors opt for YOLOv7 when YOLOv9 is the latest version? It's also puzzling why they didn't at least compare it with YOLOv9. Furthermore, their choice to use YOLOv7 Tiny without juxtaposing it with the full YOLOv7 architecture raises additional questions about their methodology.

The introduction significantly omits details on how rice diseases are detected, particularly failing to explain the novelty of their method compared to the state of the art. For instance, based on my experience, multispectral and hyperspectral information is crucial for disease detection. However, the authors neglect to mention this in the introduction.

 

Response: Thanks for your suggestion,

YOLOv9 is the latest version, but the code of YOLOv9 was not open source when we submitted the paper. Now the code of YOLOv9 has been updated, but no lightweight version has been given. The GPU used in the experiment of our paper only has 6GB video memory, and it cannot run the large models of YOLOv9 and YOLOv7.

As for the issue of adding hyperspectral technology to disease detection in the introduction, I have checked some papers that use target detection to identify rice diseases, and there is no discussion on hyperspectral technology in their introduction, so we have some doubts on this issue. If possible, I hope to further communicate with you about whether to add hyperspectral technology for disease detection in the introduction.

 

 

Comment 3: The data preprocessing section is inadequately detailed, necessitating further elaboration. Questions arise regarding the rationale behind the specific data augmentation techniques chosen and the scientific principles guiding these decisions.

Regarding line 91, is a dataset of 1,500 images deemed sufficient? Given the complexity of the disease, there are significant doubts concerning the model's ability to generalize beyond the training data.

On line 94, the question arises as to which annotation format was utilized by the authors (e.g., PASCAL VOC). Although LabelImg is a widely recognized software, it is imperative for the authors to cite it properly.

 

Response: Thanks for your suggestion.

As for the lack of detail in the data preprocessing section, we have added the presentation of the image after data enhancement in Section 2.2. In this study, we chose the combination of offline data enhancement and online data enhancement, and cited the references using this method in the paper.

Our data set adopts public data and data collected in the field. Experimental results show that [email protected] of the model converges after 150 iterations, and the model has good generalization ability.

As for the LabelImg annotation format, we first annotate the images according to the annotation format of the Pascal voc dataset. This is changed in section 2.2.

 

Comment 4: Table 2 indicates that while the models proposed by the authors achieved good F1 scores and [email protected], there needs to be an explanation for the significantly longer inference times observed in the proposed models.

 

Response: Thanks for your suggestion.

The increase in inference time is due to the increase in the number of parameters of the improved model compared to the baseline model. The comparison between the number of model parameters and inference time is shown in Table 2.

Comment 5: This section lacks a comparative approach, such as juxtaposing the results with other models within the YOLO family to elucidate the superiority of their model. The authors emphasize the model's impressive inference time, yet this attribute is not inherently indicative of the model's quality, as it largely depends on the GPU utilized for image inference.

Response: Thanks for your suggestion.

This study compares and discusses the improved model with the existing lightweight model in section 3.4. In this paper, the elaboration of the reasoning time of the discussion part is lacking, and the revised discussion does not emphasize the reasoning time.

 

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

This paper develops a computer vision system based on a YOLO architecture for detection of rice leaf diseases.  I have a few concerns that can be improved in a revision of the paper.

One main concern that I have about this paper is the lack of context for this research. One can first note that there are many papers recently about using YOLO based architectures in agriculture. However, more specifically, there are papers that use YOLO algorithms for rice disease recognition. For instance:

1.      A paper on leaf blast and brown spot using YOLO from 2021 (200 images): M. K. Agbulos, Y. Sarmiento and J. Villaverde, "Identification of Leaf Blast and Brown Spot Diseases on Rice Leaf with YOLO Algorithm," 2021 IEEE 7th International Conference on Control Science and Systems Engineering (ICCSSE), doi: 10.1109/ICCSSE52761.2021.9545153.

2.      A paper on bacterial leaf blight, brown spot and leaf smut using YOLO from 2023 (120 images). V Senthil Kumar et al 2023 Environ. Res. Commun. 5 065014DOI 10.1088/2515-7620/acdece

3.      A paper on Brown Spot, Hispa, Blast, and Blight using YOLO and data augmentation from 2023  (550 images) F. Aziz, F. Ernawan, M. Fakhreldin and P. W. Adi, "YOLO Network-Based for Detection of Rice Leaf Disease," 2023 International Conference on Information Technology Research and Innovation (ICITRI), Jakarta, Indonesia, 2023, pp. 65-69, doi: 10.1109/ICITRI59340.2023.10249843.

4.      A paper on Bacterial leaf blight, Rice blast and Brown spot using YOLO and data augmentation from 2024 A. K. Sangaiah, F. -N. Yu, Y. -B. Lin, W. -C. Shen and A. Sharma, "UAV T-YOLO-Rice: An Enhanced Tiny Yolo Networks for Rice Leaves Diseases Detection in Paddy Agronomy," in IEEE Transactions on Network Science and Engineering, doi: 10.1109/TNSE.2024.3350640.

These are aided by a few publicly available datasets:

1.      https://archive.ics.uci.edu/dataset/486/rice+leaf+diseases

2.      https://www.kaggle.com/datasets/vbookshelf/rice-leaf-diseases

3.      https://github.com/aldrin233/RiceDiseases-DataSet

 

I do not present these papers simply for you to cite them. They were located through a very cursory search.  Understandably there are differences in the approaches in these papers, particularly the size of the data sets and the backgrounds of the images in some cases. However, the lack of any mention of these types of papers means that the paper cannot be considered ready for publication.

One paper is cited, noting an accuracy of 81.79%. But this is not mentioned in the discussion or compared with the current research. Accuracy is not used by the current paper as a measure, so some sense of how much of an improvement over this result is obtained in the current paper would be helpful.

 

Another issue worth noting, especially in the context of data augmentation, is the imbalance in the data set. Bacterial blight is underrepresented and Brown spot is overrepresented in the data set. Did the authors consider, e.g., random undersampling to determine the effect of the overrepresentation on performance? Interestingly, from the results, it seems that the model is underperforming on Brown spot. This suggests that a model presented with undersampled data is worth exploration.

Other minor points:

1.      Line 138 – here and elsewhere, the authors state with certainty the effect of certain modules in the pipeline. I am not confident that this can be so certainly ascribed. For instance, on these lines, are we certain that CBAM reduces the impact of intricate backgrounds? If so, please include a reference that establishes this. The provided reference [20] says “we conjecture that the performance boost comes from accurate attention and noise reduction of irrelevant clutters”. This does not seem certain.  This is not the only location of such assertions and I would encourage the authors to reread the paper to ensure that effect of modules is not oversold.

2.      Line 254 / Eq (8) – AP_i is not defined, nor, crucially, is mAP@x.

3.      Line 255 – is the term “reconciled mean” a well-understood term? Do the authors mean “harmonic mean”?

4.      Line 274 “number of decreases” is a typo.

5.      Why is F1 score not included in Table 4?

Comments on the Quality of English Language

included above

Author Response

To Reviewer #2:

Dear Reviewer:

Thank you very much for your letter and the comments regarding our paper submitted to Agriculture. We have carefully considered the comments and have revised the manuscript accordingly. We submit here the revised manuscript as well as a list of changes. The change list is marked in yellow.

Comment 1: I do not present these papers simply for you to cite them. They were located through a very cursory search.  Understandably there are differences in the approaches in these papers, particularly the size of the data sets and the backgrounds of the images in some cases. However, the lack of any mention of these types of papers means that the paper cannot be considered ready for publication.

One paper is cited, noting an accuracy of 81.79%. But this is not mentioned in the discussion or compared with the current research. Accuracy is not used by the current paper as a measure, so some sense of how much of an improvement over this result is obtained in the current paper would be helpful.

Response: Thanks for your suggestion.

The references we quoted in the introduction are really lacking, and we have looked at the references you provided and decided to quote them to make our introduction more substantial.

The discussion section does not compare the existing improved model, but the existing baseline model in section 3.4. There are certain difficulties in comparing with existing improved models. First of all, the paper data sets of these improved models are different from mine, so it is impossible to directly compare the evaluation indicators such as mAP. Second, the code for these improved models is not open source, making it difficult to reproduce them.

Comment 2: Another issue worth noting, especially in the context of data augmentation, is the imbalance in the data set. Bacterial blight is underrepresented and Brown spot is overrepresented in the data set. Did the authors consider, e.g., random undersampling to determine the effect of the overrepresentation on performance? Interestingly, from the results, it seems that the model is underperforming on Brown spot. This suggests that a model presented with undersampled data is worth exploration.

Response: Thanks for your suggestion,

The model proposed by the undersampling data is worth exploring, but due to time, we cannot re-do the experiment to explore the influence of random undersampling on the performance of the model.

Comment 3:  Line 138 – here and elsewhere, the authors state with certainty the effect of certain modules in the pipeline. I am not confident that this can be so certainly ascribed. For instance, on these lines, are we certain that CBAM reduces the impact of intricate backgrounds? If so, please include a reference that establishes this. The provided reference [20] says “we conjecture that the performance boost comes from accurate attention and noise reduction of irrelevant clutters”. This does not seem certain.  This is not the only location of such assertions and I would encourage the authors to reread the paper to ensure that effect of modules is not oversold.

Response: Thanks for your suggestion.

According to the heat map in Figure 11, it can be proved that CBAM can reduce the influence of complex background to a certain extent. Our description in the paper is too absolute, so the relevant description can be modified to reduce the influence of complex background to a certain extent.

Comment 4:  Line 254 / Eq (8) – AP_i is not defined, nor, crucially, is mAP@x.

Response: Thanks for your suggestion.We added the definitions of [email protected] and [email protected] in Section 3.2, such as:

[email protected] signifies the AP calculated at an IoU threshold of 0.5, and [email protected] indicates the mAP computed at an IoU threshold of 0.5. [email protected] and [email protected] served as evaluation metrics.

Comment 5: Line 255 – is the term “reconciled mean” a well-understood term? Do the authors mean “harmonic mean”?

Response: Thanks for your suggestion.We have changed reconciled mean to harmonic mean.

Comment 6: Line 274 “number of decreases” is a typo.

Response: Thanks for your suggestion. In this place, we have omitted one word, and the revised word is "number of parameters".

Comment 7:  Why is F1 score not included in Table 4?

Response: Thanks for your suggestion.We added the F1 score in Table 4.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

 

This paper proposes an enhanced model for recognising rice diseases by leveraging the improved YOLOv7-Tiny. The model strikes a balance between accuracy and reasoning speed by integrating the Convolutional Block Attention Module, RepGhost bottleneck module, and T-ELAN module into the backbone network of YOLOv7-Tiny. The paper is well-structured, the methodology is well-described, and the results have been well-discussed. Therefore, I recommend it for publication. However, I do have some remarks.

-       Add the structure of the paper at the end of the introduction.

-       I suggest including a 'related work' section to better contextualise your research and identify any relevant gaps.

-       I suggest adding definition for [email protected]

-       Correct line 272-273 :  number of parameters or the size of the parameters

-       Add the VRAM size for the RTX 3060 used

Author Response

To Reviewer #3:

Dear Reviewer:

Thank you very much for your letter and the comments regarding our paper submitted to Agriculture. We have carefully considered the comments and have revised the manuscript accordingly. We submit here the revised manuscript as well as a list of changes. The change list is marked in yellow.

Comment 1: Add the structure of the paper at the end of the introduction.

Response: Thanks for your suggestion, We add the structure of the paper at the end of the introduction, such as:

In the remainder of this paper, the methods for rice disease image acquisition and identification are detailed in Section 2, the experimental results are presented in Section 3, the discussion is provided in Section 4, and the conclusions are outlined in Section 5.

Comment 2: I suggest including a 'related work' section to better contextualise your research and identify any relevant gaps.

Response: Thanks for your suggestion.

The discussion section does not compare the improved model with the existing approach, but with the existing baseline model in Section 3.4. Compared with the existing improved model, there are some difficulties. First, the paper data sets for these improved models are different from mine, so it is not possible to directly compare evaluation metrics such as mAP. Second, the code for these improved models is not open source, so it is difficult to replicate them.

Comment 3: I suggest adding definition for [email protected].

Response: Thanks for your suggestion, We added the definitions of [email protected] and [email protected] in Section 3.2, such as:

[email protected] signifies the AP calculated at an IoU threshold of 0.5, and [email protected] indicates the mAP computed at an IoU threshold of 0.5. [email protected] and [email protected] served as evaluation metrics.

Comment 4: Correct line 272-273 :  number of parameters or the size of the parameters.

Response: Thanks for your suggestion, the term "parameters" was omitted in several instances between lines 272 and 273 and this has since been corrected.

Comment 5: Add the VRAM size for the RTX 3060 used.

Response: Thanks for your suggestion, We added a description of GPU video memory in Section 3.1, such as:

It is complemented by an NVIDIA GeForce RTX 3060 laptop GPU equipped with 6GB of video memory.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

Dear Editor,

The authors did not properly address my question: They failed to include a citation for the LabelImg software, which should be referenced as follows:
Lin, T. (2015). LabelImg. Online: https://github.com/tzutalin/labelImg, 706.

Furthermore, there seems to be a misunderstanding regarding my comments on the use of hyperspectral data. I did not suggest that they need to incorporate hyperspectral data into their current experiment. Rather, my point was that their introduction lacks a discussion on the state of the art in disease detection, which they should include.

The authors place significant emphasis on the novelty of using the smaller version of the YoloV7 model, yet they fail to compare it with other versions such as YoloV5 and YoloV6. They need to clarify why YoloV7 is superior. Although they mentioned in their last letter that their computer is limited to 6GB of GPU, they should consider utilizing tools like Google Colaboratory, which provides 12GB of GPU and could enhance the robustness of their models.

Consequently, I am uncertain about the quality of this paper. I would appreciate the opinions of the other reviewers to determine if my concerns are justified.

Regards,
Enrique

Author Response

To Reviewer #1:

Dear Reviewer:

Thank you very much for your letter and the comments regarding our paper submitted to Agriculture. We have carefully considered the comments and have revised the manuscript accordingly. We submit here the revised manuscript as well as a list of changes. The change list is marked in red.

Comment 1: The authors did not properly address my question: They failed to include a citation for the LabelImg software, which should be referenced as follows:

Lin, T. (2015). LabelImg. Online: https://github.com/tzutalin/labelImg, 706.

 

Response: Thanks for your suggestion.

We have included references in the relevant locations.

Comment 2: Furthermore, there seems to be a misunderstanding regarding my comments on the use of hyperspectral data. I did not suggest that they need to incorporate hyperspectral data into their current experiment. Rather, my point was that their introduction lacks a discussion on the state of the art in disease detection, which they should include.

Response: Thanks for your suggestion,

In our introduction, we discussed the development status of crop disease grade detection technology based on hyperspectral imaging and RGB image, and explained the reason why we chose the crop disease grade detection technology based on RGB image.The specific modification content is shown in the red part of the paper.

 

Comment 3: The authors place significant emphasis on the novelty of using the smaller version of the YoloV7 model, yet they fail to compare it with other versions such as YoloV5 and YoloV6. They need to clarify why YoloV7 is superior. Although they mentioned in their last letter that their computer is limited to 6GB of GPU, they should consider utilizing tools like Google Colaboratory, which provides 12GB of GPU and could enhance the robustness of their models.

Response: Thanks for your suggestion.

We think your suggestion is very good, but limited by the relationship between equipment and time, we cannot use a GPU with larger video memory to re-do the experiment at present, and we will do more detailed comparative test later.

 

Sincerely,

Duoguan Cheng

College of Electrical and Information, Northeast Agricultural University, Harbin 150030, China

[email protected]

Author Response File: Author Response.pdf

Back to TopTop