Next Article in Journal
Initial Evaluation of the Merit of Guar as a Dairy Forage Replacement Crop during Drought-Induced Water Restrictions
Next Article in Special Issue
Light-FC-YOLO: A Lightweight Method for Flower Counting Based on Enhanced Feature Fusion with a New Efficient Detection Head
Previous Article in Journal
Analysis of Economic Ripple Effects in the Agricultural Field Using Input–Output Analysis: Drought Damage in Korea in 2018
Previous Article in Special Issue
U-Net with Coordinate Attention and VGGNet: A Grape Image Segmentation Algorithm Based on Fusion Pyramid Pooling and the Dual-Attention Mechanism
 
 
Article
Peer-Review Record

Intelligent Detection of Muskmelon Ripeness in Greenhouse Environment Based on YOLO-RFEW

Agronomy 2024, 14(6), 1091; https://doi.org/10.3390/agronomy14061091
by Defang Xu 1,*, Rui Ren 2, Huamin Zhao 2,* and Shujuan Zhang 2
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Agronomy 2024, 14(6), 1091; https://doi.org/10.3390/agronomy14061091
Submission received: 22 April 2024 / Revised: 17 May 2024 / Accepted: 20 May 2024 / Published: 21 May 2024

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The study has several significant shortcomings, which are as follows:

1.    Abstract needs to be improved. Provide valuable insights on practical deployment of YOLO-RFEW model in real-world muskmelon picking robots in terms of both accuracy and efficiency.

2.    In introduction, provide more context on the specific limitations or gaps in existing muskmelon detection methods that your research aims to address? How does your proposed solution build upon or differentiate itself from prior research in this domain?

a.     Elaborate further on the specific challenges or limitations encountered in muskmelon ripeness detection that prompted the development of the YOLO-RFEW model? How do these challenges differ from those addressed in previous research on other fruits, and what unique considerations did you account for in your model design?

3.    Materials and Methods: The section effectively describes the architecture of the proposed YOLO-RFEW model, highlighting key components such as RFAConv, C2f-FE module, and WIoU loss function.

a.     Provide more insights into the decision-making process behind choosing specific backbone models and their trade-offs in terms of model complexity, detection speed, and accuracy.

b.    Explain in details about the augmentation techniques used to enhance dataset quality, such as noise addition and brightness adjustment. How do these techniques contribute to mitigating overfitting and improving model generalization?

c.     How do the scalability and performance characteristics of YOLOv8n align with the requirements of real-time muskmelon detection applications?

d.    Elaborate on the process of incorporating target frame size and shape information. How do these factors influence the adaptation of weight coefficients, and what empirical or theoretical justifications support this approach?

4.    How did you determine the optimal configuration of the attention mechanisms (ELA, SimAM, SE, EMA) for the C2f-F structure? Were there any specific criteria or experiments used to make this decision?

5.    Regarding the integration of the WIoU loss function, how did you ensure that it effectively addresses the limitations of conventional IoU loss and adapts to targets with diverse sizes and shapes? Were there any challenges or considerations in implementing this loss function within the model architecture?

6.    The paper effectively demonstrates the superiority of YOLO-RFEW in achieving optimal performance with reduced model size and inference time. However, it would be beneficial to discuss potential limitations or trade-offs associated with optimizing for these metrics and how they impact real-world applicability and scalability.

7.    Can you provide more insights into the specific strategies or techniques employed in the YOLO-RFEW model to address challenges such as uneven ripening stages and environmental factors like lighting conditions? How do these strategies contribute to improving detection accuracy under varying conditions?

8.    How do metrics such as precision, recall, F1 score, and mAP in the context of muskmelon ripeness detection correlate with practical considerations such as harvesting efficiency and fruit quality?

9.    The findings suggest significant improvements in mAP and F1 scores across different enhancement schemes. Provide insights into why certain enhancements were more effective than others and how they complement each other in improving overall model performance.

10. Further comments are included in the annotated paper. Respond to all the comments and include in the manuscript.

Comments for author File: Comments.pdf

Comments on the Quality of English Language

 Minor editing of the English language is required.

Author Response

Dear Reviewer:

We have responded to your question in detail. Due to the large number of words, please check the attached MS office file. Thank you.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

This paper proposes using an improved version of YOLOv8n to detect muskmelon ripeness. Here are some considerations.

1. The Introduction may be improved by adding information to relevant recent agricultural works using the latest YOLO version. For example, the authors can check https://doi.org/10.1016/j.compag.2024.108728 or https://doi.org/10.3390/electronics13091620.

2. The authors should consider making the dataset publicly available to allow reproducibility.

3. The authors claim that they chose YOLOv8n as it ensures real-time performance. However, official performance metrics (check https://docs.ultralytics.com/models/yolov8/#performance-metrics) claim that YOLOv8s is only slightly slower than YOLOv8n, with a reasonable increment in size. As such, I suggest the authors consider at least YOLOv8s in their experiments.

4. The module proposed by the authors (RFAConv) is outperformed in terms of size, precision, and recall by KWConv. The authors should properly discuss the implications and the motivation behind this. Furthermore, the mAP in Table 1 is missing the indication of whether it is 0.5 or 0.5-0.95; this is extremely important for a proper metrics evaluation. The same goes for Tables 2 and 3

5. How were the training hyperparameters selected? Please argue.

6. From Table 4, inference time is not strictly related to model size. Specifically, the improvement in inference time provided by YOLO-RFEW is not linearly related to the size of models such as YOLOv5s and YOLOv8n. As such, the authors should stress whether model size is a discriminant factor in the analysis. In other words, which was the most relevant factor, time complexity or spatial (memory) complexity?

7. Considering time as a relevant constraint, considering 60 fps as a lower threshold, all the models that can achieve an inference time of 17 ms can provide real-time capabilities. I suggest also considering (at least) YOLOv8s in the comparison, possibly integrating RFW, C2-f, and WIoU in this model version.

For all these reasons, I suggest the paper undergoes a major revision.

Author Response

Dear Reviewer:

  We have responded to your question in detail. Due to the large number of words, please check the attached MS office file. Thank you.

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

The authors successfully fixed all the issues highlighted during the first round of review. Therefore, the manuscript can be considered for publication.

Back to TopTop