Next Article in Journal
Plant Growth Regulators Improve Soybean Yield in Northwest China Through Nutritional and Hormonal Regulation
Previous Article in Journal
Net-Zero Emissions for Sustainable Food Production and Land Management
 
 
Article
Peer-Review Record

Optimization of Litchi Fruit Detection Based on Defoliation and UAV

Agronomy 2025, 15(10), 2421; https://doi.org/10.3390/agronomy15102421
by Jing Wang 1, Mingyue Zhang 2, Zhenhui Zheng 3, Zhaoshen Yao 2, Boxuan Nie 4, Dongliang Guo 1, Ling Chen 2, Jianguang Li 1,* and Juntao Xiong 2,*
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Agronomy 2025, 15(10), 2421; https://doi.org/10.3390/agronomy15102421
Submission received: 14 August 2025 / Revised: 22 September 2025 / Accepted: 1 October 2025 / Published: 19 October 2025
(This article belongs to the Section Precision and Digital Agriculture)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The paper proposes an integration between agronomic techniques and deep learning for collaborative optimization strategy. The paper needs revision as follows:

1- add the research gap or research importance to the abstract

2- add contribution instead of objectives 

3- add more references related to  the use of deep learning for defoliation, detection, prediction etc. 

4- Add numbers to all equation and use the same format. 

5- Grammar check and fix error such as "Where SSB was the ....." --> where SSB is ....SSW is ...

6-is the height of UAV is 2-3 m above the ground or above the tree? what if the height increases? 

7-YOLO has newer versions than YOLO8, why did the authors used YOLO8 and do they expect better results if newer YOLO is used?

8-add the convergence matrix to the results to show the numbers of true positives, true negatives, false positives, and false negative. 
 

3- 

Comments on the Quality of English Language

The manuscript needs English language revision 

Author Response

Dear Editor and Reviewers,

We would like to sincerely thank to the editor and the reviewers for the valued comments and suggestions. They are helpful to improve the quality of the manuscript. We have revised the manuscript carefully based on the comments and hope that it is now suitable to be published in journal of Agronomy. The revised parts suggested by reviewers of the manuscript are marked as the revised traces in the manuscript. The response answers are listed following the comments point by point.

 

Sincerely yours,

Juntao Xiong

College of Mathematics and Informatics, South China Agricultural University, Guangzhou, 510642 China.

E-mail: xiongjt2340@163.com

 

Reviewer1:

The paper proposes an integration between agronomic techniques and deep learning for collaborative optimization strategy. The paper needs revision as follows:

  • 1. add the research gap or research importance to the abstract

Response:

Many thanks to the valued comments. I have added research importance to the abstract as the first sentence “The use of UAVs to detect litchi in natural environments is imperative for rapid litchi yield estimation and automated harvesting systems.”

 

  • 2. add contribution instead of objectives

Response:

Thanks. We add contribution to the end of abstract with “ The study contributes a novel collaborative optimization strategy that effectively mitigates occlusion issues in fruit detection. This approach demonstrates that agronomic techniques can be strategically used to enhance AI perception, offering a significant step forward in the integration of agricultural machinery and agronomy for intelligent orchard systems.”  Instead of “ This approach effectively mitigated occlusion-induced fruit miss detection. The study demonstrates the feasibility of optimizing visual perception environments through agronomic intervention, offering a novel strategy for deeper agricultural machinery agronomy integration in intelligent orchard systems and enabling refined orchard management.

  • 3. add more references related to the use of deep learning for defoliation, detection, prediction etc.

Response:

Thank you very much for your suggestion. In accordance with your instructions, we have added more references related to the use of deep learning for leaf removal, detection and prediction in the introduction section of the original manuscript, and reorganized the review section. The newly added and modified contents are located on page 2 and highlighted in red in the latest manuscript.

  • 4. Add numbers to all equation and use the same format.

Response:

Thank you very much for your suggestion. We have added numbers to all the equations and used the same format throughout the entire manuscript, highlighting them in red in the latest version.

  • 5. Grammar check and fix error such as "Where SSB was the ....." --> where SSB is ....SSW is ...

Response:

Thank you very much for your careful review. We are deeply sorry for our negligence. Based on your feedback, we have made the necessary corrections to ensure the accuracy of the wording throughout the manuscript, and the modifications are as follows:

Revised:

Among them, SSB represents inter-group fluctuations, while SSW represents intra-group fluctuations.

Their calculation formula is as follows:

When SSB represents inter-group fluctuations, SSW represents intra-group fluctuations.

  • 6. is the height of UAV is 2-3 m above the ground or above the tree? what if the height increases?

Response:

Thank you for your valuable suggestions. Your question is extremely crucial regarding the collection of experimental data. In this study, the altitude at which the unmanned aerial vehicle (UAV) flew was approximately 2 to 3 meters above the treetops rather than at the ground level. We have made revisions on page 6 of the latest manuscript. The selection of this height is mainly to strike a balance between image resolution and flight safety. We sincerely apologize for this issue. In the subsequent work, we will conduct more in-depth research to more comprehensively assess the impact of height on the detection effect.

  • 7. YOLO has newer versions than YOLO8, why did the authors used YOLO8 and do they expect better results if newer YOLO is used?

Response:

Thank you for this important point. We chose YOLOv8 for our study for several practical and scientific reasons; we did not evaluate very recent YOLO forks/releases because doing so is outside the manuscript’s stated scope (which focuses on the effect of canopy defoliation on fruit visibility and on demonstrating the benefits of combining a horticultural intervention with an established, high-performing detector). Below we summarize our rationale and our expectations regarding newer YOLO variants.

At the time we designed and executed the experiments, YOLOv8 was a mature, well-documented implementation with stable training recipes and widely available pre-trained weights (Ultralytics distribution). These properties were important for reproducible training and for fair, controlled comparisons (we compared YOLOv8 against YOLOv5 and YOLOv7 under identical data and hyperparameter regimes). Using a stable, widely used baseline reduces confounding factors caused by immature or experimental codebases.

Good trade-off between accuracy, speed and ease-of-integration for UAV workflows.

YOLOv8 provides a favorable balance of accuracy and real-time performance, which is important for UAV-based pipelines where inference latency and model size matter for downstream deployment. In our experiments YOLOv8 produced the best compromise between detection quality and inference time (reported mAP = 0.868 and inference time = 160.3 ms/frame), enabling us to focus on the impact of defoliation rather than on pushing incremental detection numbers.

The core contribution of the paper is the integrated pipeline—showing that a horticultural intervention (moderate defoliation) materially improves UAV detection performance and fruit quality. For that goal it is essential to compare detection performance across canopy treatments using the same detector and training regimen so that observed differences reflect canopy effects rather than differences in model architecture or training. Using YOLOv8 as the common backbone allowed a controlled, apples-to-apples comparison (Control vs T1 vs T2).

Newer YOLO variants (and numerous community forks) frequently appear; evaluating each would multiply computational cost and expand the paper’s scope substantially. Given the project emphasis on horticultural–vision integration, we prioritized a sound, reproducible comparison using a representative state-of-the-art detector rather than exhaustively benchmarking every emerging YOLO variant.

We acknowledge that newer YOLO releases or alternative detection models may yield incremental improvements in mAP, speed, or model size because of architectural tweaks or improved training recipes. However, we do not expect such changes to invalidate the paper’s central conclusions: (i) moderate defoliation significantly reduces occlusion and (ii) this reduction in occlusion materially improves detection metrics when evaluated with the same detector. Put differently, the relative gains attributable to defoliation (e.g., improvements in precision/recall/mAP between control and T1) are unlikely to disappear if a different high-performance detector is used—those gains stem from improved fruit visibility rather than the detector’s absolute ceiling.Thank you again for the helpful suggestion.

  • 8. add the convergence matrix to the results to show the numbers of true positives, true negatives, false positives, and false negative.

Response:

We appreciate the reviewer’s suggestion regarding the confusion matrix and explicit reporting of false negatives. In the context of object detection, however, a conventional 2×2 confusion matrix (TP, TN, FP, FN) is not directly meaningful. Unlike classification tasks, the number of true negatives in detection is effectively unbounded, since every background region or pixel not matched to a fruit instance could be considered a negative. For this reason, the established practice in computer vision research is to summarize performance using instance-level metrics such as Precision, Recall, F1-score, and mean Average Precision (mAP).

Importantly, the reviewer’s concern about missed detections is already fully reflected in the Recall metric. Recall is defined as TP / (TP + FN), so the false negatives (missed fruits) are inherently accounted for. Similarly, the F1-score balances Precision and Recall, thereby integrating the effects of both false positives and false negatives. These metrics are widely recognized in the literature as more interpretable and consistent across datasets than raw counts of FN or TN.

Furthermore, in our study we explicitly discussed in the Results section that untreated canopies exhibited a larger number of missed detections, particularly for small fruits (≤25 mm) or those subject to severe occlusion. This qualitative observation is supported quantitatively by the lower Recall values under the control treatment compared with the defoliation treatments. Thus, while we did not present raw FN counts, the influence of missed detections is already captured and communicated through the standard detection metrics and our accompanying analysis.

We therefore believe that Precision, Recall, F1-score, and mAP, together with the treatment-wise comparisons provided, offer a comprehensive and field-relevant evaluation of detector performance without the need for an additional confusion matrix.

 

Reviewer 2 Report

Comments and Suggestions for Authors

The article is technically sound and I find the application of deep learning to Linchi fruits interesting and useful. But several issues need to be addressed:

1. Some examples:

  • In page 2, "Tang et al. (2023) proposed YOLO-Oleifera based on im-proved YOLOv4-tiny model and binocular stereo vision". There is an hyphen in the middle of a sentence.
  • In page 6, "Where SSB was the between-group fluctuation, SSW was the within-group fluctuation, was specifically the number of groups minus one, and was the sum of the number of individuals in each group minus one." Check writing.
  • In page 6, with bounding boxes delineating only the primary fruit body irrespective of occlusion state." There is an extra quotation mark.
  • In page 6, "The vision system was the core of agricultural UAV" and "The YOLOv8 network structure was shown in Figure 4." use present tense and not past.
  • In the document the word Litchi is used with upper and lower case. The authors should select one option and used consistently throughout the document.
  • In page 9, "Stage II was mainly characterized by the growth of an embryo, and the rapid aril growth and maturation (Stage III)." Rewrite to be clearer about the Stage III, because when you read the text after the and seams part of stage II.
  • In page 14, "The specific quantitative results were shown in Table 3. When the experimental ob-jects were fruit trees without leaf defoliation processing, the accuracy, recall, mean aver-age precision (mAP), and F1 value of the model were 0.787, 0.805, 0818, and 0.796, respec-tively. On the contrary, the accuracy, recall, mean average precision (mAP), and F1 value of the model were increased to 0.846, 0.839, 0.884, and 0.842, respectively, when the objects with leaf defoliation processing." Check writing, it is not clear what the authors want to convey.

These are just some examples, but a thorough review of the writing must be carried out

2. Position of the images and tables in the document need to be adjusted.

3. Figure 1 state "The diagram of defoliation treatments", but the image is not a diagram, it is more like examples of defoliation treatments.

4. Since mAP is also a metric presented in the results section, it's necessary to include it as part of the section 2.2.3. Evaluation metrics.

5. In page7, the text "In this section, comparative experiment", doesn't seams correct. The comparative analysis is not performed in this section, but in section 3.3. Fruit Target Detection Performance Analysis section. This section presents the metrics used to compared the different methods. 

6. Information presented in table 1 is a bit confusing. I think mostly due to format, is difficult to read the results presented. Also, the table presents a column named edible rate. This information is presented in the table, but the authors didn't explain the meaning of the column or why is important? 

7. For the experiments and results presented, it isn't clear the split of the dataset of images (train, validation and test), and if the metrics presented were calculated for the test set. Also, it is not clear from the experimental configuration if a k-fold cross validation was implemented or a hold- out (80% train - 20% test, for example) run several times? This have to explained in the experimental setup.

8. In page 13, "The experimental comparison objects include SSD, YOLOv5, and YOLOv8". Review writing, The experimental comparison of models include YOLOv5, YOLOv7 and YOLOv8?

9. It is not clear to what defoliation treatment correspond the results presented in table 2.

10. In page 14, "A UAV image dataset was constructed using the highest-per-forming YOLOv8 model for comparative assessment against the control group (CK)." Rewrite to express what the authors really mean. A model (YOLOv8) is not used to construct the dataset, instead you create a dataset and then evaluated models using that dataset.

11. In page 14, "Untreated canopies (Figs. 8a, 8b) exhibited substantial missed detections due to severe foliar occlusion, particularly affecting small (≤25 mm diameter) and partially obscured fruits." Due to the size of the images is difficult to really observe the missed detections, instead if the authors present a metric based on FN (number of missed fruits), the impact of missed detections would become clearer.

12. In page 14, "accurate identification of 92.7% ± 2.1% targets" What metrics is this? because in table 3 no metric show a value of 92.7.

13. In page 14, "with improved boundary precision (IoU: 0.83 ± 0.04 vs. CK's 0.71 ± 0.07)" I guess you mean Iou (T1) vs IoU (CK). Also IoU metric was never explained, at least add a reference where can be consulted.

 

Author Response

Dear Editor and Reviewers,

We would like to sincerely thank to the editor and the reviewers for the valued comments and suggestions. They are helpful to improve the quality of the manuscript. We have revised the manuscript carefully based on the comments and hope that it is now suitable to be published in journal of Agronomy. The revised parts suggested by reviewers of the manuscript are marked as the revised traces in the manuscript. The response answers are listed following the comments point by point.

 

Sincerely yours,

Juntao Xiong

College of Mathematics and Informatics, South China Agricultural University, Guangzhou, 510642 China.

E-mail: xiongjt2340@163.com

 

 

Reviewer2:

The article is technically sound and I find the application of deep learning to Litchi fruits interesting and useful. But several issues need to be addressed:

  1. Some examples:

In page 2, "Tang et al. (2023) proposed YOLO-Oleifera based on im-proved YOLOv4-tiny model and binocular stereo vision". There is an hyphen in the middle of a sentence.

Response:

Thanks. The hyphen is deleted in page 2.

In page 6, "Where SSB was the between-group fluctuation, SSW was the within-group fluctuation, was specifically the number of groups minus one, and was the sum of the number of individuals in each group minus one." Check writing.

Response:

Thank you very much for your careful review and suggestions. We agree that the original sentence on page 6 was unclear and contained grammatical errors. To improve clarity and accuracy, we have revised the sentence as follows:

Original:

"Where SSB was the between-group fluctuation, SSW was the within-group fluctuation, was specifically the number of groups minus one, and was the sum of the number of individuals in each group minus one."

Revised:

"Where SSB represents the between-group variation, SSW represents the within-group variation, (k-1) is the number of groups minus one, and (N-k) is the sum of the number of individuals in each group minus one."

This change has been made on page 6. Thank you for your valuable feedback.

In page 6, with bounding boxes delineating only the primary fruit body irrespective of occlusion state." There is an extra quotation mark.

Response:

Thanks. The extra quotation mark is deleted in page 6.

In page 6, "The vision system was the core of agricultural UAV" and "The YOLOv8 network structure was shown in Figure 4." use present tense and not past.

Response:

  Thanks. The mentioned two sentences is revised to use present tense.

In the document the word Litchi is used with upper and lower case. The authors should select one option and used consistently throughout the document.

Response:

Thanks. The word Litchi is used as “litchi” consistently throughout the document, except for latin name “Litchi chinensis”.

In page 9, "Stage II was mainly characterized by the growth of an embryo, and the rapid aril growth and maturation (Stage III)." Rewrite to be clearer about the Stage III, because when you read the text after the and seams part of stage II.

Response:

Thanks. In order to be clearer, in page 9, the stage II and stage III are rewritten as follows:” Stage II was is mainly characterized by the growth of an embryo. The accumulation of sample mass of fruit peel and seed is significantly faster than that of fresh aril mass in Stage II. While stage III is with the rapid aril growth and maturation, 70% to 80% of the fresh sample weight of the aril are completed during this period.” These are quoted from a reference article as follows: Li, J.G, Huang, H., Huang, X. (2003). A revised division of the developmental stages in litchi fruit. Yuan Yi Xue Bao, 30, 307–310.

In page 14, "The specific quantitative results were shown in Table 3. When the experimental ob-jects were fruit trees without leaf defoliation processing, the accuracy, recall, mean aver-age precision (mAP), and F1 value of the model were 0.787, 0.805, 0818, and 0.796, respec-tively. On the contrary, the accuracy, recall, mean average precision (mAP), and F1 value of the model were increased to 0.846, 0.839, 0.884, and 0.842, respectively, when the objects with leaf defoliation processing." Check writing, it is not clear what the authors want to convey.

Response:

Thank you for pointing this out. We agree that the original writing was unclear. We have revised the paragraph for clarity. The new version now reads: “The quantitative results are presented in Table 3. For trees without defoliation, the accuracy, recall, mAP, and F1-score of the YOLOv8 model were 0.787, 0.805, 0.818, and 0.796, respectively. Under the moderate defoliation treatment, these values increased to 0.846, 0.839, 0.884, and 0.842, respectively.”
This revision has been made on page 14.

These are just some examples, but a thorough review of the writing must be carried out

  1. Position of the images and tables in the document need to be adjusted.

Response:

Thanks. All the images and tables in the document are adjusted to the right side of the page in line with the text.

  1. Figure 1 state "The diagram of defoliation treatments", but the image is not a diagram, it is more like examples of defoliation treatments.

Response:

Thanks. The legend of Figure 1 is revised as “Example of the images of defoliation treatments.”

  1. Since mAP is also a metric presented in the results section, it's necessary to include it as part of the section 2.2.3. Evaluation metrics.

Response:

Thank you very much for your suggestion. We agree that mAP should be explicitly included as one of the evaluation metrics in Section 2.2.3 “Evaluation Metrics,” as it is also presented in the results section. We have revised the corresponding paragraph to clearly incorporate mAP and provided a more detailed explanation of all evaluation indicators.This modification has been made in Section 2.2.3.

  1. In page7, the text "In this section, comparative experiment", doesn't seams correct. The comparative analysis is not performed in this section, but in section 3.3. Fruit Target Detection Performance Analysis section. This section presents the metrics used to compared the different methods.

Response:

Thank you very much for your valuable comments. We agree with your observation that the original sentence “In this section, comparative experiment was designed to verify the effectiveness of the proposed detection algorithm.” may cause misunderstanding, as the comparative analysis is actually conducted in Section 3.3 “Fruit Target Detection Performance Analysis.” In this section, we only introduce the metrics used for comparison. Accordingly, we have revised the sentence on page 7 as follows:

Original:
“In this section, comparative experiment was designed to verify the effectiveness of the proposed detection algorithm. Precision, recall, mAP, F1 and Inference time are used as evaluation indicators, and the calculation formula was as follows:”

Revised:
“In this section, the metrics used to compare different detection methods are introduced. Precision, recall, mAP, F1, and inference time are used as evaluation indicators, and their calculation formulas are as follows:”

We appreciate your suggestion, which has helped us improve the clarity and accuracy of the manuscript.The modification has been made on page 7.

 

  1. Information presented in table 1 is a bit confusing. I think mostly due to format, is difficult to read the results presented. Also, the table presents a column named edible rate. This information is presented in the table, but the authors didn't explain the meaning of the column or why is important?

Response:

Thanks. The edible rate of fruit refers to the proportion of edible parts in the total weight of the fruit. A high edible rate means more flesh and is an important indicator for measuring the economic value of fruit. The weight and thickness of peel are also an indicator of fruit quality. However, these three indicators show no difference among treatments. In order to present information much more clearer, the weight and thickness of peel, together with edible rate were canceled in Table 1.

  1. For the experiments and results presented, it isn't clear the split of the dataset of images (train, validation and test), and if the metrics presented were calculated for the test set. Also, it is not clear from the experimental configuration if a k-fold cross validation was implemented or a hold- out (80% train - 20% test, for example) run several times? This have to explained in the experimental setup.

Response:

Thank you for this valuable suggestion. As noted in the original manuscript (Section 2.2.1), the UAV images were already described as being randomly partitioned into training, validation, and test subsets. Following the reviewer’s advice, we have now elaborated this part to make it clearer. Specifically, the dataset was divided in a 2:1:1 ratio, with 50% used for training, 25% for validation, and 25% for testing. All the quantitative metrics reported in the Results section (Tables 2 and 3) were calculated on the independent test set, while the validation set was only used for hyperparameter tuning and to monitor overfitting. These details have been added to Section 2.2.1 (page 6) of the revised manuscript.

  1. In page 13, "The experimental comparison objects include SSD, YOLOv5, and YOLOv8". Review writing, The experimental comparison of models include YOLOv5, YOLOv7 and YOLOv8?

Response:

Thank you very much for your suggestion. According to your advice, we have revised the experimental comparison models from “SSD, YOLOv5, and YOLOv8” to “YOLOv5, YOLOv7, and YOLOv8” to ensure consistency with the actual experimental setup.

This modification has been made in the relevant section 3.3.2.

  1. It is not clear to what defoliation treatment correspond the results presented in table 2.

Response:

We thank the reviewer for raising this point. We have clarified that Table 2 presents a baseline comparison of different detection models (YOLOv5, YOLOv7, YOLOv8) using the control (CK) dataset without defoliation. The purpose of Table 2 is to identify the most suitable model architecture under natural orchard conditions. The effect of defoliation treatments (CK vs. T1) is analyzed separately in Table 3. This clarification has been added to both the main text (page 13) and the caption of Table 2.

  1. In page 14, "A UAV image dataset was constructed using the highest-per-forming YOLOv8 model for comparative assessment against the control group (CK)." Rewrite to express what the authors really mean. A model (YOLOv8) is not used to construct the dataset, instead you create a dataset and then evaluated models using that dataset.

Response:

Thank you for your valuable comment. To address the ambiguity in the original sentence, we have revised the text as follows:
    Original: “A UAV image dataset was constructed using the highest-performing YOLOv8 model for comparative assessment against the control group (CK).”

Revised: “A UAV image dataset was constructed, and then the highest-performing YOLOv8 model was evaluated using this dataset for comparative assessment against the control group (CK).”

This revision clarifies the sequence of dataset construction and model evaluation, eliminating any misunderstanding. The modification has been made in the main text.

  1. In page 14, "Untreated canopies (Figs. 8a, 8b) exhibited substantial missed detections due to severe foliar occlusion, particularly affecting small (≤25 mm diameter) and partially obscured fruits." Due to the size of the images is difficult to really observe the missed detections, instead if the authors present a metric based on FN (number of missed fruits), the impact of missed detections would become clearer.

Response:

We appreciate this insightful suggestion. Missed detections are indeed an important factor in evaluating detection performance. While we did not separately present the absolute FN counts, this aspect is inherently reflected in the Recall metric, since Recall = TP / (TP + FN). As reported in Table 3, recall improved from 0.805 (control) to 0.839 (defoliation), which indicates a reduction in missed detections. We agree that presenting FN counts would make this impact more explicit, and we will incorporate such analyses in future studies to further strengthen the results.

  1. In page 14, "accurate identification of 92.7% ± 2.1% targets" What metrics is this? because in table 3 no metric show a value of 92.7.

Response: We thank the reviewer for pointing out this inconsistency. The phrase “accurate identification of 92.7% ± 2.1% targets with improved boundary precision (IoU: 0.83 ± 0.04 vs. CK’s 0.71 ± 0.07)” was originally included to qualitatively illustrate our field observations, where the defoliation treatment visibly exposed more fruits and produced a clearer separation between fruit and foliage in UAV imagery. These values were not derived from the standardized evaluation pipeline (Table 2 and Table 3) but were intended as descriptive, experience-based estimates from preliminary manual inspections. We fully acknowledge that this phrasing may cause confusion, as it can be misinterpreted as formal quantitative results comparable to precision, recall, mAP, and F1-score.

To avoid ambiguity and ensure that only systematically computed and reproducible metrics are reported, we have removed this sentence from the revised manuscript. The key findings are now expressed directly through the established evaluation indicators (Precision, Recall, mAP, and F1), which already demonstrate the superiority of moderate defoliation (T1) over untreated canopies. This modification strengthens the clarity and consistency of the Results section while keeping the scientific rigor intact.

Original: Conversely, T1-treated trees demonstrated significantly enhanced fruit exposure, enabling accurate identification of 92.7% ± 2.1% targets with improved boundary precision (IoU: 0.83 ± 0.04 vs. CK's 0.71 ± 0.07) and detection completeness (Figs. 8c-8d).

Revised: Conversely, T1-treated trees demonstrated significantly enhanced fruit exposure, enabling more precise fruit boundary localization and higher overall detection completeness (Figs. 8c–8d; Table 3).

  1. In page 14, "with improved boundary precision (IoU: 0.83 ± 0.04 vs. CK's 0.71 ± 0.07)" I guess you mean Iou (T1) vs IoU (CK). Also IoU metric was never explained, at least add a reference where can be consulted.

Response:

Thank you for your valuable comment. You are absolutely right—the phrase “IoU: 0.83 ± 0.04 vs. CK's 0.71 ± 0.07” refers to the comparison between IoU (T1) and IoU (CK). We have clarified this in the revised manuscript. In addition, as you pointed out, the IoU metric was not previously explained. We have now included a definition of IoU (Intersection over Union) in the evaluation metrics section2.2.3. Thank you for your careful review and helpful suggestion.

 

 

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The author made the required modifications 

Back to TopTop