Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Wheat Ear Detection Algorithm Based on Improved YOLOv4

Appl. Sci. 2022, 12(23), 12195; https://doi.org/10.3390/app122312195

by Fengkui Zhao^1,2,3

, Lizhang Xu^2,*, Liya Lv^1,* and Yong Zhang¹

Reviewer 1:

Jan Turan

Reviewer 2:

Carlo Fantozzi

Reviewer 3:

Serdar Solak

Appl. Sci. 2022, 12(23), 12195; https://doi.org/10.3390/app122312195

Submission received: 16 September 2022 / Revised: 31 October 2022 / Accepted: 8 November 2022 / Published: 29 November 2022

Round 1

Reviewer 1 Report

Interesting article, defines the initial parameters of autonomous agricultural production.

Author Response

Dear Editor,

I really appreciate your work.

All the best,

Fengkui Zhao

Reviewer 2 Report

The paper proposes a method for automatically counting wheat ears in images. The method is based on object detection via a variant of the YOLOv4 network and is very simple: wheat ears detected by the network with a confidence level of at least 0.5 are counted as actual ears. With respect to YOLOv4, the proposed network includes additional spatial pyramid pooling (SPP) modules in the first detection stage (in the Neck, to be precise), which supposedly improves the network's ability to detect small objects such as wheat ears. Experiments show that the proposed network yields better detection performance than the standard YOLOv4 network: for instance, the precision in detecting wheat ears increases from 91.25 to 96.76. As for the counting performance, see concern #2 below.

I have five significant concerns about this submission, reported below in decreasing order of severity. Therefore, I ask for a MAJOR REVISION before reconsidering the paper for publication.

1) The experiments fail to prove that the proposed network detects wheat ears better than *competing networks specifically designed for the same target object*. The authors compare their network only with YOLOv3 and YOLOv4, which are general-purpose. The authors know that special-purpose networks exist, and mention many in Section 1. To sum up: additional experiments are needed to compare the proposed network with the existing literature, adjusting performance metrics (advice: IoU) where appropriate. Instead, I think the authors can remove the comparison with YOLOv3: this network is ostensibly worse than YOLOv4 across the board.

2) The experiments fail to prove that the proposed network *counts* more accurately than something else. Counting wheat ears, which is a relevant aspect of the research, is different from detecting them. In theory, a network may detect with lower accuracy but be better at counting because, e.g., false positives and false negatives compensate for some reason. The submission contains experiments related to counting (Table 2, Figure 5), but they are self-referential: no comparisons with other networks, not even with YOLOv4. To sum up: once again, more experiments are required. Comparisons should be performed with both YOLOv4 (which should be easy) and special-purpose networks.

3) The proposed modification to the network architecture is minor. I will re-evaluate this aspect in the revised submission. If a minor modification leads to significant performance gains, then no problem!

4) The paper does not explain how the proposed modification relates to what the authors themselves (lines 92-98) consider as the two open issues in their research area. The first issue comes from environmental conditions (suboptimal lighting, wind leading to blurred images, et cetera). The second issue comes from variations in wheat ears. I do not see how adding SPP modules addresses any of the two. Please expand Section 2.2 to explain.

5) The document is littered with English language errors, sometimes with comical effects (e.g., "imposed algorithm" instead of "proposed algorithm"). Infrequently, the style is too colloquial (e.g., "bags of tricks"). A substantial, albeit not colossal, revision is mandatory.

Author Response

Dear Reviewer,

Thank you very much for your comments.

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 3 Report

The authors improved wheat ear detection performance in images captured under natural field conditions. The authors claim that the improved model performs well compared with state-of-the-art results.

My reviews and suggestions about their publications are listed;

The article is generally well organized and prepared.

The YOLOv4 Object Detection Algorithm should be written in more detail. In addition, the general block diagram of the study should be presented and explained.

You should submit more experimental study results for your work. Sample applications and images should be presented on real images taken at different times. Experimental studies are insufficient. You should also provide comparisons with similar studies (In particular, studies in recent years).

It would be good if the "Figure 4. Detection results of some example images" given in the article are presented in the form of several different images in a more understandable way for the reader.

YOLOv7, the newest YOLO algorithm surpasses all previous object detection models and YOLO versions in both speed and accuracy. It requires several times cheaper hardware than other neural networks and can be trained much faster on small datasets without any pre-trained weights. If possible, I suggest you try the YOLOv6 or YOLOv7 algorithms.

Author Response

Dear Reviewer,

Thank you very much for your comment.

Please find the response in the attachment.

Best wishes,

Fengkui Zhao

Author Response File: Author Response.docx

Round 2

Reviewer 2 Report

The authors addressed my main concerns:

- the proposed network is now compared with one special-purpose network (Concern #1 in my previous review) called CBAM-YOLOv4,

- the counting performance of the proposed network is now compared with other networks (Concern #2).

I think the authors could have done much more to strengthen their paper: indeed, they provided only minor changes instead of the major revision I had asked. The revision was hastily assembled and rushed out: there are still problems with the language and style, and the authors did not even realize that the references in the PDF (bibliographic, figures, tables) are all busted. I disapprove of this behavior. Furthermore, the new comparison with CBAM-YOLOv4 is somewhat inconclusive because it shows that the proposed network is sometimes better, and sometimes worse than the reference. All in all, not a strong paper. However, it is above the publishability threshold, albeit barely.

Author Response

Dear reviewer,

Thank you very much for your comment. According to the suggestions of the editing service, we have revised the manuscript. Compared with CBAM-YOLOv4 on the first dataset, AP of the proposed algorithm is smaller by 1.24% smaller, and F1 value is smaller by 0.06. The gap is very small. But the proposed algorithm works much better on GWHD dataset. AP is larger by 4.85%, and F1 is larger by 0.04. The precision is significantly larger by 8.85%. We think the proposed algorithm is effective considering the overall performance.

We are grateful for all your work.

Best wishes,

Fengkui Zhao

Article Menu

Wheat Ear Detection Algorithm Based on Improved YOLOv4

Further Information

Guidelines

MDPI Initiatives

Follow MDPI