Next Article in Journal
A Method for Estimating Alfalfa (Medicago sativa L.) Forage Yield Based on Remote Sensing Data
Next Article in Special Issue
AI, Sensors, and Robotics for Smart Agriculture
Previous Article in Journal
Cover Crop Termination Methods and Custom Residue Manager Effects on Collard Production
Previous Article in Special Issue
Design and Testing of Bionic-Feature-Based 3D-Printed Flexible End-Effectors for Picking Horn Peppers
 
 
Article
Peer-Review Record

Improved Faster Region-Based Convolutional Neural Networks (R-CNN) Model Based on Split Attention for the Detection of Safflower Filaments in Natural Environments

Agronomy 2023, 13(10), 2596; https://doi.org/10.3390/agronomy13102596
by Zhenguo Zhang 1,2,3,*, Ruimeng Shi 1, Zhenyu Xing 1,3, Quanfeng Guo 1 and Chao Zeng 1
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Agronomy 2023, 13(10), 2596; https://doi.org/10.3390/agronomy13102596
Submission received: 11 September 2023 / Revised: 4 October 2023 / Accepted: 9 October 2023 / Published: 11 October 2023
(This article belongs to the Special Issue AI, Sensors and Robotics for Smart Agriculture)

Round 1

Reviewer 1 Report

Manuscript “Small-sized Detection Method of Safflower Filaments in Natural Environments Based on Improved Faster R-CNN”

This manuscript will be interesting for scientists, who are interested in studying the robotic picking operations in agricultural. Also, these scientific results will have practical application in agriculture. In order to improve the manuscript, I suggest the following corrections:

1. The article states: “The shooting equipment was a Canon E700D camera in July 2023”. This sentence needs correction (in July 2023?).

2.  The article has a Section “2. Materials and methods” section, so there is no need to separate the Section “3. Method”.

3. In the article, it is advisable to present the overall structure of a safflower picking robot with a visual recognition system.

4. The article needs to state the speed at which a safflower-picking robot with visual recognition system (shooting equipment) should move through the field to accurately detect safflower filaments. It is also necessary to specify the recommended width of the safflower row in the field, at which the visual recognition system can correctly detect safflower filaments.

Comments for author File: Comments.pdf

Author Response

We are very grateful for the comments, which we think will be of great help in improving the paper. Below are my specific responses to each question.

This manuscript will be interesting for scientists, who are interested in studying the robotic picking operations in agricultural. Also, these scientific results will have practical application in agriculture. In order to improve the manuscript, I suggest the following corrections:

1. The article states“The shooting equipment was a Canon E700D camera in July 2023”. This sentence needs correction (in July 2023?).

Response: Thanks for your kind reminders. The sentence “The shooting equipment was a Canon E700D camera in July 2023” has been revised. It was marked in blue in the article. The sentence has been revised as follows: ”The shooting equipment was the Intel Realsense D435i RGB-D depth camera and Canon E700D camera, during the actual safflower filaments opening period from July 15, 2023, to July 27, 2023.”.

2.  The article has a Section “2. Materials and methods” section, so there is no need to separate the Section “3. Method”.

Response: Thanks for your kind reminders. Section 2 is the safflower image collection and construction of the safflower dataset. The method in the title “Materials and Methods” in Section 2 is the method for constructing the safflower dataset. The method in Section 3 is to select Faster R-CNN as the target detection framework and improve it. Therefore, there is no need to merge Sections 2 and 3. The ambiguity was removed by changing the title of section 2 to "Materials" and the title of section 3 to "Methods".

3. In the article, it is advisable to present the overall structure of a safflower picking robot with a visual recognition system.

Response: Thanks for your kind reminders. The overall structure of the safflower picking robot with a vision system has been added in Section 2.1 of the article. The vision system uses an Intel Realsense D435i RGB-D depth camera to acquire safflower images. The modified sections have been marked in blue. The modifications are as follows: “The safflower picking robot is mainly composed of a visual recognition system, a three-axis linear slide, an electronic control box, an end-effector, a calculator, machine frame. The overall structure is shown in Figure 1. The safflower picking robot moves at a speed of 0.2m/s in the field. The vision system uses an Intel Realsense D435i RGB-D depth camera to capture the safflower image and obtain the positional information of the safflower filaments. The controller drives the end-effector to move to the picking position based on the obtained filaments information.”.

Figure 1. The overall structure of the safflower picking robot. 1. walking device; 2. collecting box; 3. machine frame; 4. visual recognition system; 5. calculator; 6. three-axis linear slide; 7. end-effector; 8. electronic control box; 9. mobile Power.

4. The article needs to state the speed at which a safflower-picking robot with visual recognition system (shooting equipment) should move through the field to accurately detect safflower filaments. It is also necessary to specify the recommended width of the safflower row in the field, at which the visual recognition system can correctly detect safflower filaments.

Response: Thanks for your kind reminders. The thesis of article aims to optimize the Faster R-CNN model. Through tests, it is verified that the Faster R-CNN-S101 model can improve the detection accuracy of small-size safflower filaments. Tests show that safflower filaments can be accurately detected when the safflower picking robot moves at a speed of 0.2m/s. The visual recognition system can correctly detect safflower filaments in a test safflower field with 50 cm spacing between rows of safflower and 20 cm spacing between plants in each row of safflower. Added and labeled blue in Sections 2.1 and 2.2 of the article. The additions are listed below: ”The safflower picking robot moves at a speed of 0.2m/s in the field.”, ”The planting pattern is standardized to 50 cm spacing between rows of safflower and 20 cm spacing between plants per row of safflower.”.

Author Response File: Author Response.docx

Reviewer 2 Report

Please see the attached file

Comments for author File: Comments.pdf

Author Response

We are very grateful for the comments, which we think will be of great help in improving the paper. Below are my specific responses to each question.

Overall, the approach of article is great and contains the novelty which is publishable. Authors must address following comments.

1. In the title of the paper, the authors are invited to find a suitable title that is simple and contains solid technical words for the scientific community.

Response: Thanks for your kind reminders. The article chooses to adopt the Faster R-CNN model as the detection framework for safflower filaments. Combined with the Split Attention module of ResNeSt-101 for multi-scale effective feature extraction, it reduces the feature error of small-size saffron filaments and improves the detection accuracy. Therefore, the title has been modified to “Improved Faster R-CNN based on Split-Attention for Detection Model of Safflower Filaments in Natural Environments”.

2. The authors are asked to avoid general sentences in the abstract part. For example, it is necessary to add the types of data, as well as the types of agricultural fields in which they are extracting data.

Response: Thanks for your kind reminders. The summary section has been modified, with a particular focus on generalized data types. Replace the data section with the following: “The test results showed that the mean Average Precision (mAP) of the improved Faster R-CNN reached 91.49%. Comparing with Faster R-CNN, YOLOv3, YOLOv4, YOLOv5, and YOLOv6, the improved Faster R-CNN increased the mAP by 9.52%, 2.49%, 5.95%, 3.56%, and 1.47%, respectively. The mAP of safflower filaments detection was higher than 91% in sunny day, cloudy day, overcast day, sunlight, backlight, branches and leaves occlusion, and dense occlusion.”. The modified parts have been red-flagged.

3. Please add a reference with application as :

Response: Thanks for your kind reminders. Through reading the recommended papers carefully, three of them about vegetation detection in agricultural areas are important references for safflower filament detection, added and marked red in the article.

4. In the introduction section, please highlight the contribution of your work by placing it in context with the work that has been done previously in the same domain. Please add the main research questions at the end of introduction section and add the structure of article. You must describe the objectives of each section.

Response: Thanks for your kind reminders.

Re 1 The studies of the same field have been compared and summarized in the introduction. The current research has many shortcomings in small size detection and natural environments interference. This article addresses these issues by optimizing the Faster R-CNN model to improve the detection accuracy of small-sized safflower filaments in natural environments. The additions are as follows: “The above studies demonstrated the feasibility and effectiveness of deep learning methods applied to safflower filaments detection, providing a reference for filament detection. However, the main objects of the current methods were the detection of apple blossom, tomato, and hydroponic lettuce. They had larger object sizes than safflower filaments and sparser target distributions than filaments. Meanwhile, these methods did not consider the interference of the natural environments on target detection. In reality, safflower filaments are small and dense, with severe branches and dense occlusion. In addition, different weather and illumination can also affect detection in the natural environments. Therefore, there is still the problem of insufficient performance in practical applications.”, ”To effectively solve the detection problems with the small size filaments, weather, and occlusion interferences, avoiding misdetection and omission, images of safflower were collected in different natural environments to construct datasets for model training and evaluation. Combining with ResNeSt-101 and ROI Align, optimizing the anchor boxes in Faster R-CNN, safflower filaments detection was explored in natural environments. The following is a list of the key contributions of the study:

(1) Aiming at the interference problem of the natural environments, ResNeSt-101 used the Split Attention module to carry out multi-scale effective feature extraction. Meanwhile, enhancing the ability of local feature extraction improved the detection accuracy and avoided leakage and false detection.

(2) Based on the size of safflower filaments, the anchor box size in the region generation network was optimized using the PAM clustering algorithm. The anchor box size matched with the filaments size more closely, leading to improved accuracy in detecting small sizes.

(3) ROI Align was used instead of ROI Pooling to reduce the feature error of safflower filaments caused by double quantization. The target bounding box was depicted more accurately, improving the detection accuracy.”.

Re 2 The main research questions have been added in the introduction section. Meanwhile, the additions have been marked in red. The additions are as follows: “The detection of safflower filaments has many particularities and faces the following problems in natural environments. Firstly, safflower filaments are small in size, dense, and unevenly distributed. Since detecting small objects is a difficult task, most object detection models have a tough time dealing with small objects[22]. Secondly, densely planted safflower was severely shaded by branches and leaves, leading to missed inspections. Thirdly, detection can be interfered with by weather and illumination variations in natural environments, making the color and texture characteristics of filaments unevenly represented and leading to misdetection.”.

Re 3 The structure of the article has been added at the end of the introduction section. Meanwhile, the purpose of each section is described. The additional sections have been highlighted in red. The additions are as follows: “This article is structured as follows. Section 2 describes the safflower picking robot and the safflower image dataset used. Section 3 proposes a framework structure to improved Faster R-CNN for safflower filaments detection. Section 4 describes the Evaluation indicators of the network model. Section 5 conducts multiple sets of experiments and evaluates and comparatively analyzes the results. Section 6 gives the conclusions of the article and future work.”.

5. The data in Figure 2 used must be explained in detail.

Response: Thanks for your kind reminders. The image data used is combed to increase the number of each type of image used. The modified parts have been red-flagged. The safflower images were obtained from a safflower plantation in Ili Kazakh Autonomous Prefecture, China. According to different weather, light and shade scenarios in the natural environment are categorized as sunny, cloudy, cloudy, smooth light, backlight, branch and leaf shade, and dense shade. The collection of different scene images is used to train and evaluate the degree of adaptation of the detection model to the natural environment. The article gives the number of sheets used for each scene image. The modified parts have been highlighted in red. The additions are as follows: “The safflower images were obtained from a safflower plantation in Ili Kazakh Autonomous Prefecture, China. The shooting equipment was the Intel Realsense D435i RGB-D depth camera and Canon E700D camera, during the actual safflower filaments opening period from July 15, 2023, to July 27, 2023. The planting pattern is standardized to 50 cm spacing between rows of safflower and 20 cm spacing between plants per row of safflower. The image acquisition process simulates the natural environments of the safflower-picking robot during field operations, capturing images of different weather, light, and shade conditions.”, “The total amount of the expanded safflower filaments dataset reaches 4500 images. The number of images under sunny, cloudy, and overcast conditions was 2260, 1083, and 1167, respectively. The number of images under sunlight and backlight conditions was 1272 and 988, respectively. The number of images under dense occlusion and branches and leaves occlusion was 903 and 1197, respectively.”.

6. Please add references and F1 Score

Response: Thanks for your kind reminders. References and F1 Score have been added to the evaluation indicator formulas. Section 4.2 Evaluation indicators of the network model have been added and highlighted in red. The modifications are as follows: “The Precision rate (P), the Recall rate (R), the F1 score, the Average Precision (AP), and the mean Average Precision(mAP) were chosen as the metrics of filament recognition precision [31,32]. The F1 score is the reconciled mean of precision and recall. AP is the area formed by the precision-recall curve and the coordinate axis. The mAP is the average value of the AP for different categories. The evaluation indicator is calculated as[33]:”.

7. A comparative analysis of results is very important. In many places, I cannot see that the authors compared the results with previous studies.

Response: Thanks for your kind reminders. The Faster R-CNN-S101 model results supplemented with previous studies modeling Faster R-CNN, YOLOv3, YOLOv4, YOLOv5, and YOLOv6 for comparison.In 5.2 Comparison of other object detection models is modified and marked in red. The modification is as follows: ”To evaluate the performance of the Faster R-CNN-S101 model, we report quantitative comparison results, as shown in Table 1. Faster R-CNN-S101 had a precision of 86.75%, a recall of 92.19%, an AP of 93.19% in the opening period, an AP of 89.93% in the flower-shedding period, and a mAP of 91.49%. All values of Faster R-CNN-S101 were higher than those of YOLOv3, YOLOv4, YOLOv5, and Faster R-CNN, where the mAP was improved by 2.49%, 5.95%, 3.56%, and 9.52%, respectively. The results showed that the overall improvement of Faster R-CNN-S101 could better detect small-sized safflower filaments and increase the mAP of detection. Compared with YOLOv6, Faster R-CNN-S101 showed a lower detection precision of 0.37%, but the higher recall of 7.56%, the higher AP of 3.57% in the opening period, the higher AP of 0.02% in the flower-shedding period, and the higher mAP of 1.47%. Especially, the recall rate and the AP of filaments in the blooming period were improved significantly. It showed that Faster R-CNN-S101 could extract richer feature information, distinguished filaments from branches, leaves, and background areas, and reduced false and missed detections.

The results showed that the Faster R-CNN-S101 model of recall, AP, and mAP were significantly higher than other models. The Faster R-CNN-S101 model was better than other models in small-sized safflower filaments detection, with the performance of excellent comprehensive detection performance.”.

8. Authors are invited to join the algorithmic study.

Response: Thanks for the invite. We are also keen to work with you on algorithmic study.

9. Please write the main suggestions at the end of conclusion section.

Response: Thanks for your kind reminders. The key recommendations have been added at the end of the conclusions. Revise and mark in red in 6 Conclusions and future work as follows: ”In the next step of the study, more types of safflower images will be further collected to increase the diversity of the dataset. Meanwhile, the Faster R-CNN model will be improved and optimized according to the detection tasks in other scenes, leading to the expansion of the Faster R-CNN-S101 model. In addition, there is a gap between the Faster R-CNN-S101 model and the one-stage object detection model in terms of detection speed. The above issues will continue to be researched and explored in the following research work.”.

Author Response File: Author Response.docx

Round 2

Reviewer 2 Report

The authors have made a great effort to improve the paper and now the work can be accepted.

Back to TopTop