Next Article in Journal
Lightweight One-Stage Maize Leaf Disease Detection Model with Knowledge Distillation
Previous Article in Journal
Do Futures Prices Help Forecast Spot Prices? Evidence from China’s New Live Hog Futures
 
 
Article
Peer-Review Record

Target Soybean Leaf Segmentation Model Based on Leaf Localization and Guided Segmentation

Agriculture 2023, 13(9), 1662; https://doi.org/10.3390/agriculture13091662
by Dong Wang 1, Zetao Huang 1, Haipeng Yuan 1, Yun Liang 1,*, Shuqin Tu 1 and Cunyi Yang 2
Reviewer 1:
Agriculture 2023, 13(9), 1662; https://doi.org/10.3390/agriculture13091662
Submission received: 14 July 2023 / Revised: 18 August 2023 / Accepted: 18 August 2023 / Published: 23 August 2023
(This article belongs to the Section Digital Agriculture)

Round 1

Reviewer 1 Report

1.       Did the authors perform any phenotyping classification in their proposed model?

2.       The details discussed in section 2 regarding the Libra RCNN model can be limited, as this is the author's proposed work. However, the work emphasized accurately localizing the leaf as discussed in section 2.2.2 can be kept as it is.

3.       Is the RPN shown in Figure 8 proposed by the authors? If yes? Please mention the “proposed” or in case of minor modifications use “Improved”.

4.       The overall flow of the manuscript is haphazard. Line 236 : discusses detection loss; which can be more specifically mentioned while writing detection details.

5.       Please elaborate on the Figure 9 caption.

6.       If section 2.2.2 (b) is explained in Section 3.1 then why it is discussed earlier in section 2.

7.       From where the target leaf bounding box is obtained in Figure 11

8.       The caption of Table 3 needs more explanation.

9.       It is suggested not to refer to any figure in the conclusion section. One can discuss them in the results section.

 

 

  Please correct typos and grammatical mistakes like lines 464 and 466, “Deep Labv3” and others.

Author Response

Response to Reviewer 1 Comments

 

1.Did the authors perform any phenotyping classification in their proposed model?

 

Response: Thank you for raising this question. We did not perform any phenotyping classification in our model.

 

 

2.The details discussed in section 2 regarding the Libra RCNN model can be limited, as this is the author's proposed work. However, the work emphasized accurately localizing the leaf as discussed in section 2.2.2 can be kept as it is.

 

Response: Thanks to your suggestion, we have simplified the Libra R-CNN, leaving out unnecessary details. At the same time, we have also enriched the content of the target leaf localization . You can see the changes in this section on Line 153-215.

 

3.Is the RPN shown in Figure 8 proposed by the authors? If yes? Please mention the “proposed” or in case of minor modifications use “Improved”.

4.The overall flow of the manuscript is haphazard. Line 236 : discusses detection loss; which can be more specifically mentioned while writing detection details.

 

Response: Thank you for your suggestions and for your careful advice on the organization of the paper. We have made some adjustments to the structure of the manuscript and have incorporated the discussion on detection loss into the specific details. You can refer to Line 179-188 and 204-213 for more details.

 

 

5.Please elaborate on the Figure 9 caption.

 

Response: Thank you for your advice. In order to express the idea more clearly, we have changed the style of Figure 9, as shown below. The title of Figure 9 is rewritten as: “Figure 9. The accuracy of Target Leaf Localization Algorithm using different values of  and . The green dots indicate that the accuracy can reach 100% with the corresponding values of , . The red dots indicate that there are wrong target leaf localizations with the corresponding values of  and  in our test data.”. We have adjusted the content of this section to Section 3.1, so Figure 9 is numbered Figure 15.

 

 

 

6.If section 2.2.2 (b) is explained in Section 3.1 then why it is discussed earlier in section 2.

 

Response: Thank you very much for your suggestion. Vertex offset strategy is a part of the model and the optimal vertex offset comes from experiments, so we discussed separately.  According to your suggestion, we have put them together in Section 3.1. You can directly check on Line 348.

 

 

7.From where the target leaf bounding box is obtained in Figure 11

 

Response: Thank you for your question. The target leaf bounding boxes shown in Figure 11 are generated by Libra R-CNN. Initially, Libra R-CNN detects all possible leaves in the image, and each leaf is assigned a bounding box. Then, our model performs leaf localization algorithm to filter out the bounding boxes that belong to the target leaves, and these bounding boxes are the input for Figure 11. Due to the adjustment of the paper structure, the numbering of Figure 11 is now changed to Figure 8. We categorize Libra R-CNN and the leaf localization algorithm as the target leaf localization module, as they both serve the same purpose of finding the bounding boxes of target leaves. This is mentioned on Line 149 and 136.

 

 

8.The caption of Table 3 needs more explanation.

 

Response: Thank you very much for your suggestion, we have changed the title of Table 3 to " Quantitative comparisons with baseline segmentation models". Since we have adjusted the structure of the paper, Table 3 changed to Table 4.

 

 

9.It is suggested not to refer to any figure in the conclusion section. One can discuss them in the results section.

 

Response: Thank you very much for your suggestion. We have already removed the figure and made adjustments to the conclusion section. You can directly check on Line 436.

 

 

  1. Please correct typos and grammatical mistakes like lines 464 and 466, “Deep Labv3” and others.

 

Response: Thank you for your suggestion. We have carefully reviewed the manuscript and made corrections to spelling and grammar errors. For example, from Line 414 to Line 433.

Author Response File: Author Response.docx

Reviewer 2 Report

The paper’s claim proposes an automatic soybean leaf segmentation method based on object detection and an interactive segmentation model, in which the Libra R-CNN object detection algorithm is used to detect all soybean leaves. Then, a segmentation network is applied and optimized to completely enclose the whole leaf. The overall novelty of the paper is limited, which necessitates substantial improvements summarized as follows:

The abstract appears to be lacking in clarity and adequate representation of the paper's contributions. It fails to concisely convey the novel aspects and significance of the research. Additionally, the absence of any mention of experimental results further weakens the abstract, as these outcomes are critical in validating the proposed approach and assessing its effectiveness.
In the introduction section, the language employed throughout is unclear and poorly structured, making it difficult for readers to grasp the paper's core message. Moreover, the lack of clear research objectives makes it challenging to understand the specific goals of the study and the problem the authors aim to address. This absence of a well-defined research focus detracts from the overall impact and contribution of the work. • In the introduction section, the authors claim that “This paper presents a target soybean leaf segmentation model. First, all soybean leaves in the image are detected using object detection algorithm…..”. I cannot see any research contribution in this part.

In the introduction section, the authors claim “An automatic target soybean leaf segmentation model was designed,” what is meant by automatic? How is the designed model novel? • What is meant by “soybean leaf location algorithm”?

 In section 2.1, the discussion lacks critical information pertaining to the acquisition parameters, such as time, location, camera specifications, and other relevant details. These parameters play a pivotal role in understanding the data collection process and ensuring the reproducibility of the study. Furthermore, the section fails to provide sufficient annotation details, which are vital for comprehending the ground truth labels and evaluating the reliability of the dataset. Without this essential information, readers are left with uncertainty about the dataset's quality and suitability for the proposed model evaluation. Incorporating comprehensive acquisition parameters and annotation specifics in this section is crucial to bolster the credibility and rigor of the research.

 In section 2.2, The paper appears to suffer from significant issues in its organization and presentation. The lack of a clear and coherent structure makes it challenging for readers to follow the flow of information. Additionally, this section is burdened with an abundance of unnecessary details and figures, which obscure the essential aspects of the proposed approach. The extensive discussion of previously published networks further complicates matters, as it becomes difficult for the reader to discern what is genuinely novel and original in the authors' work compared to what has already been explored in the literature. • In section 2.2., The paper necessitates a rigorous revision to establish a consistent and standardized definition of symbols, equations, and abbreviations throughout the document. Presently, there is an evident lack of clarity and coherence in the usage of these elements, resulting in confusion and ambiguity for the reader.

 In section 3, the discussion misses discussing important experimental configurations and also misses justifying the choice of training parameters. • In section 3, the Comparative Experiment misses state-of-the-art models and only considers legacy techniques. • Did the authors ensure fairness in experimental comparisons? • Which validation strategy is used? • What is meant by “The statistics results are shown in Table 3.”. I cannot see any statistics. • Complexity and ablation analysis are missed. • We observe a mismatch in mask color between Figure 2 and Figure 18.  • The paper is marred by a noticeable poor quality of language, which hinders its overall readability and comprehension. The writing is riddled with grammatical errors, awkward sentence structures, and inconsistent terminology, making it difficult for the reader to fully grasp the intended meaning of the content.

English language required to improve.

Author Response

Response to Reviewer 2 Comments

 

  1. The abstract appears to be lacking in clarity and adequate representation of the paper's contributions. It fails to concisely convey the novel aspects and significance of the research. Additionally, the absence of any mention of experimental results further weakens the abstract, as these outcomes are critical in validating the proposed approach and assessing its effectiveness.

 

Response: Thank you very much for your suggestions on our abstract. We have reorganized the content of the abstract and supplemented the missing parts based on your suggestions. The modified content is listed as follows:

 

The phenotypic characteristics of soybean leaves are of great significance for studying the growth status, physiological traits, and response to the environment of soybeans. The segmentation model for soybean leaves plays a crucial role in morphological analysis. However, current baseline segmentation models are unable to accurately segment leaves in soybean leaf images due to issues like leaf overlap. In this paper, we propose a target leaf segmentation model based on leaf localization and guided segmentation. The segmentation model adopts a two-stage segmentation framework. The first stage involves leaf detection and target leaf localization. Based on the idea that a target leaf is close to the center of the image and has a relatively large area, we propose a target leaf localization algorithm. We also design an experimental scheme to provide optimal localization parameters to ensure precise target leaf localization. The second stage utilizes the target leaf localization information obtained from the first stage to guide the segmentation of the target leaf. To reduce the dependency of the segmentation results on the localization information, we propose a solution called guidance offset strategy to improve the segmentation accuracy. We design multiple guided model experiments and select the one with the highest segmentation accuracy. Experimental results demonstrate that the proposed model exhibits strong segmentation capabilities, with the highest average precision (AP) and average recall (AR) reaching 0.976 and 0.981, respectively. We also compare our segmentation results with current baseline segmentation models, and multiple quantitative indicators and qualitative analysis indicate that our segmentation results are better.

 

 

2.In the introduction section, the language employed throughout is unclear and poorly structured, making it difficult for readers to grasp the paper's core message. Moreover, the lack of clear research objectives makes it challenging to understand the specific goals of the study and the problem the authors aim to address. This absence of a well-defined research focus detracts from the overall impact and contribution of the work. In the introduction section, the authors claim that “This paper presents a target soybean leaf segmentation model. First, all soybean leaves in the image are detected using object detection algorithm.”. I cannot see any research contribution in this part.

 

Response: Thank you very much for your advice and patience. Here, we have reorganized the introduction section and removed unnecessary content.

 

(1) In the first paragraph of the introduction, we introduce the significance of target leaf segmentation for soybean leaf phenotype analysis and the potential challenges posed by complex backgrounds. The content of this part is as follows:

 

The study of soybean leaf phenotypes plays an important role in soybean breeding, real-time monitoring of plant growth, and precision cultivation management [1]. Phenotypic parameters of soybean leaves include leaf length, leaf width, and leaf area. Traditional methods of data acquisition rely on manual measurements, which are not only time-consuming but also cause irreversible damage to crops [2]. To prevent harm to plant growth, the practice of non-contact data collection is gradually becoming a trend [3]. Images, as the most convenient and easily obtainable medium, have become the primary data type. The target leaf images for phenotype parameter measurements usually contain complex backgrounds. These backgrounds usually contains leaves with the same color and texture as the target leaves, which brings difficulties to segment the target leaves.

 

 

(2) In the second and third paragraphs of the introduction, we present some leaf segmentation methods in recent years, highlighting the research status. The content of this part is as follows:

 

Achieving fast and accurate leaf segmentation in complex background conditions has always been a challenge in the field of agricultural image recognition. Currently, there are numerous leaf segmentation algorithms based on traditional image processing techniques. For example, Kumar et al. [4] utilized a graph-based approach to extract leaf regions. Bai et al. [5] utilized a marker-based watershed algorithm that relies on the HSI space to effectively segment target leaves. Kumar et al. [6] proposed an improved unsupervised clustering algorithm, K-means, that utilizes particle swarm optimization and the firefly algorithm to enhance the performance of segmentation. Tian et al. [7] proposed an adaptive clustering algorithm called K-means to mitigate the adverse effects of manually selecting inappropriate cluster numbers on the segmentation quality. Gao et al. [8] combined the OTSU and watershed segmentation methods to achieve leaf segmentation by utilizing manually labeled leaf edge points. Although numerous leaf segmentation algorithms have been proposed, these algorithms often heavily depend on the selection of initial parameters, involve complex preprocessing procedures, or fail to effectively segment each leaf in complex practical situations, such as image noise, brightness, and overlapping leaves. These limitations in traditional techniques restrict their widespread use in agricultural production.

In recent years, there has been significant attention given to the use of deep neural networks in addressing agricultural production issues. With the continuous advancement of smart agriculture, the demand for leaf segmentation algorithms in agricultural production is also increasing. Compared to traditional image processing techniques, segmentation models have a wider range of applications, streamlined processes, and a more significant impact. Bhagat et al. [9] proposed an encoder-decoder architecture for leaf segmentation. They used EfficientNet-B4 as the encoder and implemented a lateral output structure to improve segmentation accuracy. Wang et al. [10] proposed an automated algorithm for corn leaf segmentation. The algorithm improves the segmentation results of the model by incorporating image restoration techniques. Liu et al. [11] combined Mask R-CNN [12] with the DBSCAN clustering algorithm to propose a highly accurate automatic segmentation method. Tian [13] combined the mask prediction branch of Mask R-CNN with the U-Net [14] model to improve the accuracy of segmenting apple blossom images. Although the aforementioned methods have yielded positive outcomes, they have only been examined in relatively simple settings. Therefore, their effectiveness must be enhanced when confronted with complex background environments. In addressing the issue of complex backgrounds in image processing, some scholars have adopted a two-stage approach of pre-screening the complex background. Wang et al. [15] proposed the DU-Net model, which first utilized DeepLabv3 [16] to segment cucumber leaves and then employed the U-Net model to segment leaf lesions. Tassis et al. [17] proposed a two-stage model based on Mask R-CNN. The model first utilized Mask R-CNN to identify the region of target leaves and then applied the U-Net model to segment leaf lesions within the identified leaf region.

 

 

(3) In the fourth paragraph of the introduction, we combine the aforementioned research status and practical application needs, and propose two key issues that require our attention: (1) how to identify leaves of research value in the images, and (2) how to accurately segment the target leaves in complex backgrounds. We then present our solution to address these two issues. The content of this part is as follows:

 

Based on the above research and practical application requirements, we need to solve two problems: 1) High-value leaves in soybean leaf images needs to be identified and segmented during the segmentation process. 2) An effective segmentation algorithm needs to be designed for the soybean leaf images with complex background. For the first problem, we set the large leaf close to the center of the image as the target leaf to be segmented, and designed a target leaf localization algorithm to identify the target leaf. For the second problem, we draw on the approach of Wang et al. and propose a two-stage soybean leaf segmentation algorithm. The model consists of two modules, one for target leaf recognition and localization, and the other for the guided segmentation of the target leaf.

 

 

3.In the introduction section, the authors claim “An automatic target soybean leaf segmentation model was designed,” what is meant by automatic? How is the designed model novel? • What is meant by “soybean leaf location algorithm”?

 

Response: Thank you very much for your valuable questions.

 

(1) The term "automatic" refers to the fact that the guidance information required for segmenting the target leaf is generated automatically by the detector, without the need for manual input. Upon reevaluating the term "automatic," we realized that it is unnecessary and can potentially confuse readers. Therefore, we have removed the term "automatic" from our revised manuscript.

 

(2) There are three main innovations in our model, which are described on Line 13-20 of the abstract.

 

  • We propose a target leaf segmentation model based on leaf localization and guided segmentation. Comparing with current baseline segmentation models, our segmentation results are better on multiple quantitative indicators and qualitative analysis.
  • Based on the idea that a target leaf is close to the center of the image and has a relatively large area, we propose a target leaf localization algorithm. We also design an experimental scheme to provide optimal localization parameters to ensure precise target leaf localization.
  • To reduce the dependency of target leaf segmentation results on the localization information, we propose a solution called guidance offset strategy to improve the segmentation accuracy. We design multiple guided offset model experiments and select the one with the highest segmentation accuracy.

 

(3) The soybean leaf localization algorithm refers to our proposed algorithm for selecting the bounding boxes of target leaves. The specific steps of this algorithm are mentioned on Line 212-232.

 

 

4.In section 2.1, the discussion lacks critical information pertaining to the acquisition parameters, such as time, location, camera specifications, and other relevant details. These parameters play a pivotal role in understanding the data collection process and ensuring the reproducibility of the study. Furthermore, the section fails to provide sufficient annotation details, which are vital for comprehending the ground truth labels and evaluating the reliability of the dataset. Without this essential information, readers are left with uncertainty about the dataset's quality and suitability for the proposed model evaluation. Incorporating comprehensive acquisition parameters and annotation specifics in this section is crucial to bolster the credibility and rigor of the research.

 

Response: Thank you for your suggestions regarding our data collection and annotation section. As per your request, we have provided additional details on the data collection, annotation, and augmentation processes. The modified sections can be found on Line 94-128. For your convenience, we have pasted the revised content here:

 

2.1.1. Data acquisition

We conducted image data acquisition at the College of Agriculture and Zengcheng Teaching and Research Base of South China Agricultural University in Tianhe District, Guangdong Province, China. The main acquisition device is the iPhone 12, which captured images at a resolution of 4032 x 3024 pixels. The data collection took place in the morning under cloudy weather conditions and without harsh sunlight. We captured images of soybean leaves at two growth stages: flower bud differentiation and flowering and podding. Subsequently, we performed an initial screening of the images to eliminate those with similar backgrounds. Overall, we obtained 220 original images, each of which underwent necessary cropping and resizing operations. The image size was adjusted to 512 x 512 pixels. Figure 1 shows some representative samples of the processed image data.

 

Figure 1. Examples of processed image data.

2.1.2. Data annotation and enhancement

We used Labelme (version 5.1.1, https://github.com/wkentaro/labelme) for annotating our dataset. Labelme is a free annotation software specifically designed for object detection and segmentation tasks. In this paper, as our model includes the target leaf localization module and the guided segmentation module, preprocessing of the dataset was necessary to train these two modules effectively. For the target leaf localization module, we required supervised data in the object detection format. This involved annotating bounding boxes to indicate the position and contour of each leaf, as depicted in Figure 2-a. Regarding the guided segmentation module, we required supervised data in the object segmentation format. As the objective of the guided segmentation module is to segment target leaves, we only needed to annotate segmentation masks for them, as shown in Figure 2-b.

 

Figure 2. Examples of image annotation (a) Image annotation for the target leaf localization module. (b) Image annotation for the guided segmentation module.

To ensure the trained model possess good robustness, we employ a range of data augmentation techniques to enhance the diversity of the dataset. The specific techniques and their corresponding values are presented in Table 1. After the augmentation process, the dataset comprises a total of 2954 images. We randomly divided 1619 images for model training and allocated 1335 images for model evaluation.

Table 1. The data-augmentation operations and the corresponding values.

Operation

Value

flip

horizontal/vertical flip

brightness

{0.4, 0.8}

gaussian noise

mean=0.0

Standard deviation={10, 18}

 

 

5.In section 2.2, The paper appears to suffer from significant issues in its organization and presentation. The lack of a clear and coherent structure makes it challenging for readers to follow the flow of information. Additionally, this section is burdened with an abundance of unnecessary details and figures, which obscure the essential aspects of the proposed approach. The extensive discussion of previously published networks further complicates matters, as it becomes difficult for the reader to discern what is genuinely novel and original in the authors' work compared to what has already been explored in the literature. • In section 2.2., The paper necessitates a rigorous revision to establish a consistent and standardized definition of symbols, equations, and abbreviations throughout the document. Presently, there is an evident lack of clarity and coherence in the usage of these elements, resulting in confusion and ambiguity for the reader.

 

Response: Thank you very much for your suggestions and patience. We have reorganized the structure of Section 2.2 and also shortened unnecessary details and deleted some figures. You can refer to Line 129-330 for the specific changes.

 

(1) We have divided the content of Section 2.2 into three parts: target leaf localization module, guided segmentation module, and guidance offset strategy.

 

(2) The part on target leaf localization module will describe the process of locating target leaf , including leaf detector by Libra R-CNN and target leaf localization algorithm.

 

(3) The part on guided segmentation module will explain the data processing and solution of the guided segmentation module.

 

(4) The part on guidance offset strategy will introduce the function and scheme of the guidance offset strategy.

 

 

6.In section 3, the discussion misses discussing important experimental configurations and also misses justifying the choice of training parameters. • In section 3, the Comparative Experiment misses state-of-the-art models and only considers legacy techniques. • Did the authors ensure fairness in experimental comparisons? • Which validation strategy is used? • What is meant by “The statistics results are shown in Table 3.”. I cannot see any statistics. • Complexity and ablation analysis are missed. • We observe a mismatch in mask color between Figure 2 and Figure 18.  • The paper is marred by a noticeable poor quality of language, which hinders its overall readability and comprehension. The writing is riddled with grammatical errors, awkward sentence structures, and inconsistent terminology, making it difficult for the reader to fully grasp the intended meaning of the content.

 

Response: Thank you very much for your advice and patience. We greatly appreciate the positive feedback you have given us.

 

(1) At the beginning of Section 3, Line 332-342, we discussed the settings of experimental configurations and training parameters. On Line 334, we added the description of model parameter initialization: "To ensure the performance of these models and reduce the training time, we employed transfer learning techniques. Specifically, we initialized the backbone parameters of Libra R-CNN with pre-trained weights from the COCO dataset. Similarly, for the guided segmentation module parameters, we utilized weights pre-trained on the VOC2012 dataset."

 

(2) Thank you for your suggestion. In the comparative experiments, we added a popular segmentation model, Yolov5x, for comparison.

 

(3) We ensured the fairness of the comparative experiments by training and testing all models on the same device, using the same dataset, and initializing the model parameters with weights pre-trained on the COCO or VOC2012 dataset. We also ensured that each model was trained to convergence. We mentioned this in Section 3.4, Line 400.

 

(4) We used a cross-validation strategy.

 

(5) I apologize for the improper use of the word "statistics." This sentence has been modified to " the evaluation results of these models' performance metrics are shown in Table 4." ( Line 404 )

 

(6) In Sections 3.2 and 3.3, we discussed the impact of different guidance offset strategies and leaf detectors on leaf segmentation accuracy, considering complexity and conducting ablation analysis.

 

(7) The masks in Figure 17 were generated using code based on the labels to highlight the masks more intuitively and enhance contrast. The mask images in the data acquisition part were generated using the label visualization code provided by LabelMe (https://github.com/wkentaro/labelme/tree/main/examples/instance_segmentation).

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

Authors have significantly improved the manuscript which is appreciable.

1. Section 2.1: Prior to commencing the acquisition process in the subsequent subsection, the authors have the option to introduce the dataset being utilized.

2. Figure 16: It is advisable to provide a comprehensive explanation for the entire figure, considering that while Figure 16(a) is elaborated upon, Figures 16(b) and (c) could be discussed comparatively within the text.

 

 

 

1. Please check minor spelling and typo like line 433 : ", an segmentation model of soybean leaf ....."

Author Response

Please see the attachment

Author Response File: Author Response.docx

Reviewer 2 Report

Authors covered all comments.

Author Response

Response to Reviewer 2 Comments

Thank you very much for providing us with many valuable suggestions. It has greatly helped us in revising our paper. We sincerely appreciate your patience.

Back to TopTop