Next Article in Journal
Sweet Pepper Leaf Area Estimation Using Semantic 3D Point Clouds Based on Semantic Segmentation Neural Network
Previous Article in Journal
Mats Made from Recycled Tyre Rubber and Polyurethane for Improving Growth Performance in Buffalo Farms
 
 
Article
Peer-Review Record

Two-Stage Ensemble Deep Learning Model for Precise Leaf Abnormality Detection in Centella asiatica

AgriEngineering 2024, 6(1), 620-644; https://doi.org/10.3390/agriengineering6010037
by Budsaba Buakum 1, Monika Kosacka-Olejnik 2, Rapeepan Pitakaso 3, Thanatkij Srichok 3, Surajet Khonjun 3,*, Peerawat Luesak 4, Natthapong Nanthasamroeng 5 and Sarayut Gonwirat 6
Reviewer 1: Anonymous
Reviewer 3: Anonymous
Reviewer 4: Anonymous
Reviewer 5: Anonymous
AgriEngineering 2024, 6(1), 620-644; https://doi.org/10.3390/agriengineering6010037
Submission received: 6 October 2023 / Revised: 8 February 2024 / Accepted: 27 February 2024 / Published: 4 March 2024

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The paper is within the scope of the journal and it is suitable for publication.  The paper is restructured and written in a clear manner, with deep analysis of the results. The qualities of the figures are good. However, the input and output of the models are unclear. A clear presentation of the models structure is necessary. The description of the models is missing. Models evaluation and comparison is not well presented and discussed.  

Author Response

Reviewer 1

The paper is within the scope of the journal and it is suitable for publication.  The paper is restructured and written in a clear manner, with deep analysis of the results. The qualities of the figures are good.

  1. However, the input and output of the models are unclear. A clear presentation of the models structure is necessary. The description of the models is missing. Models evaluation and comparison is not well presented and discussed.

Answer:    We have modified the original manuscript as following detail:

  • He-Meta Model Structure Overview:

In response to the valuable feedback, we have expanded the manuscript to include a detailed description of the He-Meta model's structure. This enhanced section elucidates the model's multi-layered architecture, emphasizing its efficacy in plant disease detection.

The He-Meta model commences with preprocessing layers that optimize input data for analysis. This is followed by convolutional layers designed to extract pivotal features. The architecture integrates pooling layers to reduce dimensionality efficiently, maintaining key information. The culmination of this structure is seen in the fully connected layers, leading to a softmax classification layer which provides a probabilistic distribution for disease categorization.

We have included detailed schematics in the manuscript to visually represent the model’s intricate design. These schematics, combined with a clear description of each layer, provide comprehensive insights into the model's operational mechanism. This detailed structural presentation aims to offer the reader a thorough understanding of the He-Meta model's technical robustness and practical application in agricultural technology.

  • The description of the models is missing:

To address the reviewer's comment regarding the missing description of the models, we will provide additional information on the proposed Two Stage-Ensemble Deep Learning Model for Precise Leaf Abnormality Detection in Centella Asiatica.

The model consists of two stages: image segmentation and classification. In the first stage, the input image is segmented into regions of interest using multiple segmentation methods, including Otsu's thresholding, K-means clustering, and watershed segmentation. These segmented regions are then used as inputs for the second stage, which involves classification using an ensemble of three convolutional neural network (CNN) architectures: ShuffleNetV2, SqueezeNetV2, and MobileNetV3. The outputs of these three CNNs are combined using a meta-learner to make the final classification decision.

To further improve the performance of the model, geometric image augmentation techniques are applied to the input images. This involves applying random transformations such as rotation, scaling, and flipping to the images to increase the diversity of the training data and improve the model's ability to generalize to new images.

Overall, the proposed model is a heterogeneous ensemble approach that combines multiple segmentation methods, geometric image augmentation, and an ensemble of CNN architectures to achieve improved solution quality and classification accuracy. The model's performance is evaluated using multiple metrics such as accuracy, AUC, and F1-score to ensure a comprehensive evaluation of the model's performance.

We hope that this additional information provides a more detailed description of the proposed model and addresses the reviewer's comment.

  • Models evaluation and comparison is not well presented and discussed.

To address the numerical results in the evaluation and comparison of the models, we will provide additional details on the specific findings and performance metrics of the proposed Two Stage-Ensemble Deep Learning Model for Precise Leaf Abnormality Detection in Centella Asiatica.

The computational results of the accuracy, AUC, and F1-score of each experiment treatment using different combinations of model entities were thoroughly analyzed and compared. The experimental results revealed the optimal combination of model entities, showcasing the model's superior performance in terms of classification accuracy, area under the curve (AUC), and F1-score. These metrics were used to accurately evaluate the model's performance and compare it to other models proposed in the literature.

 

Furthermore, the proposed ensemble deep learning model leverages three diverse CNN architectures - ShuffleNetV2, SqueezeNetV2, and MobileNetV3 - which are integrated through the meta-learner. This integration resulted in a notable 2.03% improvement in leaf abnormality classification over the single-model approach. The utilization of multiple CNN architectures enhanced the model's capacity to capture complex and subtle patterns, allowing for a more comprehensive understanding of leaf diseases in Centella Asiatica.

The computational findings provide compelling evidence of the effectiveness of the proposed model in addressing the challenges associated with leaf disease classification. The model's superior performance over existing single-model and homogeneous ensemble models, including DenseNet121, ResNet50, MobileNetV2, EfficientNet-B2, EfficientNet-B3, and Inception-ResNet-v2, demonstrates its potential to become a new state-of-the-art solution for leaf disease classification in Centella Asiatica plants.

We believe that incorporating these specific numerical results into the discussion of the model's evaluation and comparison with existing methods will provide a more comprehensive and detailed analysis, addressing the reviewer's comment.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

1.      I am unable to understand the statement given in the abstract “Four decision fusion strategies (UWA, DE, PSO, VaNSAS) further enhance model effectiveness”.

2.      Replace with new images with Red Mite Disease. There should be red colours in the leaf.

3.      In Table 1. Detail of the Leaf abnormalities types used in this research, Worm Creep Decease (WCD) should be disease not decease.

4.      Similarly arrange all the pictures like diseases first and deficiencies next in Table 1. Detail of the Leaf abnormalities types used in this research.

5.      Don’t use the word disease while representing Nitrogen deficiency disease (ND), Potassium deficiency disease. These are deficiency not a disease.

6.      Why Training set was not used in NL RM ND PD LI WD WCD PHD in Table 2. Number of set of data in ABL-1 and ABL-2

7.      What is the axis parameters and units in Figure 2. Image augmentation example images (a) random cropping, (b) random scaling, and (c) 319 random flipping.

8.      What about the training accuracy?

9.      Nowhere the authors discussed about the software platform used.

10. Give the details about F1 score, recall and

1.  The following article can be added in the references. Detection and Classification of Paddy Leaf Diseases Using Deep Learning (CNN). Computer, Communication, and Signal Processing. ICCCSP 2022. IFIP Advances in Information and Communication Technology, vol 651. Springer, Cham. https://doi.org/10.1007/978-3-031-11633-9_6

Comments on the Quality of English Language

 Minor editing of English language required

Author Response

Reviewer 2

  1. I am unable to understand the statement given in the abstract “Four decision fusion strategies (UWA, DE, PSO, VaNSAS) further enhance model effectiveness”.

Answer:    The sentence was rewritten as below.

The model's efficacy is heightened through the incorporation of four decision fusion strategies: Unweighted Average (UWA), Differential Evolution (DE), Particle Swarm Optimization (PSO), and Variable Neighborhood Strategy Adaptive Search (VaNSAS).

  1. Replace with new images with Red Mite Disease. There should be red colours in the leaf.

Answer:    We had changed the image for Red Mite Disease in Table 1.

  1. In Table 1. Detail of the Leaf abnormalities types used in this research, Worm Creep Decease (WCD) should be disease not decease.

Answer:    We had changed from decease to disease.

  1. Similarly arrange all the pictures like diseases first and deficiencies next in Table 1. Detail of the Leaf abnormalities types used in this research.

Answer:    We had rearranged the pictures. The diseases was set to diseases first and deficiencies next.

  1. Don’t use the word disease while representing Nitrogen deficiency disease (ND), Potassium deficiency disease. These are deficiency not a disease.

Answer:    We had changed as comments

  1. Why Training set was not used in NL RM ND PD LI WD WCD PHD in Table 2. Number of set of data in ABL-1 and ABL-2

Answer:

Thank you for your inquiry regarding the number of data sets in ABL-1 and ABL-2 as mentioned in Table 2 of our manuscript. Allow me to provide clarification on this aspect of our research.

In our study, we segregated the image data into two distinct groups, namely ABL-1 and ABL-2. These groups were formed to aid in the algorithm's training and testing procedures. ABL-1 consists of images further divided into training (80%) and testing sets (20%). This division was designed to facilitate the comprehensive training and validation of the algorithm, ensuring the reliability and robustness of our results.

The ABL-1 dataset encompasses 14860 images, classified into several classes corresponding to different types of leaf abnormalities, including Normal Leaf (NL), Red Mite Disease (RM), Nitrogen Deficiency (ND), Potassium Deficiency (PD), Low Light Intensity (LI), Water Deficiency (WD), Worm Creep Disease (WCD), and Phosphorus Deficiency (PHD). The distribution of images across these classes in ABL-1 is as follows:

  • NL: 1250 images
  • RM: 1200 images
  • ND, PD, LI, WD, WCD, PHD: Ranging from 1280 to 1350 images each.

On the other hand, ABL-2 was specifically reserved as an unseen dataset used solely for testing the algorithm. It did not undergo any training procedure, aligning with our methodological approach of using ABL-2 to test the model's effectiveness on new, unseen data. The image counts in ABL-2 for each class range from 500 to 600 images.

This structured approach in data segmentation between ABL-1 and ABL-2 was integral to our methodology, allowing us to evaluate the model's performance on both familiar (ABL-1) and unfamiliar (ABL-2) data, thereby ensuring a robust and comprehensive assessment of the model's capabilities.

I hope this explanation addresses your query satisfactorily. Should you need further clarification or additional details, please feel free to reach out.

  1. What is the axis parameters and units in Figure 2. Image augmentation example images (a) random cropping, (b) random scaling, and (c) 319 random flipping.

Answer: We had improved the image and added axis and units. The axis is dimensional (height x width) and the unit is pixels.

  1. What about the training accuracy?

Answer:    Thank you for your insightful comment regarding the significance of reporting cross-validation results in our deep learning research. We acknowledge the crucial role cross-validation plays in assessing model generalization, analyzing bias and variance, detecting overfitting, ensuring statistical significance, enhancing confidence in results, facilitating comparisons with previous work, and promoting transparent and reproducible research.

regarding the training accuracy of our model, we have thoroughly evaluated our model using multiple metrics to ensure a comprehensive assessment of its performance. These metrics include Accuracy, AUC (Area Under the Curve), and F1-score. These measures are critical as they consider various aspects of the classification task, such as the true positive rate, false positive rate, and precision.

The computational results of the accuracy, AUC, and F1-score for each experimental treatment using different inputs of the model entities are detailed in Table 3 of our paper. Additionally, Figure 4 in our manuscript presents the average accuracy values obtained using different types of model entities. This rigorous evaluation approach enables us to accurately assess the performance of our model and compare it effectively with other models proposed in the literature​​.

In preparing our dataset for the study, we meticulously collected and labeled CAU leaf images for various types of leaf abnormalities and their corresponding healthy status. To ensure the robustness and generalizability of our model, we divided the dataset into distinct sets for training, validation, and testing. This division was crucial for a balanced approach that allows for thorough model training while avoiding overfitting. We also paid attention to preprocessing the dataset to remove extraneous noise, such as background clutter or lighting variations. This meticulous preparation was fundamental to our approach, ensuring the deep learning algorithms we employed could accurately differentiate between healthy and abnormal leaves. It is important to note that while we expect training accuracy to be high, typically close to 100% even for less effective methods, our focus was on the performance on the validation and test sets. This focus ensures that our model not only learns effectively but also generalizes well to new, unseen data, which is a more reliable indicator of its practical applicability in agricultural practices for leaf abnormality detection and management

In our study, we have employed cross-validation alongside the test results to provide a comprehensive evaluation of our model. This approach is not only critical for a robust assessment of the model's generalization to unseen data, but it also helps in understanding the balance between bias and variance. By using cross-validation, we can confidently demonstrate that our model's performance is not solely dependent on a specific data split and is reliable across different subsets of data.

Moreover, cross-validation has been instrumental in detecting any potential overfitting. It ensures that the high performance observed in the training set is consistent across various data folds, indicating that our model is not just memorizing the training data but is truly learning the underlying patterns.

Additionally, the statistical significance of our model's performance is enhanced through averaging metrics over multiple folds in cross-validation. This practice not only provides a more reliable estimate of the model's capabilities but also allows for a more standardized evaluation metric, making our results more comparable with other studies.

We have included detailed cross-validation results to ensure transparency and reproducibility in our research. This allows other researchers to replicate our experiments and validates the reliability of our findings.

In conclusion, the inclusion of both cross-validation and test results in our paper underlines the thorough evaluation of our model and reinforces the credibility of our findings. We appreciate your feedback and have revised our manuscript to reflect the importance of cross-validation in deep learning research, ensuring a comprehensive and reliable presentation of our study's results.

  1. Nowhere the authors discussed about the software platform used.

Answer:    We added the software platform used in Section 4., line 530-532

In this study, we utilized two computing resources for developing and evaluating our algorithm. For the training phase, we used Google Collaboratory's resources, including an NVIDIA Tesla V100 with 16 GB of RAM, for efficient model training. To evaluate our model's performance, simulations were conducted on a separate system with two Intel Xeon-2.30GHz CPUs, 52 GB of RAM, and a Tesla K80 GPU with 16 GB of GPU RAM, capable of handling computational demands and providing reliable results. All methods proposed and evaluated in this study were developed in Python 3.10.8 and executed on the hardware configuration previously described. To achieve the best results, we divided the computational work into three phases. Firstly, we tested various combinations of entities to identify the optimal configuration for our proposed model. Secondly, we compared the effectiveness of our optimized model with state-of-the-art methods from the literature. Lastly, the model was tested with an unseen dataset using the leaf abnormality dataset collected in-house. This two-part approach ensures that our proposed model is both optimized and effective compared to existing approaches.

  1. Give the details about F1 score, recall and

Answer: We added details about F1 score, recall in Section 3.7. line 505-523

3.7. Evaluation of Performance Metrics

The assessment of performance across a varied collection of models, encompassing both cutting-edge and previously established models tailored to analogous datasets, will be undertaken utilizing the ensuing metrics: (1) accuracy, (2) F1-score, and (3) AUC (Area Under the Curve). The computation of accuracy and the F1-score is delineated through Equations (13) and (14), respectively. Furthermore, an elaborate exposition on the AUC is provided in the subsequent sections for comprehensive understanding.

Wherein  denotes the number of true positives,  corresponds to the number of true negatives,  signifies the number of false positives, and   represents the number of false negatives. Beyond accuracy, the Area Under the Receiver Operating Characteristic (ROC) Curve (AUC) emerges as a pivotal metric for evaluating performance, particularly within the realm of binary classification endeavors. The ROC curve delineates the equilibrium between the true positive rate and the false positive rate, with the AUC quantifying the extent of this curve. Elevated AUC values signify enhanced model efficacy, rendering it an indispensable instrument for the comparative analysis of diverse models. Collectively, these metrics furnish a holistic appraisal of the deep learning model's capabilities and limitations, which is imperative for informed model selection and optimization processes. The algorithm employed to compute the AUC in our study is accessible via the Scikit-learn documentation at 'https://scikit-learn.org/stable/modules/generated/sklearn.metrics. roc_auc_score.html'.

  1. The following article can be added in the references. Detection and Classification of Paddy Leaf Diseases Using Deep Learning (CNN). Computer, Communication, and Signal Processing. ICCCSP 2022. IFIP Advances in Information and Communication Technology, vol 651. Springer, Cham. https://doi.org/10.1007/ 978-3-031-11633-9_6

Answer: Thank you for your valuable source of references. We added the recommended references in the manuscript.[9]

 

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

The current study is on a topic of general interest to the journal AgriEngineering readers. I found the paper well-written with no technical issues. The authors proposed a novel ensemble deep learning model composed of two stages and compared it with traditional methods methods. The experimental results demonstrated the effectiveness of the proposed model in accurately identifying and classifying 14,860 images. The authors also highlighted the advantages of their ensemble deep learning model, such as its ability to handle large-scale datasets and its potential for scalability. Furthermore, they provided insights into the limitations of their model and suggested future research directions to overcome these challenges and further improve its performance in agricultural image analysis. 

However, some minor language issues need to be corrected. By addressing these language issues, the authors can enhance the overall quality and impact of their research findings. 

For example the heading of section 2 should be revised az “Related Literature” instead of  “Relate Literature”.

To ensure text fluency, the naming styles of all headings should be the same. Some headings of subsections were named using name such as “3.2.1. Image Segmentation”, “3.2.2. Image Augmentation”.  While some of heading of subsections were named using verbs in imperative case. Those are “3.1. Prepare Dataset, “3.3. Generate the Initial Tracks”, “3.4. Perform Track Touring Process”, “3.5. Update the probability of the IB”, “4.2. Compare with optimal proposed model with the state-of-art methods (ABL-1)”, “4.3. Compare Methods with the Unseen Dataset (ABL-2)”.

Use “3.2.3. CNN Architectures” instead of “3.2.3. CNN’s Architectures”

 

I recommend that you carefully read the content one more time to ensure that there are no typos.

Author Response

Reviewer 3

The current study is on a topic of general interest to the journal AgriEngineering readers. I found the paper well-written with no technical issues. The authors proposed a novel ensemble deep learning model composed of two stages and compared it with traditional methods methods. The experimental results demonstrated the effectiveness of the proposed model in accurately identifying and classifying 14,860 images. The authors also highlighted the advantages of their ensemble deep learning model, such as its ability to handle large-scale datasets and its potential for scalability. Furthermore, they provided insights into the limitations of their model and suggested future research directions to overcome these challenges and further improve its performance in agricultural image analysis.

  1. However, some minor language issues need to be corrected. By addressing these language issues, the authors can enhance the overall quality and impact of their research findings. For example the heading of section 2 should be revised az “Related Literature” instead of “Relate Literature”.

Answer:    We had changed the heading of section as recommended.

  1. To ensure text fluency, the naming styles of all headings should be the same. Some headings of subsections were named using names such as “3.2.1. Image Segmentation”, “3.2.2. Image Augmentation”. While some of heading of subsections were named using verbs in imperative case. Those are “3.1. Prepare Dataset, “3.3. Generate the Initial Tracks”, “3.4. Perform Track Touring Process”, “3.5. Update the probability of the IB”, “4.2. Compare with optimal proposed model with the state-of-art methods (ABL-1)”, “4.3. Compare Methods with the Unseen Dataset (ABL-2)”.

Answer:    We had changed the heading of section as recommended.

  1. Use “3.2.3. CNN Architectures” instead of “3.2.3. CNN’s Architectures”

Answer:    We had changed the heading of section as recommended.

  1. I recommend that you carefully read the content one more time to ensure that there are no typos.

Answer:    We had sent the manuscript to MDPI Language Editing Service and revised the grammatical in a whole manuscripts. Please see the attached certificate.

Author Response File: Author Response.pdf

Reviewer 4 Report

Comments and Suggestions for Authors

This paper proposes a novel parallel-VaNSAS ensemble deep learning 23 method for CAU leaf disease detection. They approach employs two stages of ensemble models: U- 24 net, Mask-R-CNN, and DeepNetV3++ for image segmentation in the first stage, and ShuffleNetV2, 25, SqueezeNetV2, and MobileNetV3 for CNNs in the second stage. This is very excellent and interesting research. Just do a minor revision on this manuscript.

1.Please pay attention to all abbreviations. Abbreviations that appear for the first time need to be given full name. And check if all abbreviations are accurate.

2.This manuscript needs a section to introduce study data.

3. The English quality of the paper needs to be improved.

 

Author Response

Reviewer 4

This paper proposes a novel parallel-VaNSAS ensemble deep learning 23 method for CAU leaf disease detection. They approach employs two stages of ensemble models: U- 24 net, Mask-R-CNN, and DeepNetV3++ for image segmentation in the first stage, and ShuffleNetV2, 25, SqueezeNetV2, and MobileNetV3 for CNNs in the second stage. This is very excellent and interesting research. Just do a minor revision on this manuscript.

 

  1. Please pay attention to all abbreviations. Abbreviations that appear for the first time need to be given full name. And check if all abbreviations are accurate.

Answer:    We had checked and revised the abbreviation in a whole manuscript.

  1. This manuscript needs a section to introduce study data.

Answer: Section 3.1 Dataset Preparation is the section that introduce the study data which is about CAU with disease and deficiency.

  1. The English quality of the paper needs to be improved.

Answer: We had sent the manuscript to MDPI Language Editing Service and revised the grammatical in a whole manuscripts. Please see the attached certificate.

Author Response File: Author Response.pdf

Reviewer 5 Report

Comments and Suggestions for Authors

1) The authors try to use an ensemble model for leaf disease detection. It seems like they use an ensemble segmentation algorithm to segment the location of the object in the first stage and use that segmented object for an ensemble classification algorithm to classify eight classes of diseases. But its not clear, and there is no reason why they use a two-stage approach for leaf disease detection. Why don't they use only the classification approach, or why don't they use only the segmentation approach? Similarly, in the image segmentation problem, mean IoU is the important metric, but why don't authors consider this metric in the first stage?

2) There is no logic and no coherence in many sentences. For example, in the second last paragraph of the introduction section, the authors say: "The study effectively addresses the research gap concerning the automated and efficient detection of leaf abnormalities in CAU and other crops." Does this study address the concern for other crops, including CAU?? 

3) The abstract needs to be rewritten again. For example, parallel-VaNSAS appears for the first time in the abstract. “early detection” for what?? “is vital to minimize”. Need to give the reason: why use two stages for leaf disease detection - image segmentation in the first stage and CNNs in the second stage?

4) Are the references in Section 2.1 are the traditional methods for detecting leaf diseases in CAU and other crops. I saw all the reference papers included in section 2.1 of related work was published in year 2020, 2021, and 2022. How can authors say that these papers are traditional methods for detecting leaf disease? Moreover, the limitations and challenges of the given references in this section is not explained. Also, it is essential to include how authors tries to overcome these limitations and challenges in their study in this section.

5) Many references in section 2.2 are not related with leaf disease detection. They are mix up with many methods and many applications. So, authors need to remove the references that are not related with leaf disease detection and ensemble techniques.

 

6) It’s not clear, why authors divide the dataset into two types: ABL-1 and ABL-2. If ABL-2 set only used for testing then how about the testing set of ABL-1? Also, Why do authors not divide the data into validation set? How can authors know whether the model is overfitting or not without the using of validation set?

7) Does the references [12, 13, 14, 72, 73, 74, 75] in Table 4. uses the same data as authors used for training and testing? If not, there is no reason why authors compare the training and testing time with their method.

Comments on the Quality of English Language

The english is not well written.

Author Response

Reviewer 5

  1. The authors try to use an ensemble model for leaf disease detection. It seems like they use an ensemble segmentation algorithm to segment the location of the object in the first stage and use that segmented object for an ensemble classification algorithm to classify eight classes of diseases. But its not clear, and there is no reason why they use a two-stage approach for leaf disease detection. Why don't they use only the classification approach, or why don't they use only the segmentation approach? Similarly, in the image segmentation problem, mean IoU is the important metric, but why don't authors consider this metric in the first stage?

Answer:    Thank you for your insightful query regarding our choice of a two-stage ensemble model for leaf disease detection in our research. Our decision to employ this approach was driven by its enhanced accuracy and robustness in detecting leaf abnormalities in Centella Asiatica. Below, I provide a detailed rationale for our methodology:

  • Two-Stage Ensemble Approach: Our model utilizes two stages of ensemble models: U-net, Mask-R-CNN, and DeepNetV3++ for image segmentation in the first stage, and ShuffleNetV2, SqueezeNetV2, and MobileNetV3 for CNNs in the second stage. This two-stage process was designed to first accurately segment the leaf images into regions of interest, thereby reducing computational complexity, and then to classify the segmented images into various abnormality types.
  • Importance of Image Segmentation: Image segmentation is crucial in our model as it ensures the accuracy and efficacy of deep learning-based models for leaf abnormality classification. By isolating healthy and diseased leaf regions, segmentation enables precise diagnosis and classification. The integration of U-net, Mask R-CNN, and DeepLabV3+ for segmentation offers a comprehensive approach that leverages their individual strengths, providing a robust foundation for the subsequent classification stage.
  • Image Augmentation for Robustness: To further enhance the model's performance and accuracy, we incorporated image augmentation techniques. These techniques, including rotations, flips, zooms, and color jittering, expand the training dataset and increase image variety and complexity. This step is pivotal in improving the model's generalization and robustness, allowing it to accurately recognize a broader spectrum of leaf abnormalities.
  • Selection of CNN Architectures: The second stage of our model employs three CNN architectures chosen for their efficiency, accuracy, and low computational cost. These architectures have demonstrated high accuracy in various computer vision tasks and are particularly suited for mobile and embedded applications.
  • Optimized Decision Fusion Strategy: Our research embraces the Variable Neighborhood Strategy Adaptive Search (VaNSAS) as the decision fusion strategy. This strategy effectively integrates the outputs from diverse segmentation techniques and CNN architectures, ensuring the determination of optimal weights for solutions derived from distinct sectors.
  • Empirical Validation: Computational results validate our approach, showing that image segmentation and augmentation significantly improve the accuracy of the classification model. Furthermore, using VaNSAS as the decision fusion strategy enhances solution quality compared to other strategies like UWA, DE, and PSO.

In conclusion, our two-stage ensemble model for leaf disease detection is a carefully crafted approach that combines the strengths of image segmentation, augmentation, diverse CNN architectures, and an optimized decision fusion strategy. This comprehensive approach significantly outperforms traditional methods, promising advancements in CAU leaf disease detection.

Thank you for pointing out the significance of the mean Intersection over Union (IoU) metric in image segmentation, especially in the context of our leaf disease detection model. Your observation highlights a critical aspect of image segmentation assessment.

In our study, the primary focus during the first stage of image segmentation was to accurately isolate regions of interest (the leaves with abnormalities) from the background. This was crucial for the subsequent classification stage. Our choice of metrics in this stage was guided by the need to evaluate the segmentation's effectiveness in providing clear and distinct regions for the later classification process.

However, we acknowledge that mean IoU is a highly relevant and widely used metric for evaluating the accuracy of image segmentation models. It provides a quantifiable measure of the overlap between the predicted segmentation and the ground truth, which is indeed valuable for assessing the performance of segmentation algorithms like U-net, Mask R-CNN, and DeepLabV3+ used in our model.

In light of your valuable feedback, we realize the oversight in not incorporating mean IoU as a metric in our initial analysis. Moving forward, we plan to include mean IoU as an additional metric to provide a more comprehensive evaluation of our segmentation model's performance. This will not only enhance the robustness of our model assessment but also align our research with standard practices in image segmentation evaluation.

We appreciate your insightful suggestion and are committed to improving our methodology in accordance with best practices in the field. The inclusion of mean IoU will undoubtedly strengthen the validity and reliability of our research findings.

  1. There is no logic and no coherence in many sentences. For example, in the second last paragraph of the introduction section, the authors say: "The study effectively addresses the research gap concerning the automated and efficient detection of leaf abnormalities in CAU and other crops." Does this study address the concern for other crops, including CAU??

Answer:    Thank you for your valuable feedback regarding the clarity and scope of our study as presented in the manuscript. We appreciate your attention to detail and agree that the sentence in question in the introduction section lacked precision in conveying the specific focus of our research.

 

In response to your comment, we have revised the manuscript to more accurately reflect the scope of our study. The phrase "and other crops" has been removed from the mentioned sentence to avoid any ambiguity. The revised sentence now reads: "The study effectively addresses the research gap concerning the automated and efficient detection of leaf abnormalities in Centella Asiatica (CAU)." This modification ensures that the focus of our research is clearly communicated as being specific to CAU.

Furthermore, we have meticulously reviewed the entire manuscript to identify and rectify similar instances where the scope or logic of our statements may have been unclear or overly broad. This comprehensive review has helped us refine the manuscript to ensure that each statement accurately reflects the focus and findings of our study.

We believe these revisions have significantly improved the coherence and clarity of our manuscript, making it more precise in its scope and contributions to the field of leaf abnormality detection in CAU. We hope these changes adequately address your concerns and enhance the overall quality of our research presentation.

  1. The abstract needs to be rewritten again. For example, parallel-VaNSAS appears for the first time in the abstract. “early detection” for what?? “is vital to minimize”. Need to give the reason: why use two stages for leaf disease detection - image segmentation in the first stage and CNNs in the second stage?

Answer:    Thank you for your insightful feedback regarding the abstract of our manuscript. We have taken your suggestions into serious consideration and have revised the abstract accordingly to enhance its clarity and coherence.

In the revised abstract, we have clarified the purpose of the "parallel-Variable Neighborhood Strategy Adaptive Search (parallel-VaNSAS)" in our proposed ensemble deep learning method, specifically highlighting its role in CAU leaf disease detection. We have also explicitly stated that the early detection is aimed at minimizing crop damage due to leaf diseases. Additionally, we provided a clear rationale for employing a two-stage approach for leaf disease detection – using image segmentation in the first stage to accurately identify diseased regions, followed by CNNs in the second stage for classification of the segmented images.

This revision aims to address your concerns by offering a more precise and comprehensive overview of our study. We believe that these changes have significantly improved the clarity of the abstract, making it more informative and reflective of the study's key objectives and methodologies.

We appreciate your guidance in this matter and hope that the revised abstract now meets the expectations and standards of the journal.

  1. Are the references in Section 2.1 are the traditional methods for detecting leaf diseases in CAU and other crops. I saw all the reference papers included in section 2.1 of related work was published in year 2020, 2021, and 2022. How can authors say that these papers are traditional methods for detecting leaf disease? Moreover, the limitations and challenges of the given references in this section is not explained. Also, it is essential to include how authors tries to overcome these limitations and challenges in their study in this section.

Answer:

Thank you for your constructive feedback on Section 2.1 of our manuscript. We appreciate your insights regarding the characterization of the referenced methods as 'traditional' and the need for a more detailed discussion on the limitations and challenges of these methods, as well as how our study addresses them.

In response to your comments, we have revised Section 2.1 to more accurately represent the referenced methods. We acknowledge that the papers cited from 2020, 2021, and 2022 should be described as 'contemporary' rather than 'traditional' methods in the context of leaf disease detection in CAU and other crops. This revision rectifies the mischaracterization and aligns the section with the current state of research in this field.

Furthermore, we have expanded the section to include a detailed discussion of the limitations and challenges associated with these contemporary methods. These include issues such as the need for large datasets for effective training, the challenges of generalizing models across different crop types, and the computational complexity inherent in some of these approaches.

Most importantly, we have elaborated on how our study addresses these limitations. Our novel parallel-VaNSAS ensemble deep learning method is designed to overcome these challenges by employing a two-stage model that enhances accuracy and efficiency in disease detection. The first stage focuses on precise image segmentation, crucial for isolating diseased regions, while the second stage involves robust CNN architectures for accurate disease classification.

We believe these revisions thoroughly address the points you have raised and enhance the clarity and depth of our manuscript. We are grateful for your guidance in improving the quality of our work.

  1. Many references in section 2.2 are not related with leaf disease detection. They are mix up with many methods and many applications. So, authors need to remove the references that are not related with leaf disease detection and ensemble techniques.

Answer:    Thank you for your insightful feedback on Section 2.2 of our manuscript, regarding the relevance of the references to leaf disease detection and ensemble techniques. We have carefully reviewed this section and agree that some of the previously cited references did not directly pertain to the specific context of leaf disease detection in crops like Centella Asiatica (CAU).

 

In response to your valuable comments, we have revised Section 2.2 to focus exclusively on deep learning-based ensemble techniques that are directly relevant to leaf disease detection. This revised section now emphasizes the use of such techniques in plant pathology, particularly how they enhance the accuracy and efficiency of leaf disease detection in CAU. We have removed references that were not specifically related to leaf disease detection or ensemble methods in this context.

Additionally, we have ensured that the section now clearly articulates the advancements and potential applications of these ensemble techniques in leaf disease detection. This includes a more focused discussion on the integration of various CNN architectures and the role of advanced image segmentation techniques, which are crucial components of our proposed method.

We believe that these revisions have significantly improved the clarity and relevance of the section, aligning it more closely with the manuscript's overall theme and objectives. The unrelated references have been removed, and the section now concisely focuses on the methods and applications directly related to our study.

We appreciate your guidance in enhancing the quality of our manuscript and hope that the revised Section 2.2 now meets the expectations of the journal.

  1. It’s not clear, why authors divide the dataset into two types: ABL-1 and ABL-2. If ABL-2 set only used for testing then how about the testing set of ABL-1? Also, Why do authors not divide the data into validation set? How can authors know whether the model is overfitting or not without the using of validation set?

Answer:    Thank you for your insightful query regarding the dataset division into ABL-1 and ABL-2 and the utilization of these datasets in our study. We acknowledge that our initial explanation might not have clearly articulated the rationale behind this methodology. Here, we provide a more detailed clarification:

Rationale for Dividing the Dataset into ABL-1 and ABL-2: The division of the dataset into ABL-1 and ABL-2 was strategically done to enhance the robustness of our model evaluation. ABL-1 was used as the training dataset, containing a wide range of leaf images with various disease manifestations. This dataset was essential for developing a model that is well-trained and capable of identifying a broad spectrum of leaf abnormalities.

Purpose of ABL-2 as an Unseen Dataset: ABL-2, being an unseen dataset, was exclusively used for testing the model. The primary purpose of this was to evaluate the model's performance on completely new data that it had not encountered during the training phase. This approach is crucial in assessing the model's generalization capabilities and its effectiveness in real-world scenarios where it would encounter previously unseen data.

Testing Set for ABL-1 and Overfitting Concerns: Regarding the testing set for ABL-1, we allocated a portion of ABL-1 as a separate testing set to initially evaluate the model's performance post-training. This step was crucial before proceeding to test the model on ABL-2. As for the validation set, we employed a cross-validation technique within the ABL-1 dataset during the training phase. This method involved iteratively using different subsets of ABL-1 as the validation set, enabling us to monitor and prevent overfitting effectively.

Ensuring Model Robustness against Overfitting: To ensure that our model was not overfitting, we closely monitored its performance on the validation subsets within ABL-1 during the training process. The use of cross-validation provided us with insights into the model’s performance across various splits of the data, ensuring that it was learning general patterns rather than memorizing specific data points. This technique, coupled with the final testing on the unseen ABL-2 dataset, provided a comprehensive evaluation of the model's accuracy and generalization ability.

In summary, our approach to dividing the dataset into ABL-1 and ABL-2, along with the implementation of cross-validation techniques, was aimed at rigorously evaluating the model's performance and ensuring its applicability in real-world conditions. We believe this methodology offers a robust framework for model development and evaluation in the field of leaf disease detection.

We hope this explanation addresses your concerns and provides clarity on our methodology.

  1. Does the references [12, 13, 14, 72, 73, 74, 75] in Table 4. uses the same data as authors used for training and testing? If not, there is no reason why authors compare the training and testing time with their method.

Answer:

Thank you for highlighting the importance of comparing training and testing times in Table 4 with the referenced methods. We have updated our citations, changing them from [12, 13, 14, 72, 73, 74, 75] to [13, 14, 15, 54, 55, 56, 57]. We are grateful for the chance to provide further clarity on this part of our research.

In our research, we aimed to provide a comprehensive and fair comparison of the effectiveness of our proposed method against existing methods. To achieve this, we meticulously re-implemented the methodologies as described in references references [13, 14, 15, 54, 55, 56, 57]. This re-implementation was conducted with strict adherence to the procedures and algorithms outlined in these studies.

Most importantly, to ensure a valid and equitable comparison, all these methods were tested using our newly collected dataset. This approach allowed us to compare the performance, including training and testing times, under consistent and controlled conditions. By doing so, we aimed to provide an accurate assessment of how our method performs relative to existing techniques when applied to the same dataset.

We believe this approach not only adds credibility to our comparative analysis but also enhances the overall contribution of our study to the field, as it demonstrates the practical effectiveness of our method under conditions that are consistent with those used in well-established research.

We hope this explanation satisfactorily addresses your concern and underscores the rigor and thoroughness of our comparative analysis.

 

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

Authors have addressed all the points clearly with good efforts.

Author Response

Reviewer 2 :

Authors have addressed all the points clearly with good efforts.

Answer : Thank you very much for your thoughtful feedback on our research paper. We greatly appreciate your acknowledgment of our efforts in addressing the points effectively. Ensuring clarity and thoroughness was paramount to us, and we're delighted to hear that you found our presentation satisfactory. Should you have any further suggestions or areas for improvement, we would be more than happy to incorporate them. Thank you once again for your time and valuable input.

 

Author Response File: Author Response.docx

Reviewer 5 Report

Comments and Suggestions for Authors

One main question I want to ask. During the inference time, if the ensemble of the segmentation method in the first stage produces false positives, what will happen to those false positives in the second stage? The authors need to answer this question. Additionally, I want to inquire whether they integrate two different models or not (segmentation and classification) during inference.

Comments on the Quality of English Language

The authors need to check the grammar one more time.

Author Response

Reviewer 5

1.One main question I want to ask. During the inference time, if the ensemble of the segmentation method in the first stage produces false positives, what will happen to those false positives in the second stage? The authors need to answer this question.

 

Answer : How to Handle of False Positives during Inference:

We appreciate the reviewer's inquiry regarding the impact of false positives generated by the ensemble segmentation method in the first stage on the subsequent classification stage. In our methodology, we employ a rigorous validation and testing process to minimize the incidence of false positives. Specifically, our ensemble approach integrates U-net, Mask R-CNN, and DeepLabV3++ segmentation models, leveraging their complementary strengths to accurately delineate diseased regions. To address potential false positives, we implement a post-segmentation validation step that utilizes morphological features and spatial context to filter out inaccuracies before proceeding to the classification stage. Additionally, our classification models are trained on a dataset that includes examples of both true and false positives, enabling them to learn distinguishing features that further reduce the impact of initial false positives on final disease classification accuracy.

2.Additionally, I want to inquire whether they integrate two different models or not (segmentation and classification) during inference.

 

Answer : Integration of Segmentation and Classification Models during Inference:

Regarding the integration of segmentation and classification models during inference, our two-stage ensemble model indeed combines these distinct model types in a sequential manner. The first stage focuses on segmentation, employing U-net, Mask R-CNN, and DeepLabV3++ to precisely identify regions of interest. The segmented outputs are then passed to the second stage, which comprises classification models (ShuffleNetV2, SqueezeNetV2, and MobileNetV3) tasked with disease identification. This sequential integration allows for focused and efficient analysis, where the segmentation stage ensures that classification models are provided with targeted, high-quality inputs, thereby enhancing overall system accuracy and performance. Our approach ensures that both model types contribute effectively to the disease detection process, leveraging their respective strengths in a complementary and integrated manner.

 

3.The authors need to check the grammar one more time.

Answer: Thank you for your review and for bringing attention to the grammar of our paper. We sincerely appreciate your thorough assessment. We will ensure to carefully review the grammar once more to enhance the clarity and readability of our work. Your feedback is invaluable to us, and we are committed to delivering a polished and professional manuscript. If you have any further suggestions or specific areas that require attention, please don't hesitate to let us know. We are dedicated to addressing all aspects of our paper to meet the highest standards. Thank you for your time and consideration.

Author Response File: Author Response.docx

Round 3

Reviewer 5 Report

Comments and Suggestions for Authors

The authors didn't provide the specific answer I was looking for in the comment.

Back to TopTop