Next Article in Journal
Optimization of Kiloampere Peltier Current Lead Using Orthogonal Experimental Design Method
Previous Article in Journal
Presence Effects in Virtual Reality Based on User Characteristics: Attention, Enjoyment, and Memory
 
 
Article
Peer-Review Record

Detection of Diseases in Tomato Leaves by Color Analysis

Electronics 2021, 10(9), 1055; https://doi.org/10.3390/electronics10091055
by Benjamín Luna-Benoso 1,*, José Cruz Martínez-Perales 1, Jorge Cortés-Galicia 1, Rolando Flores-Carapia 2 and Víctor Manuel Silva-García 2
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Reviewer 4: Anonymous
Reviewer 5: Anonymous
Reviewer 6: Anonymous
Electronics 2021, 10(9), 1055; https://doi.org/10.3390/electronics10091055
Submission received: 16 March 2021 / Revised: 26 April 2021 / Accepted: 26 April 2021 / Published: 29 April 2021
(This article belongs to the Section Computer Science & Engineering)

Round 1

Reviewer 1 Report

The authors reported the study on image processing based on machine learning techniques to detect diseases in tomato leaves.  The main conclusions presented in the paper are well supported by the figures and supporting text, and therefore it is recommended that the manuscript can be published in Electronics. However, to meet the journal quality standards, the following minor comments need to be addressed:



1. Abstract should be precious, focusing on the novelty of this work. 

2. Introduction writing part can be improved. The novelty should be highlighted in the introduction. The authors should clearly highlight the advancement of their research with existing published results.


3. There are some grammatical mistakes and drawbacks in the manuscript, Please improve the English and try to present in the concise expression. 

Author Response

Good afternoon dear Reviewer. First of all, on behalf of those who make up the work team, we want to thank you for your valuable comments to improve the work presented. Responses to your comments are shown below.

  1. Abstract should be precious, focusing on the novelty of this work. Answer: The abstract was modified emphasizing the novelty of the work (lines 18-22).
  2. Introduction writing part can be improved. The novelty should be highlighted in the introduction. The authors should clearly highlight the advancement of their research with existing published results. Answer: The introduction was modified highlighting the proposal of the work in comparison with other investigations (lines 53-77).
  3. There are some grammatical mistakes and drawbacks in the manuscript, Please improve the English and try to present in the concise expression. An exhaustive review of the document was carried out by an expert in the English language area.

Reviewer 2 Report

I was extremely impressed with the quality of the work, well written and well researched. I recommend acceptance.

Author Response

Good afternoon dear Reviewer. On behalf of those who make up the work team, we want to thank you for your valuable comments that help us to improve and continue in the development of future work. We want to thank you infinitely thanks.

 

Reviewer 3 Report

The authors present a machine-learning based approach to detect three types of disease in tomato plants. They test their method on PlantVillage dataset and report obtained results.

 

This general goals, approach, and methodology are sound, so I think the paper has some merits and can be improved and resubmitted. However, in its present form it has too many flaws in my opinion, ranging from relatively minor to ones that are hard to judge (I mean, I cannot even guess whether they can be fixed quickly or can't be fixed at all, hence the final verdict isn't positive).

First of all the authors spend enormous time discussing well-known methods and formulas of machine learning and image processing. I don't think there is any need to discuss RGB/HSV, Otsu's method, SVM, KNN and other textbook methods. If we add lengthy description of kurtosis, medium value and other formulas that do not constitute authors' contribution, they easily take up more than one half of the paper. Actually, it makes it hard to figure out where exactly lies authors' contribution and what constitutes existing research.

Presumably, the main results are contained in Tables 2 and 3. What catches the eye immediately is that these tables are named identically and have identical columns. What kind of presentation is that? It can be inferred from the text that the first table classifies leaves into "healthy/sick" and the second one is responsible for the specific disease.  However, the text also says that   "The respective tables show the comparison with other models" (taken from Weka). Again, it's hard to understand how to read it. The "proposed" model should be in the boldface font below. Other lines are probably Weka, but MLP, SVM and KNN are also listed as parts of the proposed architecture in Table 1. So is "KNN" in Table 2 actually KNN from Weka or KNN as the part of the proposal?

Furthermore, if we take results as face value, then the best numbers in both tables belong to plain SVM (kernel linear) and MLP, both outperforming "proposed".

The following Discussion/Conclusion section don't help to clarify this confusion. Authors mention the methods and tasks of "other works" and "related works", but there is no head to head number comparison. It really looks strange to me to compare own methods with general-purpose built-in models of Weka rather than with state of the art specialized algorithms developed within the same domain.

Thus, the biggest problem for me is my inability to assess the contribution of the authors. It seems apparent that their methods works and achieves certain high accuracy. However, it is unclear what is the novelty/contribution here. Is this method better than competing approaches (by "competing" I mean approaches discussed in related papers rather than Weka-supplied tools)? Is there any advancement in state of the art here? These topics are crucial and have to be addressed, I believe.

Author Response

Good afternoon dear Reviewer. First of all, on behalf of those who make up the work team, we want to thank you for your valuable comments to improve the work presented. Responses to your comments are shown below.

First of all the authors spend enormous time discussing well-known methods and formulas of machine learning and image processing. I don't think there is any need to discuss RGB/HSV, Otsu's method, SVM, KNN and other textbook methods. If we add lengthy description of kurtosis, medium value and other formulas that do not constitute authors' contribution, they easily take up more than one half of the paper. Actually, it makes it hard to figure out where exactly lies authors' contribution and what constitutes existing research. Response: The summary (lines 18-22), the introduction (lines 53-77), the discussion (lines 341-342, 347-348, 358-366) and the conclusions (lines 373-376, 378-387, 390-393) have been modified highlighting the contributions of the work.

Presumably, the main results are contained in Tables 2 and 3. What catches the eye immediately is that these tables are named identically and have identical columns. What kind of presentation is that? It can be inferred from the text that the first table classifies leaves into "healthy/sick" and the second one is responsible for the specific disease.  However, the text also says that   "The respective tables show the comparison with other models" (taken from Weka). Again, it's hard to understand how to read it. The "proposed" model should be in the boldface font below. Other lines are probably Weka, but MLP, SVM and KNN are also listed as parts of the proposed architecture in Table 1. So is "KNN" in Table 2 actually KNN from Weka or KNN as the part of the proposal?. Answer: The titles of tables 2 and 3 were corrected, in addition, table 3 is attached, which shows the result obtained by other authors, highlighting the results obtained by the proposed model. K-NN in table 2 is the result of the accuracy using the WEKA platform with the same characteristics used for the proposed model.

Furthermore, if we take results as face value, then the best numbers in both tables belong to plain SVM (kernel linear) and MLP, both outperforming "proposed".

Answer: Indeed, SVM and MLP outperform the proposed architecture with the same vector of characteristics, however, the results of the proposed model are competitive compared to other works such as those shown in the new table attached (table 4) (lines 358-366, 390-393). Also mentioned in the document is the fact that the works of other authors consider the healthy class as one more class of the group of diseased classes to be classified (lines 347-348), so it is not possible to make a comparison of the classification of healthy leaves and diseased leaves with other works except with the WEKA platform and in this case, with the same vector of characteristics considered.

The following Discussion/Conclusion section don't help to clarify this confusion. Authors mention the methods and tasks of "other works" and "related works", but there is no head to head number comparison. It really looks strange to me to compare own methods with general-purpose built-in models of Weka rather than with state of the art specialized algorithms developed within the same domain. Answer: Table 4 was added, showing the result obtained by other authors.

Thus, the biggest problem for me is my inability to assess the contribution of the authors. It seems apparent that their methods works and achieves certain high accuracy. However, it is unclear what is the novelty/contribution here. Is this method better than competing approaches (by "competing" I mean approaches discussed in related papers rather than Weka-supplied tools)? Is there any advancement in state of the art here? These topics are crucial and have to be addressed, I believe. Response: The summary (lines 18-22), the introduction (lines 53-77), the discussion (lines 341-342, 347-348, 358-366) and the conclusions (lines 373-376, 378-387, 390-393) have been modified highlighting the contributions of the work.

Reviewer 4 Report

An article explains very well the importance of topic. Also an approach is presented to solve this particular task. Unfortunately, there are several issues in present manuscript, which require a moderate editing of manuscript.
 
Distinctive features of a manuscript are presented only at the end. As far as I can conclude, most important feature is that human experts prepared a HSV-color table which was used for segmentation of lesions on a leaf. Besides, an automated segmentation of leaf itself from background was implemented by authors. I suggest stating most valuable contribution of the present work in short and concise form in abstract.


An introduction part contains (lines 57-78) a number of statements about prior-art, which read more like a bullet point list (or a table) than a text. I strongly advise to edit the style, preferably also extending this part. 


Check the sentences (line 79-80, line 328). Line 104: "where the values of each RGB component are found in three vertices: cyan, magenta and yellow" statement is difficult to understand. Line 106: "prism" - it is actually a cone. Eq(3) - "si" -> "if"? Line 118 - "scale_h,s,v" are not defined. Check subscripts  in Eq(8): (µ1,2,T), as well in lines 143-144, 159.


Most definitions are given in Section 2. However, definitions of color moments and GLCM are given later. I suggest moving the mathematical definitions (including equations) to Section 2, and leave only a short description in Section 3. Definitions of GLCM are generally incomplete (eq 18-25) - what is G? P_ij? log_ij? What about i,j indexes in GLCM - do they have the same meaning as the rest of manuscript and denote pixel positions? Check eq19 and eq27 for misprints.


The description of auto-segmentation of a leaf image in Section 3.1 is questionable. It must be improved.


First, for the RGB-grayscale conversion equal weights of separate RGB colors is used (lines 181-182). Normally, one would use some color weighting for grayscale conversion. For the purpose of segmenting the leaf out of background this might be Ok, but later the gray level coocurrence matrix is applied for image features description. So the choice of RGB-weight for grayscale conversion has implications for the results. Please, justify your choice of equal weights for RGB.


Second, it is not clear what is the intention of negating the image. Since at the next step the image is binarized using Otsu threshold, one could as well simply change the comparison sign. It is also not clear, what is meant by "black" pixels (line 189)  - black in original image? in negated image? in binarized (threshold image)?


Third - the segmentation seems to assume that background is larger by area than the leaf itself (lines 186-188). What if the area of leaf on image is larger than the half of image? What if there is no background at all? What influence has the background color/brightness? Is there any normalization of image brightness?

Considering the results and discussion parts (mistake in section numeration, should be Section 4 and 5), they are surprisingly short. It is not clear why arguments of classifiers were chosen the way they are shown in Table 1. Tables 2 and 3 have the same header; moreover, they seem to be presented in the opposite order than written in lines 300-302 (pay attention to accuracy values in lines 335-337 and 358-361). Results were compared with other models; however, it is not clear what is WEKA platform and who obtained those results. If these are third-party results, then they should be referenced. 


Most importantly, authors do not compare results of their combined approach with results of separate underlying models (MLP, SVM and K-NN). Judging by Tables 2/3, the accuracy of combined approach is actually lower than of MLP (and at least some SVM models). It obviously raises the question - why one should use MLP-SVM-K-NN approach, if plain MLP or SVM are superior. This question must be addressed by authors.

Author Response

Good afternoon dear Reviewer. First of all, on behalf of those who make up the work team, we want to thank you for your valuable comments to improve the work presented. Responses to your comments are shown below.

An article explains very well the importance of topic. Also an approach is presented to solve this particular task. Unfortunately, there are several issues in present manuscript, which require a moderate editing of manuscript.

Distinctive features of a manuscript are presented only at the end. As far as I can conclude, most important feature is that human experts prepared a HSV-color table which was used for segmentation of lesions on a leaf. Besides, an automated segmentation of leaf itself from background was implemented by authors. I suggest stating most valuable contribution of the present work in short and concise form in abstract.

Response: The summary (lines 18-22), the introduction (lines 53-77), the discussion (lines 341-342, 347-348, 358-366) and the conclusions (lines 373-376, 378-387, 390-393) have been modified highlighting the contributions of the work.

An introduction part contains (lines 57-78) a number of statements about prior-art, which read more like a bullet point list (or a table) than a text. I strongly advise to edit the style, preferably also extending this part. Response: The introduction was modified highlighting the proposal of the work in comparison with other investigations. Lines 53-77.                                                                                                     

Check the sentences (line 79-80, line 328). Answer: Sentences on lines 79-80 were changed (line 79-82), line 328 was corrected.

Check the sentences (line 79-80, line 328). Line 104: "where the values of each RGB component are found in three vertices: cyan, magenta and yellow" statement is difficult to understand. Answer: The entire paragraph where the text line was found was modified (line 99-105).

Line 106: "prism" - it is actually a cone.  Answer: The entire paragraph where the text line was found was modified and corrected (line 99-105).

 Equation (3) - "si" -> "if"? Answer: Corrected.

Line 118 - "scale_h, s, v" are not defined. Answer: Its meaning was added (line 114-115).

 Check subscripts  in Eq(8): (µ1,2,T), as well in lines 143-144, 159. Answer: The subscripts have been corrected.

Most definitions are given in Section 2. However, definitions of color moments and GLCM are given later. I suggest moving the mathematical definitions (including equations) to Section 2, and leave only a short description in Section 3. Answer: Moved the definitions of color moments and GLCM to Section 2 (175-210) and I leave a brief description in section 3 (lines 263-265).

Definitions of GLCM are generally incomplete (eq 18-25) - what is G? P_ij? log_ij? What about i,j indexes in GLCM - do they have the same meaning as the rest of manuscript and denote pixel positions? Check eq19 and eq27 for misprints. Answer: Equations 18-25 were rewritten in terms of the GLCM co-occurrence matrix, clarifying each of the doubts about the notation (lines 199-210).

The description of auto-segmentation of a leaf image in Section 3.1 is questionable. It must be improved. Answer: The explanation of auto-segmentation shown in section 3.1 (lines 224-233) has been improved.

First, for the RGB-grayscale conversion equal weights of separate RGB colors is used (lines 181-182). Normally, one would use some color weighting for grayscale conversion. For the purpose of segmenting the leaf out of background this might be Ok, but later the gray level coocurrence matrix is applied for image features description. So the choice of RGB-weight for grayscale conversion has implications for the results. Please, justify your choice of equal weights for RGB. Answer: Tests were made with the average of the RGB planes, this is I = (R + G + B) / 3, another where I = R = G = B is considered, and others more suggested by the ITU-BT standard. 709 where I = aR + bG + cB with a = 0.21, b = 0.071 and c = 0.07, however the best results obtained were with the RGB model, which are the ones presented in this work.

Second, it is not clear what is the intention of negating the image. Since at the next step the image is binarized using Otsu threshold, one could as well simply change the comparison sign. Answer: The negative of the image was considered so that at the time of finding the threshold using Otsu, the values of the pixels on the sheet correspond to colors greater than the Otsu threshold, that is, they approach the white color (value 255 ) and that the background was determined by values close to black (value 0), that is, lower than the Otsu threshold, however, as you comment, the comparison sign could be changed.

It is also not clear, what is meant by "black" pixels (line 189)  - black in original image? in negated image? in binarized (threshold image)? Response: The wording was changed (lines 224-232).

 

Third - the segmentation seems to assume that background is larger by area than the leaf itself (lines 186-188). What if the area of leaf on image is larger than the half of image? What if there is no background at all? What influence has the background color/brightness? Is there any normalization of image brightness? Answer: The image bank used is PlantVillage, these images were captured in a controlled environment with a background of the same color. In case the area of ​​the sheet is greater than half of the image, the segmentation of the sheet is carried out, since the segmentation procedure (figure 3.f) eliminates those components whose area is less than a threshold, at the end only the part corresponding to the background of the color image will remain in yellow as shown in figure 3.f. and the white background will correspond to the area of ​​the sheet. The PlantVillage Images are in a controlled environment, so in this work they are not considered images in which there is no background. On the other hand, brightness is an important factor, moreover, there are images that have shadow and are confused with damaged areas on the sheet, so in these cases the segmentation is usually wrong, as a possible solution, instead of resorting to the Otsu method, tests could be carried out using a local thresholding method such as Niblack for example, however, this research is outside the scope of this presented work, but we believe that it is a good way to consider a greater range of images.

Considering the results and discussion parts (mistake in section numeration, should be Section 4 and 5), they are surprisingly short. Answers: The content of sections 4 and 5 was extended.

 It is not clear why arguments of classifiers were chosen the way they are shown in Table 1. Answer: Several experiments were done including the MLP model with one and two hidden layers, SVM with linear kernel and rbf, K-NN with k = 1,3,7, 9 and 11, using 6, 9, 12, 18 and 21 of the proposed characteristics, and the configuration with which we obtained the best results was chosen.

Tables 2 and 3 have the same header; moreover, they seem to be presented in the opposite order than written in lines 300-302 (pay attention to accuracy values in lines 335-337 and 358-361). Results were compared with other models; however, it is not clear what is WEKA platform and who obtained those results. If these are third-party results, then they should be referenced.  Answer: The titles of tables 2 and 3 were corrected. Table 4 was added, showing the results obtained by other authors. WEKA is a software platform for machine learning and data mining written in Java and developed at the University of Waikato. It can be accessed through the following link: https://www.cs.waikato.ac.nz/ml/weka/.

Tables 2 and 3 have the same header; moreover, they seem to be presented in the opposite order than written in lines 300-302 (pay attention to accuracy values in lines 335-337 and 358-361). Results were compared with other models; however, it is not clear what is WEKA platform and who obtained those results. If these are third-party results, then they should be referenced.  Answer: Table 4 is attached, which shows the result obtained by other authors, where the accuracy result obtained by the proposed work is competitive with the results obtained by other works. This is addressed in the discussion and conclusions section.

Reviewer 5 Report

  1. In this work, a methodology was presented that allows discriminating between images of healthy and sick tomato leaves. The methodology is divided into three modules: i) segmentation, ii) extraction of characteristics and iii) classification. Segmentation was carried out by methods in the spatial domain and thresholding in the HSV color model from a range of values validated by an expert area of phytopathology. Subsequently, a vector of characteristics corresponding to 4 color moments for each component of the RGB color model and 9 statistical measurements for the texture analysis is constructed using GLCM. Finally, three classifiers are used for classification: MLP, K-NN and SVM. The authors should highlight better the contributions of the proposed method.

 

  1. The authors mentioned: For the classification module, a voting rule was used to obtain the output corresponding to an input image from the classification obtained by the MPL, K-NN and SVM algorithms. What’s the content of a voting rule?

 

  1. The authors must illustrate all of the variables from the Equations 1-14. All the Equations are unreadable.

 

  1. The authors must carefully improve many spelling errors as follows,

 

         A. Line 257: 3.2 Classification -->3.3 Classification

         B. Line 264 (Figure 6):clasification --> classification

         C. Line 357: k-fol --> k-fold

 

  1. The conclusion is too short. I suggest the authors to strengthen the content of conclusion section.

Author Response

Good afternoon dear Reviewer. First of all, on behalf of those who make up the work team, we want to thank you for your valuable comments to improve the work presented. Responses to your comments are shown below.

1. In this work, a methodology was presented that allows discriminating between images of healthy and sick tomato leaves. The methodology is divided into three modules: i) segmentation, ii) extraction of characteristics and iii) classification. Segmentation was carried out by methods in the spatial domain and thresholding in the HSV color model from a range of values validated by an expert area of phytopathology. Subsequently, a vector of characteristics corresponding to 4 color moments for each component of the RGB color model and 9 statistical measurements for the texture analysis is constructed using GLCM. Finally, three classifiers are used for classification: MLP, K-NN and SVM. The authors should highlight better the contributions of the proposed method. Response: The summary (lines 18-22), the introduction (lines 53-77), the discussion (lines 341-342, 347-348, 358-366) and the conclusions (lines 373-376, 378-387, 390-393) have been modified highlighting the contributions of the work.

2. The authors mentioned: For the classification module, a voting rule was used to obtain the output corresponding to an input image from the classification obtained by the MPL, K-NN and SVM algorithms. What’s the content of a voting rule? Answer: In the document it was chosen to change the word vote by decision (lines 21, 77, 269, 340, 380) and it is explained more clearly. The decision rule considered, consists in that if two classifiers of the three proposed (MLP, K-NN and SVM) yield the same result, then this is taken as the final decision.

3. The authors must illustrate all of the variables from the Equations 1-14. All the Equations are unreadable. Answer: Some equations have been corrected making them readable and some have changed their order. In the part of the methodology where it makes use of the equations, it was explicitly written (line 222, 237, 288).

4. The authors must carefully improve many spelling errors as follows,

A. Line 257: 3.2 Classification -->3.3 Classification Response: Corrected.

B. Line 264 (Figure 6):clasification --> classification Response: Corrected.     

C. Line 357: k-fol --> k-fold Answer: Corrected.      

5. The conclusion is too short. I suggest the authors to strengthen the content of conclusion section. Answer: The conclusion was modified by extending and reinforcing the content with respect to the results obtained (lines 371-373, 376-384, 388-391).

Reviewer 6 Report

The things I appreciated about this paper was that they cited previous work and situated this contribution in the context of other work, and they also compared their result with other methods, which is always important.

I had a few questions and suggestions that I think might improve the manuscript.

  1. I found Figure 4 and its explanation confusing. The text says "Figure 4 shows the color thresholds in which a plant is damaged by this disease and the thresholds in which it is healthy" but I just don't understand what that means. I know it's related to equations 11, 12,  and 13, but I'm not sure I understand how. It probably doesn't help that I am colorblind :)
  2. I wasn't sure why they chose an average filter instead of an edge-preserving filter. Maybe it doesn't really matter, just wondered.
  3. I appreciated that the authors quantitatively validated the results of the classification step and compared their method with other methods. One thing I would have liked to see more of is quantification and validation of the intermediate steps. In particular, I'm interested to know how well the automatic segmentation with otsu is working. And in the cases where the final classification fails, is it related to problems with the segmentation, or is it something else. I guess in general I would have liked to see the authors dig in more to the images that didn't get classified correctly and try to comment on what their next steps would have to be to get to 100%.

Author Response

Good afternoon dear Reviewer. First of all, on behalf of those who make up the work team, we want to thank you for your valuable comments to improve the work presented. Responses to your comments are shown below.

The things I appreciated about this paper was that they cited previous work and situated this contribution in the context of other work, and they also compared their result with other methods, which is always important.

I had a few questions and suggestions that I think might improve the manuscript.

  1.  I found Figure 4 and its explanation confusing. The text says "Figure 4 shows the color thresholds in which a plant is damaged by this disease and the thresholds in which it is healthy" but I just don't understand what that means. I know it's related to equations 11, 12,  and 13, but I'm not sure I understand how. It probably doesn't help that I am colorblind :)
  1. Response: The text was changed to clarify the meaning of the figure (lines 242-243), and the title of figure 4 was changed (Line 246). Table 4 shows color thresholds in which a leaf is healthy or diseased with the presence of late blight, as well as this table, we also have tables of thresholds of septoria leaf spot and mosaic virus in tomato leaves, all these values they were validated by an expert in the phytopathology area, for practical purposes of the article, only the late blight table is placed. In this work the HSV color model is used since the model is the closest thing to how we perceive the world, then to obtain a segmented image of the lesion present in the leaf, these threshold values obtained in the tables were used using of the equations you comment on.
  2. I wasn't sure why they chose an average filter instead of an edge-preserving filter. Maybe it doesn't really matter, just wondered. Answer: A media filter was actually applied with the aim of smoothing the image, however there was an error at the time of writing. It has already been corrected (lines 221, 233, 288).
    1. I appreciated that the authors quantitatively validated the results of the classification step and compared their method with other methods. One thing I would have liked to see more of is quantification and validation of the intermediate steps. In particular, I'm interested to know how well the automatic segmentation with otsu is working. And in the cases where the final classification fails, is it related to problems with the segmentation, or is it something else. I guess in general I would have liked to see the authors dig in more to the images that didn't get classified correctly and try to comment on what their next steps would have to be to get to 100%.

    Answer: Regarding the misclassification it may be due to poor segmentation, and this is mostly due to the fact that there are a number of PlanVillage images where the shadow is confused with the injury present on the sheet, in this case, instead of resorting to the Otsu method, tests could be carried out using a local thresholding method such as Niblack for example, however, this research is outside the scope of this presented work, but we believe that it is a good way to consider a greater range of pictures. On the other hand, talking about a 100% classification is very risky since we would be talking about an optimal classifier for this problem, and talking about optimal is something very delicate that has to be demonstrated, however, if new images are entered into a classifier these possibly have certain characteristics that cause the algorithm to fail, coupled with this, there is the No-Free-Lunch Theorem that establishes that there is no classification algorithm that is optimal in all competition domains.

Round 2

Reviewer 3 Report

Well, let's consider this message a 'reply' rather than a 'review'. Let me skip all considerations related to paper organization and other secondary points and focus on the main issue.

The contribution and thus the outcome of the paper really depends on whether the proposed method has any merit that makes it 'better' than the others according to some criteria. I understand the effort invested here, but it's our task to judge the outcome and thus the merit to the reader.

I find this response strange: "not possible to make a comparison of the classification of healthy leaves and diseased leaves with other works except with the WEKA platform".

In other words, the authors knew from the day one that all other papers in the field treat "healthy leaves" as yet another class. So they knew from the day one that a fair comparison with the state of the art would be impossible. Why so? Is it an omission, or there is some good reason or some bad reason behind it? Why won't they design the experiment in the same way as everyone else to make fair comparison possible?

Next, let's try to have a rough estimation. Let's say an average competing approach achieves around 90% accuracy.

The proposed method achieves 86.45% accuracy on the first step (healthy/sick), and 97.39% accuracy on the second step (disease type). So it means after step 1 we get 13.55% of total cases wrong. Then after the second step we get 2.61% of the remaining cases wrong (which means 2.61% of the 86.45%, i.e. 2.26% of the original dataset).

Thus, in total 13.55% + 2.26% of the original data is classified wrong, which means that 84.19% of the original data is classified right. It actually places the proposed method at the very bottom of the Table 4.

I might be wrong in my calculations, but I want to hear authors' view on the matter. In any case, the phrase "in each of them, the healthy class is considered as one more class to be classified within the set of diseased leaf classes" placed as a comment to the table is very misleading. It is not just "a note", it is a point that can turn everything, I am afraid.

Author Response

Well, let's consider this message a 'reply' rather than a 'review'. Let me skip all considerations related to paper organization and other secondary points and focus on the main issue.

The contribution and thus the outcome of the paper really depends on whether the proposed method has any merit that makes it 'better' than the others according to some criteria. I understand the effort invested here, but it's our task to judge the outcome and thus the merit to the reader.

I find this response strange: "not possible to make a comparison of the classification of healthy leaves and diseased leaves with other works except with the WEKA platform".

In other words, the authors knew from the day one that all other papers in the field treat "healthy leaves" as yet another class. So they knew from the day one that a fair comparison with the state of the art would be impossible. Why so? Is it an omission, or there is some good reason or some bad reason behind it? Why won't they design the experiment in the same way as everyone else to make fair comparison possible?

Next, let's try to have a rough estimation. Let's say an average competing approach achieves around 90% accuracy.

The proposed method achieves 86.45% accuracy on the first step (healthy/sick), and 97.39% accuracy on the second step (disease type). So it means after step 1 we get 13.55% of total cases wrong. Then after the second step we get 2.61% of the remaining cases wrong (which means 2.61% of the 86.45%, i.e. 2.26% of the original dataset).

Thus, in total 13.55% + 2.26% of the original data is classified wrong, which means that 84.19% of the original data is classified right. It actually places the proposed method at the very bottom of the Table 4.

I might be wrong in my calculations, but I want to hear authors' view on the matter.

 

Answer: Dear reviewer. First of all, we appreciate your suggestions and comments in order to improve the work. Let me tell you how this work came about in order to answer the proposed comments. In mid-2019, we were approached by a professor-researcher who teaches the subject of Phytopathology in the parasitology department of the Autonomous University of Chapingo (of Mexico). The initial proposal of the professor was that he wanted a system, in which he could feed himself with different diseases that the leaves present and be able to classify them between healthy and diseased, he also wanted the area of the injury to be located, leaving the task of identifying the disease to their students. Fortunately we found a huge repertoire of leaf diseases (PlantVillage) that we started to work with. Later it was determined that the system will not only detect between healthy and diseased leaves, but will also detect the type of disease, however, this was proposed once the solution to the original problem was already built, so it was decided to continue with the work carried out and from identifying between healthy and diseased leaves, detect what type of disease it presents. For practical and show results effects, work was limited to three types of diseases, these were chosen by experts from the area of Phytopathology and practically were these, as they were working on the threshold color that would detect them disease so visual for these diseases (tables with which we work in this article). The results of the work are as presented, however, we appreciate your comments and, like you, we believe it convenient as a future work to classify together both the healthy and diseased leaf samples, but as do the works shown in the state of the art, to be able to see the comparison from the table of thresholds proposed by the experts in the Phytopathology area, since as you comments, the results are not far from the analytical calculations that it estimates. In advance we thank you infinitely for your comments, which, we believe are of great help to improve the work. Best regards.

 In any case, the phrase "in each of them, the healthy class is considered as one more class to be classified within the set of diseased leaf classes" placed as a comment to the table is very misleading. It is not just "a note", it is a point that can turn everything, I am afraid.

Answer: The sentence placed as a comment to table 4 was rewritten, (see part in green).

Reviewer 4 Report

There is still some room for improvement, however, in my opinion the manuscript in current shape deserves to be published.

media filter -> median filter.

Author Response

Reviewer 4.  Dear Reviewer,  We appreciate the comments and suggestions that you have given us to improve the work. We are grateful to you.

Best regards.

There is still some room for improvement, however, in my opinion the manuscript in current shape deserves to be published.

media filter -> median filter.. 

Answer: Corrected (lines 221, 233, 288 in green color).

Round 3

Reviewer 3 Report

I get the research history of your paper which certainly highlights your situation, but please understand that this path is irrelevant to the reader. The reader (as I believe) is only interested in finding the best approach to solve their own problem not in successes of failures of others.

Thus, a method that has no clear advantages over other competing methods isn't of high interest to the readers unless it has some specific advantages. In any case, it must be very clearly stated what is your methods objective performance measurements and how it is positioned within other research works in the area. 

Author Response

Reviewer 3.

I get the research history of your paper which certainly highlights your situation, but please understand that this path is irrelevant to the reader. The reader (as I believe) is only interested in finding the best approach to solve their own problem not in successes of failures of others.

Thus, a method that has no clear advantages over other competing methods isn't of high interest to the readers unless it has some specific advantages. In any case, it must be very clearly stated what is your methods objective performance measurements and how it is positioned within other research works in the area. 

Answer: Dear reviewer. We thank you in advance for your comments. The main contribution of this work is based on the use of a table of values, validated by an expert in the phytopathology area, which allows the thresholding of those areas damaged by any disease of the three considered, in this way we can obtain the segmentation of damaged areas. The classification was divided between healthy leaves and diseased leaves, later the diseased leaves were classified between three types of diseases. That as I mentioned before, the proposed works carry out the classification as a whole considering at the same time the types of disease and healthy leaves, which although we will take it into account as future work to consider the images as a whole (both healthy and diseased leaves. The present work is limited first to detecting healthy and diseased leaves, and later the type of disease with the results shown in tables 2 and 3.

 

Reply to Academic Editor.

Dear Editor.

First of all, on behalf of those who make up the work team, we want to thank you for your valuable comments to improve the proposed work. Below are the responses to each of your comments.block diagram of the proposed method should be added;;;

Answer: The block diagram (figure 3) was added, which together with the architecture of the model (figure 7) make up the proposed model. In addition, a text was written to allow continuity to the document to present the block diagram (lines 226-232) and another to give continuity to the text later (lines 243-245).

  • Figure 1 should have better quality;;;

Answer: The quality of figure 1 has been improved.

  • photo of measurements should be added;;;

The images used were obtained using the PlantVillage dataset available in reference [34], you can directly access the images through the following link: https://github.com/spMohanty/PlantVillage-Dataset/tree/master/raw/color/ . PlantVillage is a freely accessible repertoire of more than 70,000 images of different types of diseases in different types of plants, and has been used for different classification investigations. These images were used for this work, so no prototype was made to capture the images, that is, it does not apply to have a photo of the measurements of a prototype.

  • what is the accuracy of the proposed approach??? ;;;;

Answer: The model allows to obtain two classifications, firstly, given a tomato leaf, it classifies between healthy leaves and diseased leaves, in this case, its accuracy is 86.45%, later the set of diseased leaves classifies them into three types of diseases (late blight, Septoria leaf spot and mosaic virus), obtaining 97.39% accuracy.

  • references should be 2019-2021 Web of Science about 50% or more;; 30-40 at least.;;;;

Answer: Some references were updated without losing context in the document and all were rewritten in the format used by the journal, having a total of 41 references, of which 28 are Web of Science, of which 22 correspond to years of publication between 2019-2021 (references [3-14,18,22,24-27,29-32]).

  • is there a possibility to use the proposed methods for other problems?  ;;;;

Answer: The proposed method can be divided into three distinguishable modules: 1) segmentation, 2) feature extraction, and 3) classification. Regarding the segmentation module, there are some general algorithms that were applied, but there are others that were developed specifically for the problem addressed, furthermore, thresholding is proposed using values validated by a phytopathology expert. In conclusion, regarding the segmentation module, the general algorithms can be used for other jobs and depending on the problem, they may need to do something else. Regarding module 2, they can be used for different jobs that require the extraction of characteristics once the area of interest has been segmented. While module 3 can be applied to all classification problems with supervised learning.

  • for example analysis of thermal imaging, advantages, disadvantages, KNN, neural network etc.

    1)
    Fault diagnosis of electric impact drills using thermal imaging, Measurement, Volume 171, 2021,
    https://doi.org/10.1016/j.measurement.2020.108815

Answer: KNN and neural networks are widely used algorithms in the field of classification problems. Whereas KNN bases the output results for an input value by calculating the minimum distance to each of its elements in the training set, neural networks build a model using hidden layers and visible layers (input layer and output layer) to starting from the assignment of initial weights to the input values when training the neural network. KNN is a supervised learning algorithm. Some advantages when using KNN, is that the algorithm is simple and allows obtaining competitive results, KNN also allows working with regression problems, that is, prediction problems. Some disadvantages are that KNN is slow compared to other classification models, since whenever you want to know the output for an input value you have to calculate the distance with each of the elements of the training set, which means that, if the training set is very large, then it will be very slow, in addition to this, it becomes slower if the feature vector is dimensionally large. KNN is local, that is, it assumes that the class of a data depends only on the k-closest neighbors. On the other hand, some advantages of using neural networks are that they are self-organized, that is, neural networks allow themselves to organize what they have learned, it is tolerant to failure, that is, if any part of the network has failures, the rest will work normally. ANNs can recognize input patterns that have not been learned as long as there is a certain similarity with what has been learned, on the other hand, one of their main disadvantages is the time it takes to learn the neural network, this can be very time consuming. Another disadvantage is that learning for large tasks is complex, that is, the more things that need to be learned, the more difficult it will be to teach the neural network.

Both KNN and neural networks can be applied to works such as the one shown related to thermal images, since the objective is to detect hole failures, through thermal images, for this a vector of characteristics such as the one proposed in the work (BCAoID) is required to build the neural network or make use of KNN. From this, the objective is to distinguish between faults present in holes and those that are not by means of thermal images, which has been a classification problem.

 

Back to TopTop