Next Article in Journal
Monitoring Land Cover Change by Leveraging a Dynamic Service-Oriented Computing Model
Previous Article in Journal
Insights into the Effect of Urban Morphology and Land Cover on Land Surface and Air Temperatures in the Metropolitan City of Milan (Italy) Using Satellite Imagery and In Situ Measurements
 
 
Article
Peer-Review Record

Detection of Aquatic Invasive Plants in Wetlands of the Upper Mississippi River from UAV Imagery Using Transfer Learning

Remote Sens. 2023, 15(3), 734; https://doi.org/10.3390/rs15030734
by Gargi Chaudhuri 1,2,* and Niti B. Mishra 1,2
Remote Sens. 2023, 15(3), 734; https://doi.org/10.3390/rs15030734
Submission received: 30 November 2022 / Revised: 16 January 2023 / Accepted: 21 January 2023 / Published: 27 January 2023

Round 1

Reviewer 1 Report

Dear authors,

Upon a careful reading and evaluation of your manuscript, I’m recommending it for a major revision. This is an interesting topic, but I detect some deficiencies in its sections. In this regard, I have detailed my suggestions. I hope these comments are useful to you.

1. The introduction is fine, but I think it will read better if you combine sections 2 and 3. Also, the DL part (section 3) should be introduced first, before the transfer learning description.

2. In Figure 1, the Purple Loosestrife presented in the imagery are colored pixels from the labeling process, or are they flowers? Because it gives the impression that these plants are dissonant from the rest of the scene, and one could argue that a simple pixel-based analysis would be sufficient to identify these plants.

3. In the Accuracy Assessment section you described that only IoU was used to measure your accuracy, but why not implement other metrics such as f1-score, recall, precision, and global accuracy? In your introduction, you even used such metrics to describe the results from other studies, so I don’t understand why you didn’t use them in yours. Also, it is difficult to compare and discuss these results without more metrics. Please consider adding it and improving your table descriptions, which are lacking information. Based on the qualitative view from Figures 7 and 9 it appears that the network segmentations are poorly optimized. 

 

4. The results, in general, are below what is expected from DL segmentation models, even for a supposedly difficult task. While you detailed the challenges provided by the characteristics of the areas, did additional tests with other types of networks were not conducted? For instance, one could argue that vision transformer-based networks could provide better results (SegFormer, TransUNet, etc.). Why not use or test it? Don’t you think that your discussion section could be improved if you detailed these perspectives, even if it’s for future investigations?

5. In the general sense of this manuscript, the English language needs to be improved; there are some grammar and sentence structure errors in the text. I advise a careful examination in a subsequent read. Overall, the quality of the manuscript is good.

Author Response

We thank both the reviewers for positive review of our manuscript. We highly appreciate Reviewer#1’s detailed comments which helped to improve the manuscript. Listed below are the overall changes in the revised version:

  • We thoroughly edited the manuscript to improve the language and readability.
  • We rearranged the Section 2s and 3 as suggested by Reviewer#1.
  • For accuracy assessment, we added F1-score as suggested by Reviewer#1.
  • We replaced the Figure 6 (bar graph showing training and testing accuracy) of original manuscript with a table showing the training and testing mean IoU values and F1-scores. We believe that reporting the values in table format increased the discernibility of the results than the bar graph format. The metrics reported were reduced to 2 places after the decimal point to fit the table within the page margin.
  • For sake of readability, only the new additions and updates are shown with highlight (track changes) and not the deleted part of the text.

Our response to the Reviewer#1’s specific comments are listed below:

  1. The introduction is fine, but I think it will read better if you combine sections 2 and 3. Also, the DL part (section 3) should be introduced first, before the transfer learning description.

Changed

  1. In Figure 1, the Purple Loosestrife presented in the imagery are colored pixels from the labeling process, or are they flowers? Because it gives the impression that these plants are dissonant from the rest of the scene, and one could argue that a simple pixelbased analysis would be sufficient to identify these plants.

Yes, they are the colored polygons from the imagery. We added further clarification in the manuscript. As mentioned in section 2, our early testing was based on pixel-based classification which didn’t provide good results.

  1. In the Accuracy Assessment section you described that only IoU was used to measure your accuracy, but why not implement other metrics such as f1-score, recall, precision, and global accuracy? In your introduction, you even used such metrics to describe the results from other studies, so I don’t understand why you didn’t use them in yours. Also, it is difficult to compare and discuss these results without more metrics. Please consider adding it and improving your table descriptions, which are lacking information. Based on the qualitative view from Figures 7 and 9 it appears that the network segmentations are poorly optimized.

We have added F1-score in the accuracy assessment. More details about the accuracy metric used have been added in the manuscript. As you can see in the results, the F1-score generated higher accuracy than the Intersection over Union (IoU) metric reported. The two metrics are quantitatively similar and generate similar inferences, that is if F1-score says model A is better than model B, the IoU will say the same. However, if you look at the exact values, the F1-score is higher than the IoU values because IoU metric tends to penalize single instances of bad classification in a single image more than the F1-score. Therefore, when calculating the average score over all images, the mean IoU metric is closer to the worst scores and the F1-score is closer to the average scores. The choice behind what metric is based on what predominantly has been used in the discipline. I agree that in almost all papers they used F1-score, or some variation of it, but some did use IoU metric as well. However, when we were exploring and testing different metrics, IoU metric is very popular in object detection and is predominantly used in almost all types of image classification including the biomedical imagery, photos, video segments, and not just remote sensing. In our work, we tested different metrics that are suitable for binary classification and widely used in deep learning literature and the results were very similar. So, we initially chose to use only IoU because it reported the most conservative values of accuracy.

 

  1. The results, in general, are below what is expected from DL segmentation models, even for a supposedly difficult task. While you detailed the challenges provided by the characteristics of the areas, did additional tests with other types of networks were not conducted? For instance, one could argue that vision transformer based networks could provide better results (SegFormer, TransUNet, etc.). Why not use or test it? Don’t you think that your discussion section could be improved if you detailed these perspectives, even if it’s for future investigations?

As mentioned above the IoU metrics are lower compared to F1-score reported in other literature. I think by adding F1-score, it will make the results more comparable to the existing literature. We did test with Keras based models such as VGG19, PSPNet and FPN in addition to UNet and LinkNet. The other models didn’t generate any better results. There have been different types of models developed since we started this project, but we focused on existing models that were successfully used before for invasive species mapping. We agree testing with more models will probably generate deeper understanding of the task but that is beyond the scope of this paper. Therefore, without explicit knowledge on performance of untested models like SegFormer or TransUnet we would like to refrain from discussing how better/worse the other models would have been for the given task.

 

  1. In the general sense of this manuscript, the English language needs to be improved; there are some grammar and sentence structure errors in the text. I advise a careful examination in a subsequent read. Overall, the quality of the manuscript is good.

We revised the document thoroughly. We highly appreciate your feedback.

Author Response File: Author Response.pdf

Reviewer 2 Report

Detection of aquatic invasive plants in wetland from UAV imagery using transfer learning

 

1. The abstract is understandable and clear.

2. The introduction is very detailed. More than 30 sources are cited.

2. The organization and presentation of the manuscript is correct and well understandable.

3. I have comments. The author recommends publishing the article in the present form.

Author Response

We thank both the reviewers for positive review of our manuscript. We highly appreciate Reviewer#1’s detailed comments which helped to improve the manuscript. Listed below are the overall changes in the revised version:

  • We thoroughly edited the manuscript to improve the language and readability.
  • We rearranged the Section 2s and 3 as suggested by Reviewer#1.
  • For accuracy assessment, we added F1-score as suggested by Reviewer#1.
  • We replaced the Figure 6 (bar graph showing training and testing accuracy) of original manuscript with a table showing the training and testing mean IoU values and F1-scores. We believe that reporting the values in table format increased the discernibility of the results than the bar graph format. The metrics reported were reduced to 2 places after the decimal point to fit the table within the page margin.
  • For sake of readability, only the new additions and updates are shown with red font color (track changes) and not the deleted part of the text.

Our response to the Reviewer#1’s specific comments are listed below:

  1. The introduction is fine, but I think it will read better if you combine sections 2 and 3. Also, the DL part (section 3) should be introduced first, before the transfer learning description.

Changed

  1. In Figure 1, the Purple Loosestrife presented in the imagery are colored pixels from the labeling process, or are they flowers? Because it gives the impression that these plants are dissonant from the rest of the scene, and one could argue that a simple pixelbased analysis would be sufficient to identify these plants.

Yes, they are the colored polygons from the imagery. We added further clarification in the manuscript. As mentioned in section 2, our early testing was based on pixel-based classification which didn’t provide good results.

  1. In the Accuracy Assessment section you described that only IoU was used to measure your accuracy, but why not implement other metrics such as f1-score, recall, precision, and global accuracy? In your introduction, you even used such metrics to describe the results from other studies, so I don’t understand why you didn’t use them in yours. Also, it is difficult to compare and discuss these results without more metrics. Please consider adding it and improving your table descriptions, which are lacking information. Based on the qualitative view from Figures 7 and 9 it appears that the network segmentations are poorly optimized.

We have added F1-score in the accuracy assessment. More details about the accuracy metric used have been added in the manuscript. As you can see in the results, the F1-score generated higher accuracy than the Intersection over Union (IoU) metric reported. The two metrics are quantitatively similar and generate similar inferences, that is if F1-score says model A is better than model B, the IoU will say the same. However, if you look at the exact values, the F1-score is higher than the IoU values because IoU metric tends to penalize single instances of bad classification in a single image more than the F1-score. Therefore, when calculating the average score over all images, the mean IoU metric is closer to the worst scores and the F1-score is closer to the average scores. The choice behind what metric is based on what predominantly has been used in the discipline. I agree that in almost all papers they used F1-score, or some variation of it, but some did use IoU metric as well. However, when we were exploring and testing different metrics, IoU metric is very popular in object detection and is predominantly used in almost all types of image classification including the biomedical imagery, photos, video segments, and not just remote sensing. In our work, we tested different metrics that are suitable for binary classification and widely used in deep learning literature and the results were very similar. So, we initially chose to use only IoU because it reported the most conservative values of accuracy.

 

  1. The results, in general, are below what is expected from DL segmentation models, even for a supposedly difficult task. While you detailed the challenges provided by the characteristics of the areas, did additional tests with other types of networks were not conducted? For instance, one could argue that vision transformer based networks could provide better results (SegFormer, TransUNet, etc.). Why not use or test it? Don’t you think that your discussion section could be improved if you detailed these perspectives, even if it’s for future investigations?

As mentioned above the IoU metrics are lower compared to F1-score reported in other literature. I think by adding F1-score, it will make the results more comparable to the existing literature. We did test with Keras based models such as VGG19, PSPNet and FPN in addition to UNet and LinkNet. The other models didn’t generate any better results. There have been different types of models developed since we started this project, but we focused on existing models that were successfully used before for invasive species mapping. We agree testing with more models will probably generate deeper understanding of the task but that is beyond the scope of this paper. Therefore, without explicit knowledge on performance of untested models like SegFormer or TransUnet we would like to refrain from discussing how better/worse the other models would have been for the given task.

 

  1. In the general sense of this manuscript, the English language needs to be improved; there are some grammar and sentence structure errors in the text. I advise a careful examination in a subsequent read. Overall, the quality of the manuscript is good.

We revised the document thoroughly. We highly appreciate your feedback.

Round 2

Reviewer 1 Report

Dear authors,

I'm satisfied with your response to my inquiries and I believe that the manuscript is now suitable for publication.

Back to TopTop