Next Article in Journal
Constructing Rainfall Threshold for Debris Flows of a Defined Hazardous Magnitude
Previous Article in Journal
Mapping Erosion Hotspots: Coherent Change Detection in the Quilpie Region, Queensland, Australia
 
 
Article
Peer-Review Record

Classification of Lakebed Geologic Substrate in Autonomously Collected Benthic Imagery Using Machine Learning

Remote Sens. 2024, 16(7), 1264; https://doi.org/10.3390/rs16071264
by Joseph K. Geisz 1, Phillipe A. Wernette 1,* and Peter C. Esselman 2
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3:
Remote Sens. 2024, 16(7), 1264; https://doi.org/10.3390/rs16071264
Submission received: 6 September 2023 / Revised: 13 March 2024 / Accepted: 27 March 2024 / Published: 3 April 2024
(This article belongs to the Topic Geocomputation and Artificial Intelligence for Mapping)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The manuscript is devoted to the study of bottom sediments using uninhabited underwater vehicles by two methods. A comparison of these methods is given and the advantages of the DNN method are shown. It's a good job, but there are a few refinements:

1. The abstract needs to be shortened a little, remove the introductory phrases. It is better to focus on what is done in the manuscript.

2. The model has been tested on one lake, how can it be applied to other reservoirs with different bottom parameters?

3. Not always underwater uninhabited vehicles can be used to study bottom sediments. Has the data of the devices been compared with the data of the echo sounders?

4. To compare the data of the two models, it is desirable to provide more visual illustrations.

5. In conclusion, it is better to add a few words about what has been done in the article.

Author Response

Please see the attachment

 

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

This manuscript describes an interesting approach for classifying benthic substrate from AUV imagery. The approach is very thorough and demonstrates the accuracy of different machine learning approaches. I have a few suggestions to improve this manuscript:

1)      I suggest including context in the introduction on why RFs and DNNs selected? It is not obvious why these specific ML approaches are chosen for this study.

2)      I think it would be useful to discuss why it is beneficial to have a high-resolution benthic substrate dataset (i.e., what are the management applications)

3)      It may be worthwhile to have a supplemental table of dates/locations for different data collection missions used for this study.

4)      Can you also include how long it took to train/classify the 6- and 2-class models, perhaps in an existing table.

5)      Is there any evidence that Lake Michigan grain size is skewed towards coarser/finer sized particles? I would imagine this would influence the number of images in each class.

Author Response

Please see the attachment

 

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

The article applies machine learning to AUV images of the lakebed of Lake Michigan to classify the benthic abiotic substrate. The manuscript is very well written and clearly demonstrates the advantage of ML over manual classification.

 

The article has a major methodological weakness that should be corrected before publication: The training did not take into account the different scale of the images. This could easily be remedied by using the distance as an additional training parameter. As the classes are defined according to the size of the structures, the scale is an indispensable parameter. Only if the AUV had flown at a constant distance, as one would conclude from line 155 ("target flying altitude of 1.75 m above the lakebed"), is the scale irrelevant. However, the distance between AUV and lakebed was changing by an order of magnitude from 0.51 to 4.9 m  (lines 338f). This variable distance should be mentioned and explained in the Materials and Methods section rather than the irrelevant target altude.

Author Response

Please see the attachment

 

Author Response File: Author Response.pdf

Round 2

Reviewer 3 Report

Comments and Suggestions for Authors

I had recommended major revisions of the submitted manuscript due to a major methodological weakness: training and application of the machine learning methods did not take into account the different scales of the images, even though many of the classes to be distinguished are primarily defined by the particle size. This weakness was clear to the authors but was postponed to future work: “Future work should (1) incorporate altitude and/or imaged area into the model“ (line 836 of the submitted manuscript). In the revised manuscript, this brief text in the outlook chapter was a bit extended. However, in my opinion it is necessary to include the distance as additional training parameter already in this paper and not postpone it to future work. Because the reason for recommending major revisions was not addressed substantially, I still cannot recommend a publication of the paper.

Furthermore, the following recommendation was not addressed in the authors’ reply at all: “the distance between AUV and lakebed was changing by an order of magnitude from 0.51 to 4.9 m (lines 338f). This variable distance should be mentioned and explained in the Materials and Methods section rather than the irrelevant target altitude.“ The authors should delete the misleading sentence in the Materials and Methods section „target flying altitude of 1.75 m above the lakebed“ (line 159) and replace it with a description of how the flying altitude was set and how variable it was. Just mentioning an irrelevant target altitude can easily lead to the wrong conclusion that scale is irrelevant because a constant altitude of 1.75 m seems to mean a constance distance of the AUV to the ground.

The authors write in their response that they did tests including the distance from bottom in the RF models, but these did not improve any of the models’ performance, but in contrast made all of these models significantly less accurate. These studies are exactly what I had recommended in my review, but they are not mentioned in either the original or revised manuscript. I would really encourage the authors to include them in the paper and not just briefly mention them in their response to my review, even if they are not yet satisfied with the results. Negative results are as important as positive results! And they justify and motivate the described future work.

Author Response

Thank you very much for taking the time to review this manuscript. Because the reviewer comments were all related, we chose to respond to the reviewer comment as a single block. Although the reviewer comments required very extensive re-analysis, including over 30 additional machine learning models to be trained, we believe it has significantly improved the manuscript. To reduce manuscript length and confusion, we chose to include three new tables in an Appendix section. These are all linked in the manuscript, but to expand on them in detail would add significant length to the manuscript and significantly decrease clarity in the text. Please find the detailed responses below and the corresponding revisions/corrections highlighted/in track changes in the re-submitted files.

We have extensively revised the entire analysis to try and account for and/or limit the influence of scale in the ML models. Because the AUV altitude at the time of image acquisition directly affects the ground sample resolution (GSR) and total image footprint of an individual image, we manually cropped the original larger dataset to two different altitude ranges: (a) 1.25 m to 3.00 m and (b) 1.60 m to 2.10 m. These altitude ranges were determined by plotting the histogram of all AUV image altitude values and then (a) cropping only extreme tailing values above or below the curve, and (b) cropping the histogram tails more tightly to include only the central portion of the dataset. Table A1 presents the number of images in each class present in the full dataset as well as these two cropped datasets, and Table A2 presents the corresponding calculated GSR values for the programmed AUV altitude, full dataset, and both cropped datasets. Table 6 was updated to include model results trained on the dataset cropped to 1.60 m to 2.10 m.

 

The calculated GSR values in Table A2 support using the 1.60 m to 2.10 m cropped dataset as one approach to significantly limit the influence of AUV altitude and scale on the models. Cropping the dataset to images between 1.60 m and 2.10 m from the lakebed reduces the GSR range from 0.139-1.339 mm (full dataset) to only 0.436-0.573 mm (1.60 m – 2.10 m dataset). This reduces the GSR range by 876% compared to the full dataset, and supports using the 1.60 m to 2.10 m dataset for the main analysis throughout the manuscript. All values and tables in the manuscript text have been updated to include results from the RF and DNN models trained on the 1.60 m – 2.10 m dataset.

 

Regarding scale and, more specifically, altitude, we also trained independent models that explicitly included altitude as a variable. Appendix Table A3 presents the complete set of model results for all models with and without altitude for both types of ML models and each of the three dataset crops. This resulted in a total of 30 new models for the revised analysis. Of these 30 new models, 18 of them directly included image altitude in the model. Analysis of Table A3 suggests that neither the RF nor DNN models were improved by explicitly including altitude as an input variable. In fact, the DNN models were significantly worse when altitude was included, suggesting that this new input information only added confusion to the model.

 

Table A2 supports the use of the 1.60 m to 2.10 m cropped dataset, and Table A3 supports not explicitly including altitude in the RF or DNN models.

 

The manuscript text has been updated to improve clarity regarding the points above. The last paragraph in Section 2.1.3 has been updated to clarify the concern regarding AUV image altitude and its effect on GSR:

“Although the AUV was programmed to travel 1.75 m above the lakebed, the actual altitude varied because of the onboard altitude sensor. As a result, altitude varied by image. To minimize variation in the ground sample resolution (GSR) from one image to another we subsampled a larger labelled dataset of 7282 images to include 4956 images with an altitude between 1.60 m and 2.10 m (Table A1). These thresholds were selected based on the histograms of the altitude values for all images to limit the outside influence of image scaling. The mean altitude of this subset was 1.88±0.13 m. Since image altitude varied, the GSR of a pixel ranged from 0.44 mm to 0.57 mm per image, where images taken farther from the lakebed had a coarser GSR than images taken closer to the lakebed. The GSR of an image at the average altitude of 1.88 m was 0.51 mm. See Table A2 for GSR of images by altitude.”

 

The third paragraph in Section 2.2 has been updated to clarify that RF and DNN models were trained with and without explicitly including altitude as an input variable:

“To explore issues of scale and the importance of the image altitude from the lakebed, two variations of RF and DNN models were trained. One set of RF and DNN models was trained without including image altitude as a model input, while the second set of RF and DNN models did include this value explicitly for every image.”

 

Furthermore, a new paragraph was added to the beginning of Section 3. Results to clarify that (a) explicitly including altitude did not improve the ML models, and (b) the remainder of the manuscript is focused on results using the 1.60 m – 2.10 m cropped dataset:

“Preliminary comparison of the RF and DNN models with and without image altitude suggest that RF models were the same when image altitude was included, while DNN models were significantly less accurate when they included image altitude (Table A3). As a result, the remainder of this manuscript will focus on RF and DNN models that did not utilize image altitude as a model input and were trained on the Geisz, et al. (2024) dataset [32] cropped to images acquired between 1.6 m and 2.1 m above the lakebed.”

 

The fourth paragraph of Section 4.4. Model Comparison was expanded to expand on results indicating model performance was either not affected or adversely affected when altitude was included in the ML models:

“However, when AUV altitude was explicitly included in the RF or DNN models, none of the models improved significantly. The DNN models with altitude even dis-played a significant decrease in accuracy compared to DNN models without altitude. Cropping the dataset to include only AUV images acquired between 1.60 m and 2.10 m altitude above the lakebed substantially reduced the altitudinal variation and variability in GSR between images, thereby minimizing the direct effect of varying image altitude on the models and classification results.”

 

Author Response File: Author Response.docx

Back to TopTop