Monitoring Harmful Algal Blooms and Water Quality Using Sentinel-3 OLCI Satellite Imagery with Machine Learning
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsIn this paper, some water quality parameters such as CHLA were estimated from Sentinel3 images. A machine learning approach and ground measurements are employed in this study. The paper must be revised carefully according to the following comments.
-Based on data descriptions, a period of time more than 5 years is considered for the study. Do you select 367 samples over this period? Do you think the trained model can predict the parameters in each day?
-Please provide maps of the parameters for some days.
-In the workflow, Kfold test is mentioned but I did not see any results regarding it.
- I would like to see results of the proposed method with changing training samples randomly over different experiments.
-Descriptions regarding RF are not adequate, please revise it.
- The proposed method must be compare with others in the text. For example, you can use other machine learning approaches to obtain results.
- In some points, a huge error is observed, why? Please discuss it in the text.
- Also, please present raw Sentinel3 images in the text.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsBasic random forest models were used to monitor cyanobacterial harmful algal blooms in water resources using Sentinel-3 OLCI satellite imagery. Among all four proxies, chlorophyll-a as an indicator achieved the best results. Although the topic is interesting, the manuscript lacks significant novelty. Further revision is required.
Figure 5. How are the importance values obtained? Are they the coefficients of the model? Please clarify.
Figure 6. Which band did you select? What is the order of the bands? Is this order the same as in Figure 5? Please clarify. If not, I think ranking the bands based on their importance shown in Figure 5 would be better.
How did you obtain the R2 and RMSE values in Figure 6? Are they based on the training set or the testing set? Please clarify.
You have two Figure 6, please correct them.
I think Section 4.2 should be be included in the methods section.
For the second Figure 6, please clarify the values on the Y-axis and expand the caption to provide more details.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for AuthorsGeneral comment
It has been presented in the manuscript that the Sentinel-3 OLCI images were evaluated to predict algal bloom proxies in the western Lake Erie by machine learning architectures. Four RF models were trained for Chlorophyll-a, Microcystin, Phycocyanin, and Secchi-depth, respectively. Although the manuscript holds a merit to be published, some efforts are needed to be fixed as the details of methods should be explained. I will address myself in detail as follows.
Specific comment
1. How is the spatiotemporal match between the water quality data obtained from the in-stu station and the band reflectance data of OLCI images used for training? Is the time lag within plus or minus 3 hours?
2. Among many machine learning models, why did the you choose the random forest model for training? Why not select multiple models and ultimately choose the one that performs best?
3. Please describe in detail what parameters are input to model training? How was it chosen?
4. There should be interaction between the Model Prediction and Validation and Model Assessment parts in Figure 2. The model performance can be optimized by adjusting the hyperparameter settings to obtain the best model. In its current form, predictive models are generated directly.
5. The relative error in Figure 4 should be written inside, and each subfigure should be square and marked with a 1:1 line.
6. Figure 7 and related descriptions need to clearly indicate the significance of the regression.
7. Line 350: Figure 8 not found
8. Line 418-422: The Secchi Depth is related to many factors and cannot be directly only related to the magnitude of algal blooms. For example, the turbidity of the lake water and the concentration of organic matter will all affect the Secchi Depth.
9. The first half of the article emphasizes the importance of machine learning. It is recommended to include a discussion of the application of machine learning in this work in the discussion section.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 4 Report
Comments and Suggestions for AuthorsBased on the Sentinel-3 OLCI data, using a machine learning algorithm, for Western Lake Erie basin The four indexes of Chlorophyll-a, Microcystin, Phycocyanin and and Secchi-depth of water bodies are reversed, and the correlation of the four water quality indexes is analyzed, and the inversion accuracy of various models for different water quality parameters is discussed.
The article data is reliable, the discussion is clear, and the result is reliable.
The main amendments are suggested as follows:
1) There are many algorithms based on machine learning, in addition to random forest, there are support vector machines, XG-BOOST, etc. It is suggested to supplement the research status of these machine learning methods, and the comparison of the inversion accuracy of water quality parameters by different inversion algorithms;
2) The results and discussion are very similar, and clear research results and conclusions are suggested.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsAccept as it is.
Reviewer 2 Report
Comments and Suggestions for AuthorsAll comments have been resolved.
Reviewer 3 Report
Comments and Suggestions for AuthorsMy previous question has been answered.