Next Article in Journal
Additional Treatment of Nitrogen and Phosphorus Using Natural Materials in Small-Scale Domestic Wastewater Treatment Unit
Next Article in Special Issue
Soil Moisture Distribution and Time Stability of Aerially Sown Shrubland in the Northeastern Margin of Tengger Desert (China)
Previous Article in Journal
Using Natural and Artificial Microalgal-Bacterial Granular Sludge for Wastewater Effluent Polishing
Previous Article in Special Issue
Assessment of Three GPM IMERG Products for GIS-Based Tropical Flood Hazard Mapping Using Analytical Hierarchy Process
 
 
Article
Peer-Review Record

Surface Water Quality Assessment through Remote Sensing Based on the Box–Cox Transformation and Linear Regression

Water 2023, 15(14), 2606; https://doi.org/10.3390/w15142606
by Juan G. Loaiza 1, Jesús Gabriel Rangel-Peraza 1,*, Sergio Alberto Monjardín-Armenta 2, Yaneth A. Bustos-Terrones 3, Erick R. Bandala 4, Antonio J. Sanhouse-García 1 and Sergio A. Rentería-Guevara 5
Reviewer 1:
Reviewer 3:
Reviewer 4: Anonymous
Water 2023, 15(14), 2606; https://doi.org/10.3390/w15142606
Submission received: 15 June 2023 / Revised: 6 July 2023 / Accepted: 12 July 2023 / Published: 18 July 2023
(This article belongs to the Special Issue Remote Sensing-Based Study on Surface Water Environment)

Round 1

Reviewer 1 Report

Line 181 The DOS method does not generate, it only assumes that the reflectance of dark objects includes a substantial component of atmospheric scattering. Thus the reflectance of a dark object, such as a deep lake, is measured and that value must be subtracted from the image.

 

Line 185 Start sentence with capital letter

 

Line 193 A reference is suggested that supports the mentioned data

 

Line 194 Review the units since TOA's dimensional analysis does not match what is reported here or in the reference article.

 

Line 218 The statement is not entirely accurate. Data normalization involves transforming the data to a standardized format, but it is not necessarily about removing the effects of external influences on a time series. The primary goal of data normalization is to modify the values in the data set to a common scale, without distorting differences in value ranges or losing information. This is usually done to prevent certain features from dominating others due to their scale.

In the context of a time series, normalization can be used to make data more comparable over time (for example, adjusting for inflation when looking at the price of a product over several years). However, it is not specifically about controlling external influences. It would rather be "deseasonalization" or "seasonal adjustment", which are methods used in the analysis of time series to eliminate patterns that are not related to the phenomenon studied.

So, to correct the statement, I suggest:

"The main goal of data normalization is to adjust values to a common scale, achieving a standardized data format, which may facilitate comparison and analysis."

 

Line 242 And the remainder or error? ε_i is not considered to be the i_th identically distributed independent normal error.

 

Furthermore, it remains to be noted that:

 

for each observation i= 1,2,3,..,n

 

Line 244 The format of b_0 is incorrect

 

Line 262. There is a bit of confusion in the formulation. When testing the significance of a coefficient (such as b_1), the model does not remove that variable (in this case x_1). Instead, it tests whether b_1 differs significantly from zero. If the model were to remove x_1 as you have described, it would be testing a different hypothesis.

So, to correct the statement:

 

"For the model y_i=b_0+b_1x_1+b_2x_2++b_ix_i, if the test is performed for b_1, the significance of the variable x_1 is evaluated while controlling for the presence of the variables x_2, ..., x_i."

This means that we are testing whether b_1 differs significantly from zero, giving us tests of the significance of the variable x_1 in the presence of all other variables.

 

Line 297 “particule” instead of “particulate”

 

Line 303 Incorrect mg/m3 format

 

Line 304 Incorrect mg/m3 format

 

Line 325 When affirming that there is a good adjustment, it is recommended to indicate with respect to which reference

 

Line 329 Here it is noted that they applied another formula for Box-Cox than the one reported in equation 3. They do not give the values for the λ's either. That is, it is not very clear how they applied the Box-Cox transformation. It is also not clear how they obtained the value of

4.57379, 97.6453 and 1.4358 that multiply the value of λ, in the denominator of the y_i formula in TOC, TDS and Chl-a, respectively. Likewise, the number to which the previously reported numbers rise.

 

Table 3 Format, is it r^2 or R^2,? Uppercase or lowercase?

 

Table 3 Recommendation: Standardize the number of decimals that are being used.

 

Table 4. Considering that a p value > 0.05 in the context of normality tests suggests that there is insufficient evidence to reject the null hypothesis, where the null hypothesis is that the data come from a normal distribution.

 

Statistical tests such as the Shapiro-Wilk test, the Anderson-Darling test, or the Kolmogorov-Smirnov test are often used to test whether a data set is normally distributed. In these tests:

 

1.-The null hypothesis (H0) is that the data is normally distributed.

2.- The alternative hypothesis (H1) is that the data is not normally distributed.

 

When one of these tests is performed, a p-value is returned as the result. If the p-value is less than the chosen significance level (usually 0.05), the null hypothesis is rejected in favor of the alternative hypothesis. In this case, it would be concluded that the data is not normally distributed.

 

On the other hand, if the p-value is greater than the chosen significance level (for example, p > 0.05), the null hypothesis is not rejected. This means that you do not have sufficient evidence to conclude that the data is not normally distributed. It's a subtle but important distinction: this doesn't necessarily prove that the data is normally distributed, it just suggests that we don't have strong evidence to believe that they aren't.

 

Line 346 In the context of a regression model, you could use a t-test to determine whether each independent variable (in this case, "bands") has a statistically significant relationship with the dependent variable. However, it is not a question of whether the variable has a "statistically significant effect on mathematical models" as such; it is whether the variable has a significant effect on the outcome variable of interest.

 

So, in principle, you could use Discriminant Analysis to reduce the number of bands (if these bands represent different classes of data), and you could use a t-test to test whether each band has a significant effect on your outcome variable. However, the wording as it stands can confuse these different concepts a bit.

 

Line 347 This consideration must be supported with a quote.

 

Line 364. This is incorrect. Figure 4b shows TOC.

 

Line 365. What is not RMSE = 3.2613

 

Line 376 Says “coincide” in Spanish

 

Line 390 In the graphs an RMSE = 3.1267 is reported for Chl-a

 

Line 415 Predicted or Estimated?

 

Line 433 are Figures 5b and 5c?

 

Line 434 Is it Figure 5a?

 

Figure 5. The labels a, b, c are not indicated

 

Figure 5 It goes first TDS, then OCD and finally Chl-a

 

Line 456, how can you know all that just by looking at the images? There is no explanation of the date format (YYYY/MM/DD). Neither are the images separated with respect to a letter, i.e., there is no figure a,b,c, etc. The color bar is also not clear and is not explained in the text.

 

Line 457 Figure 6a is not correctly identified

Lines 466 and 467 Figures 6b and 6c are not identified

 

Figure 6 Figures 6 a, b and c are not identified

Comments for author File: Comments.pdf

I detected only a few incorrect sentences, so I have recommended minor revision of english language to be required.

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Reviewer 2 Report

The manuscript by Loaiza et al. is an interesting study utilizing Landsat 8 imagery data to predict the water quality parameters of Chlorophyll a, TOC and TDS using Box-Cox transformation and linear regression.

The manuscript is interesting and generally well written.

It is understood that the model was developed based on 3 years of monitoring data (2015 to 2017). It would have been interesting if the authors would have applied their model on a different year for for example 2018, 2019 and 2020.

Please find additional comments below:

L 47: “ the identification and categorization of water quality” : Please rephrase, do the authors want to identify and categorize water quality parameters

L 49: “ absorption characteristics of OACs” Please define abbreviations first before using them

L 55: “Normalization facilitates the discrimination of soils and emergent vegetation” Please explain why discrimination of soils and vegetation is relevant for determining water quality parameters using remote sensing

L 61: “chlorophyll-a”: in abstract it was written as “Chlorophyll-a”, please be consistent

L 69: “Normalizing water quality data can eliminate errors, inconsistencies, duplicates, or missing values”: Please explain how normalizing of water quality data can eliminate duplicate or missing values, would not this elimination of duplicate or missing data a normal step during data processing before the normalization takes place ?

L 79: “selected to apply” : Please rephrase, correct for grammar.

L 87: “dam wall is 105.5 m height”: Please correct for grammar.

L 122: “meters above sea level (m.a.s.l.)”: meters above sea level has been mentioned before, please introduce the first time it is mentioned.

L 128: “which promote continuous and rapid changes”: These are very negative changes, I am not sure the word “promote” is the right choice here.

L 141: “Imagery pre-processing.” : Is this a header or an incomplete sentence ?” This applies throughout the manuscript. If it is a header it should be clear from the positioning and formatting in the manuscript.

L 142: Satellite imagery was pre-processed for atmospheric and radiometric correctionverting a digital number to a radiance value [31, 32, 33]. “ : How was this pre-processing donw, please give some details, was a specific software used ? “

Figure 2 text: “Water quality maps generation”: Some of the text in the figure is truncated, please correct this. The arrow from the field “Data for validation” goes straight to “Water quality maps generation”, wouldn’t it also go to “Model validation” ?

Table 1” “Paht/Row””: Correct spelling.

L 178 “Reflectance data extraction.” Please see previous comment. If this a header it should not be with the main body of the text with the same format. This applies to the whole manuscript.

Equation 1: The equation should be reformatted. The “=” sign seems offset.

2.2.2. Water quality

Please explain how TDS, Chl-a and TOC were measured in the laboratory, what methods were used and what were the detection limits. Please explain how sampling was conducted. How many samples were taken (e.g. Fig 3 as well).

Figure 3: It appears strange that while TOC, TDS are significantly increased in 2017, Chlorophyll a is significantly reduced in 2017 compared to the other years. Normally there should be a correlation between these parameters. Please explain what could be the reason for the drop in Chlorophyll a in 2017 and the increase of TOC and TDS for the same year.

L 216 following and Table 2: Which of the Box Cox transformation approaches was used for this study and why.

Table 3: “Parameter” should be all on the same line. Please correct

Table 4: What does “Dn” in Table 4 represent ? Please explain.

Table 5: “Box-Cox (Cha-a)”: Please correct

L 3303-304: “concentration of 3.4 mg/m3 and a minimum concentration of 0.3 mg/m3” Please correct to mg/m3

Figure 6: Please reconsider colour scheme for TDS, the colours appear all the same throughout the 3 years.

Results and Discussion and Figure 6. It is understood that the model was developed based on 3 years of monitoring data (2015 to 2017). It would have been interesting if the authors would have applied their model on a different year for for example 2018, 2019 and 2020.

Conclusion:

L 511-512: “The methodology gives the spatial and temporal views of surface water quality needed for the accurate assessment and management of water bodies.” Please rephrase this sentence

The manuscript by Loaiza et al. is an interesting study utilizing Landsat 8 imagery data to predict the water quality parameters of Chlorophyll a, TOC and TDS using Box-Cox transformation and linear regression.

The manuscript is interesting and generally well written.

It is understood that the model was developed based on 3 years of monitoring data (2015 to 2017). It would have been interesting if the authors would have applied their model on a different year for for example 2018, 2019 and 2020.

Please find additional comments below:

L 47: “ the identification and categorization of water quality” : Please rephrase, do the authors want to identify and categorize water quality parameters

L 49: “ absorption characteristics of OACs” Please define abbreviations first before using them

L 55: “Normalization facilitates the discrimination of soils and emergent vegetation” Please explain why discrimination of soils and vegetation is relevant for determining water quality parameters using remote sensing

L 61: “chlorophyll-a”: in abstract it was written as “Chlorophyll-a”, please be consistent

L 69: “Normalizing water quality data can eliminate errors, inconsistencies, duplicates, or missing values”: Please explain how normalizing of water quality data can eliminate duplicate or missing values, would not this elimination of duplicate or missing data a normal step during data processing before the normalization takes place ?

L 79: “selected to apply” : Please rephrase, correct for grammar.

L 87: “dam wall is 105.5 m height”: Please correct for grammar.

L 122: “meters above sea level (m.a.s.l.)”: meters above sea level has been mentioned before, please introduce the first time it is mentioned.

L 128: “which promote continuous and rapid changes”: These are very negative changes, I am not sure the word “promote” is the right choice here.

L 141: “Imagery pre-processing.” : Is this a header or an incomplete sentence ?” This applies throughout the manuscript. If it is a header it should be clear from the positioning and formatting in the manuscript.

L 142: Satellite imagery was pre-processed for atmospheric and radiometric correctionverting a digital number to a radiance value [31, 32, 33]. “ : How was this pre-processing donw, please give some details, was a specific software used ? “

Figure 2 text: “Water quality maps generation”: Some of the text in the figure is truncated, please correct this. The arrow from the field “Data for validation” goes straight to “Water quality maps generation”, wouldn’t it also go to “Model validation” ?

Table 1” “Paht/Row””: Correct spelling.

L 178 “Reflectance data extraction.” Please see previous comment. If this a header it should not be with the main body of the text with the same format. This applies to the whole manuscript.

Equation 1: The equation should be reformatted. The “=” sign seems offset.

2.2.2. Water quality

Please explain how TDS, Chl-a and TOC were measured in the laboratory, what methods were used and what were the detection limits. Please explain how sampling was conducted. How many samples were taken (e.g. Fig 3 as well).

Figure 3: It appears strange that while TOC, TDS are significantly increased in 2017, Chlorophyll a is significantly reduced in 2017 compared to the other years. Normally there should be a correlation between these parameters. Please explain what could be the reason for the drop in Chlorophyll a in 2017 and the increase of TOC and TDS for the same year.

L 216 following and Table 2: Which of the Box Cox transformation approaches was used for this study and why.

Table 3: “Parameter” should be all on the same line. Please correct

Table 4: What does “Dn” in Table 4 represent ? Please explain.

Table 5: “Box-Cox (Cha-a)”: Please correct

L 3303-304: “concentration of 3.4 mg/m3 and a minimum concentration of 0.3 mg/m3” Please correct to mg/m3

Figure 6: Please reconsider colour scheme for TDS, the colours appear all the same throughout the 3 years.

Results and Discussion and Figure 6. It is understood that the model was developed based on 3 years of monitoring data (2015 to 2017). It would have been interesting if the authors would have applied their model on a different year for for example 2018, 2019 and 2020.

Conclusion:

L 511-512: “The  methodology gives the spatial and temporal views of surface water quality needed for the accurate assessment and management of water bodies.” Please rephrase this sentence

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Reviewer 3 Report

Although the research topic of the article is related to my research, I still cannot complete the review completely. After reading, I feel that this paper is very good, skillfully using remote sensing means to evaluate water quality, and give a rigorous treatment method. If comments can be referenced, it is recommended to accept this article.

(1) Figure 5 compares the relationship between the measured data and the inversion data, suggesting the acquisition path and practice of the measured data.

(2) Does formula 5 need to be cited?

(3) In line 325, 0.80 is revised to 0.85.

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Reviewer 4 Report

The authors have been trying to proposed an cost effective approach for water quality parameters estimated from satellite data. In general, this objective is fine. However, there are so many similar works have been done since the 2010s. In order to clarify the current work is much better than the previous one, I strongly recommend the author carefully revised the Introduction section.

Also, the authors should make clear in the man text which one is the novelty compared to existing works, which should be in the Discussion section. Typos are found in the text.

Some suggestions are included in the attached file.

Comments for author File: Comments.pdf

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Round 2

Reviewer 4 Report

I agree with the revised version.

Back to TopTop