Next Article in Journal
Internal Wave Generation in a Non-Hydrostatic Wave Model
Previous Article in Journal
Reduction of Fouling and Scaling by Calcium Ions on an UF Membrane Surface for an Enhanced Water Pre-Treatment
 
 
Article
Peer-Review Record

Calibrating a Hydrological Model by Stratifying Frozen Ground Types and Seasons in a Cold Alpine Basin

Water 2019, 11(5), 985; https://doi.org/10.3390/w11050985
by Yi Zhao 1, Zhuotong Nan 1,2,*, Wenjun Yu 3 and Ling Zhang 4
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Water 2019, 11(5), 985; https://doi.org/10.3390/w11050985
Submission received: 3 April 2019 / Revised: 5 May 2019 / Accepted: 8 May 2019 / Published: 10 May 2019
(This article belongs to the Section Hydrology)

Round 1

Reviewer 1 Report

@page { margin: 0.79in } p { margin-bottom: 0.1in; line-height: 115% }

Calibrating a hydrological model by stratifying frozen ground types and seasons in a cold alpine basin by Zhao et al.



Aim of the study:

Authors attempt to calibrate DHSVM based on frozen land types and precipitation seasonality.


Detail comments:

line 68: sentence is not clear. Please revise and cite.

Line 88: All acronyms that occurs once should be removed from the text (for e.g. SRM, GMHM, SFG etc). As it is the paper is unreadable for a reader not familiar with these acronyms.

Line: 93-97 formulate your objectives and organization of this manuscript.

Line 103: explain more about which anthropogenic factors??

line 142 what is eFAST? Explain.

Line 161: What is PSO?

Line 162: what is your model calibration period. How do you define your model is best calibrated based on NSE. Why you have selected NSE instead of squareroot NSE or KGE?

Line 179 what is next year. Be specific.

193: also mention in previous comments, what is your calibration and simulation periods?

Line 203: define BBR. Try to avoid acronyms as much as possible.

Line 215: What is re-sampling? How did you re-sample your DEM..what what you used to re-sample explain more.

Line 221: Briefly describe MicroMet approach.

Line 273 this should be in methods part.

Table 2: be consistent with the numbers after decimal points. It does not make sense to have 4 numbers after decimal. Its too precise.

Figure 3: I think the deviation from observation is due to model warming periods. So its better to find warming periods for your model and present results after warming periods.

Line 396: do you have some supporting arguments for this sentence?? any supporting references?


Minor comments:

Introduction could be shortened. Abstract and conclusion needs to be improved and shortened.




Author Response

Responses to the comments of Reviewer #1

Article ID: water-488408

Title: Calibrating a hydrological model by stratifying frozen ground types and seasons in a cold alpine basin

Authors: Yi Zhao, Zhuotong Nan, Wenjun Yu, Ling Zhang

 

Dear reviewer,

Thank you very much for your valuable comments on our manuscript entitled “Calibrating a hydrological model by stratifying frozen ground types and seasons in a cold alpine basin” (ID: Water-488408). As per your comments and suggestion, the brief descriptions about the eFAST, PSO and MicroMet have been supplemented in this revision, and more details about the model setting were added. The writing has been improved and the unclear sentences got clarified. Two metrics, RMSE and KGE, in addition to the NSE, have been applied to further evaluate the simulation accuracy. Some acronyms (such as BBR) were removed for good legibility and the values of parameters have been rounded to two decimal places. Apart from these, we have conducted an extra experiment, which is designed to explain the impacts of the calibration scheme by stratifying the parameters in frozen ground type, in response to Reviewer #2. The experiment costed us one week to calibrate and re-run the model.

Intensive revisions have been made. Please see the tracked document provided for all modifications. I believe the quality has been significantly improved by incorporating your valuable comments and suggestion into the MS.

Thank you very much,

The authors

 

Texts in red are the reviewer’s comments; those in black are the authors’ explanations to the reviewer’s comments; and those in blue are the revised texts appeared in the revised manuscript. Along with this point to point letter, I also enclosed a track-enabled document (water-488408_revision track) and a clear version document (water-488408_revision). The line numbers appearing in this letter correspond to those in the track-enabled document, which might differ those in the clear document. Please note, the figures and tables in this letter are numbered by the appearance order, possibly not same in the revised MS.

 

 

 

Reviewer 1:

Authors attempt to calibrate DHSVM based on frozen land types and precipitation seasonality.

 

Point 1: line 68: sentence is not clear. Please revise and cite.

Answer: This sentence has been rewritten to make it clear:

“Modeling in cold alpine basin usually needs to consider more complex situations compared to other regions.” (Page 2 Lines 73-74)

 

Point 2: Line 88: All acronyms that occurs once should be removed from the text (for e.g. SRM, GMHM, SFG etc). As it is the paper is unreadable for a reader not familiar with these acronyms.

Answer: The unnecessary acronyms which occurs only once, such as SRM, GBHM and HBV, etc, have been removed. The acronym ‘SWAT’ is remained because it is used later in the texts. (Page 2 Lines 81-85).

“Included are empirical models such as Hydrologiska Byråns Vattenbalansavdelning model [26], semi-distributed models such as TOPMODEL [27] and Snowmelt Runoff Model [28], distributed models such as Soil and Water Assessment Tool (SWAT) [29], Distributed Large Basin Runoff Model  [30] and Geomorphology-Based Hydrological Model [31], and land surface models such Noah [32,33].”

 

Point 3:Line: 93-97 formulate your objectives and organization of this manuscript.

Answer: We rewrote this part and introduced the organization by the end of the introduction section (Page 3 Lines 101-112). Now it reads,

“The objective of this study is to develop a new calibration scheme for hydrological modeling in cold alpine basins in which the model parameters are stratified by frozen ground type and season. The approach was evaluated in the Babao River Basin, located within the Qilian Mountains, northwest China, where a grid-based distributed hydrological model, DHSVM, was applied. This paper is organized as follows. Section 2 elaborates the proposed stratified calibration scheme in parallel with the study area, data and the model settings. Section 3 presents the results from the calibration experiments and examines their impacts on the simulation accuracy. Section 4 includes thorough discussion on the effectiveness and rationality of this proposed calibration scheme. The uncertainties and model inadequacy learned through those experiments are also discussed. Finally, the conclusions are drawn in Section 5.

 

Point 4: Line 103: explain more about which anthropogenic factors??

Answer: The anthropogenic factors include, but not limited to, engineering constructions such as road and buildings, and changes in land use. They may strongly influence the hydrological processes. I added some references to those statements. (Page 3 Lines 116-119)

“This model has been widely used in a wide spectrum of studies, such as hydrological forecasting [39,40], effects and responses to climate changes [41,42] or anthropogenic factors such as engineering construction and cultivated land use [43,44].”

 

Point 5: line 142 what is eFAST? Explain.

Answer: The extended Fourier amplitude sensitivity testing method (eFAST) is a tool for parameter sensitivity analysis in hydrological modelling. It uses a search curve to make random values of parameters, and calculates the fluctuation range of the simulation results with the parameter change to determine the sensitivity of a parameter. The description of eFAST and its settings in this study has been provided in Section 2.2. (Page 4 Lines 160-164)

“The eFAST method was used to perform the analyses for each combination and the mean streamflow is used as the variable for evaluating the sensitivity. The eFAST method provides quantitative parametric sensitivity indices based on conditional variances to rank the sensitivities of parameters. In eFAST, the total effect instead of the first order sensitivity indices is computed.”

 

Point 6: Line 161: What is PSO?

Answer: The particle swarm optimization (PSO) technique is a computational optimization method by having a population of candidate solutions, and moving these particles (solutions) around in the search-space according to simple mathematical formulae over the particle's position and velocity. More descriptions about PSO were added to Section 2.3. (Page 5 Lines 182-188)

“The PSO is a widely used computational optimization method by having a population of candidate solutions, dubbed particles, and moving these particles around in the search-space according to simple mathematical formulae over the particle's position and velocity [10]. Each particle's movement is influenced by its local best known position and the best known positions in the search-space. Through the iteration of particle position, the best known positions of the whole search-space will be updated. The PSO has an advantage of fewer iteration times comparing to other heuristics methods.”

 

Point 7: Line 162: what is your model calibration period. How do you define your model is best calibrated based on NSE. Why you have selected NSE instead of squareroot NSE or KGE?

Answer: The calibration period is from 10/01/2008 to 09/30/2009 including whole dry and wet seasons. The validation period has been modified to cover 10/01/2004 to 09/30/2008, which is now completely independent to the calibration period. (Page 6 Lines 234-239)

“The calibration period covers at least one cycle of dry and wet alternation, which spans 10/01/2008 to 09/30/2009 in the case study. To further verify the applicability of optimized parameters, the parameter values obtained in Steps 1 and 4 are used to simulate a much longer period (such as 10/01/2004 to 09/30/2008 in the case study) than the calibration period, over which the simulated discharges from Steps 1 and 4 are verified against the observed records.”

The Nash-Sutcliffe coefficient (NSE) is widely used in hydrologic modeling to measure the performance. The closer the NSE is to 1, the better the simulation performance is. The PSO method could search for the more suitable parameter combination (with highest NSE) in the search space.

In this revision, the RMSE and KGE have be added for evaluating the simulation in a more comprehensive way. Brief descriptions and equations for those indices are provided (Page 5 Lines 193-200). The metrics measured in the calibration and validation periods are shown in Table 1 and Table 2, for your reference. They are consistent in measuring the performance of our proposed approach.

Table 1. The NSE, KGE and RMSE of the simulated daily streamflow with the parameters optimized in the four calibration steps


NSE

KGE

RMSE   (×106 m3)

Baseline

0.58

0.72

0.89

FroCal

0.65

0.80

0.81

SeaCal

0.63

0.76

0.84

FullCal

0.69

0.83

0.77

Table 2. The NSE, KGE and RMSE of the simulated daily streamflow with the parameters optimized in the baseline and full stratification calibration schemes, respectively, in a four-year period (10/01/2004-09/30/2008).


NSE

KGE

RMSE   (×106 m3)

Baseline

0.48

0.68

1.08

FullCal

0.60

0.75

0.96

 

Point 8: Line 179 what is next year. Be specific.

Answer: In the case study, the dry season refer to 10/01/2008 to 04/30/2009. The sentence now reads:  (Page 6 Lines 217-218)

“The entire period is divided into dry (October to next April in the Babao River Basin) and wet seasons (May to September).”

 

Point 9: 193: also mention in previous comments, what is your calibration and simulation periods?

Answer: The calibration period is from 10/01/2008 to 09/30/2009. The validation period spans from 10/01/2004 to 09/30/2008, in which we have the necessary data. (Page 6 Lines 234-239)

 

Point 10: Line 203: define BBR. Try to avoid acronyms as much as possible.

Answer: The BRB is short for Babao River Basin. However, it was removed in this revision in response to your suggestion on less use of acronyms.

 

Point 11: Line 215: What is re-sampling? How did you re-sample your DEM..what what you used to re-sample explain more.

Answer: The original resolution of DEM is 30 m, however, the resolutions of soil type data and land use type data is about 1 km. In order to match all data in a modeling resolution, all data have been re-sampled to a 300 m resolution by the nearest neighbor method. Resample is an often-used term in GIS. I provided more details in the revision about the data resolution.  (Page 7 Lines 268-273)

“The DEM data for the study basin was subset from the ASTER Global DEM [51], with an original resolution of 30 m. The maps of frozen ground type [52], soil type [53] and land use/cover type [54] reflecting the conditions of around 2010 in the study area were provided by the Cold and Arid Regions Science Data Center at Lanzhou, Northwest China (http://westdc.westgis.ac.cn) and were converted to the grid format. The grid data were then converted to a uniform 300 m model resolution.”

 

Point 12: Line 221: Briefly describe MicroMet approach.

Answer: Brief descriptions about the MicroMet approach have been provided. Now it reads, (Page 7 Lines 277-280)

“MicroMet is an intermediate-complexity, quasi–physically based, meteorological model to produce high-resolution atmospheric data. It is able to provide high quality meteorological data for mountainous areas by considering topographic impacts to climate variables.”

 

Point 13: Line 273 this should be in methods part.

Answer: That sentence has been removed in this revision, and a new sentence was added (Page 10 Lines 336-337).

“The optimal values for the sensitive parameters obtained in the four steps are presented in Table  and Table .”

 

Point 14: Table 2: be consistent with the numbers after decimal points. It does not make sense to have 4 numbers after decimal. Its too precise.

Answer: We rounded them to two decimal places, such as in Tables 3 and 4, except LC, which keeps four digits. Otherwise LC will be rounded to zero.

Table 3. Calibrated values of parameters connected to soil type in four calibration steps. Baseline, FroCal, SeaCal and FullCal stand for calibration steps 1 to 4, respectively; Perm stands for the permafrost area and SFG the seasonally frozen ground area. Dash indicates the non-sensitive parameter. Ranges: ED (≥0), BP (>0), FC (wilting point to porosity), LC (>0). The values are round to two decimal places except LC, which keeps four digits.


Param

Baseline

FroCal/SFG

FroCal/Perm

SeaCal

FullCal/SFG

FullCal/Perm

Clay Loam

ED

3.16

4.78

5.00

3.16

4.80

5.00

BP

0.36

0.69

0.76

0.36

0.69

0.76

FC

0.33

0.18

0.27

0.33

0.18

0.27

LC

-

0.0070

-

0.0210

0.0050

-

Loam

ED

3.20

0.00

2.88

3.19

0.00

2.90

BP

0.29

0.10

0.76

0.29

0.10

0.76

FC

0.18

0.18

0.18

0.18

0.18

0.18

LC

-

0.0002

-

0.0008

0.0002

-

Sandy Loam

ED

2.71

2.25

4.05

2.71

2.20

4.49

BP

0.39

0.10

0.10

0.39

0.10

0.10

FC

0.30

0.26

0.28

0.30

0.25

0.29

LC

-

0.1000

-

0.0600

0.1000

-










 

Table 4. Calibrated values of land cover relevant parameters in four calibration steps. Baseline, FroCal, SeaCal and FullCal stand for calibration steps 1 to 4, respectively; Dry is short for the dry season and Wet the wet season. Dash indicates the parameter being not sensitive. Ranges: LAI (≥0), MinR (>0), Alb (0-1). The values are round to two decimal places.


Param

Baseline

FroCal

SeaCal/Dry

SeaCal/Wet

FullCal/Dry

FullCal/Wet

Bare soil

LAI

1.20

1.20

-

0.50

-

0.50

MinR

346.17

346.17

-

300.00

-

399.37

Alb

0.18

0.18

0.10

0.14

0.18

0.20

Grass

LAI

4.46

4.46

-

3.12

-

2.00

MinR

389.02

389.02

-

600.00

-

500.00

Alb

0.14

0.14

0.20

0.12

0.20

0.10

 

Point 15: Figure 3: I think the deviation from observation is due to model warming periods. So its better to find warming periods for your model and present results after warming periods.

Answer: Thanks. We have done 20 years of spin-up (model warming) and started the simulations after the spin-up period as you suggested. I am afraid I do not make it clear in the previous version, so I modified the texts. ( Page 7 Lines 285-288)

“The runs begun with 20 years spin-up consisting of replicated meteorological records from January 2008 to December 2009. Because the DHSVM can output the states at any given time step, the output states on the last October 1, 2008 and May 1, 2009 serve as the initial states of the dry season and wet season, respectively, to the formal runs.”

 

Point 16: Line 396: do you have some supporting arguments for this sentence?? any supporting references?

Answer: Some researches have been done and found that the ice present in permafrost all year round will prevent the movement of soil water and reduce the base flow in soil layer. This is consistent with the results in this study. I cited two works (Kane, et al. 1983. and Woo, et al. 1986.) in this revision. (Page 18 Lines 525)

 

Minor comments:

Point 17: Introduction could be shortened. Abstract and conclusion needs to be improved and shortened.

Answerer: Done. We have improved the abstract, introduction and conclusion sections. However, in order to response to another reviewer’s suggestion, we also added one important point to the abstract and conclusion sections.

In Abstract: (Page 1 Lines 33-35)

“The underestimation in the April streamflow also highlights the importance of improved physics in a hydrological model, without which the model calibration cannot fully compensate the gap.”

In Conclusions: (Page 20 Lines 627-638)

“While the new calibration approach benefits the simulation, it cannot fully compensate the absence of key physics in the model. Owing to the lack of representing freeze-thaw processes in active layer in DHSVM, the simulation calibrated by the proposed approach cannot correctly reproduce runoff around April. It highlights the importance of improved physics in a hydrological model, the need of which cannot be fully circumvented by parametric calibration.”


Author Response File: Author Response.pdf

Reviewer 2 Report

This paper examines whether the performance of a watershed model can be improved by (1) dividing the watershed’s grid cells into different categories and using different parameter values for those categories and (2) dividing the modeling period into different categories and using different parameter values for those categories.  The results suggest that discharge at the basin outlet is estimated with greater accuracy when the divisions are included.  Overall, the paper is clearly written and requires only moderate corrections for English and typos.  For the most part, the authors have clearly explained both the methodology and results.  However, I have a few major concerns.

1.       I think the results of this study are obvious from the beginning.  By allowing different parameter values for different locations and time periods, the authors have provided many additional degrees of freedom in the calibration process.  Improved performance during the calibration is almost certain given the greater flexibility.  Really, the only question is how much the performance will improve.  To facilitate a fair comparison among models with different numbers of parameters, I would suggest that the authors use measures of performance that account for the different number of parameters used in the different models.  Such measures would penalize the models with more parameters.  They would show that the additional parameters are only warranted if the increase in performance is substantial.

2.       A key problem with including many parameters in a model (and especially time-varying parameters) is the possibility that the temporal pattern and values are only applicable for the time period for which they were calibrated.  Thus, an independent evaluation (“validation”) period is essential to assess whether the new parameterization is useful for forecasting.  The present paper does not include a truly independent validation period.  Instead, the authors expand the period and continue to include the calibration period in the validation step.  This approach should be changed.  Also, the authors should also expand how they “validate” their model.  They might consider using a k-fold validation or jackknife procedure.  Also, if the parameters are estimated from only a dry year, are they still applicable to a wet year?  If they are estimated form the accumulation period are they applicable to the melt period?  I think these tests are necessary given the temporal variations in the parameter values.

3.       Also, I would recommend that the authors read the following article:  Klemeš, V. "Guest editorial: Of carts and horses in hydrologic modeling." Journal of Hydrologic Engineering 2.2 (1997): 43-49.  This article discusses the appropriate purpose of model calibration.  The key point is that calibration is only a curve fitting exercise if the modelers do not use the results of the calibration to return to the actual natural system and learn more about it.  I believe that situation applies to this paper.  As currently presented, the results only provide a better curve fitting approach.  We have not learned anything about the physical system or processes.  The paper would be a much stronger contribution if the calibration results were used to learn how the model fails to represent reality.  What physical processes are not adequately represented?  What assumptions are inappropriate?  What are key pathways to improve the model moving forward so that this type of calibration is not needed in the future?  I think these questions can be and should be be addressed by rewriting the discussion section.  Currently, that section interprets the results from a narrow perspective (one might argue a curve fitting perspective).  Instead, I think the paper would be a much better contribution if the authors could use the results to identify needed improvements to the model itself.  Otherwise, the paper merely presents a better curve fitting method and provides little improvement in our understanding of the physical system or processes.

Minor comments:

1.       The authors rely on NSE for almost all their model evaluation.  Multiple measures should be used such as RMSE, MBE, etc.

2.       The authors present up to 7 digits for numbers presented in the paper.  That number of digits is well beyond the real number of significant figures.

3.       I think units are missing in some cases in the tables. 

4.       The authors neglect to say what variable they are evaluating the sensitivity of.  Is it mean discharge?  Peak discharge? 

5.       The authors have some algorithmic parameters in their sensitivity analysis and optimization methods.  They should provide a brief justification for the use of those values.


Author Response

Responses to the comments of Reviewer #2

Article ID: water-488408

Title: Calibrating a hydrological model by stratifying frozen ground types and seasons in a cold alpine basin

Authors: Yi Zhao, Zhuotong Nan, Wenjun Yu, Ling Zhang

 

Dear Reviewer,

 

Thank you very much for your inspiring and insightful comments concerning our manuscript entitled “Calibrating a hydrological model by stratifying frozen ground types and seasons in a cold alpine basin” (ID: Water-488408). Following your comments, an additional experiment has been added to explain the impacts of the calibration scheme with stratifying the parameters by frozen ground types. We designed a control calibration scheme in which the study area is stratified randomly into two areas with a same proportion in area as does with stratifying the frozen ground types. By comparing the two spatial stratification calibration schemes, we can evaluate the real effects of the stratified calibration by frozen ground type in exclusion of the effects of increasing number of model parameters. After a week’s work on it, the finding positively supports the effects of stratification by frozen ground. Please see the point-to-point answers below for details.

 

I also added more discussion on the importance of complete physical processes in the model. Two additional metrics: RMSE and KGE, have been applied to further evaluate the simulation accuracy. Intensive revisions have been made. Please see the tracked document provided for all modifications.

 

I believe the quality has been significantly improved by incorporating your valuable comments and suggestion into the MS.

Thank you very much,

The authors

 

Texts in red are the comments; those in black are the authors’ explanations to the comments and those in blue are the revised texts appeared in the revised manuscript Along with this point to point letter, I also enclosed a track-enabled document (water-488408_revision track) and a clear version document (water-488408_revision). The line numbers appearing in this letter correspond to those in the track-enabled document, which might differ those in the clear document. Please note, the figures and tables in this letter are numbered by the appearance order, possibly not same in the revised MS.

 

 

Reviewer 2:

This paper examines whether the performance of a watershed model can be improved by (1) dividing the watershed’s grid cells into different categories and using different parameter values for those categories and (2) dividing the modeling period into different categories and using different parameter values for those categories.  The results suggest that discharge at the basin outlet is estimated with greater accuracy when the divisions are included.  Overall, the paper is clearly written and requires only moderate corrections for English and typos.  For the most part, the authors have clearly explained both the methodology and results.  However, I have a few major concerns.

Thank you for your positive comments.

 

Point 1: I think the results of this study are obvious from the beginning.  By allowing different parameter values for different locations and time periods, the authors have provided many additional degrees of freedom in the calibration process.  Improved performance during the calibration is almost certain given the greater flexibility.  Really, the only question is how much the performance will improve.  To facilitate a fair comparison among models with different numbers of parameters, I would suggest that the authors use measures of performance that account for the different number of parameters used in the different models.  Such measures would penalize the models with more parameters.  They would show that the additional parameters are only warranted if the increase in performance is substantial.

Answer: Huge thanks for this valuable comment. Inspired by your comment, we have designed an additional experiment to examine the real effects of spatial stratification by frozen ground types in exclusion of the effects of increasing number of parameters in the model. We do not use the measures that penalize the models with more parameters because we believe the comparative experiment will be more persuasive. As there are a number of previous studies working on the seasonally stratified calibration which have supported the necessity of seasonal division in calibration, we focus on investigating the effects of spatial stratified calibration by frozen ground types.

A control experiment (RanCal) was additionally designed to manifest the importance of stratification in frozen ground type. In the control, the entire area is randomly divided into two sets of areas (R1 and R2), for which two sets of soil parameters are created so that the model has the same number of variables as in the scheme with stratifying frozen ground types. The areal proportion between the two randomly divided area is restricted equal to that between permafrost and seasonally frozen ground in the study basin. All data and parameters in the control keep identical to those in the frozen ground type stratification.  (added to the method section, Page 6 Lines 240-247)

The results of the control experiment have been presented in Figure 1 and Table 1, for your reference. Comparing to the FroCal (the calibration with frozen ground type stratification), in which the parameter values in permafrost areas are consistently larger than those in the SFG areas, the changes of parametric values between the two randomly separated areas (R1 and R2 in Table 1) appear no obvious patterns. Figure 1 shows that both RanCal and FroCal improve the agreements with the observed daily streamflow especially in the dry season in comparison with the Baseline (the calibration without any stratification). It does confirm the potential effects of introduction of more parameters into a model. However, the RanCal suffers larger discrepancy in simulating the early recession limb than the FroCal, whereas the latter agrees much better in the same period. The NSE, KGE and RMSE of the RanCal are 0.61, 0.78 and 0.86×106 m3, better than those from the Baseline but worse than those from FroCal. We therefore argue that the increased performance with enforcing a stratified calibration by frozen ground type than the control experiment is rooted in the existence of systematic distinction in physiographic characteristics between the permafrost and SFG types. (added to the result section, Page 15-16 Lines 446-471)

                                             

Figure 1. Simulated daily streamflow with experimentally applying the parameters calibrated for the permafrost area and for the SFG area to the entire basin.

 

Table 1 Calibrated values of soil-related parameters for the random spatial stratification experiment (RanCal). R1 and R2 represent the two randomly separated areas.


Clay Loam

Loam

Sandy Loam


Param

ED

BP

FC

LC

ED

BP

FC

LC

ED

BP

FC

LC

RanCal/R1

0.87

0.10

0.33

0.0308

2.26

0.10

0.18

0.0012

5.00

0.10

0.25

0.0842

RanCal/R2

5.00

0.25

0.32

-

3.60

0.10

0.18

-

4.79

0.53

0.28

-

















 

I totally agree the meaning of parameter calibration is not only to provide a more reasonable combination of parameters, but also to “return to the actual natural system and learn more about it”, as you pointed it out in a later comment. The comparative experiment shows the increasing in parameter numbers is in favor of improving the simulation, although its improvement is less than stratifying by frozen ground type. However, we found no apparent pattern in the optimized parametric values from the randomly stratified calibration, which provide little knowledge to understand the relationship between real world and parameters due to the lack of physics basis. In this sense, a scientifically sound spatial stratification, such as by frozen ground type in a typical cold alpine basin, could well interpret the responses of parameters to different phenomena as well as improve the simulation accuracy, and is helpful to promote the understanding of the real world. (added to the discussion section, Page 18-19 Lines 557-574)

 

Point 2: A key problem with including many parameters in a model (and especially time-varying parameters) is the possibility that the temporal pattern and values are only applicable for the time period for which they were calibrated.  Thus, an independent evaluation (“validation”) period is essential to assess whether the new parameterization is useful for forecasting.  1) The present paper does not include a truly independent validation period.  Instead, the authors expand the period and continue to include the calibration period in the validation step.  This approach should be changed. Also, the authors should also expand how they “validate” their model.  They might consider using a k-fold validation or jackknife procedure.  2) Also, if the parameters are estimated from only a dry year, are they still applicable to a wet year?  3) If they are estimated form the accumulation period are they applicable to the melt period?  I think these tests are necessary given the temporal variations in the parameter values.

Answer: Thanks. (1) We changed the validation period to 10/01/2004-09/30/2008, which is now completely independent of the calibration period (10/01/2008 to 09/30/2009). The description has been accordingly modified in the Section 2.3, 3.2 in the revised manuscript (Page 6 Lines 235-239; Page 13 Lines 407-410) and the Figure 2. The time series can be seen in Figure 2. The performance indices measured for the Baseline and the FullCal (the calibration with stratifying of both frozen ground types and seasons) are listed in Table 2 for your reference. Basically, while all stats become worse to some degree than in the calibration period, the FullCal results keep considerably better than the Baseline by 0.12 and 0.08 higher in NSE and KGE, respectively, and by 0.11×106 m3 lower in RMSE.

Considering the verification period is long enough, we do not take the k-fold or jackknife approach to validate the results. However, we added two metrics, RMSE and KGE (Kling-Gupta efficiency), in addition to NSE for performance evaluation. (the equations are listed in Page 5 Lines 193-200)

 


Figure 2. Simulated daily streamflow with the parameters optimized in the baseline and full stratification calibration, respectively, in a four-year validation period (10/01/2004-09/30/2008).

 

Table 2. The values of NSE, KGE and RMSE of the simulated daily streamflow with the parameters optimized in the baseline and full stratification calibration schemes, respectively, in a four-year validation period (10/01/2004-09/30/2008).


NSE

KGE

RMSE (×106 m3)

Baseline

0.48

0.68

1.08

FullCal

0.60

0.75

0.97

 

 (2) Based on observational data, the annual precipitation in Babao River Basin, which is a mountainous basin, from 1965 to 2012 (presented in Figure 3) are much stable, mostly between 350 mm to 500 mm. The maximal annual precipitation in the validation period is 494 mm (2008) and the minimal 382 mm (2007); they can roughly represent the peak and valley of precipitation in the basin. The validation made in a four-year long period independent of the calibration period indicates that the calibrated parameters can work well in a long, independent period. There is an extreme wet year of 1998, with an annual precipitation close to 600 mm, but unfortunately we do not have the observed streamflow in this year.

In addition, given the fact 90% precipitation in the study basin is concentrated in the wet season, the effects of parameters estimating from a dry year or a wet year can be diminished by the seasonally stratified calibration as we proposed. In Section 3.3, we tested the applicability of using the calibrated parameters for dry or wet seasons to the entire period. We found the dry season parameters are not applicable to the wet season and vice versa, as presented in Section 3.3.

Figure 3. The annual precipitation from 1965 to 2012 in the study area. The figure not appear in the manuscript.

 

(3) Actually we did not include the snowpack parameters in the sensitivity analysis and the following experiments. The DHSVM includes a sophisticated snow physics. The snow-related parameters only work in dry seasons when snowfall takes place and the model physics already represents its seasonal roles very well. Therefore, there is no need to further divide them spatially or temporally. In this study, both the accumulation period and melt period belong to the same dry season, which holds a same set of parameter values.

 

Point 3: Also, I would recommend that the authors read the following article:  Klemeš, V. "Guest editorial: Of carts and horses in hydrologic modeling." Journal of Hydrologic Engineering 2.2 (1997): 43-49.  This article discusses the appropriate purpose of model calibration.  The key point is that calibration is only a curve fitting exercise if the modelers do not use the results of the calibration to return to the actual natural system and learn more about it.  I believe that situation applies to this paper.  As currently presented, the results only provide a better curve fitting approach.  We have not learned anything about the physical system or processes.  The paper would be a much stronger contribution if the calibration results were used to learn how the model fails to represent reality.  What physical processes are not adequately represented?  What assumptions are inappropriate?  What are key pathways to improve the model moving forward so that this type of calibration is not needed in the future?  I think these questions can be and should be addressed by rewriting the discussion section.  Currently, that section interprets the results from a narrow perspective (one might argue a curve fitting perspective).  Instead, I think the paper would be a much better contribution if the authors could use the results to identify needed improvements to the model itself.  Otherwise, the paper merely presents a better curve fitting method and provides little improvement in our understanding of the physical system or processes.

 

Answer: Yes, we have read through the recommended paper (also cited in the discussion) and rewritten the discussion part by strengthening the discussion on the model physics we have learned from the calibration experiments. We strongly agree with your opinions on the meaning of the model calibration. We have stressed the improvements the stratification by frozen ground type have made from the perspective of physical meaning of parameters, in comparison to a randomly spatial stratification scheme. Please refer to the answer to the point 1.

In spite of all efforts we made to develop a sophisticated calibration approach, it is still impossible for DHSVM to repeat the peak streamflow in April induced by thawing active layer in frozen soil areas. The defects can be obviously observed in all figures in both calibration and validation periods. In Figure 4, we presented an amplified windows of March to May, 2005-2009, where large discrepancies between simulations and observations are presented.


Figure 4. Amplified windows of March to May, 2005-2009, in which temperature starts to rise across the freezing point, showing incapability of the DHSVM in simulating peak streamflow produced by combinations of thawing active layer of frozen ground, snowmelt and glacier melt.

We mentioned in the result section (Page 11 Lines 378-380),

“Meanwhile all simulations suffer from common weakness such as incapability in reproducing the runoff around April and underestimating some peak flows in summer.”

and tried to explain the possible reasons in a separate paragraph: (Page 14 Lines 431-439),

“As exhibited in Figure 3 and Figure 4, the simulations, regardless of any calibration approach employed, consistently fail to reproduce streamflow peaks in around Aprils. It seems only base flow is simulated in those periods. The observed peak streamflow is produced by combinations of thawing active layer of frozen ground, snow melt and glacier melt in early spring when air temperature becomes positive (Figure 5). This proportion of runoff accounted about 8% of the total annual amount. The DHSVM is capable of simulating snow melt but lacks the model physics of representing freeze and thaw processes occurred within active layer and glacier melt as well. The calibration experiments help to detect the inadequacy of the model physics.”

In the discussion section, we added a separate paragraph in dedication to this issue: (Page 19 Lines 575-588)

“This study also underlines the importance of complete model physics. Even with a sophisticated calibration approach as presented, it is still impossible to perfectly simulate the streamflow in April. Those defects are related to the occurrence of freeze and thaw processes in active layer in early spring when surficial frozen ground begins to melt. Those physics are absent in the adopted version of DHSVM. The stratified calibration scheme could strengthen the impacts of the permafrost and the SFG in a cold alpine basin. They cannot completely replace the physical processes in the model. It urges the need of fitting in place a frozen ground module in the DHSVM to compute the freezing and thawing processes recurring with the oscillating soil temperature. Some efforts have been undertaken in the same basin to enhance the freezing and thawing cycle in models so the simulation in spring has been improved [31]. They suffered notable discrepancy in summer, which can be mitigated by undertaking an appropriate stratified calibration as proposed.”

 

Also, we concluded this issue in the Abstract and conclusion section:

In Abstract: (Page 1 Lines 33-35)

“The underestimation in the April streamflow also highlights the importance of improved physics in a hydrological model, without which the model calibration cannot fully compensate the gap.”

In Conclusions: (Page 20 Lines 627-638)

“While the new calibration approach benefits the simulation, it cannot fully compensate the absence of key physics in the model. Owing to the lack of representing freeze-thaw processes in active layer in DHSVM, the simulation calibrated by the proposed approach cannot correctly reproduce runoff around April. It highlights the importance of improved physics in a hydrological model, the need of which cannot be fully circumvented by parametric calibration.”

 

 

 

Minor comments:

Point 4: The authors rely on NSE for almost all their model evaluation.  Multiple measures should be used such as RMSE, MBE, etc.

Answer: We have added RMSE and KGE for evaluating the simulating accuracy. The metrics measured in the calibration and validation periods are shown in Tables 2 and 3. They have consistent performance.

Table 3. The NSE, KGE and RMSE of the simulated daily streamflow with the parameters optimized in the four calibrated steps


NSE

KGE

RMSE   (×106 m3)

Baseline

0.58

0.72

0.89

FroCal

0.65

0.80

0.81

SeaCal

0.63

0.76

0.84

FullCal

0.69

0.83

0.77

 

Point 5: The authors present up to 7 digits for numbers presented in the paper.  That number of digits is well beyond the real number of significant figures.

Answer: We rounded them to two decimal places, such as in Tables 1 and 4, except LC, which keeps four digits. Otherwise LC will be rounded to zero. 

Table 4. Calibrated values of parameters connected to soil type in four calibration steps. Baseline, FroCal, SeaCal and FullCal stand for calibration steps 1 to 4, respectively; Perm stands for the permafrost area and SFG the seasonally frozen ground area. Dash indicates the non-sensitive parameter. Ranges: ED (≥0), BP (>0), FC (wilting point to porosity), LC (>0). The values are round to two decimal places except LC, which keeps four digits.


Param

Baseline

FroCal/SFG

FroCal/Perm

SeaCal

FullCal/SFG

FullCal/Perm

Clay Loam

ED

3.16

4.78

5.00

3.16

4.80

5.00

BP

0.36

0.69

0.76

0.36

0.69

0.76

FC

0.33

0.18

0.27

0.33

0.18

0.27

LC

-

0.0070

-

0.0210

0.0050

-

Loam

ED

3.20

0.00

2.88

3.19

0.00

2.90

BP

0.29

0.10

0.76

0.29

0.10

0.76

FC

0.18

0.18

0.18

0.18

0.18

0.18

LC

-

0.0002

-

0.0008

0.0002

-

Sandy Loam

ED

2.71

2.25

4.05

2.71

2.20

4.49

BP

0.39

0.10

0.10

0.39

0.10

0.10

FC

0.30

0.26

0.28

0.30

0.25

0.29

LC

-

0.1000

-

0.0600

0.1000

-










 

Point 6: I think units are missing in some cases in the tables.

Answer: We fixed this issue. The units of each parameter have been provided in the texts when they initially appear. (Page 9 Lines 302-306)

“Among them, three parameters, i.e., monthly albedo (Alb), monthly leaf area index (LAI, m2/m2) and minimum resistance (MinR, s/m), are closely connected to land cover/vegetation, while four of them, i.e., field capacity (FC, m3/m3), lateral conductivity (LC, m/s), bubbling pressure (BP, m) and exponential decrease (ED, m-1) controlling the decay rate of hydraulic conductivity with depth, are associated to soils.”

 

Point 7: The authors neglect to say what variable they are evaluating the sensitivity of.  Is it mean discharge?  Peak discharge?

Answer:  The mean discharge is used for evaluating the sensitivity. The description has been added. (Page 4 Lines 160-161).

“The eFAST method was used to perform the analyses for each combination and the mean streamflow is used as the variable for evaluating the sensitivity.”

 

Point 8: The authors have some algorithmic parameters in their sensitivity analysis and optimization methods. They should provide a brief justification for the use of those values.

Answer: Thanks. I have provided the references to justify those values. For example:

“The number of samples per search curve and maximum number of Fourier coefficients were set to 65 and 4 in this study, respectively, according to a previous study [7](Page 4 Lines 164-165)

 “The number of iteration and population size were assigned to 100 and 20, respectively, following the advices from [49].” (Page 5 Lines 188-190)

 

 


Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

I believe the authors have adequately addressed my comments.

Back to TopTop