Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Is the LSTM Model Better than RNN for Flood Forecasting Tasks? A Case Study of HuaYuankou Station and LouDe Station in the Lower Yellow River Basin

Water 2023, 15(22), 3928; https://doi.org/10.3390/w15223928

by Yiyang Wang, Wenchuan Wang^*

, Hongfei Zang and Dongmei Xu

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Water 2023, 15(22), 3928; https://doi.org/10.3390/w15223928

Submission received: 11 October 2023 / Revised: 28 October 2023 / Accepted: 30 October 2023 / Published: 10 November 2023

(This article belongs to the Special Issue Intelligent Modelling for Hydrology and Water Resources)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

General comments: As the title stated, the paper compares the use of LSTM and RNN for flood forecasting from several points of view using diverse techniques. The paper is very well written and structured. The authors made a detailed previous introductory description of the models. Likewise, Results and discussion section is analysed in-depth and the conclusions are clearly supported by the results. The only section which needs to be extended concerning the results presented is Methodology. Lines 314-318 should be moved to methodology and discussed in this section, before the results are shown.

Specific comments:

Title: Consider add study case.

Line 18: Define acronyms BOA-RNN, BOA-LSTM, MHAM-RNN, and MHAM-LSTM

Line 224: "h" letter doesn´t appear in previous equations.

Figures 6 and 7: include situation in China and legend with altitude of DEM

Line 358: Delete ":"

Author Response

Reply to questions or reviews given by Reviewer #1:

Comment: The only section which needs to be extended concerning the results presented is Methodology. Lines 314-318 should be moved to methodology and discussed in this section, before the results are shown.

Response: Thank you for your suggestions. We move the content from Lines 314-318 to the methodology section and expand upon it. Additionally, in response to your advice, we add Section 3.5 in the methodology section to provide an initial analysis of model differences.

Comment: Title: Consider add study case.

Response: Thank you very much for your suggestions. We make modifications to the article's title based on your guidance in the revised manuscript.

Comment: Line 18: Define acronyms BOA-RNN, BOA-LSTM, MHAM-RNN, and MHAM-LSTM

Response: Thank you very much for your comment. We define similar issues to enhance the readability of the article in the revised manuscript.

Comment: Line 224: "h" letter doesn´t appear in previous equations.

Response: Sincere thanks for your comment. We remove the erroneous sections in the revised manuscript.

Comment: Figures 6 and 7: include situation in China and legend with altitude of DEM

Response: Thank you for your suggestions. We redraw these two figures, adding region locations and altitude in the revised manuscript.

Comment: Line 358: Delete ":"

Response: Thank you for correcting the errors in the article. We check the entire manuscript and make corrections in the revised manuscript.

The above responses are our reply to the reviews about Water-2683916. We look forward to hearing further information from you.

Sincerely yours,

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

This paper explores flood prediction using data from the LouDe and HuaYuankou stations in the Yellow River. Six models, including RNN, LSTM, BOA-RNN, BOA-LSTM, MHAM-RNN, and MHAM-LSTM, are compared using evaluation metrics and gated unit architecture. This study finds that RNN outperforms the other models in most cases. The research highlights that the complexity of a model's structure does not necessarily correlate with the quality of its predictions; instead, the model's architecture should be tailored to the specific characteristics of the target data. While the topic of comparing these six flood prediction models is intriguing, the paper's content structure needs improvement.

Comments

1. The article has issues with its content structure arrangement, such as the introduction of data sources in the article should be placed in the chapter before the model introduction. It would be better to swap sections 2.2 and 2.3, and to introduce the model before discussing the selection of hyperparameters and parameters, without intermingling them.

2. If there are repeated phrases in the paper, it is best to name them in the following format when they first appear, such as Bayesian optimization algorithm (BOA) (lines 18~19). When the same phrase appears later in the article, you can use the abbreviation directly.

3. It is important to avoid abbreviating the phrases when they first appear, which may confuse the readers (lines 24 and 74). In lines 500, 507, 544, etc., it is generally not necessary to use ":" in the article. If you want to indicate which figure the content refers to, you can use the format “ … (Figure **)” to specify it.

4. The ratio of the training set to the validation set to the test set in line 216 is 70:25:5. The quantity of the validation set is 5 times larger than that of the test set. Can such a small amount of test set ensure the accuracy of the results? Alternatively, increasing the use of cross-validation may further evaluate the model.

5. Some parts of the article are not described clearly enough, such as the phrase “According to the research conclusion of the paper” in line 108. It is unclear how the conclusion was reached before the experiment was conducted, and the causal sequence is not logical.

6. Some figures in the article only have titles without explanations of what the elements represent, and even lack a colorbar, such as Figures 12 and 25, etc. The analysis of the figures can be explained in the main text.

7. Why is the training set data in matrix format in section 3.3 (31043*4) and (4684*19) respectively? What do the second dimensions of 4 and 19 represent?

Author Response

Reply to questions or reviews given by Reviewer #2:

Comment: The article has issues with its content structure arrangement, such as the introduction of data sources in the article should be placed in the chapter before the model introduction. It would be better to swap sections 2.2 and 2.3, and to introduce the model before discussing the selection of hyperparameters and parameters, without intermingling them.

Response: Thank you very much for your suggestions on the logical structure of the paper.

Following your suggestions, we adjust the order of the data and methodology sections in the article. Additionally, we move the explanation of hyperparameters in the methodology section to after the attention mechanism model to enhance the logical flow of the text.

However, in the subsequent discussion section of the article, since hyperparameter optimization does not change the model structure and still falls under the basic model category, we keep Section 4.3 after Section 4.2.

Comment: If there are repeated phrases in the paper, it is best to name them in the following format when they first appear, such as Bayesian optimization algorithm (BOA) (lines 18~19). When the same phrase appears later in the article, you can use the abbreviation directly.

Response: Thank you very much for your comment. We make revisions to the abbreviations of technical terms in the paper to enhance its readability in the revised manuscript.

Comment: It is important to avoid abbreviating the phrases when they first appear, which may confuse the readers (lines 24 and 74). In lines 500, 507, 544, etc., it is generally not necessary to use ":" in the article. If you want to indicate which figure the content refers to, you can use the format “ … (Figure **)” to specify it.

Response: Thank you very much for your suggestions. We add definitions when the phrases first appear to avoid reader confusion. Additionally, we make modifications regarding the use of colons.

Comment: The ratio of the training set to the validation set to the test set in line 216 is 70:25:5. The quantity of the validation set is 5 times larger than that of the test set. Can such a small amount of test set ensure the accuracy of the results? Alternatively, increasing the use of cross-validation may further evaluate the model.

Response: Thank you very much for your valuable feedback on the paper. Our initial thoughts coincided with yours.

Just as you've considered, we also had quite a debate during our initial experiments. Regarding the dataset partitioning for the HuaYuankou station, we initially thought about increasing the proportion of the test set. However, due to the influence of watershed confluence at this station, the flood processes have an exceptionally long duration, and with just 5% of the data, it amounts to 1552 time points, which effectively showcases the model's performance.

Additionally, we conducted experiments with a larger test set initially. However, we found that as time extended, the NSE metric, which describes data fluctuation, had difficulty in highlighting the differences in the model's performance at this site (sometimes requiring precision down to 1e-5). Furthermore, the differences in the MAE and RMSE metrics were not substantial. Hence, we ultimately opted for a 5% test set proportion.

Comment: Some parts of the article are not described clearly enough, such as the phrase “According to the research conclusion of the paper” in line 108. It is unclear how the conclusion was reached before the experiment was conducted, and the causal sequence is not logical.

Response: Thank you for your valuable feedback. We restructure the language in that section to minimize any logical issues in the revised manuscript.

Comment: Some figures in the article only have titles without explanations of what the elements represent, and even lack a color bar, such as Figures 12 and 25, etc. The analysis of the figures can be explained in the main text.

Response: Thank you very much for pointing out the issues. We make modifications to the figures with similar problems, added color bars, and provided explanations where needed in the revised manuscript.

Comment: Why is the training set data in matrix format in section 3.3 (31043*4) and (4684*19) respectively? What do the second dimensions of 4 and 19 represent?.

Response: Thank you very much for pointing out the issue. These two matrix formats represent the entire dataset we obtained after performing data formatting. The numbers 4 and 19 represent the number of input factors in the data, while 31043 and 4684 represent the total length of the data, including the 15 time steps that were not removed. Additionally, for improved readability, we add an explanation in the figure in the revised manuscript.

The above responses are our reply to the reviews about Water-2683916. We look forward to hearing further information from you.

Sincerely yours,

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

The authors have responded diligently to my suggested revisions. I recommend accepting this paper.

Article Menu

Is the LSTM Model Better than RNN for Flood Forecasting Tasks? A Case Study of HuaYuankou Station and LouDe Station in the Lower Yellow River Basin

Further Information

Guidelines

MDPI Initiatives

Follow MDPI