Next Article in Journal
Effects of Bio-Organic Fertilizers Substitution on Gaseous Nitrogen Losses in Rice Fields
Previous Article in Journal
An Assessment of the Suitability of Contrasting Biosolids for Raising Indigenous Plants in Nurseries
 
 
Article
Peer-Review Record

Water-Level Prediction Analysis for the Three Gorges Reservoir Area Based on a Hybrid Model of LSTM and Its Variants

Water 2024, 16(9), 1227; https://doi.org/10.3390/w16091227
by Haoran Li 1, Lili Zhang 1,2,*, Yaowen Zhang 1,2, Yunsheng Yao 1, Renlong Wang 1 and Yiming Dai 1
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Water 2024, 16(9), 1227; https://doi.org/10.3390/w16091227
Submission received: 22 March 2024 / Revised: 19 April 2024 / Accepted: 23 April 2024 / Published: 25 April 2024

Round 1

Reviewer 1 Report (Previous Reviewer 3)

Comments and Suggestions for Authors

The manuscript presents an investigation into the application and performance of four distinct deep-learning models for predicting water levels in the Three Gorges Reservoir. The study highlights the importance of accurately predicting water levels, particularly in the dam area, for ensuring downstream safety and supporting economic development. The finding that the CNN-Attention-LSTM model outperforms others in all metrics, achieving notable accuracy with an R2 value of 0.9940, MAE of 0.5296, RMSE of 0.6748, and MAPE of 0.0032, is significant. Additionally, the observation of exceptional predictive accuracy for lower water levels with the CNN-LSTM model underscores the potential utility of deep learning approaches in water level forecasting. Overall, the writing and presentation of the article are somewhat distant from being suitable for publication in a high-level journal like “Water”. Below are my issues and suggested revisions for the authors' reference. 

General Comments:

The Introduction section of the manuscript falls short of the standard expected for a scientific paper, as it mainly recounts the actions of others. It is recommended that the authors restructure the section, bolster the logical flow, and improve the alignment between referenced literature and the arguments presented.

 

2. Section 2.1 is titled "Research Site," but it actually discusses a lot of the research significance. It is recommend moving this content to the Introduction section.

 

3. The article only provides results for predicting the next 6 hours. It is suggest extending the forecast horizon to test the model's performance, such as 12, 18, or 24 hours.

 

4. “The selected data for this study covers from January 1, 2008, to February 1, 2021, with data collected every 6 hours, totaling 19,124 data points.” However, given the data collection interval of every 6 hours, the total number of data points should be significantly higher than 19,124. What is the reason for this discrepancy in data volume?

 

5. Figures 9 through 12 are repetitively monotonous. It is suggest combining them into one figure or changing the presentation format.

 

Minor Comments:

1. MAPE is conventionally expressed as a percentage, which is evident from its calculation formula. It is recommended that the authors convert all MAPE data to percentages.

 

2. Shouldn't the rightmost part of Figure 4 be labeled as "output"?

 

3. The labels on the y-axis in Figure 5 are too small, making it difficult to distinguish the three overlapped colors of data. It is suggest using a different plotting method.

 

4. The capitalization of every word in the legend of Figure 7 is unnecessary.

 

5. Formula 18 in the main text is not used in the validation table, and the RMSE in the table is not explained in the main text.

 

6. The usage of "%" for MAPE is inconsistent between Tables 2 and 3. It is suggest standardizing it.

 

7. "Tab" is an abbreviation for "Table" and should be written as "Tab.".

Author Response

Dear editor and reviewer:

Thank you for carefully reviewing our manuscript entitled “Water Level Prediction Analysis for the Three Gorges Reservoir Area Based on a Hybrid Model of LSTM and Its Variants, water-2824128 ” and for providing constructive suggestions. We have studied the comments carefully and tried our best to revise and improve the manuscript. We hope the revised manuscript will meet the requirements of your journal. The serial numbers of the charts in the letter correspond to those in the manuscript.

Our responses to the editor’s and reviewers’ comments are listed below:

 

Response to comments and instructions of editor: Below, the original comments are in red, and our responses are in blue.

General Comments:

1)The Introduction section of the manuscript falls short of the standard expected for a scientific paper, as it mainly recounts the actions of others. It is recommended that the authors restructure the section, bolster the logical flow, and improve the alignment between referenced literature and the arguments presented.

Response:Dear reviewer, thank you for your careful review and valuable feedback on the introduction section of our paper. We understand that the introduction section should not only review existing research, but also clearly demonstrate the innovative points and scientific contributions of the research, while ensuring a close correspondence and smooth logic between literature citations and thesis arguments.

Based on your suggestion, we have made some modifications to the introduction section. In this revision, we have strengthened the logical connection between literature review and the objectives and contributions of this study, ensuring that each paragraph closely supports and lays the foundation for our research argument. In addition, we have also improved the relevance and accuracy of literature citations, ensuring that the cited research directly supports our research hypotheses and methodology.

We believe that this revision can improve the quality of the introduction section, better meet the standards of scientific papers, and help readers clearly understand the background, importance, and innovation of this study. Thank you again for your guidance and feedback.

  • Section 2.1 is titled "Research Site," but it actually discusses a lot of the research significance. It is recommend moving this content to the Introduction section.

Response:Dear reviewer,Thank you for your careful review and specific suggestions on our paper. You pointed out that the title of section 2.1 is "Research Location", but the content actually involves the importance and significance of many studies. It is suggested to move this part to the introduction section.

Based on your suggestion, we have restructured the article and moved the content on research importance from section 2.1 to the introduction section to ensure consistency and logical coherence among all sections. Now, the introduction section more comprehensively reflects the background and importance of the research, while the section on "research location" focuses on describing specific geographical and environmental conditions.

I believe that this structural adjustment will make the content of the paper clearer and more coherent, which will help readers better understand the background of the research and its scientific contributions. Thank you again for your valuable feedback and guidance.

  • The article only provides results for predicting the next 6 hours. It is suggest extending the forecast horizon to test the model's performance, such as 12, 18, or 24 hours.

Response:Dear reviewer,Thank you for your review and constructive suggestions, especially regarding expanding the prediction time range. Your feedback is of great significance for a comprehensive evaluation of the model performance in our research.

The current research mainly focuses on using multi input single output models for predicting water levels for the next 6 hours, with the aim of optimizing the accuracy and response speed of the model in controlling recent hydrological events. However, as you suggested, expanding the prediction range to 12 hours, 18 hours, or even 24 hours can indeed provide us with an opportunity to more comprehensively test the accuracy and robustness of the model on longer time scales. This not only tests the model's ability to capture long-term dependencies in time series data, but also provides more robust support for decision-making in reservoir management and flood warning systems.

To address this suggestion, we have included the extension of the predicted time range in our future research plan. We plan to explore and address in detail the challenges faced in expanding prediction in future work, such as adjusting the model structure and optimizing training strategies, in order to further enhance the application value and predictive performance of the model.

We believe that these follow-up studies will greatly enrich and improve the current research results, contributing new perspectives and depth to the development of hydrological models. Thank you again for your valuable feedback and guidance.

  • “The selected data for this study covers from January 1, 2008, to February 1, 2021, with data collected every 6 hours, totaling 19,124 data points.” However, given the data collection interval of every 6 hours, the total number of data points should be significantly higher than 19,124. What is the reason for this discrepancy in data volume?

Response:Dear reviewer,Thank you very much for carefully reviewing the details of our dataset description and raising the key questions. The question you raised about the number of data points is entirely reasonable, and we appreciate your attention to this detail.

In response to your question, we have once again verified the calculation of the dataset and time range. According to our dataset, the total duration from January 1, 2008 to February 1, 2021 covers 13 years and 1 month. Collect data 4 times a day (every 6 hours), resulting in approximately 1460 data points per year (365 days x 4 times per day). By adding up these data points, we did obtain approximately 19124 data points (13 years x 1460+31 days (January 2021) x 4), which is consistent with the numbers reported in our article.

I hope this explanation can clarify your question. Thank you again for your careful review and valuable feedback.

  • Figures 9 through 12 are repetitively monotonous. It is suggest combining them into one figure or changing the presentation format. 

Response:Dear reviewer, thank you very much for your feedback. Based on your feedback, we have combined Figure 9 and Figure 12 to create a diagram. The specific diagram is shown below:

Thank you again for your feedback and feedback.

Minor Comments:

  • MAPE is conventionally expressed as a percentage, which is evident from its calculation formula. It is recommended that the authors convert all MAPE data to percentages.

Response:Dear reviewer,Thank you for your review and guidance on our use of MAPE (Mean Absolute Percentage Error). You correctly pointed out that MAPE is usually expressed in percentage form, which can be clearly seen from its calculation formula.

Based on your suggestion, we have adjusted all MAPE data in the article to ensure that they are now presented in percentage form. This modification will make the results more intuitive and understandable, while also conforming to conventional expression habits.

We appreciate your contribution to improving the quality of the article and hope that this revision can meet your expectations. Thank you again for your careful review and valuable suggestions.

  • Shouldn't the rightmost part of Figure 4 be labeled as "output"?

Response:Dear reviewer,Thank you for your detailed review of our chart description and your suggestions. You pointed out that the rightmost part in Figure 4 should be labeled as "output", which is a very important correction.

We have made the corresponding modifications to Figure 4 as per your suggestion, ensuring that the rightmost part is now marked as "output". This change helps to clearly display the flow of data, making the information transmission in the chart more accurate and intuitive.

We greatly appreciate your careful review and substantial suggestions, which are extremely important for improving the quality of our work. Thank you again for your valuable feedback.

  • The labels on the y-axis in Figure 5 are too small, making it difficult to distinguish the three overlapped colors of data. It is suggest using a different plotting method.

 Response:Dear reviewer,Thank you for providing valuable suggestions. Regarding the issue with Figure 5, we have completely remade it based on your guidance. As shown in the following figure:

Thank you again for your valuable feedback.

  • The capitalization of every word in the legend of Figure 7 is unnecessary.

Response:Dear reviewer,Thank you for your specific feedback on the legend format of Figure 7. We have adjusted the capitalization of the letters in the legend based on your suggestion. Now, we have only capitalized the first letter and used lowercase for the rest to comply with regular formatting conventions.

Thank you again for your careful review and valuable feedback.

  • Formula 18 in the main text is not used in the validation table, and the RMSE in the table is not explained in the main text.

Response:Dear reviewer,Thank you very much for your careful review of our article and pointing out the specific issues. Indeed, Formula 18 was not used in the validation table, and the RMSE in the table was not explained in detail in the main text, which may lead to difficulties for readers to understand.

To address this issue, we have made the following modifications:

1) Application of Formula 18: We have updated the validation table to include the results calculated using Formula 18, which can directly demonstrate the actual application effect of the formula and compare it with other results.

2) Explanation of RMSE: We have also supplemented the detailed explanation of RMSE (root mean square error) in the main text, clarifying its importance and calculation method in model evaluation. In addition, we ensure that the calculation of RMSE and the application results of formula 18 are clearly presented and discussed in the text.

Thank you again for your valuable suggestions and guidance.

  • The usage of "%" for MAPE is inconsistent between Tables 2 and 3. It is suggest standardizing it.

Response:Dear reviewer,Thank you very much for pointing out the inconsistency in our use of MAPE percentage symbols in Tables 2 and 3. Ensuring consistency in reporting standards is crucial for readers to understand and compare data.

We have carefully reviewed these two tables and standardized the representation of MAPE values, ensuring that they are expressed as percentages in all tables and using the "%" symbol. This modification will eliminate potential confusion and improve the professionalism and clarity of data presentation.

Thank you again for your careful review and valuable feedback.

  • "Tab" is an abbreviation for "Table" and should be written as "Tab.".

Response:Dear reviewer,Thank you for pointing out the issue with our improper use of abbreviations for "table" in the text. Your guidance has helped us improve the formatting standardization of the article.

Based on your suggestion, we have revised the entire text to ensure that all references to the table use the correct abbreviation "Tab.". This standardized usage will enhance the professionalism of the article and ensure consistency in formatting.

 

We appreciate your careful review and valuable suggestions.

 

The comments from the editor and reviewer are very valuable and helpful for improving our paper. We have studied the comments carefully and revised the manuscript, and we hope that we have satisfactorily addressed the comments that were raised. Thank you again for your time and effort. Please refer to the attachment for the revised manuscript.

 

Sincerely yours,

Lili Zhang

Author Response File: Author Response.pdf

Reviewer 2 Report (New Reviewer)

Comments and Suggestions for Authors

This study demonstrated the application value of deep learning models, especially the CNN-LSTM model, in predicting water levels in the Three Gorges Reservoir and showed how the attention mechanism effectively enhances the model's predictive capabilities. These findings provide new technical paths for hydrological prediction and have essential theoretical and practical significance for optimizing reservoir management and improving flood control decision support systems.

Some comments should be clarified

1)      “This provides a more accurate deep learning prediction model for the upstream water 106 level prediction of the Three Gorges Reservoir, which helps to plan hydropower genera- 107 tion more effectively and provides scientific basis for flood control and water resource 108 allocation [28-31].” Authors developed a review using references together. Please, each reference should be discussed separately.

2)      A deep review was developed but the authors did not explain the need of this research. Why did the authors develop this research?

3)      The goals of the review should be developed

4)      Different models were presented in the methodology, but what is the novel?

5)      A flowchart should be developed to understand better the research

6)      Authors could use this scale to qualify the errors “Calibrating a flow model in an irrigation network: Case study in Alicante, Spain. Spanish Journal of Agricultural Research (Online), 1 (15), 1 - 13. 10.5424/sjar/2017151-10144

7)      Figures should be improved because it is difficult to read. The scales should be changed and the resolution improved, including auxiliary lines

8)      All figures and tables should be discussed compared with other published research to show the improved of their research

 

9)      Conclusions are vague, the authors should improved it, showing the real novel and its applicability in real case of study and future research

Comments on the Quality of English Language

It is ok, minor mistackes

Author Response

Dear editor and reviewers:

Thank you for carefully reviewing our manuscript entitled “Water Level Prediction Analysis for the Three Gorges Reservoir Area Based on a Hybrid Model of LSTM and Its Variants, water-2824128 ” and for providing constructive suggestions. We have studied the comments carefully and tried our best to revise and improve the manuscript. We hope the revised manuscript will meet the requirements of your journal. The serial numbers of the charts in the letter correspond to those in the manuscript.

Our responses to the editor’s and reviewers’ comments are listed below:

 

Response to comments and instructions of editor: Below, the original comments are in red, and our responses are in blue.

  • “This provides a more accurate deep learning prediction model for the upstream water 106 level prediction of the Three Gorges Reservoir, which helps to plan hydropower genera- 107 tion more effectively and provides scientific basis for flood control and water resource 108 allocation [28-31].” Authors developed a review using references together. Please, each reference should be discussed separately.

Response:Thank you for your valuable feedback on the use of references [28-31] in the paper. I recognize that you suggest discussing each reference individually to enhance the clarity and depth of the literature review. Detailed revisions have been made in the introduction section of the paper.

Thank you once again for your valuable suggestions.

  • A deep review was developed but the authors did not explain the need of this research. Why did the authors develop this research?

Response:Thank you very much for your review and invaluable feedback. You raised an essential question regarding the motivation for this study, which was not adequately explained at the beginning of the article. Below, I will detail the research motivation and objectives. This study is motivated by the recognition of challenges in current water resource management, particularly in large-scale hydraulic projects such as the Three Gorges Dam. Accurate water level prediction is critical for ensuring efficient operation of hydropower stations, flood safety, and protection of downstream ecosystems. However, due to the complexity of the involved dynamic processes and the uncertainty of environmental inputs, accurate water level prediction poses a significant challenge. Our reasons for choosing LSTM and its variants for our research include:

1)LSTM's Time Series Capability: Long Short-Term Memory networks (LSTM) are neural networks specifically designed for time series data, capable of capturing long-term dependencies, which are crucial for water level prediction.

2)Exploration of Optimized Variants: We have further explored LSTM variants such as BiLSTM, CNN-LSTM, and CNN-Attention-LSTM to assess their performance in capturing the complexities of time series data. BiLSTM considers both past and future information, CNN-LSTM combines the spatial feature extraction capability of Convolutional Neural Networks with LSTM’s time series analysis, and CNN-Attention-LSTM incorporates an attention mechanism to enhance model recognition of critical time steps.

3)Significance of the Three Gorges Reservoir: The Three Gorges Reservoir is among the world's largest hydropower stations and plays a pivotal role in China's energy supply and water resource management. Enhancing the accuracy of water level predictions for this project not only has direct economic benefits but also contributes to regional safety and ecological protection.

While these variant models have been utilized in other studies, research in the Three Gorges area is relatively scarce, and there are differences in model parameters. The article will include additional descriptions of these motivations and necessities in the revisions. Thank you once again for your valuable feedback.

 

  •  The goals of the review should be developed

Response: Thank you very much for your review and valuable suggestions. We greatly appreciate your comment regarding the need for clearer objectives in our literature review.

In response to your suggestion, the document has been revised to detail the objectives of the review more clearly. Specifically, this review aims to: 1) comprehensively evaluate and compare the applications and scientific contributions of LSTM and its variants (including BiLSTM, CNN-LSTM, and CNN-Attention-LSTM) in predicting water levels in the Three Gorges Reservoir area; 2) identify the strengths and limitations of these models when processing large-scale hydrological data; 3) explore the practical application potential of these technologies in water resource management and flood warning systems.

We hope these revisions more clearly articulate the core objectives of the research and better guide the framework and content of our study. Thank you again for your meticulous review and constructive suggestions, which are crucial to improving the quality of our paper.

Thank you once again for your valuable feedback.

  • Different models were presented in the methodology, but what is the novel?

Response:Thank you for your feedback and attention to our research methods. Although our study employs well-known deep learning models such as LSTM and its variants (BiLSTM, CNN-LSTM, CNN-Attention-LSTM), our innovation is not limited to the use of these models alone. Our main innovations and contributions are reflected in the following aspects:

Model Parameters: The parameters used in our models differ from those in previous studies, making them more suitable and applicable for predicting water levels in the upstream area of the Three Gorges Reservoir.

Multi-Model Ensemble Method: This research introduces, for the first time, an ensemble application of LSTM, BiLSTM, CNN-LSTM, and CNN-Attention-LSTM for predicting water levels in the upstream region of the Three Gorges Reservoir. This multi-model ensemble strategy effectively combines the strengths of each model, such as LSTM’s long-term memory capabilities, CNN’s spatial feature extraction, and the focused attention capabilities of the attention mechanism, thereby significantly enhancing the accuracy and robustness of the predictions.

Integration of Geographical and Meteorological Factors: The study integrates multidimensional data such as reservoir water levels, dam inflow, and rainfall in key areas to comprehensively capture the complex factors affecting water level changes. This data fusion approach is an important supplement to existing hydrological prediction models, helping to more accurately understand and predict water level fluctuations.

Innovative Application of Region-Specific Data: Selecting Badong, Zigui, and Xingshan as the primary sources of rainfall data is based on their significant impact on water level fluctuations in the Three Gorges Reservoir area. This targeted selection of data not only enhances model sensitivity but also demonstrates our innovation in data application.

We believe these innovative points not only enhance the predictive performance of the model but also open new perspectives and methods for the further development and application of hydrological models. We hope this revision clearly responds to your inquiry about the innovativeness of our research and look forward to your further guidance and suggestions.

Thank you once again for your valuable feedback.

  • A flowchart should be developed to understand better the research

Response:Firstly, thank you for your detailed review and the valuable suggestions you have provided. Regarding your recommendation to add a flowchart for a better understanding of the research process, we have given this suggestion careful consideration. After assessing the existing text and diagrams, we believe they already adequately and clearly describe the methods and steps of our research. Adding a flowchart might lead to redundancy and could potentially compromise the compactness of the paper.

To address your concern about understanding the research process, we have tightened the relationships within the text and enhanced the logic and sequence of each step to ensure that readers can clearly comprehend our research methods and process.

We hope this approach will avoid redundancy while still meeting your expectations for clarity and understandability of the article. Please consider our proposal and we look forward to your further guidance and feedback.

Thank you once again for your attention to our work and your invaluable advice.

  • Authors could use this scale to qualify the errors “Calibrating a flow model in an irrigation network: Case study in Alicante, Spain. Spanish Journal of Agricultural Research (Online), 1 (15), 1 - 13. 10.5424/sjar/2017151-10144”

Response:Firstly, thank you for your careful review and in-depth attention to our research. We have carefully considered and acknowledged the importance and effectiveness of indicators such as the Nash Sutcliffe coefficient E, relative square root error (RRSE), and percentage deviation (PBIAS) used in this study in specific fields, as you suggest referring to the evaluation criteria in the "Calibrating a Flow Model in an Irigation Network: Case Study in Alicante, Spain".

However, based on the specific needs and conditions of the study, R was chosen as the research option ²、 MAE, RMSE, and MAPE are the main performance evaluation indicators. These indicators have not only been widely applied in various hydrological and environmental model evaluations, but also provide sufficient support for comprehensive evaluation of model predictive performance. They have been proven to be extremely effective in prediction accuracy and error analysis.

In order to demonstrate respect for the aforementioned literature and recognition of their contributions, the applicability and potential advantages of these evaluation criteria are cited and discussed in this article. We also plan to adopt this method in our upcoming article to further explore and verify its application effectiveness in different hydrological scenarios. This will provide readers with a comprehensive perspective on different evaluation methods, while emphasizing the rationality of our selection of current evaluation indicators.

Thank you again for your valuable suggestion.

  • Figures should be improved because it is difficult to read. The scales should be changed and the resolution improved, including auxiliary lines

Response:Thank you for your careful review and valuable suggestions on our work. We deeply apologize for any inconvenience caused to your reading regarding the issue of graphic readability that you mentioned.

Based on your feedback, thorough modifications have been made to graphics with poor readability and insufficient clarity. Specific improvement measures include adjusting the scale and improving the graphics resolution. We are confident that these improvements will significantly improve the clarity and overall quality of graphics, making them easier to understand and analyze.

Thank you again for your feedback and guidance.

  • All figures and tables should be discussed compared with other published research to show the improved of their research

Response:Thank you very much for reviewing our paper and providing constructive feedback. You suggest that we compare all charts and tables with published research to highlight improvements and innovations in this study. We understand the importance of this comparison, especially in verifying the effectiveness of new methods in scientific research.

However, we face some specific challenges that make direct comparison less feasible:

The uniqueness of the research environment: The study focuses on the head area of the Three Gorges Reservoir, a region with unique geographical location, climate conditions, and socio-economic background. The uniqueness of this specific environment makes it difficult to accurately reflect the impact of environmental factors on research results when compared directly with studies in other regions.

Methodological innovation: The study adopted a series of innovative research methods, which have been applied in other fields (such as water quality prediction, electricity prediction, etc.), but there is relatively little research on predicting the water level of large reservoirs. Therefore, the application and effectiveness of these methods have certain pioneering significance in this field.

The incomparability of data: Due to differences in the collection methods, time spans, and processing techniques of datasets, directly comparing our data with data from other studies may not be able to fairly evaluate the strengths and weaknesses of each study.

Given these considerations, the research plan will introduce comparisons with other studies in the discussion section, but will mainly focus on explaining why current methods and data are particularly crucial for this research question, rather than conducting direct quantitative comparisons. This approach not only highlights the uniqueness of our research, but also effectively demonstrates the scientific value and practical significance of our research.

We greatly appreciate your feedback, which is of great significance for us to improve our paper.

  • Conclusions are vague, the authors should improved it, showing the real novel and its applicability in real case of study and future research

Response:Thank you for your valuable suggestions and criticism of our conclusion section. We recognize the importance of clarity and specificity in the conclusion section for showcasing research results, and understand the need to more clearly articulate the innovative points of the research and its practical application value.

Based on your suggestion, the conclusion section has been rewritten to present the core contributions and future application prospects of this study more clearly. Please refer to the conclusion section in the article for specific content.

Thank you again for your feedback.

 

The comments from the editor and reviewer are very valuable and helpful for improving our paper. We have studied the comments carefully and revised the manuscript, and we hope that we have satisfactorily addressed the comments that were raised. Thank you again for your time and effort. Please refer to the attachment for the revised manuscript.

 

Sincerely yours,

Lili Zhang

 

Author Response File: Author Response.pdf

Reviewer 3 Report (New Reviewer)

Comments and Suggestions for Authors

 

Revision of the manuscrpt “Water Level Prediction Analysis for the Three Gorges Reservoir Area Based on a Hybrid Model of LSTM and Its Variants” by Li Haoran, Zhang Lili, Zhang Yaowen, Yao Yunsheng , Wang Renlong and Dai Yiming.

 

The manuscript presented the results of training three different structures for a deep learning model for predicting the water level of Three Gorges Reservoir with 6 hours in advance, given the information measured in the 36 hour period the precedes the prediction. In the article, the authors show that the inclusion of an attention layer after the lstm block improves the performance of the prediction.

However, there are some unclear parts that needs to be clarified:

1.- If I correctly understood, the available information is the water level, Q of Eq 14, and the precipitation in Badong, Zigui, and Xingshan. This gives the 5 channels of dimension 2 of line 297. The number 19124 of the dimension 0 of line 291 corresponds to the number of available times (4780 days, with 4 observations per day). But I’m not sure of what is the 6 of dimension 1 (sliding window). ¿If X[i,0,:] is the available information, X[i,1,:]=X[i-1,0,:], and X[i,2,:]=X[i-2,0,:], and so on?

2.- In the 2.2.3. Model Design section, first it is read that the input variables to the model are for the past 48 hours, but later it is read that a 36 hour period is used. For the purpose of the following comments, I’ll assume that there is a mistake and the correct value is 36.

3.- It is not clear how the predicted variable is in terms of dimension and arrangement. It is a many-to-many structure or a many-to-one structure?. I understand that the input data was split into blocks of [36,6,5], such as 64 of these blocks are passed in the batch-size. However, this input block contains in channel 0 of dimension 2 the water level that is the predicted variable. Therefore, I’m not sure if the predicted variable is a block of [36,1] containing the shifted time series of the water levels such as the the last elements has the water level for the next 6 hours, or if the output of the model is one value of the water level for the next 6 hours.

4.- I assume that the input block of [36,6,5] passes to the CNN layer. The convolution is applied for each time (convolution to a 6x5 matrix) or to all of the input block?

5.- Fig 4 suggests that the width (number of times) of each layers is not equal to the width of the input block (36 times). Also, there is a “expand node” layer to the number of times that enter to the LSTM layer is not equal to 36. Then, Fig 4 suggests that there is a reduction of the number of times that passes to the last LSTM layer, and I’m not sure how it is done. Some operations are missed or the figure should be revised in order to avoid this confusion with the width of the block.

6.- A summary table with the dimension of the input block to each layer in the model is needed.

7.- Eq 15 is correct?. In the precipitation, Xmin=0 so for days without precipitation, X=0, Xnorm is NaN.

8.- One mayor comment is with respect to the dataset division. The entire data set was split following a 8:1:1 ratio. Does it mean that the total number of [36,6,5] blocks and the corresponding outputs (something like 19088 blocks, 19124-36) was randomly split, or that the first 80% part of the total time series was used for training while the last 20% was used for validation and test? If the blocks were randomly assigned to each of these sub datasets (like de la Fuente et al (2019)*), the train, validation and test datasets are not independent among each other because, so the actual performance of model is not quantified. In particular, if the block that predicts the water level for t=T is in the validation or test dataset, it is very likely that the block for predicting t=T+6h and t=T-6h are in the train dataset, so an important amount of the information that is pass for predicting t=T was used for training the model. If this is the case, the models must be re-trained by splitting the entire data set into three independent sets without overlapped information.

9.- It is not clear how Figures 9 to 13 were computed, because of the water level for the past hours enters as input for the prediction. The entire annual evolution was constructed based on the input blocks that uses the actual measurements of the water level, o the input water level for time t corresponds to the predicted water level for time t-1?

10.- Which library and language was used? (tensoflow, keras, pytorch)

 

* de la Fuente, A.; Meruane, V.; Meruane, C. Hydrological Early Warning System Based on a Deep Learning Runoff Model Coupled with a Meteorological Forecast. Water 2019, 11, 1808. https://doi.org/10.3390/w11091808

Author Response

Dear editor and reviewer:

Thank you for carefully reviewing our manuscript entitled “Water Level Prediction Analysis for the Three Gorges Reservoir Area Based on a Hybrid Model of LSTM and Its Variants, water-2824128 ” and for providing constructive suggestions. We have studied the comments carefully and tried our best to revise and improve the manuscript. We hope the revised manuscript will meet the requirements of your journal. The serial numbers of the charts in the letter correspond to those in the manuscript.

Our responses to the editor’s and reviewers’ comments are listed below:

 

Response to comments and instructions of editor: Below, the original comments are in red, and our responses are in blue.

  • - If I correctly understood, the available information is the water level, Q of Eq 14, and the precipitation in Badong, Zigui, and Xingshan. This gives the 5 channels of dimension 2 of line 297. The number 19124 of the dimension 0 of line 291 corresponds to the number of available times (4780 days, with 4 observations per day). But I’m not sure of what is the 6 of dimension 1 (sliding window). ¿If X[i,0,:] is the available information, X[i,1,:]=X[i-1,0,:], and X[i,2,:]=X[i-2,0,:], and so on?

Response:Thank you for your careful review and valuable feedback. Regarding your question about data structure and sliding window settings, we will provide further explanation here in order to better illustrate our method and data processing flow.

You have correctly understood the dimension setting of our data: the 19124 samples with dimension 0 reflect 4 observation records per day for 4780 days. These records include water level and rainfall in three regions (Badong, Zigui, and Xingshan), as well as runoff flow data in front of the dam, forming five characteristic channels in dimension 2.

Regarding your inquiry about dimension 1, the number 6 does indeed represent the size of the sliding window. In our model, each input sample utilizes data from the past 36 hours (i.e. 6 time points, recorded every 6 hours) for prediction. Specifically, X [i, 0,:] contains the most recent observation point data, X [i, 1,:] contains the data from the previous time point, and so on until X [i, 5,:] contains the data from 36 hours ago. This data structure design is aimed at enabling the model to capture the trends of water level and rainfall over time, thereby improving the accuracy and robustness of predictions.

We chose this sliding window size based on preliminary experimental results, which helped the model achieve good performance on the current dataset. The specific selection and adjustment of sliding windows are based on the performance evaluation of the model on the validation set, to ensure that the model can capture key time series characteristics.

  • - In the 2.2.3. Model Design section, first it is read that the input variables to the model are for the past 48 hours, but later it is read that a 36 hour period is used. For the purpose of the following comments, I’ll assume that there is a mistake and the correct value is 36.

Response:Dear reviewer, thank you very much for pointing out the inconsistencies in our paper. The description you mentioned about the time span of input variables is indeed incorrect. I appreciate your attention and correction.

The description of 48 hours as an input variable mentioned in section 2.2.3 "Model Design" is incorrect. In fact, our model uses 36 hours of data as input. This error was an oversight that occurred during the writing process, and we will correct it in the final version of the paper to ensure that all relevant descriptions consistently point to the 36 hour time window.

Thank you again for your careful review and valuable feedback. We will carefully review the entire paper to avoid such errors from happening again and ensure the quality and accuracy of the paper. Looking forward to your further guidance and suggestions.

  • - It is not clear how the predicted variable is in terms of dimension and arrangement. It is a many-to-many structure or a many-to-one structure?. I understand that the input data was split into blocks of [36,6,5], such as 64 of these blocks are passed in the batch-size. However, this input block contains in channel 0 of dimension 2 the water level that is the predicted variable. Therefore, I’m not sure if the predicted variable is a block of [36,1] containing the shifted time series of the water levels such as the the last elements has the water level for the next 6 hours, or if the output of the model is one value of the water level for the next 6 hours.

Response:Dear reviewer, thank you for your valuable feedback and questions, which will help us clarify the predictive structure of the model. I will provide a detailed explanation regarding the dimensions and arrangement of the predictive variables you mentioned.

In our research, the structure of the model is actually many to one. Specifically, although our input data is processed in blocks of 36 hours (6 time steps, each containing 5 features), the output goal of the model is to predict the water level within the first 6 hours after the last input time step. Therefore, the output of the model is a single value, not a time series.

This means that although the input data contains multiple time steps (each time step is 6 hours, for a total of 36 hours), the model does not output the water level for each hour of the next 6 hours, but rather outputs the predicted water level at a specific time point during these 6 hours. This structural selection is based on considerations of model performance and prediction task requirements, aiming to provide concise and effective water level prediction.

We will clarify this point more clearly in the paper to avoid any possible misunderstandings and ensure the clarity and accuracy of the paper. Thank you again for your feedback.

  • - I assume that the input block of [36,6,5] passes to the CNN layer. The convolution is applied for each time (convolution to a 6x5 matrix) or to all of the input block?

Response:Dear reviewer, thank you for your detailed review and questions regarding the model design in our paper. Regarding the specific inquiry about the application of convolutional neural network (CNN) layers, the following is a detailed explanation of our processing method.

In our model, the input data is organized into data blocks with a shape of [6,5], where 6 represents continuous time steps, each time step represents a 6-hour data interval, and 5 represents the number of features at each time step. We use one-dimensional convolution (Conv1D), which is designed to handle this multi time step data structure.

For the processing of CNN layers, convolution is performed on the entire input block, rather than on each time step separately. This means that the convolution operation spans all six time steps, enabling the capture of local time dependent features in time series data. In this process, each convolution kernel slides along the time dimension while covering all five features, effectively extracting key information from each paragraph of the time series.

Thank you again for your feedback.

  • - Fig 4 suggests that the width (number of times) of each layers is not equal to the width of the input block (36 times). Also, there is a “expand node” layer to the number of times that enter to the LSTM layer is not equal to 36. Then, Fig 4 suggests that there is a reduction of the number of times that passes to the last LSTM layer, and I’m not sure how it is done. Some operations are missed or the figure should be revised in order to avoid this confusion with the width of the block.

Response:Dear reviewer,Thank you very much for your specific feedback on Figure 4 and the questions you raised, which are crucial for ensuring the clarity and accuracy of our research. We have thoroughly remade the graph to ensure accurate expression and easy understanding of all information regarding the issue you pointed out about inconsistent layer widths and how the time width of data passing through different layers is reduced.

 

We hope that these modifications can address your concerns and eliminate any confusion that may arise from the original graphics. Thank you again for your careful review and suggestions.

  • - A summary table with the dimension of the input block to each layer in the model is needed.

Response:Dear reviewer,Thank you very much for your in-depth review and suggestions on our paper. You suggest that we provide a summary table that lists the input block sizes for each layer in the model, so that readers can better understand the structure of the model.

Based on your suggestion, we have created a detailed table. We hope that the newly added table can meet your requirements and help readers better grasp the detailed structure of the model and the details of data flow. Thank you again for your valuable suggestion. We look forward to your further guidance.

Model Type

Input Dimensions

Output

Dimensios

Key Layer Composition

Optimizer

LSTM

(epoch,6,5)

(epoch ,1)

2x LSTM, 1x Dense

TensorFlow Adam with learning rate 0.0001

BiLSTM

(epoch,6,5)

(epoch ,1)

2x BiLSTM, 1x Dense

TensorFlow Adam with learning rate 0.0001

CNN-LSTM

(epoch,6,5)

(epoch ,1)

Conv1D, MaxPooling1D, 2x LSTM, 1x Dense

TensorFlow Adam with learning rate 0.0001

CNN-Attention-LSTM

(epoch,6,5)

(epoch,1)

Conv1D, MaxPooling1D, Attention, 2x LSTM, 1x Dense

TensorFlow Adam with learning rate 0.0001

 

  • - Eq 15 is correct?. In the precipitation, Xmin=0 so for days without precipitation, X=0, Xnorm is NaN.

Response:Dear reviewer, thank you for reviewing our research work and pointing out the errors in formula (15). We deeply apologize for this and thank you for your careful correction. You correctly pointed out that in the absence of rainfall, the original formula would cause the denominator to be zero due to X_min=0, resulting in the calculation result of X_norm being NaN. This is a serious technical error that has affected the accuracy and reliability of our model. Based on your guidance, we have revised the formula. The new normalization method will be represented as follows:

 

Thank you again for your feedback.

8)- One mayor comment is with respect to the dataset division. The entire data set was split following a 8:1:1 ratio. Does it mean that the total number of [36,6,5] blocks and the corresponding outputs (something like 19088 blocks, 19124-36) was randomly split, or that the first 80% part of the total time series was used for training while the last 20% was used for validation and test? If the blocks were randomly assigned to each of these sub datasets (like de la Fuente et al (2019)*), the train, validation and test datasets are not independent among each other because, so the actual performance of model is not quantified. In particular, if the block that predicts the water level for t=T is in the validation or test dataset, it is very likely that the block for predicting t=T+6h and t=T-6h are in the train dataset, so an important amount of the information that is pass for predicting t=T was used for training the model. If this is the case, the models must be re-trained by splitting the entire data set into three independent sets without overlapped information.

Response:Dear reviewer, thank you for raising important questions about dataset partitioning. You have correctly pointed out the importance of independence in dataset partitioning for evaluating model performance. In our research, we took specific measures to ensure the independence between datasets and avoid potential data leakage issues.

When dividing the dataset, we did not choose a random segmentation method, but instead adopted a partitioning strategy based on time series order. Specifically, we first divide the dataset into continuous blocks in chronological order, then use the first 80% of the data as the training set, followed by 10% as the validation set, and the last 10% as the testing set. This method ensures the temporal continuity of the data in the training, validation, and testing sets, and prevents data blocks near time points from being segmented into different datasets.

In addition, we also ensure that the data blocks within each window are complete and not broken in the application of sliding windows. This means that for each prediction point t=T, all historical data blocks it relies on come from the same dataset, ensuring that the model does not come into contact with any information from the validation or testing sets during the training process.

Thank you again for your feedback.

9)- It is not clear how Figures 9 to 13 were computed, because of the water level for the past hours enters as input for the prediction. The entire annual evolution was constructed based on the input blocks that uses the actual measurements of the water level, o the input water level for time t corresponds to the predicted water level for time t-1?

Response:Dear reviewer, thank you for your detailed review of our research and for any questions you may have regarding the generation process of Figures 9 to 13. We understand that this part of the content may not have been described clearly enough in the original manuscript. Now, I will elaborate on the calculation and construction process of these charts in detail.

In our model, we used the sliding window method to construct input features, where the input block for each predicted point includes actual water level measurements from the past few hours. Specifically, for any prediction time t, the input of the model includes water level data from t-N to t-1, where N is the size of the window. Therefore, the model predicts the water level at time t, while the input data is the actual measured water level up to time t-1.

Figures 9 to 13 show the comparison between the predicted water level by the model and the actual measured water level at the same time point. The purpose of these charts is to demonstrate the performance of our predictive model over different time periods and how the model can accurately capture the trend of water level changes over time. In order to construct these charts, we collected the predicted results of the model and the actual water level data at the corresponding time points. The two were overlaid and displayed using plotting software, allowing for a visual comparison of prediction accuracy and actual values.

Thank you again for your feedback.

10)- Which library and language was used? (tensoflow, keras, pytorch)

Response:Dear reviewer, thank you for your review and attention to our work. Regarding the technical details you raised, our model was developed based on the Python programming language and used the TensorFlow library and its API Keras to build and train the model.

Specifically, we utilized various neural network structures in the model, including Convolutional Neural Networks (CNN), Long Short Term Memory Networks (LSTM), and Attention Mechanisms (Attention), all of which were implemented through the functional modules of TensorFlow and Keras. This method of combining different types of neural networks aims to enhance the model's processing ability for time series data, especially when performing complex multivariate time series prediction tasks such as water level prediction.

The model compilation and training process is also entirely carried out within this framework. Use the Adam optimizer for model training, using mean squared error (MSE) as the loss function, which is clearly explained in the code.

Thank you again for your feedback.

* de la Fuente, A.; Meruane, V.; Meruane, C. Hydrological Early Warning System Based on a Deep Learning Runoff Model Coupled with a Meteorological Forecast. Water 2019, 11, 1808. https://doi.org/10.3390/w11091808

Response:Thank you for your constructive feedback and for recommending the article by de la Fuente et al. (2019) on the hydrological early warning system. We agree that this reference is highly relevant to our work and have duly incorporated it into our manuscript to strengthen our discussion on similar systems. We appreciate your insightful suggestion, which has enriched our study.

 

The comments from the editor and reviewer are very valuable and helpful for improving our paper. We have studied the comments carefully and revised the manuscript, and we hope that we have satisfactorily addressed the comments that were raised. Thank you again for your time and effort. Please refer to the attachment for the revised manuscript.

 

Sincerely yours,

Lili Zhang

 

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report (Previous Reviewer 3)

Comments and Suggestions for Authors

The authors have offered thorough and thoughtful responses to my inquiries, and I am content with their answers. I suggest accepting this study.

Author Response

Dear reviewer,

Thank you very much for reviewing our research and providing constructive suggestions. We are pleased to hear that you are satisfied with our response to your previous inquiry and appreciate your recommendation to accept our paper.

We greatly appreciate the detailed feedback you provided during the review process, which has greatly helped to improve the quality and presentation of our research. We will continue to work hard to ensure the scientific and rigorous nature of our research results, and hope to receive your support and guidance in future research.

Thank you again for your positive feedback and recommendation.

Sincerely yours,
Lili Zhang

Reviewer 2 Report (New Reviewer)

Comments and Suggestions for Authors

Authors clarified the different suggestions in first review

Comments on the Quality of English Language

It is correct

Author Response

Dear reviewer,
Thank you very much for reviewing our research and providing constructive suggestions. We are pleased to hear that you are satisfied with our response to your previous inquiry.
We greatly appreciate the detailed feedback you provided during the review process, which has greatly helped to improve the quality and presentation of our research. We will continue to work hard to ensure the scientific and rigorous nature of our research results, and hope to receive your support and guidance in future research.
Thank you again for your positive feedback and recommendation.
Sincerely yours,
Lili Zhang

Reviewer 3 Report (New Reviewer)

Comments and Suggestions for Authors

I'd like to thank to the authors for their complete response to all of my previous comments.

I have just one small comment that is needed to be included in the revised and last version of the manuscript. It is important to clarify in the text that Figure 9 and 10 were made with the actual observations, and they are not a long-run of the model, where the prediction for time t is used as input for predicting time t+6h.

 

Author Response

Dear editor and reviewer:

Thank you for carefully reviewing our manuscript entitled “Water Level Prediction Analysis for the Three Gorges Reservoir Area Based on a Hybrid Model of LSTM and Its Variants, water-2824128 ” and for providing constructive suggestions. We have studied the comments carefully and tried our best to revise and improve the manuscript. We hope the revised manuscript will meet the requirements of your journal. The serial numbers of the charts in the letter correspond to those in the manuscript.

Our responses to the editor’s and reviewers’ comments are listed below:

 

Response to comments and instructions of editor: Below, the original comments are in red, and our responses are in blue.

I have just one small comment that is needed to be included in the revised and last version of the manuscript. It is important to clarify in the text that Figure 9 and 10 were made with the actual observations, and they are not a long-run of the model, where the prediction for time t is used as input for predicting time t+6h.

Response:Dear reviewer,Firstly, I would like to express my sincere gratitude for the valuable feedback you provided during the review process. We highly value your feedback and have revised the manuscript based on your suggestions.

We have provided a detailed explanation in the corresponding section of the paper regarding your comment on the need to clarify the data sources for Figures 9 and 10 in the text. Specifically, we have added the following content to enhance the clarity and accuracy of the explanation:

"In this study, the actual values presented in Figures 9 and 10 originate from real observational data collected at hydrological monitoring stations. The forecast values observed are not generated through extended model runs; that is, this study did not use the predictions from a previous time point as inputs to continuously forecast future states. Each data point show is independent" "ent of model predictions and directly reflecting the actual observations."

 

We hope that this change can clearly explain the source and usage of the data, ensuring that readers can correctly understand the information displayed in the chart. We appreciate your careful review and suggestions, which have played a crucial role in improving the quality of our research.

Thank you again for your feedback.

 

The comments from the editor and reviewer are very valuable and helpful for improving our paper. We have studied the comments carefully and revised the manuscript, and we hope that we have satisfactorily addressed the comments that were raised. Thank you again for your time and effort.

 

 

Sincerely yours,

Lili Zhang

 

 

This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.


Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

 

This is an interesting paper that worth’s publication. However, there are some points that the authors should take into account

 

1)        Since they discuss ML methods they should add some refrences related to methdos such as  symbolic regression

 https://doi.org/10.1029/2020WR027385

https://doi.org/10.1007/s11831-023-09922-z

 

2)        deep learning methods are very interesting and since time series are employed it could also of interest to refer to methods which include information from dynamical system analysis in order to take into account special characteristics of time series such as chaotic behavior (see for example DOI 10.1007/s00521-021-06266-2, https://doi.org/10.1016/j.chaos.2023.113971)

 

3)        in all the graphs the predictions present larger and higher frequency fluctuations compared to the actual values. Can the authors comment on this?

 

Reviewer 2 Report

Comments and Suggestions for Authors

Please find attached

Comments for author File: Comments.pdf

Comments on the Quality of English Language

minor english check required.

Author Response

Please check the attachment. 

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

The water level of the Three Gorges Hydropower Station directly influences hydroelectric power generation, flood control, navigation, and ecological preservation. Additionally, its significance for the safety and regional economic development of downstream areas cannot be overstated, making the chosen topic of the article meaningful. However, it is evident that there is much room for improvement in various aspects of the writing. The overall structure of the article needs adjustment, as it currently does not resemble a scientific paper but rather a project proposal. It is recommend that the authors read more literature  to gain a better understanding of the structure typical of scientific papers.

Furthermore, the content of the article requires clarification in several areas. The introduction of the four artificial intelligence algorithms is superficial, and there is a lack of detailed explanations regarding the configuration of specific models and the selection of hyperparameters. This leaves readers struggling to gain specific knowledge from the author's work. Numerous low-level errors are present in the article, such as in Formula 15, where its range is evidently not confined to 0—1. Figures 11 and 12 lack citation and introduction in the main text, and the labeling of Figure 13 appears in both the 415th and 410th lines. Given the multitude of issues and details that need attention, it seems that a substantial amount of time will be required for the necessary revisions. Thus, I recommend rejecting the current submission with an invitation to resubmit after significant improvements have been made. 

Author Response

Please check the attachment.

Author Response File: Author Response.pdf

Back to TopTop