Next Article in Journal
In-Situ Leaching Mining Technique for Deep Bauxite Extraction and the Countermeasures for Water Pollution Prevention: An Example in the Ordos Basin, China
Previous Article in Journal
Use of Logs Downed by Wildfires as Erosion Barriers to Encourage Forest Auto-Regeneration: A Case Study in Calabria, Italy
 
 
Article
Peer-Review Record

Overflow Capacity Prediction of Pumping Station Based on Data Drive

Water 2023, 15(13), 2380; https://doi.org/10.3390/w15132380
by Tiantian Guo, Jianzhuo Yan, Jianhui Chen and Yongchuan Yu *
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3:
Water 2023, 15(13), 2380; https://doi.org/10.3390/w15132380
Submission received: 25 April 2023 / Revised: 16 June 2023 / Accepted: 26 June 2023 / Published: 28 June 2023

Round 1

Reviewer 1 Report

Please refer to the attachment for details

 

Comments for author File: Comments.pdf


Author Response

Response to Reviewer 1 Comments

 

Point 1: Pay attention to the use of tenses, the use of articles and complex words in the article.

Response 1: I make changes to lines 18, 26, 116, 119, 120, 312, etc. of the paper.Detailed modifications can be found in the attached paper.

 

Point 2: Can the Overflow capacity prediction model proposed in this paper be combined with the 3 D model for coupling calculation?

Response 2: The 3D model needs to model the real pump station operation conditions in a certain proportion, including channel geometric parameters, hydraulic characteristics, gate parameters, etc., and has high requirements for its physical model and Geometric modeling. For example, Guan Guanghua and others use 3D numerical simulation to analyze the gate flow characteristics. And this article mainly explores whether data-driven methods are also effective for predicting traffic flow. By analyzing and processing a large amount of historical data, certain prediction results can be obtained. The literature of coupling 3D hydrodynamic model and data-driven model for calculation is not reviewed, and combining 3D model and data-driven model will be a research direction in the subsequent research.

 

Point 3: Figure 1, Figure 2, Figure 7 are not clear enough.

Response 3: For Figure 1, I redrew the top left image to enhance the clarity of the image; for Figure 2, I changed the color of the module and increased the font size to enhance the clarity of the image; for Figure 7, I removed the background of the image and bolded the text to enhance the clarity of the image.Detailed modifications can be found in the attached paper.

 

Point 4: Explain the meaning of ρx,y in equation 5.

Response 4: The first paragraph of section 2.3 is an explanation of the meaning of the Pearson correlation coefficient. I will change the last sentence to "is used to denote the Pearson correlation coefficient, defined as follows:", which will help the reader to understand the meaning better.Detailed modifications can be found in the attached paper.

 

Point 5: Chapter 2 of the article explains too much of the research field;chapter 3 and chapter 4 results conclusion part content is too little, some priorities do not divided.

Response 5: I trimmed section 2.6 of the thesis and added to chapters 3 and 4 of the thesis, and changed the title of chapter 3 to highlight the priority.Detailed modifications can be found in the attached paper.

 

Point 6: There are many factors affecting the overflow, so whether the measured data screening method and the isolation forest algorithm adopted in this paper have absolute accuracy;the correlation coefficient of Person is more than 0.5, and the data values in Figure 8 are between 0.45-0.5. How to guarantee the accuracy of these data values with small differences? 

Response 6: (1) The isolated forest algorithm selected in this paper mainly deals with the outliers in the data to reduce the influence of outliers on the results, but it does not guarantee that all the outliers can be identified; the Pearson correlation coefficient is used for variable screening because there are many factors affecting the overflow capacity, and if all the variables are uniformly used as input variables, it will not only increase the computational burden, but also may not achieve good prediction results because different influencing factors have different degrees of influence on the overflow capacity, so the data screening method should be used, which can also be understood as a dimensionality reduction operation.

(2) The rules for specifying the degree of correlation of variables in the Pearson correlation coefficient, generally speaking, a correlation greater than [0.8,1.0] indicates a strong correlation; [0.5,0.8] indicates a moderate correlation; [0.0,0.5] indicates a low correlation. If more data are needed, the low correlation can be divided into weak correlation and no correlation, and then more variables can be selected. In addition, the selection of the degree of variables is combined with the significance level between the variables to make a comprehensive judgment; the significance level is less than 0.05 and the correlation is statistically significant.

 

Point 7: In Section 3.2, will different parameters yield different calculation results and conclusions?. 

Response 7: The first part of 3.2 is the selection of hyperparameters for BIGRU neural network, for the network layer settings, learning rate level decay factor, error training settings and dropout size are set with reference to the classical network parameters, too large or too small will have a bad impact on the experiment, so the middle value is generally chosen, and finally, an optimizer is used for parameter selection; the second is the ARIMA model parameters, which are the optimal parameters obtained in the experiments and tested by the relevant criteria.

 

Point 8: Figure 11 (a) (b) shows that the four prediction models have a overflow difference of about 5m3/s and water level of about 10m from the original data,with a large error.

Response 8: I have modified the result graph so that it is easier to see the difference between each method. The SA-BIGRU-ARIMA hybrid model proposed in this paper has some improvement for comparing the models BIGRU, ARIMA, and BP, although it also has a large error compared with the original data. Because the current data do not cover the operating parameters of the unit, the degree of hydrophytes and sediment in the channel, and the resistance of the river, we will consider conducting on-site research on the Miyun reservoir storage project to increase the data richness to further improve the accuracy of the model.Detailed modifications can be found in the attached paper.

Author Response File: Author Response.pdf

Reviewer 2 Report

1. Content:

(1) Introduction: With so many references cited, the last paragraph should summarize and point out why the method used in this paper was chosen compared to previous studies;

(2)Line 314:The definition of ACF and PACF in section 2.6.2 of the paper should be clearly defined.

(3) ine 321:The BIC criteria are defined as follows: "does not represent a definition of the paper.

2. Formulas and charts:

(1) Formula labeling is unclear. The label should be located at the far right of the page, while the author should be placed after the formula, and the indentation format of the formula is also not neat;

(2) The depth and thickness of the chart lines in Table 3 vary;

(3) Figure 8 is not clear;

(4) Figure 9 (b), indicating the presence of Chinese;

Comments for author File: Comments.pdf

(1) Line 43, the sentence is too long and contains multiple 'and', making it difficult for readers to understand the meaning of the sentence;

(2) Line 172, "Randomly selected from the data sample n data as samples," the word order is incorrect and the author's meaning is not understood;

Author Response

Response to Reviewer 2 Comments

Point 1: Revisions to contents :(1) Introduction: With so many references cited, the last paragraph should summarize and point out why the method used in this paper was chosen compared to previous studies; (2)Line 314:The definition of ACF and PACF in section 2.6.2 of the paper should be clearly defined. (3) Line 321:The BIC criteria are defined as follows: "does not represent a definition of the paper.

Response 1: (1) I add, “BIGRU not only has a simple structure but also can well solve the gradient disappearance and explosion problems of the traditional RNN , while the ARIMA model can compensate for its insensitivity to linear components. In addition, the innovative and improved RNN network and ARIMA combination algorithm has not been applied to the research of over-flow characteristics of pumping stations”at the end of the penultimate paragraph in the introduction section., to summarize the reasons for the algorithm chosen in this paper.

  • I add the definitions of autocorrelation coefficient (ACF) and partial correlation coefficient (PACF) in the second paragraph of section 2.6;
  • I add to the definition of the Bayesian Information Criterion BIC in the third paragraph of section 2.6.

Detailed modifications can be found in the attached paper.

 

Point 2: Revisions to formulas and charts : (1) Formula labeling is unclear. The label should be located at the far right of the page, while the author should be placed after the formula, and the indentation format of the formula is also not neat;(2) The depth and thickness of the chart lines in Table 3 vary;(3) Figure 8 is not clear;(4) Figure 9 (b), indicating the presence of Chinese;

Response 2: The word document has made modifications to the formula.Detailed modifications can be found in the attached paper.

 

Point 3: Revisions to Language : (1) Line 43, the sentence is too long and contains multiple 'and', making it difficult for readers to understand the meaning of the sentence;(2) Line 172, "Randomly selected from the data sample n data as samples," the word order is incorrect and the author's meaning is not understood;

Response 3: See attached word document for modifications.Detailed modifications can be found in the attached paper

 

Point 4: Revisions to References : Reference 39, there are too many spaces between words.

Response 4: I have adjusted the spacing between fonts.Detailed modifications can be found in the attached paper.

 

Author Response File: Author Response.pdf

Reviewer 3 Report

This work developed a methodology for the Overflow capacity prediction of pumping station based on data drive.

Among the main suggestions are:

1. Improve the quality of Figure 1; it is somewhat hazy and has some tiny subtitles

2. In section 2.3 of variable selection, the correlation of random variables is performed. It would be interesting for the authors to expand and relate this section to random sampling, that is, did the sampling have any probability distribution? In what cases can correlation be used?

3. In section 2.6, ARIMA for the time series forecasting model. Was the seasonality of the data observed? Because ARIMA was used, they can complement it with justifications.

4. Translate the caption of Figure 9

5. Improve the presentation of the results from Figure 11. For example, we can split the raw data with the preliminary models and then compare it with the best model.

6. In Figure 12, we can place maximum and minimum water level thresholds

No.

Author Response

Response to Reviewer 3 Comments

Point 1: Improve the quality of Figure 1; it is somewhat hazy and has some tiny subtitles.

Response 1: For Figure 1, I redrew the top left image to enhance the clarity of the image.Detailed modifications can be found in the attached paper.

 

Point 2: In section 2.3 of variable selection, the correlation of random variables is performed. It would be interesting for the authors to expand and relate this section to random sampling, that is, did the sampling have any probability distribution? In what cases can correlation be used?

Response 2: I have added the corresponding content at section 2.3.Detailed modifications can be found in the attached paper.

 

Point 3: Figure 1, Figure 2, Figure 7 are not clear enough.

Response 3: I have redrawn Figures 1, 2, and 7.Detailed modifications can be found in the attached paper.

 

Point 4: Translate the caption of Figure 9. 

Response 4: I have retranslated the title of Figure 9.Detailed modifications can be found in the attached paper.

 

Point 5:  Improve the presentation of the results from Figure 11. For example, we can split the raw data with the preliminary models and then compare it with the best model.

Response 5: I have redrawn Figure 11.Detailed modifications can be found in the attached paper.

 

Point 6:In Figure 12, we can place maximum and minimum water level thresholds

Response 6: I have added the maximum and minimum values in Figure 12.Detailed modifications can be found in the attached paper.

 

 

Author Response File: Author Response.pdf

Back to TopTop