Runoff Prediction for Hydrological Applications Using an INFO-Optimized Deep Learning Model

Wang, Weisheng; Hao, Yongkang; Zheng, Xiaozhen; Mu, Tong; Zhang, Jie; Zhang, Xiaoyuan; Cui, Zhenhao

doi:10.3390/pr12081776

Open AccessArticle

Runoff Prediction for Hydrological Applications Using an INFO-Optimized Deep Learning Model

by

Weisheng Wang

^1,*,

Yongkang Hao

¹,

Xiaozhen Zheng

¹,

Tong Mu

²,

Jie Zhang

¹,

Xiaoyuan Zhang

¹ and

Zhenhao Cui

¹

College of Electrical Engineering, Henan University of Technology, Zhengzhou 450001, China

²

College of Hydrology and Water Resources, Hehai University, Nanjing 210098, China

^*

Author to whom correspondence should be addressed.

Processes 2024, 12(8), 1776; https://doi.org/10.3390/pr12081776

Submission received: 18 July 2024 / Revised: 16 August 2024 / Accepted: 19 August 2024 / Published: 22 August 2024

(This article belongs to the Section Advanced Digital and Other Processes)

Download

Browse Figures

Versions Notes

Abstract

:

Runoff prediction is essential in water resource management, environmental protection, and agricultural development. Due to the large randomness, high non-stationarity, and low prediction accuracy of nonlinear effects of the traditional model, this study proposes a runoff prediction model based on the improved vector weighted average algorithm (INFO) to optimize the convolutional neural network (CNN)-bidirectional long short-term memory (Bi-LSTM)-Attention mechanism. First, the historical data are analyzed and normalized. Secondly, CNN combined with Attention is used to extract the depth local features of the input data and optimize the input weights of Bi-LSTM. Then, Bi-LSTM is used to study the time series feature depth analysis data from both positive and negative directions simultaneously. The INFO parameters are optimized to provide the optimal parameter guarantee for the CNN-Bi-LSTM-Attention model. Based on a hydrology station’s water level and flow data, the influence of three main models and two optimization algorithms on the prediction accuracy of the CNN-Bi-LSTM-Attention model is compared and analyzed. The results show that the fitting coefficient, R2, of the proposed model is 0.948, which is 7.91% and 3.38% higher than that of Bi-LSTM and CNN-Bi-LSTM, respectively. The R2 of the vector-weighted average optimization algorithm (INFO) optimization model is 0.993, which is 0.61% higher than that of the Bayesian optimization algorithm (BOA), indicating that the method adopted in this paper has more significant forecasting ability and can be used as a reliable tool for long-term runoff prediction.

Keywords:

runoff forecasting; vector-weighted average algorithm; convolutional neural networks; bidirectional long–short-term memory neural network; attention mechanisms

1. Introduction

The simulation and prediction of runoff in a basin play an essential role in protecting water resources, managing the environment, and studying agronomy [1]. The runoff and its changes involve many factors, such as rainfall, atmospheric circulation, etc. In addition, evaporation, temperature, wind speed, and atmospheric pressure affect the runoff [2]. For centuries, many scholars have been committed to studying flow prediction. Still, the inherent errors of various hydrological models lead to inaccurate prediction results and cannot be accurately analyzed [3,4]. Accurate simulation and prediction of runoff are critical, especially for flood and drought forecasting, irrigation control, and other forms of water resource management, and require reliable hydrological modeling results.

Currently, medium- and long-term runoff prediction mainly include process-driven and data-driven models [5,6]. The process-driven model relies on the scientific theories of hydrology, hydraulics, and erosion dynamics [7], considers the physical mechanism in the water circulation system from all aspects, and combines factors such as land use, soil type, meteorological change, and the advantages and disadvantages of water quality. The SWAT, DHSVM, and Xin ’a River models are used to simulate the runoff generation process [8,9]. Reference [10] further improves runoff simulation by improving the snow melt source code of the SWAT model. It better captures the daily discharge and the discharge state during the snowmelt period. Reference [11] simulated future runoff in SSP scenarios for SWAT and LSTM and found that the SWAT simulation’s analysis process and results were relatively poor. Reference [12] assessed the performance of dynamically downscaled climate fields concerning observed historical stream runoff at the basin scale using a physically distributed hydrologic model (DHSVM). Reference [13] proposed a distributed degree-day Xin’anjiang model (DD-XAJ) considering ice and snow thawing, adding the degree-day module describing snow and ice thawing to the previously distributed Xin’anjiang model. By comparing the rainfall–runoff results of the TOPMODEL and Xin’ the Nanjing River model, in reference [14], the authors found that the Xin’ Nanjing River model has slightly better performance, and both of them can be used to simulate the rainfall–runoff process in the basin. As the rainfall–runoff process is affected by terrain, rainfall distribution, soil properties, land use, climate change, and other factors, the process-driven model requires a large amount of data for modeling, and insufficient data will impede the successful establishment of the model.

In contrast, the data-driven model requires fewer data, and the development time is fast. Data-driven methods are still mainly used for medium- and long-term runoff forecasting [15,16]. Data-driven models aim to establish optimal relationships between data and use mathematical techniques to establish relationships between the model’s input data and output goals. Traditional data-driven models include multiple regression models, time series models, mathematical statistics models, etc. [17]. With the development of computer theory, modern data-driven models have made more use of neural networks, fuzzy mathematics, and support vector machines for hydrological data prediction [18,19]. Generally, data-driven models can be divided into three categories: statistics-based predictive models, machine learning models, and combinatorial models [20]. Reference [21] uses several data-driven models, namely multiple linear regression (MLR), multiple adaptive regression splines (MARSs), support vector machine (SVM), and random forest (RF), which were used for rainfall–runoff prediction of the Gola watershed, located in the south-eastern part of Uttarakhand. Reference [22] presents an application of a prediction model based on LSTM and the seq2seq structure to estimate hourly rainfall–runoff. The results show that the LSTM-seq2seq model shows sufficient predictive power and could be used to improve forecast accuracy in short-term flood forecast applications. The results of reference [23] demonstrate that the selection of input variables significantly influences the predictions made using the RNN. In contrast, the RNN model, with multiple meteorological input data, achieves higher accuracy than using rainfall data alone. Reference [24] aims to summarize and compare the applicability and accuracy of different feature selection (FS) methods and ensemble learning (EL) models in medium- and long-term runoff forecasts. They conclude that the theoretical framework based on machine learning could be helpful to water managers who focus on medium- and long-term runoff forecasts. Reference [25], using multi-source information fusion and residual correction method, constructed a back propagation neural network (BPNN), Elman neural network (ENN), and particle swarm optimization–support vector machine. The coupled forecasting model involving regression (PSO-SVR) showed the ability to improve the accuracy of predicting the annual runoff time series in the validation period. Reference [26] analyzes machine learning applications for runoff modeling based on literature reviews.

Compared with the traditional research methods, the current runoff analysis model mainly uses a combination of machine learning and deep learning algorithms to simulate processes and predict results [27]. Because its model structure does not require parameters that are too complex, and because it can thoroughly determine the existing rules of the studied data, it is favored by scholars. Various methods of runoff prediction have been proposed in previous studies. Due to machine learning methods’ powerful nonlinear mapping ability, many researchers have developed various prediction models combined with machine learning to predict runoff and achieved good results [28,29,30,31]. Reference [32] proposes a high-accuracy runoff forecasting model using machine learning; the results also lay an essential foundation for mid-to-long-term runoff forecasting. Reference [33] proposes a hybrid model based on a combination of two-stage decomposition and support vector machine (SVM). Reference [34] implemented a model of near-future global meteorological forecasting coupled with a short-range runoff forecasting system. Reference [35] used a deep learning approach as a post-processor to correct for errors associated with hydrologic data. The proposed model uses the extended short-term memory model with a sequence-to-sequence structure as a post-processor to improve runoff forecasting. Reference [36] proposes a homogeneous selective ensemble forecasting framework based on a modified differential evolution algorithm (MDE) to elucidate the complex nonlinear characteristics of hydrological time series.

In addition, due to the inherent complexity of hydrological series and the difficulty of parameter optimization of the neural network model, the prediction accuracy of a single model in terms of its errors is significant, so it is necessary to optimize the model instead of using the manual parameter optimization process to further improve the prediction accuracy of the model [37,38]. Therefore, many scholars have proposed the use of the Bayesian optimization algorithm, particle swarm optimization algorithm, sparrow search algorithm, and ant colony algorithm [39,40] to assist the model in selecting the optimal parameters through various optimization algorithms to improve its prediction accuracy. The vector-weighted average algorithm has the advantages of easy operation and less parameter adjustment. Compared with other algorithms, its search accuracy and convergence speed are higher. Based on the improved vector-weighted average algorithm, INFO’s update rules and vector combination steps are improved, and its exploration and development capabilities are improved [41]. In addition, the local search phase helps the algorithm to avoid low-precision solutions, improve utilization and convergence, and thus improve the algorithm’s optimization performance.

INFO improves the ability to explore and develop by updating the position of vectors through the three core processes of updating rules, vector combination, and local search. At the same time, the algorithm avoids low-precision solutions and improves the utilization rate and convergence. The vector-weighted average optimization algorithm optimizes weighted averages by dynamically adjusting weights. Data analysis and deep learning can help the model to obtain more accurate weighted average results. Convolutional neural networks (CNNs) can effectively reduce the complexity of model training and improve the efficiency of model prediction. Extended short-term memory networks (LSTMS) can choose to retain or delete data through a gating mechanism. Bi-directional long–short-term memory networks (Bi-LSTMs) have two LSTMS, forward and backward, which control past and future information, respectively, and can interpret data more thoroughly. Its prediction performance is better than that of the one-way LSTM model [42,43,44]. The Attention mechanism combination assigns weights to the information on different states and hidden vectors according to their importance, which can compensate for the limitation of the decoder assigning the same weights to each hidden vector in neural networks. Therefore, this study proposes a runoff prediction model based on the information optimization of the CNN-Bi-LSTM-Attention network. Using CNN, we can effectively and quickly extract data features, and Attention can extract local depth features of data and optimize input weights in the next step. Meanwhile, using Bi-LSTM, we can analyze time series features of data in both the positive and negative directions. The parameters of the CNN-Bi-LSTM-Attention network model were optimized by INFO, and the algorithm’s convergence speed and search accuracy were improved to achieve accurate runoff prediction.

The main contributions of this work are briefly described below:

(i): An improved vector-weighted average algorithm (INFO) is proposed to solve the problems of excessive hyperparameter settings and complex optimization parameter operation. This algorithm has higher search accuracy and convergence speed than other algorithms.
(ii): Most of the applied runoff prediction combinatorial models are machine learning combinatorial models, and research on deep learning combinatorial models is still in the initial stage. In this study, a novel forecasting method was proposed: the runoff prediction method optimized by the CNN-Bi-LSTM-Attention model based on INFO. By combining CNN-Attention with the rapid extraction of local depth features, the input weights of Bi-LSTM were further optimized, and the time series features were simultaneously analyzed bidirectionally. The neural network uses dropout and L2 regularization to reduce unnecessary structures in the network model, reduce the complexity of model training, and improve the computing power and accuracy of the model.
(iii): Through comparative analysis of experiments, compared with the Bi-LSTM and CNN-Bi-LSTM basic models and the Bayesian optimization model, the fitting coefficient, R2, of the proposed prediction method in this study increased by 7.91%, 3.38%, and 0.61%, respectively. The experimental results show that the method proposed in this paper has better optimal forecasting effect and can achieve the goal of long-term runoff prediction.

2. Study Area and Data

Xiaolangdi Reservoir, located in the north of Luoyang City, Henan Province, China, is a crucial water conservancy project in the middle reaches of the Yellow River. The geographical location of the reservoir is of great strategic significance, not only for the management of regional water resources but also for the ecological balance and economic development of the whole Yellow River basin. Its geographical coordinates are about 112° east in terms of longitude and 35° north in terms of latitude, and it is located in the transition zone between the Loess Plateau and the North China Plain. This unique geographical position gives Xiaolangdi Reservoir unique geographical and hydrological characteristics.

The basin of Xiaolangdi Reservoir covers most areas of the Loess Plateau, where plateaus, mountains, and hills dominate the terrain, and the soil type is mainly loess, which leads to severe soil erosion problems in this area. The region has a temperate monsoon climate, with four distinct seasons, an average annual temperature between 12 and 14 °C, and an annual precipitation of about 600–700 mm, which is mainly concentrated in summer, while the winter is relatively dry.

Regarding hydrology, the water level and discharge of Xiaolangdi Reservoir are significantly affected by seasonal precipitation, with a wet season in summer and a dry season in winter. The reservoir’s maximum storage capacity is 12.65 billion cubic meters, the standard storage level is 275 m, and the average annual flow is about 2200 cubic meters per second. These hydrological parameters are essential for understanding reservoirs’ hydrological dynamics and managing water resources.

In this study, 2018–2021 runoff data of a hydrological station in the lower reaches of Xiaolangdi Reservoir were selected, all of which were observational data of the hydrological station. This study mainly included flow- and water-level data, selecting the data from local weather stations, including atmospheric pressure, average wind speed, minimum and maximum temperature, and rainfall data. Further, Pearson coherence analysis was performed on the data, and input data with a strong correlation were selected for the model. Among them, 11,672 meteorological data and 12,165 hydrological data were selected. The first 75% of all the selected data were used as the training set of the model, and the remaining 25% of data were used as the test set to verify the prediction effect of the model. Four groups of data with solid correlation coefficients were selected, and their changes are shown below in Figure 1, Figure 2 and Figure 3.

3. Method

3.1. Pearson Correlation Analysis

Pearson correlation analysis was carried out on the selected hydrological and meteorological elements and runoff using MATLAB2020, and the results are shown in Table 1. The results show that the selected water level, temperature, rainfall, and runoff are strongly correlated and can be used as input factors for the model. Although the air pressure and wind speed have specific effects, they have no significant effects. Therefore, hydrology, maximum and minimum temperature, rainfall, and runoff, with considerable effects, were selected as the input factors of the model for further study.

To better study the influence of historical data on runoff prediction, mapminmax normalization was carried out for each data factor, and all data were scaled to (0,1) to reduce miscellaneous feature variables. The steps were as follows: 1. Data collection: Collect data sets that need to be preprocessed, which may contain multiple features and have different ranges of feature values. 2. Determine and select the scaling range, which was (0,1) in this study. 3. Calculate the extreme value for each feature column and find its maximum (

x_{m a x}

) and minimum (

x_{m i n}

) values in the entire data set. 4. Apply Formula (1) to scale each eigenvalue to the specified range. 5. Feature scaling: apply the appeal formula to each feature in the original data set to obtain the scaled data set through the above steps to complete the data preprocessing process and to apply the data in the prediction model to achieve runoff prediction. The process is shown in Equation (1):

y = \frac{(y_{m a x} - y_{m i n}) \times (x - x_{m i n})}{x_{m a x} - x_{m i n}} + y_{m i n},

(1)

where

x_{m i n}

and

x_{m a x}

are the minimum and maximum values of the original data set,

x

, respectively;

y_{m i n}

and

y_{m a x}

are the minimum and maximum values of the normalized data set,

y

, respectively.

\times

is a multiplication operation.

3.2. Convolutional Neural Network

The CNN has been widely used in the feature extraction of sequence data. Studies have [45,46,47] effectively extracted potential features of runoff data based on the CNN. The convolutional neural network (CNN) is divided into three layers: the convolution layer, the pooling layer, and the fully connected layer. The convolution layer and pooling layer are the core of the CNN. Features of each layer of data are extracted through the convolution kernel to obtain a correlation; network parameters are reduced through the pooling layer, thus reducing the computational load and complexity of the model; and data features are integrated and output by the fully connected layer. The specific calculation of the CNN layer is as follows:

C_{i} = R (X_{i - 1} * W_{c} + b_{1}),

(2)

P_{i} = R (C_{i}) + b_{2},

(3)

H_{i} = σ (P_{i} \times W_{H} + b_{3}),

(4)

where

C_{i}

is the output of layer,

i

, of the convolution layer;

P_{i}

is the output of the pooled layer;

R

is the

R e l u

activation function;

σ

is the

S i g m o i d

activation function;

H_{i}

is the output result of the fully connected layer;

W_{c}

and

W_{H}

represent the weight matrix;

b_{1}

,

b_{2}

and

b_{3}

are offset terms; ∗ is a convolution operation.

Figure 4 shows the CNN network structure.

3.3. Bidirectional Long–Short-Term Memory Neural Network

The literature has verified the effectiveness of the LSTM model in time prediction [48,49]. Early approaches to time series processing utilized recurrent neural networks (RNN). However, in training a long series, RNN come with the problem of gradient disappearance and explosion. For such defects, researchers have proposed LSTM (a special RNN) through three gated units inside LSTM; in turn, there are forget gates, input gates, and output gates to solve such problems. The LSTM structure is shown in Figure 5.

The forget gate indicates that the information of the previous node is selectively forgotten, the input gate selectively inputs the required information to the next state, and the output gate determines which information can be output as the current state. The following is the formula for different cells in LSTM:

i_{t} = σ (W_{3} \times x_{t} + W_{7} \times h_{t - 1} + b_{4}),

(5)

f_{t} = σ (W_{4} \times x_{ι} + W_{8} \times h_{t - 1} + b_{5}),

(6)

o_{t} = σ (W_{5} \times x_{t} + W_{9} \times h_{t - 1} + b_{6}),

(7)

{\tilde{c}}_{ι} = \tanh (W_{6} \times x_{ι} + W_{10} \times h_{t - 1} + b_{7}),

(8)

c_{ι} = f_{t} \otimes c_{ι - 1} + i_{ι} \otimes {\tilde{c}}_{ι},

(9)

h_{t} = o_{t} \otimes \tanh (c_{ι}),

(10)

where

i_{t}

is the input gate,

f_{t}

is the forgetting gate,

o_{t}

is the output gate,

x_{t}

is the input vector at the current time; σ is the sigmoid activation function; tanh is the hyperbolic tangent activation function;

W_{3}, W_{4}, W_{5}, W_{6}

are the weights of input layer to different gating mechanisms;

b_{4}, b_{5}, b_{6}, b_{7}

are offset terms;

W_{7}, W_{8}, W_{9}, W_{10}

are the weights of hidden layers for different gating mechanisms;

c_{t}

is the unit that retains information at the current time;

c_{t - 1}

is the unit that retains information at the previous time;

{\tilde{c}}_{t}

is the original storage unit value;

⨂

represents the multiplication of vector elements.

The LSTM neural network only considers the correlation between the forward input sequences separately, while Bi-LSTM is composed of forward LSTM units and backward LSTM units, allowing it to obtain the future sequence information at the same time, calculate the forward (before-the-current-moment) and backward (after-the-current-moment) information, and obtain the result of linear superposition. Data are fitted from two directions to solve the problem of gradient disappearance and explosion. To improve the model’s accuracy, the Bi-LSTM network is adopted in this paper, and the structure is shown in Figure 6.

Here,

W_{f x}

and

W_{f x}

are the weights from the input layer to each node;

W_{f h}

and

W_{g h}

are the weights from the hidden layer to each node. The output,

h_{B t}

, after the Bi-LSTM layer can be expressed as follows:

h_{B t} = B i L S T M (H_{c, t - 1}, H_{c, t}),

(11)

where

H_{c, t}

is the output of the CNN layer at the current time, and

H_{c, t - 1}

is the output of the CNN layer at the previous time.

3.4. Attention Mechanism

The Attention mechanism originated from the study of human vision, which is often used in artificial intelligence, machine learning, and other fields. The principle is to assign different weights to the input data and obtain the desired data via weighting and using model parameters. The Attention mechanism can cleverly pick up global and local relationships, focus on more important information, and assign importance to information at different times. Therefore, after the Attention mechanism is introduced in the CNN layer, recalibration and enhancement of data features can be realized by squeezing and stimulating feature channels to enhance essential features and suppress unimportant features, avoid irrelevant information in impression results, and finally achieve the purpose of optimizing the model. The specific calculation of the Attention layer is as follows:

e_{t} = \tanh (W \times h_{t} + b),

(12)

a_{t} = \frac{\exp (e_{t} \times v)}{\sum_{i = 1}^{t} \exp {(e_{t} \times v)}^{'}},

(13)

s_{t} = \sum_{i = 1}^{t} a_{t} \times h_{t},

(14)

where

e_{t}

is the status value distribution;

W

and

b

are the weight and bias terms of Attention;

v

is the Attention value;

a_{t}

is the weight coefficient. The weighted sum of

e_{t}

and

a_{t}

gives the output

s_{t}

.

3.5. Improved Vector Weighted Average Algorithm

INFO is an improved weighted mean method where the weighted mean idea is applied to entity structures and updates the position of vectors using three core processes: update rules, vector combinations, and local searches. The update rule phase generates new vectors based on the mean value law and convergence acceleration. In the vector combination stage, the obtained vectors are combined with the update rules to improve the local search ability, improving INFO update rules and vector combination steps to enhance exploration and development capabilities. In addition, the local search phase helps the algorithm avoid low-precision solutions and improve utilization and convergence.

INFO ensures that the parameters of the CNN-Bi-LSTM-Attention model are optimal by optimizing the parameters of the model. By using the improved vector-weighted average algorithm, the INFO algorithm can effectively adjust the model’s parameters to improve the accuracy and stability of the model in runoff prediction. In contrast, the effect of the BOA optimization algorithm is not as significant as that of the INFO algorithm. Specifically, the advantages of the INFO algorithm are reflected by the following: 1. Higher prediction accuracy; the deep learning model optimized by the INFO algorithm has a higher fitting coefficient, R², which is slightly improved compared with that of the BOA algorithm. 2. More stable performance; the INFO algorithm allows for more stable and reliable long-term runoff prediction, and the optimization results can obtain accurate prediction results in different study areas, indicating that the INFO algorithm has stronger robust adaptability and accuracy to complex hydrological changes.

The mathematical definition of a weighted mean is as follows: a weighted mean indicates a unique position in an object or system. The mean of a set of vectors is described as the mean of their positions (

x_{i}

), weighted by the fitness of a vector (

w_{i}

). This concept is used because of its simplicity and ease of implementation. The larger the weight of the solution, the more effective the weighted mean is in the calculation. The expression for the weighted mean (

W M

) is defined in Equation (15):

W M = \frac{\sum_{i = 1}^{N} x_{i} \times w_{i}}{\sum_{i = 1}^{N} w_{i}},

(15)

where

N

is the number of vectors.

To provide a better explanation,

W M

can be viewed as two vectors, as shown in Equation (16):

W M = \frac{x_{1} \times w_{1} + x_{2} \times w_{2}}{w_{1} + w_{2}},

(16)

The weight of each vector is calculated based on the wavelet function (

W F

). In general, wavelets are valuable tools for simulating seismic signals of the composite translation and expansion of finite-period oscillation functions (i.e., parent wavelets). This function is used to generate effective fluctuations during optimization. The mother wavelet used in this paper is defined as follows:

w = \cos (x) \times \exp (- \frac{x^{2}}{ω}),

(17)

where

w

is a constant called the expansion parameter.

Figure 7a,b show three vectors, and the difference between them is shown in Figure 7c. The weighted mean of the vector is calculated using the following Equation (18):

W M = \frac{w_{1} \times (x_{1} - x_{2}) + w_{2} \times (x_{1} - x_{3}) + w_{3} \times (x_{2} - x_{3})}{w_{1} + w_{2} + w_{3}},

(18)

where

w_{1} = \cos ((f (x_{1}) - f (x_{2})) + π) \times \exp (\frac{f (x_{1}) - f (x_{2})}{ω}),

(18a)

w_{2} = \cos ((f (x_{1}) - f (x_{3})) + π) \times \exp (\frac{f (x_{1}) - f (x_{3})}{ω}),

(18b)

w_{3} = \cos ((f (x_{2}) - f (x_{3})) + π) \times \exp (\frac{f (x_{2}) - f (x_{3})}{ω}),

(18c)

where

f (x)

represents the fitness function of vector

x

.

In figure below, Figure 7a,b represent different vectors respectively, and Figure 7c represents the difference between each vector.

Figure 7. Weighted average of the three vectors.

3.6. Model Evaluation

This paper selected the mean absolute error, MAE, root mean square error, RMSE, fitting coefficient, R², and mean absolute percentage error, MAPE, as the basis for evaluating the model performance, distinguishing the model’s prediction performance and describing the accuracy and generalization of the model output. The relevant calculation formula is as follows:

M A E = \frac{1}{M} \sum_{k = 1}^{M} | y_{k}^{'} - y_{k} |,

(19)

R M S E = \sqrt{\frac{1}{M} \sum_{k = 1}^{M} {(y_{k} - y_{k}^{'})}^{2}},

(20)

R^{2} = 1 - \frac{\sum_{k = 1}^{M} {(y_{k}^{'} - y_{k})}^{2}}{\sum_{k = 1}^{M} {(\bar{y} - y_{k})}^{2}},

(21)

M A P E = \frac{1}{M} \sum_{k = 1}^{M} \frac{| y_{k}^{'} - y_{k} |}{y_{k}} \times 100 %,

(22)

where

M

is the total number of samples,

y_{k}^{'}

is the predicted value,

y_{k}

is the observed value, and

\bar{y}

is the average of all sample observations.

3.7. Construction of INFO-CNN-Bi-LSTM-Attention Model

This study selected time series data such as rainfall, maximum temperature, minimum temperature, average wind speed, average pressure, historical water level, and runoff as covariables. At the same time, the CNN-Bi-LSTM-Attention model was taken as an example to build a basic prediction model, wherein the INFO-CNN-Bi-LSTM-Attention optimization model was a fully coupled deep learning model. The hyperparameters optimized by the improved vector-weighted average algorithm are the learning rate, hidden layer neurons, and the dropout layer. The optimized dropout reduces the complexity of the network, allows for the obtention of a network with a smaller parameter scale, and prevents overfitting. At the same time, to prevent this phenomenon, two other methods were used in this study for correction: (1) Using a more extended data set, the 48 months of hydrological data collected in this paper made the model more inclined to learn generally applicable patterns and laws, rather than relying too much on a small number of specific data points. (2) By adding a penalty term to the model loss function through L2 regularization, the size of model parameters was constrained, thereby reducing the complexity of the model and avoiding overfitting.

The data set constructed by the model consisted of two parts. The training set covers a period of 36 months, from January 2018 to December 2020. The test set spanned from January to December 2021, over 12 months. The hydrological data of the hydrological station and the meteorological data of the meteorological station were used as variables to forecast the runoff. The steps were as follows:

Step 1: Pearson correlation analysis was carried out on the hydrological and meteorological elements and runoff, and the data with solid correlations were selected as the input factors of the model. At the same time, to better study the influence of historical data on runoff prediction, mapminmax normalization was carried out for each data factor, and all data were scaled to (0,1) to reduce miscellaneous feature variables.

Step 2: The model initialization parameters were set (see Section 4.2 for specific model parameter settings), the data were divided into training sets and test sets, and the target variables and covariables for the trained sample data were set so that the machine could learn its feature changes.

Step 3: The CNN neural network extracted and reduced the dimensions of hydrological data. First, the convolutional layer received the model’s input data, carried out feature extraction, and passed the activation function onto the pooling layer for secondary extraction and simplification of the data. The Attention mechanism was introduced into the pooling layer, and the similarity between each output hidden layer and input hidden layer was calculated. The process of extracting the deep local features of the data and obtaining the value of the new hidden layer through weighted summation was carried out for the purpose of introducing the Attention mechanism into the CNN model. Finally, the extracted deep local information on the hydrological data was integrated, laying a foundation for the use of Bi-LSTM to extract the specific time features.

Step 4: The data features extracted by CNN-Attention were input into Bi-LSTM as new variables to optimize the weight of its input model. At the same time, Bi-LSTM comprised forward LSTM units and backward LSTM units, which were able to obtain future sequence information simultaneously, calculate forward and backward information, and obtain the result of linear superposition. Further, the change rule of bidirectional runoff timing was learnt, critical information was accurately obtained, data generalization of the output layer of Bi-LSTM was prevented, data from two directions were fitted to solve the problem of gradient disappearance and explosion, and the model’s accuracy was improved.

Step 5: An improved vector-weighted average algorithm was used to optimize CNN-Bi-LSTM-Attention. The lower boundary was set as (0.0001, 5, 0.1), and the upper boundary was set as (0.1, 200, 0.5). The data represented the learning rate, hidden layer neurons, and dropout layers.

Step 6: The optimal parameters obtained by the INFO optimization algorithm were put into the runoff prediction model and verified by the runoff data of a hydrological station in Xiaolangdi. The forecast results of the model output were reverse-normalized to obtain the forecast results of the model runoff.

The overall idea of this study is as follows: When predicting runoff, the CNN and Attention are first used to extract the hidden information of input features, and then the extracted information is brought into the Bi-LSTM model for training. Finally, the INFO algorithm is used to optimize the parameters of the CNN-Bi-LSTM-Attention network model. Then, the prediction accuracy of runoff can be improved effectively. The overall structure of the model is shown in Figure 8 below.

4. Experimental Results and Analysis

4.1. Scene Assumption

In this paper, we consider three scenarios: Scenario One (S1) as a baseline, when the entire data set is available, following standard practice in previous work, only including the underlying model used in this study. The basic models used in S1 include Bi-LSTM, CNN-Bi-LSTM, and the improved model CNN-BI-LSTM-ATTENTION. The three models are compared and verified using Xiaolangdi hydrological data. For the Xiaolangdi reservoir data set under S1, the 2018–2020 data are used for training, and the 2021 data are used for testing because the 2021 hydrologic scenario is the most complex, with significant floods in the dry season and the rainy season, and the maximum and minimum values of all runoff from the previous data. Scenario 2 (S2) is based on S1, with two different optimization algorithms added and a comparison of INFO and BOA algorithms to verify the correctness of model selection. S2 uses the Xiaolangdi hydrologic data set, with 75% of the labeled data used for the training set and the remaining 25% used for the test set to evaluate the proposed optimization method. Scenario 3 (S3) is based on S2. Compared with those in S1 and S2, the data set of Huayuankou hydrometeorological station and its surrounding weather stations is selected; this scenario is established to verify that the model proposed in this study maintains good migration ability on different data sets. Hydrological data and meteorological data were combined as training samples. For the generalization analysis of the model, the data from the Huayuankou hydrology station are selected to carry out the same analysis for the hypothetical scenario. Scenario 1 (S1) corresponds to Section 4.3.1 of this study on model comparative analysis, Scenario 2 (S2) corresponds to Section 4.3.2 of this study on optimization algorithm comparative analysis, and Scenario 3 (S3) corresponds to Section 4.4 on model generalization verification.

4.2. Model Parameter Setting

CNN parameter setting: The convolutional layer is divided into two layers. The first layer is as follows: convolutional kernel = (3,1), number 8, stride = (1,1), padding = 1, channel number = 32. The second layer is as follows: convolutional kernel = (3,1), padding = 1, channel number = 32. The global average pooling method is selected in the pooling layer, and the Attention mechanism (1/4 of the number of channels is extracted) is added to the full connection layer below. The full connection layer is also set up with two layers: the

R e l u

activation function is selected in the first layer, and the

S i g m o i d

activation function is selected in the second layer. In terms of the Bi-LSTM parameter settings, the optimal parameters of the CNN-Bi-LSTM-Attention prediction model were selected by using the improved INFO optimization algorithm; the lower boundary of the superparameter setting was (0.0001, 5, 0.1), and the upper boundary was (0.1, 200, 0.5), where the data represent the learning rate, the hidden layer neurons and the dropout layer, respectively. Table 2 sets the parameters for the network.

4.3. Result Analysis

4.3.1. Model Comparison Analysis

To verify the accuracy of the runoff prediction model used in this paper, MATLAB was used to build a simulation environment, and the CNN-Bi-LSTM, Bi-LSTM, and CNN-Bi-LSTM-Attention models were compared. To avoid any accidental results in model prediction, multiple sets of experiments were conducted for each model, and the optimal prediction results of each model were recorded. The comparison results were as follows:

Figure 9a–d shows the comparative renderings of the three models from January to March, April to June, July to September, and October to December respectively. By comparing the predicted values of each model with the actual values, it can be found that the three models can roughly predict the runoff change, proving that the method of establishing the runoff prediction model in this study was feasible. It can be seen from the figure that there are errors in the three models, among which the error of the Bi-LSTM model is the largest, and the error is the most obvious during the change in peak and minimum values. The overall and peak predictions of the CNN-Bi-LSTM model are better than those of the Bi-LSTM model. However, the model error also increases in the prediction after October, and the CNN-Bi-LSTM-Attention model has a better prediction effect and higher fitting degree than the other two models. However, because the meteorological factors and runoff will change with time, the results of runoff forecast will be worse with the increase in forecast time.

As can be seen from the comparison of model evaluation indicators in Table 3, the four evaluation indicators of the CNN-Bi-LSTM-Attention model are superior to those of the other two models, and the R² of this model is 0.948, which is 3.38% and 7.91% higher than that of the CNN-Bi-LSTM and Bi-LSTM models, respectively. The fitting coefficient, R², measures the model’s ability to explain data variance. The value ranges from 0 to 1. The closer the value is to 1, the better the model can explain the variance in the target variable, that is, the better the predicted value fits the original value. The RMSE of this model was 0.385, which represents a decrease of 22.37% and 50.06% compared with that of the other two models. The root mean square error (RMSE) measures the average difference between the model’s predicted value and the actual value. The lower the value, the more accurate the model’s prediction. The MAE of this model was 0.322, which represents a decrease of 1.22% and 46.06% compared with that of the other two models. The mean absolute error (MAE) measures the mean absolute difference between the predicted value and the model’s actual value. The lower the value, the higher the prediction accuracy of the model. The MAPE of this model was 0.063, which represents a decrease of 27.58% and 54.01% compared with that of the other two models. The average absolute percentage error (MAPE) measures the average percentage error between the predicted value and the actual value. The lower the value, the higher the prediction accuracy of the model. According to the evaluation indexes of the model, it can be shown that the CNN-Bi-LSTM-Attention model has higher prediction accuracy and is more suitable for runoff prediction in time series. From the above indicators and analysis, the fitting coefficient, R², of the CNN-Bi-LSTM-Attention model is the highest, while the RMSE, MAE, and MAPE of the model are the lowest, indicating that the former model has minor prediction errors and the highest prediction accuracy. Although the Bi-LSTM model has a poor forecasting effect, it still shows the ability to process runoff data, indicating that the selected model has the forecasting ability required to complete the stated goals. This research model uses the CNN to extract input data features and combines it with Attention to extract the deep local features of data to optimize the input weight of Bi-LSTM, and then uses Bi-LSTM to learn time series while simultaneously performing deep feature analysis of the data from both positive and negative directions, effectively improving the learning ability of the model. It then has a better ability to fit the actual runoff.

4.3.2. Comparative Analysis of Optimization Algorithms

To verify the superiority of the improved vector-weighted average algorithm (INFO) used in this paper, a more commonly used Bayesian optimization algorithm (BOA) was selected for comparison. For the INFO override settings, see Section 4.2. For the BOA, the upper boundary is (200, 0.1, 0.5, 128), and the lower boundary is (1, 0.001, 0.1, 8). The override parameters represent the hidden layer neurons, initial learning rate, dropout layer, and batch size. With the two optimization algorithms performing extensive deep learning in the model, the optimal fitness of the INFO optimization algorithm is 0.4783, as shown in Figure 10. At this time, the optimal parameters of the algorithm are obtained: the learning rate is 0.0009, the number of hidden layer neurons is 186.32, and the dropout is 0.47. The optimal parameters of the Bayesian optimization algorithm are as follows: the number of hidden layer neurons is 199.70, the initial learning rate is 0.0054, the dropout is 0.35, and the batch size is 8.88. The optimal parameters of INFO and BOA are put into their respective algorithms, and a comparison chart of the prediction effect is obtained.

Figure 11 shows the prediction effect diagram obtained after the optimization of the neural network model by the INFO optimization algorithm and the BOA optimization algorithm. It can be seen from the figure that the prediction effect of the Info-CNN-Bi-LSTM-Attention model is better than that of the BOA-CNN-Bi-LSTM-Attention model. The fitting ability of the Bayesian optimization algorithm gradually weakens, especially when the actual value reaches its peak and the model prediction worsens in the final two months. In contrast, the INFO optimization algorithm’s overall prediction and peak prediction ability are improved.

In this paper, the optimization algorithm of the CNN-Bi-LSTM-Attention prediction model still used the mean absolute error, MAE, the root mean square error, RMSE, the fitting coefficient, R², and the mean absolute percentage error, MAPE, as the basis for evaluating the model’s performance. From the comparative analysis of evaluation indicators in Table 4, we can see that the prediction effect of INFO-CNN-Bi-LSTM-Attention is better than that of BOA-CNN-Bi-LSTM-Attention and CNN-Bi-LSTM-Attention. The model’s R² of this optimization algorithm is 0.993. Compared with that of the primary and Bayesian optimization models, this represents an increased of 4.53% and 0.61%, respectively. The RMSE of this optimization algorithm is 0.221, which represents a decrease of 42.59% and 22.72% compared with that of the other two models. The MAE of this optimization algorithm is 0.163, which represents a decrease of 49.38% and 24.18% compared with that of the other two methods. The MAPE of this optimization algorithm is 0.041, which represents a decrease of 53.66% and 28.07%, respectively, compared with that of the other two model. It can be shown that after the optimization algorithm is added, R2 improves, and RMSE, MAE, and MAPE decrease, which proves that INFO and BOA can optimize the model and improve its prediction accuracy, and that the CNN-Bi-LSTM-Attention prediction model based on the improved vector-weighted average algorithm has a better fitting effect, can thoroughly analyze the time series characteristics of runoff, and has better stability and generalization abilities. At the same time, the results show that the optimization algorithm can effectively improve the prediction ability and accuracy of the model, and that its use is an effective way to improve the model.

4.4. Model Generalization Ability Verification

The model used in this study improves the model’s generalization ability through data enhancement, regularization technology, ensemble learning, etc. Data enhancement improves the model’s adaptability by increasing the diversity of data; regularization technology prevents overfitting by limiting the complexity of the model; ensemble learning improves the overall generalization performance by combining the prediction of multiple models. The model was first trained and verified on the historical data of Xiaolangdi hydrology station and then tested on the data of Huayuankou hydrology station to evaluate the model’s generalization performance. After strict training and verification of the data set of Xiaolangdi hydrology station, the model showed high prediction accuracy. To test the generalization ability of the model further, it was applied to the data set of the Huayuankou hydrology station. The model prediction results are shown in the figure below:

Figure 12 shows a comparison of the prediction results obtained by using four models of the Huayuankou hydrologic station data set. The results show that the INFO-CNN-Bi-LSTM-Attention model has a good prediction effect at the peak value and good performance in predicting a long-time series. The overall prediction effect of the CNN-Bi-LSTM-Attention model is also up to standard, especially in a short period. The model also achieved satisfactory prediction results in Huayuankou hydrographic station, and the test results of the model in two different hydrographic stations reached the expected accuracy requirements. This finding confirms that the model performs well on specific hydrological stations and can make compelling predictions under similar hydrological conditions, indicating that the model can adapt to different hydrological conditions and environmental characteristics. The model’s generalization ability validates its potential in practical applications and makes it a reliable tool for runoff prediction in different regions.

5. Discussion

Given the complicated data content in the hydrological field and the poor prediction effect of the ordinary model, this study proposed a novel optimization model, namely the INFO-optimized CNN-Bi-LSTM-Attention model, which firstly preprocesses the data and then extracts the deep features of the data through CNN-Attention to optimize the Bi-LSTM weight further. At the same time, the sequential characteristics of the data are processed from both positive and negative aspects. An improved vector-weighted average algorithm optimizes the parameters of the model to improve the model’s performance further. Compared with other basic models (Bi-LSTM, CNN-Bi-LSTM, etc.), this optimization model has higher prediction accuracy and better model generalization ability. The following is an analysis of this study’s contribution, influence, and limitations.

Influence and contribution in the field of hydrology:

(i): Improved prediction accuracy: By utilizing deep learning and improved optimization techniques, the proposed model outperforms traditional methods in accuracy, especially for complex and dynamic runoff scenarios. This advance is essential for flood management, water resource planning, and environmental sustainability.
(ii): Reduce reliance on manual parameter tuning: Adopting data-driven methods and advanced algorithms reduces the dependence on manual parameter selection and improves the efficiency and reliability of runoff prediction models.
(iii): Practical application: The INFO-CNN-Bi-LSTM-Attention model can accurately predict short- and long-term runoff, providing practical applications in water resources management, agriculture, urban planning, and disaster preparedness. It provides stakeholders with timely and reliable information for the decision-making process.
(iv): The deep learning model used in this study can not only be applied in the field of hydrology but also to the Remaining service life of lithium-ion batteries, network flow, gas well classification, Remaining life of rolling bearing prediction, etc., and good results were obtained, which is in agreement with results in the literature [45,46,47,48,49].

Analysis of limitations of this study:

(i): Data limitations: The data used in this study included the observation data of Xiaolangdi Reservoir and its surrounding weather stations, which only cover a specific period in a particular region from 2018 to 2021. There are some differences in the number of daily observation data, which may limit the generalization ability of the model. Future studies could consider expanding the coverage of the data set to include more geographic areas and different climate conditions to verify the model’s adaptability in various environments.
(ii): Subjectivity of parameter selection: Although the improved INFO algorithm was used to optimize the model in this study, parameter selection may still be affected by researchers’ subjective experience. Future research could explore more automated or data-driven methods to optimize model parameters to improve model stability and reliability.
(iii): Model generalization ability: Although studies have shown that the INFO-CNN-Bi-LSTM-Attention model performs well on specific prediction tasks, its generalization ability in future time series or unknown environments needs to be further verified. Future studies can evaluate the model’s generalization ability through cross-validation, externally validated data sets, or cross-regional validation.
(iv): Uncertainty and risk management: Hydrologic forecasting involves complex factors such as natural systems and climate change, so uncertainty in forecast results is inevitable. Future research could strengthen the quantitative analysis of uncertainties and explore how to manage these uncertainties effectively in decision support systems.

6. Conclusions

Aiming towards solving problems such as the low recognition accuracy of traditional machine learning models, dependence on manual experience in parameter selection, and an inability to screen important data information, this paper proposes a deep learning model for the runoff field based on a data-driven approach. It adopts an improved INFO algorithm to optimize CNN-Bi-LSTM-Attention network architecture. By forecasting and analyzing the runoff of the data mentioned in the paper, the following conclusions are reached:

(i): Pearson correlation analysis was used to analyze the data selected in this study. The correlation coefficient shows that water level, temperature, and precipitation are the main factors affecting the prediction of runoff, and other meteorological factors also have certain but relatively small impacts.
(ii): The three basic deep learning models selected in this study all have high prediction accuracy, especially in short-term runoff prediction, the fitting coefficient, R2, of which is greater than 0.873. The CNN-Bi-LSTM-Attention model has the best prediction effect, with a fitting coefficient, R2, of 0.948. Compared with that of the other two models, this represents an increase of 3.38% and 7.91%, respectively, which indicates that this model can better extract the deep features of data, capture critical hydrological information, and improve prediction accuracy.
(iii): Two optimization algorithms were selected in this study. By setting super-parameters to replace the CNN-Bi-LSTM-Attention model, it was found that compared with the BOA optimization algorithm, the improved INFO optimization algorithm used in this paper has a better prediction effect, especially for the prediction of peak value and long time series, and its fitting coefficient, R2, was as high as 0.993. This shows that the improved INFO-CNN-Bi-LSTM-Attention prediction model has a better fitting and generalization ability.

According to the prediction results of the INFO-CNN-Bi-LSTM-Attention model established in this paper, on the one hand, its optimal fitness can reach the corresponding value quickly, its convergence speed is faster, and its global search ability is better. On the other hand, the predicted and actual values have the best fitting effect, accurately reflecting the change rules of runoff and providing a new contribution for runoff prediction in medium- and long-term time series.

Potential practical applications and policy implications of this research:

(i): Water management: Improved runoff forecasting can help water managers optimize reservoir operations, carry out irrigation scheduling, and improve drought management strategies;
(ii): Adaptation to climate change: Accurate runoff predictions are critical for adapting infrastructure to and creating policies for changing hydrological conditions as climate variability increases;
(iii): Policymaking: Policymakers can use accurate runoff projections to formulate effective policies related to water resource allocation, environmental protection, and disaster risk reduction.

In summary, as this study demonstrates, integrating deep learning models with advanced optimization techniques significantly advances runoff prediction. It improves the accuracy of forecasts and contributes to informed decision-making and sustainable water management practices across sectors. Based on the paper’s analysis of limitations and potential applications, the research direction that should be followed next should involve testing the feasibility of the model using data sets of different regions and climates, furthering explore better methods for parameter optimization, and quantifying the uncertainty of hydrological predictions.

Author Contributions

Conceptualization, W.W. and Y.H.; methodology, W.W.; software, Y.H.; validation, W.W., Y.H. and X.Z. (Xiaozhen Zheng); formal analysis, T.M.; investigation, J.Z.; resources, X.Z. (Xiaozhen Zheng); data curation, Z.C.; writing—original draft preparation, Y.H.; writing—review and editing, W.W.; visualization, Y.H.; supervision, X.Z. (Xiaozhen Zheng); project administration, Y.H.; funding acquisition, X.Z. (Xiaoyuan Zhang). All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the [Joint Fund of Science and Technology Research and Development Program of Henan Province] (grant number 225200810038) and the [Natural Science Fund of Henan Province] (grant number 232300421207). The APC was funded by [Henan University of Technology].

Data Availability Statement

The data sets presented in this article are not readily available because the data are part of an ongoing study.

Acknowledgments

We would like to thank the Yellow River Water Conservancy Commission for the data, Henan University of Technology for the experimental environment, and the Henan Provincial Fund for their support.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Castro, R.P.; Dinho da Silva, P.; Pires, L.C.C. Advances in Solutions to Improve the Energy Performance of Agricultural Greenhouses: A Comprehensive Review. Appl. Sci. 2024, 14, 6518. [Google Scholar] [CrossRef]
Kadri, I. Simulation des Ruissellements Dans les Bassins Non Jaugés Avec la Prise en Considération des Surfaces Imperméables. Ph.D. Thesis, University of Guelma, Guelma, Algeria, 2022. [Google Scholar]
Kumar, V.; Kedam, N.; Sharma, K.V.; Mehta, D.J.; Caloiero, T. Advanced machine learning techniques to improve hydrological prediction: A comparative analysis of streamflow prediction models. Water 2023, 15, 2572. [Google Scholar] [CrossRef]
Sene, K. Hydrological forecasting. In Hydrometeorology: Forecasting and Applications; Springer: Cham, Switzerland, 2024; pp. 167–215. [Google Scholar]
Xu, C.; Zhong, P.; Zhu, F.; Xu, B.; Wang, Y.; Yang, L.; Wang, S.; Xu, S. A hybrid model coupling process-driven and data-driven models for improved real-time flood forecasting. J. Hydrol. 2024, 638, 131494. [Google Scholar] [CrossRef]
Wang, L.; Li, X.; Ma, C.; Bai, Y. Improving the prediction accuracy of monthly streamflow using a data-driven model based on a double-processing strategy. J. Hydrol. 2019, 573, 733–745. [Google Scholar] [CrossRef]
Muñoz-Carpena, R.; Carmona-Cabrero, A.; Yu, Z.; Fox, G.; Batelaan, O. Convergence of mechanistic modeling and artificial intelligence in hydrologic science and engineering. PLoS Water 2023, 2, e0000059. [Google Scholar] [CrossRef]
Jin, L.; Xue, H.; Dong, G.; Han, Y.; Li, Z.; Lian, Y. Coupling the remote sensing data-enhanced SWAT model with the bidirectional long short-term memory model to improve daily streamflow simulations. J. Hydrol. 2024, 634, 131117. [Google Scholar] [CrossRef]
Guo, S.; Wen, Y.; Zhang, X.; Chen, H. Runoff prediction of lower Yellow River based on CEEMDAN–LSSVM–GM (1, 1) model. Sci. Rep. 2023, 13, 1511. [Google Scholar] [CrossRef]
Harik, G.; Alameddine, I.; Najm, M.A.; El-Fadel, M. Modified SWAT to forecast water availability in mediterranean mountainous watersheds with snowmelt dominated runoff. Water Resour. Manag. 2023, 37, 1985–2000. [Google Scholar] [CrossRef]
Song, Y.H.; Chung, E.S.; Shahid, S. Differences in extremes and uncertainties in future runoff simulations using SWAT and LSTM for SSP scenarios. Sci. Total Environ. 2022, 838, 156162. [Google Scholar] [CrossRef]
Hasan, M.M.; Strong, C.; Kochanski, A.K.; Burian, S.J.; Barber, M.E. Validating Dynamically Downscaled Climate Projections for Mountainous Watersheds Using Historical Runoff Data Coupled with the Distributed Hydrologic Soil Vegetation Model (DHSVM). Water 2020, 12, 1389. [Google Scholar] [CrossRef]
Ju, Q.; Liu, X.; Zhang, D.; Shen, T.; Wang, Y.; Jiang, P.; Gu, H.; Yu, Z.; Fu, X. Application of distributed Xin’anjiang model of melting ice and snow in Bahe River basin. J. Hydrol. Reg. Stud. 2024, 51, 101638. [Google Scholar] [CrossRef]
Li, X.; Cheng, G.; Lin, H.; Cai, X.; Fang, M.; Ge, Y.; Hu, X.; Chen, M.; Li, W. Watershed system model: The essentials to model complex human-nature system at the river basin scale. J. Geophys. Res. Atmos. 2018, 123, 3019–3034. [Google Scholar] [CrossRef]
Liang, Z.; Li, Y.; Hu, Y.; Li, B.; Wang, J. A data-driven SVR model for long-term runoff prediction and uncertainty analysis based on the Bayesian framework. Theor. Appl. Climatol. 2018, 133, 137–149. [Google Scholar] [CrossRef]
Guo, Z.; Moosavi, V.; Leitão, J.P. Data-driven rapid flood prediction mapping with catchment generalizability. J. Hydrol. 2022, 609, 127726. [Google Scholar] [CrossRef]
Guo, W.; Pan, T.; Li, Z.; Li, G. A review on data-driven approaches for industrial process modelling. Int. J. Model. Identif. Control. 2020, 34, 75–89. [Google Scholar] [CrossRef]
Bafitlhile, T.M.; Li, Z. Applicability of ε-support vector machine and artificial neural network for flood forecasting in humid, semi-humid and semi-arid basins in China. Water 2019, 11, 85. [Google Scholar] [CrossRef]
Gong, Y.; Zhang, Y.; Lan, S.; Wang, H. A comparative study of artificial neural networks, support vector machines and adaptive neuro fuzzy inference system for forecasting groundwater levels near Lake Okeechobee, Florida. Water Resour. Manag. 2016, 30, 375–391. [Google Scholar] [CrossRef]
Jordan, M.I.; Mitchell, T.M. Machine learning: Trends, perspectives, and prospects. Science 2015, 349, 255–260. [Google Scholar] [CrossRef]
Singh, A.K.; Kumar, P.; Ali, R.; Al-Ansari, N.; Vishwakarma, D.K.; Kushwaha, K.S.; Panda, K.C.; Sagar, A.; Mirzania, E.; Elbeltagi, A.; et al. An integrated statistical-machine learning approach for runoff prediction. Sustainability 2022, 14, 8209. [Google Scholar] [CrossRef]
Xiang, Z.; Yan, J.; Demir, I. A rainfall-runoff model with LSTM-based sequence-to-sequence learning. Water Resour. Res. 2020, 56, e2019WR025326. [Google Scholar] [CrossRef]
Zhang, J.; Chen, X.; Khan, A.; Zhang, Y.K.; Kuang, X.; Liang, X.; Taccari, M.L.; Nuttall, J. Daily runoff forecasting by deep recursive neural network. J. Hydrol. 2021, 596, 126067. [Google Scholar] [CrossRef]
Li, Y.; Wei, J.; Wang, D.; Li, B.; Huang, H.; Xu, B.; Xu, Y. A medium and long-term runoff forecast method based on massive meteorological data and machine learning algorithms. Water 2021, 13, 1308. [Google Scholar] [CrossRef]
Song, P.; Liu, W.; Sun, J.; Wang, C.; Kong, L.; Nong, Z.; Lei, X.; Wang, H. Annual runoff forecasting based on multi-model information fusion and residual error correction in the Ganjiang River Basin. Water 2020, 12, 2086. [Google Scholar] [CrossRef]
Mohammadi, B. A review on the applications of machine learning for runoff modeling. Sustain. Water Resour. Manag. 2021, 7, 98. [Google Scholar]
Khosravi, K.; Golkarian, A.; Tiefenbacher, J.P. Using optimized deep learning to predict daily streamflow: A comparison to common machine learning algorithms. Water Resour. Manag. 2022, 36, 699–716. [Google Scholar] [CrossRef]
Adnan, R.M.; Petroselli, A.; Heddam, S.; Santos, C.A.G.; Kisi, O. Comparison of different methodologies for rainfall–runoff modeling: Machine learning vs conceptual approach. Nat. Hazards 2021, 105, 2987–3011. [Google Scholar] [CrossRef]
Liu, G.; Tang, Z.; Qin, H.; Liu, S.; Shen, Q.; Qu, Y.; Zhou, J. Short-term runoff prediction using deep learning multi-dimensional ensemble method. J. Hydrol. 2022, 609, 127762. [Google Scholar] [CrossRef]
Young, C.C.; Liu, W.C.; Wu, M.C. A physically based and machine learning hybrid approach for accurate rainfall-runoff modeling during extreme typhoon events. Appl. Soft Comput. 2017, 53, 205–216. [Google Scholar] [CrossRef]
Xiao, L.; Zhong, M.; Zha, D. Runoff forecasting using machine-learning methods: Case study in the middle reaches of Xijiang River. Front. Big Data 2022, 4, 752406. [Google Scholar] [CrossRef]
Chen, S.; Ren, M.; Sun, W. Combining two-stage decomposition based machine learning methods for annual runoff forecasting. J. Hydrol. 2021, 603, 126945. [Google Scholar] [CrossRef]
de la Fuente, A.; Meruane, V.; Meruane, C. Hydrological early warning system based on a deep learning runoff model coupled with a meteorological forecast. Water 2019, 11, 1808. [Google Scholar] [CrossRef]
Han, H.; Morrison, R.R. Improved runoff forecasting performance through error predictions using a deep-learning approach. J. Hydrol. 2022, 608, 127653. [Google Scholar] [CrossRef]
Liu, S.; Qin, H.; Liu, G.; Xu, Y.; Zhu, X.; Qi, X. Runoff forecasting of machine learning Model based on selective ensemble. Water Resour. Manag. 2023, 37, 4459–4473. [Google Scholar] [CrossRef]
Oyebode, O.; Stretch, D. Neural network modeling of hydrological systems: A review of implementation techniques. Nat. Resour. Model. 2019, 32, e12189. [Google Scholar] [CrossRef]
Khalid, R.; Javaid, N. A survey on hyperparameters optimization algorithms of forecasting models in smart grid. Sustain. Cities Soc. 2020, 61, 102275. [Google Scholar] [CrossRef]
Yao, Z.; Wang, Z.; Wang, D.; Wu, J.; Chen, L. An ensemble CNN-LSTM and GRU adaptive weighting model based improved sparrow search algorithm for predicting runoff using historical meteorological and runoff data as input. J. Hydrol. 2023, 625, 129977. [Google Scholar] [CrossRef]
Yue, Z.; Liu, H.; Zhou, H. Monthly runoff forecasting using particle swarm optimization coupled with flower pollination algorithm-based deep belief networks: A case study in the Yalong River Basin. Water 2023, 15, 2704. [Google Scholar] [CrossRef]
Li, B.J.; Yang, J.X.; Luo, Q.Y.; Wang, W.C.; Zhang, T.H.; Zhong, L.; Sun, G.L. A hybrid model of ensemble empirical mode decomposition and sparrow search algorithm-based long short-term memory neural networks for monthly runoff forecasting. Front. Environ. Sci. 2022, 10, 909682. [Google Scholar] [CrossRef]
Ahmadianfar, I.; Heidari, A.A.; Noshadian, S.; Chen, H.; Gandomi, A.H. INFO: An efficient optimization algorithm based on weighted mean of vectors. Expert Syst. Appl. 2022, 195, 116516. [Google Scholar] [CrossRef]
Song, C.M. Data construction methodology for convolution neural network based daily runoff prediction and assessment of its applicability. J. Hydrol. 2022, 605, 127324. [Google Scholar] [CrossRef]
Wu, J.; Wang, Z.; Hu, Y.; Tao, S.; Dong, J. Runoff forecasting using convolutional neural networks and optimized bi-directional long short-term memory. Water Resour. Manag. 2023, 37, 937–953. [Google Scholar] [CrossRef]
Yang, X.; Zhou, J.; Zhang, Q.; Xu, Z.; Zhang, J. Evaluation and interpretation of runoff forecasting models based on hybrid deep neural networks. Water Resour. Manag. 2024, 38, 1987–2013. [Google Scholar] [CrossRef]
Chen, C.; Wei, J.; Li, Z. Remaining Useful Life Prediction for Lithium-Ion Batteries Based on a Hybrid Deep Learning Model. Processes 2023, 11, 2333. [Google Scholar] [CrossRef]
Chen, G.; Guo, Y.; Zeng, Q.; Zhang, Y. A Novel Cellular Network Traffic Prediction Algorithm Based on Graph Convolution Neural Networks and Long Short-Term Memory through Extraction of Spatial-Temporal Characteristics. Processes 2023, 11, 2257. [Google Scholar] [CrossRef]
Xu, X.; Zhai, X.; Ke, A.; Lin, Y.; Zhang, X.; Xie, Z.; Lou, Y. Prediction of Leakage Pressure in Fractured Carbonate Reservoirs Based on PSO-LSTM Neural Network. Processes 2023, 11, 2222. [Google Scholar] [CrossRef]
Zhao, C.; Jia, Y.; Qu, Y.; Zheng, W.; Hou, S.; Wang, B. Forecasting Gas Well Classification Based on a Two-Dimensional Convolutional Neural Network Deep Learning Model. Processes 2024, 12, 878. [Google Scholar] [CrossRef]
Zhang, X.; Yang, J.; Yang, X. Residual Life Prediction of Rolling Bearings Based on a CEEMDAN Algorithm Fused with CNN–Attention-Based Bidirectional LSTM Modeling. Processes 2024, 12, 8. [Google Scholar] [CrossRef]

Figure 1. Runoff variation diagram of Xiaolangdi Dam (located in the north of Luoyang City, Henan Province, China) from 2018–2021.

Figure 2. Water-level change diagram of Xiaolangdi Dam from 2018–2021.

Figure 3. Changes in meteorological factors from 2018–2021.

Figure 4. CNN network structure.

Figure 5. Single-cell structure of LSTM.

Figure 6. Bi-LSTM network structure.

Figure 8. Overall structure of the model.

Figure 9. Prediction effect of the three models. (a) January–March forecast chart; (b) April–June forecast chart; (c) July–September forecast chart; (d) October–December forecast chart.

Figure 10. Fitness curve.

Figure 11. Prediction results of two optimization algorithms.

Figure 12. Comparison of prediction effects of four models in Huayuankou hydrology station.

Table 1. Pearson correlation coefficient.

Method	Water Level	Max Temperature	Min Temperature	Mean Pressure	Mean Wind Speed	Rainfall
Pearson	0.989	0.611	0.657	−0.323	−0.285	0.732

Table 2. Network parameter settings.

Network Layer	Output Dimension	Activation Function
Input layer	(100,1,1)	-
CNN layer	(100,1,8)	Relu
Fully connected layer	(1,100)	Relu, Sigmoid
Attention	(1,100)	-
The Bi-LSTM layer	(1,100)	Sigmoid, tanh
Hidden layer	(1,200)	-
Fully connected layer	(1,100)	Sigmoid
Output layer	(1,100)	-

Table 3. Comparative analysis of model evaluation indicators.

Model	R²	RMSE	MAE	MAPE
CNN-Bi-LSTM-Attention	0.948	0.385	0.322	0.063
CNN-Bi-LSTM	0.916	0.496	0.326	0.087
Bi-LSTM	0.873	0.771	0.597	0.137

Table 4. Comparative analysis of evaluation indexes of adding optimization algorithms.

Model	R²	RMSE	MAE	MAPE
CNN-Bi-LSTM-Attention	0.948	0.385	0.322	0.063
BOA optimization model	0.987	0.286	0.215	0.057
INFO optimization model	0.993	0.221	0.163	0.041

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, W.; Hao, Y.; Zheng, X.; Mu, T.; Zhang, J.; Zhang, X.; Cui, Z. Runoff Prediction for Hydrological Applications Using an INFO-Optimized Deep Learning Model. Processes 2024, 12, 1776. https://doi.org/10.3390/pr12081776

AMA Style

Wang W, Hao Y, Zheng X, Mu T, Zhang J, Zhang X, Cui Z. Runoff Prediction for Hydrological Applications Using an INFO-Optimized Deep Learning Model. Processes. 2024; 12(8):1776. https://doi.org/10.3390/pr12081776

Chicago/Turabian Style

Wang, Weisheng, Yongkang Hao, Xiaozhen Zheng, Tong Mu, Jie Zhang, Xiaoyuan Zhang, and Zhenhao Cui. 2024. "Runoff Prediction for Hydrological Applications Using an INFO-Optimized Deep Learning Model" Processes 12, no. 8: 1776. https://doi.org/10.3390/pr12081776

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Runoff Prediction for Hydrological Applications Using an INFO-Optimized Deep Learning Model

Abstract

1. Introduction

2. Study Area and Data

3. Method

3.1. Pearson Correlation Analysis

3.2. Convolutional Neural Network

3.3. Bidirectional Long–Short-Term Memory Neural Network

3.4. Attention Mechanism

3.5. Improved Vector Weighted Average Algorithm

3.6. Model Evaluation

3.7. Construction of INFO-CNN-Bi-LSTM-Attention Model

4. Experimental Results and Analysis

4.1. Scene Assumption

4.2. Model Parameter Setting

4.3. Result Analysis

4.3.1. Model Comparison Analysis

4.3.2. Comparative Analysis of Optimization Algorithms

4.4. Model Generalization Ability Verification

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI