Comparisons of Different Machine Learning-Based Rainfall–Runoff Simulations under Changing Environments

Li, Chenliang; Jiao, Ying; Kan, Guangyuan; Fu, Xiaodi; Chai, Fuxin; Yu, Haijun; Liang, Ke

doi:10.3390/w16020302

Open AccessArticle

Comparisons of Different Machine Learning-Based Rainfall–Runoff Simulations under Changing Environments

by

Chenliang Li

^1,2,

Ying Jiao

³,

Guangyuan Kan

^1,2,*

,

Xiaodi Fu

^1,2,

Fuxin Chai

^1,2,

Haijun Yu

^1,2

and

Ke Liang

⁴

¹

State Key Laboratory of Simulation and Regulation of Water Cycle in River Basin, Research Center on Flood & Drought Disaster Prevention and Reduction of the Ministry of Water Resources, China Institute of Water Resources and Hydropower Research, Beijing 100038, China

²

Key Laboratory of Water Safety for Beijing-Tianjin-Hebei Region of Ministry of Water Resources, Beijing 100038, China

³

China Water Resources Bei Fang Investigation, Design & Research CO. LTD, Tianjing 300222, China

⁴

Beijing IWHR Corporation, Beijing 100048, China

^*

Author to whom correspondence should be addressed.

Water 2024, 16(2), 302; https://doi.org/10.3390/w16020302

Submission received: 13 December 2023 / Revised: 10 January 2024 / Accepted: 11 January 2024 / Published: 16 January 2024

Download

Browse Figures

Versions Notes

Abstract

:

Climate change and human activities have a great impact on the environment and have challenged the assumption of the stability of the hydrological time series and the consistency of the observed data. In order to investigate the applicability of machine learning (ML)-based rainfall–runoff (RR) simulation methods under a changing environment scenario, several ML-based RR simulation models implemented in novel continuous and non-real-time correction manners were constructed. The proposed models incorporated categorical boosting (CatBoost), a multi-hidden-layer BP neural network (MBP), and a long short-term memory neural network (LSTM) as the input–output simulators. This study focused on the Dongwan catchment of the Yiluo River Basin to carry out daily RR simulations for the purpose of verifying the model’s applicability. Model performances were evaluated based on statistical indicators such as the deterministic coefficient, peak flow error, and runoff depth error. The research findings indicated that (1) ML-based RR simulation by using a consistency-disrupted dataset exhibited significant bias. During the validation phase for the three models, the R² index decreased to around 0.6, and the peak flow error increased to over 20%. (2) Identifying data consistency transition points through data analysis and conducting staged RR simulations before and after the transition point can improve simulation accuracy. The R² values for all three models during both the baseline and change periods were above 0.85, with peak flow and runoff depth errors of less than 20%. Among them, the CatBoost model demonstrated superior phased simulation accuracy and smoother simulation processes and closely matched the measured runoff processes across high, medium, and low water levels, with daily runoff simulation results surpassing those of the BP neural network and LSTM models. (3) When simulating the entire dataset without staged treatment, it is impossible to achieve good simulation results by adopting uniform extraction of the training samples. Under this scenario, the MBP exhibited the strongest generalization capability, highest prediction accuracy, better algorithm stability, and superior simulation accuracy compared to the CatBoost and LSTM simulators. This study offers new ideas and methods for enhancing the runoff simulation capabilities of machine learning models in changing environments.

Keywords:

changing environment; rainfall–runoff simulation; CatBoost; multi-hidden-layer BP neural network; long short-term memory neural network

1. Introduction

Accurate and reliable rainfall–runoff (RR) simulation is crucial for flood prevention, disaster reduction, and water resource management. However, the frequent and intense disturbances to natural water resource systems caused by climate change, human activities, and underlying surface evolution have led to continuous changes in the mechanisms and patterns of runoff formation [1,2]. In the process of runoff prediction, numerous uncertainties contribute to the instability of forecast results [3,4]. Constructing RR prediction models that adapt to environmental disturbances is a fundamental challenge [5].

Process-driven models predict runoff based on the theories and mechanisms of runoff formation, using governing equations with physics to model the water cycle process, offering strong interpretability and rigorous adherence to physical mechanisms [6,7]. On the other hand, data-driven models start from statistical regularities contained in the data, without consideration of the physical causation of runoff generation, and directly mimic the correlation relationship between model inputs and outputs to obtain runoff predictions [8]. These models can simulate complex relationships between the driving factors of runoff and runoff time series without prior knowledge of the system’s physical mechanisms [9].

RR models based on data-driven methods primarily involve constructing modeling approaches and selecting input–output mapping models. Data-driven models often use real-time correction modes, utilizing rainfall and the measured prior flow (i.e., measured runoff before the forecast time) as model inputs to extrapolate the flow in the next time step. Real-time correction mode depends on the measured prior flow and rainfall, allowing for only a short-term forecast of one computation period ahead, though with higher accuracy [10]. In contrast, non-real-time correction modes can simulate flow processes continuously based on future rainfall sequences without relying on the measured prior flow, offering longer foresight periods than real-time correction modes but with lower accuracy [11,12].

In recent years, in the field of data-driven-based RR prediction research, machine learning (ML) technologies, represented by neural networks, have been widely used owing to their strong capability in handling nonlinear problems, not requiring an understanding of the physical mechanisms between inputs and outputs, and having good generalization capability. Examples include backpropagation neural networks (BP) [13], long short-term memory (LSTM) neural networks [14,15,16], support vector machines (SVM) [17,18], decision trees (DT), and boosting methods (such as CatBoost, XGBoost, and LightGBM [19,20]). The BP model has strong nonlinear mapping capabilities but suffers from complex structures, susceptibility to local optima during training, and difficulties in hyperparameter optimization. The LSTM model, with its gated unit structure, has a significant advantage in time series prediction with long-range dependencies but has drawbacks such as many internal parameters [21,22,23], slow training convergence, and high computational resource demands [24]. The CatBoost model efficiently handles the category typing features of the RR process, addresses gradient bias and prediction shift issues, reduces the occurrence of overfitting, and improves algorithm accuracy and generalization capability [25].

This study focuses on the research of the applicability of ML-based RR prediction under environmental disturbances. A non-real-time correction forecasting mode for RR was constructed, and three types of data-driven RR models were developed: CatBoost, MBP, and LSTM. This study compared the prediction results of these models, analyzed the impact of changing environmental factors on runoff prediction, and employed both phased RR simulation methods before and after the point of abrupt changes, as well as cross-sampling methods for RR simulation. These approaches were used to analyze and compare prediction results, exploring methods to enhance the accuracy of runoff simulation under the influence of changing environmental factors. The methods were applied to simulate the daily flood process from 1961 to 2000 in the Dongwan sub-basin of the Yiluo River Basin. The results showed that both phased runoff simulations following the abrupt changes due to environmental disturbances and uniformly extracting training samples can improve simulation accuracy. Among these, the MBP model demonstrated the strongest generalization capability, the highest prediction accuracy, and algorithm stability. This paper provides a new method and approach for runoff forecasting using machine learning models under the disturbance of changing environmental factors.

2. Materials and Methods

2.1. Study Area

The Yi River originates at the northern foot of the Funiu Mountains in Luanchuan County, Henan Province, and is an important tributary in the Yellow River’s Xiaolangdi-Huayuankou region [26]. The upper reaches of the Yi River watershed, which is located above the Dongwan hydrological station (located between 33.5° N to 34.5° N and 111° E to 112° E), were chosen as the study area. The topographic and hydrographic map of the watershed is shown in Figure 1. The Dongwan hydrological station receives a drainage area of approximately 2656 km², with an average annual runoff of 1.025 billion m³. The recorded maximum peak flow since its establishment was 4200 m³/s on 8 August 1975. The climate of the watershed is characterized as a warm, temperate, semi-humid continental climate, with an average annual precipitation ranging from 500 to 1100 mm. The interannual variation in precipitation is significant, with the maximum annual precipitation being approximately twice that of the minimum value. Precipitation is unevenly distributed throughout the year, with the majority occurring from July to September, accounting for 60% to 70% of the annual rainfall. The catchment is full of complex terrain and is influenced by both mid- and low-latitude weather systems, making it prone to heavy, widespread, and prolonged rainfall events.

2.2. Data

2.2.1. Data Description

This study utilized 90 m resolution digital elevation model (DEM) terrain data provided by HYDROSHEDS [27] to extract the catchment above the Dongwan station. In data processing, Arcview 3.2 was employed for tasks such as watershed extraction, river network generation, and sub-watershed delineation within the study area. The research watershed was divided into eight sub-watersheds using Thiessen polygons, as illustrated in Figure 2. The study area contains eight rainfall stations: Taowan, Luanchuan, Miaozi, Baishi, Baitu, Tantou, Heyu, and Dongwan. The rainfall stations are listed in Table 1. Daily rainfall data from 1961 to 2000 (excluding the missing data from 1969) were extracted for modeling, along with observed flow data from the Dongwan station of the same period, to conduct daily model simulations and forecasts.

2.2.2. Data Pre-Processing

Due to differences in numerical scale and unit between flow and rainfall data, this study employs max-min normalization. This process rescales all data to fall within the range of [0, 1]. By normalizing both input and output node data, the absolute values of the data become a relative relationship, eliminating the interference of dimensions on model computations [28]. The normalization formula is as follows:

\hat{P} = (P - P_{m i n}) / (P_{m a x} - P_{m i n})

(1)

\hat{Q} = (Q - Q_{m i n}) / (Q_{m a x} - Q_{m i n})

(2)

where

\hat{P}

represents the normalized precipitation;

P

is the original precipitation, mm;

P_{m a x}

and

P_{m i n}

are the maximum and minimum values of the precipitation sequence, mm, respectively;

\hat{Q}

represents the normalized runoff;

Q

is the original runoff, m³/s; and

Q_{m a x}

and

Q_{m i n}

are the maximum and minimum values of the runoff sequence, m³/s, respectively.

2.3. Methods

2.3.1. Novel Continuous Modeling Scheme

This study established a non-real-time correction rainfall–runoff data-driven model, which replaced the observed antecedent flow with the simulated antecedent flow, achieving high-precision continuous simulation under the non-real-time correction mode. The modeling approach can be expressed as follows:

Q_{t}^{s i m} = F_{n o n - r e a l t i m e} (P_{t}, \dots, P_{t - n p}, Q_{t - 1}^{s i m}, \dots, Q_{t - n q}^{s i m}) (0 < t < T)

(3)

where

Q_{t}^{s i m}

,

Q_{t - 1}^{s i m}

, and

Q_{t - n q}^{s i m}

represent the predicted outflow at times t, t − 1, and t − nq, respectively;

P_{t}

and

P_{t - n p}

represent the precipitation at times t and t − np, respectively; T is the number of simulation time steps;

F_{n o n - r e a l t i m e}

denotes the non-real-time correction flow forecast; and np and nq are integers, representing the orders of the observed rainfall and simulated antecedent flow in the rainfall–runoff relationship, respectively.

At the beginning of the model’s computation, it uses measured prior flow for the initial few steps. After a few steps of warm-up, the computation is carried out using the outlet cross-section flow, calculated by the non-real-time corrected rainfall–runoff data-driven model, as the prior flow [29]. This approach allows for continuous simulation without the need for measured prior flow.

2.3.2. Machine Learning-Based Methods

In the application of machine learning models for time series simulation, the CatBoost model, based on boosting algorithms, is inherently optimized for handling categorical data [19], offers fast training speeds, and is adept at managing large-scale datasets. The LSTM model exhibits strong capabilities in handling long-sequence data with long-term dependencies and has been widely applied with success [30]. The multi-layer BP model, with its simple structure, serves as a foundation for early neural networks and is broadly applied and especially effective in nonlinear time series [31]. Considering these characteristics, this paper selects CatBoost, LSTM, and multi-layer BP models for research.

(1): CatBoost

CatBoost is a type of boosting algorithm and, along with XGBoost and LightGBM, is recognized as one of the three major algorithms under the gradient-boosting decision tree (GBDT) framework. It is an improved implementation within the GBDT algorithm framework. Yandex has demonstrated that CatBoost performs better than XGBoost and LightGBM in terms of algorithm accuracy [32].

CatBoost is a GBDT framework with fewer parameters, designed for high accuracy, and supports categorical variables. It adopts oblivious trees as base learners. The framework consists of two components: categorical and boosting. It first conducts some statistics on categorical features, calculating the frequency of occurrence for a specific category. Afterward, by adding hyperparameters, it generates new numerical features efficiently and reasonably handles categorical features. Additionally, CatBoost addresses issues such as gradient bias and prediction shift, reducing the occurrence of overfitting and consequently enhancing the accuracy and generalization capability of the algorithm.

(2): LSTM

LSTM is a novel deep learning neural network built upon the recurrent neural network (RNN). Similar to other neural networks, the structure of the LSTM model consists of an input layer, one or more hidden layers, and an output layer. The neurons contained in its hidden layers not only receive information from the input layer but also receive information perceived by neurons from the previous time step.

The key idea of the LSTM lies in the cell state (see Figure 3), a horizontal line running above the diagram resembling a conveyor belt. The cell state functions like a conveyor belt, running directly through the entire chain with only a few linear interactions. Keeping information unchanged as it flows on top is easily achievable. LSTM has the ability to remove or add information to the cell state through a carefully designed structure known as “gates”. Each gate comprises a sigmoid neural network layer and a pointwise multiplication-based nonlinear operation. The sigmoid layer output values lie between 0 and 1, where 0 means “do not allow any amount to pass”, and 1 means “allow any amount to pass”. LSTM incorporates three types of gates: the forget gate

Z_{f}^{(t)}

, the input gate

Z_{i}^{(t)}

, and the output gate

Z_{o}^{(t)}

.

(3): Multi-layer BP neural network

The BP neural network is a type of multi-layer feedforward neural network trained using the error backpropagation algorithm. It is currently one of the most widely used neural network models [33]. The basic idea of this network is to consider the nonlinear relationship between input nodes (with M nodes) and output nodes (with P nodes) as a function mapping from an M-dimensional Euclidean space to a P-dimensional Euclidean space. Theoretically speaking, it can fit the mapping relationship between input and output of any complexity without need for explicitly understanding the mathematical equations describing this mapping relationship.

The topological structure of the BP neural network model consists of an input layer, multiple hidden layers, and an output layer. Based on the number of hidden layers, the BP neural network can be divided into single-hidden-layer and multi-hidden-layer networks. Studies have shown that compared to a single hidden layer, a multi-hidden-layer BP (MBP) network has stronger generalization capability and higher prediction accuracy. For complex mapping relationships, increasing the number of hidden layers can significantly improve the network’s forecasting accuracy [34,35]. The schematic diagram of the structure of the MBP model is demonstrated in Figure 4. The training and learning process of the BP neural network consists of two stages: forward propagation of information flow and backward propagation of error signals. First, input information is propagated layer by layer from the input layer through the hidden layers, with each layer of neuron states only affecting the next layer of neurons. It finally reaches the output layer, and the error value is calculated. If the error value does not meet the convergence criterion, it enters the backward propagation stage, where the error is propagated back along the original path, and the partial derivatives of the error with respect to the weights of each neuron are calculated. These derivatives serve as the basis for modifying the weights of each neuron. The process of forward and backward propagation is repeated until the error meets the criterion, concluding the learning process of the BP neural network.

2.3.3. Model Performance Evaluation

(1): Criterion for performance evaluation

According to the “Hydrological Information Forecasting Standards” (GB/T 22482—2008) [36], the main parameters for evaluating the simulation effect include the coefficient of determination (

R^{2}

), the relative error of runoff depth (

D E_{P}

), and the relative error of peak flow rate (

R E_{P}

). Runoff depth refers to the depth of water generated from precipitation that does not infiltrate into the soil or get absorbed by vegetation and instead flows over the land surface. Peak flow refers to the maximum rate of water flow occurring at a particular point in a river or stream during a storm or after a heavy rainfall. The calculations for each indicator are as follows:

R^{2} = \frac{{(\sum_{i = 1}^{N} (Q_{i}^{'} - \bar{Q^{'}}) (Q_{i} - \bar{Q}))}^{2}}{\sum_{i = 1}^{N} {(Q_{i} - \bar{Q^{'}})}^{2} \sum_{i = 1}^{N} {(Q_{i} - \bar{Q})}^{2}}

(4)

D E_{P} = \frac{h^{'} - h}{h} \times 100 %

(5)

R E_{P} = \frac{Q^{p e a k} - Q^{{p e a k}^{'}}}{Q^{p e a k}} \times 100 %

(6)

where

Q_{i}

is the measured flow at the i-th moment, m³/s;

Q_{i}^{'}

is the simulated flow at the i-th moment, m³/s; N is the number of time periods in the flood process, h;

\bar{Q}

is the average measured flow, m³/s;

\bar{Q^{'}}

is the average simulated flow, m³/s;

h

is the measured runoff depth, mm;

h^{'}

is the simulated runoff depth, mm;

Q^{p e a k}

is the measured peak flow, m³/s; and

Q^{{p e a k}^{'}}

is the simulated peak flow, m³/s.

R^{2}

is a statistical metric that reflects the reliability of the variation in the dependent variable in regression analysis. Its value ranges from 0 to 1, with a value of 1 indicating perfect correlation.

D E_{P}

denotes relative errors between the observed and simulated runoff depth.

R E_{P}

denotes relative errors between the observed and simulated peak flow discharge. If the

R E_{P}

value is greater than 0, it indicates an underestimation of the simulated peak flow. If it is less than 0, it indicates an overestimation of the peak flow forecast. If it equals 0, it signifies no forecast error. According to the forecasting standards, the permissible error for the

R E_{P}

is set at 20%, and the permissible error for the

D E_{P}

is also set at 20%. The accuracy during the calibration and validation periods is assessed using the validity test method and the permissible error qualification rate method.

(2): Empirical flow duration curve

The flow duration curve (FDC) is a cumulative frequency curve that characterizes the relationship between runoff and frequency [37]. FDC curves can be calculated for daily, weekly, and monthly runoff data, providing corresponding frequencies and runoff values equal to or greater than a certain frequency. FDC curves offer a simple, comprehensive, and graphical representation of the variability in historical runoff data [38,39]. Each flow value Q corresponds to the probability of exceeding that value. An FDC curve is a simple plot of the runoff

Q_{p}

corresponding to a certain probability p, where p is defined by the following formula:

p = 1 - p {Q_{p} \leq q}

(7)

where

Q_{p}

is a function of runoff data and is often referred to as the empirical quantile function because this function depends on the observed values.

The FDC for daily runoff simulation can be divided into five intervals: [0, 10%], [10%, 40%], [40%, 60%], [60%, 90%], and [90%, 100%]. These intervals correspond to high flow, wet conditions, moderate flow, dry conditions, and low flow, respectively [40].

2.3.4. Mann–Kendall Trend Test and Mann–Kendall Change Point Test

The Mann–Kendall (MK) method is an effective approach for testing trend changes in runoff time series. Its fundamental concept involves examining each data point within the time series to assess their correlations, thereby determining whether there is a trend change in the time series. In order to investigate the consistency of the rainfall–runoff data, this study used the MK to conduct trend analysis and change point detection [41]. The statistic Z is the significance test value for the MK trend analysis. When Z is greater than 0, it indicates an upward trend in the series; if Z is less than 0, it indicates a downward trend. When the absolute value of Z is greater than or equal to the critical values of 1.28, 1.64, and 2.32, it signifies that the trend in the time series is significant at the 90%, 95%, and 99% confidence intervals, respectively [42].

The change point detection statistic values UF and UB for the series variables indicate an upward or downward trend in the data series, and exceeding the significance level suggests a significant change in the trend. If UF and UB intersect within the significance level, the point is considered a change point, dividing the series variables into a baseline period and a changing period.

2.3.5. Double Mass Curve

The double mass curve (DMC) method is currently the simplest, most intuitive, and widely used approach for the consistency and long-term trend analysis of hydro-meteorological elements. It involves plotting the relationship line between the continuous cumulative values of one variable and another variable over the same period in a Cartesian coordinate system. By establishing the double cumulative curve, the influence of a reference variable is eliminated, revealing whether another factor has led to significant trend changes in the tested variable. In the context of rainfall–runoff double cumulative curve analysis, after identifying the year of abrupt change, the characteristics of precipitation and runoff changes are analyzed. The double cumulative curve method is employed to differentiate the impacts of climate change and human activities on runoff. Through an analysis tailored to the actual conditions of the watershed, the influences of climate change and human activities on runoff are assessed, and the resulting hydrological effects are evaluated.

2.4. Model Development

CatBoost, MBP, and LSTM models were employed for daily runoff simulation. Based on the chronological order of the time series, using observed daily RR data from 1961 to 2000 for model training and testing. In total, 80% of the data was designated as the training set and the remaining 20% as the testing set. Specifically, the data from 1961 to 1992, excluding 1969 (because we do not have data for year 1969), totaling 30 years, served as the training set, while the data from 1993 to 2000, spanning 8 years, comprised the testing set. In terms of modeling, rainfall and antecedent runoff were selected as input variables. According to the modeling method in Equation (3) (where for t = 1,

Q_{t - 1}^{s i m}

and

Q_{t - n q}^{s i m}

are taken as measured values, and for t = 2, 3, …, progressively replaced by simulated values, and once

Q_{t - n q}^{s i m}

is entirely based on simulated values, the model’s warm-up is complete; after the warm-up, the model uses prior simulated flow for continuous simulation), training samples for the CatBoost model, BP neural network model, and LSTM neural network model were generated. The networks were trained using the Bayesian optimization algorithm combined with five-fold cross-validation to fine-tune the hyperparameters of the three models. This involves determining the number of iterations, learning rate, and maximum tree depth for the CatBoost model, the number of hidden layer neurons and training iterations for the BP neural network model, and the number of neural units and training iterations for the LSTM model. The optimal parameters are shown in Table 2. At this point, the CatBoost model, BP neural network model, and LSTM neural network model with the optimal parameters have been calibrated, and all the training and testing period data are used to drive these models for continuous simulation. The simulated outflow discharge sequences were obtained, and evaluation metrics were calculated.

3. Results and Discussion

3.1. Analysis of Simulation Accuracy

3.1.1. Simulation Results for the Entire RR Dataset

Table 3 presents the accuracy statistics of daily runoff simulation for the three models during the training and testing periods. In the training period, the R² values for the CatBoost, LSTM, and MBP models were all above 0.9, specifically 0.9690, 0.9053, and 0.9428, respectively. The peak flow error and runoff depth error were both within 20%, with peak flow errors of 8.14%, 13.28%, and 10.34% and runoff depth errors of 4.19%, 4.72%, and 7.19%, respectively. According to these evaluation metrics, all three models demonstrated satisfactory daily runoff simulation results during the training period. Among them, the CatBoost model exhibited a higher R² than the LSTM and MBP models, with peak flow errors and runoff depth errors lower than those of the LSTM and MBP models. This result suggests that during the training period, the CatBoost model provided a daily runoff simulation that closely approximated the observed flow process.

However, during the testing period, there is a significant deterioration in the daily runoff simulation performance of the CatBoost, LSTM, and MBP models. As shown in Table 3, the R² values decrease to around 0.6 in the testing period, with peak flow errors increasing to over 20% and exhibiting considerable variations. Meanwhile, runoff depth errors remain below 20%, showing relatively small variations. By comparing the R² values, peak flow relative errors, and runoff depth relative errors during the training and testing periods from 1961 to 2000, as illustrated in Figure 5, it is evident that all three models experienced a noticeable reduction in R² values during the testing period. The decrease is particularly prominent for the CatBoost and MBP models, exceeding that of the LSTM model. Peak flow errors significantly increase, surpassing the 20% reasonable range, while runoff depth errors exhibit some increase but remain below 20%. This indicates that the decrease in R² values during the testing period is primarily attributed to the deviation in peak flow simulation.

In the field of ML, the quality of fundamental data plays a crucial role in the determination of model accuracy. A precise and detailed modeling approach can effectively enhance model accuracy under the condition of good training data. This study focuses on a time series spanning from 1961 to 2000, and the time duration of the sequence is noteworthy. Through a literature review, it has been discovered that significant changes occurred in the underlying surface of the basin during the corresponding period. In the 1960s and 1970s, soil conservation and small reservoir construction were initiated, leading to sedimentation phenomena after the 1970s. In the 1990s, the construction of the Luanchuan ecological park upstream strengthened tree protection.

The simulation results based on the entire rainfall–runoff dataset indicate that the training period accuracy is high for all three models, but the testing period accuracy decreases. Among them, the CatBoost model and MBP neural network model exhibit higher accuracy than the LSTM model. The cause of this issue may lie in the disruption of data consistency, where the quality of rainfall–runoff data significantly impacts the simulation accuracy of the models. It is necessary to conduct reliability and consistency analysis on observed rainfall–runoff data sequences and explore the correlation of rainfall–runoff sequences in the watershed.

3.1.2. Analysis of Variations of the Watershed Hydro-Meteorological Properties

(1): Trend analysis and change point detection

Statistical analysis of the trend changes in observed rainfall, runoff, and water surface evaporation (using water surface evaporation data obtained from the E601 evaporimeter) from 1961 to 2000 is shown in Figure 6. During 1961–2000, the Dongwan watershed exhibited a decreasing trend in runoff (−2.56 mm/year), rainfall (−3.53 mm/year), and evapotranspiration (−19.05 mm/year). The decrease in precipitation directly led to a reduction in runoff. The MK trend analysis results indicate that the statistical values Z for rainfall and runoff are −1.2097 and −1.4032, respectively, passing the 90% significance test. The evaporation statistical value Z is −6.4921, exceeding the 99% significance level, indicating a significant trend change in runoff and evaporation, while the trend change in precipitation is relatively insignificant.

The results of the MK abrupt change test for the runoff series are shown in Figure 7. From the figure, it can be observed that the values of UF or UB are less than 0, indicating a decreasing trend in runoff. There are multiple intersection points between the UF and UB curves, and all the intersection points are within the critical lines. In the case of numerous anomalies detected by the MK test, it is necessary to consult the relevant literature on the Dongwan watershed [43,44]. Finally, it was determined that an abrupt change in the runoff series occurred in 1989, but it did not exceed the 0.05 significance level, suggesting that the change was not significant. The intersection point corresponds to the time when the abrupt change began. The year of the runoff change coincides with the construction of the Luanchuan ecological park in the watershed. The park enhanced tree protection, increased water storage in the basin, and led to a reduction in runoff. This indicates that human activities started influencing runoff from this point onwards. Based on the abrupt change point, we divided the runoff series into a baseline period (1961–1989) and a changing period (1990–2000).

(2): Double mass curve analysis of the rainfall–runoff data

Double mass curve analysis was conducted on the rainfall–runoff series data in the Dongwan watershed. Assuming no human influence, the double mass curve of precipitation and runoff should be a straight line. The linear variation reflects the impact of human activities on watershed runoff. The analysis results of the rainfall–runoff double mass curve in the Dongwan watershed are shown in Figure 8a. From the figure, it can be observed that before 1989, the fit between rainfall and runoff was good, with a linear fit determination coefficient R² of 0.9946 (changes in the R² value merely reflect the degree of linear correlation in the rainfall–runoff relationship). This indicates that the watershed runoff process was in a natural state, relatively unaffected by human activities. However, the relationship between cumulative rainfall and runoff changed significantly after 1989, indicating an intensified impact of human activities on runoff after 1989.

Additionally, the rainfall–runoff relationship curves for the periods 1961–1989 and 1990–2000 were separately plotted, as shown in Figure 8b. It was found that the fitting lines for the two groups of data points deviated to some extent. The linear fit determination coefficient R² was 0.5844 for the period 1961–1989 and 0.8929 for the period 1990–2000, indicating a change in the correlation between rainfall and runoff before and after the abrupt change point.

Through the analysis of the hydro-meteorological variables in the Dongwan watershed, it is indicated that there is a significant decreasing trend in runoff in the period from 1961 to 2000. Meanwhile, due to human activities altering vegetation conditions, land use patterns, and the construction of water and soil conservation projects in the watershed, the hydrological cycle has been affected. This led to a significant change in the runoff data in 1989, disrupting the consistency and reliability of the data. Therefore, conducting machine learning modeling with the complete time series data and dividing the modeling period and validation period based on the inflection point is unreasonable. The results obtained under such conditions may have significant biases. Therefore, based on the change point in 1989, the complete dataset should be divided into a baseline period (1961–1989) and a change period (1990–2000) to establish machine learning models for each period and conduct phased modeling and simulation studies of rainfall–runoff forecasting.

3.2. Staged Rainfall–Runoff Simulation

3.2.1. Baseline Period

Based on the daily rainfall–runoff data measured during the baseline period from 1961 to 1989, the data are divided into a training set and a testing set at a ratio of 80% to 20%. This means that the 22 years of measured data from 1961 to 1983 (excluding 1969) serve as the training set, and the data from 1984 to 1989, spanning 6 years, form the testing set. In the modeling process, rainfall and prior runoff are selected as model inputs. The parameters of the three models are optimized using the Bayesian optimization algorithm combined with five-fold cross-validation. For the CatBoost model, the number of iterations is 195, with a learning rate of 0.01 and a maximum tree depth of 8; for the MBP neural network model, the number of neurons in the hidden layers is 126 and 176, respectively, with the number of epochs being 106; and for the LSTM model, the number of neurons is 68, with the number of epochs being 136.

Table 4 presents the daily runoff simulation results of the CatBoost model, MBP model, and LSTM neural network model during the model calibration and validation periods. Figure 9, Figure 10 and Figure 11 shows the scatter plots of observed runoff versus predicted runoff for the three models. From the evaluation metrics and scatter plots, it can be observed that all three models demonstrate good performance in simulating daily runoff during both the training and testing periods. The R² values are all above 0.80, and the peak flow error and runoff depth error are within 20%. The scatter points are evenly distributed around the 45-degree reference line.

During the model testing period, the CatBoost model exhibits an R² of 0.8921, a peak flow error of 9.14%, and a runoff depth error of 2.83%. The CatBoost model outperforms the MBP model and the LSTM neural network model. The scatter plot indicates that the points from the CatBoost model are more evenly distributed compared to the MBP model and the LSTM neural network model, with the points closely aligned to the 45-degree reference line. Therefore, the CatBoost neural network model overall provides a closer fit to the observed runoff process during the baseline period.

3.2.2. Changing Period

Based on the observed daily rainfall–runoff data from the change period of 1990–2000, the data were divided into a training set and a testing set in an 80%, 20% ratio, with data from 1990 to 1998 (a total of 9 years) serving as the training set, and data from 1999 to 2000 (a total of 2 years) serving as the testing set. For modeling, rainfall and prior runoff volume were selected as model inputs. The parameters for the three models were optimized using the Bayesian optimization algorithm combined with five-fold cross-validation. For the CatBoost model, the number of iterations is 177, the learning rate is 0.05, and the maximum tree depth is 7. For the MBP neural network model, the number of neurons in the hidden layers is 35 and 96, respectively, with the number of epochs being 115, while the LSTM model has 108 neurons, with the number of epochs being 93.

Table 5 presents the runoff simulation results of the CatBoost model, MBP neural network model, and LSTM neural network model during the model training and testing periods. The R² values for all three models are greater than 0.87, with peak flow errors and runoff depth errors both below 20%, indicating good daily runoff simulation performance. Among them, the R² and peak relative errors of the CatBoost model and MBP neural network model are similar, while the peak error of the LSTM model is better than that of the CatBoost model and MBP neural network model. However, the R² and runoff depth errors are less than those of the CatBoost model and MBP neural network model. To further assess the daily runoff simulation performance of the three models, a comparison was made with the observed daily runoff processes in 1999 and 2000, as shown in Figure 12 and Figure 13. From the simulated daily runoff processes of the three models, it can be observed that the CatBoost model and MBP neural network model exhibit good forecasting accuracy, with smooth processes that closely match the observed hydrographs in high, medium, and low flow conditions. Particularly, the CatBoost model aligns well with the observed hydrographs in both the rising and falling stages. The simulated daily runoff processes of the LSTM model are slightly less consistent with the observed runoff, especially in low flow conditions and the falling stage. Considering both evaluation indicators and the simulated daily runoff processes, during the change period, the CatBoost model’s daily runoff simulation process is closer to the observed flow processes.

The comprehensive analysis of the simulation results of the baseline and change periods validates that when there is a breakpoint in the time series, and the training and testing data are precisely split at the breakpoint, it severely impacts the simulation performance of machine learning models. It is essential to separate the datasets according to the breakpoint and build models for simulation independently, which significantly enhances the predictive accuracy. For the baseline and change period datasets in this study, the CatBoost model demonstrates reliability and superiority in simulating daily runoff in the Dongwan watershed. It is deemed suitable for application in the rainfall–runoff simulation field in the Dongwan watershed.

3.3. Rainfall–Runoff Simulation Based on Cross-Sampling of the Entire Dataset

To further investigate the performance of machine learning models in dealing with non-stationary datasets, the entire rainfall–runoff dataset for the years 1961–2000 was divided into training and testing sets, with even-numbered years assigned to training and odd-numbered years to testing. CatBoost, MBP, and LSTM models were employed for simulation. The simulation evaluation results are presented in Table 6. The linear relationship between observed and simulated daily runoff and the flow duration curves (FDCs) are depicted in Figure 14, Figure 15 and Figure 16. Figure 14a,c, Figure 15a,c and Figure 16a,c show the linear relationship between observed and simulated daily runoff during the calibration and validation periods, while Figure 14b,d, Figure 15b,d and Figure 16b,d present the FDCs during the calibration and validation periods.

As shown in Table 6, the R² indicators for the training period of CatBoost, MBP, and LSTM models are all above 0.94, while there is a slight decrease in R² during the testing period. The decay rate (training period/testing period) for MBP and LSTM models is similar and less than that of the CatBoost model. The peak flow errors for the three models increase during the testing period, exceeding the 20% error tolerance range. Among them, the peak flow error of the MBP model is 33.17%, which is smaller than that of the CatBoost and LSTM models. The runoff depth error during the testing period shows little change compared to the training period. In a comprehensive evaluation, the MBP neural network model exhibits better runoff simulation performance, indicating its stronger generalization capability in dealing with non-stationary data.

Figure 14a, Figure 15a and Figure 16a reveal that during the training period, the observed and simulated runoff points of CatBoost, MBP, and LSTM models are concentrated around the 45° line, indicating a linear relationship between observed and simulated daily runoff. The larger the flow, the more concentrated the points, indicating a stronger linear relationship. Figure 14c, Figure 15c and Figure 16c show that during the testing period, the observed and simulated runoff points of CatBoost and LSTM models are concentrated around the 45° line when the flow exceeds 10 m³·s⁻¹·d⁻¹, indicating a better linear relationship for flows greater than 10 m³·s⁻¹·d⁻¹. The observed and simulated runoff points of the MBP model are concentrated around the 45° line, indicating a higher linear relationship compared to the CatBoost and LSTM models.

Through examining the FDC curves in Figure 14b,d, Figure 15b,d and Figure 16b,d, the relative change in simulated values compared to observed values can be assessed based on runoff magnitude. Figure 14b, Figure 15b and Figure 16b show that during the training period, the observed FDC divides runoff data into five intervals: high flow (>38.1 m³·s⁻¹·d⁻¹), wet conditions (>10.1 m³·s⁻¹·d⁻¹), medium flow (>6.4 m³·s⁻¹·d⁻¹), dry conditions (>3.19 m³·s⁻¹·d⁻¹), and low flow (<3.19 m³·s⁻¹·d⁻¹). During the testing period, the observed runoff is divided into five intervals: high flow (>34.7 m³·s⁻¹·d⁻¹), wet conditions (>10.5 m³·s⁻¹·d⁻¹), medium flow (>6.94 m³·s⁻¹·d⁻¹), dry conditions (>3.35 m³·s⁻¹·d⁻¹), and low flow (<3.35 m³·s⁻¹·d⁻¹). CatBoost, MBP neural network, and LSTM models show a strong variation in runoff in the high flow and low flow intervals while showing a steady and progressive change in runoff in the wet, medium, and dry condition intervals, consistent with the observed and simulated trends. During the calibration period, the FDCs of CatBoost, MBP, and LSTM models in predicting daily runoff are relatively stable in approximating the observed FDC in the high flow, wet condition, and medium flow intervals but show instability in the dry condition and low flow intervals. During the testing period, the FDCs of CatBoost, MBP, and LSTM models in predicting daily runoff are relatively stable in approximating the observed FDC in the high flow interval but exhibit obvious instability in other intervals. Based on the training and testing period FDC results, the MBP model demonstrates higher stability across the entire FDC range compared to the LSTM and CatBoost models, indicating better runoff simulation performance.

4. Conclusions

In response to the runoff adaptability prediction issue in a changing environment, this study established a non-real-time correction modeling approach and constructed runoff models based on CatBoost, multi-layer BP neural network, and LSTM simulators, respectively. This study simulated the daily runoff process in the Dongwan watershed of the Yiluo River Basin from 1961 to 2000, comparing and analyzing the runoff simulation results and achieving adaptive daily runoff prediction in response to changing environmental conditions to provide new ideas and methods for improving the runoff simulation capabilities of machine learning models in changing environments.

From the perspective of daily runoff simulation results based on the overall dataset, the CatBoost model, LSTM model, and MBP model all performed well in the training period, but their simulation performance sharply declined in the testing period. This indicates issues with the reliability and consistency of the dataset.

Addressing the problem of the non-stationarity assumption of watershed datasets in a changing environment, this study analyzed the hydro-meteorological changes in the Dongwan watershed and their impact on runoff prediction. The long-term trend of watershed runoff was significant, and a sudden change occurred in 1989, disrupting the consistency and reliability of the data. Building machine learning models based on this dataset and selecting the mutation point to split the model’s training and testing periods is unreasonable, resulting in significant bias.

Splitting the dataset according to the mutation point and building separate models significantly improved the simulation and prediction effectiveness. For the baseline and changing periods in this study, the CatBoost model demonstrated reliability and superiority in simulating daily runoff in the Dongwan watershed, making it suitable for rainfall–runoff simulation in the study area.

Comprehensive evaluation of the three machine learning models for rainfall–runoff simulation using cross-partitioned datasets revealed that, for unstable datasets with mutation points, the MBP model showed higher evaluation indicators for observed and simulated daily runoff. It exhibited a strong linear relationship, more stable FDC performance, and better runoff simulation results. This indicates the significant advantage of the MBP model in dealing with complex nonlinear simulation problems.

This study employed three widely used models in daily rainfall–runoff simulation, specifically applied to the Dongwan watershed. Due to certain limitations, future research will involve the use of more types of machine learning models to conduct daily rainfall–runoff simulations, conduct research on the identification of mutation points in real-time rainfall–runoff simulation scenarios, and, additionally, apply these methods for validation in a greater variety of watersheds.

Author Contributions

Conceptualization, Y.J. and G.K.; methodology, C.L. and G.K.; software, Y.J. and G.K.; validation, G.K., C.L. and X.F.; formal analysis, G.K. and Y.J.; investigation, F.C., X.F. and K.L.; resources, C.L. and H.Y.; data curation, Y.J., C.L. and K.L.; writing—original draft preparation, Y.J., C.L. and G.K.; writing—review and editing, G.K. and K.L.; visualization, C.L.; supervision, F.C. and H.Y.; project administration, G.K.; funding acquisition, C.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Key R&D Program of China (2023YFC3010704, 2023YFC3209202); IWHR Research & Development Support Program (JZ0199A032021); Significant Science and Technology Project of Ministry of Water Resources (SKR-2022056); GHFUND A No. ghfund202302018283. We gratefully acknowledge the support from Key Laboratory of Water Safety for Beijing-Tianjin-Hebei Region of Ministry of Water Resources. We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Tesla K40 and TITAN V GPUs used for this research.

Data Availability Statement

Data are contained within the article.

Acknowledgments

The authors also thank the anonymous reviewers for their helpful comments and suggestions.

Conflicts of Interest

Author Ying Jiao was employed by the company China Water Resources Bei Fang Investigation, Design & Research Co., Ltd. Author Ke Liang was employed by the company Beijing IWHR Corporation. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Ahn, K.-H.; Merwade, V. Quantifying the relative impact of climate and human activities on streamflow. J. Hydrol. 2014, 515, 257–266. [Google Scholar] [CrossRef]
Chang, J.; Zhang, H.; Wang, Y.; Zhu, Y. Assessing the impact of climate variability and human activities on streamflow variation. Hydrol. Earth Syst. Sci. 2016, 20, 1547–1560. [Google Scholar] [CrossRef]
Stergiadi, M.; Di Marco, N.; Avesani, D.; Righetti, M.; Borga, M. Impact of geology on seasonal hydrological predictability in alpine regions by a sensitivity analysis framework. Water 2020, 12, 2255. [Google Scholar] [CrossRef]
Liang, Z.; Huang, Y.; Singh, V.P.; Hu, Y.; Li, B.; Wang, J. Multi-source error correction for flood forecasting based on dynamic system response curve method. J. Hydrol. 2021, 594, 125908. [Google Scholar] [CrossRef]
Wang, D.; Hejazi, M. Quantifying the relative contribution of the climate and direct human impacts on mean annual streamflow in the contiguous United States. Water Resour. Res. 2011, 47, 1080. [Google Scholar] [CrossRef]
Hu, S.; Qiu, H.; Yang, D.; Cao, M.; Song, J.; Wu, J.; Gao, Y. Evaluation of the applicability of climate forecast system reanalysis weather data for hydrologic simulation: A case study in the Bahe River Basin of the Qinling Mountains. China J. Geogr. Sci. 2017, 27, 546–564. [Google Scholar] [CrossRef]
Zhao, Q.; Zhu, Y.; Shu, K.; Wan, D.; Yu, Y.; Zhou, X.; Liu, H. Joint spatial and temporal modeling for hydrological prediction. IEEE Access 2020, 8, 78492–78503. [Google Scholar] [CrossRef]
Chen, S.; Dong, S.; Cao, Z.; Guo, J. A compound approach for monthly runoff forecasting based on multiscale analysis and deep network with sequential structure. Water 2020, 12, 2274. [Google Scholar] [CrossRef]
Bian, L.; Qin, X.; Zhang, C.; Guo, P.; Wu, H. Application, interpretability and prediction of machine learning method combined with LSTM and LightGBM-a case study for runoff simulation in an arid area. J. Hydrol. 2023, 625, 130091. [Google Scholar] [CrossRef]
Kan, G.; Yao, C.; Li, Q.; Li, Z.; Yu, Z.; Liu, Z.; Ding, L.; He, X.; Liang, K. Improving event-based rainfall-runoff simulation using an ensemble artificial neural network based hybrid data-driven model. Stoch. Environ. Res. Risk Assess. 2015, 29, 1345–1370. [Google Scholar] [CrossRef]
Kan, G. Study on Application and Comparative of Data-Driven Model and Semi-Data-Driven Model for Rainfall-Runoff Simulation. Acta Geod. Et Cartogr. Sin. 2017, 46, 265. [Google Scholar]
Liang, K.; Kan, G.; Li, Z. Application of A New Coupled Data-driven Model in Rainfall-Runoff Simulation. J. China Hydrol. 2016, 36, 1–7. [Google Scholar]
Yong, D.; Lei, R. Research on diurnal runoff prediction in the middle reaches of the Yellow River based on BP neural network. People’s Yellow River 2020, 42, 5–8. (In Chinese) [Google Scholar]
Qingfang, H.; Shiyuan, C.; Huibin, Y.; Yintang, W.; Lingjie, L.; Lihui, W. Preliminary study on LSTM model for diurnal runoff prediction at Ankang station in Hanjiang River Basin. Prog. Geogr. 2020, 39, 636–642. (In Chinese) [Google Scholar]
Xiang, Z.; Yan, J.; Demir, I. A rainfall-runoff model with LSTM-based sequence-to-sequence learning. Water Resour. Res. 2020, 56, e2019WR025326. [Google Scholar] [CrossRef]
Zhang, J.; Zhu, Y.; Zhang, X.; Ye, M.; Yang, J. Developing a Long Short-Term Memory (LSTM) based model for predicting water table depth in agricultural areas. J. Hydrol. 2018, 561, 918–929. [Google Scholar] [CrossRef]
Lekuan, M.; Yu, Q.; Yue, Z.; Xue, L.; Yuqiu, W. Research on daily runoff prediction of small watershed based on improved neural network and support vector machine. J. Water Resour. Hydraul. Eng. 2016, 27, 23–27. (In Chinese) [Google Scholar]
Jingguang, H.; Wei, W.; Luyao, C.; Nan, Y.; Bo, C. Prediction of Daily Runoff Combination Based on Wavelet Support Vector Machine Feature Classification: A Case Study of Yichang Three Gorges Reservoir. China Rural Water Resour. Hydropower 2018, 33–39. (In Chinese) [Google Scholar]
Szczepanek, R. Daily Streamflow Forecasting in Mountainous Catchment Using XGBoost, LightGBM and CatBoost. Hydrology 2022, 9, 226. [Google Scholar] [CrossRef]
Cui, Z.; Qing, X.; Chai, H.; Yang, S.; Zhu, Y.; Wang, F. Real-time rainfall-runoff prediction using light gradient boosting machine coupled with singular spectrum analysis. J. Hydrol. 2021, 603, 127124. [Google Scholar] [CrossRef]
Gao, S.; Huang, Y.; Zhang, S.; Han, J.; Wang, G.; Zhang, M.; Lin, Q. Short-term runoff prediction with GRU and LSTM networks without requiring time step optimization during sample generation. J. Hydrol. 2020, 589, 125188. [Google Scholar] [CrossRef]
Sun, W.; Zhou, S.; Yang, J.; Gao, X.; Ji, J.; Dong, C. Artificial Intelligence Forecasting of Marine Heatwaves in the South China Sea Using a Combined U-Net and ConvLSTM System. Remote Sens. 2023, 15, 4068. [Google Scholar] [CrossRef]
Ji, Q.; Han, L.; Jiang, L.; Zhang, Y.; Xie, M.; Liu, Y. Short-term prediction of the significant wave height and average wave period based on the variational mode decomposition–temporal convolutional network–long short-term memory (VMD–TCN–LSTM) algorithm. Ocean Sci. 2023, 19, 1561–1578. [Google Scholar] [CrossRef]
Wangliang, S.; Jianzhong, Z.; Lihong, P.; Zhanxing, X.; Li, M.; Siman, H.; Feifei, H. Research on DFA_VMD_LSTM Combined Daily Runoff Prediction Model. Hydropower Energy Sci. 2021, 39, 12–15. (In Chinese) [Google Scholar]
Ahmadianfar, I.; Demir, V.; Heddam, S.; Al-Areeq, A.M.; Abba, S.I.; Tan, M.L.; Halder, B.; Marhoon, H.A.; Yaseen, Z.M.; Kilinc, H.C. Daily Scale Streamflow Forecasting Based-Hybrid Gradient Boosting Machine Learning Model. Water Resour. Manag. 2023. [Google Scholar] [CrossRef]
Jie, D.; Zhijia, L.; Yuan, G.; Pengnian, H. Using the HEC Model to Analyze the Impact of Underlying Surface Changes on Floods: A Case Study of the East Bay Watershed of the Yi River. J. Lake Sci. 2011, 23, 463–468. (In Chinese) [Google Scholar] [CrossRef]
Wickel, B.A.; Lehner, B.; Sindorf, N. HydroSHEDS: A global comprehensive hydrographic dataset. AGU Fall Meet. Abstr. 2007, 2007, H11H-05. [Google Scholar]
Zhaokai, Y.; Weihong, L.; Ruojia, W.; Xiaohui, L. Simulation and prediction of rainfall runoff based on long short-term memory neural network (LSTM). South—North Water Divers. Water Sci. Technol. 2019. (In Chinese) [Google Scholar] [CrossRef]
Guangyuan, K.; Zhiyu, L.; Zhijia, L.; Cheng, Y.; Sai, Z. Coupling Xinanjiang runoff generation model with improved BP flow concentration model. Adv. Water Sci. 2012, 23, 21–28. (In Chinese) [Google Scholar]
Wang, X.; Wang, Y.; Yuan, P.; Wang, L.; Cheng, D. An adaptive daily runoff forecast model using VMD-LSTM-PSO hybrid approach. Hydrol. Sci. J. 2021, 66, 1488–1502. [Google Scholar] [CrossRef]
He, X.; Luo, J.; Zuo, G.; Xie, J. Daily runoff forecasting using a hybrid model based on variational mode decomposition and deep neural networks. Water Resour. Manag. 2019, 33, 1571–1590. [Google Scholar] [CrossRef]
Hancock, J.T.; Khoshgoftaar, T.M. CatBoost for big data: An interdisciplinary review. J. Big Data 2020, 7, 1–45. [Google Scholar] [CrossRef]
Zhou, Y.; Guo, S.; Zhang, F. Research on the Application of Artificial Intelligence in Hydrological Forecasting. J. Water Resour. Res. 2019, 8, 1–12. [Google Scholar] [CrossRef]
MATLAB Chinese Forum. MATLAB Neural Network Analysis of 30 Cases; Beijing Aerospace University Press: Beijing, China, 2010. [Google Scholar]
Defeng, Z. MATLAB Neural Network Design; Mechanical Industry Press: Beijing, China, 2009. [Google Scholar]
GB/T 22482-2008; General Administration of Quality Supervision, Inspection and Quarantine of the People’s Republic of China, Standardization Administration of the People’s Republic of China. Standard for Hydrological Information and Hydrological Forecasting. Standards Press of China: Beijing, China, 2008. (In Chinese)
Song, C.M. Data construction methodology for convolution neural network based daily runoff prediction and assessment of its applicability. J. Hydrol. 2022, 605, 127324. [Google Scholar] [CrossRef]
Dingman, S.L. Physical Hydrology; Prentice Hall: Saddle River, NJ, USA, 2020. [Google Scholar]
Vogel, R.M.; Fenessey, N.M. Flow–duration curves 1: New interpretation and confidence intervals. J. Water Resour. Plan. Manag. 1994, 120, 485–504. [Google Scholar] [CrossRef]
Cleland, B.R. TMDL development from the “bottom up”–Part II: Using duration curves to connect the pieces. In TMDLS Conference 2002; Water Environment Federation: Alexandria, WV, USA, 2002; pp. 687–697. [Google Scholar]
Chen, Y.; Takeuchi, K.; Xu, C.; Chen, Y.; Xu, Z. Regional climate change and its effects on river runoff in the Tarim Basin, China. Hydrol. Process. 2006, 20, 2207–2216. [Google Scholar] [CrossRef]
Yongxin, N.; Zhongbo, Y.; Xizhi, L.; Li, M.; Qiufen, Z.; Jianwei, W. Attribution analysis of runoff evolution in Yiluo River Basin in the past 50 years. J. Water Resour. Transp. Eng. 2022, 1, 59–66. (In Chinese) [Google Scholar]
Guofu, L.; Shengyan, D. Study on the Impact of Climate and Land Use Changes on Runoff Variations: A Case Study of the Upper Yihe River Area in the Yiluo River Basin. Geogr. Sci. 2012, 32, 635–640. [Google Scholar]
Lvliu, L.; Xiujie, W.; Pengfei, Z. Analysis of the Impact of Climate Change and Human Activities on the Runoff of the Yiluo River Based on the SWAT Model. People’s Pearl River 2020, 41, 1–6+75. [Google Scholar]

Figure 1. Location map of the Dongwan catchment. The upper left plot shows the location of Henan Province in China, the lower left plot shows the location of the Dongwan Basin in Henan Province, and the right plot is the DEM (digital elevation model) map of the Dongwan Basin.

Figure 2. Thiessen polygon of Dongwan catchment.

Figure 3. Schematic diagram of LSTM model structure(* represents multiplication).

Figure 4. Schematic diagram of BP neural network structure.

Figure 5. Comparison of the training and testing of evaluation indicators in 1961–2000. (a) R²; (b) peak relative error; (c) runoff relative error.

Figure 6. Linear trends in hydro-meteorological variables in the Dongwan watershed. (a) Rainfall and runoff; (b) evaporation.

Figure 7. The MK abrupt change test results for the runoff series.

Figure 8. Cumulative rainfall–runoff curve and rainfall–runoff relationship in Dongwan watershed. (a) Rainfall–runoff double accumulation curve; (b) rainfall–runoff relationship.

Figure 9. Scatter plot of observed and predicted runoff for the CatBoost model. (a) Training; (b) testing.

Figure 10. Scatter plot of observed and predicted runoff for the MBP model. (a) Training; (b) testing.

Figure 11. Scatter plot of observed and predicted runoff for the LSTM model. (a) Training; (b) testing.

Figure 12. Simulation of daily runoff in 1999. (a) Observed and simulated scatter plot of 1999; (b) observed and simulated flood process plot of 1999.

Figure 13. Simulation of daily runoff in 2000. (a) Observed and simulated scatter plot of 2000; (b) observed and simulated flood process plot of 2000.

Figure 14. Comparison of observed and simulation runoff by CatBoost model; (a) scatter plot between observed and simulation runoff in training dataset; (b) flow duration curve of the observed and simulation runoff in training dataset; (c) scatter plot between observed and simulation runoff in testing dataset; (d) flow duration curve of the observed and simulation runoff in testing dataset.

Figure 15. Comparison of observed and simulation runoff by MBP model; (a) scatter plot between observed and simulation runoff in training dataset; (b) flow duration curve of the observed and simulation runoff in training dataset; (c) scatter plot between observed and simulation runoff in testing dataset; (d) flow duration curve of the observed and simulation runoff in testing dataset.

Figure 16. Comparison of observed and simulation runoff by LSTM model; (a) scatter plot between observed and simulation runoff in training dataset; (b) flow duration curve of the observed and simulation runoff in training dataset; (c) scatter plot between observed and simulation runoff in testing dataset; (d) flow duration curve of the observed and simulation runoff in testing dataset.

Table 1. Sub-basin rainfall station.

Number	Station	Area (km²)	Areal Weight
1	Dongwan	434	0.151908
2	Heyu	394	0.137907
3	Tantou	603	0.211061
4	Miaozi	338	0.118306
5	Luanchuan	239	0.083654
6	Baishi	354	0.123906
7	Taowan	295	0.103255
8	Baitu	200	0.070004

Table 2. Optimal parameter configurations for the models.

Model Name	Hyperparameter Name	Range	Optimal Value
CatBoost	Iterations	[50, 300]	184
	Learning Rate	[0, 1]	0.44
	Depth	[2, 15]	8
MBP	Number of Neurons in the Hidden Layer 1	[50, 300]	64
	Number of Neurons in the Hidden Layer 2	[50, 300]	184
	Epochs	[20, 200]	126
LSTM	Number of Neurons	[50, 300]	128
LSTM	Epochs	[20, 200]	93

Table 3. Simulation accuracy of CatBoost, BP, and LSTM models.

Evaluation Criterion	R²			Relative Error of Peak Flow			Relative Error of Runoff Depth
Name of model	CatBoost	LSTM	MBP	CatBoost	LSTM	MBP	CatBoost	LSTM	MBP
Training	0.9690	0.9053	0.9428	8.14%	13.28%	10.34%	4.19%	4.72%	7.19%
Testing	0.6148	0.6504	0.6196	64.63%	75.06%	84.32%	8.20%	7.15%	8.74%

Table 4. The prediction accuracy of CatBoost, MBP, and LSTM models (baseline period).

Evaluation Criterion	R²			Relative Error of Peak Flow			Relative Error of Runoff Depth
Evaluation Criterion	CatBoost	MBP	LSTM	CatBoost	MBP	LSTM	CatBoost	MBP	LSTM
Training	0.9426	0.8562	0.8087	2.46%	15.33%	21.73%	2.81%	6.17%	5.22%
Testing	0.8921	0.9334	0.8583	9.14%	11.01%	15.67%	2.83%	7.83%	7.64%

Table 5. CatBoost, MBP, and LSTM model prediction accuracy (change period).

Evaluation Criterion	R²			Relative Error of Peak Flow			Relative Error of Runoff Depth
Evaluation Criterion	CatBoost	MBP	LSTM	CatBoost	MBP	LSTM	CatBoost	MBP	LSTM
Training	0.9253	0.8645	0.8971	8.51%	25.09%	12.56%	2.02%	5.89%	4.57%
Testing	0.9185	0.9121	0.8784	15.59%	12.88%	2.33%	1.13%	4.36%	5.38%

Table 6. Evaluation of rainfall–runoff simulation results with cross-partitioned datasets.

Evaluation Indicator	R²			Peak Relative Error			Runoff Relative Error
Evaluation Indicator	CatBoost	MBP	LSTM	CatBoost	MBP	LSTM	CatBoost	MBP	LSTM
Training	0.9987	0.9655	0.9479	0.42%	8.27%	13.51%	0.54%	5.92%	16.75%
Testing	0.7293	0.7512	0.7359	38.14%	33.17%	34.36%	5.13%	9.84%	15.25%
Training/ Testing	1.3694	1.2852	1.2618	/			/

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, C.; Jiao, Y.; Kan, G.; Fu, X.; Chai, F.; Yu, H.; Liang, K. Comparisons of Different Machine Learning-Based Rainfall–Runoff Simulations under Changing Environments. Water 2024, 16, 302. https://doi.org/10.3390/w16020302

AMA Style

Li C, Jiao Y, Kan G, Fu X, Chai F, Yu H, Liang K. Comparisons of Different Machine Learning-Based Rainfall–Runoff Simulations under Changing Environments. Water. 2024; 16(2):302. https://doi.org/10.3390/w16020302

Chicago/Turabian Style

Li, Chenliang, Ying Jiao, Guangyuan Kan, Xiaodi Fu, Fuxin Chai, Haijun Yu, and Ke Liang. 2024. "Comparisons of Different Machine Learning-Based Rainfall–Runoff Simulations under Changing Environments" Water 16, no. 2: 302. https://doi.org/10.3390/w16020302

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Comparisons of Different Machine Learning-Based Rainfall–Runoff Simulations under Changing Environments

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Data

2.2.1. Data Description

2.2.2. Data Pre-Processing

2.3. Methods

2.3.1. Novel Continuous Modeling Scheme

2.3.2. Machine Learning-Based Methods

2.3.3. Model Performance Evaluation

2.3.4. Mann–Kendall Trend Test and Mann–Kendall Change Point Test

2.3.5. Double Mass Curve

2.4. Model Development

3. Results and Discussion

3.1. Analysis of Simulation Accuracy

3.1.1. Simulation Results for the Entire RR Dataset

3.1.2. Analysis of Variations of the Watershed Hydro-Meteorological Properties

3.2. Staged Rainfall–Runoff Simulation

3.2.1. Baseline Period

3.2.2. Changing Period

3.3. Rainfall–Runoff Simulation Based on Cross-Sampling of the Entire Dataset

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI