Investigation of the EWT–PSO–SVM Model for Runoff Forecasting in the Karst Area

Investigation of the EWT–PSO–SVM Model for Runoff Forecasting in the Karst Area

Investigation of the EWT–PSO–SVM Model for Runoff Forecasting in the Karst Area

Abstract

1. Introduction

2. Methodology

3. Research Areas and Data

4. Results and Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Nomenclature

Appendix A. Design Values of Each System Parameter

References

Share and Cite

Article Menu

Abstract

1. Introduction

2. Methodology

3. Research Areas and Data

4. Results and Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Nomenclature

Appendix A. Design Values of Each System Parameter

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

2.1. Empirical Wavelet Transform (EWT)

2.2. Support Vector Machine (SVM)

2.3. Particle Swarm Optimization (PSO)

2.4. Developed EWT–PSO–SVM Model

2.5. Model Evaluation and Optimization

2.6. Different Data Structures to Verify Model Stability

4.1. Nonstationary Analysis of Runoff Data

4.2. EWT Decomposition

4.3. Determine the Input Variables

4.4. Parameter Setting

4.5. Performance Analysis

4.5.1. Forecasting Results

4.5.2. Forecasting Results with Different Input Models

4.6. Discussion

4.6.1. The Roles of EWT and PSO on SVM Performance Improvement

4.6.2. Compared with Previous Studies in Runoff Forecasting

4.6.3. Innovation and Further Research

2.1. Empirical Wavelet Transform (EWT)

2.2. Support Vector Machine (SVM)

2.3. Particle Swarm Optimization (PSO)

2.4. Developed EWT–PSO–SVM Model

2.5. Model Evaluation and Optimization

2.6. Different Data Structures to Verify Model Stability

4.1. Nonstationary Analysis of Runoff Data

4.2. EWT Decomposition

4.3. Determine the Input Variables

4.4. Parameter Setting

4.5. Performance Analysis

4.5.1. Forecasting Results

4.5.2. Forecasting Results with Different Input Models

4.6. Discussion

4.6.1. The Roles of EWT and PSO on SVM Performance Improvement

4.6.2. Compared with Previous Studies in Runoff Forecasting

4.6.3. Innovation and Further Research

Mo, Chongxun; Yan, Zhiwei; Ma, Rongyong; Lei, Xingbi; Deng, Yun; Lai, Shufeng; Huang, Keke; Mo, Xixi

doi:10.3390/app13095693

Open AccessArticle

by

Chongxun Mo

^1,2,3,4,

Zhiwei Yan

^1,2,3,4

,

Rongyong Ma

^1,2,3,4,

Xingbi Lei

^1,2,3,4,*

,

Yun Deng

⁵,

Shufeng Lai

^1,2,3,4,

Keke Huang

^1,2,3,4 and

Xixi Mo

^1,2

¹

Key Laboratory of Disaster Prevention and Structural Safety, Ministry of Education, College of Civil Engineering and Architecture, Guangxi University, Nanning 530004, China

²

College of Architecture and Civil Engineering, Guangxi University, Nanning 530004, China

³

Guangxi Provincial Engineering Research Center of Water Security and Intelligent Control for Karst Region, Guangxi University, Nanning 530004, China

⁴

Guangxi Key Laboratory of Disaster Prevention and Engineering Safety, Guangxi University, Nanning 530004, China

⁵

Nanning Survey and Design Institute of Guangxi Pearl River Commission, Nanning 530004, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(9), 5693; https://doi.org/10.3390/app13095693

Submission received: 24 March 2023 / Revised: 26 April 2023 / Accepted: 1 May 2023 / Published: 5 May 2023

Download

Browse Figures

Review Reports Versions Notes

:

As the runoff series exhibit nonlinear and nonstationary characteristics, capturing the embedded periodicity and regularity in the runoff series using a single model is challenging. To account for these runoff characteristics and enhance the forecasting precision, this research proposed a new empirical wavelet transform–particle swarm optimization–support vector machine (EWT–PSO–SVM) hybrid model based on “decomposition-forecasting-reconstruction” for runoff forecasting and investigated its effectiveness in the karst area. First, empirical wavelet transform (EWT) was employed to decompose the original runoff series into multiple subseries. Second, the support vector machine (SVM) optimized by particle swarm optimization (PSO) was applied to forecast every signal subseries. Finally, this study summarized the predictions of the subseries to reconstruct the ultimate runoff forecasting. The developed forecasting model was assessed by applying the monthly runoff series of the Chengbi River Karst Basin, and the composite rating index combined with five metrics was adopted as the performance evaluation tool. From the results of this research, it is clear that the EWT–PSO–SVM model outperforms both the PSO–SVM model and the SVM model in terms of the composite rating index, reaching 0.68. Furthermore, verifying the performance stability, the developed model was also compared with PSO–SVM and SVM models under different input data structures. The comparison demonstrated that the hybrid EWT–PSO–SVM model had a robust performance superiority and was an effective model that can be applied to karst area runoff forecasting.

Keywords:

empirical wavelet transform; particle swarm optimization; support vector machine; runoff forecasting; Chengbi River Karst Basin

Accurate and reliable runoff forecasting provides data support for reservoir scheduling decisions, enhances the benefits of reservoirs under the condition of ensuring flood safety, and ultimately achieves the goal of optimizing water resources allocation, flood control, and disaster reduction [1]. However, with the increasing frequency of climate change and human activities the runoff series exhibit more obvious nonlinearity and nonstationarity, which has enhanced the complexity of runoff forecasting [2]. Therefore, improving the precision of runoff forecasting is a vital challenge for hydrologists [3]. In general, runoff forecasting models are classified according to the modeling approach, which are process-driven and data-driven models [4]. Process-driven models need to be based on hydrological concepts to analyze the runoff process for constructing a physical model, and ultimately achieve the purpose of runoff forecasting. Building physical models based on hydrologic concepts gives process-driven models the advantage of being highly interpretable but leads to models with systematic biases and reliance on high-precision data. Therefore, the accuracy of underground hydrological characteristic data and meteorological data in the basin is an important factor affecting the process-driven model performance [5]. When the underground hydrological characteristics and historical long-term meteorological data cannot be explicitly obtained, the performance of the process-driven model is difficult to give full play [6]. Different from process-driven models, data-driven models make the models fit the data to find patterns among the data, so there is no need to build physical models. As a result, although the interpretability of data-driven models is weak, they are better able to explore the regularity among data [7]. Li et al. [4] compared the runoff forecasting results of the data-driven model Long Short Term Memory (LSTM) and the process-driven model Gridded Surface Subsurface Hydrologic Analysis (GSSHA) which demonstrated that the data-driven model exhibited better performance and robustness in forecasting and calibration. Partal and Sezen [8] used wavelet-based artificial neural network (WANN) for daily runoff forecasting. The results of the study showed that the data-driven model WANN outperformed the process-driven model GR4J. Due to the strong generalization ability, the data-driven models have gained widespread use in predicting runoff in recent years. So far, the widely used data-driven models include artificial neural networks, extreme learning machines, genetic programming, and support vector machines [9]. Among these data-driven models, the SVM was developed by Vapnik [10], which is based on the Vapnik–Chervonenkis dimensionality theory and the principle of structural risk minimization. The SVM model has the advantages of global optimization in theory, avoiding dimensionality disasters and small sample advantage, and thus has been widely used in runoff forecasting [11,12]. However, due to the nonlinear and nonstationary characteristics of the runoff series, a single model is hard to capture the periodicity and regularity in the runoff series, and the predictive ability of the single model is usually limited [13]. Typically, two strategies are utilized to improve SVM monthly runoff forecasts. The first is to utilize intelligent algorithms to optimize SVM parameters, and the second is to apply data preprocessing techniques to decompose the underlying subseries within the runoff series for better capture of the runoff periodicity and regularity.

The commonly used parameter optimization intelligent algorithms are genetic algorithm (GA) and particle swarm optimization (PSO). Because the PSO algorithm has fewer parameters and is easy to implement, it has found broad application in the field of runoff forecasting. Li et al. [14] applied the PSO to search for optimal parameters of the back propagation (BP) neural network and compared the forecasting results of the BP neural network with that of the PSO–BP. The comparison results indicated that the BP model optimized by the PSO algorithm provides better prediction performance. Yang et al. [15] used PSO and LSTM model coupling to predict glacier runoff. According to the results, the PSO–LSTM model exhibited better forecasting accuracy than the LSTM model. Sudheer et al. [16] developed a PSO–SVM model to forecast the flow, which improved the forecasting accuracy of a single SVM model. Therefore, this study adopted the PSO method to search for optimal parameters of the SVM model.

Commonly applied techniques for data preprocessing are empirical mode decomposition (EMD) [17] and ensemble empirical mode decomposition (EEMD) [18]. EMD is a commonly used nonlinear series decomposition method and has strong adaptability, but it is easy to produce a mode mixing problem and end effect. As a solution to the mode mixing problem of EMD, EEMD was presented by Wu and Huang [19]. EEMD is an improved EMD method that essentially suppresses the issue of mode mixing by introducing zero-mean and well-characterized white noise. However, EEMD still has some problems, because the auxiliary white noise added in the decomposition process eventually needs to increase the number of ensemble averages to offset, and the mode components are uncontrollable, resulting in large errors in the model results. To solve these problems, Gilles [20] proposed the empirical wavelet transform (EWT) in 2013. EWT combines EMD and wavelet transform, which has the adaptability of EMD and the completeness of wavelet transform theory, a simple and fast calculation. Because of its relatively reliable performance in separating nonlinear and nonstationary signals, EWT is widely used in mechanical fault diagnosis, medical disease diagnosis, intelligent wind speed forecasting, and financial time series forecasting [21,22]. Chegini et al. [23] used EWT to decompose the bearing vibration signal to denoise the vibration signal and identify the bearing fault. Experiments showed that the denoising technology after EWT decomposition can detect early faults. Hu et al. [24] decomposed the wind speed series by EWT, and effectively obtained the real information in the series so that the forecasting model obtained more accurate prediction ability. He et al. [25] used EWT to decompose the financial time data into series more suitable for forecasting, which effectively reduced the influence of noise in financial time series on forecasting results. Many studies have shown that EWT can be well adopted to the decomposition of time series data, and makes the model achieve good forecasting results. Therefore, this study adopted EWT to assist the SVM model to capture the periodicity and regularity embedded in runoff series.

Global karst areas cover about 12% of the earth’s land surface and provide drinking water for almost 25% of the global population, so the study of karst area is significant in economic development [26]. Nevertheless, the geographical structural characteristics of the karst area have serious heterogeneity, which makes the hydrological process of the karst area more complex than that of the non-karst area, and constructing an accurate runoff forecasting model is a challenging task [27]. Therefore, the accuracy of simulation results of single models in karst areas is usually poor, and hybrid models must be explored more urgently.

In summary, although many previous studies have been done on hybrid models for runoff forecasting, there are still some problems. The motivations and contributions of this study are summarized as follows. Firstly, EWT has been widely used in series decomposition studies in other fields, but still less in decomposition of runoff forecasting. Therefore, applying EWT to runoff forecasting studies in different basins can further demonstrate its generalizability. Secondly, previous studies of hybrid models have focused on non-karst basins, so it is necessary to discuss their feasibility in karst basins. Finally, previous studies on the performance of hybrid models have mainly used sequential structured data inputs, while few have discussed their performance when inputting monthly structured data. Therefore, it is meaningful to discuss the stability of the hybrid models under different data structure inputs. To achieve these objectives, the following studies are done: (1) A hybrid EWT–PSO–SVM model based on “decomposition-forecasting-reconstruction” is constructed. (2) Using the runoff data collected from karst area as the input data of the models, the forecasting results of SVM, GA–SVM, PSO–SVM, EMD–PSO–SVM, and EWT–PSO–SVM models with different performance metrics are compared to validate the superiority and feasibility of the developed model. (3) To further identify the stability of the developed model, the forecasting results under different input data structures are compared. The EWT–PSO–SVM hybrid model and its performance investigation schematic diagram are shown in Figure 1.

The remaining structure of the paper is organized as follows. Section 2 describes the model method construction and the results evaluation system. Section 3 presents an overview of the study area and data sources. Section 4 analyzes the predictive performance of the model and further discusses the findings of this study. Section 5 summarizes this work.

The principle of empirical wavelet transform is to adaptively divide the Fourier spectrum and construct a set of suitable signal wavelet filters to decompose each signal into its modes [20]. First, if the given runoff time series f(t) is composed of N single-component components, the largest component is found and the series is normalized; second, the Fourier spectrum is segmented into N continuous intervals. It is necessary to specify N + 1 boundary line, where ω₀ = 0 is the first boundary line and ω_N = π is the last boundary line. The results of each segmentation are expressed as

Λ_{n} = [ω_{n - 1}, ω_{n}], n = 1, 2, \dots, N

. To facilitate the construction of the filter later, a transition segment Tn centered at ω_n and of width 2τ_n is defined, and τ_n = γω_n [20].

Ref. [21] defines the empirical scale function and empirical wavelet function formula as

{\hat{Φ}}_{n} (ω) = {\begin{array}{l} 1, i f | ω | \leq (1 - γ) ω_{n} \\ \cos [\frac{π}{2} β (\frac{1}{2 τ_{n}} (| ω | - ω_{n} + τ_{n}))], i f (1 - γ) ω_{n} \leq | ω | \leq (1 + γ) ω_{n} \\ 0, o t h e r w i s e \end{array}

(1)

{\hat{ψ}}_{n} (ω) = {\begin{cases} 1, i f (1 + γ) ω_{n} \leq | ω | \leq (1 - γ) ω_{n + 1} \\ \cos [\frac{π}{2} β (\frac{1}{2 τ_{n} + 1} (| ω | - ω_{n + 1} + τ_{n + 1}))], i f (1 - γ) ω_{n + 1} \leq | ω | \leq (1 - γ) ω_{n + 1} \\ \sin [\frac{π}{2} β (\frac{1}{2 τ_{n}} (| ω | - ω_{n} + τ_{n}))], i f (1 - γ) ω_{n} \leq | ω | \leq (1 - γ) ω_{n} \\ 0, o t h e r w i s e \end{cases}

(2)

where ω_n is the nth maximal value of Fourier spectrum, γ is between 0 and 1. Referring to Ref. [23],

γ < m i n_{n} [ω_{n + 1} - ω_{n} / ω_{n + 1} + ω_{n}], β (x) = x^{4} (35 - 84 x + 70 x^{2} - 20 x^{3})

.

Similar to wavelet transform method, EWT also requires the establishment of orthogonal basis. As defined in Ref. [20], the calculation formulas of detail coefficients, approximation coefficients and the runoff time series reconstruction results are

w_{f}^{ε} (n, t) = 〈 f, ψ_{n} 〉 = \int f (τ) \bar{ψ_{n} (τ - t)} d τ = F^{- 1} (\hat{f} (ω) \bar{{\hat{ψ}}_{n} (ω)}),

(3)

w_{f}^{ε} (0, t) = 〈 f, ϕ_{1} 〉 = \int f (τ) \bar{ϕ_{1} (τ - t)} d τ = F^{- 1} (\hat{f} (ω) \bar{{\hat{ϕ}}_{1} (ω)}),

(4)

and

f (t) = w_{f}^{ε} (0, t) * ϕ_{1} (t) + \sum_{n = 1}^{N} w_{f}^{ε} (n, t) * ψ_{n} (t) .

(5)

The empirical mode function f_k is obtained from Equation (5) as follows:

f_{0} = w_{f}^{ε} (0, t) * ϕ_{1} (t)

(6)

f_{n} = w_{f}^{ε} (n, t) * ψ_{n} (t) .

(7)

In the wavelet analysis of hydrological series, the reasonable selection of wavelet decomposition layers has a great influence on the results. Therefore, this paper uses the adaptive method of wavelet filter decomposition layers proposed by Du et al. [28] to determine the wavelet decomposition layers of hydrological series. After testing and comparison, the performance of the model is optimal when the hydrological series is decomposed into four layers.

Support vector machines are supervised models that use statistical learning theory to improve the generalization ability of machine learning algorithms by pursuing minimum structural risk. The SVM model is a nonlinear mapping, which adjusts the feature vectors of different dimensions to the same dimension so that the feature vectors of the runoff time series complete the mapping from low to high dimensional space., so as to realize the model of separating the runoff time series into the maximum interval hyperplane [29]. From the Ref. [12], the support vector machine problems can be reconverted as

{\begin{cases} \min \frac{1}{2} \sum_{i = 1}^{n} \sum_{j = 1}^{n} y_{i} y_{j} a_{i} a_{j} \exp {(- g ‖ x_{i} - x_{j} ‖)}^{2} - \sum_{i = 1}^{n} a_{i} \\ s . t \sum_{i = 1}^{n} y_{i} a_{i} = 0, 0 \leq a_{i} \leq c \end{cases} .

(8)

The optimization effect of SVM classification problem is mainly determined by the parameters of the penalty factor c and the kernel function parameter g. The parameter c represents the tolerance to sample misclassification, the higher c is intolerant to misclassification and prone to overfit. The parameter g defines the influence ripple range of a single sample, and the larger g is, the easier the model is to overfit. Therefore, selecting the appropriate parameter optimization algorithm to determine c and g is a challenge to improve the forecasting performance of SVM.

Particle swarm optimization is a simplified algorithm proposed by James Kennedy and Russell Eberhart inspirited by the regularity of foraging behavior of birds [30]. The PSO uses massless abstract particles to simulate individuals in a flock of birds and uses globally optimal solutions to simulate the location of food. Each particle has only two attributes, velocity and direction, and finds the individual optimal solution in its respective range. The current global optimal solution is found by multiple interactions and information sharing among the particles. Using the individual optimal solutions and the current global optimal solution, the unique properties of each particle are updated and each particle reaches the global optimal solution through continuous iterations. In the computation of the PSO algorithm, the particles are randomly generated by initialization. In the iterative process, each particle finds the individual’s historical optimal solution and the swarm’s historical optimal solution from the fitness value, and then updates its velocity and position according to the following equations in Ref. [30]:

v_{i} (t + 1) = ω v_{i} (t) + c_{1} R_{1} [R_{i}^{b} (t) - x_{i} (t)] + c_{2} R_{2} [R_{g}^{b} (t) - x_{i} (t)]

(9)

x_{i} (t + 1) = x_{i} (t) + φ v_{i} (t + 1)

(10)

where

t

is the iteration number;

ω

is the inertia weight;

v_{i} (t)

is the i-th particle speed;

c_{1}

and

c_{2}

are learning factor, which determine the step size of the particle flight toward individual optimal solution and global optimal solution;

R_{1}

and

R_{2}

are random uniform distribution numbers between 0 and 1;

x_{i} (t)

is the i-th particle position;

R_{i}^{b} (t)

is the i-th particle optimal position;

R_{g}^{b} (t)

is the group optimal position; and

φ

is a contraction factor.

The EWT method inherits the advantages of EMD and wavelet analysis. The PSO algorithm is an intelligent optimization algorithm, which is one of evolutionary algorithms to further optimize the SVM parameters. SVM has a simple structure, good robustness, and good classification ability. Combining the advantages of the above three methods for predicting runoff series, a hybrid EWT–PSO–SVM forecasting model was constructed. First, the EWT is applied to resolve the original monthly runoff series into IMF (IMF₁, IMF₂, …, IMF_n) components with certain regularity. Second, the PSO method is used to optimize the SVM parameters c and g. Third, using the PSO–SVM model as a forecasting tool, the extracted individual IMF components are substituted into the PSO–SVM model to get the corresponding forecasting results for every IMF component. Finally, the forecasting results of all IMF components are combined to get the final runoff forecasting results. The developed method mainly consists of the decomposition stage and the reconstruction stage. In the decomposition stage, the subseries obtained by EWT decomposition are more suitable for models to find periodicity and regularity than the original series. In the reconstruction stage, the information in the forecasting results of all subseries is combined to obtain the final results. Through both decomposition and reconstruction stages, the developed model is more likely to have better generalization ability.

For the purpose of visual, reliable, and accurate reflection of the forecasting effect of each model, five indexes of Nash efficiency coefficient (NSE), root mean square error (RMSE), mean absolute percentage error (MAPE), mean absolute error (MAE), and correlation coefficient (R) were selected as the evaluation criteria [17,31]. The detailed formulas of the five indicators are

N S E = 1 - \frac{{\sum_{i = 1}^{n} (y_{i} - y_{i}^{*})}^{2}}{{\sum_{i = 1}^{n} (y_{i} - \bar{y})}^{2}}

(11)

M A P E = \frac{1}{n} \sum_{i = 1}^{n} | \frac{y_{i} - y_{i}^{*}}{y_{i}} | \times 100 %

(12)

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - y_{i}^{*})}^{2}}

(13)

M A E = \frac{1}{n} \sum_{i = 1}^{n} | y_{i} - y_{i}^{*} |

(14)

R = \frac{\sum_{i = 1}^{n} (y_{i} - \bar{y}) (y_{i} - \bar{y^{*}})}{\sqrt{{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2} \sum_{i = 1}^{n} ({y^{'}}_{i} - \bar{y^{*}})}^{2}}}

(15)

where

y_{i}

refers to the measured value;

y^{*}

refers to the simulation value;

\bar{y}

refers to the average of the measured values; and

\bar{y^{*}}

refers to the average of the simulation values.

Because each evaluation index analyzes the model performance from different perspectives, the model forecasting results don’t necessarily show optimal performance in all aspects. Therefore, this paper adopted the composite rating index method (M_R) to calculate the composite rating of the model by ranking each index. M_R can effectively judge the consistency of the evaluation indexes [32]:

M_{R} = 1 - \frac{1}{m n} \sum_{i = 1}^{n} r_{i}

(16)

where m is the number of models, n is the number of evaluation indexes, and r_i is the ranking of each model evaluation index (i = 1, 2, 3). When r_i = 1, it means that the model has the best simulation effect under this evaluation index. The value of M_R is between 0 and 1. A larger M_R value indicates a better model’s comprehensive simulation ability.

Hydrological series are classified into two categories based on the ordering principle: sequential structure and monthly structure [33]. The sequential structure is based on chronological order and is used in the study of many articles to discuss the model performance. The monthly structure is divided based on months, and the runoff series is reordered based on January through December. In this paper, the monthly series are substituted into the model for forecasting in the hope of further validating the stability of the developed model. The schematic diagram of these two data structures is illustrated in Figure 2.

The Chengbi River Karst Basin is located in the northeast of Baise City, Guangxi and belongs to Xijiang River system. Chengbi River Karst Basin covers an area of 2087 km², of which the karst landform area is about 1123 km², which is a typical karst basin. Above the middle of the basin are mostly soil mountains and karst landforms, while below the middle of the basin are alluvial along the river, and the terrain is relatively flat, belonging to hilly landforms [34]. Due to the complexity of the karst basin, the underground hydrological characteristics of the basin are difficult to obtain. Meanwhile, the sparse meteorological stations in the Chengbi River basin before 2001 could not provide accurate meteorological data input to the process-driven model. This difficulty in obtaining underground hydrological characteristics and lack of long-term meteorological data is common in karst basins. Therefore, it is necessary to explore data-driven models based on runoff series. The catchment area above the Chengbi River Reservoir dam site covers 2000 km², which is 95.8% of the total catchment area of the whole basin. The annual average precipitation is 1560 mm, with about 87% of it occurring during the flood season. The average annual flow is about 37.8 m³/s, and the basin belongs to the subtropical monsoon climate with long summers and short winters. The basin is hot and rainy in spring and summer due to the influence of the ocean monsoon. At the same time, due to the existence of more tributaries in the basin, floods with longer duration are often formed in the basin, and the flood season in the basin lasts longer, starting from April to October. The Chengbi River Reservoir can provide residential water, flood control, power generation, agricultural irrigation, and other social benefits, which play very important roles in the development of Baise City. Therefore, it is essential to establish an appropriate and high-precision runoff forecasting model, which can effectively adjust the optimal scheduling model of Chengbi River Reservoir and make full use of water resources in flood season. A total of 12 telemetric rainfall stations are set up in the Chengbi River Karst Basin. This paper focuses on Bashou station and uses a dataset of 492 monthly runoff data from 1979 to 2019 for runoff forecasting. The runoff data are provided by the Chengbi River Reservoir Authority, and runoff depth data are obtained by dividing the runoff data of the designated section by the catchment area of the reservoir. The Chengbi River Karst Basin’s approximate location is depicted in Figure 3.

The hydrological series is complex and changeable, which is mainly affected by two factors: climate change and human activities. As a result, it exhibits the characteristics of nonlinearity and nonstationarity. For obtaining key and useful regularities from the series accurately and quickly, abrupt change analysis, trend analysis, and period analysis are commonly used to analyze runoff series characterization.

The Mann–Kendall mutation test was adopted as the abrupt change analysis method. It is a statistical verification method with the obvious advantage of unnecessarily following a certain statistical distribution of parameters and is very suitable for ordinal and type variables [35]. The monthly runoff depth mutation at Bashou hydrological station is shown in Figure 4A. It shows that the monthly runoff Mann–Kendall test statistic UF curve of Bashou station has no large fluctuation overall, and it is relatively flat. Only in the period of 1993~1994, the monthly runoff depth trend appeared significant. Within the significance level line of α = 0.05, the UF and UB curves showed mutation points during the period between 2015 and 2017.

The linear trend method was employed as the trend analysis method. The linear trend method is a forecasting method used when the rise and fall of the trend are approximately linear [36]. The monthly runoff depth trend diagram of the Bashou hydrological station is depicted in Figure 4B. Trend lines generated from monthly runoff series showed an increasing trend, and the monthly runoff increased at a rate of 0.018 mm/mon.

The Morlet wavelet analysis was applied as the period analysis method. Wavelet analysis has a powerful function of discriminating multiple scales in analyzing hydrological series, and a good localization function in time and frequency domains [37]. The time–frequency diagram of the monthly runoff depth and the monthly runoff depth variance is shown in Figure 4C,D. The analysis results indicate that the four main cycles of monthly runoff are: 18a, 6a, 3a, and 12a in descending order.

The conclusions of the above three analysis methods show that the monthly runoff depth series has nonstationary characteristics. Therefore, it is worth proposing a hybrid model employing the “decomposition-prediction-reconstruction” mode to increase the single model performance.

EWT is a nonstationary time series processing method that combines the concept of EMD adaptive decomposition and wavelet transform theory. The purpose of EWT is to decompose the original data series into different scales, so the IMF components represent the amplitude change and frequency change of the original runoff series, and each IMF component corresponds to a different physical background. Because the results of the superposition of each IMF component is the original data series, it includes the original information of the data series, and intuitively shows the features on different characteristic scales of the original series. As a result, the periodicity and regularity of the runoff series can be analyzed more accurately, and a model that is more consistent with actual runoff patterns can be more easily established. The subseries results obtained by decomposing the original runoff series using EWT are shown in Figure 5. The EWT decomposition results don’t contain false modes, and there is no trend component. The absence of false modes can effectively increase the precision of forecasting results. Figure 5 shows that the subseries decomposed by EWT can demonstrate the frequency, amplitude, and wavelength patterns more intuitively. Among them, the IMF₁ component has the smallest frequency and amplitude compared with other components, and the fluctuation is relatively gentle. The fluctuation range of IMF₁ is 30~75, which is positive. The fluctuation ranges of IMF₂, IMF₃ and IMF₄ are −50~38, −110~98, and −55~124, respectively. The frequency and amplitude of each subseries gradually increase, while the wavelength gradually decreases.

Before runoff forecasting, selecting the appropriate number of input data can effectively improve the accuracy of model construction. When the input data is insufficient in number, the model training is inadequate to capture the changes and characteristics of the data, resulting in a large difference between the model fitting series and the actual series, and the final forecasting results are not good enough. When the input data is too much, the model training is easy to overfit, and the model fitting series is too close to the input time series, which cannot accurately predict and will also affect the accuracy of the forecasting. This article tests the effect of lag time under different scenarios by referencing previous studies, and the simulation results are compared. Finally, 11 kinds of preferred sets are selected for forecasting.

In this study, the optimization of SVM parameters c and g was conducted using PSO. By analyzing the data characteristics and testing repeatedly, the population size was finally selected as 25; the position and velocity of the initial population particles were set at [0, 1]; this study set individual learning factor c₁ and social learning factor c₂ to 1.6; the maximum number of iterations was selected as 150; the setting range of general inertia weight was 0.3~0.9, this paper set it as 0.8. After several experiments, it was found that the simulation performance was more favorable as the ratio of training set to validation set was 8:2, i.e., the first 396 months was the model training period and the last 96 months was the model validation period. Under these conditions, the runoff series can not only provide the training data required for machine learning to obtain lower errors, but also achieve better accuracy in the forecasting of validation set data. The design values of each parameter are shown in Appendix A.

This study evaluates the effectiveness of the EWT–PSO–SVM hybrid model in comparison to the SVM, GA–SVM, PSO–SVM, and EMD–PSO–SVM models. The runoff forecasting results of the five models are presented in Figure 6. It shows that the SVM model has a relatively poor forecasting effect, and there is a large error between the measured and forecasted values in large runoff events. The forecasting results of PSO–SVM are slightly improved compared with the GA–SVM and single SVM models. The high consistency between the forecasted and original runoff series trends indicates that the model using series decomposition has higher forecasting accuracy. The EWT–PSO–SVM model optimizes the forecasting results in large runoff events significantly, and the results are also fitted close to the actual measured values predicted in small runoff events. Plotting the absolute error of the model can better show the forecasting effect of the model under different events.

Considering that the absolute error (AE) in the model prediction results is not intuitively presented in Figure 6, the AE values of the five models obtained by calculating the difference between the measured values and the predicted values are plotted in Figure 7. In Figure 7, the lower the height of the column chart representing AE, the better the model’s predictive performance. Analysis of Figure 7 shows that when the runoff process line has extreme values, the column height of the EWT–PSO–SVM model is significantly lower than those of the other models. In addition, the maximum positive AE value of the EWT–PSO–SVM model is 106.11 mm, which is reduced by 43.12% and 34.27% compared with the SVM and EMD–PSO–SVM models, respectively. The minimum negative AE value is −69.56 mm, which increases by 24.78% and 22.27%, respectively. This indicates that under high-flow conditions, the hybrid model developed in this study can achieve better predictive performance.

To see the fitting effect of each model and the main evaluation indexes more intuitively, the Taylor diagram and scatter fitting plot were also further adopted to performance comparison.

The scatter plot is drawn with the measured values as horizontal coordinates and the predicted values as vertical coordinates, and the correlation between the predicted and measured results of the model is visualized by linear fitting of the scatter points of each model. Figure 8 is the scatter plot of forecasting by each model. The blue line y = x represents the ideal fit measured values. Comparing the linear trend of each model with the line y = x. The closer the linear trend of the model is to y = x, the better the fitting results of the model. Figure 8 presents that the linear fitting of EWT–PSO–SVM model is the closest to y = x, while the linear fitting of SVM model exhibits the largest deviation from y = x, and the linear trend of GA–SVM, PSO–SVM, and SVM model is similar. Among the five models, the linear correlation coefficient of the EWT–PSO–SVM model (0.79) is the highest, while the remaining models have the highest to lowest linear correlation coefficients as EMD–PSO–SVM (0.74), PSO–SVM (0.44), GA–SVM (0.43), and SVM (0.42), respectively. This demonstrates that the forecasting results of EWT–PSO–SVM model are the best, while the forecasting results of the PSO–SVM model exhibit a slight improvement over those of the SVM model.

The Taylor diagram is drawn based on three indicators: correlation coefficient, standard deviation, and root mean square error. It can quantify the correlation between model forecasting results and measured results [38]. Figure 9 is a Taylor diagram of the five models. In the Taylor diagram, the five coordinate points represent the models, the radiation lines represent the correlation coefficients, the horizontal and vertical axes represent the standard deviations, and the dashed lines represent the root mean square errors. The closer the coordinate points in the Taylor diagram are to the observation point, the better the forecasting performance of the model. Figure 9 indicates that the point presenting the EWT–PSO–SVM model is closest to the observed values, which indicates that the EWT–PSO–SVM model has the highest forecasting capability by comparison with the other models.

The evaluation results of the five forecasting models under different evaluation indexes are shown in Table 1. Comparing the first three models in the table, it is found that their performance on each of the five indicators have advantages and disadvantages. Comparing the evaluation indexes of EWT–PSO–SVM model and PSO–SVM model, the EWT–PSO–SVM model has significant improvement over the PSO–SVM model in terms of NSE, RMSE, MAE, and R. Compared with the PSO–SVM model, the RMSE and MAE values of the EWT–PSO–SVM model are reduced by 46.14% and 40.79%, and the NSE and R values of the are improved by 70.03% and 19.23%. The data in the last two rows of the table indicates that the EWT–PSO–SVM model outperforms the EMD–PSO–SVM model in all metrics. However, the MAPE value of the EWT–PSO–SVM model performs worse among the five models. In summary, the overall simulation effects of the five models need to be further discussed to determine which is optimal.

To present an overall performance of the model based on multiple metrics, the composite rating index method is utilized. The ranking results and the final scores of the five models are shown in Table 2. It presents that the SVM model has the lowest comprehensive evaluation score of 0.24, followed by the GA–SVM, PSO–SVM, EMD–PSO–SVM models with 0.28, 0.40, 0.48, respectively, and the EWT–PSO–SVM model has the highest comprehensive evaluation score of 0.68. Therefore, the EWT–PSO–SVM runoff forecasting results have the most overall performance among the five models.

Furthermore, verifying the stability of the developed model, the model forecasting results under different data structures were compared. The original sequential structure runoff data were reconstructed into monthly structure runoff data, and the new data structure models were named SVM (M), PSO–SVM (M), and EWT–PSO–SVM (M). The 12 subseries of the monthly structural runoff data were substituted into the SVM (M), PSO–SVM (M), and EWT–PSO–SVM (M) models for prediction, and the forecasting and evaluation results of the models are shown in Figure 10, Figure 11 and Figure 12 and Table 3 and Table 4.

Figure 10 shows the forecasted runoff depth of the models using the monthly structure. It demonstrates that the forecasting results of the EWT–PSO–SVM (M) model are closer to the measured results and better simulate the monthly runoff, which indicates that the EWT–PSO–SVM (M) model outperforms the PSO–SVM (M) model and the SVM (M) model.

Figure 11 shows the scatter plot of the model forecasting results under the monthly structure. Figure 11 demonstrates that the linear correlation coefficient of the EWT–PSO–SVM (M) model (0.73) is the highest, while the linear correlation coefficient of the PSO–SVM (M) model (0.58) and the SVM (M) model (0.62) are close. EWT–PSO–SVM (M) exhibits the highest performance in all three models.

In addition, Figure 12 displays the Taylor diagram of the model forecasting results under the monthly structure. Figure 12 presents that the point of the EWT–PSO–SVM (M) model is closest to the observation point, which indicates that the EWT–PSO–SVM (M) model has the highest forecasting accuracy compared with the SVM (M) model and the PSO–SVM (M) model. Meanwhile, the Taylor diagram demonstrates similar conclusions to Figure 10 and Figure 11 Therefore, the three pictures indicate that the developed model has good stability.

Table 3 shows the forecasting performance of the three new data structure models under different evaluation indexes. It shows that EWT–PSO–SVM (M) demonstrates the optimal results in all evaluation indexes. Compared with the PSO–SVM (M) model, the MAPE, RMSE, and MAE values of the EWT–PSO–SVM (M) model are reduced by 22.77%, 26.02%, and 26.75%, respectively, and the NSE and R values are improved by 37.31% and 15.59%. Compared with the SVM (M) model, the MAPE, RMSE, and MAE values of the EWT–PSO–SVM (M) model are reduced by 25.28%, 28.66%, and 30.25%, respectively, and the NSE and R values are improved by 18.45% and 46.41%.

Table 4 further evaluates the monthly structure models using the composite rating index method. The M_R of the EWT–PSO–SVM (M) model reaches 0.67 and the model comprehensive performance is optimal.

In conclusion, the hybrid EWT–PSO–SVM model consistently outperforms the other two models across different data structures. This suggests the high stability of the hybrid EWT–PSO–SVM model developed in this study for runoff forecasting in Chengbi River Karst Basin.

In this study, the hybrid model EWT–PSO–SVM with the “decomposition-prediction-reconstruction” step achieves outstanding fitting results. In the decomposition step, the training efficiency of the model is enhanced by reducing the data dimensionality and decomposing the complex and random original runoff series into relatively stable subseries. The decomposition method used in this study is EWT, which effectively captures the time–frequency information of the original series, allowing the forecasting model to better capture the hidden periodicity and regularity in the original series. In Li et al.’s study, an EWT–PSO–SVM model was constructed to classify the fault diagnosis time series of high voltage circuit breakers [22]. The conclusion was shown that EWT can effectively extract data features of nonstationary time series, which is similar to the conclusion obtained in this study. In the forecasting step, the PSO algorithm was adopted to choose the optimal parameters c and g for the SVM model. The parameter c determines the tolerance for errors allowed by the SVM model, while g determines the number of support vectors. Both too large and too small values of c and g can have a negative impact on the performance of the SVM model. Choosing the optimal parameters can enhance the model’s generalization ability, meaning that the model performs better when dealing with new data. In Qin et al.’s study, the PSO method was utilized to optimize the SVM model, which achieved good prediction results [39]. In the reconstruction step, the final results are derived by combining the prediction results of each subseries, which leads to a synergistic effect that improves the forecasting performance of the developed model. In the research on series forecasting using EWT and PSO, both this study and previous research have reached similar conclusions, indicating that they can effectively enhance the capability of model runoff forecasting.

In this study, a comprehensive evaluation index, M_R, was obtained by ranking and calculating five evaluation indicators to verify the performance of the models. The M_R of the EWT–PSO–SVM model was 0.68, while the SVM model was only 0.24. A larger M_R indicates a better predictive capability of the developed model. The developed model reduced the maximum forecasting absolute error in runoff forecasting by 43.12%. More accurate forecasting of large flow events can help reservoirs to release water before floods occur, ensuring efficient utilization of water resources. Moreover, the forecasting performance of the three models was also investigated under a monthly structure. The M_R of the EWT–PSO–SVM (M) model reached 0.67, higher than the PSO–SVM(M) and SVM (M) models with an M_R of 0.33 and 0, respectively, indicating that the developed model has the best forecasting performance under the new data structure input. In previous studies on runoff forecasting in karst basins, process-driven models were commonly used. However, due to the complexity of karst basins, the ability of runoff forecasting was limited. Zhou et al. [40] used the Xinanjiang model for runoff forecasting studies in karst basins, and the NSE value was only 0.847 in Guilin. The NSE of the developed EWT–PSO–SVM model in this study reached 0.86, indicating that the developed model performs well in runoff prediction of karst watersheds.

However, the developed model shows relatively unsatisfactory results in terms of the MAPE index, which is different from the work of other scholars’ studies. The study by Yuan et al. [41] indicates that the forecasting performance of the hybrid model with “decomposition-forecasting-reconstruction” is better than that of the single model in terms of the MAPE index. The one reason may be due to the runoff data in this study being from karst basins. The unique geological structure of karst watersheds gives them stronger water storage characteristics [42], which means that a small amount of rainfall is difficult to directly convert into runoff, ultimately leading to low values in the sequence having no obvious statistical features, which in turn affects the accuracy of the series decomposition. Therefore, the regularity of the runoff series under small flow events is more difficult to capture, which leads to poor forecasting performance. In addition, the LSTM was used in Yuan’s study, while the SVM was used in this study, and it is possible that the different model types caused the difference in results. Therefore, accurate runoff forecasting in karst basins is a sustained challenge, and it is worthwhile to further explore the forecasting performance of different hybrid models.

This study developed an EWT–PSO–SVM model to predict runoff in karst basins. In previous studies, hybrid models were also used to predict runoff, but most of them used EMD and EEMD to preprocess the runoff series. The feasibility of EWT has been effectively verified in many fields [19], but it has rarely been applied in runoff forecasting fields in karst basins. In this paper, EWT was introduced into the decomposition of runoff series to enhance the precision of runoff forecasting in the Chengbi River Karst Basin. The M_R of the EWT–PSO–SVM model reached 0.68 indicating that the developed model has the higher forecasting accuracy when compared with PSO–SVM model. Moreover, the developed model is significantly optimized in the forecasting of large flow events compared to the other two models, which means that applying the developed model can better assist in the optimal allocation of water resources. Meanwhile, this paper also investigated the predictive capability of the three models under the monthly structure. The M_R of the EWT–PSO–SVM (M) model reached 0.67, which shows that the developed model outperforms the highest forecasting performance under the new data structure input. The comprehensive performance of the developed model always outperforms the other two models with different data structure inputs, which further indicates that the developed model has a robust performance superiority.

Given the phenomenon that the EWT–PSO–SVM model has higher prediction accuracy under large flow events but inadequate accuracy under small flow events, future research can consider dividing the runoff series according to the size of the annual water inflow and predicting them separately. Moreover, the selection of model plays a critical role in determining the precision of forecasting results, and different types of models have different forecasting performances. This study attempted to construct an EWT–PSO–SVM model to predict runoff to verify the hybrid performance of EWT, PSO, and SVM, and obtained a good forecasting result. However, due to the complexity of runoff series in the karst basin, there is still a large research space for runoff forecasting. Therefore, it is valuable to further investigate the hybrid performance of EWT and other artificial intelligence or machine learning methods, such as ANN, LSTM, and Random Forest.

Due to the complexity of karst basin structure, runoff forecasting has always faced great challenges. According to the “decomposition-prediction-reconstruction” process, the following research steps were taken for the hybrid model EWT–PSO–SVM to forecast runoff. First, EWT is employed to split the runoff series into subseries for the purpose of reducing series nonlinearity and nonstationarity. Second, the parameters of SVM are selected using PSO, and then the subseries are substituted into the optimized SVM model for prediction. Finally, the predicted values of every subseries are reconstructed for the final runoff forecasting results. The runoff data from the Chengbi River Karst Basin was substituted into the different models for forecasting to test the superiority of the developed model. The comprehensive evaluation index of the developed model reached 0.68 and the maximum error was reduced by 43.12% compared with the SVM model. Meanwhile, the monthly structure data was fed into the different models for prediction to further verify the stability of the developed model. The composite evaluation index of the developed model under the monthly structure reached 0.67. The results show that the EWT–PSO–SVM model exhibits better performance than the single SVM model under different data structures, which indicates that data decomposition and parameter optimization strategies can effectively enhance the precision of a single SVM model. In this way, the developed EWT–PSO–SVM model makes a prospective method for predicting nonlinear and nonstationary runoff series in karst basins.

Although the developed model has shown good overall forecasting performance, there are still some limitations. The developed model has some prediction errors in predicting low values in the series. Therefore, future research directions could consider separating high and low values in the runoff series for prediction to obtain higher prediction accuracy. Meanwhile, it is valuable to further investigate the hybrid performance of EWT with other artificial intelligence or machine learning methods to enhance the accuracy of runoff forecasting.

Conceptualization, C.M., Z.Y., X.L. and Y.D.; methodology, Z.Y., X.L. and Y.D.; software, Y.D.; validation, C.M., Z.Y. and S.L.; formal analysis C.M., Z.Y. and S.L.; investigation, C.M., Z.Y. and S.L.; resources, C.M.; data curation, K.H.; writing—original draft preparation, Z.Y. and Y.D.; writing—review and editing, X.L., S.L. and K.H.; supervision, X.L.; project administration, C.M. and R.M.; funding acquisition, C.M., S.L. and X.M. All authors have read and agreed to the published version of the manuscript.

This work was supported by the National Natural Science Foundation of China (Grant Nos. 52269002, 51969004), the science and technology award incubation project of Guangxi University (Grant No. 2022BZJL023), the Interdisciplinary Scientific Research Foundation of Guangxi University (Grant No. 2022JCC028), the Science and Technology Award Incubation Project of Guangxi University (Grant No. 2022BZJL023), and the Guangxi Water Resource Technology Promotion Foundation (SK2021-3-23).

Not applicable.

Not applicable.

Some data that support the findings of this study are available from the corresponding author upon reasonable request.

The authors declare no conflict of interest.

Acronyms	Nomenclature
EWT	Empirical wavelet transform
PSO	Particle swarm optimization
SVM	Support vector machine
EWT–PSO–SVM	Empirical wavelet transform–particle swarm optimization–support vector machine
PSO–SVM	Particle swarm optimization–support vector machine
LSTM	Long short term memory
GSSHA	Gridded surface subsurface hydrologic analysis
WANN	Wavelet-based artificial neural network
GR4J	Genie Rural a 4 parametres Journalier
BP	Back propagation
EMD	Empirical mode decomposition
EEMD	Ensemble empirical mode decomposition
GA	Genetic algorithm
GA–SVM	Genetic algorithm–support vector machine
EMD–PSO–SVM	Empirical mode decomposition–particle swarm optimization–support vector machine
NSE	Nash efficiency coefficient
RMSE	Root mean square error
MAPE	Mean absolute percentage error
MAE	Mean absolute error
R	Correlation coefficient
M_R	Composite rating index method
AE	Absolute error

System Parameter	Values
wavelet decomposition layers	4
the population size	25
the position and velocity of the initial population particles	[0, 1]
individual learning factor	1.6
social learning factor	1.6
the maximum number of iterations	150
inertia weight	0.8
the ratio of training set to validation set	8:2

Maddu, R.; Pradhan, I.; Ahmadisharaf, E.; Singh, S.K.; Shaik, R. Short-range reservoir inflow forecasting using hydrological and large-scale atmospheric circulation information. J. Hydrol. 2022, 612, 128153. [Google Scholar] [CrossRef]
He, C.; Chen, F.; Long, A.; Qian, Y.; Tang, H. Improving the precision of monthly runoff prediction using the combined non-stationary methods in an oasis irrigation area. Agric. Water Manag. 2023, 279, 108161. [Google Scholar] [CrossRef]
Zhang, J.; Chen, X.; Khan, A.; Zhang, Y.-k.; Kuang, X.; Liang, X.; Taccari, M.L.; Nuttall, J. Daily runoff forecasting by deep recursive neural network. J. Hydrol. 2021, 596, 126067. [Google Scholar] [CrossRef]
Li, W.; Kiaghadi, A.; Dawson, C. High temporal resolution rainfall–runoff modeling using long-short-term-memory (LSTM) networks. Neural Comput. Appl. 2021, 33, 1261–1278. [Google Scholar] [CrossRef]
Hunt, K.M.R.; Matthews, G.R.; Pappenberger, F.; Prudhomme, C. Using a long short-term memory (LSTM) neural network to boost river streamflow forecasts over the western United States. Hydrol. Earth Syst. Sci. 2022, 26, 5449–5472. [Google Scholar] [CrossRef]
Feng, Z.-k.; Niu, W.-j.; Wan, X.-y.; Xu, B.; Zhu, F.-l.; Chen, J. Hydrological time series forecasting via signal decomposition and twin support vector machine using cooperation search algorithm for parameter identification. J. Hydrol. 2022, 612, 128213. [Google Scholar] [CrossRef]
Adamowski, J.; Sun, K. Development of a coupled wavelet transform and neural network method for flow forecasting of non-perennial rivers in semi-arid watersheds. J. Hydrol. 2010, 390, 85–91. [Google Scholar] [CrossRef]
Partal, T.; Sezen, C. The utilization of a GR4J model and wavelet-based artificial neural network for rainfall–runoff modelling. Water Supply 2019, 19, 1295–1304. [Google Scholar] [CrossRef]
Feng, B.f.; Xu, Y.s.; Zhang, T.; Zhang, X. Hydrological time series prediction by extreme learning machine and sparrow search algorithm. Water Supply 2022, 22, 3143–3157. [Google Scholar] [CrossRef]
Vapnik, V.N.; Chervonenkis, A.J. On a class of perceptrons. Avtom. Telemekhanika 1964, 25, 112–120. [Google Scholar]
Liang, Z.; Li, Y.; Hu, Y.; Li, B.; Wang, J. A data-driven SVR model for long-term runoff prediction and uncertainty analysis based on the Bayesian framework. Theor. Appl. Climatol. 2017, 133, 137–149. [Google Scholar] [CrossRef]
Wang, P.; Zhang, J.; Wang, M.; Liang, Y.; Li, J. Stochastic simulation of daily runoff in the middle reaches of the Yangtze river based on SVM-Copula model. Syst. Sci. Control. Eng. 2019, 7, 452–459. [Google Scholar] [CrossRef]
Chen, S.; Ren, M.M.; Sun, W. Combining two-stage decomposition based machine learning methods for annual runoff forecasting. J. Hydrol. 2021, 603, 126945. [Google Scholar] [CrossRef]
Li, Y.; Zhou, L.; Gao, P.; Yang, B.; Han, Y.; Lian, C. Short-Term Power Generation Forecasting of a Photovoltaic Plant Based on PSO-BP and GA-BP Neural Networks. Front. Energy Res. 2022, 9, 824691. [Google Scholar] [CrossRef]
Yang, X.; Maihemuti, B.; Simayi, Z.; Saydi, M.; Na, L. Prediction of Glacially Derived Runoff in the Muzati River Watershed Based on the PSO-LSTM Model. Water 2022, 14, 2018. [Google Scholar] [CrossRef]
Sudheer, C.; Maheswaran, R.; Panigrahi, B.K.; Mathur, S. A hybrid SVM-PSO model for forecasting monthly streamflow. Neural Comput. Appl. 2013, 24, 1381–1389. [Google Scholar] [CrossRef]
Huang, S.; Chang, J.; Huang, Q.; Chen, Y. Monthly streamflow prediction using modified EMD-based support vector machine. J. Hydrol. 2014, 511, 764–775. [Google Scholar] [CrossRef]
Bai, Y.; Chen, Z.; Xie, J.; Li, C. Daily reservoir inflow forecasting using multiscale deep feature learning with hybrid models. J. Hydrol. 2016, 532, 193–206. [Google Scholar] [CrossRef]
Wu, Z.; Huang, N.E. Ensemble empirical mode decomposition: A noise-assisted data analysis method. Adv. Adapt. Data Anal. 2009, 1, 1–41. [Google Scholar] [CrossRef]
Gilles, J. Empirical Wavelet Transform. IEEE Trans. Signal Process. 2013, 61, 3999–4010. [Google Scholar] [CrossRef]
Liu, W.; Chen, W. Recent Advancements in Empirical Wavelet Transform and Its Applications. IEEE Access 2019, 7, 103770–103780. [Google Scholar] [CrossRef]
Li, B.; Liu, M.; Guo, Z.; Ji, Y. Application of EWT and PSO-SVM in Fault Diagnosis of HV Circuit Breakers. In Proceedings of the 7th International Conference on Communications, Signal Processing, and Systems (CSPS), Dalian, China, 14–16 July 2020; pp. 628–637. [Google Scholar]
Chegini, S.N.; Bagheri, A.; Najafi, F. Application of a new EWT-based denoising technique in bearing fault diagnosis. Measurement 2019, 144, 275–297. [Google Scholar] [CrossRef]
Hu, J.; Wang, J.; Ma, K. A hybrid technique for short-term wind speed prediction. Energy 2015, 81, 563–574. [Google Scholar] [CrossRef]
He, Y.; Li, J.M.; Ruan, S.; Zhao, S. A Hybrid Model for Financial Time Series Forecasting—Integration of EWT, ARIMA with The Improved ABC Optimized ELM. IEEE Access 2020, 8, 84501–84518. [Google Scholar] [CrossRef]
Li, Y.; Song, T.; Lai, Y.; Huang, Y.; Fang, L.; Chang, J. Status, mechanism, suitable distribution areas and protection countermeasure of invasive species in the karst areas of Southwest China. Front. Environ. Sci. 2022, 10, 957216. [Google Scholar] [CrossRef]
Meng, X.; Yin, M.; Ning, L.; Liu, D.; Xue, X. A threshold artificial neural network model for improving runoff prediction in a karst watershed. Environ. Earth Sci. 2015, 74, 5039–5048. [Google Scholar] [CrossRef]
Du, W.; Zhu, R.; Li, Y. Adaptive determination method of wavelet filter decomposition layers. Optoelectron. Laser 2010, 21, 1408–1411. (In Chinese) [Google Scholar] [CrossRef]
Niu, W.-j.; Feng, Z.-k. Evaluating the performances of several artificial intelligence methods in forecasting daily streamflow time series for sustainable water resources management. Sustain. Cities Soc. 2021, 64, 102562. [Google Scholar] [CrossRef]
Kennedy, J.; Eberhart, R. Particle swarm optimization. In Proceedings of the ICNN’95-International Conference on Neural Networks 1995, Perth, WA, Australia, 27 November–1 December 1995; Volume 4, pp. 1942–1948. [Google Scholar] [CrossRef]
Feng, Z.-K.; Niu, W.-J.; Tang, Z.-Y.; Jiang, Z.-Q.; Xu, Y.; Liu, Y.; Zhang, H.-R. Monthly runoff time series prediction by variational mode decomposition and support vector machine based on quantum-behaved particle swarm optimization. J. Hydrol. 2020, 583, 124627. [Google Scholar] [CrossRef]
Mo, C.; Liu, G.; Lei, X.; Zhang, M.; Ruan, Y.; Lai, S.; Xing, Z. Study on the Optimization and Stability of Machine Learning Runoff Prediction Models in the Karst Area. Appl. Sci. 2022, 12, 4979. [Google Scholar] [CrossRef]
Zhao, X.; Lv, H.; Wei, Y.; Lv, S.; Zhu, X. Streamflow Forecasting via Two Types of Predictive Structure-Based Gated Recurrent Unit Models. Water 2021, 13, 91. [Google Scholar] [CrossRef]
Mo, C.; Ruan, Y.; Xiao, X.; Lan, H.; Jin, J. Impact of climate change and human activities on the baseflow in a typical karst basin, Southwest China. Ecol. Indic. 2021, 126, 107628. [Google Scholar] [CrossRef]
Seenu, P.Z.; Jayakumar, K.V. Comparative study of innovative trend analysis technique with Mann-Kendall tests for extreme rainfall. Arab. J. Geosci. 2021, 14, 536. [Google Scholar] [CrossRef]
Zhang, X.; Zhao, C.; Yang, J. Quantitative Analysis of Impact of Climate Variability and Human Activities on Water Resources Change in Suzhou City. In Proceedings of the 4th International Conference on Environmental Engineering and Sustainable Development (CEESD), Xiamen, China, 5–7 December 2020. [Google Scholar]
Kisi, O. Wavelet regression model for short-term streamflow forecasting. J. Hydrol. 2010, 389, 344–353. [Google Scholar] [CrossRef]
Fu, Q.; Li, L.; Li, M.; Li, T.; Liu, D.; Hou, R.; Zhou, Z. An interval parameter conditional value-at-risk two-stage stochastic programming model for sustainable regional water allocation under different representative concentration pathways scenarios. J. Hydrol. 2018, 564, 115–124. [Google Scholar] [CrossRef]
Qin, Y.W.; Lei, Y.J.; Gong, X.Y.; Ju, W.L. A model involving meteorological factors for short- to medium-term, water-level predictions of small- and medium-sized urban rivers. Nat. Hazards 2022, 111, 725–739. [Google Scholar] [CrossRef]
Zhou, Q.; Chen, L.; Singh, V.P.; Zhou, J.Z.; Chen, X.H.; Xiong, L.H. Rainfall-runoff simulation in karst dominated areas based on a coupled conceptual hydrological model. J. Hydrol. 2019, 573, 524–533. [Google Scholar] [CrossRef]
Yuan, R.; Cai, S.; Liao, W.; Lei, X.; Zhang, Y.; Yin, Z.; Ding, G.; Wang, J.; Xu, Y. Daily Runoff Forecasting Using Ensemble Empirical Mode Decomposition and Long Short-Term Memory. Front. Earth Sci. 2021, 9, 621780. [Google Scholar] [CrossRef]
Sagir, C.; Kurtulus, B.; Razack, M. Hydrodynamic Characterization of Mugla Karst Aquifer Using Correlation and Spectral Analyses on the Rainfall and Springs Water-Level Time Series. Water 2020, 12, 85. [Google Scholar] [CrossRef]

Figure 1. EWT–PSO–SVM hybrid model and its performance investigation schematic diagram.

Figure 2. Data structure.

Figure 3. Overview of Chengbi River Karst Basin.

Figure 4. (A) Monthly runoff depth mutation, (B) Monthly runoff depth trend, (C) Monthly runoff depth real part time–frequency diagram, and (D) Monthly runoff depth variance diagram.

Figure 5. EWT decomposition results.

Figure 6. Monthly runoff forecasting results.

Figure 7. The AE of five models.

Figure 8. Scatter plot of forecasting by each model.

Figure 9. Taylor Diagram.

Figure 10. Monthly runoff forecast results under the monthly structure.

Figure 11. Scatter plot of forecasting by each model under the monthly structure.

Figure 12. Taylor Diagram under the monthly structure.

Table 1. Forecasting effect of each model on validation set under the sequential structure.

Model	NSE	MAPE(%)	RMSE	MAE	R
SVM	0.49	79.18	47.79	29.08	0.78
GA–SVM	0.49	78.39	47.73	29.06	0.76
PSO–SVM	0.50	80.45	47.14	28.76	0.78
EMD–PSO–SVM	0.72	125.14	35.59	24.60	0.85
EWT–PSO–SVM	0.86	94.95	25.39	17.03	0.93

Table 2. Ranking of evaluation indexes and overall evaluation scores of the five models.

Table 3. Forecasting effect of each model on validation set under the monthly structure.

Model	NSE	MAPE(%)	RMSE	MAE	R
SVM (M)	0.51	67.57	47.79	29.44	0.74
PSO–SVM (M)	0.55	65.37	44.81	28.03	0.75
EWT–PSO–SVM (M)	0.75	50.49	25.39	20.53	0.87

Table 4. Ranking of evaluation indexes and overall evaluation scores of the three models.

Model	NSE	MAPE	RMSE	MAE	R	M_R
SVM (M)	3	3	3	3	3	0
PSO–SVM (M)	2	2	2	2	2	0.33
EWT–PSO–SVM (M)	1	1	1	1	1	0.67

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

MDPI and ACS Style

Mo, C.; Yan, Z.; Ma, R.; Lei, X.; Deng, Y.; Lai, S.; Huang, K.; Mo, X. Investigation of the EWT–PSO–SVM Model for Runoff Forecasting in the Karst Area. Appl. Sci. 2023, 13, 5693. https://doi.org/10.3390/app13095693

AMA Style

Mo C, Yan Z, Ma R, Lei X, Deng Y, Lai S, Huang K, Mo X. Investigation of the EWT–PSO–SVM Model for Runoff Forecasting in the Karst Area. Applied Sciences. 2023; 13(9):5693. https://doi.org/10.3390/app13095693

Chicago/Turabian Style

Mo, Chongxun, Zhiwei Yan, Rongyong Ma, Xingbi Lei, Yun Deng, Shufeng Lai, Keke Huang, and Xixi Mo. 2023. "Investigation of the EWT–PSO–SVM Model for Runoff Forecasting in the Karst Area" Applied Sciences 13, no. 9: 5693. https://doi.org/10.3390/app13095693

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

EMD–PSO–SVM

EWT–PSO–SVM