*Article* **Evaluating Monthly Flow Prediction Based on SWAT and Support Vector Regression Coupled with Discrete Wavelet Transform**

**Lifeng Yuan 1,2 and Kenneth J. Forshay 1,\***


**Abstract:** Reliable and accurate streamflow prediction plays a critical role in watershed water resources planning and management. We developed a new hybrid SWAT-WSVR model based on 12 hydrological sites in the Illinois River watershed (IRW), U.S., that integrated the Soil and Water Assessment Tool (SWAT) model with a Support Vector Regression (SVR) calibration method coupled with discrete wavelet transforms (DWT) to better support modeling watersheds with limited data availability. Wavelet components of the simulated streamflow from the SWAT-Calibration Uncertainty Procedure (SWAT-CUP) and precipitation time series were used as inputs to SVR to build a hybrid SWAT-WSVR. We examined the performance and potential of the SWAT-WSVR model and compared it with observations, SWAT-CUP, and SWAT-SVR using statistical metrics, Taylor diagrams, and hydrography. The results showed that the average of *RMSE*-observation's standard deviation ratio (*RSR*), Nash–Sutcliffe efficiency (*NSE*), percent bias (*PBIAS*), and root mean square error (*RMSE*) from SWAT-WSVR is 0.02, 1.00, <sup>−</sup>0.15, and 0.27 m<sup>3</sup> s −1 in calibration and 0.14, 0.98, <sup>−</sup>1.88, and 2.91 m<sup>3</sup> s −1 in validation on 12 sites, respectively. Compared with the other two models, the proposed SWAT-WSVR model possessed lower discrepancy and higher accuracy. The rank of the overall performance of the three SWAT-based models during the whole study period was SWAT-WSVR > SWAT-SVR > SWAT-CUP. The developed SWAT-WSVR model supplies an additional calibration approach that can improve the accuracy of the SWAT streamflow simulation of watersheds with limited data.

**Keywords:** SWAT; support vector regression; streamflow prediction; wavelet transform; Illinois River watershed

#### **1. Introduction**

A precise and reliable monthly streamflow prediction model is helpful in the planning, management, development, and protection of water resources, such as future flood and drought forecasting, reservoirs, and/or agricultural water management [1–3]. However, many hydrological elements (e.g., precipitation, runoff, sediment, flood, and streamflow) have highly complex, nonlinear, non-stationary, and uncertainty features, which is a challenge for conventional hydrological methods when analyzing and predicting the complex patterns and inherent variabilities of rainfall–runoff relationships [4–8]. Hence, accurate rainfall–runoff prediction became a difficult task in stochastic hydrology; subsequently, new theories and methods, such as machine learning methods [9,10], have been introduced to improve rainfall–runoff forecasting.

Support vector machine (SVM) is a machine learning method, which can be considered a data-driven black-box model [11]. SVM focuses on determining a kernel function and searching for an optimum separating hyperplane based on the kernel function selected.

**Citation:** Yuan, L.; Forshay, K.J. Evaluating Monthly Flow Prediction Based on SWAT and Support Vector Regression Coupled with Discrete Wavelet Transform. *Water* **2022**, *14*, 2649. https://doi.org/10.3390/ w14172649

Academic Editors: Dengfeng Liu, Hui Liu and Xianmeng Meng

Received: 30 June 2022 Accepted: 4 August 2022 Published: 27 August 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

This separating hyperplane determines an optimal parameter combination fitting of the observations. Meanwhile, the search process avoids overfitting and make SVM present better generalization characteristics [12]. Compared to an artificial neural network (ANN), SVM performs better [13] in some hydrological applications because it applies a structural risk minimization principle to obtain a global optimum solution rather than a solution based on the empirical risk minimization as applied in ANN [4,13–15]. SVM can solve the nonlinear problem in a low dimension input space by projecting to a higher dimension feature space where an original nonlinear problem is converted into a linear problem [16,17]. SVM has been widely applied in recent decades to hydrological prediction worldwide. For instance, Shabri and Suhartono (2012) [18] used the least-squares SVM (LSSVM) method to predict streamflow in Peninsular Malaysia to compare performance of different models' including LSSVM, autoregressive integrated moving average (ARIMA), ANN, and regular SVM. Kalteh (2013) [4] compared the prediction accuracy of monthly flow discharge from an artificial neural network (ANN) method and the support vector regression (SVR) model coupled with a wavelet transform in two stations in northern Iran. Chiogna et al. (2018) [19] combined the Soil and Water Assessment Tool (SWAT) model and SVR method to predict hydropeaking for the Upper Adige River watershed in northeast Italy and applied a wavelet method to analyze the price of energy. However, these studies only used a few hydrological stations to test their methods and evaluate the model performance. Nourani et al. (2015) [20] proposed a two-stage SVM method with spatial statistics to simulate monthly river suspended sediment load for 15 sites within the Ajichay River in northwest Iran. Yuan and Forshay (2021) [21] developed a seasonal SWAT model coupled with SVR for 13 hydrological stations using a spatial calibration method to improve the accuracy of monthly streamflow prediction in the Illinois River watershed. From these efforts, it is clear that the SVM method has powerful predictive ability depending on calibration datasets and can accurately capture nonlinear relationships between the input and output variables. However, the SVM prediction method neglects the detailed characteristics and processes of a watershed system [22] and simplifies the complexity of the rainfall–runoff relationship [23].

Wavelet analysis is a mathematical function with an auto-adaptive time-frequency window (i.e., the width of time and frequency may change) and is suited to analyze and calculate stationary or non-stationary time series signals [4]. In the high-frequency period of signals, the size of a frequency window becomes larger while the size of a time window becomes smaller and otherwise occurs in the signals low-frequency period. Wavelet analysis has an excellent capability to reduce data noise, analyze variabilities, periodicities, and trends of hydrological time series and is suitable for handling the non-stationary flow signals [24,25]. To date, the wavelet analysis method has been widely applied in hydrological prediction. For example, Nourani et al. (2014) [7] reviewed applications of hybrid wavelet-Artificial Intelligence (AI) models in hydrology and presented remarkable progress when integrating wavelet analysis and AI models to improve the prediction accuracy of hydrologic models in the recent couple of decades. Noraini and Norhaiza (2017) [13] compared different wavelet denoising techniques and decomposition levels, and input streamflow time series after wavelet denoising into SVR based on a radial basis function (RBF) kernel to improve 1-month-ahead streamflow prediction in the Segamat River basin in southern Malaysia. Sun et al. (2019) [26] integrated multiple methods, such as the autoregressive model, autoregressive moving average (ARMA) model, ANN model, and linear regression (LR) model, with wavelet transform and compared their differences in performance while predicting daily streamflow in the Heihe River basin of northern China, where they found that the wavelet-based method can effectively improve streamflow simulations compared to other models. Nalley et al. (2020) [27] used several wavelet-transform-based methods to improve the performance of extending streamflow records for areas in Canada with limited data available. From the works mentioned above, it was inferred that wavelet transform typically worked as a data pre-processing tool before employing conventional hydrological analysis methods or data-driven models and

could raise the accuracy of the flow prediction. The main concerns regarding wavelet applications in hydrological forecasting are the proper selection of the mother wavelet function and the determination of decomposition levels corresponding to the specific hydrological time series [7,28]. Integrating these machine learning methods with physically based hydrologic models could better describe streamflow by incorporating the constraints of the physical world that include advanced mathematical techniques to discover the complex hidden or obscured signals that are difficult or impossible to model in a physically based modeling system.

Many physically based hydrologic models have been developed and applied to predict streamflow [29]. Among these models, the Soil and Water Assessment Tool (SWAT) is a conceptual, physically based, watershed-scale hydrologic model that has been extensively applied worldwide [30]. SWAT takes a large number of physical processes of hydrology into account. Hence, it requires extensive data and parameter inputs. Often, data are unavailable in certain regions due to time or economic cost or limitations of measurement technologies, especially in some developing countries. Therefore, the unknown values of many parameters in SWAT can only be determined via the procedure of calibration [3]. Calibration is a time-consuming and complicated process since it involves parameterization, optimal algorithm determination, and extensive iterative computation to find optimal value ranges and parameter combinations [11,31,32].

Moreover, the issue of parameter non-uniqueness is that different parameter sets might produce very similar simulated signals with the observed flow time series, which makes effective calibration harder to achieve [31]. To obtain appropriate parameter combinations, it requires researchers to have a profound understanding of hydrological parameters and processes and familiarity of local hydrological conditions and physical features in a study area. To raise the accuracy of hydrological models prediction, especially for a region with limited data available, several efforts have evaluated the performance and potential of SWAT coupling with the SVR methods in streamflow prediction [11,15,33,34], yet few efforts [19,35] have attempted to couple a distributed physically based model and a machine learning method to improve the rainfall–runoff simulation. In some cases, we do not achieve a desirable result of flow prediction with acceptable accuracy, even after conducting comprehensive model calibration. Here, we attempt to improve the accuracy of a commonly used physically based model (SWAT) by integrating discrete wavelet transform functions and support vector machines in a system with complex non-linear rainfall–runoff relationships to support modeling efforts in watersheds with limited data.

The object of this study aims to show how the SVM method with wavelet transforms can be used to improve the monthly flow prediction of the calibrated SWAT hydrological model in the Illinois River watershed (IRW), U.S. This work studied and compared the performance of calibrated SWAT (SWAT-CUP), regular SWAT-SVR without applying wavelet transform, and SWAT-WSVR coupled with wavelet transform for monthly flow prediction for 12 hydrological stations in the IRW. This study is helpful to improve streamflow prediction at a month scale in some areas with sparse data.

#### **2. Methodology**

To improve monthly flow prediction, we developed the hybrid model SWAT-WSVR based on SWAT and SVR with discrete wavelet transforms. First, we developed SWAT models for twelve sites after SWAT-CUP (SWAT Calibration and Uncertainty Program) calibration progresses. Then, the flow at month *t* simulated by SWAT-CUP calibration and corresponding precipitation (i.e., applying the Thiessen polygons divide method to allocate the NCDC meteorological stations to the USGS hydrological sites) of each site served as SVR input variables to predict flow on month *t*. It was a regular SWAT-SVR model. Next, the simulated monthly flow from SWAT-CUP was decomposed using the discrete wavelet transform (DWT) to obtain wavelet coefficients and approximation at different scale resolutions, which were served as inputs of SWAT-WSVR with precipitation data. The results from SWAT-CUP worked as a benchmark compared with the results from

SWAT-SVR and SWAT-WSVR. To estimate the performance of different models, we used a metric such as the Nash–Sutcliffe efficiency (*NSE*), Percent Bias (*PBIAS*), Root Mean Square Error (*RMSE*), and *RMSE*-observations standard deviation ratio (*RSR*) to estimate the model results quantitatively. Finally, we combined calibrated (or training) and validated (or testing) time series data and re-estimated the entire model performance based on Pearson's correlation coefficient (*r*), *RMSE*, and normalized standard deviation (*NSD*) and plotted the Taylor diagram and hydrography to compare their performance difference graphically. Figure 1 showed the construction flowchart of three hydrology models. The results from SWAT-CUP worked as a benchmark compared with the results from SWAT-SVR and SWAT-WSVR. To estimate the performance of different models, we used a metric such as the Nash–Sutcliffe efficiency (*NSE*), Percent Bias (*PBIAS*), Root Mean Square Error (*RMSE*), and *RMSE*-observations standard deviation ratio (*RSR*) to estimate the model results quantitatively. Finally, we combined calibrated (or training) and validated (or testing) time series data and re-estimated the entire model performance based on Pearson's correlation coefficient (*r*), *RMSE*, and normalized standard deviation (*NSD*) and plotted the Taylor diagram and hydrography to compare their performance difference graphically. Figure 1 showed the construction flowchart of three hydrology models.

wavelet transform (DWT) to obtain wavelet coefficients and approximation at different scale resolutions, which were served as inputs of SWAT-WSVR with precipitation data.

*Water* **2022**, *14*, x FOR PEER REVIEW 4 of 19

**Figure 1.** The developing flowchart of different models: SWAT-CUP, SWAT-SVR, and SWAT-**Figure 1.** The developing flowchart of different models: SWAT-CUP, SWAT-SVR, and SWAT-WSVR.

#### WSVR. *2.1. Watershed Description and Data Source*

*2.1. Watershed Description and Data Source*  The IRW (35°31′–36°9′ N, 94°12′–95°2′ W) covers about a 4200 km2 drainage area and crosses Arkansas and Oklahoma, U.S. The average basin slope is about 5.6%. The average annual temperature and precipitation are about 16 °C and 1198 mm, respectively. Monthly statistical discharge data from twelve U.S. Geological Survey (USGS) hydrological sites were downloaded from the official website [21]. These monthly statistics generated from sites are based on USGS-approved daily-mean data. The basic information of 12 hydrologic stations and selected descriptive statistics for monthly flow time series are listed in Table 1. Additionally, Figure 2 shows the spatial distribution of meteorological and hy-The IRW (35◦310–36◦9 0 N, 94◦120–95◦2 <sup>0</sup> W) covers about a 4200 km<sup>2</sup> drainage area and crosses Arkansas and Oklahoma, U.S. The average basin slope is about 5.6%. The average annual temperature and precipitation are about 16 ◦C and 1198 mm, respectively. Monthly statistical discharge data from twelve U.S. Geological Survey (USGS) hydrological sites were downloaded from the official website [21]. These monthly statistics generated from sites are based on USGS-approved daily-mean data. The basic information of 12 hydrologic stations and selected descriptive statistics for monthly flow time series are listed in Table 1. Additionally, Figure 2 shows the spatial distribution of meteorological and hydrological stations, terrain, lakes, and rivers in the IRW.

drological stations, terrain, lakes, and rivers in the IRW.

**Data Period**

**Upstream** 

**No. USGS Sta-**

**tion** 


**Table 1.** Watershed properties and selected descriptive statistics of USGS hydrological stations.

**Average Monthly Streamflow (m3 s−1)** 

*Water* **2022**, *14*, x FOR PEER REVIEW 5 of 19

**Number of Data** 

1 07195800 36.8 1.1995–12.2013 228 0.41 2.90 0.05 0.25 0.44

**Flow Descriptive Statistics (m3 s−1)** 

**Max Min Median Standard Devi-**

**ation** 

**Figure 2.** The geographic distribution of NCDC meteorological stations, USGS hydrological stations, lakes, and rivers in the IRW. **Figure 2.** The geographic distribution of NCDC meteorological stations, USGS hydrological stations, lakes, and rivers in the IRW.

The data used to set up the SWAT model include the following: (1). Digital elevation model (DEM): 1 Arc-Second Global Database from Shuttle Radar Topography Mission (SRTM) were downloaded from the USGS website (https://earthexplorer.usgs.gov/, accessed on 01-28-2018). Its spatial resolution is about 30 m × 30 m. (2). Land use and land cover data: 2011 NLCD dataset (https://www.mrlc.gov/, accessed on 01-31-2018) were applied in this study, and spatial resolution is 100 m × 100 m. (3). Soil data: We downloaded soil data from the SSURGO database (https://websoilsurvey.nrcs.usda.gov/, accessed on 02-05-2018). (4). Climate data: Daily climate data came from the National Climatic Data The data used to set up the SWAT model include the following: (1). Digital elevation model (DEM): 1 Arc-Second Global Database from Shuttle Radar Topography Mission (SRTM) were downloaded from the USGS website (https://earthexplorer.usgs.gov/, accessed on 28 January 2018). Its spatial resolution is about 30 m × 30 m. (2). Land use and land cover data: 2011 NLCD dataset (https://www.mrlc.gov/, accessed on 31 January 2018) were applied in this study, and spatial resolution is 100 m × 100 m. (3). Soil data: We downloaded soil data from the SSURGO database (https://websoilsurvey.nrcs.usda.gov/, accessed on 5 February 2018). (4). Climate data: Daily climate data came from the National Climatic Data Center (NCDC) (https://www.ncdc.noaa.gov/, accessed on 7 February 2018). Due to incomplete precipitation and temperature records from January 1990 to December

Center (NCDC) (https://www.ncdc.noaa.gov/, accessed on 02-07-2018). Due to incomplete

2013, we downloaded alternative Climate Forecast System Reanalysis (CFSR) data from the SWAT official website (https://globalweather.tamu.edu/, accessed on 31 January 2018), then filled missing NCDC data using climate data from the closest CFSR stations.

#### *2.2. Hydrological Model*

SWAT was developed by the U.S. Department of Agriculture Agricultural Research Service (USDA-ARS) and has been extensively used worldwide [30,36]. It is a conceptual, semi-distributed, and physically based hydrologic model used to simulate water cycles, crop growth, sediment yields, and agricultural chemical transport in a large river basin with varying soils, slopes, and land use management conditions [30]. A more detailed description of the SWAT model is available from online documentation [37].

ArcSWAT 2012.10\_4.19 within ArcGIS 10.4.1 was selected to build the SWAT-based model in the study area. The entire IRW watershed was discretized as 86 subwatersheds with 1023 hydrologic response units (HRUs) using a threshold area of 3000 ha. Each HRU consisted of various land use/soil/slope attributes and was defined with a threshold of land use (10%), soil (10%), and slope (5%). The SCS curve number method [38] and the variable storage routing method [37] were applied to calculate the surface runoff and river flow, respectively. A five-year warm-up period (1990–1994) was set up to initialize the model input and stabilize the SWAT model. The SWAT simulation running period is from 1 January 1995 to 31 December 2013.

We used SWAT-CUP with Sequential Uncertainty Fitting (SUFI2) method to conduct sensitivity analysis, calibration, and validation procedures of the SWAT model [31]. The all-at-a-time approach was applied in the procedure of parameterization with 1000 SWAT-CUP simulations. SWAT-CUP was set up for all twelve stations and run at one time with two iterations. The nine sensitive parameters range were determined and their range and fitted values in calibration listed in Table 2. The results from SWAT-CUP were regarded as a benchmark to compare with results from SWAT-SVR and SWAT-WSVR. The optimal parameters combination and sensitivity of the SWAT model depends on precipitation input, interpolation of weather data, and the number of iterations. It has been investigated in previous publications [11,32,39,40] and will not be discussed further in this article.


**Table 2.** The sensitive parameters range and their fitted values in calibration.

Note: "A\_\_", "V\_\_", and "R\_\_" mean an absolute increase, a replacement, and a relative change to the initial parameter values, respectively.

#### *2.3. Support Vector Machine*

SVM is built on the principle of the statistical learning and structural risk minimization theory [41]. When SVM technology is applied in regression analysis, it is called SVR. The SVR function is expressed as below [41]:

$$f(\mathbf{x}) = w \cdot \Phi(\mathbf{x}) + b \tag{1}$$

where *w* is a weight vector, *Φ* is a nonlinear transfer function, and *b* is offset. An SVR function *f*(*x*) can be expressed as the below formulation [17]:

$$\min \quad \frac{1}{2}||w||^2 + \mathbb{C}\sum\_{n=1}^{n} (\mathfrak{f}\_i + \mathfrak{f}\_i^\*)$$

$$y\_i - (w \cdot \Phi(\mathbf{x}\_i) + b) \le \varepsilon + \mathfrak{f}\_i$$

$$\text{subject to } (w \cdot \Phi(\mathbf{x}\_i) + b) - y\_i \le \varepsilon + \mathfrak{f}\_i^\* \tag{2}$$

$$\mathfrak{f}\_{i'} \mathfrak{f}\_{i}^\* \ge 0, \ i = 1, 2, \dots, n$$

where *ξ<sup>i</sup>* and *ξ* ∗ *i* are slack variables that estimate the deviation of training data falling out of the ε-insensitive zone. The *C* is a penalty factor that determines the tradeoff between the flatness of Φ(*xi*) and the amount up to which the deviation *ε* can be tolerated [42].

In application, SVR includes four commonly used kernel functions such as the linear, polynomial, Gaussian radial basis (RBF), and sigmoid. In this paper, we selected the Gaussian RBF kernel function due to its computational efficiency, and its expression is described below [43]:

$$K(\mathbf{x}\_{i\prime}\mathbf{x}\_{j}) = \exp\left(-\gamma \|\mathbf{x}\_{i} - \mathbf{x}\_{j}\|^{2}\right) \tag{3}$$

The most critical three parameters in a SVR ε-regression application based on the RBF kernel include: the penalty error parameter *C* (*C* > 0), the Gaussian RBF kernel parameter *γ*, and the deviation of the error margin *ε* [14]. We applied the grid search and the *k*-fold cross-validation method to optimize these parameters. The parameters value range in the grid-searching was set up as: *C* (begin = 2−<sup>6</sup> , end = 2<sup>8</sup> , step = 1), *γ* (begin = 2<sup>4</sup> , end = 2−<sup>8</sup> , step = <sup>−</sup>1), and *<sup>ε</sup>* (begin = 2−<sup>8</sup> , end = 2−<sup>1</sup> , step = 0.5). The *k*-value in cross-validation was set to 5 for tuning the SVR. Before training the SVR, all input data were normalized to the value range [0, 1] by the formula (*x* − *x*\_*min*)/(*x*\_*max* − *x*\_*min*). Additionally, for each site, we used the first 70% of data to train the model, then applied the remaining 30% subset for validation purposes. The SVR ε-regression model was used to develop both SWAT-SVR and SWAT-WSVR. R version 4.1.0 running on RStudio version 1.4.1717 and the 'e1071' package [44] were used for the development, training, and testing of the hybrid model [45].
