Invasive-Weed-Optimization-Based Extreme Learning Machine for Prediction of Lake Water Level Using Major Atmospheric–Oceanic Climate Scenarios

Murat Can

doi:10.3390/su16177825

State Hydraulic Works of Turkey, 1st District, Bursa 16260, Turkey

Sustainability2024, 16(17), 7825;https://doi.org/10.3390/su16177825

This article belongs to the Section Air, Climate Change and Sustainability

Version Notes

Order Reprints

Abstract

Fresh water lakes are vulnerable assets that need to be protected against manmade/natural challenges like climate change and anthropogenesis activities. This study addresses the predictability of the lake water level changes based on the knowledge acquired directly from the climate data. Two fresh water lakes named Lake Iznik and Uluabat, located in Turkey, are addressed. Time series of the lake water levels during October 1990–September 2019 at a monthly scale, along with the corresponding anomalies of 24 Large-Scale Atmospheric–Oceanic Oscillations (LSAOOs) from around the globe, are used in the analysis. The relationship between variables and the structure of the models are initially acquired based on the significance of the dependence between climate indices and lake water levels with consideration of the significance of the Spearman rank-order coefficient. Then, the time series are divided into training (80%) and testing (20%) sets. The Extreme Learning Method (ELM), enhanced with the genetic algorithm (ELM-GA) and Invasive Weed Optimization (ELM-IWO), is then used in the predictive models. Based on the results, Lake Uluabat showed a stronger teleconnection with LSAOOs, while the ELM-GA for Lake Iznik and ELM-IWA for Lake Uluabat depicted the best performance in the prediction of lake water levels. Comparison of the enhanced ELM-IWO to the corresponding ELM-GA illustrates that the ELM-IWO reveals more acceptable results owing to its flexible nature.

Keywords:

climate change; lake water level; machine learning; prediction

1. Introduction

Fresh water lakes are among the most invaluable water resources that ensure the life cycle and health of the surrounding wetlands [1]. As a result, any changes in their water content cause major damages to the activities that take place in their vicinity, alternation in the coast lines, and changes in the bathymetry and micro-climate of the region [2]. The reports on the forthcoming changes in climate patterns and water shortage, however, urge decision makers to develop plans and take actions to enhance them, adapting to climate factors and the ever-increasing demand for fresh water reservoirs [1,2].

In this respect, Angel and Kunkel investigated the effect of low-, moderate-, and high-emissions climate scenarios on Lake Michigan’s water levels [3]. Haghighi and Kløve investigated the effect of climate and river flow on lakes [4]. The so-called degree of lake wetness was introduced, and it was concluded that the lake water level in high-capacity-inflow-ratio systems depends on the climate, while, in systems with low capacity, the inflow ratio mostly is related to the river regime. Woolway et al. reviewed lakes’ physical variables and their responses to climate change around the globe [5]. Bai et al. investigated the extreme water level events in 245 lakes around the world. To perform this, satellite altimetry data were used to study the effect of large-scale climate oscillations on these changes. It was concluded that the proportion of lakes with extreme water level events annually experience a cyclic fluctuation [6]. Aminjafari et al. examined the changes in water levels in 144 lakes in Sweden using a radar altimetry technique. It was concluded that the lakes depict different types of trends and behavior based on their geographical positioning, and this underscores the continuous need for monitoring lake water levels for adaptation strategies in the face of climate change [7]. The interested reader may also refer to the recent studies that addressed the link between lake water level and large-scale climatic events [8,9,10].

Furthermore, the recent development in machine learning methods and artificial intelligence motivated many researchers to use stand-alone or hybrid models in the prediction of different properties of lakes [11,12,13], including lake water level and coast line changes. Some of these models are Artificial Neural Networks (ANNs), Genetic Programming (GP), Extreme Learning Method (ELM), Support Vector Machine (SVM), Elman Neural Network (ENN), Ant Colony (AC), Firefly Algorithm (FA), Kernel Fusion (KF), etc. So, in light of the superiority of these models, one may prefer using these approaches in the prediction of lake water levels. Among these methods, the so-called ELM is a novel approach for addressing single-hidden-layer neural networks, and, in contrast to the conventional solving algorithms, this method is both streamlined and significantly faster. The parameters for the ELM, encompassing weights and biases, are initialized randomly, obviating the need for subsequent adjustment. Although the stand-alone state of the ELM in the prediction of lake water levels was already mentioned in several studies, a more sophisticated version or the hybridization of the main algorithm with different methods is still a promising topic in lake studies [11,14]. Shiri et al. used the ELM in the prediction of the lake water level in Lake Urmia [11], while the same dataset was used in another study conducted by Sales et al. [15]. Yet, according to the obtained results, the hybridization of the ELM with another algorithm reveals more promising prediction [15]. Therefore, in this study, we also seek a fresh hybridization approach which leads us to a better understanding of lake water level prediction, and develop a foundation for further investigation.

When the effect of climatic events on lake water levels is of interest, the role of Large-Scale Atmospheric Oceanic Oscillations (LSAOOs) can be distinguished as a hot topic. It is related to the changes in the pivotal climatic factors such as atmospheric pressure, sea surface temperature (SST), wind speed, and radiation, which is capable of altering the precipitation and other hydro-climatic phenomena around the globe. For instance, the North Atlantic Oscillation (NAO) and El Niño–Southern Oscillation (ENSO) indices have been continuously addressed in the past, and it has been concluded that both of them are effective in determining the hydrology of many catchments. For instance, Vaheddoost investigated the correspondence between the NAO, Atlantic Multi-Decadal Oscillation (AMO), ENSO, Indian Ocean Dipole (IOD), Arctic Oscillation (AO), Pacific Decadal Oscillation (PDO), and Southern Annular Mode (SAM) and water level oscillations in Lake Urmia [14]. Wang et al. investigated the effect of AMO, NAO, and PDO on ice cover in the Great Lakes, US/Canada [16]. Several studies also used these indices within the structure of models to develop more advanced prediction. Likewise, Komatsu et al. investigated the watershed runoff and reservoir water quality with the help of the Global Climate Model (GCM) in Shimajigawa reservoir, Japan [17]. Fathian and Vaheddoost investigated the role of the Southern Oscillation Index (SOI) and NAO indices in Lake Urmia’s water level with the help of autoregressive conditional heteroscedasticity (GARCH) models. It was concluded that the establishment of an SOI–water level link is more relevant than the NAO–water level link [18]. Ozdemir et al. reviewed the lake water models’ prediction, especially using Deep Learning (DL), that showed the highest accuracy in terms of the evaluation metrics [19].

As briefly mentioned above, the impact of climate on the behavior of water levels in the Uluabat and Iznik lakes has been insufficiently addressed in the past. This gap is largely attributable to a scarcity of data and significant discontinuities in the existing records, which present considerable challenges for conducting comprehensive hydrological studies. Consequently, our understanding of the hydrological cycles and behaviors of these lakes remains quite limited. Moreover, recent environmental changes and anthropogenic activities in the vicinity of these lakes have disrupted their natural equilibrium, leading to an unpredictable future. To the best of the author’s knowledge, no study to date has employed advanced or hybrid models for predicting water levels in Lake Iznik and Lake Uluabat. In light of this brief discussion, the objectives of this study are to identify the most influential climatic indices affecting water levels in these lakes; develop advanced predictive models, preferably utilizing the ELM, for predicting future lake water levels; and establish a foundation for future hydrological research on these lakes.

2. Materials and Methods

2.1. Study Area

The study area is located in western Turkey at the south of the Marmara Sea, covering a relatively vast area known as the Bursa district (Figure 1). Both Lake Iznik (40°26′6.11″ N, 29°31′19.52″ E) and Lake Uluabat (40°10′3.81″ N, 28°36′5.68″ E) are considered to be permanent fresh water lakes of great importance and are less than 60 km apart. The elevation of the region ranges from 0.00 m Above Sea Level (m ASL) at the coasts of the Marmara Sea to 2543 m ASL at the summit of Mount Uludag. According to the Köppen–Geiger climate classification map [20], the vicinity can be considered as Csa and partly Csb around Lake Iznik, accounting for the warm-temperate climate (C), steppe precipitation (s), and hot (a)/warm (b) summer temperature. The average annual precipitation in the region is about 709 mm (long-run mean for Bursa region), while the annual temperature in the region ranges between −20.5 °C and 43.8 °C.

Figure 1. Study area and location of Lake Uluabat and Lake Iznik.

2.2. Data

The data used in this study are all provided by the State Hydraulic Works (DSI in Turkish acronyms). The data consist of lake water level time series records for both lakes, spanning from October 1990 to September 2019 at a monthly scale. The measurements, recorded in m ASL, are used as model outputs after checking for data loss and potential inconsistencies. Subsequently, the time series records of 24 LSAOOs, which were previously normalized or standardized by the provider (expressed as anomalies and listed in Table 1 and Figure 2) [21], are used alongside the lake water level records. Due to the substantial gaps and inconsistencies in the hydro-meteorological data records, which hinder the development of robust hydrological models that adequately account for parameters such as runoff and groundwater information, the model scenarios are all developed solely based on the relationship between LSAOOs and lake water levels. In this regard, the effect of climate anomalies and changes on the selected water bodies (i.e., Lake Uluabat and Lake Iznik) is evaluated and incorporated into the model development without further data manipulation. It is noteworthy that the ENSO, Niño 1 + 2, Niño 3, Niño 3.4, and Niño 4 indices are not always independent of each other, and usually serve as fundamental components of sea surface temperature (SST) in the Pacific Ocean. However, recent studies have examined these indices separately due to various phenomena that may occur globally.

Table 1. LSAOO indices used in this study [21]. (SST: sea surface temperature; SLP: sea level pressure; AP: atmospheric pressure).

Figure 2. Location of large-scale oscillations used in the study.

2.3. Models

In the first step, to select the best combination of input variables and scenarios, the Spearman rank-order correlation coefficient is used. The Spearman correlation coefficient, unlike Pearson’s correlation, measures the monotonic relationship and linear dependence between the selected variables. In other words, it determines whether the relationship between the output (i.e., lake water level) and the inputs (i.e., climatic oscillations) remains constant. The p-values are then evaluated at 0.05 critical level to define the significance of the teleconnections. Subsequently, based on the selected climate indices (i.e., input scenarios), the main structures of the models are created. Initially, 80% (288 months) of the total data (360 months) records are used for model development and calibration, while the remaining 20% (72 months) are reserved for the testing phase to confirm the model’s credibility and further assess lake water level predictions. By using 80% of the data, maximum acquisition of historical information is ensured, while the remaining 20% are drawn from recent records to enhance the predictability of trends, seasonality, and other time-dependent factors.

2.3.1. Extreme Learning Method (ELM)

The Extreme Learning Method (ELM) is a type of feedforward neural network method with a single hidden layer which generates robust generalization capacity through providing random hidden nodes [22]. In this regard, the single-layer feedforward neural network (SLFFNN) with L hidden neurons can be written as

f_{L} (x) = \sum_{i = 1}^{L} β_{i} G (a_{i}, b_{i}, x) x, a_{i}, b_{i} \in R^{n}

(1)

where β_i is the weight matrix which links the ith hidden and output nodes, a_i and b_i stand for learning parameters, and G(a_i,b_i,x) is the output of the ith node for x as an input parameter. In this phase, various types of activation functions can be implemented during the modeling process, e.g., sigmoid, hard-limit, radial basis, sine, and triangular basis functions. For non-linear problems, the sigmoid function provides better results as it detects the complex pattern within the data [23].

The ELM does not require iterative tuning for the hidden-layer parameters, which significantly reduces computational complexity and the training stage with the help of 80% of the data (in this study). By randomly assigning error weights to the hidden nodes, the ELM reaches a high-speed learning and generalization ability, making it particularly useful in determination of model scenarios where swift and accurate predictions are needed. The ELM algorithm operates by initially randomizing the input weights and errors in the hidden layer. These parameters remain fixed, eliminating the need for backpropagation. Once the hidden layer’s output is determined, the algorithm applies a least-squares solution to determine the output weights, ensuring minimal error between the predicted and observed data. This streamlined process not only accelerates training but also enhances the model’s ability to generalize from data, making the ELM a highly efficient alternative to conventional neural network training techniques.

2.3.2. Genetic Algorithm (GA)

First developed by Holland, the genetic algorithm (GA) utilizes a searching approach to find an answer for an optimization problem [24]. It is a type of evolutionary algorithm which applies crossover and mutation in the analysis. The GA was established based on Darwin’s principles to detect an optimum solution relying on natural selection. It solves the problem using a genetic evolution paradigm that models gene evolution. Initially, the GA begins with a basic population size where, at the first stage, the fitness quantity for each chromosome is computed via a fitness function. From those, the one that is closest to the optimal response is selected as a parent to generate offspring chromosomes for the subsequent generation. In order to create a new generation, several formerly generated chromosomes perform mutation and recombination operations to produce new chromosomes.

The GA iteratively generates and examines responses until a satisfactory outcome is reached. Initially, chromosomes as various populations of random solutions are generated. Chromosomes include several individual components called genes which contribute to the overall solution. A fitness function is then implemented to examine the outcomes, where fitter chromosomes have a higher probability of being selected for reproduction. Through the application of the GA within the crossover and mutation process, new chromosomes generations are gradually produced to improve the overall population. Such a method explores the effective solutions without evaluating every possibility [24].

2.3.3. Invasive Weed Optimization (IWO)

Relying on weed adaptability, survival, and propagation, Mehrabian and Lucas recommended the Invasive Weed Optimization (IWO) algorithm as an evolutionary optimization algorithm [25]. The weed’s optimization search behavior for finding a better living environment, adaptation to new environments, and also resistance to changes were the inspiration for the development of the IWO algorithm. At first, a weed seeks to generate a higher number of offspring to grows in terms of quantity and cover the existing environment. In the first stage of the primary population, seeds are produced and dispersed. At the second stage, when the seeds grow and evolve into plants, according to their suitability and fitness, they are dispersed. At the third stage, offspring seeds are scattered around their parents and then the previous stages continue iteratively until the population reaches a certain extent. Its superiority lies in its adaptive nature and robust search capabilities. IWO dynamically adjusts its search space by simulating the spread and competition of weeds, which allows it to explore a broad solution space efficiently. The algorithm excels at balancing exploration and exploitation, making it particularly effective in solving complex, nonlinear optimization problems. Its ability to escape local optima and converge on global solutions with high precision distinguishes IWO as a powerful tool in optimization tasks.

The IWO algorithm stages include initial population generation, fitness examination, and seed generation. The fitness of the plant can be written as follows:

S_{i} = f l o o r (s_{\min} + \frac{s_{\max} - s_{\min}}{f_{\max} - f_{\min}} \times (f (x_{i}) - f_{\min}))

(2)

in which S_i is the seed number generated by a weed x_i, f(x_i) is fitness for x_i, floor is rounded down to the closest number, f_min and f_max, respectively, are the minimum and maximum of f(x_i), and s_min and s_max are the numbers of seeds generated via the least desirable and most beneficial weeds within the population, respectively. Produced seeds have a zero average with normal distribution; however, they have varied variance. Extra plants with less faintness are removed once the plant’s number reaches the predefined limited number. Finally, the previous steps must be repeated until the termination condition is satisfied [25].

2.3.4. Hybrid ELM-GA and ELM-IWO Models

In the current study, for the sake of promoting the computational ability of the ELM in modeling the lake water level with consideration of the LSAOO scenarios, two optimization algorithms, the GA and IWO, are implemented. Through the hybridization process, weights and biases at hidden layers are optimized to obtain a better generalization performance and mitigate the errors that typically initialize randomly. Such an approach provides better computational results in contrast to the stand-alone algorithms [26,27,28].

For the case of the ELM-GA model, the hidden layer parameters are optimized using the GA. The process starts with a population of nominated solutions in which every individual denotes a set of biases and weights. The fitness of each individual is examined relying on the root-mean-squared error. Hence, crossover and mutation operations are implemented to produce new solutions, and individuals with better performance are chosen. This process continues for several generations, modifying the solutions up to the optimal parameters set. Within the ELM-IWO model development process, IWO is implemented to iteratively enhance the hidden-layer parameters in the ELM. The modeling procedure begins via a population primarily comprising weeds in which each weed represents a potential solution. Weeds are regenerated by producing seeds proportional to their fitness. In a gradually decreasing manner, the variance seeds are dispersed in the searching space. The weeds that have better performance are then selected to be spread in the subsequent generation. This process is repeated until the optimal solution is reached.

In this regard, the hybridization of the ELM with the GA or the IWO enhances the ELM’s performance, offering different optimization strategies. The ELM-GA hybrid employs the GA’s evolutionary mechanisms, such as selection and mutation, to explore a vast search space, making it ideal for global optimization but potentially slower in convergence. Conversely, the ELM-IWO hybrid uses IWO’s adaptive, nature-inspired approach to dynamically adjust the search space, offering quicker convergence and precise optimization. While the ELM-GA excels in robustness, the ELM-IWO is superior in efficiency and speed.

As detailed, the hybrid ELM-GA and ELM-IWO models improve the modeling and prediction ability of the stand-alone ELM using the robustness of the GA and IWO. In this regard, the optimization parameters used in development of GA and IWO are given in Table 2.

Table 2. Setting parameters for ELM-GA and ELM-IWO.

2.3.5. Performance Evaluation Criteria

To assess the effectiveness of the models, the Root Mean Square Error (RMSE), Nash–Sutcliffe Efficiency (NSE), Lin’s Concordance Correlation Coefficient (CCC), determination coefficient (R²), Akaike Information Criterion (AIC), and scatter plots are employed. The NSE evaluates the proportion of variation in the outcomes attributed to the input variables (i.e., the variance explained by the model), while the RMSE quantifies the average discrepancy between predicted and observed values, and CCC illustrates the concordance among predicted and observed values. The R², however, measures how well the selected model predicts the observed value, while the AIC gives credit to the most parsimonious model. Hence, the RMSE, NSE, CCC R², and AIC can be calculated as follows:

R M S E = \sqrt{\frac{1}{n} \sum {(x_{t} - y_{t})}^{2}}

(3)

N S E = 1 - (\frac{\sum {(x_{t} - y_{t})}^{2}}{\sum {(x_{t} - \bar{x})}^{2}})

(4)

C C C = \frac{2 r σ_{x} σ_{y}}{{(\bar{x} - \bar{y})}^{2} + σ_{x}^{2} + σ_{y}^{2}}

(5)

R^{2} = {(\frac{\sum (x_{t} - \bar{x}) (y_{t} - \bar{y})}{\sqrt{\sum {(x_{t} - \bar{x})}^{2}} \sqrt{\sum {(y_{t} - \bar{y})}^{2}}})}^{2}

(6)

A I C = N \ln (\frac{N}{N - n - 1}) σ_{e}^{2} + 2 (n + 1)

(7)

where x and y are observed and predicted values, respectively; n is the data number in the set; t is related to the consecutive number of months;

\bar{x}

and

\bar{y}

are the average value of the time series; r is the Pearson’s correlation coefficient; n is the number of predictors; N is the total number of data; and, finally,

σ_{x}^{2}

,

σ_{y}^{2}

, and

σ_{e}^{2}

are the variance related to the observed, predicted, and residual time series, respectively.

2.3.6. Summarizing the Modeling

The flowchart of the modeling procedure in Figure 3 summarizes the methodology detailed above. Initially, the teleconnection between lake water levels and the selected LSAOOs is investigated to establish a conceptual hydrologic model for prediction of the lake water levels. Since the data records in the region are not reliable, it is assumed that the indirect effect of LSAOOs such as SST, SLP, and/or AP on the Iznik and Uluabat lakes can be used as the key parameters in the prediction. To achieve this, the Spearman rank-order correlation coefficient is used to ensure the stability in the teleconnections.

Figure 3. Flowchart of the methodology.

Once the best scenario is obtained based on the teleconnections between LSAOOs and lake water levels, the ELM is performed, and results are enhanced once again either using the GA or the IWO algorithm. To perform this, the Spearman rank-order correlation is used in distinguishing the best scenarios, while several performance criteria such as the RMSE, NSE, R², and CCC are used in determining the performance of the models in the training and testing stages. In the training stage, the models are raised to the maximum caliber to depict the highest similarity and concordance between observed and predicted lake water levels. In the testing stage, however, the performance of the models already calibrated in the calibration steps is evaluated once again to monitor the expected bias in the prediction of the lake water levels. Once known, these criteria could be used in determination of the actions needed for future model calibrations and updates, sensitivity, and uncertainty.

3. Results

Initially, the most related LSAOOs are selected and used in determination of the best scenario models using the Spearman rank-order correlation coefficient (Table S1 in Supplementary Materials). The obtained results showed that the Niño 3.4, Niño 1 + 2, Niño 3, WHWP, TNA, NP, and IDO (7 inputs) are the most related LSAOOs to Lake Iznik, while the PDO, Niño 3.4, Niño 1 + 2, Niño 3, WHWP, TSA, EP/NP, NP, TNI, AMM, QBO, CAR, and IDO (13 inputs) are the most related LSAOOs to Lake Uluabat. According to Table S1 (Supplementary Materials), it is obvious that the climate indices are more connected with Lake Uluabat than Lake Iznik, most probably due to the anthropogenic changes or encroachment in the catchment.

Therefore, the selected LSAOOs are used in prediction of the lake water response to the climatic changes in each lake. Afterward, the time series of the records are divided into training and test sets, which are then used in developing predictive models. The ELM is first calibrated and then updated using the hybrid ELM-GA and ELM-IWO. In this respect, Table 3 demonstrates the performance criteria for the training and the testing stages of all models developed for Lake Iznik. The scatter plots of Figure 4 also depict the similar statistics between developed modes. Based on the results, the ELM has the best performance in the training stage, while the performance obtained by the ELM-GA is better in the testing stage, i.e., NSE: 0.929, RMSE: 0.092, CCC: 0.964, R²: 0.929, and AIC: 16.071. The ELM in the training stage, however, showed superior results compared to the ELM-GA and ELM-IWO, achieving NSE: 0.969, RMSE: 0.064, CCC: 0.984, R²: 0.969, and AIC: 16.033. This can be the outcome of overfitting, limited calibration data, bias in calibration, weights associated with the layers, number of neurons, and/or difference in metrics (i.e., in dimensionality). Hence, the hybridization of the ELM with the GA and IWO is effective according to the performance metrics used.

Table 3. Performance of the models in training and testing for Lake Iznik.

Figure 4. Performance of the models devloped for Lake Iznik at (a) training and (b) testing stages.

Generally speaking, the training stage shown in Figure 4a is more robust than the testing stage (Figure 4b). But, according to the obtained results, we are confident that it is possible to use them in prediction of the future water levels of Lake Iznik. Therefore, it can be concluded that the GA method enhances the performance of the ELM model and the predictions are quite reasonable.

Results obtained for Lake Uluabat are also given in Table 4 and the scatter plot in Figure 5. Similarly, the performance of the stand-alone ELM model outperforms the remaining models in the training stage, i.e., achieving NSE: 0.884, RMSE: 0.276, CCC: 0.939, R²: 0.884, and AIC: 29.091. However, the ELM-IWO, achieving NSE: 0.926, RMSE: 0.194, CCC: 0.962, R²: 0.927, and AIC: 28.582, showed the best performance in the testing stage (Table 4). These results, however, are less qualified compared to the performance metrics obtained for Lake Iznik. This might be linked to the cyclic behavior of Lake Uluabat, which alternates the phase space, or the ephemeral behavior of the lake compared to Lake Iznik, which is bigger in size. The scatter plots of the training (Figure 4a) and the testing (Figure 4b) stages also confirm the statistics obtained in Table 3.

Table 4. Performance of the models in training and testing for Lake Uluabat.

Figure 5. Performance of the models devloped for Lake Uluabat at (a) training and (b) testing stages.

4. Discussion

The study addresses the effect of large-scale climatic indices on the behavior of the lake water levels in Lake Uluabat and Lake Iznik, located in Turkey. The data records cover October 1990 to September 2019 in monthly record sequences (360 months), which are later divided into training (October 1990 to September 2013) and testing (October 2013 to September 2019) datasets accounting for 80% and 20% of the time series, respectively.

For deeper investigations, 24 well-documented and well-known LASOOs representing different atmospheric, oceanic, or climatic events such as changes in SST are selected, and the correspondence between lake water levels is compared to those in a one-by-one comparison using the Spearman rank-order correlation. The Spearman rank-order correlation evaluates whether there is a significant monotonic relationship between the input and output, providing insight into how closely they follow each other’s rank over time [29]. The p-value is then used as a selection criteria defining the best relevant LSAOOs. It is concluded that Lake Uluabat is more connected with the climatic oscillations since the Spearman coefficients depicted higher values. More specifically, the Niño 3.4, Niño 1 + 2, Niño 3, WHWP, TNA, NP, and IDO (7 LASOOs) are the most related LSAOOs to Lake Iznik, while the PDO, Niño 3.4, Niño 1 + 2, Niño 3, WHWP, TSA, EP/NP, NP, TNI, AMM, QBO, CAR, and IDO (13 LASOOs) are the most related climatic indices to Lake Uluabat. Therefore, Lake Uluabat with Niño 1 + 2 and Lake Iznik with Niño 3 and Niño 1 + 2 showed the highest teleconnection. However, none of them showed good concordance with the ENSO as a multivariable index, covering most of what is explained by the Niño series. Therefore, it can be concluded that the lake water level changes have to do more with the SST in southeast America near Peru than the sea level pressure, zonal and meridional components of the surface wind, or the longwave radiation over the tropical Pacific [21]. These results are in line with the results obtained by Ghanbari and Brav [30] and Guo et al. [31], who, respectively, determined good teleconnection between the Trans-Niño Index (TNI) and Niño 3.4 (SST) and some lakes around the globe.

The performance of the models showed that the GA and IWO are closely competitive, but, generally speaking, the ELM-IWO outperforms the ELM-GA for accurate lake water level modeling and has a relatively lower AIC since it strikes a better balance between exploring and refining potential solutions. This balance allows IWO to find optimal results more quickly and reliably. Its method of adjusting population size and spreading solutions helps maintain diversity, making it less likely to get stuck in local optima, a common issue with GAs. Additionally, IWO is easier to implement and fine-tune, making it a more flexible and efficient choice, especially in complex situations where constraints are involved.

Based on the results, it can be concluded that employing the ELM and its integration with specialized methods represents a valuable approach for establishing the relationship between LSAOOs and lake water levels. Furthermore, in total agreement with the previous studies that have advocated for the use of the ELM, this method can be reliably selected for predicting water levels in lakes or reservoirs [13,14]. However, before the development of the model, it is essential to evaluate the efficacy of the input variables using techniques such as Spearman’s rank-order correlation, factor analysis, or the stepwise method.

The limitation of the study, however, lies in the application of Representative Concentration Pathways (RCPs) and emission scenarios, investigating the teleconnection between LASOOs and more related variables such as precipitation, which can be interpreted as the indirect link between LASOOs and lake water level, and also in the application of the hydrological parameters directly linked to the properties of the catchments.

5. Conclusions

This study investigated the teleconnection between 24 large-scale climatic indices and lake water level records in Lake Iznik and Lake Uluabat, located in western Turkey. Spearman rank-order correlation is used in the selection of the most effective oscillations, and is later used in the prediction of lake water levels. It was concluded that, respectively, 7 and 13 indices are effective for the Iznik and Uluabat lakes. Once they appeared in the results, the corresponding scenarios were used in the development of the ELM, ELM-GA, and ELM-IWO models with the help of 80% of the data for training and 20% of the data for the testing of the models. It was concluded that Lake Uluabat is more responsive to climatic oscillation than Lake Iznik, and the ELM-IWO showed the best performance, although ELM-GA showed a better performance in prediction of Lake Iznik’s water levels.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/su16177825/s1: Table S1: Spearman rank-order correlation matrix (values with significant p-values are given in italics).

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are available through formal application to State Hydraulic Works.

Acknowledgments

The author is thankful for the State Hydraulic Works for providing the data used in the analysis. The author also wants to express his gratitude to Babak Vaheddoost and Mir Jafar Sadegh Safari for helping and sharing their experience in the development of the scenario models and conducting this research.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Xu, N.; Lu, H.; Li, W.; Gong, P. Natural lakes dominate global water storage variability. Sci. Bull. 2024, 69, 1016–1019. [Google Scholar] [CrossRef]
Vaheddoost, B.; Fathian, F.; Gul, E.; Safari, M.J.S. Studying the Changes in the Hydro-Meteorological Components of Water Budget in Lake Urmia. Water Resour. Res. 2022, 58, e2022WR032030. [Google Scholar] [CrossRef]
Angel, J.R.; Kunkel, K.E. The response of Great Lakes water levels to future climate scenarios with an emphasis on Lake Michigan-Huron. J. Great Lakes Res. 2010, 36, 51–58. [Google Scholar] [CrossRef]
Haghighi, A.T.; Kløve, B. A sensitivity analysis of lake water level response to changes in climate and river regimes. Limnologica 2015, 51, 118–130. [Google Scholar] [CrossRef]
Woolway, R.I.; Kraemer, B.M.; Lenters, J.D.; Merchant, C.J.; O’Reilly, C.M.; Sharma, S. Global lake responses to climate change. Nat. Rev. Earth Environ. 2020, 1, 388–403. [Google Scholar] [CrossRef]
Bai, B.; Mu, L.; Ma, C.; Chen, G.; Tan, Y. Extreme water level changes in global lakes revealed by altimetry satellites since the 2000s. Int. J. Appl. Earth Obs. Geoinf. 2024, 127, 103694. [Google Scholar] [CrossRef]
Aminjafari, S.; Brown, I.A.; Frappart, F.; Papa, F.; Blarel, F.; Mayamey, F.V.; Jaramillo, F. Distinctive patterns of water level change in Swedish lakes driven by climate and human regulation. Water Resour. Res. 2024, 60, e2023WR036160. [Google Scholar] [CrossRef]
Wu, C.; Liu, G.; Cong, L.; Li, X.; Liu, X.; Liu, Y.; Deyan, W.; Zhang, Y.; Bai, D. ENSO-driven hydroclimate changes in central Tibetan Plateau since middle Holocene: Evidence from Zhari Namco’s lake sediments. Quat. Sci. Rev. 2024, 330, 108593. [Google Scholar] [CrossRef]
Fuentes-Aguilera, P.; Rodríguez-López, L.; Bourrel, L.; Frappart, F. Recovery of Time Series of Water Volume in Lake Ranco (South Chile) through Satellite Altimetry and Its Relationship with Climatic Phenomena. Water 2024, 16, 1997. [Google Scholar] [CrossRef]
Mologni, C.; Revel, M.; Chaumillon, E.; Malet, E.; Coulombier, T.; Sabatier, P.; Brigode, P.; Herve, G.; Develle, A.L.; Schenini, L.; et al. 50-year seasonal variability in East African droughts and floods recorded in central Afar lake sediments (Ethiopia) and their connections with the El Niño–Southern Oscillation. Clim. Pas. 2024, 20, 1837–1860. [Google Scholar] [CrossRef]
Shiri, J.; Shamshirband, S.; Kisi, O.; Karimi, S.; Bateni, S.M.; Hosseini Nezhad, S.H.; Hashemi, A. Prediction of water-level in the Urmia Lake using the extreme learning machine approach. Water Resour. Manag. 2016, 30, 5217–5229. [Google Scholar] [CrossRef]
Zhu, S.; Hrnjica, B.; Ptak, M.; Choiński, A.; Sivakumar, B. Forecasting of water level in multiple temperate lakes using machine learning models. J. Hydrol. 2020, 585, 124819. [Google Scholar] [CrossRef]
Zhen, L.; Bărbulescu, A. Comparative Analysis of Convolutional Neural Network-Long Short-Term Memory, Sparrow Search Algorithm-Backpropagation Neural Network, and Particle Swarm Optimization-Extreme Learning Machine Models for the Water Discharge of the Buzău River, Romania. Water 2024, 16, 289. [Google Scholar] [CrossRef]
Sales, A.K.; Gul, E.; Safari, M.J.S.; Ghodrat Gharehbagh, H.; Vaheddoost, B. Urmia lake water depth modeling using extreme learning machine-improved grey wolf optimizer hybrid algorithm. Theor. Appl. Climatol. 2021, 146, 833–849. [Google Scholar] [CrossRef]
Vaheddoost, B. Spatial analysis of large atmospheric oscillations and annual precipitation in Lake Urmia basin. Eur. Water 2017, 59, 123–129. [Google Scholar]
Wang, J.; Kessler, J.; Bai, X.; Clites, A.; Lofgren, B.; Assuncao, A.; Bratton, J.; Chu, P.; Leshkevich, G. Decadal variability of Great Lakes ice cover in response to AMO and PDO, 1963–2017. J. Clim. 2018, 31, 7249–7268. [Google Scholar] [CrossRef]
Komatsu, E.; Fukushima, T.; Harasawa, H. A modeling approach to forecast the effect of long-term climate change on lake water quality. Ecol. Model. 2007, 209, 351366. [Google Scholar] [CrossRef]
Fathian, F.; Vaheddoost, B. Conceptualization of the indirect link between climate variability and lake water level using conditional heteroscedasticity. Hydrol. Sci. J. 2021, 66, 1907–1923. [Google Scholar] [CrossRef]
Ozdemir, S.; Yaqub, M.; Yildirim, S.O. A systematic literature review on lake water level prediction models. Environ. Model. Softw. 2023, 163, 105684. [Google Scholar] [CrossRef]
Kottek, M.; Grieser, J.; Beck, C.; Rudolf, B.; Rubel, F. World Map of the Köppen-Geiger climate classification updated. Meteorol. Z. 2006, 15, 259–263. [Google Scholar] [CrossRef]
National Oceanic and Atmospheric Administration (NOAA). Climate Indices: Monthly Atmospheric and Ocean Time Series. Available online: https://psl.noaa.gov/data/climateindices/list/ (accessed on 1 August 2024).
Huang, G.B.; Zhu, Q.Y.; Siew, C.K. Extreme learning machine: A new learning scheme of feedforward neural networks. In Proceedings of the IEEE International Joint Conference on Neural Networks (IEEE Cat. No. 04CH37541), Budapest, Hungary, 25–29 July 2004; Volume 2, pp. 985–990. Available online: https://ieeexplore.ieee.org/xpl/conhome/9486/proceeding (accessed on 1 August 2024).
Safari, M.J.S.; Ebtehaj, I.; Bonakdari, H.; Eshaghi, M.S. Sediment transport modeling in rigid boundary open channels using generalize structure of group method of data handling. J. Hydrol. 2019, 577, 123951. [Google Scholar] [CrossRef]
Holland, J.H. Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence; MIT Press: Cambridge, MA, USA, 1992; Available online: https://ieeexplore.ieee.org/book/6267401 (accessed on 1 August 2024).
Mehrabian, A.R.; Lucas, C. A novel numerical optimization algorithm inspired from weed colonization. Ecol. Inform. 2006, 1, 355–366. [Google Scholar] [CrossRef]
Safari, M.J.S.; Mohammadi, B.; Kargar, K. Invasive weed optimization-based adaptive neuro-fuzzy inference system hybrid model for sediment transport with a bed deposit. J. Clean. Prod. 2020, 276, 124267. [Google Scholar] [CrossRef]
Vaheddoost, B.; Guan, Y.; Mohammadi, B. Application of hybrid ANN-whale optimization model in evaluation of the field capacity and the permanent wilting point of the soils. Environ. Sci. Pollut. Res. 2020, 27, 13131–13141. [Google Scholar] [CrossRef]
Gul, E.; Staiou, E.; Safari, M.J.S.; Vaheddoost, B. Enhancing Meteorological Drought Modeling Accuracy Using Hybrid Boost Regression Models: A Case Study from the Aegean Region, Türkiye. Sustainability 2023, 15, 11568. [Google Scholar] [CrossRef]
Zar, J.H. Spearman Rank Correlation. In Biostatistical Analysis, 5th ed.; Pearson Prentice-Hall: Hoboken, NJ, USA, 2005; pp. 388–394. [Google Scholar]
Ghanbari, R.N.; Bravo, H.R. Coherence between atmospheric teleconnections, Great Lakes water levels, and regional climate. Adv. Water. Resour. 2008, 31, 1284–1298. [Google Scholar] [CrossRef]
Guo, J.; Sun, J.; Chang, X.; Guo, S.; Liu, X. Correlation analysis of NINO3. 4 SST and inland lake level variations monitored with satellite altimetry: Case studies of lakes Hongze, Khanka, La-ang, Ulungur, Issyk-kul and Baikal. TAO Terr. Atmos. Ocean. Sci. 2011, 22, 2. [Google Scholar] [CrossRef]

Figure 1. Study area and location of Lake Uluabat and Lake Iznik.

Figure 2. Location of large-scale oscillations used in the study.

Figure 3. Flowchart of the methodology.

Figure 4. Performance of the models devloped for Lake Iznik at (a) training and (b) testing stages.

Figure 5. Performance of the models devloped for Lake Uluabat at (a) training and (b) testing stages.

Table 1. LSAOO indices used in this study [21]. (SST: sea surface temperature; SLP: sea level pressure; AP: atmospheric pressure).

N.	LSAOO	Extension	About
1	AMM	Atlantic Meridional Mode	Cross-equatorial meridional difference in SST anomaly in the tropical Atlantic
2	AO	Arctic Oscillation	Back-and-forth shifting of AP between the Arctic and the mid-latitudes of the North Pacific and North Atlantic
3	CAR	Caribbean SST Index	SST anomalies averaged over the Caribbean
4	EA/WR	Eastern Asia/Western Russia	Large-scale anomalies over the Caspian Sea toward western Europe
5	ENSO	Multivariate El Niño–Southern Oscillation Index	Empirical orthogonal function of SLP, SST, zonal, and meridional components of the surface wind and outgoing longwave radiation over the tropical Pacific basin (30° S–30° N and 100° E–70° W)
6	EP/NP	East Pacific/North Pacific Oscillation	Spring–Summer–Fall pattern focused on three anomaly centers
7	IDO	Indian Dipole Oscillation	Anomalous SST gradient between the western equatorial Indian Ocean (50° E–70° E and 10° S–10° N) and the south eastern equatorial Indian Ocean (90° E–110° E and 10° S–0° N)
8	NAO	North Atlantic Oscillations	Difference in normalized pressure between Iceland and the Azores
9	Niño 1 + 2	Extreme Eastern Tropical Pacific SST	The average SST between 0°~10° S and 90°~80° W
10	Niño 3	Eastern Tropical Pacific SST	The average SST between 5° S~5° N and 150°~90° W
11	Niño 3.4	East Central Tropical Pacific SST	The average SST between 5° S~5° N and 170°~120° W
12	Niño 4	Central Tropical Pacific SST	The average SST between 5° S~5° N and 160° E~150° W
13	NOI	Northern Oscillation Index	The difference in SLP between Darwin in Australia and the North Pacific High
14	NP	North Pacific pattern	Sea level differences between 30°~65° N and 160° E~145 ° W
15	ONI	Oceanic Niño Index	3-month averaged SST in the east-central tropical Pacific (120°–170° W) near the International Dateline
16	PDO	Pacific Decadal Oscillation	Temporal covariance matrix of SST in the Pacific
17	PNA	Pacific North American index	Anomalies in the geopotential height fields of 700 or 500 mb over the western and eastern US
18	QBO	Quasi-Biennial Oscillation	A quasiperiodic oscillation between easterlies and westerlies of the equatorial zonal wind in the tropical stratosphere
19	SOI	Southern Oscillation Index	The difference between normalized pressure in Tahiti and Darwin
20	TNA	Tropical Northern Atlantic index	SST between 15°~57.5° W and 5.5°~23.5 ° N
21	TNI	Trans-Niño Index	The difference between normalized anomalies of SST between Niño 1 + 2 and Niño 4
22	TSA	Tropical Southern Atlantic index	SST between 30° W~10° E and 20° S~0°
23	WHWP	Western Hemisphere Warm Pool	The SST (when it is hotter than 28.5 °C) in the eastern North Pacific and Atlantic oceans
24	WP	Western Pacific Index	Low-frequency oscillations over the North Pacific

Table 2. Setting parameters for ELM-GA and ELM-IWO.

GA		IWO
Population size	50	Population size	50
Number of generations	100	Number of generations	100
Reproduction method	Crossover and mutation	Reproduction method	Based on fitness
Crossover fraction	0.8	Number of seeds	1–5
Mutation function	mutationadaptfeasible	Sigma	0.001–0.1

Table 3. Performance of the models in training and testing for Lake Iznik.

Model	Stage	RMSE	NSE	CCC	R²	AIC
ELM	Training	0.064	0.969	0.984	0.969	16.033
ELM	Testing	0.208	0.634	0.822	0.692	16.342
ELM-GA	Training	0.092	0.935	0.966	0.935	16.069
ELM-GA	Testing	0.092	0.929	0.964	0.928	16.071
ELM-IWO	Training	0.067	0.966	0.983	0.965	16.036
ELM-IWO	Testing	0.096	0.921	0.960	0.925	16.076

Table 4. Performance of the models in training and testing for Lake Uluabat.

Model	Stage	RMSE	NSE	CCC	R²	AIC
ELM	Training	0.276	0.884	0.939	0.884	29.091
ELM	Testing	0.533	0.443	0.723	0.539	32.159
ELM-GA	Training	0.325	0.839	0.913	0.873	29.514
ELM-GA	Testing	0.289	0.836	0.918	0.839	29.276
ELM-IWO	Training	0.283	0.878	0.935	0.878	29.149
ELM-IWO	Testing	0.194	0.926	0.962	0.927	28.582

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Invasive-Weed-Optimization-Based Extreme Learning Machine for Prediction of Lake Water Level Using Major Atmospheric–Oceanic Climate Scenarios

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Data

2.3. Models

2.3.1. Extreme Learning Method (ELM)

2.3.2. Genetic Algorithm (GA)

2.3.3. Invasive Weed Optimization (IWO)

2.3.4. Hybrid ELM-GA and ELM-IWO Models

2.3.5. Performance Evaluation Criteria

2.3.6. Summarizing the Modeling

3. Results

4. Discussion

5. Conclusions

Supplementary Materials

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics