Salinity Prediction Based on Improved LSTM Model in the Qiantang Estuary, China

Zheng, Rong; Sun, Zhilin; Jiao, Jiange; Ma, Qianqian; Zhao, Liqin

doi:10.3390/jmse12081339

Open AccessArticle

Salinity Prediction Based on Improved LSTM Model in the Qiantang Estuary, China

by

Rong Zheng

¹

,

Zhilin Sun

¹,

Jiange Jiao

^2,*,

Qianqian Ma

² and

Liqin Zhao

²

¹

College of Civil Engineering and Architecture, Zhejiang University, Hangzhou 310058, China

²

College of Mechanical and Electrical Engineering, China Jiliang University, Hangzhou 310018, China

^*

Author to whom correspondence should be addressed.

J. Mar. Sci. Eng. 2024, 12(8), 1339; https://doi.org/10.3390/jmse12081339 (registering DOI)

Submission received: 17 June 2024 / Revised: 21 July 2024 / Accepted: 23 July 2024 / Published: 7 August 2024

(This article belongs to the Topic Sustainable River and Lake Restoration: From Challenges to Solutions)

Download

Browse Figures

Versions Notes

Abstract

:

Accurate prediction of estuarine salinity can effectively mitigate the adverse effects of saltwater intrusion and help ensure the safety of water resources in estuarine regions. Presently, diverse data-driven models, mainly neural network models, have been employed to predict tidal estuarine salinity and obtained considerable achievements. Due to the nonlinear and nonstationary features of estuarine salinity sequences, this paper proposed a multi-factor salinity prediction model using an enhanced Long Short-Term Memory (LSTM) network. To improve prediction accuracy, input variables of the model were determined through Grey Relational Analysis (GRA) combined with estuarine dynamic analysis, and hyperparameters for the LSTM model were optimized using a multi-strategy Improved Sparrow Search Algorithm (ISSA). The proposed ISSA-LSTM model was applied to predict salinity at the Cangqian and Qibao stations in the Qiantang Estuary of China, based on measured data from 2011–2012. The model performance is evaluated by mean absolute error (MAE), mean absolute percentage error (MAPE), root mean square error (RMSE), and Nash-Sutcliffe efficiency (NSE). The results show that compared to other models including Back Propagation neural network (BP), Gate Recurrent Unit (GRU), and LSTM model, the new model has smaller errors and higher prediction accuracy, with NSE improved by 8–32% and other metrics (MAP, MAPE, RMSE) improved by 15–67%. Meanwhile, compared with LSTM optimized with the original SSA (SSA-LSTM), MAE, MAPE, and RMSE values of the new model decreased by 13–16%, 15–16%, and 11–13%, and NSE value increased by 5–6%, indicating that the ISSA has a better hyperparameter optimization ability than the original SSA. Thus, the model provides a practical solution for the rapid and precise prediction of estuarine salinity.

Keywords:

saltwater intrusion; salinity prediction; improved long short-term memory; the Qiantang Estuary

1. Introduction

The intrusion of saltwater is a common natural phenomenon in tidal estuarine areas. In recent years, with rising sea levels, extreme weather events, and frequent human activities, the problem of saltwater intrusion has intensified, emerging as a pressing concern in many tidal estuarine regions [1]. Elevated salinity in water bodies not only threatens the safety of coastal city water supplies [2,3] but also leads to the deterioration of estuarine water quality and ecosystems [4,5]. Therefore, accurate estuarine salinity forecasting is of urgent and substantial practical importance. However, the salinity intrusion process is influenced by many temporally and spatially variable factors with complex interactions [6], resulting in high nonlinearity, randomness, and instability of salinity sequences of estuarine salinity sequences [7,8,9], which cause challenges for accurate estuarine salinity forecasting.

Generally, models currently used for estuarine salinity prediction can be categorized into two types: numerical and data-driven. Numerical models are extensively employed in saltwater intrusion research [10,11,12,13]. These models typically simulate the dynamic process of saltwater intrusion by solving coupled equations for water flow and material transport. However, the performance of such models relies on a significant amount of observed data and computility [8,14].

In recent years, with the rapid development of machine learning technology, data-driven models have been widely applied in water quality prediction [15,16,17], with neural network-based deep learning models being the primary representatives. Compared to numerical models, these models can establish relationships between parameters and avoid the constraints imposed by complex boundaries or initial conditions [16,18]. The sequence of estuarine salinity itself is highly nonlinear and nonstationary, with complex nonlinear interaction with different influencing factors, which makes salinity prediction a typical nonlinear problem. Compared to traditional machine learning methods such as Artificial Neural Network (ANN) [19], deep learning methods can learn deeper nonlinear relations and have a stronger ability to forecast nonlinear and nonstationary data series [20,21]. Among the deep learning models, the Long Short-Term Memory (LSTM) model [22] has a stronger ability to capture nonlinear patterns in time series data, while considering the inherent characteristics of nonstationary time series data [23]. Due to its special gate architectures, the LSTM model can also overcome the problem of gradient disappearance and gradient explosion in conventional ANNs [24]. These advantages make the LSTM model more suitable for handling nonlinear, nonstationary, and long-term dependency problems [21], such as estuarine salinity prediction.

Presently, some scholars have employed LSTM models to predict estuarine salinity [25,26,27]. However, these models still rely on empirical selection when choosing input variables and hyperparameters, often requiring extensive experimentation for adjustments. In fact, determining hyperparameters is a crucial step in the development and training of neural networks. Inappropriate choices for hyperparameter values can lead to overfitting or underfitting, resulting in reduced prediction accuracy or deteriorated model performance [28,29]. With increasing dataset volume and model complexity, it will be harder to determine optimal hyperparameter values by subjective experience or extensive experiments. To improve prediction accuracy, swarm intelligence algorithms (SIA) are introduced into neural network models for hyperparameter optimization.

As a kind of bio-inspired heuristic algorithm, SIAs can simulate the evolutionary patterns, behavioral characteristics, or thinking modes of insects, birds, or other populations [30]. Among many commonly used SIAs, the Sparrow Search Algorithm (SSA) [31] is known for its high search accuracy, fast convergence speed, good stability, and robustness. Additionally, SSA requires fewer model parameters, making it a relatively simple algorithm with better global optimization capabilities in complex problem-solving environments [32]. However, original SSA also has some drawbacks, such as poor uniformity and predictability in the initialization of individuals, lack of step control, and individual mutation mechanisms [33]. These issues lead to reduced population diversity and traversal in the later stages of SSA, making it prone to premature convergence and falling into local optima [28].

Furthermore, determining input variables is crucial in developing data-driven prediction models. Appropriate feature selection can prevent dimension disaster phenomenon, reduce the time required for model training, mitigate the risk of algorithms falling into local optima, and improve model performance [34,35]. The estuarine salinity is influenced by many temporally and spatially variable factors, including runoff and tides, wind, river topography, sea level rise, and human activities [36,37]. For example, there are some short-term factors such as wind and storm surges, which can bring about an abrupt shift in salinity [37,38]. Moreover, these factors have different time scales and exhibit highly complex and nonlinear relationships among themselves, introducing considerable uncertainty to salinity predictions. Therefore, selecting a reasonable method for correlation analysis is crucial for feature analysis in estuarine salinity prediction models.

The Qiantang Estuary of China is a typical macrotidal estuary with significant saltwater intrusion issues. There have been relatively few studies on the salinity prediction of the Qiantang Estuary based on neural networks to date. Xu et al. [39] established a salinity prediction model for the Qibao station based on the Back Propagation neural network (BP), using upstream discharge and downstream tidal range as control conditions. Li et al. [40] optimized the parameters of a wavelet neural network using Particle Swarm Optimization (PSO) and applied it to simulate salinity at the Ganpu station. Yang and Zhang [41] conducted a single-factor analysis and prediction of salinity based on the daily maximum salinity sequence at the Cangqian station using LSTM. However, so far, no scholar has applied an improved LSTM model with multiple variables to salinity prediction in the Qiantang Estuary.

Therefore, for the first time, this paper proposes a multivariate salinity prediction model based on LSTM optimized with an improved Sparrow Search Algorithm (ISSA-LSTM). The observational data from the Cangqian (CQ) and Qibao (QB) stations in the Qiantang Estuary during 2011–2012 were used for analysis. The prediction result was compared with ISSA-LSTM and other commonly used models including BP, GRU (Gate Recurrent Unit), LSTM model, and a LSTM model optimized with the original SSA (SSA-LSTM). All models were trained and tested with the same datasets and inputs. The paper is organized as follows: Section 2 describes the study area and datasets; Section 3 introduces the framework of the ISSA-LSTM model and the methodology of each part; Section 4 presents the experimental results and analysis, and also introduces a sensitivity analysis based on the proposed model; Section 5 concludes the paper.

2. Materials

2.1. Study Area

The Qiantang River is the largest river in Zhejiang Province, China. The Qiantang Estuary extends from the Fuchunjiang power station to the mouth of Hangzhou Bay and exhibits a funnel-shaped plane. The location of the study area is shown in Figure 1a. The average annual runoff upstream is 318.8 × 10⁸ m³, but the average tidal range is 5.62 m, and the maximum tidal range can reach 9 m at Ganpu. Strong tidal force and relatively weak upstream runoff make the study area a world-famous macrotidal estuary but also cause more severe saltwater intrusion problems than other estuaries.

Hangzhou, the capital city of Zhejiang Province, relies heavily on the Qiantang River as its primary source of drinking water, supplying approximately 85% of the water for the Hangzhou municipal waterworks. In recent decades, saltwater intrusion in the Qiantang Estuary has threatened the freshwater supply security of Hangzhou during the dry season and spring tide every year, primarily from August to December [42]. As a result, salinity prediction is both significant and urgent for solving water quality problems due to saltwater intrusion and protecting water resources.

2.2. Data Collection

The Cangqian station (CQ) and Qibao station (QB) are permanent hydrological observation stations situated in the upper-middle section of the Qiantang Estuary, as shown in Figure 1b. These two stations have abundant observed data including continuous multi-year time series of daily salinity values, which can effectively capture the characteristics of salinity change in the Qiantang Estuary. Hence, this paper selected CQ and QB stations as the research sites.

Since the maximum salinity in the upstream river channel reflects the severity of saltwater intrusion, the maximum daily salinity is chosen as the predictive variable (hereinafter referred to as salinity). Salinity in the Qiantang Estuary is influenced not only by tide and salinity from the open sea but also by the upstream runoff discharge and instantaneous surface wind speed. Considering the above factors, relevant data were collected in this paper. Upstream discharge data were collected from the Fuchunjiang hydrological station (FCJ), downstream tidal level data were collected from the Ganpu station (GP), salinity data were collected from CQ, QB, and GP stations, and wind speed data were collected from the Hangzhou meteorological station (HZ). The period of the datasets ranged from 1 January 2011 to 31 December 2012, covering two drought periods. Specific details of each dataset are shown in Table 1.

Due to variations in the measurement frequency of each dataset, all data were transformed into daily sequences to ensure consistent sequence lengths. The daily tidal range was calculated by averaging hourly tidal level data from the GP station. The daily average upstream discharge was calculated by averaging hourly discharge data from the FCJ station. The daily maximum salinity was calculated from multiple daily measurement results at the CQ and QB stations. Daily average wind speed was calculated based on hourly surface wind speed observations. Finally, a total of 731 sets of data were processed.

2.3. Data Preprocessing

Considering the differences in units and magnitudes among input factors of the prediction model, it was necessary to normalize the sample data before training the neural network model, thereby enhancing the convergence and training efficiency of the neural network.

This paper employed a min-max normal form to normalize the datasets to the distribution range [0, 1], as shown in Equation (1).

X_{i} = (x_{i} - x_{m i n}) / (x_{m a x} - x_{m i n})

(1)

where

x_{i}

is the original data series,

X_{i}

is the normalized series,

x_{m i n}

and

x_{m a x}

are the minimum and maximum values of series

x

.

3. Methods

3.1. Framework of ISSA-LSTM Model

The model comprised three sections: Part 1 involved data processing and feature selection, Part 2 was the Improved Sparrow Search Algorithm (ISSA), and Part 3 was the LSTM model. Initially, in Part 1, the datasets underwent preprocessing and feather selection. The input datasets were first examined to remove the outliers and then normalized using Equation (1). The input variables for the prediction model were determined through Grey Relational Analysis (GRA) and then imported into Part 3. Subsequently, in Part 2, according to the hyperparameter selection range and fitness function provided in Part 3, the ISSA underwent several iterative calculations and finally selected the hyperparameter combination with the best fitness and then returned it to Part 3. Finally, in Part 3, the optimal hyperparameters obtained from ISSA were used to construct the LSTM model and then the model was trained to get the final prediction results. The overall framework and process of the model are shown in Figure 2.

3.2. Feature Selection Based on GRA

Reasonable feature selection could adequately reflect estuarine system characteristics and enhance the efficiency and accuracy of the prediction model. Principal Component Analysis (PCA) is a commonly used method in feature selection, but it has the problem of high computational cost and memory requirements [43]. GRA is a multifactor statistical analysis method based on grey system theory [44]. Compared to PCA, GRA has lower data requirements, less computational load, and a stronger ability to address nonlinear problems, which means that it is more suitable for feature selection in this paper. Therefore, GRA could better explore the nonlinear relationship between salinity and influencing factors and ultimately improve the performance of the salinity prediction model. The calculation of the grey relational degree

γ

is shown in Equation (2).

γ (X_{0}, X_{i}) = \frac{1}{n} \sum_{k = 1}^{n} \frac{\min_{i} \min_{k} |X_{0} (k) - X_{i} (k)| + ζ \max_{i} \max_{k} |X_{0} (k) - X_{i} (k)|}{|X_{0} (k) - X_{i} (k)|}

(2)

where

X_{0}

and

X_{i}

are the dimensionless reference series

x_{0}

and comparative series

x_{i}

obtained through Equation (1),

ζ

is the identification coefficient ranging from 0 to 1, normally

ζ = 0.5

. In this paper,

x_{0}

is the salinity at the CQ or QB station, and

x_{i} (i = 1, 2 \dots m)

are the impact factors, where

m

is the number of influencing factors.

The value of

γ

represents the degree influence that the comparability sequence could exert over the reference sequence. A higher

γ

value suggests a higher level of correlation between the two. Generally,

γ

> 0.9 indicates a marked influence,

γ

> 0.8 indicates a relatively marked influence,

γ

> 0.7 indicates a noticeable influence, and

γ

< 0.6 indicates a negligible influence [45].

3.3. Hyperparameter Optimization Using ISSA

The SSA imitates the foraging and anti-predation behavior of sparrows in nature to perform local and global searches. The algorithm divides the sparrow population into Discoverers and Followers. Discoverers search for food and share the food locations with other individuals, while Followers forage by following the Discoverers. Additionally, a subset of sparrows is designated as Vigilantes, responsible for alerting the group to potential dangers. The quality of food that sparrow individuals can obtain is measured by their fitness. The entire sparrow population continually searches for better food, representing the process of solution optimization. The process of SSA is illustrated in Figure 3a.

To overcome the problems of sensitivity to initial values and premature convergence in the original Sparrow Search Algorithm, this paper proposed a multi-strategy ISSA. The ISSA optimized the original SSA by modifying the initialization of the population, updating the formula of Followers’ behavior, and introducing random sparrow mutations to enhance algorithm stability and global optimization capabilities, preventing the algorithm from getting stuck in local optima. These approaches could help enhance the ability of ISSA to search for the best hyperparameter combination, which solved the problem of poor fitting ability due to the dependence on subjective empirical selection of hyperparameter and thereby improved the prediction accuracy of the LSTM model. The process of ISSA is illustrated in Figure 3b and specific improvement measures are as follows.

Initial population optimization

Chaos mapping possesses characteristics such as ergodicity, randomness, and non-linearity, enabling it to overcome the issue of uneven initial distribution in a population. It is frequently utilized to enhance the global optimization capability of optimization algorithms. Commonly used chaotic mappings include logistic, tent, and sine. Compared to the former two mappings, sine mapping has an unrestricted folding number and higher chaotic characteristics. Therefore, sine mapping was utilized to initialize the sparrow population in this paper, as expressed by Equation (3).

φ_{n + 1} = a \sin (π φ_{n}) a \in (0, 1]

(3)

where

a

is the control parameter and

φ_{n}

is the value of the mapping function.

2.: Followers strategy optimization

In the original SSA, Followers tended to focus on a particular Discoverer, displaying blindness and losing independence. It may have caused them to overlook other Discoverers with better foraging abilities, leading to a state of local optima. To avoid the above phenomenon, a dynamic inertia weight was introduced into the position update formula of Followers, dynamically adjusting the current individual’s influence on the optimization of the next-generation individuals, thereby enhancing the global optimization capability of Followers. The expression of dynamic inertia weight is shown in Equation (4). The improved position update formula of Followers is shown in Equation (5).

ω = \frac{\exp [2 (1 - t / i t e r_{m a x})] - \exp [- 2 (1 - t / i t e r_{m a x})]}{\exp [2 (1 - t / i t e r_{m a x})] + \exp [- 2 (1 - t / i t e r_{m a x})]}

(4)

G_{i, j}^{t + 1} = \{\begin{matrix} Q \cdot \exp (\frac{G_{w r o s t}^{t} - G_{i, j}^{t}}{i^{2}}) i > n / 2 \\ G_{b e s t}^{t + 1} + ω |G_{i, j}^{t} - G_{b e s t}^{t + 1}| \cdot A^{+} \cdot L o t h e r w i s e \end{matrix}

(5)

where

t

is the number of current iterations and

i t e r_{m a x}

is the maximum number of iterations.

G_{b e s t}

and

G_{w r o s t}

are the positions of current Discoverers which have the highest or lowest fitness value.

A^{+}

is a 1 × d matrix with

A^{+} = A^{T} {(A A^{T})}^{- 1}

. The condition

i > n / 2

indicates that an individual with lower fitness, experiencing hunger due to insufficient food, needs to move to another place for foraging.

In the early stage of iterations, the

ω

value was relatively large, allowing the algorithm to conduct global searches with a larger step. In the late stage of iterations, the

ω

value decreased adaptively, enabling the algorithm to perform local searches with a smaller step and to reach a balance between the global and local optimization capabilities. This could prevent the algorithm from being trapped in local optima during the mid-term and enhanced the convergence accuracy and speed in the later stages.

3.: Population mutation

Levy’s flight search strategy simulates the random foraging path of insects and birds in nature [46]. It combines high-probability short-distance jumps with low-probability long-distance flights, alternating between the two movement modes. Short-distance jumps enhance the sparrow’s meticulous search ability in the nearby environment, improving the algorithm’s local optimization capability. Long-distance flights help the sparrow escape local areas, expanding the search range.

After updating the sparrow population’s positions, this paper utilized a roulette wheel to select a subset of all sparrow individuals. Then the Levy mutation strategy was introduced to enhance the diversity of the sparrow population, as shown in Equation (6).

G_{i, j}^{t + 1} = l e v y_{d} \cdot G_{i, j}^{t}

(6)

l e v y_{d} = 0.01 σ \cdot r_{1} / {|r_{2}|}^{1 / β}

(7)

σ = Γ (1 + β) \cdot \sin (\frac{π β}{2}) / Γ [(\frac{1 + λ}{2}) β \cdot 2^{(\frac{β - 1}{2})}]

(8)

where

l e v y_{d}

is the random step length generated by Levy flight strategy, which is expressed in Equation (7).

Γ (x)

is the gamma function, and

β

is the fixed step length, typically set to 1.5.

r_{1}

and

r_{2}

are standard normal distribution random numbers.

The optimization performance of the SSA and ISSA was compared by six test functions F1–F6. Among them, F1–F3 are unimodal, emphasizing the algorithm’s capability for local optimization, while F4–F5 are multimodal, focusing on the algorithm’s global optimization ability. The expressions and theoretical optimal values for each test function are shown in Table 2.

The parameters of the algorithms were set as follows: the sparrow population size was 10, the optimization dimension was 50, the maximum number of iterations was 1000, ST is 0.8, and the ratio of Followers and Vigilantes was 0.6 and 0.2. The average value and variance were used to indicate the overall accuracy and stability of the optimization algorithm. The results for each test function are presented in Table 3.

According to Table 3, the ISSA exhibited superior average and variance values compared to the SSA when computing optimal values for both unimodal and multimodal test functions. Therefore, the ISSA demonstrated better testing performance than the SSA, indicating better global optimization capabilities.

3.4. Model Setting and Evaluation Index

The dataset, comprising 731 sets of data, was divided into training and testing sets in a 7:3 ratio, with the testing set including a complete dry season (July to December). The sparrow population size was 25, maximum iteration was 20, ST was 0.6, and the ratio of Vigilantes and Followers was 0.2 and 0.6. The hyperparameters of the LSTM model optimized by ISSA included the number of neurons in the first hidden layer (L₁), the number of neurons in the second hidden layer (L₁), the branching factor (B), the number of iterations (K), and the base learning rate (lr). The ranges of five hyperparameters were [1, 100], [1, 100], [16, 64], [10, 100] and [0.001, 0.01].

To evaluate the performance of different models, this paper selected mean absolute error (MAE), mean absolute percentage error (MAPE), root mean square error (RMSE), and Nash-Sutcliffe efficiency (NSE) as the evaluation indexes. The calculation of evaluation index is shown in Equations (9)–(12).

MAE = \frac{1}{n} \sum_{i = 1}^{n} |\dot{y_{i}} - y_{i}|

(9)

MAPE = \frac{1}{n} \sum_{i = 1}^{n} |\frac{\dot{y_{i}} - y_{i}}{y_{i}}|

(10)

RMSE = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(\dot{y_{i}} - y_{i})}^{2}}

(11)

NSE = R^{2} = 1 - \sum_{i = 1}^{n} {(\dot{y_{i}} - y_{i})}^{2} / \sum_{i = 1}^{n} {(\dot{y_{i}} - \bar{y_{i}})}^{2}

(12)

where

y_{i}

and

\dot{y_{i}}

are true value and predicted value,

\bar{y_{i}}

is the average of

y_{i}

, and

n

is data volume. Smaller values for MAPE, RMSE, and MAE, along with higher values for NSE, indicate higher prediction accuracy of the model.

4. Results

4.1. Result of Feature Selection

The dynamic mechanism of estuarine saltwater intrusion is complex, with salinity influenced not only by hydrodynamic factors such as upstream runoff and downstream tides but also by the instantaneous effect of local surface wind. Additionally, considering the time lag effect among variables [29], the influence of factors such as initial salinity, runoff, and tide from earlier periods also needs to be considered. Therefore, this paper categorized the initial candidate set of input variables into 4 major classes: salinity, water level, runoff, and wind speed, comprising a total of 11 impact factors, as shown in Table 4. The results of the GRA are shown in Table 5.

Table 5 shows the overall ranking of correlation degree values for each factor type as follows: downstream salinity > tidal range > upstream discharge > wind. Moreover, when moving upstream, the influence of runoff discharge on local salinity increases, while the impact of downstream tidal range or salinity diminishes, aligning with estuarine hydrodynamic characteristics. In the tidal range category,

γ

ranges from 0.841 to 0.865, indicating a relatively marked influence. The ranking of correlation values is S₂ > S₁ > S₀, suggesting that the earlier salinity at the downstream GP station (S₂) has the most significant impact on salinity at the CQ and QB stations. In the tidal range category,

γ

ranges from 0.812 to 0.826, indicating a relatively marked influence, and the correlation values are ranked TR₀ > TR₁ > TR₂. In the runoff category,

γ

ranges from 0.739 to 0.766, indicating a noticeable influence, and the correlation values are ranked Q₀ > Q₁ > Q₂. Additionally, the correlation of flow-related factors at the QB station is greater than at the CQ station, suggesting that the influence of upstream flow on salinity becomes more substantial when moving upstream. In the wind speed category, all

γ

are <0.6, indicating a weak correlation, and the correlation values are ranked W_WE > W_NS, primarily due to the consistent east-west trend of W_WE with the overall pattern of the Qiantang River. In summary, S₂, TR₀, Q₀, and W_WE are finally selected as input factors for the prediction model.

4.2. Results of Hyperparameters Optimization

The results of optimized hyperparameters of the LSTM model through SSA and ISSA are shown in Table 6.

4.3. Results of Different Prediction Models

The proposed ISSA-LSTM model was employed to forecast the next day’s maximum salinity at the CQ and QB stations. The prediction results were compared with those of the BP, GRU, LSTM, and SSA-LSTM models, as illustrated in Figure 4 and Figure 5. The model prediction accuracy was evaluated using the four indexes introduced in the previous section, and the results are shown in Table 7 and Table 8.

According to Table 7 and Table 8, in the prediction results for CQ and QB stations, the proposed ISSA-LSTM model exhibits the smallest RMSE, MAPE, and MAE values, along with the highest NSE values compared to other models. At the CQ station, compared with BP, GRU, LSTM, and SSA-LSTM models, the MAE value of salinity prediction results of the ISSA-LSTM model decreased by 35%, 31%, 36%, and 13%, the MAPE value decreased by 64%, 59%, 53%, and 15%, the RMSE values decreased by 20%, 16%, 21%, and 11% while the NSE values increased by 12%, 8%, 10%, and 5%, respectively. At the QB Station, compared with BP, GRU, LSTM, and SSA-LSTM models, the MAE value of salinity prediction results of the ISSA-LSTM model decreased by 36%, 15%, 20%, and 13%, the MAPE value decreased by 67%, 53%, 38%, and 16%, the RMSE values decreased by 46%, 37%, 21%, and 16%, while the NSE values increased by 32%, 10%, 14%, and 6%, respectively. The overall ranking of the prediction performance of each model from highest to lowest is ISSA-LSTM > SSA-LSTM > GRU > LSTM > BP.

Meanwhile, at the CQ station, compared with the original LSTM model, the MAE values of the SSA-LSTM and ISSA-LSTM models decreased by 26% and 36%, MAPE values decreased by 45% and 53%, RMSE values decreased by 11% and 21%, and NSE values increased by 4% and 10%, respectively. At the QB station, compared with the original LSTM, the MAE value of the SSA-LSTM and ISSA-LSTM model decreased by 6% and 21%, MAPE value decreased by 26% and 38%, RMSE value decreased by 9% and 20%, and NSE value increased by 8% and 14%, respectively. The results indicate that ISSA-LSTM exhibits a more significant improvement over the performance of the original LSTM model compared to SSA-LSTM, validating the superior optimization effects of the proposed ISSA on model hyperparameters over SSA.

Moreover, as shown in Figure 4, the curves of the salinity prediction results of the GRU, SSA-LSTM, and ISSA-LSTM models exhibit a generally good fit with the observed values, while the BP model results show significant fluctuations compared to the observed values. In the salinity peak region, the ISSA-LSTM model provides the closest prediction to the peak values, SSA-LSTM predicts slightly lower peak values and the unoptimized LSTM model exhibits a considerable deviation in peak results. It suggests that the LSTM model optimized by ISSA performs better in the salinity peak prediction. The scatter plots shown in Figure 5 also confirm these conclusions.

In summary, concerning the salinity prediction results for CQ and QB stations, the proposed ISSA-LSTM model outperforms other models with the smallest RMSE, MAPE, and MAE values, along with the highest NSE values. It indicates that the ISSA-LSTM model possesses higher prediction accuracy, better fitting capabilities, and more accurate peak predictions. Furthermore, after optimizing LSTM hyperparameters through the sparrow algorithms, a noticeable improvement in the LSTM model’s prediction accuracy is observed. The enhancement achieved by ISSA on the LSTM model surpasses that of SSA, demonstrating that the ISSA proposed in this paper has a better optimization capability compared to the original SSA, and can effectively enhance the model’s prediction performance.

4.4. Further Tests

It has been proven in Section 4.1 that upstream runoff is an important factor affecting saltwater intrusion. Therefore, a sensitivity analysis was conducted to investigate the impact of upstream discharge on salinity value based on the proposed model.

The period from September to December 2019 was considered a typical dry season, with daily average flows generally below 700 m³/s and significant salinity peaks occurring in October and November. The maximum salinity values at the CQ and QB stations reached 4.34 and 2.30 PSU during this period. Therefore, these four months were selected as the study period.

With the other factors remaining constant, the salinity predictions for the CQ and QB stations were conducted under three different flow conditions: original discharge, discharge increased by 50%, and discharge decreased by 50%. The results are shown in Figure 6.

According to Figure 6, salinity at both CQ and QB stations decreases with increasing runoff discharge, indicating that an increase in upstream discharge can inhibit and weaken saltwater intrusion. The peak salinity at the CQ and QB stations is 4.34 and 2.30 PSU under the original discharge conditions. When the discharge increases by 50%, the peak salinity at the CQ and QB stations decreases to 3.86 and 1.74 PSU, respectively, with a reduction of 11.1% and 24.3%. Conversely, when the flow decreases by 50%, the peak salinity at the CQ and QB stations increases to 4.94 and 3.36 PSU, respectively, with an increase of 13.8% and 46.1%. It indicates that peak salinity is more sensitive to a decrease in discharge. And under the same magnitude of flow change, the response of salinity to discharge decrease is stronger than discharge increase.

At the same time, it is noted that the change in peak salinity at the upstream QB station is more significant than at the CQ station. Especially when the upstream discharge decreases, the salinity increase of QB station can reach about three times that of CQ station, indicating that the salinity upstream is more sensitive to the change in upstream discharge than the salinity downstream. It is mainly because the tidal dynamics gradually weaken upstream, with the runoff becoming the main controlling hydrodynamic factor. Therefore, the further upstream, the impact of changes in discharge on salinity becomes more pronounced.

Based on the analysis of the variations of salinity peak values with upstream discharge at the CQ and QB stations, it is suggested that increasing upstream discharge through artificial means, such as opening reservoir gates, can effectively inhibit saltwater intrusion and reduce downstream salinity peaks. This measure will be particularly effective during the dry season when saltwater intrusion is more pronounced. However, the effectiveness of reducing salinity peaks through increasing runoff discharge diminishes as upstream flow increases. When the upstream discharge has reached a relatively high level, it requires a substantially larger increase in discharge to achieve the same reduction magnitude in salinity peak values at the CQ and QB stations.

5. Conclusions

Based on the measured data in the Qiantang Estuary from 2011–2012, this paper developed a multi-factor estuarine salinity prediction model based on the improved LSTM model. The proposed model was applied to predict the salinity at the CQ and QB stations, and compared with other models including BP, GRU, LSTM, and SSA-LSTM. The conclusion is as follows.

(1) The proposed model adopted Grey Relational Analysis (GRA) combined with estuarine dynamic analysis to determine input variables and used a multi-strategy Improved Sparrow Search Algorithm (ISSA) for hyperparameters optimization to improve prediction accuracy. The results show that, compared to BP, GRU, and the original LSTM model, the ISSA-LSTM model has smaller errors and higher prediction accuracy, with NSE improved by 8–32% and other metrics improved by 15–67%. Moreover, it can accurately predict salinity peaks. The MAE, MAPE, RMSE, and NSE values of the ISSA-LSTM model are 0.223, 0.681, 0.381, and 0.842 at the CQ station, and 0.081, 0.479, 0.168, and 0.806 at the QB station.

(2) The optimization capability of the SSA is enhanced by integrating multiple improvement strategies, to more effectively optimize the hyperparameters of the LSTM model. Compared with LSTM optimized with the original SSA (SSA-LSTM), MAE, MAPE, and RMSE values of the ISSA-LSTM model decreased by 13–16%, 15–16%, and 11–13% and NSE value increased by 5–6%. This indicates that the ISSA has better hyperparameter optimization ability than the original SSA and can significantly improve the prediction performance of the LSTM model.

(3) Based on the proposed model, a sensitivity analysis of salinity to upstream discharge variations was conducted by predicting salinity at the CQ and QB stations under different flow conditions during the dry season. The results show that the sensitivity of peak salinity to discharge reduction is about 2–3 times that of discharge increases, which suggests that increasing upstream discharge can effectively reduce downstream salinity peaks in dry seasons. Additionally, the impact of discharge changes on the upper reaches is more significant than that in the lower reaches of the Qiantang Estuary.

In conclusion, this paper proposed a new salinity prediction model with high accuracy and practical effectiveness. It provides a new approach for salinity prediction in estuaries, which can help implement early warning and reduce potential losses caused by saltwater intrusion.

Author Contributions

R.Z.: Methodology, Data Curation, Writing–Original Draft; Z.S.: Supervision; J.J.: Writing, Review, and Editing; Q.M. and L.Z.: Software. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Joint Funds of the Zhejiang Provincial Natural Science Foundation of China (No. LZJWY24E090001) and the Zhejiang Provincial Department of Science and Technology (No. 2023C03119).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

Eslami, S.; Hoekstra, P.; Minderhoud, P.S.J.; Trung, N.N.; Hoch, J.M.; Sutanudjaja, E.H.; Dung, D.D.; Tho, T.Q.; Voepel, H.E.; Woillez, M.-N.; et al. Projections of Salt Intrusion in a Mega-Delta under Climatic and Anthropogenic Stressors. Commun. Earth Environ. 2021, 2, 142. [Google Scholar] [CrossRef]
Bellafiore, D.; Ferrarin, C.; Maicu, F.; Manfè, G.; Lorenzetti, G.; Umgiesser, G.; Zaggia, L.; Levinson, A.V. Saltwater Intrusion in a Mediterranean Delta Under a Changing Climate. J. Geophys. Res. 2021, 126, e2020JC016437. [Google Scholar] [CrossRef]
Yang, F.; Xu, Y.; Zhang, W.; Zou, H.; Yang, J.; Liang, J.; Ji, X. Assessing the Influence of Typhoons on Salt Intrusion in the Modaomen Estuary within the Pearl River Delta, China. J. Mar. Sci. Eng. 2023, 12, 22. [Google Scholar] [CrossRef]
Sudaryanto; Naily, W. Ratio of Major Ions in Groundwater to Determine Saltwater Intrusion in Coastal Areas. IOP Conf. Ser. Earth Environ. Sci. 2018, 118, 012021. [Google Scholar] [CrossRef]
Liu, J.; Hetland, R.; Yang, Z.; Wang, T.; Sun, N. Response of Salt Intrusion in a Tidal Estuary to Regional Climatic Forcing. Environ. Res. Lett. 2024, 19, 074019. [Google Scholar] [CrossRef]
Wu, H.; Tu, X.; Chen, X.; Vijay, P.S.; Leonardo, A.; Lin, K.; Liu, Z.; Lai, R. A Framework for Water Supply Regulation in Coastal Areas by Avoiding Saltwater Withdrawal Considering Upstream Streamflow Distribution. Sci. Total Environ. 2023, 905, 167181. [Google Scholar] [CrossRef]
Zhou, F.; Liu, B.; Duan, K. Coupling Wavelet Transform and Artificial Neural Network for Forecasting Estuarine Salinity. J. Hydrol. 2020, 588, 125127. [Google Scholar] [CrossRef]
Hu, J.; Liu, B.; Peng, S. Forecasting Salinity Time Series Using RF and ELM Approaches Coupled with Decomposition Techniques. Stoch. Environ. Res. Risk Assess. 2019, 33, 1117–1135. [Google Scholar] [CrossRef]
Jayasundara, N.C.; Asce, M.; Seneviratne, S.A.; Reyes, E.; Chung, F.I. Artificial Neural Network for Sacramento–San Joaquin Delta Flow–Salinity Relationship for CalSim 3.0. J. Water Resour. Plann. Manag. 2020, 146, 04020015. [Google Scholar] [CrossRef]
Lin, Z.; Zhang, G.; Zou, H.; Gong, W. Salt Intrusion Dynamics in a Well-Mixed Sub-Estuary Connected to a Partially to Well-Mixed Main Estuary. Ocean. Sci. 2024, 20, 181–199. [Google Scholar] [CrossRef]
Chang, Y.; Li, X.; Wang, Y.P.; Klingbeil, K.; Li, W.; Zhang, F.; Burchard, H. Salinity Mixing in a Tidal Multi-Branched Estuary with Huge and Variable Runoff. J. Hydrol. 2024, 634, 131094. [Google Scholar] [CrossRef]
Cho, E.-B.; Tak, Y.-J.; Cho, Y.-K.; Na, H. Fortnightly Variability of Horizontal Salinity Gradient Affects Exchange Flow in the Sumjin River Estuary. Front. Mar. Sci. 2022, 9, 1077004. [Google Scholar] [CrossRef]
Van Bang, D.P.; Phan, N.V.; Guillou, S.; Nguyen, K.D. A 3D Numerical Study on the Tidal Asymmetry, Residual Circulation and Saline Intrusion in the Gironde Estuary (France). Water 2023, 15, 4042. [Google Scholar] [CrossRef]
Lin, K.; Lu, P.; Xu, C.-Y.; Yu, X.; Lan, T.; Chen, X. Modeling Saltwater Intrusion Using an Integrated Bayesian Model Averaging Method in the Pearl River Delta. J. Hydroinformatics 2019, 21, 1147–1162. [Google Scholar] [CrossRef]
He, Y.; Chen, S.; Huang, R.; Chen, X.; Cong, P. Impact of Upstream Runoff and Tidal Level on the Chlorinity of an Estuary in a River Network: A Case Study of Modaomen Estuary in the Pearl River Delta, China. J. Hydroinformatics 2019, 21, 359–370. [Google Scholar] [CrossRef]
Ye, R.; Kong, J.; Shen, C.; Zhang, J.; Zhang, W. An Alternative Statistical Model for Predicting Salinity Variations in Estuaries. Sustainability 2020, 12, 10677. [Google Scholar] [CrossRef]
Lu, P.; Lin, K.; Xu, C.; Lan, T.; Liu, Z.; He, Y. An Integrated Framework of Input Determination for Ensemble Forecasts of Monthly Estuarine Saltwater Intrusion. J. Hydrol. 2021, 598, 126225. [Google Scholar] [CrossRef]
Liang, Z.; Zou, R.; Chen, X.; Ren, T.; Su, H.; Liu, Y. Simulate the Forecast Capacity of a Complicated Water Quality Model Using the Long Short-Term Memory Approach. J. Hydrol. 2020, 581, 124432. [Google Scholar] [CrossRef]
Luo, L.; Zhang, Y.; Dong, W.; Zhang, J.; Zhang, L. Ensemble Empirical Mode Decomposition and a Long Short-Term Memory Neural Network for Surface Water Quality Prediction of the Xiaofu River, China. Water 2023, 15, 1625. [Google Scholar] [CrossRef]
Belayneh, A.; Adamowski, J.; Khalil, B.; Ozga-Zielinski, B. Long-Term SPI Drought Forecasting in the Awash River Basin in Ethiopia Using Wavelet Neural Network and Wavelet Support Vector Regression Models. J. Hydrol. 2014, 508, 418–429. [Google Scholar] [CrossRef]
Hu, Y.-L.; Chen, L. A Nonlinear Hybrid Wind Speed Forecasting Model Using LSTM Network, Hysteretic ELM and Differential Evolution Algorithm. Energy Conv. Manag. 2018, 173, 123–142. [Google Scholar] [CrossRef]
Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
Abbasimehr, H.; Shabani, M.; Yousefi, M. An Optimized Model Using LSTM Network for Demand Forecasting. Comput. Ind. Eng. 2020, 143, 106435. [Google Scholar] [CrossRef]
Wang, Z.; Wang, Q.; Wu, T. A Novel Hybrid Model for Water Quality Prediction Based on VMD and IGOA Optimized for LSTM. Front. Environ. Sci. Eng. 2023, 17, 88. [Google Scholar] [CrossRef]
Qi, S.; He, M.; Bai, Z.; Ding, Z.; Sandhu, P.; Zhou, Y.; Namadi, P.; Tom, B.; Hoang, R.; Anderson, J. Multi-Location Emulation of a Process-Based Salinity Model Using Machine Learning. Water 2022, 14, 2030. [Google Scholar] [CrossRef]
Tran, T.T.; Pham, N.H.; Pham, Q.B.; Pham, T.L.; Ngo, X.Q.; Nguyen, D.L.; Nguyen, P.N.; Veettil, B.K. Performances of Different Machine Learning Algorithms for Predicting Saltwater Intrusion in the Vietnamese Mekong Delta Using Limited Input Data: A Study from Ham Luong River. Water Resour. 2022, 49, 391–401. [Google Scholar] [CrossRef]
Woo, J.W.; Kim, Y.J.; Yoon, J.S. Prediction of Salinity of Nakdong River Estuary Using Deep Learning Algorithm (LSTM) for Time Series Analysis. J. Korean Soc. Coast. Ocean. Eng. 2022, 34, 128–134. [Google Scholar] [CrossRef]
Wu, H.; Yang, T.; Li, H.; Zhou, Z. Air Quality Prediction Model Based on mRMR–RF Feature Selection and ISSA–LSTM. Sci. Rep. 2023, 13, 12825. [Google Scholar] [CrossRef]
Fang, Y.; Chen, X.; Cheng, N.-S. Estuary Salinity Prediction Using a Coupled GA-SVM Model: A Case Study of the Min River Estuary, China. Water Supply 2017, 17, 52–60. [Google Scholar] [CrossRef]
Xu, M.; Cao, L.; Lu, D.; Hu, Z.; Yue, Y. Application of Swarm Intelligence Optimization Algorithms in Image Processing: A Comprehensive Review of Analysis, Synthesis, and Optimization. Biomimetics 2023, 8, 235. [Google Scholar] [CrossRef]
Xue, J.; Shen, B. Research and Application of a Novel Swarm Intelligence Optimization Technique: Sparrow Search Algorithm. Donghua Univ. 2020, 8, 22–34. [Google Scholar]
Yang, S.; Jin, A.; Nie, W.; Liu, C.; Li, Y. Research on SSA-LSTM-Based Slope Monitoring and Early Warning Model. Sustainability 2022, 14, 10246. [Google Scholar] [CrossRef]
Yue, Y.; Cao, L.; Lu, D.; Hu, Z.; Xu, M.; Wang, S.; Li, B.; Ding, H. Review and Empirical Analysis of Sparrow Search Algorithm. Artif. Intell. Rev. 2023, 56, 10867–10919. [Google Scholar] [CrossRef]
Bowden, G.J.; Maier, H.R.; Dandy, G.C. Input Determination for Neural Network Models in Water Resources Applications. Part 2. Case Study: Forecasting Salinity in a River. J. Hydrol. 2005, 301, 93–107. [Google Scholar] [CrossRef]
Lyu, H.; Wan, M.; Han, J.; Liu, R.; Wang, C. A Filter Feature Selection Method Based on the Maximal Information Coefficient and Gram-Schmidt Orthogonalization for Biomedical Data Mining. Comput. Biol. Med. 2017, 89, 264–274. [Google Scholar] [CrossRef]
Jiao, J.; Huang, S.; Zheng, R. Influence of Tide and Runoff on Saltwater Intrusion in the Qiantang River Estuary, China. IOP Conf. Ser. Earth Environ. Sci. 2021, 691, 012014. [Google Scholar] [CrossRef]
Gao, Y.; Wang, X.; Dong, C.; Ren, J.; Zhang, Q.; Huang, Y. Characteristics and Influencing Factors of Storm Surge-Induced Salinity Augmentation in the Pearl River Estuary, South China. Sustainbility 2024, 16, 2254. [Google Scholar] [CrossRef]
Jongbloed, H.; Schuttelaars, H.M.; Dijkstra, Y.M.; Donkers, P.B.; Hoitink, A.J.F. Influence of Wind on Subtidal Salt Intrusion and Stratification in Well-Mixed and Partially Stratified Estuaries. J. Phys. Oceanogr. 2022, 52, 3139–3158. [Google Scholar] [CrossRef]
Xu, D.; Sun, Z.L.; Pan, D. Neural Network Modeling of Salinity in Qiantang Estuary. J. Zhejiang Univ. Sci. Ed. 2011, 38, 234–238. [Google Scholar]
Li, G.H.; Sun, Z.L.; Hu, S.X. Prediction of Salinity in Qiantang Estuary Based on Wavelet Neural Network Optimized by Particle Swarm Optimization. AMM 2013, 353–356, 2683–2687. [Google Scholar] [CrossRef]
Yang, X.; Zhang, R. Salinity Time Series Prediction Based on LSTMs Neual Network. In Proceedings of the 2019 4th International Conference on Electromechanical Control Technology and Transportation (ICECTT), Guilin, China, 26–28 April 2019; pp. 182–185. [Google Scholar]
Li, R.; Gao, L.; Pan, C.; Pang, Y. Detecting the Mechanisms of Longitudinal Salt Transport during Spring Tides in Qiantang Estuary. J. Integr. Environ. Sci. 2019, 16, 123–140. [Google Scholar] [CrossRef]
Deepa, P.; Thilagavathi, K. Data Reduction Techniques of Hyperspectral Images: A Comparative Study; IEEE: Chennai, India, 2015; pp. 1–6. [Google Scholar]
Deng, J. Control Problems of Grey Systems. Syst. Control Lett. 1982, 1, 288–294. [Google Scholar]
Lin, S.J.; Lu, I.J.; Lewis, C. Grey Relation Performance Correlations among Economics, Energy Use and Carbon Dioxide Emission in Taiwan. Energy Policy 2007, 35, 1948–1955. [Google Scholar] [CrossRef]
Li, S.; Wang, J.; Xie, W.; Li, X. An Improved Henry Gas Solubility Optimization Algorithm Based on Lévy Flight and Brown Motion. Appl. Intell. 2022, 52, 12584–12608. [Google Scholar] [CrossRef]

Figure 1. A map of study area. (a,b) Monitoring stations along the Qiantang Estuary. The discharge data is provided by Fuchunjiang hydrological station (FCJ), the water level data is provided by Ganpu station (GP), the salinity data is provided by CQ and QB station, and the wind speed data is provided by Hangzhou station (HZ).

Figure 2. Overall framework and flowchart of the ISSA-LSTM model. Part 1 is data preprocessing and feature selection; Part 2 is hyperparameters optimization by ISSA; Part 3 is the LSTM model.

Figure 3. Flowchart of the SSA and ISSA. (a) SSA. (b) ISSA.

Figure 4. Prediction results of different models. (a) CQ station. (b) QB station. The gray color block represents observed values of daily maximum salinity. The green, brown, purple, blue, and red lines represent the predicated results of BP, GRU, LSTM, SSA-LSTM, and ISSA-LSTM models. The light orange region is zoomed in and shown in the small window in the subgraph (7/15–8/15, 10/15–11/15).

Figure 5. Scatterplot of observed values and predicted values of different models. (a) CQ station. (b) QB station. The green, brown, purple, blue, and red lines represent the results of BP, GRU, LSTM, SSA-LSTM, and ISSA-LSTM models.

Figure 6. Comparison of prediction results under different discharge conditions. (a) CQ station. (b) QB station. The black solid line represents the salinity prediction result for the original discharge and the yellow and blue dash lines represent the salinity prediction results for discharge decreased or increased by 50%.

Table 1. Datasets for salinity perdition model.

Data	Station	Unit	Sampling Frequency	Sample Size
Water level	GP	m	hourly	17,544
Discharge	FCJ	m³/s	daily	731
Salinity	CQ/QB/GP	PSU	at least 2 times per day	1759/1584/2103
Wind	HZ	m/s	hourly	17,544

Table 2. Benchmark functions list.

No.	Function Expression	Range
1	$F_{1} (x) = \sum_{i = 1}^{D} x_{i}^{2}$	[−100, 100]
2	$F_{2} (x) = \sum_{i = 1}^{D} \|x_{i}\| + \prod_{i = 1}^{n D} \|x_{i}\|$	[−10, 10]
3	$F_{3} (x) = \max_{i} \{\|x_{i}\|, 1 \leq i \leq n\}$	[−100, 100]
4	$F_{4} (x) = \sum_{i = 1}^{D} [100 {(x_{i + 1} - x_{i}^{2})}^{2} + {(x_{i} - 1)}^{2}]$	[−30, 130]
5	$F_{5} (x) = \sum_{i = 1}^{D} [x_{i}^{2} - 10 \cos (2 π x_{i} + 10)]$	[−5.12, −5.12]

Table 3. Results of different benchmark functions.

Function	Mean		Variance
Function	SSA	ISSA	SSA	ISSA
F₁	1.15 × 10⁻³⁴	1.04 × 10⁻⁷⁵	3.64 × 10⁻³⁴	3.14 × 10⁻⁷⁵
F₂	9.12 × 10⁻²²	1.20 × 10⁻³⁷	3.20 × 10⁻²¹	3.52 × 10⁻³⁷
F₃	2.19 × 10⁻²¹	7.36 × 10⁻⁵⁰	3.20 × 10⁻²¹	2.32 × 10⁻⁴⁹
F₄	3.15 × 10⁻¹⁴	8.91 × 10⁻²³	9.56 × 10⁻¹⁴	2.82 × 10⁻²²
F₅	5.15 × 10⁻⁰⁶	4.64 × 10⁻¹⁰	1.12 × 10⁻⁰⁵	7.45 × 10⁻¹⁰

Table 4. List of the predicted target and variables of the initial input candidates set.

Type	Abbreviation	Detail
Salinity related	S_t	Maximum daily salinity at the CQ and QB station (Target)
	S₀	Maximum daily salinity at the GP station
	S₁	Maximum daily salinity of 1 day ago at the GP station
	S₂	Maximum daily salinity of 2 days ago before at the GP station
Tidal range related	TR₀	Daily tidal rage at the GP station
	TR₁	Daily tidal rage of 1 day ago at the GP station
	TR₂	Daily tidal rage of 2 days ago at the GP station
Runoff related	Q₀	Daily runoff discharge
	Q₁	Daily runoff discharge of 1 day ago
	Q₂	Daily runoff discharge of 2 days ago
Wind related	W_WE	West-East component of daily surface wind speed
Wind related	W_NS	North-South component of daily surface wind speed

Table 5. Correlation analysis results between target and candidate variables (Values and ranks).

Station	Impact Factors
Station	S₀	S₁	S₂	TR₀	TR₁	TR₂	Q₀	Q₁	Q₂	W_WE	W_NS
CQ	0.863	0.864	0.865	0.826	0.825	0.825	0.743	0.741	0.739	0.530	0.455
CQ	3	2	1	4	5	6	7	8	9	10	11
GP	0.841	0.841	0.842	0.812	0.812	0.811	0.766	0.763	0.761	0.536	0.457
GP	3	2	1	4	5	6	7	8	9	10	11

Table 6. Hyperparameters optimization results of SSA-LSTM and ISSA-LSTM.

Hyperparameters	Range	CQ Station		QB Station
Hyperparameters	Range	SSA-LSTM	ISSA-LSTM	SSA-LSTM	ISSA-LSTM
L₁	1–100	51	21	67	52
L₂	1–100	47	4	35	9
B	16–64	52	29	39	27
K	10–100	92	62	49	96
lr	0.001–0.01	0.00549	0.00246	0.00502	0.00906

Table 7. Performance evaluation results of different models at the CQ station.

Models	Evaluation Indexes
Models	MAE	MAPE	RMSE	NSE
BP	0.341	1.869	0.475	0.753
GRU	0.325	1.665	0.452	0.779
LSTM	0.346	1.444	0.480	0.768
SSA-LSTM	0.257	0.798	0.426	0.801
ISSA-LSTM	0.223	0.681	0.381	0.842

Table 8. Performance evaluation results of different models at the QB station.

Models	Evaluation Indexes		Models
Models	MAE	MAPE	RMSE	NSE
BP	0.150	1.442	0.263	0.609
GRU	0.127	1.013	0.197	0.730
LSTM	0.102	0.770	0.210	0.704
SSA-LSTM	0.096	0.573	0.192	0.761
ISSA-LSTM	0.081	0.479	0.168	0.806

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zheng, R.; Sun, Z.; Jiao, J.; Ma, Q.; Zhao, L. Salinity Prediction Based on Improved LSTM Model in the Qiantang Estuary, China. J. Mar. Sci. Eng. 2024, 12, 1339. https://doi.org/10.3390/jmse12081339

AMA Style

Zheng R, Sun Z, Jiao J, Ma Q, Zhao L. Salinity Prediction Based on Improved LSTM Model in the Qiantang Estuary, China. Journal of Marine Science and Engineering. 2024; 12(8):1339. https://doi.org/10.3390/jmse12081339

Chicago/Turabian Style

Zheng, Rong, Zhilin Sun, Jiange Jiao, Qianqian Ma, and Liqin Zhao. 2024. "Salinity Prediction Based on Improved LSTM Model in the Qiantang Estuary, China" Journal of Marine Science and Engineering 12, no. 8: 1339. https://doi.org/10.3390/jmse12081339

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

Salinity Prediction Based on Improved LSTM Model in the Qiantang Estuary, China

Abstract

1. Introduction

2. Materials

2.1. Study Area

2.2. Data Collection

2.3. Data Preprocessing

3. Methods

3.1. Framework of ISSA-LSTM Model

3.2. Feature Selection Based on GRA

3.3. Hyperparameter Optimization Using ISSA

3.4. Model Setting and Evaluation Index

4. Results

4.1. Result of Feature Selection

4.2. Results of Hyperparameters Optimization

4.3. Results of Different Prediction Models

4.4. Further Tests

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI