Next Article in Journal
Analytical Method for Polyelectrolytes in Sludge Condensation (Centrate) Units of a Wastewater Treatment Plant
Previous Article in Journal
From Data Scarcity to Solutions: Hydrological and Water Management Modeling in a Highly Managed River Basin
Previous Article in Special Issue
Enhancing the Prediction of Influent Total Nitrogen in Wastewater Treatment Plant Using Adaptive Neuro-Fuzzy Inference System–Gradient-Based Optimization Algorithm
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Research on Water Resource Carrying Capacity Assessment and Water Quality Forecasting Based on Feature Selection with CNN-BiLSTM-Attention Model of the Min River Basin

1
College of Computer and Information Sciences, Fujian Agriculture and Forestry University, Fuzhou 350002, China
2
Fujian Statistical Information Research Center, Fuzhou 350000, China
3
The Faculty of Business, Economic and Law, The University of Queensland, Brisbane, QLD 4072, Australia
*
Author to whom correspondence should be addressed.
Water 2025, 17(6), 824; https://doi.org/10.3390/w17060824
Submission received: 9 January 2025 / Revised: 9 March 2025 / Accepted: 11 March 2025 / Published: 13 March 2025
(This article belongs to the Special Issue Prediction and Assessment of Hydrological Processes)

Abstract

:
To achieve a more accurate assessment of water resource carrying capacity (WRCC), the indicators of water resources, social resources, and ecological environment were selected to construct the WRCC system on the basis of the combinatorial assignment method with advantages. Moreover, the incorporation of key water quality influences into water quality predictions facilitated the performance of predictive models. Adaptive Lasso Regression was used to select key factors affecting water quality, whereas the CatBoost algorithm ranked the importance of the key factors selected by Adaptive Lasso in the prediction model. The CatBoost Convolutional Neural Network-Bidirectional Long Short-Term Memory-Attention (CNN-BiLSTM-Attention) model was used to forecast WQI. The research results propose a new WRCC evaluation and water quality prediction model method. The results show that the average barrier levels for water resources, socio-economic development, and ecological environment were 34.97%, 34.93%, and 30.10%, respectively. Compared to other system layers of WRCC, the obstacle degree of the ecological environment system layer has always been lower. The total sewage treatment, greening coverage in built-up areas, and per capita green space in parks were the main obstacle factors to the WRCC within the Min River Basin. Based on the results of the key factor screening, it can be seen that dissolved oxygen is positively correlated with the water quality of the watershed, while the other key influencing factors are negatively correlated with the WQI. Total nitrogen had the greatest impact on water quality conditions in the watershed, with a regression coefficient of −1.7532. From the comparison of the prediction results, it is known that the hybrid model can make the MAE value of 45% monitoring points reach the minimum, and the RMSE value of 35% monitoring points reach the minimum. The percentages of the remaining prediction models that reached the lowest values for MAE and RMSE were 15% to 20% and 15% to 30%, respectively. Compared with other prediction models, the MSE and RMSE values of the hybrid model were relatively small, which was more conducive to predicting water quality in the Min River Basin.

1. Introduction

Water is a unique substance and resource on Earth. With its unique physical and chemical properties, as well as its recycling and regeneration characteristics, water irreplaceably supports human survival and the development of civilization. It is the fundamental condition for human survival and the most important material basis for production activities [1]. However, the current situation of water resources is severe in China, with both water scarcity and pollution issues coexisting. Industrial production wastewater, daily domestic sewage, agricultural fertilizer abuse, and livestock breeding wastewater are all constantly eroding the water quality of the natural environment. There is an urgent need for water resource management and sustainable development [2]. The water resource challenges currently facing China can be broadly placed into five categories [3,4]. First, while China has abundant total water resources, per capita water resources are limited. Second, there is insufficient water supply for domestic needs, with some cities continuing to face water shortages and water supplies to rural residents being insecure. Third, the utilization of water resources remains nonoptimal and unsustainable, with continued wastage. Fourth, overexploitation of water resources continues in some areas. Finally, limits to water quantity availability are exacerbated by water pollution.
In recent years, while China has implemented measures to strengthen the conservation and management of water resources, which has led to improved carrying capacity and addressed various water-related challenges, problems persist [5]. There has been a focus on sustainable policies in Fujian Province since the establishment of the first national ecological civilization pilot zone in the province in 2016. The Min River Basin is the mother river of Fujian, and its runoff and pollutant concentrations vary due to differences in climate, meteorology, terrain, human activities, industrial production, and agricultural production. The temporal and spatial heterogeneity of river water quality is significant of the study area [6]. Given the importance of the study area in Fujian Province, there is a need to study its water resource carrying capacity (WRCC) and quality status to identify the timing of pollution and factors regulating WRCC to achieve its sustainable development [7]. Good monitoring of WRCC and water quality can provide better decision-making bases for regulatory authorities and further the practice of the civilized concept of ecological sustainable development. Moreover, achieving water environment protection and sustainable development of water ecology requires the use of various methods to evaluate WRCC and water quality status [8].
Research on water quality assessment and monitoring has generally centered around water quality status and spatial and temporal distribution. Sedighkia simulated water quality distribution in reservoirs through a coupled remote sensing data processing–machine learning model, which linked lake water quality distribution modeling and reservoir operation optimization for improved environmental management of reservoirs [9]. Akiner used multivariate statistical techniques such as Factor Analysis and Principal Component Analysis (FA/PCA) to monitor and assess the water quality status of the Betwa watershed. Based on this, Akiner explored factors affecting water quality [10].
Moreover, research on WRCC generally focuses on various aspects of water resources and water consumption to reflect social development indicators at present. Zhao conducted a diagnostic study on WRCC and obstacle factors using the combination weighting method of game theory to quantitatively evaluate the WRCC system, as well as to analyze and determine the physical mechanisms of regional WRCC states and state changes [11].
Taking the Yangtze River Economic Belt as the object of analysis, Liu explored the coupled coordination degree of the carbon emission–economy–environment tri-system of 11 provinces and municipalities within the region from a multivariate coupled coordination degree and spatial perspective by using the coupled coordination degree model from 2008 to 2019 [12]. However, the above studies have neglected the important impact of water quality on the values of WRCC. How to give the evaluation indicators more objective weights in the coupled harmonization model is also an important challenge that needs to be overcome in the related research [13,14]. In order to avoid the limitations of a single objective weighting method, the weights of the indicators were assessed using a combined objective weighting method. Water quality, socio-economic development, and environmental data are the result of a combination of factors. If these data are evaluated directly, there will be a certain degree of overlap and masking of the information reflected in the evaluation results. Since the single-weighting method does not reflect the actual situation of the study area, this study chooses to use Criteria Importance Through Intercriteria Correlation (CRITIC) and the entropy weighting method to calculate the combined weights, on the basis of which the WRCC was derived using Grey Relation Analysis (GRA) and Technique for Order Preference by Similarity to an Ideal Solution (TOPSIS).
Uncertainty in water quality data patterns has been an important challenge faced by scholars when exploring water quality evaluation and prediction. The advantages of artificial neural networks in solving complex nonlinear problems make the use of artificial neural networks for water quality prediction significant. In recent years, artificial neural networks have a wide range of applications in the field of water quality prediction, including the Back Propagation Neural Network (BP) [15], Recurrent Neural Network (RNN) [16], and Convolutional Neural Network (CNN) [17]. Moreover, the Long Short-Term Memory Neural Network (LSTM) is used as a special RNN model to predict various water quality indicators [18]. This model can handle the long-range dependency problem that cannot be solved by RNN, but it cannot overcome the disadvantage of using limited time period information for prediction. The Bidirectional Long Short-Term Memory Neural Network (BiLSTM) model was born, which takes into account the positive and negative directions of the time series data on the basis of the traditional LSTM [19]. The current research on water quality prediction using the BiLSTM model focuses on two main aspects. The first is to utilize standard BiLSTM models for water quality prediction [20]. The second is the combination of different models with BiLSTM models to achieve improved prediction accuracy [21]. Among the combined models, the Convolutional Neural Network-Bidirectional Long Short-Term Memory-Attention (CNN-BiLSTM-Attention) model utilizing feature selection is more often used in water quality applications [22]. While using feature selection on top of predictive modeling does not directly solve the problem of vanishing gradients and explosions, selecting more relevant features helps the model to learn the features of the data better, thus mitigating the problem to some extent. However, current water quality prediction studies on feature selection have generally ignored the impact of external influences on water quality, which is not conducive to achieving more accurate water quality predictions. Therefore, the study proposes a new method of applying variable selection to water quality prediction that considers the inclusion of factors affecting water quality.
Water quality index valves of 20 major water quality testing stations in the Min River from 2017 to 2023 were selected as the raw data to analyze the spatio-temporal changes in water quality conditions. In order to achieve a more accurate assessment of WRCC and explore the obstacles that affect WRCC in the research area, it is necessary to introduce water quality conditions into the assessment system that have been neglected by previous studies. Indicators of water resources, social resources, and ecological environment were selected to structure a WRCC system. On the basis of analyzing the changes in WRCC, the degree of coupling coordination and the main obstacle factors for each city were explored. In addition, the incorporation of key water quality influences into water quality predictions facilitated the performance of predictive models. Therefore, it was necessary to select the features through a number of methods. The Adaptive Lasso Regression was used to select key factors affecting water quality, whereas the CatBoost algorithm ranked the importance of the key factors selected by Adaptive Lasso in the prediction model. Then, social and natural key factors with importance weights greater than 0.15 were included in the water quality prediction model. The CNN-BiLSTM-Attention model was used to forecast WQI. The research results propose a new WRCC evaluation method and a feature selection-based water quality prediction method. Compared with the existing WRCC evaluation system, this study emphasizes the link between WQI and WRCC in constructing the system and compared to existing CNN-BiLSTM-Attention prediction model. The innovation of this study lies in considering the key influencing factors of the prediction target and integrating them into the prediction model, further improving accuracy. It is conducive to realizing a more objective evaluation of the WRCC, further improving the accuracy of the prediction model in the basin, and providing a certain guiding significance to the realization of water resource management and water environmental protection in Fujian Province.

2. Materials and Methods

2.1. Study Area

The Min River originates from Jianning, and it has a total length of 562 km and drains an area of approximately 6.1 × 104 km2. The geographical location map and monitoring point location map of the Min River Basin are shown in Figure 1. The specific monitoring location of the river section is shown in Table 1. The Min River Basin falls within a subtropical marine monsoon climate zone with year-round warm and humid conditions [23]. The annual average temperature is between 16 °C–20 °C, and the average annual precipitation is 1742.1 mm. The study area receives a spatially inhomogeneous distribution of precipitation due to the influence of terrain, with most precipitation in the mountainous areas between Wuyishan, Shaowu, Guangze, and Jianyang, and areas of low precipitation in the coastal areas of Ningde and Fuzhou [24]. In addition, the annual runoff of the watershed varies greatly in different regions and periods. The low values are mainly located along the coast of the estuarine plain, and most of the annual runoff is less than 700 mm. The high value area is mainly located in northern Fujian, which is mainly influenced by the Wuyi Mountain Range, and the average annual runoff has been greater than 1100 mm for many years. In addition, during the flood season, the three major tributaries of Jianxi, Futunxi, and Shaxi in the watershed have larger inflows of water compared with the dry period, and the water level rises sharply. Due to high precipitation, high annual runoff, and high water flow, water quality upstream and downstream is likely degraded as surface pollutants are carried into rivers and lakes, which in turn leads to a decline in water quality. In particular, heavy rainfall situations may cause serious damage to water quality in a short period of time.
The construction of river basins is closely related to the production and life of coastal people, urban construction, and social development. The upper and middle reaches mainly flow through Sanming and Nanping. These two cities are important industrial bases in Fujian Province, with well-developed machinery manufacturing and chemical industries, which are susceptible to industrial wastewater pollution. The downstream cities are Ningde and Fuzhou, which are characterized by high population density and urbanization. Compared with the upstream and midstream, the downstream is more vulnerable to domestic wastewater pollution [25]. During the study period, the area of arable land continued to decrease. Cultivated land was mainly shifted to urban–rural, industrial, mining, and residential land use, especially in the north-central and south-central parts of the Upper Min River and the southern and northeastern parts of the Lower Min River. The shift in land use type implies to some extent that rapid urbanization in the study area has encroached on a larger area of agricultural and forestry land. At the same time, the increase in economic level and urbanization level also means that it is easier to generate pollution discharge and damage to the aquatic environment. This is one of the reasons why the water quality in the middle reaches of the basin is better compared to the upper and lower reaches [26]. The cities in the Min River Basin generally have more developed industries, including manufacturing and chemical industries. The study area also hosts relatively well-developed animal husbandry and grain farming. Consequently, this makes the aquatic ecological environment of the water area susceptible to serious industrial wastewater pollution and human activity factors.

2.2. Sample Collection and Preparation

The study used water quality data collected by the automatic surface water monitoring system in Fujian Province. Through the above website, water quality data were collected every 4 h for 20 major tributaries in the study area from January 2017 to December 2023. The study compared data with the national water quality standards to determine various water quality indicators. The WRCC indicators were from the Water Resources Bulletin of the Fujian Provincial Water Resources Department (Available online: https://slt.fujian.gov.cn/ (accessed on 1 December 2024)) and the Statistical Yearbook of Fujian Province from China Knowledge (Available online: https://www.cnki.net/ (accessed on 1 December 2024)). Social and natural activities can impact water quality in the watershed. Data for natural factors affecting water quality were obtained from the Quarterly Hydrological Report of the Department of Water Resources of Fujian Province (Available online: https://slt.fujian.gov.cn/ (accessed on 1 December 2024)) and the Air Quality Monitoring Platform (Available online: https://www.aqistudy.cn/ (accessed on 1 December 2024)). Human factors were obtained from the Statistical Yearbook of Fujian Province from China Knowledge (Available online: https://www.cnki.net/ (accessed on 1 December 2024)). Outliers were determined and eliminated using the statistical 3σ principle. After eliminating outliers, the vacancy values based on the 3 times spline interpolation method and the data weighting method were calculated, respectively. Considering both the temporal continuity of the water quality data and the temporal periodicity of the water quality data, an exponential decay method was used to fill in the vacancies of the natural factors and the water quality data [27]. Moreover, considering that socio-economic data have a linear relationship, social factor data were filled in using linear interpolation.

2.3. Framework of Methodlogy

The study mainly focuses on the water quality and WRCC of the Min River Basin, as well as proposing a new application method. On the one hand, it achieves analysis and higher accuracy prediction of key factors affecting water quality in the watershed. On the other hand, it explores the main obstacles in the watershed through the assessment of WRCC.
The specific steps for water quality assessment and prediction are as follows. First, the water quality status of each monitoring point was evaluated by calculating the WQI value. Second, the Adaptive Lasso Regression Algorithm (Adaptive Lasso) was used to screen the key factors that affect water quality, and their importance was ranked based on the results. Finally, a water quality prediction model based on feature selection using CNN-BiLSTM-Attention was proposed. The steps for evaluating the carrying capacity of water resources in the basin are as follows. The study constructed a WRCC evaluation system and determined the combined weights of indicators using entropy weighting and CRITIC. GRA and TOPSIS were utilized to calculate the degree of coupled coordination. The main obstacle factors of WRCC in the Min River Basin were explored using an obstacle degree model. The specific methodological framework diagram for this study is shown in Figure 2.

2.4. Water Quality Index

WQI is universally applicable to water quality evaluation. The study applied different weights to the different water quality indices, which were combined to calculate the WQI as done in past studies [28,29]:
I = i = 1 n C i w i / i = 1 n w i ,
where I represents the WQI, n expresses the total number of water quality indicators, Ci is the normalized value of each indicator, and wi represents the weight of each indicator based on its influence on water quality.
The study selected indicators based on past studies applying the WQI [30,31], including water temperature (X1), Ph (X2), dissolved oxygen (X3), conductivity (X4), turbidity (X5), permanganate (X6), ammonia nitrogen (X7), total phosphorus (X8), and total nitrogen (X9). The applied weights within the WQI were 1, 1, 4, 1, 2, 3, 3, 1, and 2, respectively. Based on past related studies, the study divided water quality into five categories [32]: excellent (94, 100), good (79, 94), medium (64, 79), poor (44, 64), and extremely poor (0, 44).

2.5. Integrated Evaluation of WRCC Using Combined Empowerment and GRA-TOPSIS

The present study conducted a comprehensive evaluation of WRCC using the combined empowerment method, GRA, and TOPSIS, as follows.
  • Step 1: Calculate the weights using the CRITIC method.
After normalizing the maximum-minimum values for positive and negative indicators, respectively, the information carrying capacity was calculated, and the information carrying capacity was used to calculate the specific weights of each indicator. The general formula for calculating weights is as follows [33]:
σ j = i = 1 m ( x i j a ) m 1 ,
f j = i = 1 m ( 1 r i j ) ,
C j = σ j f j ,
w c j = C j j = 1 n C j ,
where xij is the standardized data, a is the mean of each column of indicator data, m represents the number of objects to be evaluated, σj represents the comparative strength of the jth indicator, rij indicate the correlation coefficient between indicator i and indicator j, fj represents the contradiction between indicator j and other indicators, and Cj is the information carrying capacity.
  • Step 2: Calculate the weights using the entropy weighting method.
Firstly, the weight of each sample value to the indicator was calculated using the normalized data processed in the previous step. Then, the entropy and redundancy of the indicators were calculated from the resulting weights. Finally, the weight of each indicator was calculated. The general calculation formula is as follows [34]:
p i j = x i j i = 1 n x i j ,
e j = k i = 1 n p i j ln ( p i j ) , k = 1 ln ( n ) > 0 , e j 0 ,
d j = 1 e j ,
w s j = d j j = 1 m d j ,
where pij denotes the weight of the ith sample value under the jth indicator for that standard, ej represents the entropy value of the jth indicator, and dj is the information entropy redundancy.
  • Step 3: Calculate the combination weights.
In the previous research on assigning weights to the indicators, the general focus was on able objective weighting methods, such as Analytic Hierarchy Process [35], Principal Components Analysis [36], and so on. However, a single objective weighting method still has limitations, may have the problems of contradicting the actual situation and not strong interpretation. Therefore, the study choosed to adopt a combination of weights to calculate the weights of each indicator. The combination weights were calculated using the entropy weighting and CRITIC method, and the specific formulas are as follows [37,38]:
w i j = w s j w c j j = 1 n w s j w c j
where wsj and wcj are the weights of each indicator according to the entropy weighting and CRITIC methods, respectively.
  • Step 4: Combine GRA-TOPSIS to calculate the values of each subsystem.
The weighted decision matrix was then constructed, and the negative and positive ideal solution distances of each index were calculated [39]:
Z = ( z i j ) m × n = ( w i j y i j ) m × n ,
z j + = { max z i j | i = 1 , 2 , , m } , z j = { min z i j | i = 1 , 2 , , m } ,
D i + = j = 1 n ( z i j z j + ) 2 , D i = j = 1 n ( z i j z j ) 2 ,
s i j + = min i min j z i j + ρ max i max j z i j z i j + + ρ max i max j z i j , s i j = min i min j z i j + ρ max i max j z i j z i j + ρ max i max j z i j ,
z i j + = | z j + z i j | , z i j = | z j z i j |
where wij is the weight of each indicator; yij is the standardized values of original data in the positive and negative directions; D i + expresses the positive ideal solution distance; D i expresses the negative ideal solution distance; min i min j z i j expresses the bipolar minimum; max i max j z i j expresses the bipolar maximum; and ρ indicates the resolution factor, usually taken as 0.5.
Generally, grey relation coefficients and grey relations were calculated, producing dimensionless data:
K i + = 1 n j = 1 n s i j + , K i = 1 n j = 1 n s i j ,
φ i = θ i max ( θ i )
where φi represents the ideal solution distance and grey relation after dimensionless quantization: d i + , d i , k i + , and k i .
T i + = e 1 d i + e 2 k i + ( i = 1 , 2 , 3 , , m ) , T i = e 1 d i + + e 2 k i ( i = 1 , 2 , 3 , , m ) ,
C i = T i T i + T i + ( i = 1 , 2 , 3 , , m )
where e1 and e2 are the degrees of preference of the evaluation value, e1 + e2 = 1 taken to be 0.6 and 0.4, respectively, as according to previous studies [40].
  • Step 5: Calculate the degree of coupled coordination and the degree of obstruction.
The coupled coordination and obstacle degree model were used to achieve a comprehensive evaluation of the WRCC and the main influencing factors [12,41,42]:
U = C 1 × C 2 × × C m ( C 1 + C 2 + + C m m ) 1 m ,
T = i = 1 m a i C i ( i = 1 , , m ) ,
D = U T ,
F i j = w i z j ,
I i j = 1 y i j ,
p i j = F i j I i j j = 1 n ( F i j I i j ) × 100 %
where U denotes the comprehensive score of the system, Ci is the values of each subsystem in WRCC, T represents the comprehensive evaluation and coordination index of the WRCC system, ai represents the weight of system layer, respectively, and D is the coupling coordination of the system. Moreover, wj is the indicator-level weights, zj denotes system-level weights, Fij represents the contribution of indicators to WRCC, Iij is the deviation degree, and pij is used to reflect the degree of impact of indicators on WRCC.

2.6. Screening of Key Influencing Factors Using Adaptive Lasso-CatBoost

Many previous studies have applied the Lasso Regression Algorithm (Lasso) for selecting variables in predictive models, including models related to air quality, stock prices, carbon emissions, and medical research [43,44,45]. However, Lasso has rarely been applied for screening factors affecting water quality. Lasso uses the L1 penalty for selecting variables and reducing dimensionality. The algorithm works by minimizing the sum of squared residuals constrained by the absolute sum of regression coefficients being less than a constant. Finally, Lasso generates regression coefficients that are strictly equal to 0 [46]. In contrast, Adaptive Lasso reduces the occurrence of unreasonable situations arising from all coefficients being assigned the same penalty [47,48]:
( α , β ) = arg min { i = 1 n ( y i α j = 1 p x i j β j ) 2 } , j = 1 p | β j | t
where t ≥ 0 is the reconciliation parameter, with t 0 = | β j ° | , t t 0 used to reduce some variables to approximate 0. This process leads to the eventual screening out of irrelevant or less relevant independent variables to enhance the accuracy and interpretability of the regression model. Following the application of Lasso, factors influencing the water quality situation of the Min River were identified, and key influencing factors were derived.
The CatBoost algorithm is a raised method based on the Gradient Boosting Decision Tree (GBDT). This algorithm can effectively resolve gradient and prediction biases [49,50]. Following an approach used in previous studies, the present study applied the CatBoost algorithm based on Adaptive Lasso to derive the ranking of key factors based on importance. Considering the influence of various factors on water quality, the appropriate introduction of key influencing factors in the prediction model is conducive to improving the accuracy and effectiveness of the prediction. Therefore, the study chose to add the step of feature selection before prediction and incorporate the factors derived from feature selection into the prediction model.

2.7. Water Quality Prediction Based on CNN-BiLSTM-Attention

ANN were widely used due to their high ability in handling nonlinear mapping problems [51]. In the ANN system, RNN have significant advantages in processing time series [52]. However, traditional RNN is prone to gradient vanishing during the training process, resulting in the inability to update the weights of neurons. Moreover, LSTM solved the long-distance dependency problem. While overcoming memory defects and gradient explosion problems, it was able to achieve more accurate prediction of long sequence data [53]. With the continuous research on LSTM, its drawback of only being able to predict through limited time period information has become apparent. In this situation, the BiLSTM model had emerged. In addition, due to the limitations of a single prediction model, the study considered the introduction of the CNN model and Attention model on top of BiLSTM. CNN was used to extract the input features. The time series data were computed using BiLSTM by multiple forward- and backward-chained LSTM units for the data after feature extraction by CNN. The Attention model focused BiLSTM on certain features by assigning weights, thereby further improving the prediction accuracy of WQI. On the basis of the LSTM prediction model, the calculation of reverse sequence information was added to form the BiLSTM model, and CNN-Attention mechanisms were introduced. The specific steps of the final prediction model [54] were as follows.
  • Step 1: Normalize the water quality data.
Normalize the WQI data with a length of T, and use the processed data as the input value for the model, denoted as X = [ x 1 , x 2 , , x T ] .
  • Step 2: Calculate the values of the CNN layer.
The general calculation process of the CNN layer was as follows:
C 1 = Re l u ( X W 1 + b 1 )
P 1 = max ( C 1 ) + b 2
H c = s i g m o i d ( P 1 W 2 + b 3 )
where C1 represents the output value of the convolutional layer using the Relu activation function, P1 represents the output value of the pooling layer for selecting the sigmoid activation function in the fully connected layer, and HC represents the output result after passing through the fully connected layer. Moreover, Wi represents weights, and bi represents deviations.
  • Step 3: Calculation of the BiLSTM.
BiLSTM considers the positive and negative directions of the time series data on the basis of traditional LSTM, reads the future sequence information through the hidden layer, and adds the computation of the reverse sequence information in the process. The output value of BiLSTM layer generally denoted as follows:
h t = B i L S T M ( H c , t 1 , H c , t ) , t [ 1 , i ]
where Hc,t represents the output of the CNN layer at t.
  • Step 4: Calculation of the Attention Mechanism.
The Attention Mechanism was calculated as follows:
e t = tanh ( W h h t + b h )
a t = exp ( e t v ) t exp ( e t v )
s t = t a t h t
where et is the probability distribution, Wh represents the weight of Attention layer, bh represents the bias of Attention layer, v is the value of attention, at represents the weight coefficient, and st is the output of Attention layer in t.
  • Step 5: Output the final predicted value.
Taking st as input, the output, the general expression for yt, is obtained by the sigmoid activation function:
y t = s i g m o i d ( w o s t + b o )
where wo is the probability distribution.
The final extraction results of the key factors were utilized to predict the water quality condition of the Min River Basin using the CNN-BiLSTM-Attention model. The algorithm was implemented using the tensorflow, sklearn, and attention packages in PyCharm 2020.1.1. The predictive effectiveness of the CNN-BiLSTM-Attention hybrid model was measured using the mean absolute errorroot (MAE), mean square error (RMSE), Nash–Sutcliffe Efficiency (NSE), Adjusted R2, MAPE (Mean Absolute Percentage Error), and SMAPE (Symmetric Mean Absolute Percentage Error) representing the deviation between real and predicted data, with smaller values indicating improved model predictions.

3. Evaluation of WRCC and Influencing Factors

3.1. Mann–Kendal Test

The coefficient of variation is beneficial for checking the homogeneity of data, evaluating changes in data, and measuring the degree of variation of indicators. According to the coefficient of variation results (Table 2), conductivity had the highest coefficient of variation, reaching 0.551. Moreover, water quality indicators with high coefficients of variation were water temperature and turbidity. The specific M-K trend test statistical values of various water quality indicators in the Min River Basin are shown in Table 3. It is generally believed that when the p-value in the M-K trend test is less than 0.05, the indicator shows a significant trend of change. In this study, permanganate and ammonia nitrogen had a significant trend during the flood season, while the turbidity indicator had a significant fluctuating trend during the dry season. Figure 3 shows the trend of changes in quality concentration of various water quality indicators in the Min River Basin, with significance levels α of 0.05 and 0.01, and critical values of U0.05 = ±1.96 and ±2.58, respectively. During the flood season, the overall permanganate index and ammonia nitrogen concentration showed a significant downward trend, while turbidity showed a certain downward trend during the dry season. The mutation point of the permanganate index occurred in July 2020, with an upward trend before the mutation point and a downward trend after the mutation point, especially after May 2023 when the decline was significant. August 2019 was the point of sudden change in ammonia nitrogen concentration. Before the sudden change, there was little fluctuation in the mass concentration of ammonia nitrogen, but after the sudden change, the ammonia nitrogen content in the watershed showed a downward trend.
Except for the permanganate index, ammonia nitrogen, and turbidity during the flood season, the trend of changes in other water quality indicators did not pass the significance test, with a confidence level of 95%. The overall trend of turbidity during the dry season is significant. Prior to December 2020, with the continuous development of the social economy and heavy industry, pollution such as industrial wastewater increased significantly, and suspended solids in water bodies continued to accumulate. This led to a gradual increase in water turbidity and a significant decline in water quality. After the mutation point, with the increasing awareness of ecological and environmental protection among people and the strict control of heavy industry wastewater discharge by the government, the turbidity showed a significant downward trend.

3.2. Spatio-Temporal Variations in Water Quality

The study applied WQI to evaluate the water quality conditions of the monitored reaches in the watershed (Figure 4). The results reveal that the overall WQI showed an upward trend. The discharge of domestic and industrial wastewater was an important reason for the impact on water quality. Due to the abundant industrial resources and limited sewage treatment capacity in Minhou, the water quality in Zhuqi declined. Moreover, the water quality at the monitoring point located in Nanping was poor, which could be attributed to environmental pollution caused by the development of local animal husbandry [55]. Further deterioration of water quality could be attributed to heavy industries on both sides of the river, resulting in a decrease in water quality at Banzhu Creek Crossing. The reopening of businesses at the end of the COVID-19 outbreak in 2023 contributed to a rapid decline in water quality status. These observations indicate the need for strengthened regulation of industrial wastewater discharge. In addition, the accumulation of a large amount of sediment has led to long-term poor water quality in the downstream area of Lianjiang Guantou. In general, water quality in the lower monitored reaches of the watershed was poor compared to water quality at other monitoring sites.

3.3. Construction of Evaluation System

The previous assessment system for WRCC generally considered factors related to water resources, society, the economy, and the ecological environment [56,57]. The comprehensive system for evaluating the values of WRCC constructed in the study considered three broad categories of factors, namely water resources, socio-economic factors, and the ecological environment (Supplementary Table S1). Combined with the actual state in the research area, the indicators of total water resources, annual precipitation, total water supply, and water consumption were selected to reflect the water resources situation. The socio-economic indicators revealed the current human development in the watershed, including gross regional per capita product, urban water penetration, and total wastewater treatment. Indicators such as forest cover, water quality composite index, and emissions were used to reflect the state of the ecology in the watershed.

3.4. Determination of Indicator Weights

The study applied the entropy weight method and CRITIC method to combine and assign weights to the indicators of the WRCC evaluation system (Figure 5). A comparison of the weighting results reveals that the weights between the indicators derived from the combination method and entropy weighting method had a large difference compared to the small difference in the weights of the indicators in the results of the CRITIC method. In addition, compared with the entropy weight method, the results of the combination weighting method indicate that the weight of industrial water consumption was larger. This was consistent with the level of heavy industry in the basin and may better reflect the carrying capacity of water resources. In the WRCC evaluation system, the weights of each indicator were between 2% and 9.8%. The weight of the socio-economic system was larger than that of the water resources and ecological environment system. Industrial water consumption, the proportion of the tertiary industry, and ammonia nitrogen emissions were assigned the highest weight in the water resources, socio-economic, and ecological environment subsystems, respectively. Among them, the industrial water consumption and total water supply had higher weights in the water resources subsystem, with values of 9.81% and 8.24%. The proportion of the tertiary industry and the level of urbanization were relatively large in the socio-economic system, with values of 8.36% and 7.81%. Moreover, the indicators of ammonia nitrogen emissions and chemical oxygen demand emissions in the ecological environment system had a relatively high weight in WRCC, with values of 6.65% and 6.26%.

3.5. Analysis of the Coupling Coordination Degree

The research on the coupling coordination model shows that the greater the coupling coordination degree, the higher the coordination degree between various subsystems [56,57]. Figure 6 illustrates the values of WRCC within the watershed area. The coupling coordination degree classification criteria indicate that there were various levels of coordination of WRCC from 2017 to 2023. Except for Ningde, which had a high degree in 2017, the other three cities had relatively poor coupling coordination in that year. There were upward trends in the overall coupling coordination degree of various prefecture-level cities from 2017 to 2019.
While Nanping showed a rising coupling coordination, the remaining three cities showed decreasing trends in the degree of coupling coordination from 2019 to 2020. Moreover, WRCC and the degree of coordination between its subsystems from 2020 to 2022 had been increasing year by year. The resurgence of enterprises after the end of the COVID-19 epidemic contributed to a rapid return of enterprises and accelerated production and socio-economic development. Consequently, the water and ecological environments were prone to some damage, and there was a decreasing trend in the coordination between the subsystems of WRCC in the Min River Basin. In summary, there is still a great deal of room for improvement in the coupling coordination degree of WRCC within the watershed.

3.6. Changes in WRCC and Various Subsystems

The study applied GRA-TOPSIS to comprehensively evaluate WRCC and its subsystems. There was an increasing trend in the values of WRCC, with a downward trend occurring only in individual years (Figure 7). The evaluation results of WRCC and its water resources and socio-economic subsystems remain between 0.3 and 0.65. The evaluation results of the ecological environment subsystem range from 0.25 to 0.75. There were decreasing trends in the WRCC of Fuzhou, Sanming, and Ningde in 2020, whereas that of Nanping showed an increasing trend. The above results could be mainly attributed to the successful environmental conservation policies implemented in Nanping while maintaining economic development. Closure of businesses in 2021, due to the COVID-19 epidemic, resulted in further improvements in the urban ecological environment, with a consequent steady increase in WRCC. The resurgence of economic activity as the impact of the epidemic declined was expected to lead to harmonization of socio-economic development and environmental conservation.

3.7. Diagnosis of Disorder Factor Identification

3.7.1. Diagnosis of System Level Barrier Factor Identification

From the WRCC system-level barrier degree results in Figure 8, it can be seen that the influence size of the barrier degree of each system level was, in descending order, ecological environment, socio-economic factors, and water resources. The average barrier levels for water resources, socio-economic factors, and ecological environment were 34.97%, 34.93%, and 30.10%, respectively. There have been fewer fluctuations in the barrier degree at the overall system level. Moreover, the overall trend of the ecological environment system layer showed a W-curve fluctuation. The ecological environment layer was at the lowest level of barrier compared to the other layers. Among them, the ranking of the system layer barrier level in 2017 and 2019 was as follows: socio-economic > water resources > ecological environment. In the remaining years, the ranking of the barrier level of the system layer was the following: water resources > socio-economic > ecological environment. This indicates that socio-economic development and water resources were the main obstacle factors affecting the WRCC of the Min River Basin.

3.7.2. Diagnosis of Indicator Level Barrier Factor Identification

The study applied the barrier factor model to identify the influence of water resources, with specific results shown in Figure 9. By sorting the frequency percentage of obstacles in each city, the main obstacles to WRCC within the watershed can be identified from 2017 to 2023. The socio-economic and ecological environment layer indicators have a greater impact on the values of WRCC in the Min River Basin than the water resource layer indicators. The total quantity of water resources, sewage treatment, and forest coverage were the main factors obstructing WRCC in Fuzhou. Moreover, the main obstacle factors in Sanming, Nanping, and Ningde were the water consumption of industrial added value per 104 Yuan. The results indicate that the effective regulation of water consumption with a value added of 104 Yuan and increasing sewage treatment capacity are of great significance for increasing the WRCC.
According to the diagnostic method of obstacle factors, the top five indicator layers of obstacle factors and their obstacle degrees that affect the WRCC of the Min River Basin during the research period can be obtained. Table 4 demonstrates the top 5 indicator layer barrier factors for 2017, 2019, 2021, and 2023. During the research period, the total sewage treatment, greening coverage in built-up areas, and per capita green space in parks had the highest probability of being in the top five. The increase in total sewage treatment means an increase in sewage discharge, and the increase in sewage discharge will cause significant damage to the water environment and ultimately lead to a decrease in the WRCC [58]. Moreover, the per capita green space in parks and greening coverage in built-up areas is not only an important indicator reflecting the state of ecological environmental protection and civilization construction in a country or region, but also represents the comprehensive economic strength and modernization level of a city [59]. The increase in per capita green space in parks and greening coverage in built-up areas represents the rapid development of the social economy in the current situation, which can easily cause problems such as sewage discharge and treatment, leading to a decline in the ecological environment quality and WRCC in the region.

4. Water Quality Prediction Modeling Based on Feature Selection

4.1. Screening of Key Influences on Water Quality

The study constructed a holistic indicator system for both natural and social indicators. Water quality data show the degree of water pollution. In response to the factors affecting water quality, the study selected average temperatures and average humidity to represent the meteorological conditions of the city. Population density reflects the population situation. The proportion of primary, secondary, and tertiary industries may reflect changes in industrial structure. Per capita GDP and total investment in soil and water conservation represent the development of the social economy. Industrial wastewater discharge reflects the discharge of industrial pollution. Pesticide utilization denotes agricultural pollution. Chemical oxygen demand emissions, ammonia and nitrogen emissions from urban domestic wastewater, centralized treatment rate of urban sewage treatment plants, and reduction of soil erosion demonstrate the pollution status and ecological environment level of the city. The most important factors influencing the water quality were screened according to natural conditions, demographic characteristics, and socio-economic factors. Natural factors selected included annual precipitation, average air temperature, and average humidity. Socio-economic factors selected included per capita gross domestic product, the centralized rate of treatment of municipal sewage treatment plants, and the proportion of tertiary industry. Supplementary Table S3 provides a summary of the selected indicators.
Supplementary Figure S1 plays the correlation between the water quality within the watershed and the key influencing factors. The relevant analysis results indicate that among the natural factors, dissolved oxygen was positively correlated with water quality, whereas conductivity, turbidity, the permanganate index, ammonia nitrogen, total phosphorus, and total nitrogen were negatively correlated. The WQI was significantly negatively correlated with the proportion of tertiary industry, chemical oxygen demand, population density, ammonia nitrogen emission from urban domestic wastewater, centralized rate of treatment of urban wastewater treatment plants, industrial wastewater emission, and total investment in soil and water conservation. There was a negative relationship between the proportion of the tertiary industry and WQI.
Compared with the secondary industry, the tertiary industry is less polluting to the environment. However, with the rapid development of the tertiary sector, the discharge of pollutants brought about by it has been increasing, and the environment has been continuously damaged, leading to a decline in water quality. Therefore, optimizing the industrial structure appropriately is beneficial for improving water quality [60]. From the results of the study, it is known that turbidity had a negative correlation with water quality. This indicates that an increase in turbidity leads to the destruction of water quality and is more detrimental to the protection of the water environment. Turbidity refers to the degree to which suspended solids in water obstruct the passage of light. It means that the turbidity of water is not only related to the suspended solids content in the water, but also to their size, shape, and refractive index. Due to the presence of a large amount of suspended solids, including sediment, organic matter, inorganic matter, and microorganisms in the water body, the turbidity of the water in the watershed was relatively high. There was a negative correlation between dissolved oxygen and water turbidity, and an increase in turbidity may hinder the photosynthesis of aquatic organisms, thereby reducing the replenishment of dissolved oxygen [61]. In addition to the correlation between dissolved oxygen and turbidity, turbidity was also related to many natural (precipitation, annual runoff) and human (land type, pollutant discharge) factors that could caused changes in water quality [62]. On the one hand, higher precipitation and annual runoff could led to surface pollutants being easily carried into water bodies. This was the main reason for the greater turbidity in the upper and lower reaches of the watershed compared to the middle reaches, which in turn led to a decline in local water quality, especially in the case of heavy rainfall. On the other hand, the increase in the level of economic development and urbanization had an impact on the changes in the area of cultivated land during the study period. The type of cultivated land shifted mainly to urban-–rural, industrial, mining, and residential land, especially in the upper and lower reaches of the Min River Basin. It was also more likely to cause an increase in the turbidity of water bodies in the upper and lower reaches of the basin. At the same time, the improvement of economic development level and urbanization meant that there was a higher likelihood of increased turbidity and water quality damage in the region. In addition to the above factors affecting water quality, the level of conductivity also has an impact on water quality conditions. Conductivity is an indicator of the concentration of ionic constituents, with a high conductivity representing poor water quality condition. Excessive factors of industrial wastewater in bodies of water can easily cause environmental pollution and aquatic ecological damage, leading to poor water quality in the watershed [63]. Apart from the negative impact of industrial wastewater on water quality, domestic wastewater also had a negative impact on the environment and human health [11]. Therefore, treatment of domestic and industrial wastewater by municipal wastewater treatment plants plays an important role in conservation of the aquatic ecology. Water quality can be improved by increasing the rate of centralized treatment of wastewater by municipal wastewater treatment plants and reducing wastewater discharge. In addition, the increase in total investment in soil and water conservation indicates that the original measures in place were less effective, so the water quality situation in the basin is relatively poor.
Different factor selection models have a significant impact on factor screening. A smaller mean square error (MSE) indicated an improved model selection. Table 5 summarizes the results of the different model choices. The above analysis and the results of Adaptive Lasso Regression screening (Table 6) identified the key factors influencing the water quality to be dissolved oxygen, turbidity, the permanganate index, ammonia nitrogen, total phosphorus, total nitrogen, and the proportion of the tertiary industry. Dissolved oxygen was positively correlated with watershed water quality, while the other key influencing factors were negatively correlated with WQI. Among them, total nitrogen and permanganate index had a significant impact on WQI in the watershed, with regression coefficients of −1.7532 and −1.6642, respectively. Moreover, the next most influential factors were total phosphorus and ammonia nitrogen, with regression coefficients of −1.4217 and −1.3623.

4.2. Construction of Water Quality Prediction Model

The CatBoost algorithm was used to rank the importance of key factors within the water quality indicators and integrated water quality index prediction model (Supplementary Figure S2 and Figure 10). Indicators with scores exceeding 0.15 and indicators to be tested were selected for merging, thereby resolving the challenge of low average absolute error and poor fitting due to the use of a single prediction model. The feature selection results of the CatBoost algorithm were combined with a prediction model to forecast the water quality data and WQI at each monitoring point in the Min River Basin. Due to the limitations of a single prediction model, the study considered introducing CNN and attention models based on BiLSTM. First, CNN was used to extract the input features. Then, the time series data were computed using BiLSTM by multiple forward and backward chained LSTM units for the data after feature extraction by CNN. Finally, the Attention model focused BiLSTM on certain features by assigning weights, thereby further improving the prediction accuracy of WQI. In general, smaller MAE and RMSE values represent better prediction results.
The results of the model were derived from 2 NVIDIA RTX A6000 (Taibei, Taiwan) runs. The experimental data were randomly divided into training and test sets in a ratio of 8 to 2, and the number of iterations was set to 100. The above parameter settings ensure that all the prediction models involved in this study are under equal conditions, which facilitates further comparisons of the models’ prediction results and performance. Hybrid models using the Attention layer run results for a longer period of time than general predictive models. The hybrid model proposed in this paper runs longer compared to other models, but the overall prediction results are better. The CatBoost-CNN-BiLSTM-Attention water quality prediction model wascompared with the other three models by means of a paired sample t-test. From the results in Table 7, it is found that there was a difference between the results of the feature selection based model and the other models. Compared to the results predicted by the hybrid model, there was a small difference in the results of LSTM model, a moderate magnitude of difference in the results of LSTM-Attention model, and a significant difference in the results of the CatBoost-CNN-BiLSTM-Attention model. The effectiveness of the water quality prediction model was compared with 20 monitoring sites in the watershed (Figure 11). The values of the three models were relatively centralized, except for the MAE and RMSE values of the LSTM, which were more dispersed.
Supplementary Tables S3 and S4 show the prediction performance of different models and the results of the optimal prediction model. The results show that the overall results of MAE and RMSE were low based on the CatBoost-CNN-BiLSTM-Attention water quality prediction model, with only a few monitoring points showing higher MAE and RMSE results. From the prediction results, it canbe seen that the CatBoost-CNN-BiLSTM-Attention water quality prediction model resulted in 45% of the overall number of monitoring points with the lowest MAE values, while the number of monitoring points with the lowest RMSE values was 35% of the overall number of monitoring points. The remaining prediction models minimized the number of monitoring sites with MAE and RMSE values compared to the hybrid model predictions, with percentages ranging from 15% to 20% and from 15% to 30%, respectively.
Taking the monitoring sites N, S, and T as an example (Table 8), the NSE and adjusted R2 results of the hybrid model compare favorably with the other models. In general, the NSE and adjusted R2 results are between 0 and 1, and the closer to 1 means a better fit. However, there were some limitations. The predicted fits from the hybrid model were worse, but still better compared to the other models. Table 9 demonstrates the modeling results of MAPE and SMAPE for selected monitoring sites. From the results, it can be seen that the MAPE and SMAPE values of the monitoring points are overall lower compared to the other models. This suggests better predictions based on the CatBoost-CNN-BiLSTM-Attention model in monitoring points at C, D, G, and R. In summary, compared with other prediction models, improved water quality predictions were achieved after applying screening for key factors and feature selection. Figure 12 demonstrates the gap between the true and predicted values under different prediction models, further verifying that the prediction model based on feature selection is better. The hybrid model was more suitable for predicting water quality in the Min River Basin.

5. Discussion

5.1. Impact of Social Factors on WRCC and Water Quality

It is accepted that social factors have major influences on WRCC and water quality. The study results indicate that indicators such as the proportion of the tertiary industry, which reflects the socio-economic development status, and the indicators such as industrial wastewater discharge and total sewage treatment, which reflect the pollution discharge and treatment situation, all have an impact on the values of WQI in the Min River Basin. It has been found that the indicators related to pollution emission and socio-economic development generally show negative correlation with WQI. In particular, the increase in the discharge of wastewater and solid waste has led to the deterioration of water quality. Therefore, as an important link in treating industrial and domestic wastewater, urban sewage treatment plants are crucial for protecting aquatic ecology. The WRCC status and management of water resources in the watershed can be improved by increasing the centralized treatment rate of urban sewage treatment plants and reducing wastewater discharge.
With the increase in urbanization and population density, the socio-economic development of Fujian Province has continued. This posed significant challenges to the current situation of WRCC and water environment protection. With the continuous adjustment and optimization of China’s economic structure, tertiary industry has become the leading force driving economic development. As the proportion of tertiary industry continues to increase, its pollution of water quality will gradually expand. In particular, the labor-intensive industries in the tertiary industry had a greater impact on ecological pollution and the values of WRCC. The study proposed relevant strategies to address the main factors contributing to water pollution, such as improving sewage treatment facilities, increasing the rate of centralized sewage treatment, and ensuring management of local pollution challenges. At the same time, modern technology should be fully utilized to address local challenges and ensure the sustainable use of water resources [64]. Moreover, the structure of industry needs to be rationalized to avoid water pollution and damage to the aquatic ecology, achieving sustainable development of water ecology, which can be achieved through optimization [65].

5.2. Advantages and Limitations of the Present Study

In previous studies on water quality prediction modeling, LSTM was able to solve the problem of long-distance dependence in prediction. It was widely used in hydrological research. While overcoming the gradient explosion problem, it can realize more accurate prediction of long sequence data [53]. With the continuous research on LSTM, its disadvantage of only being able to make predictions through limited time period information has become obvious. On the basis of the LSTM prediction model, the BiLSTM model adds the calculation of inverse sequence information. Since the CNN model failed to notice the influence of external influences on water quality prediction, the CatBoost method was used as a way of feature selection in this study. The CatBoost algorithm was used to derive the ranking of key factors based on their importance, which further improved the prediction accuracy of WQI. But the hybrid model also has some limitations. Significant fluctuations and variations in water quality indicators within the watershed may have contributed to the low NSE and Adjusted R2 results for all predictive models mentioned in this study.
Due to the limitation of a single-weight calculation method, the single evaluation model was not good enough to assess the values of WRCC objectively [66]. This study chooses to combine the entropy weighting and CRITIC methods to obtain the combined weight and uses it to objectively weight the indicators of each subsystem of WRCC [67]. In addition, Adaptive Lasso was applied to address the challenge of inconsistent variable selection in the original algorithm by using different penalty weights for different variables [68]. However, some flaws in the Adaptive Lasso approach remain, including the failure to consider the correlations between indicators and the impact of lag order on time series modeling [69]. The consideration of natural and social factors in the study was hindered by the complexity of water pollution. Future research should aim to classify pollutants in different water bodies based on their severity and incorporate factors that have a significant impact on WQI at different monitoring points into the construction of water quality assessment models to address the aforementioned uncertainties.

6. Conclusions

The study selected water quality monitoring data from 20 monitoring points in the Min River Basin from January 2017 to December 2023, calculated the water quality index of the basin, and proposed a new method for evaluating WRCC and predicting water quality. It is conducive to more targeted and enhanced water resources management in the future. this study chooses to use CRITIC and the entropy weighting method to calculate the combined weights. On this basis, the WRCC evaluation value was calculated by using GRA and TOPSIS methods. A WRCC system was constructed considering water resources, socio-economic factors, and the ecological environment in the research area. The main factors obstructing water quality in each city were explored based on changes in the degree of coupling coordination. Then, the prediction model based on feature selection was used to predict water quality in this study. Adaptive Lasso Regression was used to select key factors affecting water quality, whereas the CatBoost algorithm ranked the importance of the key factors selected by Adaptive Lasso in the prediction model. Indicators with CatBoost algorithm results greater than 0.15 were selected, and the filtered indicators were incorporated into the CNN-BiLSTM-Attention water quality prediction model. This approach proposed a new prediction method based on key factor screening. The above study can improve certain reference bases for future water quality protection and water environment management in Fujian Province. Finally, according to the results obtained, this study gives targeted recommendations. The specific findings of the study are set out below.
(1) There was an overall increasing trend in WRCC, with only a few years of decline. The status of WRCC showed a little fluctuation in the overall coupling coordination degree. The ecosystem layer has been at the lowest level of barrier compared to the other layers.
(2) Socio-economic development and water resources were the main obstacle factors affecting the WRCC in the basin. Among them, the main barrier to WRCC of Sanming, Nanping, and Ningde was water consumption of 104 Yuan of industrial added value; those in Fuzhou were the total water resources, the total sewage treatment, and forest cover.
(3) According to the results of Adaptive Lasso Regression screening, the key factors affecting water quality in the Min River Basin were dissolved oxygen, turbidity, the permanganate index, ammonia nitrogen, total phosphorus, total nitrogen, and the proportion of the tertiary industry.
(4) By comparing the predictive effectiveness of the different models, it can be seen that the hybrid model can make the MAE value of 45% monitoring points reach the minimum, as well as making the RMSE value of 35% monitoring points reach the minimum. The percentages of the remaining prediction models that reached the lowest values for MAE and RMSE were 15% to 20% and 15% to 30%, respectively. The CNN-BiLSTM-Attention prediction model based on feature selection has been proven to be more suitable for water quality prediction in the study area.
Of course, this study has some limitations. The Adaptive Lasso method in the study has some shortcomings, including the failure to consider the correlation between indicators and the effect of lag order on time series modeling. In addition, due to the complexity of water pollution, it is easy to be influenced when considering natural and social factors. Future research should address these uncertainties by categorizing pollutants in different water bodies according to their severity and incorporating factors that have a significant effect on WQI at different monitoring points into the construction of water quality evaluation models.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/w17060824/s1, Figure S1: Spatial distributions of factors influencing water quality in the study area falling into the following categories: (a) natural factors; (b) social factors; Figure S2: Feature selection results based on CatBoost algorithm in (a) X1; (b) X2; (c) X3; (d) X4; (e) X5; (f) X6; (g) X7; (h) X8; (i) X9; Table S1: Comprehensive evaluation of WRCC; Table S2: Indicator System of Water quality influencing factors; Table S3: Comparison of MAE values for different prediction models; Table S4: Comparison of RMSE values for different prediction models.

Author Contributions

Conceptualization, methodology, Y.X. (Yanglan Xiao) and H.S.; investigation and formal analysis, L.Y.; validation, Y.X. (Yanglan Xiao), H.S. and W.F.; resources, J.N.; data curation and visualization, Y.Z. and H.X.; writing—original draft preparation, Y.X. (Yanglan Xiao) and Y.X. (Yihan Xu); writing—review and editing, T.Y.; supervision, T.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Social Science Foundation of Fujian Province, China (No. FJ2018B063).

Data Availability Statement

The original contributions presented in this study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding author.

Acknowledgments

We thank the reviewers and editors for their suggestions on this work.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Netti, A.M.; Abdelwahab, O.M.; Datola, G.; Ricci, G.F.; Damiani, P.; Oppio, A.; Gentile, F. Assessment of nature-based solutions for water resource management in agricultural environments: A stakeholders’ perspective in Southern Italy. Sci. Rep. 2024, 1, 24668. [Google Scholar] [CrossRef] [PubMed]
  2. Zhao, F.F.; Guo, M.W.; Zhao, X.; Shu, X.Y. Spatio-temporal characteristics and coupling coordination factors of industrial water resource system resilience and utilization efficiency: A case study of the Yangtze River Economic Belt. Ecol. Indic. 2024, 167, 112704. [Google Scholar] [CrossRef]
  3. Totan, G.; Harish, G. Multi-criteria decision making of water resource management problem (in Agriculture field, Purulia district) based on possibility measures under generalized single valued non-linear bipolar neutrosophic environment. Expert Syst. Appl. 2022, 205, 117715. [Google Scholar]
  4. Ajay, S. Effective management of water resources problems in irrigated agriculture through simulation modeling. Water Resour. Manag. 2024, 8, 2869–2887. [Google Scholar]
  5. Zhang, Y.Y.; Zhao, Y.; Zhang, H.W.; Cao, J.; Chen, J.S.; Su, C.C.; Chen, Y.P. The impact of land-use composition and landscape pattern on water quality at different spatial scales in the Dan River Basin, Qin Ling mountains. Water 2023, 18, 3276. [Google Scholar] [CrossRef]
  6. Wang, T.R.; Ding, L.Z.; Zhang, D.Y.; Chen, J.P. A hybrid model combined deep neural network and Beluga Whale optimizer for China urban dissolved oxygen concentration forecasting. Water 2024, 20, 2966. [Google Scholar] [CrossRef]
  7. Zhang, P.; Li, L.Y.M.; Wang, Y.S.; Shi, C.C.; Fan, C.C. Influence of riverbed incision and hydrological evolution on water quality and water age based on numerical simulation: A case study of the Minjiang estuary. Int. J. Environ. Res. Public Health 2021, 11, 6138. [Google Scholar] [CrossRef]
  8. Li, Z.; Jiang, S.M.; Jin, J.L.; Shen, R.; Cui, Y. Quantitative diagnosis of water resources carrying capacity obstacle factors based on connection number and TOPSIS in Huaibei Plain. Water 2023, 18, 3217. [Google Scholar] [CrossRef]
  9. Sedighkia, M.; Abdoli, A. Linking remote sensing analysis and reservoir operation optimization for improving water quality management of reservoirs. J. Hydrol. 2022, 613, 128445. [Google Scholar] [CrossRef]
  10. Akiner, M.E.; Chauhan, P.; Singh, S.K. Evaluation of surface water quality in the Betwa River Basin through the water quality index model and multivariate statistical techniques. Environ. Sci. Pollut. Res. Int. 2024, 12, 18871–18886. [Google Scholar] [CrossRef]
  11. Zhao, H.; Zhao, L.L.; Tian, H.; Ding, L.C.; Wang, F.; Wang, Z.R.; Han, Z.H.; Gong, W.Q.; Hou, X.D. Comprehensive evaluation of water resources carrying capacity in Ankang City based on Game Theory Combination Weighting-TOPSIS model. Geol. Resour. 2023, 5, 642–654. [Google Scholar]
  12. Liu, T.; Xu, C.Y.; Zhang, X. Analysis of carbon emission-economy-environment coupling coordination degree in the Yangtze River economic belt from a dual-carbon perspective. J. Hunan Univ. Technol. 2024, 38, 53–60. [Google Scholar]
  13. Zhao, Z.Y.; Fan, B.; Zhou, Y.Q.; Wang, D. An effective data-driven water quality modeling and water quality risk assessment method. Eng. Appl. Artif. Intell. 2024, 138, 109457. [Google Scholar] [CrossRef]
  14. Du, L.L.; Niu, Z.R.; Zhang, R.; Zhang, J.Z.; Jia, L.; Wang, L.J. Evaluation of water resource carrying potential and barrier factors in Gansu Province based on game theory combined weighting and improved TOPSIS model. Ecol. Indic. 2024, 166, 112438. [Google Scholar] [CrossRef]
  15. Zeng, Y.; Yang, C.; Wang, X.Y.; Fang, Z.X.; Wu, J. Prediction of water quality of bicarbonate mineral water in Wudallanchi based on BP Neural Network model. E3S Web Conf. 2018, 53, 4017. [Google Scholar]
  16. Cao, D.D.; Chan, M.K.; Ng, S.C. Modeling and forecasting of nanoFeCu treated sewage quality using Recurrent Neural Network (RNN). Computation 2023, 2, 39. [Google Scholar] [CrossRef]
  17. Vijay, A.M.; Sohitha, C.; Saraswathi, G.N.; Lavanya, G.V. Water quality prediction using CNN. J. Phys. Conf. Ser. 2023, 1, 12051. [Google Scholar] [CrossRef]
  18. Kasiselvanathan, M.; Venkata, S.R.P.C.; Vijay, A.J.; Suresh, A.; Sinduja, M.; Prajna, K.B.; Maheswaran, S. Prediction of ground water quality in western regions of Tamilnadu using LSTM network. Groundw. Sustain. Dev. 2024, 25, 101156. [Google Scholar]
  19. Shang, X.D.; Duan, Z.X.; Chen, B.S.; Li, T.C. Water quality prediction based on a composite model of bidirectional long shortterm memory network. Acta Sci. Circumstantiae 2024, 7, 261–270. [Google Scholar]
  20. Zhang, Y.T.; Li, T.H. River Water Quality Prediction Based on Long Short-term Memory Neural Network. Environ. Sci. Technol. 2021, 8, 163–169. [Google Scholar]
  21. Hao, Y.Y.; Zhao, L.; Sun, T.; Qiao, Z. Surface water quality prediction based on RF-LSTM. J. Water Resour. Water Eng. 2021, 6, 41–48. [Google Scholar]
  22. Huang, Y.H.; Zhang, Z.H.; Chen, Q.; Zhang, L.; Zhang, J.G.; Lan, X.S. Transmission line audible noise prediction based on CNN-BiLSTM-Attention method. J. Phys. Conf. Ser. 2023, 1, 12078. [Google Scholar] [CrossRef]
  23. Zhao, T.Y.; Shi, C.C.; Xie, R.R.; Li, J.B.; Jiang, H.; Chen, J.; Liu, J.H. Change trend of main water quality indices and pollution identification in Minjiang River Basin in recent years. J. Fish. Res. 2022, 44, 324–335. [Google Scholar]
  24. Wen, Y.Y. Temporal and Spatial Analysis of Water Quality in Minjiang River and Its Sustainable Development. Master’s Thesis, Fujian Agriculture and Forestry University, Fuzhou, China, 2023. [Google Scholar]
  25. Tang, Y. Spatio-temporal Distribution Characteristics Analysis and Water Quality Prediction in Mingjiang River Basin. Master’s Thesis, Fuzhou University, Fuzhou, China, 2018. [Google Scholar]
  26. Wang, X.K.; Lin, H.; Xie, X.Q.; Wang, Z.F.; Liu, Y.; Liu, X.Z. Spatiotemporal change characteristics and driving factors of landuse in Minjiang River Basin. Ecol. Sci. 2023, 4, 171–181. [Google Scholar]
  27. Qin, W.H.; Chen, X.Y. Water quality forecast and prediction model based on Long-Short Term Memory network. J. Saf. Environ. 2020, 20, 328–334. [Google Scholar]
  28. Jin, C.C.; Ou, D.Y.; Zhou, H.Y.; Dong, H.Y. Comprehensive city river water quality assessment based on water quality identification index method. J. Water Resour. Archit. Eng. 2021, 19, 240–245. [Google Scholar]
  29. Zhou, Y.; Lu, N.; Hu, H.T.; Fu, B.J. Water resource security assessment and prediction in a changing natural and social environment: Case study of the Yanhe Watershed, China. Ecol. Indic. 2023, 154, 110594. [Google Scholar] [CrossRef]
  30. He, T.; Zhang, L.G.; Zeng, Y.; Zuo, C.Y.; Li, J. Water quality comprehensive index method of Eltrix River in Xin Jiang Province using SPSS. Procedia Earth Planet. Sci. 2012, 5, 314–321. [Google Scholar]
  31. Ye, Y.Z.; Chen, F.; Huang, Y.L. Water quality evaluation of subtropical water source reservoir using water quality index method. Water Resour. Prot. 2022, 38, 116–124. [Google Scholar]
  32. Hao, M.; Yan, Z.; Huang, Y.H.; Cao, Q.Y.; Cao, Z.Y. Water quality evaluation of centralized drinking water sources in Yan’an city. Res. Agric. Mod. 2023, 44, 736–744. [Google Scholar]
  33. Zeng, X.T.; Chen, Y.Y.; Yue, S.J.; Xu, D.Q.; Fu, R.J.; Tang, Y.P. Quantitative identification of Q-markers of Euphorbiae Humifusae Herba based on AHP-CRITIC comprehensive weighting method. China J. Chin. Mater. Medica 2022, 19, 5193–5202. [Google Scholar]
  34. Shi, H.W.; Wang, H.Y.; Xue, S.F.; Feng, S.L.; Li, Y.C. Durability evaluation of iron tailings concrete under freeze-thaw cycles and sulfate erosion based on entropy weighting method. Constr. Build. Mater. 2024, 443, 137747. [Google Scholar] [CrossRef]
  35. Hossain, M.Z.; Adhikary, A.K.; Nath, H.; Kafy, A.A.; Altuwaijri, H.A.; Rahman, M.T. Integrated geospatial and analytical hierarchy process approach for assessing sustainable management of groundwater recharge potential in Barind Tract. Water 2024, 20, 2918. [Google Scholar] [CrossRef]
  36. Xie, L.; Huang, J.J.; Zhu, X.; Yang, F.; Peng, F.Q.; Pang, Q.Q.; Jing, Y.M.; Tian, L.F.; Jin, J.H.; Hu, G.R.; et al. Simplification and simulation of evaluation process for low efficiency constructed wetlands based on principal component analysis and machine learning. Sci. Total Environ. 2024, 955, 176873. [Google Scholar] [CrossRef] [PubMed]
  37. Ye, S.; Li, C.Y.; Qiu, X.; Xiong, B.; Sun, G.C.; Huang, S.J.; Rong, Y. Application of combination weighting based TOPSIS model in fruit quality evaluation. J. Northwest A F Univ. 2017, 45, 111–121. [Google Scholar]
  38. Wei, Z.Q.; Ji, D.D.; Yang, L. Comprehensive evaluation of water resources carrying capacity in Henan Province based on entropy weight TOPSIS—Coupling coordination—Obstacle model. Environ. Sci. Pollut. Res. Int. 2023, 54, 115820–115838. [Google Scholar] [CrossRef] [PubMed]
  39. Fan, W.G.; Chen, B.X.; Li, Q.X. Research on the evaluation of the formation of new development patterns in each province: Coupling coordination model based on the improved CRITIC entropy method of combining weights. China J. Commer. 2023, 4, 37–41. [Google Scholar]
  40. Ma, J.M.; Tuo, Y.F.; Wang, Q.; Wang, F.; Zheng, Y.; Du, W.J. Evaluation of water resources carrying capacity in Yunnan province based on GRA-TOPSIS and diagnosis of its obstacle factors. J. Water Resour. Water Eng. 2022, 33, 11–17. [Google Scholar]
  41. Xie, X.J. Coupling coordination relationship and spatiotemporal evolution characteristics of land use benefit in Sichuan province based on entropy weight TOPSIS and coupling coordination model. J. Soil Water Conserv. 2024, 38, 267–277. [Google Scholar]
  42. Li, R.Q.; Shan, J.X.; Zhao, J.; Hu, H.; Liu, D.H.; Wang, G.F.; Li, Y.F. Spatio-temporal evolution and obstacle factors of territorial utilization quality in the Bohai Rim. Geogr. Res. 2024, 3, 736–753. [Google Scholar]
  43. Fan, H.; Wu, Y.; Liu, X.; Chen, J.; Wu, X.J.; Yao, X.D.; Duan, X.Y. Construction and validation of a Lasso regression baesd nomogram prediction model for active tuberculosis. Chin. J. Clin. Res. 2024, 37, 424–429. [Google Scholar]
  44. Huang, X.; Zheng, S.Y.; Zhang, Z.Y.; Zhu, D.P.; Fan, X.T.; Du, B.; Liu, S.J. A prediction model involving biomarkers for the risk of metabolic syndrome using the Lasso regression. Chin. J. Conval. Med. 2024, 33, 1–5. [Google Scholar]
  45. Sailaja, M.; Prema Kumar, M.; Swarna Jyothi, B.; Narasamba Vanguri, G.L.; Manjula, S.; Divya Priya, D. Remote sensing-based UAV imaging in heat pattern analysis impact on climate change detection using fuzzy stacked Lasso elastic-net model. Remote Sens. Earth Syst. Sci. 2024, 24, 158. [Google Scholar] [CrossRef]
  46. Fu, C.; Wang, K.J. Lasso regression assisted by coefficient of determination and correlation coefficient. J. Fujian Norm. Univ. (Nat. Sci. Ed.) 2024, 40, 57–63. [Google Scholar]
  47. Yuan, B.H.; Lu, Y.; Hu, T.F. Research of feature extraction algorithm based onadaptive Lasso manifold regularization. J. Hunan Univ. Arts Sci. (Sci. Technol.) 2021, 33, 23–26. [Google Scholar]
  48. Sandberg, J.; Voigtmann, T.; Devijver, E.; Jakse, N. Feature selection for high-dimensional neural network potentials with the adaptive group lasso. Mach. Learn. Sci. Technol. 2024, 2, 25043. [Google Scholar] [CrossRef]
  49. Yang, J.; Ren, G.H.; Wang, Y.X.; Liu, Q.; Zhang, J.M.; Wang, W.Q.; Li, L.Z.; Zhang, W.P. Environmental prediction model of solar sreenhouse based on improved Harris Hawks optimization-CatBoost. Sustainability 2021, 16, 2021. [Google Scholar] [CrossRef]
  50. Ge, Z.W.; Feng, S.; Ma, C.C.; Wei, K.; Hu, K.; Zhang, W.J.; Dai, X.J.; Fan, L.F.; Hua, J.H. Quantifying and comparing the effects of key chemical descriptors on metal-organic frameworks water stability with CatBoost and SHAP. Microchemical 2024, 196, 109625. [Google Scholar] [CrossRef]
  51. Zhou, Y.C.; Hu, T.S.; Chen, J.; Xu, J.J.; Zhou, Y.L. Application of neural betwork model coupled with dynamic equation in water quality prediction. J. Yangtze River Sci. Res. Inst. 2017, 9, 1–5. [Google Scholar]
  52. Meng, H.N.; Tong, X.Y.; Shi, Y.K.; Zhu, L.; Feng, K.; Hei, X.H. Cloud server aging prediction method based on hybrid model of auto-regressive integrated moving average and recurrent neural network. J. Commun. 2021, 1, 163–171. [Google Scholar]
  53. Guo, L.J.; Xu, R.W. Application of LSTM model combining improved fruit-fly algorithm after seasonal-trend decomposition using LOESS to water quality prediction. J. Yangtze River Sci. Res. Inst. 2023, 8, 57–63. [Google Scholar]
  54. Ke, J.S.; Zhao, J.M.; Li, H.F.; Yuan, L.; Dong, G.H.; Wang, G.H. Prediction of protein n-terminal acetylation modification sites based on CNN-BiLSTM-attention model. Comput. Biol. Med. 2024, 174, 108330. [Google Scholar] [CrossRef]
  55. Xu, Y.H. Study on Water Ecological Environment Pollution in the Min River Based on Copula Function. Master’s Thesis, Fujian Agriculture And Forestry University, Fuzhou, China, 2024. [Google Scholar]
  56. Yin, J.; Wanyan, D.D. Evaluation on the coupling coordination degree between public cultural services and tourism. J. Natl. Libr. China 2024, 33, 61–73. [Google Scholar]
  57. Cui, M.S.; Liu, R.Q. Study on the coupling coordination of digital economy and green innovation: A case study of cssities in the Yangtze River Delta region. East China Econ. Manag. 2024, 38, 25–37. [Google Scholar]
  58. Yan, X.Y.; Wu, Q.A.; Lv, L.J. Study on influencing factors of water resources carrying capacity in Central China: Based on IGMM empirical analysis. J. Hexi Univ. 2023, 5, 85–92. [Google Scholar]
  59. Peng, Y.; Tan, X.Y.; Zhu, Z.L.; Liao, J.Y.; Xiang, L.J.; Wu, F. Evaluation of resource and environmental carrying capacity at provincial level in China using a pressure-support-adjustment ternary system. Sustainability 2024, 19, 8607. [Google Scholar] [CrossRef]
  60. Chen, J.F.; Che, Y.J.; Gu, Y.; Ding, T.H. Structural optimization of Jiangsu’s chemical industry constrained by carbon reduction and water ecological carrying capacity. Resour. Ind. 2024, 26, 111–123. [Google Scholar]
  61. Wang, J.; Wu, Q.; Luo, H.; Sun, L.L.; Li, N.; He, Y.Q. Spatial-temporal distribution and influencing factors of dissolved oxygen in the North Mainstream of Dong River. J. Chang. River Sci. Inst. 2024, 3, 37–44. [Google Scholar]
  62. Zhang, M.; Yan, R.H.; Gao, J.F.; Yan, S.D.; Yan, J.L. A framework for characterizing spatio-temporal variation of turbidity and drivers in the navigable and turbid river: A case study of Xitiaoxi River. Water 2024, 17, 2503. [Google Scholar] [CrossRef]
  63. Wen, Y.Y.; You, T.G.; Xu, Y.H.; Lin, S.H.; Ning, J.; You, X.M.; Xiao, Y.L. Comprehensive Evaluation of the Level of Water Ecological Civilization Construction in the Min River Basin, China. Sustainability 2022, 14, 15753. [Google Scholar] [CrossRef]
  64. Wang, K.X.; Chen, W. Evaluation of water resources carrying capacity in Shiyan city based on improved fuzzy comprehensive evaluation. Water Resour. Power 2024, 42, 14–17. [Google Scholar]
  65. Pang, B.W.; Li, Z.J. Carrying capacity evaluation on water resources of Jilin Province based on PCA-GA-Xgboost model. Pearl River 2024, 45, 98–106. [Google Scholar]
  66. Gao, Y.; Bai, L.Y.; Zhou, K.F.; Kou, Y.F.; Yuan, W.T.; Zhou, X.Z.; Qiu, Z.Y.; Zhao, D.Q.; Lv, Z.H.; Wu, Q.L.; et al. Study on the coupling coordination degree and driving mechanism of “production-living-ecological” space in ecologically fragile areas: A case study of the Turpan-Hami Basin. Sustainability 2024, 20, 9054. [Google Scholar] [CrossRef]
  67. Liu, H.Y.; Xia, J.; Zou, L.; Huo, R. Comprehensive quantitative evaluation of the water resource carrying capacity in Wuhan city based on the “human-water-city” framework: Past, present and future. J. Clean. Prod. 2022, 366, 132847. [Google Scholar] [CrossRef]
  68. Tong, X.W.; He, X.; Sun, L.Q.; Sun, J.G. Variables selection vianon-concave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 2009, 36, 620–635. [Google Scholar]
  69. Wang, X.Y.; Zhang, Z.Y. Adaptive lasso penalized financial risk warning model with network structure. J. Appl. Stat. Manag. 2021, 40, 888–900. [Google Scholar]
Figure 1. Location of the study area and monitoring points (SM: Sanming; NP: Nanping; ND: Ningde; FZ: Fuzhou; XM: Xiamen; LY: Longyan; ZZ: Zhangzhou; PT: Putian; QZ: Quanzhou).
Figure 1. Location of the study area and monitoring points (SM: Sanming; NP: Nanping; ND: Ningde; FZ: Fuzhou; XM: Xiamen; LY: Longyan; ZZ: Zhangzhou; PT: Putian; QZ: Quanzhou).
Water 17 00824 g001
Figure 2. Flowchart of research methods in the study.
Figure 2. Flowchart of research methods in the study.
Water 17 00824 g002
Figure 3. M-K trend test chart of the study area: (a) Permanganate index during flood season; (b) Ammonia nitrogen during flood season; (c) Turbidity during dry season.
Figure 3. M-K trend test chart of the study area: (a) Permanganate index during flood season; (b) Ammonia nitrogen during flood season; (c) Turbidity during dry season.
Water 17 00824 g003
Figure 4. Calculated WQI results in the Min River Basin.
Figure 4. Calculated WQI results in the Min River Basin.
Water 17 00824 g004
Figure 5. Results of weighting of evaluation indicators.
Figure 5. Results of weighting of evaluation indicators.
Water 17 00824 g005
Figure 6. Coupling and coordinated scheduling of water resources, social economy, and ecological environment.
Figure 6. Coupling and coordinated scheduling of water resources, social economy, and ecological environment.
Water 17 00824 g006
Figure 7. Results of the comprehensive WRCC evaluation system: (a) Water resources subsystem; (b) Socio-economic subsystem; (c) Ecological environment subsystem; (d) WRCC subsystem.
Figure 7. Results of the comprehensive WRCC evaluation system: (a) Water resources subsystem; (b) Socio-economic subsystem; (c) Ecological environment subsystem; (d) WRCC subsystem.
Water 17 00824 g007
Figure 8. Obstacle level of WRCC system layer in (a) Water resources; (b) Socio-economic; (c) Ecological environment.
Figure 8. Obstacle level of WRCC system layer in (a) Water resources; (b) Socio-economic; (c) Ecological environment.
Water 17 00824 g008
Figure 9. Frequencies of the top three factors obstructing WRCC in (a) Fuzhou; (b) Sanming; (c) Nanping; (d) Ningde.
Figure 9. Frequencies of the top three factors obstructing WRCC in (a) Fuzhou; (b) Sanming; (c) Nanping; (d) Ningde.
Water 17 00824 g009
Figure 10. Feature selection results based on CatBoost algorithm in WQI.
Figure 10. Feature selection results based on CatBoost algorithm in WQI.
Water 17 00824 g010
Figure 11. Comparison of prediction model results based on (a) MAE; (b) RMSE (model a: LSTM; model b: LSTM-Attention; model c: CNN-BiLSTM-Attention; model d: CatBoost-CNN-BiLSTM-Attention).
Figure 11. Comparison of prediction model results based on (a) MAE; (b) RMSE (model a: LSTM; model b: LSTM-Attention; model c: CNN-BiLSTM-Attention; model d: CatBoost-CNN-BiLSTM-Attention).
Water 17 00824 g011
Figure 12. Comparison of observed and predicted water quality for (a) B; (b) K; (c) O; (d) R.
Figure 12. Comparison of observed and predicted water quality for (a) B; (b) K; (c) O; (d) R.
Water 17 00824 g012
Table 1. Water quality monitoring sites.
Table 1. Water quality monitoring sites.
Monitoring PointSerial
Number
Monitoring PointSerial
Number
Monitoring PointSerial
Number
Monitoring PointSerial
Number
Fuzhou WenshanliAJianou
Pengdun
FYanping LangshiKGutian
Reservoir
P
Lianjiang GuantouBJianyang Pingzhou BridgeGYanping NanxiLDatian
Gaocai
Q
Minhou
Zhuqi
CJiangle ZhangyingHYanping YangkengMJianning YuanzhuangR
Minqing XiongjiangDNanping Shuifen BridgeIZhenghe
Xijin
NBanzhu Creek
Crossing
S
Jianou fangcunEWuyishan XingtianJGutian HuangtianOYongan Ansha
Reservoir
T
Table 2. Coefficient of variation of water quality indicators.
Table 2. Coefficient of variation of water quality indicators.
Indicator LayerX1X2X3X4X5X6X7X8
Coefficient of variation0.4680.2310.3360.5510.3950.3020.1560.328
Table 3. M-K trend test statistical values of various indicators.
Table 3. M-K trend test statistical values of various indicators.
Flood SeasonDry Season
MK-ZMK-pMK-ZMK-p
water temperature (X1)0.65200.51440.78450.4327
Ph (X2)0.96810.3330−0.16260.8709
dissolved oxygen (X3)−0.77050.4410−0.26150.7937
conductivity (X4)1.36320.17281.73150.0834
turbidity (X5)−0.77050.4410−2.07080.0384
permanganate (X6)−2.19300.02831−1.56190.1183
ammonia nitrogen (X7)−3.14130.0017−1.47710.1396
total phosphorus (X8)0.41490.67820.58660.5575
total nitrogen (X9)0.45440.64951.39230.1638
Table 4. Top 5 indicator layer barrier factors of WRCC.
Table 4. Top 5 indicator layer barrier factors of WRCC.
Main Obstacle Factor2017201920212023
X250.0507-0.0508-
X30-0.05080.05100.0509
X310.0508---
X360.05080.05100.05090.0509
X370.05070.05070.05080.0510
X380.05100.05130.05120.0511
Table 5. Comparison of model results for screening different variables based on MSE.
Table 5. Comparison of model results for screening different variables based on MSE.
LassoCVLassoLarsCVAdaptive Lasso
MSE2.699 × 10−15.848 × 10−25.740 × 10−2
Table 6. Results of Adaptive Lasso Regression screening.
Table 6. Results of Adaptive Lasso Regression screening.
Indicator LayerX3X5X6X7X8X9X14
Correlation Coefficient0.9396−0.7414−1.6642−1.3623−1.4217−1.7532−0.5427
Table 7. Paired-sample t-test results of CatBoost-CNN-BiLSTM-Attention.
Table 7. Paired-sample t-test results of CatBoost-CNN-BiLSTM-Attention.
ModeltpCohen’s d
LSTM−5.1660.000 ***0.348
LSTM-Attention−12.1660.000 ***0.820
CNN-BiLSTM-Attention−19.9140.000 ***1.343
Note: In this table, “***” indicates 1% level of significance.
Table 8. Comparison of NSE and Adjusted R2 results.
Table 8. Comparison of NSE and Adjusted R2 results.
ModelIndexNSTMean Value
LSTMNSE0.48840.14730.35490.3208
Adjusted R20.15400.08700.02880.0869
LSTM-AttentionNSE0.20460.11110.07340.1262
Adjusted R20.06460.09740.10510.2487
CNN-BiLSTM-AttentionNSE0.29690.23190.09800.2084
Adjusted R20.01320.05140.10040.0562
CatBoost-CNN-BiLSTM-AttentionNSE0.55130.44140.06400.3548
Adjusted R20.22660.10530.34050.2249
Note: In this table, “N”, “S”, and “T” represent monitoring point, and mean value is the average of performance metrics across 20 sites.
Table 9. Comparison of MAPE and SMAPE results.
Table 9. Comparison of MAPE and SMAPE results.
ModelIndexCDGRMean Value
LSTMMAPE0.02630.14370.04170.05030.0525
SMAPE0.00660.03330.01000.01290.0128
LSTM-AttentionMAPE0.03830.04380.04040.02400.0443
SMAPE0.00990.01140.00990.00680.0109
CNN-BiLSTM-AttentionMAPE0.03540.04700.04420.02400.0444
SMAPE0.00900.01210.01090.00610.0109
CatBoost-CNN-BiLSTM-AttentionMAPE0.02030.02570.03940.01620.0326
SMAPE0.00500.00640.00950.00410.0103
Note: In this table, “C”, “D”, “G”, and “R” represent monitoring point, and mean value is the average of performance metrics across 20 sites.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Xiao, Y.; Shen, H.; You, L.; Zheng, Y.; Xie, H.; Xu, Y.; Fu, W.; Ning, J.; You, T. Research on Water Resource Carrying Capacity Assessment and Water Quality Forecasting Based on Feature Selection with CNN-BiLSTM-Attention Model of the Min River Basin. Water 2025, 17, 824. https://doi.org/10.3390/w17060824

AMA Style

Xiao Y, Shen H, You L, Zheng Y, Xie H, Xu Y, Fu W, Ning J, You T. Research on Water Resource Carrying Capacity Assessment and Water Quality Forecasting Based on Feature Selection with CNN-BiLSTM-Attention Model of the Min River Basin. Water. 2025; 17(6):824. https://doi.org/10.3390/w17060824

Chicago/Turabian Style

Xiao, Yanglan, Huirou Shen, Linyi You, Yijing Zheng, Houzhan Xie, Yihan Xu, Weiwei Fu, Jing Ning, and Tiange You. 2025. "Research on Water Resource Carrying Capacity Assessment and Water Quality Forecasting Based on Feature Selection with CNN-BiLSTM-Attention Model of the Min River Basin" Water 17, no. 6: 824. https://doi.org/10.3390/w17060824

APA Style

Xiao, Y., Shen, H., You, L., Zheng, Y., Xie, H., Xu, Y., Fu, W., Ning, J., & You, T. (2025). Research on Water Resource Carrying Capacity Assessment and Water Quality Forecasting Based on Feature Selection with CNN-BiLSTM-Attention Model of the Min River Basin. Water, 17(6), 824. https://doi.org/10.3390/w17060824

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop