1. Introduction
The management and protection of water resources have been a major topic of concern worldwide. The environmental monitoring and water quality assessment of river systems are key steps in ensuring the sustainable use of water resources. The Yangtze River, as the largest river in China, holds an important geopolitical, economic, and ecological position. It not only provides drinking and irrigation water for millions of people but also plays an important role in transportation and trade routes. In addition, the Yangtze River Basin supports diverse ecosystems and biodiversity, which have important implications for China’s sustainable development. Therefore, the monitoring and protection of the water quality and ecology of the Yangtze River is of strategic importance. As social and economic progress, phosphorus and nitrogen pollution have become the primary factors on water quality in the Yangtze River Basin. Nitrogen and phosphorus increase algal growth, and although they do not directly affect the spectra parameters, they can lead to the eutrophication of water bodies, ultimately harming the health of riverine ecosystems. Monitoring river water quality and exploring eutrophication mechanisms are of great significance for the management, control, and treatment of water bodies. The main target for assessing the environmental status of rivers, lakes, groundwater, and coastal waters is the regular detection of water pollutants and the causes of their presence [
1]. Satellite remote sensing technology has been developing rapidly in recent years and gradually applied to the field of water quality monitoring. From qualitative remote sensing to quantitative remote sensing, the accuracy of remote sensing inversion of water quality has been steadily improving, showing great potential for large-scale water quality monitoring. In water pollution monitoring, remote sensing technology can rapidly identify pollutant sources and provide reliable scientific guidance for prompt issue resolution. Remote sensing monitoring has distinct advantages, including prolonged observation periods, extensive geographical coverage, rapid monitoring cycles, and low cost, which effectively compensate for the limitations of conventional monitoring methods. Currently, remote sensing inversion for optical properties related to water quality parameters is gradually maturing. Water quality inversion methods can be broadly categorized into four primary types: empirical, analytical, semi-empirical, and artificial intelligence methods. The empirical method develops a statistical regression model that correlates with measured water quality parameters and reflectance in the optimal band or combination of bands [
2]. Basic empirical methods make it difficult to meet the accuracy requirements for estimating water quality parameter concentrations, because the relevant spectral features are affected by a complex mix of water quality variables, such as phytoplankton pigments, colored dissolved organic matter (CDOM), etc. This complexity poses significant obstacles to the process of water quality inversion [
3]. Empirical methods tend to have limited generalizability due to their dependence on specific regional and temporal conditions. In contrast, analytical methods employ bio-optical models and radiative transfer models to simulate the propagation of light through both the atmosphere and water column. This approach characterizes the intricate relationship between water quality parameters and radiation or reflectance. Gilerson et al. [
4] evaluated the performance of bio-optical inversion models in estimating the chlorophyll concentration using comprehensive synthetic datasets of reflectance spectra and intrinsic optical features associated with various water quality parameters, as well as field datasets. However, the complex composition of water bodies and radiative transfer processes necessitates the consideration of numerous factors, and this method relies on a substantial foundation of mathematical and physical principles. Additionally, the spectral resolution of most satellite sensors often does not align with the spectral resolution of near-ground measurements. Therefore, achieving quantitative remote sensing inversion based on the analytical model is still challenging and limited in practical applications [
5].
The semi-empirical method is a combination of the empirical and analytical methods. Semi-empirical methods can be well modeled by incorporating a limited set of field data and reflectance or radiance values. Some researchers have employed this method for water quality inversion with improved accuracy. For example, Peter et al. [
6] inverted chlorophyll a and phycocyanin using four semi-analytical algorithms from data acquired by CASI-2 (Compact Airborne Spectrographic Imager-2) and AISA (the Airborne Imaging Spectrometer for Applications), verifying the great potential of semi-analytical algorithms for remote sensing inversion. In general, the limited optical properties of total nitrogen (TN) and total phosphorus (TP) make it difficult to capture the straightforward linear relationship between concentrations of water constituents and spectral reflectance or radiance. However, these constituents do influence the spectral characteristics of the water body. When the underlying relationships of obtained data are difficult to describe, neural network (NN) models still work [
7]. Therefore, it becomes imperative to enhance the accuracy of inversion through the incorporation of artificial intelligence methods in remote sensing. Artificial intelligence methods differ from the above three modeling methods by their ability to provide explicit equations that can capture both complex linear and nonlinear relationships between features and target variables. A large number of studies have been conducted on artificial intelligence methods, such as neural networks (NNs) and support vector machines (SVMs), for water quality inversion. Guo et al. [
8] compared three machine learning models: random forest (RF), SVMs, and NNs for inversion of the TP, TN, and chemical oxygen demand. Their results showed that the optimized machine learning model and image band selection have significantly improved the performance of these non-optically active parameters’ inversion.
Numerous scholars have extensively studied the remote sensing inversion of TN and TP. For instance, a water quality retrieval model of TP and TN was established and evaluated using the Landsat 5 Thematic Mapper (TM) data and multiple regression algorithms, where the TP concentration was retrieved within a 30% mean relative error and the TN concentration was retrieved within 20% [
9]. Their results showed that the most traditional routine monitoring of water quality by remote sensing was possible and effective. Baban [
10] applied regression equations to predict water quality parameters using TM data, including chlorophyll and TP. Isenstein et al. established a remote sensing inversion model for TP and various other water quality parameters in Lake Champlain. Their approach involved a multiple linear regression model that incorporated Landsat’s ETM+ data in conjunction with measured data [
11]. Yue et al. employed IKONOS data to construct a multiple linear regression model with two neural network models and inverted four water quality parameters in Huangshi Ci Lake, which included TN and TP, indicating that the neural network model produced more accurate inversion results when compared to the multiple linear regression model [
12]. Liu et al. [
13] extracted 16 spectral parameters from multispectral images and constructed several inversion models for TN, SS, and TUB water quality parameters in the East Lake of Zhejiang. Although the R
2 of the exponential regression function model for TP inversion reached 0.7829, the model just provided a valuable reference for guiding pollution control in small watersheds. Sun [
14] applied twelve machine learning algorithms to estimate the TN and TP concentrations in the Miyun Reservoir and analyzed the changes in the four water quality parameters at the spatial and temporal scales, along with classifying the range of water quality fluctuation. The findings indicated that the Miyun Reservoir consistently maintained a high overall water quality while achieving Class II water quality standards throughout the year. Du et al. conducted TP inversion in Taihu Lake using GOCI images and regression analysis, which achieved a peak accuracy of 0.898 and revealed variations in the TP concentrations across different seasons [
15]. With the development of computer technology, machine learning methods are applied in the field of remote sensing inversion. For instance, Zhang et al. developed a new Bayesian probabilistic neural network model and employed it to quantitatively predict water quality parameters such as TN, TP, etc. The model demonstrated high inversion accuracies, achieving R
2 values of over 0.9. This advancement facilitated extensive-scale water quality monitoring and pinpointed pollutant sources in the Maozhou River in Guangdong Province, China [
16]. He et al. inverted the TN and TP parameters of a large inland reservoir in Jiangmen City, South China, based on the inversion model of the BP neural network with Landsat-8 images and evaluated the eutrophication of the reservoir [
17]. Karamoutsou and Psilovikos [
18] tested feed-forward deep neural networks (FF-DNNs) of dissolved oxygen (DO) with different structures, and all the well-trained DNNs gave satisfactory results. At present, the short-term prediction of water quality parameters is relatively mature. For example, Sentas et al. [
19] evaluated the short-term prediction capabilities of ARIMA, transfer function (TF), and artificial neural network (ANN) for the water body parameters of Thesaurus Dam, River Nestos, Greece. Some scholars have also conducted inversions of water quality in the mainstream of the Yangtze River. For instance, an empirical (regression-based) model was proposed, and the TN and TP concentrations were retrieved and analyzed in the Yangtze River [
20]. Zhao et al. proposed a joint inversion model based on Landsat-8 and Sentinel-2 and found that the accuracy of multi-source satellite inversion was higher than that of a single satellite [
21]. However, only one model was assumed for the whole mainstream of the Yangtze River, while the water qualities in the upper, middle, and lower reaches are different.
The aim of this work is to quantify and analyze the long-term changes in key water quality parameters of the Yangtze River in order to better understand the trends in water quality changes and their impacts on the ecosystem. In this work, inversion models of TN and TP in the upper, middle, and lower reaches of the Yangtze River are constructed and evaluated from Landsat-8 and Sentinel-2 remote sensing images based on two machine learning regression methods: the random forest method and the neural network method. Subsequently, the more effective model is used to invert the TN and TP concentrations in the mainstream of the Yangtze River from January 2016 to December 2022. The spatial and temporal patterns of the water quality changes are further studied, and the possible mechanisms in the nutrient status within the mainstream of the Yangtze River are discussed. In
Section 2, the data and methods are shown, the results and analysis are presented in
Section 3, the impacts and discussion are given in
Section 4, and finally, the conclusions are summarized in
Section 5.
4. Impacts and Discussion
Many factors affect the water quality of the Yangtze River, including temperature, water level, flow, and rainfall. Based on the results from our water quality model inversion, we analyzed the monthly variations in concentrations of TN and TP across six sections of the Yangtze River, including Chongqing, Yichang in Hubei, Yueyang in Hunan, Wuhan in Hubei, Jiujiang in Jiangxi, and Anqing in Anhui (
Figure 7). We then conducted a correlation analysis between the monthly averaged temperature, water level, flow, and rainfall obtained at six sections and the monthly values of TN and TP in the Yangtze River, and the Pearson correlation coefficient was selected as the indicator (
Figure 8).
4.1. Temperature
Temperature directly affects the activities of organisms in water and the health of ecosystems. Differences in temperature affect the density and flow of a water body, and temperature also affects the rate of chemical reactions in the water body, such as the conversion of nitrogen and phosphorus, which may have an effect on TN and TP concentrations. Temperature is often one of the factors that must be considered in water quality assessment and management to ensure the health and sustainability of water bodies.
From 2016 to 2022, the average temperatures in six typical sections were 18.5 °C, 16.9 °C, 18.2 °C, 18.0 °C, 18.1 °C, and 17.6 °C, respectively. As shown in
Figure 8, there is a negative correlation between temperature and TN concentration in various sections of the Yangtze River mainstream across different months. The negative correlation is particularly obvious in the Yueyang section, with a value of
of −0.28. Specifically, as the temperature increases, the TN concentration shows a downward trend. This negative correlation suggests that temperature may impact aquatic ecosystems, including biological activities and chemical transformation processes of nitrogen. High temperatures may promote microbial activities in the water, potentially leading to increased bioabsorption and degradation rates of TN and ultimately reducing the TN concentration. On the contrary, a positive correlation was observed between the temperature of the transect and TP. Higher water temperatures may foster the growth of algae and bacteria, thereby increasing the bioabsorption and release rates of TP. The TP concentration in the Chongqing section, the Yichang section, and the Yueyang section is most affected by temperature, and the changing trend is the same as that of the temperature (
Figure 9). This shows that, in the upper reaches of the Yangtze River, the TP concentration in winter is significantly lower than that in the summer. The TN and TP concentrations in the middle and lower reaches of the Yangtze River are less affected by temperature than those in the upper reaches of the Yangtze River. The main reason may be that the upper reaches have greater seasonal temperature differences.
4.2. Water Level
The water level is relevant to the water quality of the Yangtze River, because it is one of the important physical parameters of the river and has direct and indirect effects on water quality. Rising and falling water levels can affect the solubility of oxygen in the water column. Changes in the water level can affect wastewater discharge and the dilution of wastewater. Changes in the water level of the Yangtze River can affect the functioning of wetlands and ecological habitats around the water body, which, in turn, affects the water quality.
The water level of the mainstream of the Yangtze River changes seasonally, reaching the highest value in the summer. As shown in
Figure 8, the correlation between the TN parameters and water level data in the six sections of the Yangtze River is low, with
values consistently below 0.2. Among them, the correlation between the TN parameters and the water level in the Jiujiang section is the lowest, and the correlation between the TN parameters and the water level in the Anqing section is the highest. The water level changes at Anqing Station are relatively stable, with a water level difference of about 10 m within a year. The TN content at Anqing Station was also relatively stable before 2019 (
Figure 10). However, this study reveals that the water level of the Yangtze River has a greater impact on the TP when compared to the TN. The Yichang section exhibits the highest correlation coefficient between the TP and water level, with a value of
of 0.3, and the lowest is in the Yueyang section, with a value of
of 0.1. The concentrations of the TN and TP show a weak positive correlation with the water level. Although the direct relationship between water level and TN and TP is limited, changes in the water level may indirectly affect the water quality by factors such as hydrodynamics and dissolved oxygen in the water body.
In addition, the rise in water level during floods will lead to an increase in TN and TP content, especially in the Jiujiang section. As shown in
Figure 10e, in the summers of 2016 and 2020, the water level in the Jiujiang section of the Yangtze River rose while the TN and TP rose sharply. This phenomenon may be attributed to the flooding, which caused erosion and overflow of the adjacent land and introduced particulate matter rich in TN and TP into the water body. In situations where the mainstream of the Yangtze River experiences high water levels and a weak water cycle, the input rate of TN and phosphorus may exceed the treatment and discharge capacity of the water body, resulting in the accumulation of TN and TP. Complex hydrological conditions exist in the mainstream of the Yangtze River, including seasonal flood and drought cycles. These hydrologic changes may mask potential associations between water levels and TP, as flooding can cause large fluctuations in TP, and TP concentrations during droughts may be constrained by other factors.
4.3. Flow
The radial flow in the Yangtze River is intrinsically linked to the concentrations of TN and TP within its waters. There is a strong correlation between flow and water quality in the Yangtze River due to the fact that the flow is a key controlling factor in river hydrodynamics and water quality dynamics. High flows can accelerate pollutant transport and flushing, which results in pollutants being flushed from the surface and riverbanks into the water column, negatively impacting the water quality. Therefore, changes in flow can play a key role in the transport and distribution of pollutants. To investigate the relationship between radial flow and TN and TP concentrations in the mainstream of the Yangtze River, we compared and analyzed the monthly average radial flow and water quality parameter concentrations in six Yangtze River sections from January 2016 to December 2022.
Figure 8c illustrates the correlation analysis at the six selected sections between flow and TN concentration in the Yangtze River. There is a positive correlation between flow and TN concentration. Notably, the Chongqing section exhibits the highest correlation coefficient with a value of
of 0.36, indicating a significant association between flow and TN. In addition, a sharp increase in flow will cause the TN concentration to rise rapidly. For instance, the flow of the Chongqing section increased sharply in the summer of 2018 and 2020, and the total nitrogen content also increased in a short period (
Figure 11). This outcome underscores the influence of radial flow on the TN concentration within the river. Our findings reveal that an augmentation in radial flow is generally accompanied by an increase in TN concentration. This observation suggests that radial flow may exert an impact on nitrogen cycling within aquatic ecosystems. With an elevated radial flow, it is plausible that more nitrogen species are transported from the watershed into the river, resulting in elevated TN concentrations.
Moreover, a discernible correlation between radial flow and TP is evident. Notably, the correlation coefficients between radial flow and TP in the Chongqing, Yichang, and Anqing sections all exceed 0.2. This phenomenon implies that radial flow may play a substantial role in the transport of TP within the water column. Increased flow over a short period of time can lead to a sharp increase in TP concentrations. For example, in the summer of 2020, the flow in the Yueyang and Jiujiang sections increased, and the TP concentration increased significantly (
Figure 11). It is imperative to acknowledge that the relationship between radial flow and TP exhibits seasonal and spatial variations. During flood events, radial flow increases, potentially leading to a greater influx of TP into the water column and, subsequently, higher TP concentrations. Conversely, in the dry season, when radial flow diminishes, a corresponding decrease in TP concentrations may ensue.
4.4. Rainfall
The amount of rainfall is related to the water quality of the Yangtze River, because rainfall is one of the important environmental factors that affect the water quality of water bodies. Rainfall can wash waste, asphalt, and other pollutants from the surface and carry them into the river. Heavy rainfall or flooding events can introduce large amounts of wastewater, sediment, and pollutants that can negatively impact the water quality. Rainfall can also increase the risk of soil erosion, transporting sediment and suspended solids into streams. This not only causes turbidity in the water but can also attach organic matter and pollutants to sediment particles, further affecting the water quality.
Figure 12 presents the results of an analysis conducted on TN, TP, and rainfall data derived from six distinct sections in the Yangtze River. The rainfall in the middle and lower reaches of the Yangtze River is significantly more than that in the upper reaches of the Yangtze River. Our examination reveals a discernible correlation between rainfall and TN concentration within the Yangtze River, with pronounced effects observed particularly in the upper reaches of the river. The values of
between rainfall and TN in the Chongqing section and Yichang section are 0.36 and 0.33, respectively. After rainfall events, the TN concentration usually rises for a short period. This phenomenon may reflect the scouring of nitrogenous substances in the rainwater, which carries nitrogenous substances from the soil into the river. Therefore, rainfall events may be an important driver of short-term fluctuations in TN concentrations, and there is a positive correlation between rainfall and TN concentration. Among the six selected sections, except for Anqing, the correlation between rainfall and TP is lower than that of TN. However, like TN, the impact of surface rainfall on TP in the upper reaches of the Yangtze River is higher than that of the middle and lower reaches of the Yangtze River. The values of
between rainfall and TP in the two upstream sections Chongqing and Yichang are 0.22 and 0.30, respectively. It was also found that the relationship between rainfall and TP showed seasonal and spatial variations. In the summer, the increase in rainfall may lead to more TP transported from the watershed to the river, thus increasing the likelihood of elevated TP concentrations. In contrast, during the dry season, the impact of rainfall on TP concentrations tends to be less pronounced.
5. Conclusions
Based on Landsat-8 and Sentinel-2 satellite images, this study undertook the inversion of monthly TN and TP across the entire mainstream of the Yangtze River. The mainstream of the Yangtze River was divided into three sections according to geographic location: namely, the upstream, midstream, and downstream. In each region, a remote sensing inversion model was established, incorporating principal component analysis for image fusion. To construct the inversion model, both RF regression and NNs regression methods were used and assessed. NNs regression exhibited superior suitability for the inversion of the water quality parameters in the mainstream of the Yangtze River. Employing the NNs regression approach, comprehensive datasets for the spatial distribution of the TN and TP concentrations in the Yangtze River mainstem were derived. The results demonstrated that the neural network regression method was more suitable for the inversion of TN and TP water quality parameters in the mainstream of the Yangtze River, with an R2 value exceeding 0.67 for all models. The NNs model was applied to accurately invert the time series of TN and TP changes with the fused images in the Yangtze River mainstem from January 2016 to December 2022. This analysis revealed a decreasing trend in monthly mean values for TN by 13.7% and TP by 46.2% over this period.
In addition, an examination of the relationship between the monthly average TN and TP concentration changes and hydrometeorological factors such as temperature, water level, flow, and rainfall was conducted in six typical sections of the Yangtze River mainstem. The analysis results indicated that the upper reaches of the Yangtze River exhibited higher sensitivity to hydrometeorological factors than the middle and lower reaches, with rainfall and flow exerting more significant impacts on water quality in the mainstream. Among them, TN has a weak negative correlation with temperature and a positive correlation with water level, flow, and rainfall. TP has a positive correlation with temperature, water level, flow, and rainfall.
This study provides insights into the complexity and long-term trends of water quality problems in the Yangtze River. It not only is important for the management of the Yangtze River Basin, but also provides valuable references for the management of other river systems and global ecosystems.