Next Article in Journal
The Mesoscale SST–Wind Coupling Characteristics in the Yellow Sea and East China Sea Based on Satellite Data and Their Feedback Effects on the Ocean
Previous Article in Journal
Research Progress on Preparation of Superhydrophobic Surface and Its Application in the Field of Marine Engineering
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Water Quality in the Ma’an Archipelago Marine Special Protected Area: Remote Sensing Inversion Based on Machine Learning

by
Zhixin Wang
1,†,
Zhenqi Zhang
1,†,
Hailong Li
1,
Hong Jiang
2,
Lifei Zhuo
2,
Huiwen Cai
1,*,
Chao Chen
3 and
Sheng Zhao
1
1
Marine Science and Technology College, Zhejiang Ocean University, Zhoushan 316000, China
2
Zhoushan Marine Workstation, State Oceanic Administration (SOA), Zhoushan 316000, China
3
School of Geographical Sciences and Geomatics Engineering, Suzhou University of Science and Technology, Suzhou 215009, China
*
Author to whom correspondence should be addressed.
The author contributed equally to this work.
J. Mar. Sci. Eng. 2024, 12(10), 1742; https://doi.org/10.3390/jmse12101742
Submission received: 12 August 2024 / Revised: 15 September 2024 / Accepted: 26 September 2024 / Published: 3 October 2024
(This article belongs to the Section Ocean Engineering)

Abstract

:
Due to the increasing impact of climate change and human activities on marine ecosystems, there is an urgent need to study marine water quality. The use of remote sensing for water quality inversion offers a precise, timely, and comprehensive way to evaluate the present state and future trajectories of water quality. In this paper, a remote sensing inversion model utilizing machine learning was developed to evaluate water quality variations in the Ma’an Archipelago Marine Special Protected Area (MMSPA) over a long-time series of Landsat images. The concentrations of chlorophyll-a (Chl-a), phosphate, and dissolved inorganic nitrogen (DIN) in the sea area from 2002 to 2022 were inverted and analyzed. The spatial and temporal characteristics of these variations were investigated. The results indicated that the random forest model could reliably predict Chl-a, phosphate, and DIN concentrations in the MMSPA. Specifically, the inversion results for Chl-a showed the coefficient of determination (R2) of 0.741, the root mean square error (RMSE) of 3.376 μg/L, and the mean absolute percentage error (MAPE) of 16.219%. Regarding spatial distribution, the concentrations of these parameters were notably elevated in the nearshore zones, especially in the northwest, contrasted with lower concentrations in the offshore and southeast areas. Predominantly, the nearshore regions with higher concentrations were in proximity to the aquaculture zones. Additionally, nutrients originating from land sources, transported via rivers such as the Yangtze River, as well as influenced by human activities, have shaped this nutrient distribution. Over the long term, the water quality in the MMSPA has shown considerable interannual fluctuations during the past two decades. As a sanctuary, preserving superior water quality and a healthy ecosystem is very important. Efforts in protection, restoration, and management will demand considerable labor. Remote sensing has demonstrated its worth as a proficient technology for real-time monitoring, capable of supporting the sustainable exploitation of marine resources and the safeguarding of the marine ecological environment.

1. Introduction

As the marine economy progresses, the contrast between marine exploitation and preservation has become more evident. Marine Special Protection Areas, characterized by their distinctive geographic attributes, ecosystems, and the coexistence of biotic and abiotic resources, require robust conservation strategies and management practices based on scientific principles. Establishing marine conservation zones is essential for preserving and maintaining the integrity and vitality of ecosystems. These protected areas play a critical role in conserving marine biodiversity and protecting marine species from extinction and endangerment [1]. Effective marine protected areas significantly enhance biodiversity conservation, promote sustainable utilization of fishery resources, and contribute to climate change mitigation [2]. Simultaneously, they enhance tourism and generate economic benefits. The Ma’an Archipelago Marine Special Protected Area (MMSPA) is located at the estuary of the Yangtze River [3] (Figure 1). The MMSPA is the second national marine special protection area in Zhejiang Province, succeeding Ximen Island in Yueqing. This region, with its multitude of islands, boasts a diverse topography that offers sanctuaries for marine life. The nutrients delivered by the Yangtze River contribute to high levels of primary productivity and a bountiful array of marine fishery resources, including shellfish, seaweed, cephalopods, crustaceans, echinoderms, and the distinctive Larimichthys crocea [3]. Moreover, it serves as a vital habitat and nursery for a variety of fish species, such as Sebastiscus marmoratus, Harpodon nehereus, Pampus argenteus, Nibea albiflora, and Thryssa kammalensis [3,4]. It is the center of the Zhoushan fishing ground and is surrounded by numerous islands and reefs. The marine resources are rich but are affected by a large influx of nutrients from land-based sources and human activities such as overfishing and industrialization. Seawater eutrophication is a significant issue, with frequent red tide occurrences impacting the stability of the ecosystem. According to statistics, over 30 red tides were recorded in the MMSPA between 2002 and 2022, as reported by the Zhejiang Marine Disaster Bulletin from 2002 to 2022 [5]. In recent years, Zhejiang Province has stepped up its efforts to combat coastal marine environmental pollution through programs like the “Blue Bay” and “Ecological Reefs” initiatives [6]. Real-time monitoring has been the most important and efficient water quality management method for a long time and will continue to play this role; however, the high costs, time consumption, limitations imposed by specific geographical and weather conditions, and unsuitability for large-scale monitoring are significant challenges. Remote sensing monitoring has presented a lot of benefits: it enables real-time coverage of vast areas, is economically viable, and leverages data from multiple sources. Enabled by satellite and aerial equipment, this technology swiftly surveys large regions, offering a cost-effective means of monitoring. Moreover, the integration of diverse data sets and computational algorithms boosts both the scope and the processing efficiency of the data gathered. Therefore, remote sensing demonstrates unique advantages when rapid and cost-effective data collection is required over vast areas.
Remote sensing has been applied to water quality monitoring studies since the 1970s [7,8]. Over the past decades, numerous studies have conducted inversion models for water quality parameters both domestically and internationally. These models have been applied to monitor environmental variation in the water quality of oceans, nearshore zones, and inland water bodies. Significant progress has been made in estimating optically active parameters such as chlorophyll-a (Chl-a), colored dissolved organic matter, turbidity, and transparency [9,10,11,12]. In recent years, advancements in artificial intelligence have increased the prominence of machine learning in remote sensing for water quality monitoring. Machine learning involves the process of training computers to construct models based on existing data and algorithms, which are subsequently used for analyzing or forecasting new datasets. This methodology boasts numerous benefits, such as adaptability, autonomous learning capabilities, enhanced efficiency, and robustness to errors. Machine learning methods are adept at revealing underlying relationships and trends within datasets, which positions them as highly beneficial for forecasting water quality [13,14,15]. In the work of Fang et al. [10], a random forest (RF) nonparametric regression prediction model was used to estimate suspended sediment concentration in the river section from Yichang to Chenglingji downstream of the Three Gorges Project dam. A support vector machine regression (SVR) algorithm was utilized to construct a model for predicting the concentration of chlorophyll-a (Chl-a) in the area adjacent to Poyang Lake [16]. A coupled HBGC-Convolutional Neural Network (CNN) model was developed to predict the daily surface Chl-a distribution in the Bohai Sea in China [17].
In this work, a remote sensing inversion model based on random forest (RF) has been developed in the MMSPA to investigate the spatial and temporal variation of chlorophyll-a (Chl-a), dissolved inorganic nitrogen (DIN), and phosphate. Chl-a is a crucial pigment involved in algal photosynthesis and serves as a key indicator of eutrophication. Its concentration significantly influences the photosynthetic capacity and primary productivity of algae, as well as the spectral properties of aquatic environments [18]. DIN includes ammonium, nitrate, and nitrite, which are essential nutrients influencing phytoplankton growth and can impact water quality [19]. Phosphate is another critical nutrient that, like DIN, affects phytoplankton productivity and can contribute to eutrophication [20]. The results offered essential insights for the sustainable management of the conservation zone, establishing the foundation for the rational use of natural resources and the implementation of comprehensive ecological studies within the sanctuary.

2. Data and Methods

2.1. Study Area

The MMSPA was formally established in May 2005 by the State Oceanic Administration [4]. The MMSPA is located in the southeast of the Yangtze River estuary, covering the area of 30°31′40″~30°56′40″ N and 122°38′40″~123°23′40″ E (Figure 1). It resides within the subtropical marine monsoon climate region, characterized by distinct seasonal variations. Coastal currents, the Kuroshio Current, and the cold-water masses of the Yellow Sea converge in this area, resulting in a water body with a wide temperature range, high salinity, and high transparency [21,22,23]. The runoff from the Yangtze River into the ocean delivers a significant volume of freshwater, organic material, and plankton to the MMSPA, enriching the area with nutrients. This locale sustains a vast array of more than 100 species of phytoplankton, zooplankton, and benthic life, which in turn fuels its elevated rate of primary productivity [24]. This region also offers excellent conditions for coastal aquaculture and is teeming with fish and shrimp resources, along with a wealth of wild intertidal shellfish and seaweed [3,24]. Moreover, the vicinity of the MMSPA is home to several rare and endangered marine species, including the Chinese white dolphin, the spotted seal, the Chinese sturgeon, and the giant salamander [4]. These species are of considerable ecological and economic importance and necessitate conservation efforts within these protected zones [25].

2.2. Data Sources and Processing

2.2.1. Remote Sensing Data Source and Image Pre-Processing

The remote sensing data was from the Landsat series satellites and the United States Geological Survey (USGS) official website. The data have been processed by the Landsat Ecosystem Disturbance Adaptive Processing System (LEDAPS) to correct the atmospheric effects [26]. The primary objective of this method is to apply atmospheric corrections to satellite remote sensing data, thereby eliminating atmospheric effects and ensuring that the acquired remote sensing images are more accurate and reliable [27].
In order to invert the long-time series evolution of water quality in the MMSPA, the data spanned from 2002 to 2022, where the data from 2002 to 2013 came from the Landsat 5 satellite. The data from 2013 to 2022 came from the Landsat 8 satellite due to the decommissioning of Landsat 5 in June 2013. The orbit of Landsat 5 is a near-circular, sun-synchronous orbit with a revisit period of 16 days and a single-view scanning range of 185 km × 185 km. It carries the Multi-Spectral Scanner (MSS) and the Thematic Mapper (TM). There are seven bands in the TM, and the information on each band is shown in Table 1. The orbit of Landsat 8 is a near-polar, sun-synchronous orbit with a revisit period of 16 days. The scanning range of a single view is 185 km × 185 km. It carries an Operational Land Imager (OLI) and a Thermal Infrared Sensor (TIRS). The OLI has nine bands, including a panchromatic band with a resolution of 15 m and an imaging width of 185 km × 185 km. To avoid atmospheric absorption features, the OLI bands have been adjusted [28]. Additionally, two new bands have been added: the blue band (Band 1 Coastal) is mainly used for coastal zone observation, and the short-wave infrared band (Band 9 Cirrus) with strong water vapor absorption features can be used for cloud detection [29,30]. The information on each band is shown in Table 2. By integrating data from Landsat 5 and Landsat 8, we sought to capitalize on the extended temporal coverage provided by these satellites. This methodology facilitates the examination of trends and changes over a prolonged period, thereby offering a more comprehensive understanding of the study area [31,32]. The combination of these datasets ensures analytical continuity and addresses potential gaps that might arise from relying solely on data from one satellite mission.
For image selection, we ensured that the time interval between the acquisition of in-situ samples and the satellite image capture was as short as possible, typically within 5 days, to minimize temporal discrepancies. To account for the effects of cloud coverage, we selected only high-quality images, ensuring that cloud coverage did not exceed 30% over the study area. Given that Landsat images have undergone geometric and topographic correction, this study proceeds directly with the preliminary processing of remote sensing imagery, including radiometric calibration, atmospheric correction, and image cropping. Regarding atmospheric correction, we applied the FLAASH (Fast Line-of-sight Atmospheric Analysis of Spectral Hypercubes) atmospheric correction method to all satellite images. This method is widely used for correcting atmospheric effects, including scattering and absorption, thereby enhancing the accuracy of the retrieved water quality parameters [33]. All pre-processing was performed by ENVI 5.3 software.

2.2.2. Sampling

The measured data was based on the field investigation at the 0.5 m layer of the sampling sites in MMSPA from 2007 to 2018 (Figure 1). The final dataset consisted of 288 samples, with most of the sampling conducted in August. Based on the statistics of the concentration of Chl-a, Phosphate, and DIN measured in the area, Chl-a concentrations exhibited a range from 1.1 to 38.8 μg/L, with a multi-year average is 8.975 μg/L. DIN concentrations spanned from 0.285 to 0.71 mg/L, with the multi-year average being 0.533 mg/L. Similarly, the concentration of Phosphate is 0.002–0.026 mg/L, and the multi-year average is 0.013 mg/L (Table 3). Water samples for Chl-a, DIN, and phosphate for lab analysis were collected by a 5 L acid-cleaned (1 N HCl) Niskin Bottle. Subsamples for nutrients and Chl-a were filtered immediately through a GF/F filter membrane (0.70 mm) and stored at −20 °C for later laboratory analysis. Acetone was added to the subsamples for Chl-a analysis before storage. After returning to the lab, the water samples were thawed at room temperature for phosphate, DIN was analyzed by SEAL QuAAtro39-SFA, and Chl-a was determined by Trilogy (Turner Design Ltd., San Jose, CA, USA).

2.3. Model Construction

2.3.1. RF Model

RF [34], a combination classifier algorithm proposed by Breiman [35], is based on statistical learning theory. Multiple samples are drawn from the original sample using the bootstrap resampling method. Decision tree modeling is performed on each bootstrap sample, and the predictions of multiple decision trees are combined to arrive at the final prediction by voting [36,37]. The Bootstrap Aggregating (Bagging) method is combined with the Classification and Regression Trees (CART) algorithm, along with the random selection of features for attribute splitting. This combination allows for better tolerance of noise, exhibits good robustness, and is suitable for modeling small data samples [38,39].
The RF algorithm has significant advantages in classification accuracy, generalization error, and other performance aspects. It demonstrates high classification accuracy and efficiency, with well-developed theoretical and methodological foundations. Consequently, it has been widely used in many fields and has shown excellent results [35,36].
The RF algorithm is implemented in the following four steps [37,40,41]:
(1)
A bootstrap sampling technique with replacement is used to extract n training sets from the original dataset, each of which is approximately two-thirds the size of the original dataset.
(2)
CART regression trees are built separately for each bootstrap training set, producing a total of n decision trees constituting a “forest” without pruning the decision trees. During the growth of each tree, the optimal attributes are selected from m ≤ M randomly chosen attributes for branching.
(3)
The prediction results of the n-decision trees are assembled, and voting is used to decide the category of the new sample.
(4)
According to steps 1 to 3, a large number of decision trees were established, which constitutes a Random Forest (RF).

2.3.2. Model Training

Given the substantial data requirements for effective empirical statistical models and machine learning methods, we do not construct models based on individual sensors. Instead, we integrate the measured data with corresponding Landsat data and apply them simultaneously in the remote sensing inversion model.
To achieve a balance between sufficient training data and reliable model evaluation, 80% of the remote sensing band values as well as the corresponding measured Chl-a, phosphate, and DIN concentrations were allocated as the training dataset, and the remaining 20% for the validation.
The five-fold cross-validation method was employed to identify the most suitable hyperparameters for each water quality parameter, ensuring the model’s reliability and consistency [42,43]. Initially, the datasets comprising remote sensing band values and the empirical measurements were segregated into five distinct subsets. During each iteration, a quartet of these subsets was chosen for training, with the remaining subset serving as the validation group. Subsequently, the model was constructed with the training data, and its forecasting prowess was gauged against the validation data. The model’s precision was appraised using the determination coefficient (R2), the root mean square error (RMSE), and the mean absolute percentage error (MAPE). The model’s estimations and the associated prediction errors were documented, and this process was iterated five times. Ultimately, the mean of the five prediction errors was computed to derive an overall measure of predictive error.
The band values from the remote-sensing images were extracted for the specific locations corresponding to the latitude and longitude of the field-measured data points. The selected band values, as well as band combinations and their ratios, were used as the dataset for the models. By comparing the relationship between remote sensing band values and detailed concentration of the parameters (Table 4 and Table 5), the appropriate bands for analysis were selected. We calculated the feature importance for each input band in the RF models. Remote sensing bands were initially used as feature inputs, and their significance was determined based on their correlation with water quality parameters. This analysis allowed us to identify the most influential bands for predicting Chl-a, DIN, and phosphate concentrations. For Chl-a prediction, the ratio of B3/B5 was selected due to its strong correlation with Chl-a concentrations. For DIN, the difference between B2 and B4 was used, and for phosphate, the B6/B5 ratio provided the highest predictive accuracy. The grid search method was utilized to optimize several parameters of the decision tree model, including the number of decision trees and the maximum number of features considered at each split. This latter parameter determines the number of features evaluated at each node within the decision tree. Additionally, other critical parameters, such as the minimum number of samples required at a leaf node and the maximum depth of the trees, were fine-tuned. These parameter combinations were then subjected to cross-validation to assess their performance, with the combination yielding the best classification results being selected as the globally optimal solution. The grid search method can calculate the optimal combination of hyperparameters and is one of the most commonly used parameter optimization algorithms [44,45]. The grid search method for parameter optimization. Enhancing model accuracy through this methodology. Test samples were then put into the model to obtain the estimated Chl-a concentrations. The model was evaluated by comparing and analyzing the estimated and measured values, with its efficacy appraised through the use of R2, RMSE, and MAPE metrics. The construction method and steps for the phosphate and DIN models in the MMSPA were the same as those used for the RF model to invert Chl-a. The RF models in this study were constructed and trained using Python 3.8.

2.3.3. Model Accuracy

The efficacy of the machine learning models was appraised through the use of R2, RMSE, and MAPE metrics. The coefficient of determination (R2) signifies the level of association between the independent and dependent variables. The value of R2 ranges from 0 to 1. The closer the value is to 1, the higher the degree of explanation of the independent variable to the dependent variable, and the higher the percentage of the total variation caused by the independent variable [46]. The R2 measures the extent to which the model explains the variance in the outcome. While a higher R2 typically indicates a better fit of the model to the data, it is crucial to also consider additional performance metrics and cross-validation results to comprehensively assess model accuracy and robustness. The RMSE measures the standard deviation of the residuals or prediction errors. It is calculated as the square root of the average of the squared differences between observed and predicted values [47]. It serves to quantify the divergence of the observed values from the model’s predictions, thereby gauging the model’s performance. The RMSE is acutely responsive to outliers and errors within a dataset, thus offering a more precise indication of the model’s accuracy [48]. A smaller RMSE value indicates that the predictive model describes the experimental data with better accuracy. The MAPE is a relative error measure that defines the Mean Absolute Deviation (MAD) scale as a percentage unit. Absolute values are employed in the calculation of MAPE to prevent the potential cancellation of positive and negative errors. The normal range of MAPE is between 0% and 100%. The closer the value is to 0%, the more perfect the model is, and a MAPE greater than 100% indicates a poor model [49]. R2, RMSE, and MAPE were defined as follows [46,47,49] Equations (1)–(3):
R 2 = 1 i = 1 n x i y i 2 i = 1 n x i x i ¯ 2
R M S E = 1 n i = 1 n x i y i 2
M A P E = 100 % m i = 1 m y i f x i y i
where x i means the measured value and y i means the predicted value.
To provide a comprehensive overview of the methodology employed in this study, we present the technical workflow for the model construction process (Figure 2). This workflow details the essential steps, ranging from data acquisition and pre-processing to model development, validation, and spatial distribution analysis. It visually represents this workflow, illustrating the sequence of tasks and the interactions among different components of the study.

3. Results and Discussion

3.1. Validation of Chl-a, Phosphate and DIN Inversion Models

The R2, RMSE, and MAPE were used to evaluate the accuracy of the RF model on inverting Chl-a, phosphate, and DIN concentrations (Figure 3, Figure 4 and Figure 5).
The optimal parameter configuration for the RF model was determined to include 50 decision trees, with a maximum of three features considered at each split, a minimum leaf sample size of one, and a maximum tree depth of 70. With these settings, the model demonstrated strong performance in estimating Chl-a concentrations. The results found that the R2 between the estimated and measured Chl-a values of the MMSPA was 0.741, the RMSE was 3.376 μg/L, and the MAPE was 16.219%. For the DIN model, the optimal parameter combination was 50 decision trees, with a maximum of three features considered at each split, a minimum leaf sample size of one, and a maximum tree depth of 10. For the phosphate model, the optimal parameters were 80 decision trees, three maximum features per split, a minimum leaf sample size of one, and a maximum tree depth of 40. The performance of the RF model was evaluated based on the estimation results for both phosphate and DIN. The models tend to underestimate higher concentrations of both Chl-a and DIN, primarily due to the limited number of data points in the high-concentration range, which restricts its ability to accurately predict extreme values. Additionally, environmental factors that could affect the predictions may not have been fully captured by the model, further contributing to the observed discrepancies. In contrast, the model performs well for low to medium concentrations, where more data points were available for training. For phosphate, the error distribution remains relatively uniform, with a slight tendency toward overestimation at higher concentrations. Despite these limitations, the model demonstrates overall reliable performance. Specifically, for the phosphate and DIN, the R2 was 0.700, with an RMSE of 0.004 mg/L and a MAPE of 28.170%, and the values of R2, RMSE, MAPE were 0.747, 0.016 mg/L, and 2.107%, respectively. Therefore, the RF model can be used to estimate the Chl-a, phosphate, and DIN concentrations in the MMSPA sea area reliably.

3.2. Spatial and Temporal Evolution of Water Quality in the MMSPA from 2002 to 2022

3.2.1. Spatial and Temporal Evolution of Chl-a

Ecosystems typically require extended periods to respond to environmental changes. Analyzing data at five-year intervals can better capture the response of ecosystems to long-term environmental changes, rather than short-term fluctuations, thereby providing a clear reflection of overall trends. Additionally, the data at five-year intervals balances the depth and breadth of research, allowing for optimized resource allocation and improved research efficiency while ensuring the reliability of the results.
The inversion model presented the spatial distribution of the water parameters very well. There was significant spatial variation for the Chl-a concentration, where it was high in the western and northern parts of the MMSPA, and low in the southern part of the area. The degree of variation exhibited distinct area-specific patterns. In the nearshore region, Chl-a concentrations fluctuated considerably, ranging from 0 to 10 μg/L, whereas spatial differences were more subdued in the offshore region (Figure 6). The reason for this phenomenon may be that the MMSPA is located in the Yangtze River estuary, where a large amount of nutrients from the Yangtze River are transported down to the East China Sea. This effect is particularly pronounced in August when the Yangtze River discharge is considerably high. The northwestern part of the study area, being in closer proximity to the river mouth, consequently experiences significantly higher nutrient concentrations during this period. This nutrient influx promotes the growth and reproduction of plankton and algae in the protected area, leading to frequent occurrences of red tides [50], resulting in a high Chl-a concentration. Additionally, the aquaculture in the nearshore contributed to Chl-a concentrations. The filter-feeding activities of bivalves in these areas removed phytoplankton from the water column, which generally leads to a reduction in Chl-a concentrations due to the decreased abundance of phytoplankton [51,52]. However, bivalves also lead to increased nutrient levels through the excretion of dissolved nutrients such as ammonia nitrogen. This additional nutrient input can stimulate phytoplankton growth, potentially leading to high Chl-a concentrations [52]. Furthermore, the biological deposition activities of bivalves accelerate the sedimentation of suspended particles and alter the mineralization and regeneration processes of elements in the sediments. These changes also can impact the phytoplankton growth and Chl-a concentrations in complex ways [53]. Thus, the combined effects of these processes contribute to the significant variability in Chl-a concentrations observed within the nearshore area.
There was a significant decreasing trend in the five-year Chl-a concentration during the summer from 2002 to 2007, followed by an increasing trend from 2007 to 2022 (Figure 6). The highest Chl-a concentration was recorded in the summer of 2002, with an average value of 15.97 μg/L (Figure 6a). The lowest Chl-a concentration was observed in the summer of 2007, with an average value of 11.64 μg/L (Figure 6b). Compared with the same period in other years, the change in Chl-a concentration in the MMSPA sea area from 2002 to 2007 was the most significant, showing an overall significant decreasing trend. Specifically, the mean Chl-a concentration decreased by 4.33 μg/L. From 2007 to 2013, there was an overall upward trend in the Chl-a concentration within the MMSPA, with an average increase of 1.47 μg/L. Between 2013 and 2018, the MMSPA exhibited a slightly significant increasing trend in Chl-a concentration, with an average increase of 0.45 μg/L. During the period from 2018 to 2022, the Chl-a concentration in the MMSPA generally rose, with an average increase of 2.04 μg/L. The general trend of Chl-a concentration from 2002 to 2022 can be described as follows: 2002 > 2022 > 2018 > 2013 > 2007.
Red tides are typically the main factor impacting Chl-a concentration in spring and summer. According to the statistics, more than 30 red tides occurred in the MMSPA between 2002 and 2022, and more than 10 red tides occurred in August. Specifically, there were nine red tides in 2013, four in 2018, and three in 2022, according to the Zhejiang Marine Disaster Bulletin [5]. Significant large areas of high Chl-a concentrations were observed in the study area during the summers of 2002, 2018, and 2022, indicating the possibility of red tides (Figure 6a,d,e). Upon the occurrence of a red tide, there is a rapid proliferation of diatoms and dinoflagellates, in conjunction with algae carried by current, leading to a significant increase in Chl-a concentration [54,55,56,57]. In recent years, the proliferation of human populations in coastal regions, coupled with swift industrial and agricultural expansion and urbanization processes, has exacerbated the eutrophication of water bodies in the proximal waters of the MMSPA. The runoff from urban areas can introduce a wide range of pollutants, including nitrogen and phosphorus compounds, into aquatic systems, leading to nutrient accumulation and eutrophication [58]. Eutrophication not only negatively impacts the marine ecosystem but also may lead to a significant increase in the number of harmful algal blooms [59,60,61].

3.2.2. Spatial and Temporal Evolution of Phosphate

A comparative analysis of the summer phosphate concentration data for each five-year period from 2002 to 2022 indicates a clear upward trend in phosphate concentrations in the MMSPA (Figure 7). The variation in phosphate concentration is characterized by the highest concentration in August 2022, with an average of 0.019 mg/L in the study area (Figure 7e). In contrast, the lowest phosphate concentration was recorded in August 2007, with an average of 0.0136 mg/L (Figure 7b).
From August 2002 to August 2007, there was a significant decreasing trend in the five-year phosphate concentration in the MMSPA, and then increased from 2007 to 2022. Specifically, from August 2002 to August 2007, the average phosphate concentration fell by 0.001 mg/L. Between August 2007 and August 2013, the average phosphate concentration increased by 0.0034 mg/L. From August 2013 to August 2018, the average phosphate concentration increased by 0.001 mg/L. During the period from August 2018 to August 2022, the phosphate concentration in the MMSPA continued an upward trend, with an average increase of 0.001 mg/L. The overarching trend of phosphate concentration from 2002 to 2022 may be articulated as follows: 2022 > 2018 > 2013 > 2002 > 2007. The average concentration of phosphate in the MMSPA in August, spanning from 2002 to 2022, fell within the parameters of seawater quality standard levels I and II [62].
Phosphate concentrations show varying patterns across the monitoring stations. Offshore waters generally exhibit stable concentrations with minimal variation. In contrast, the coastal waters experience considerable fluctuations in phosphate concentration, ranging from 0.001 to 0.014 mg/L (Figure 7). These fluctuations are predominantly influenced by physical factors, including the dilution effects of the Yangtze River freshwater inflow, as well as wind, wave action, and surface currents. The diluted water from the Yangtze River expands to the southeast after the exit estuary and divides into two branches. One branch moves southeast, hugging the shore and heading south, mostly confined to the sea area near Hangzhou Bay and the Zhoushan Islands, which results in a relatively high nutrient concentration in the sea area near the MMSPA [59,61,63]. Furthermore, phosphate concentration is influenced by the growth, reproduction, and aggregation density of phytoplankton, benthos, and algae [64]. In summer, the influx of nutrients from terrestrial sources peaks, and phytoplankton exhibit very intense photosynthetic activity. The average wind speeds are comparatively mild in the summer, yet upwelling remains intense [65,66]. Variations in wind velocity can stir up and re-suspend sediments, thereby liberating trapped nutrients. Moreover, the discharge of nutrients from point sources such as outfalls, the desorption of nutrients from particulate matter within the water column, and the breakdown of phosphorus-rich organic matter in the water or sediment can also lead to elevated phosphate concentrations [67,68]. The MMSPA is characterized by marine aquaculture activities and its human activity development index is notably high [69]. Both terrestrial runoff and aquaculture operations, particularly through the discharge of organic waste and unconsumed feed, lead to elevated nutrient concentrations in the surrounding waters. This phenomenon has been well-documented in various studies. For example, research from the EU region demonstrated that the expansion of aquaculture has significantly impacted marine water quality [70], showing that human activities, especially aquaculture, can markedly alter nutrient concentrations. This provides a clear empirical case for distinguishing between natural and anthropogenic nutrient inputs, similar to what is observed in the nearshore areas of MMSPA. We attribute the high nutrient levels observed in the nearshore areas of MMSPA to these human activities.

3.2.3. Spatial and Temporal Evolution of DIN

This study investigates the distribution of DIN concentrations during the summer months, at five-year intervals from 2002 to 2022. A significant overall increase in DIN concentration in the MMSPA was observed, reflecting an upward trend (Figure 8). This trend aligns with the conclusions of related studies, which indicate that DIN concentration in the Yangtze River estuary and its surrounding waters has been on the rise over the years [71,72]. The study area’s DIN concentration characteristics are particularly pronounced, with the peak concentration recorded in August 2007 (Figure 8b), averaging 0.5788 mg/L. Conversely, the minimum DIN concentration was noted in August 2013 (Figure 8c), averaging 0.5654 mg/L.
Between August 2002 and August 2022, the DIN concentration greatly fluctuated, showing a gradual increase (2002–2007, 2013–2022), with occasional decrease (2007–2013). From August 2002 through August 2007, there was an average rise in DIN concentration of 0.0016 mg/L. Subsequently, from August 2007 to August 2013, the average DIN concentration experienced a decrease of 0.0134 mg/L. During the period from August 2013 to August 2018, there was an uptick in the average DIN concentration, rising by 0.0031 mg/L. Continuing this trend, between August 2018 and August 2022, the DIN concentration in the MMSPA saw a further increase, averaging a 0.0027 mg/L rise. The general trend of DIN concentration from 2002 to 2022 was 2007 > 2002 > 2022 > 2018 > 2013. Between 2002 and 2022, the average DIN concentration in August consistently exceeded the seawater quality standard level I, and in all cases, it deteriorated beyond the seawater quality standard level IV [62].
Spatially, DIN concentrations varied more in the nearshore areas than in the offshore areas (Figure 8). The DIN concentration in the nearshore areas varied in a wide range, with a range of 0.004–0.01 mg/L. In general, the DIN concentration in the MMSPA was relatively high. This high concentration may be primarily attributed to the large number of shellfish aquaculture areas, such as those for purple mussels and deep-water net-pen aquaculture. The purple mussels grow vigorously and metabolize rapidly, generating a large amount of nitrogenous waste in August [73]. Additionally, purple mussels mostly grow in offshore and inland bays with gentle currents and poor seawater exchange, which can easily lead to DIN enrichment and increase the DIN concentration in seawater. The high nutrient concentration and rich phytoplankton communities make the area a major marine aquaculture zone [74,75]. At the same time, with the increase in population and economic development around Zhoushan, the discharge of agricultural, industrial, and domestic sewage has increased. This discharge contains DIN substances such as nitrogen compounds and ammonia. The entry of these substances into the ocean raises the DIN concentration of the water body.

3.3. Comparative Analysis of Machine Learning in Remote Sensing Inversion

The RF model is widely applied in the field of water quality inversion. In this paper, it was used to construct an inversion model for water quality parameters in the MMSPA. It has demonstrated its suitability for inverting water quality in this region. Several Chinese scholars inverted the Chl-a of Xingkai Lake, and the results showed that the accuracy of the RF model was higher than that of ten other machine learning models [76]. For the estuary, an ensemble machine learning model was developed to estimate coastal water quality parameters using remote sensing imagery. This study estimates Chl-a, turbidity, and dissolved oxygen (DO) levels in Shenzhen Bay. The results indicated that the degradation of water quality in Shenzhen Bay is predominantly driven by terrestrial pollutant inputs from the estuary [77]. Additionally, researchers employed machine learning techniques to monitor water quality in the Menor Sea using Sentinel-2 satellite data. They applied RF, Support Vector Machine (SVM), Artificial Neural Network (ANN), and Deep Neural Network (DNN) algorithms under various feature selection scenarios, incorporating multiple spectral indices to estimate Chl-a concentrations. This study revealed that when all available predictors were utilized, the RF, SVM (Radial), and DNN models demonstrated superior performance, achieving the RMSE of 0.82, 0.82, and 1.76 mg/m3, respectively [78]. In comparing our findings from the MMSPA with those from the Great Barrier Reef Marine Park (GBRMP), our models exhibit relatively good performance. While both regions face unique challenges, our approach in the MMSPA demonstrates comparable accuracy with generally lower error rates. This indicates that our model performs effectively within its specific context, similar to the performance observed in the GBRMP [79]. Compared to lakes, the optical properties of the ocean are more complex and dynamic [80]. The ocean contains more dissolved organic matter and suspended particulate matter, especially in areas like the MMSPA, where diluted water affects the accuracy of water quality inversion. Furthermore, the inversion results of this research were influenced by activities such as marine farming and shipping. A thorough examination of the existing literature suggests that the RF model used in this study provides a high degree of accuracy and dependability for inverting water quality parameters [81,82].
In addition to RF, SVMs, long short-term memory, and least squares algorithms have also been used to model the inversion of water quality parameters [83,84]. The effectiveness of different methods varies in practical research. For modeling with extreme values, it is not recommended to use algorithms such as average neural networks, RFS, Cubist, and K-nearest neighbors that average the output variables. Different machine learning algorithms have different requirements for the number of training samples, so the selection of algorithms should also be based on the sample size. SVM is not suitable for processing large-scale data. However, deep learning algorithms, such as deep belief networks, require a large number of samples for training [85]. By combining the measured data and remote sensing data with the actual conditions of the sea area, the RF method is found to be more suitable.

4. Conclusions

A long-term (20-year) series of Landsat remote sensing images and measured data were used to build an RF inversion model for Chl-a, phosphate, and DIN concentration distribution in the MMSPA. The temporal and spatial variations in the water quality over 20 years have been demonstrated. The over-fitting and prediction errors were solved through the bootstrap sampling strategy in the RF model, and it proved to have excellent versatility and robustness. It can be applied to the remote sensing inversion of water quality parameters such as Chl-a, phosphate, and DIN in the MMSPA. The modeled concentrations of Chl-a, phosphate, and DIN in the MMSPA presented a decreasing trend both from nearshore to offshore and from northwest to southeast. Additionally, the relevant water quality parameters showed an inter-annual upward trend, presumably driven by land-based inputs and human activities.
The water quality monitoring model developed in this study has significant reference value for real-time monitoring of the water environment in the MMSPA. Furthermore, it holds importance for improving the quality of the marine water environment. The research provides valuable insights for policy-making and marine conservation by offering precise assessments of water quality parameters. These findings can inform targeted pollution control measures, enhance ecosystem health monitoring, and guide effective aquaculture management. By incorporating remote sensing data into regular monitoring practices, this study contributes to the development of sustainable management strategies and policy adjustments. Additionally, it supports public awareness and education initiatives aimed at promoting sustainable practices and fostering community engagement in conservation efforts. However, nearshore waters that comply with seawater quality standard level II exhibit complex optical properties, making it necessary to study the spectral characteristics of different water quality parameters more comprehensively. In the modeling process, it is also essential to improve the inversion accuracy, versatility, and stability of the model. Therefore, future research should focus on further enhancing the water quality monitoring model to achieve more comprehensive and accurate marine water environment monitoring.

Author Contributions

Conceptualization, H.C.; Methodology, Z.Z. and C.C.; Software, Z.Z.; Validation, Z.W. and H.L.; Resources, H.J., L.Z. and H.C.; Data curation, H.J. and L.Z.; Writing—original draft, Z.W. and Z.Z.; Writing—review and editing, Z.W., H.C. and S.Z.; Visualization, Z.W.; Supervision, S.Z.; Funding acquisition, H.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Science and Technology Bureau of Zhejiang (2023C03120, 2022C02040).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Dataset available on request from the authors.

Acknowledgments

We thank the Zhoushan Marine Environment Monitoring and Forecasting Center for providing the investigation data.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Ma, C.; Zhang, X.; Chen, W.; Zhang, G.; Duan, H.; Ju, M.; Li, H.; Yang, Z. China’s special marine protected area policy: Trade-off between economic development and marine conservation. Ocean. Coast Manag. 2013, 76, 1–11. [Google Scholar] [CrossRef]
  2. Arneth, A.; Leadley, P.; Claudet, J.; Coll, M.; Rondinini, C.; Rounsevell, M.D.; Shin, Y.J.; Alexander, P.; Fuchs, R. Making protected areas effective for biodiversity, climate and food. Glob. Chang. Biol. 2023, 29, 3883–3894. [Google Scholar] [CrossRef] [PubMed]
  3. Wang, Y.; Li, X.; Zhao, X.; Chen, J.; Wang, Z.; Chen, L.; Zhang, S.; Wang, K. Assessment of fish diversity in the Ma’an Archipelago special protected area using environmental DNA. Biology 2022, 11, 1832. [Google Scholar] [CrossRef] [PubMed]
  4. Yu, X.; Dong, Y. Local practice of marine protected areas legislation in China: The case of Zhoushan. Mar. Policy 2022, 141, 105084. [Google Scholar] [CrossRef]
  5. Zhejiang Marine Disaster Bulletin; Department of Natural Resources of Zhejiang Province: Hangzhou, China, 2002–2022.
  6. Liu, L.; Zhang, X.; Chen, Z.; Zhou, H.; Li, C.; Chen, Y. Assessment of ecological sustainability for international bays in the context of common prosperity—A case study of sanmen bay in Zhejiang province. Front. Environ. Sci. 2022, 10, 944936. [Google Scholar] [CrossRef]
  7. Peterson, K.T.; Sagan, V.; Sloan, J.J. Deep learning-based water quality estimation and anomaly detection using Landsat-8/sentinel-2 virtual constellation and cloud computing. GISci. Remote Sens. 2020, 57, 510–525. [Google Scholar] [CrossRef]
  8. Yang, H.; Kong, J.; Hu, H.; Du, Y.; Gao, M.; Chen, F. A review of remote sensing for water quality retrieval: Progress and challenges. Remote Sens. 2022, 14, 1770. [Google Scholar] [CrossRef]
  9. Dogliotti, A.I.; Ruddick, K.; Nechad, B.; Doxaran, D.; Knaeps, E. A single algorithm to retrieve turbidity from remotely-sensed data in all coastal and estuarine waters. Remote Sens. Environ. 2015, 156, 157–168. [Google Scholar] [CrossRef]
  10. Fang, X.; Wen, Z.; Chen, J.; Wu, S.; Huang, Y.; Ma, M. Remote sensing estimation of suspended sediment concentration based on random forest regression model. J. Remote Sens. 2019, 23, 756–772. [Google Scholar] [CrossRef]
  11. Guo, J.; Nie, Y.; Sun, B.; Lv, X. Remote sensing of transparency in the China seas from the ESA-OC-CCI data. Estuar. Coast. Shelf Sci. 2022, 264, 107693. [Google Scholar] [CrossRef]
  12. Marinho, R.R.; Martinez, J.-M.; de Oliveira, T.C.S.; Moreira, W.P.; de Carvalho, L.A.S.; Moreira-Turcq, P.; Harmel, T. Estimating the colored dissolved organic matter in the negro river, amazon basin, with in situ remote sensing data. Remote Sens. 2024, 16, 613. [Google Scholar] [CrossRef]
  13. Song, W.; Yinglan, A.; Wang, Y.; Fang, Q.; Tang, R. Study on remote sensing inversion and temporal-spatial variation of hulun lake water quality based on machine learning. J. Contam. Hydrol. 2024, 260, 104282. [Google Scholar] [CrossRef] [PubMed]
  14. Yan, X.; Zhang, T.; Du, W.; Meng, Q.; Xu, X.; Zhao, X. A comprehensive review of machine learning for water quality prediction over the past five years. J. Mar. Sci. Eng. 2024, 12, 159. [Google Scholar] [CrossRef]
  15. Gao, L.; Shangguan, Y.; Sun, Z.; Shen, Q.; Shi, Z. Estimation of non-optically active water quality parameters in Zhejiang province based on machine learning. Remote Sens. 2024, 16, 514. [Google Scholar] [CrossRef]
  16. Huang, H.; Zhang, J. Prediction of chlorophyll a and risk assessment of water blooms in poyang lake based on a machine learning method. Environ. Pollut. 2024, 347, 15. [Google Scholar] [CrossRef]
  17. Li, H.; Li, X.; Song, D.; Nie, J.; Liang, S. Prediction on daily spatial distribution of chlorophyll-a in coastal seas using a synthetic method of remote sensing, machine learning and numerical modeling. Sci. Total Environ. 2024, 910, 168642. [Google Scholar] [CrossRef] [PubMed]
  18. Yuan, X.; Wang, S.; Fan, F.; Dong, Y.; Li, Y.; Lin, W.; Zhou, C. Spatiotemporal dynamics and anthropologically dominated drivers of chlorophyll-a, tn and tp concentrations in the pearl river estuary based on retrieval algorithm and random forest regression. Environ. Res. 2022, 215, 114380. [Google Scholar] [CrossRef]
  19. Zhang, Z.; Liao, Y.; Huang, J. A framework to quantify riverine dissolved inorganic nitrogen exports under changing land-use patterns and hydrologic regimes. Water 2023, 15, 3528. [Google Scholar] [CrossRef]
  20. Duhamel, S.; Diaz, J.M.; Adams, J.C.; Djaoudi, K.; Steck, V.; Waggoner, E.M. Phosphorus as an integral component of global marine biogeochemistry. Nat. Geosci. 2021, 14, 359–368. [Google Scholar] [CrossRef]
  21. Han, X.; Zhang, S.; Wang, Z.; Wang, K.; Lin, J.; Deng, M.; Wu, X. Fish community structure and its relationship with environmental factors in the Ma’an Archipelago and its eastern waters. J. Fish. China 2019, 43, 1483–1497. [Google Scholar] [CrossRef]
  22. Yin, W.; Ma, Y.; Wang, D.; He, S.; Huang, D. Surface upwelling off the Zhoushan islands, East China Sea, from Himawari-8 ahi data. Remote Sens. 2022, 14, 3261. [Google Scholar] [CrossRef]
  23. Ding, W.; Li, C. Algal blooms forecasting with hybrid deep learning models from satellite data in the Zhoushan fishery. Ecol. Inf. 2024, 82, 102664. [Google Scholar] [CrossRef]
  24. Li, X.; Zhao, X.; Yuan, H.; Guo, Y.; Li, J.; Zhang, S.; Chen, J.; Wang, Z.; Wang, K. Diversity and carbon sequestration of seaweed in the Ma’an Archipelago, China. Diversity 2022, 15, 12. [Google Scholar] [CrossRef]
  25. Zhou, X.; Zhao, X.; Zhang, S.; Lin, J. Marine ranching construction and management in East China Sea: Programs for sustainable fishery and aquaculture. Water 2019, 11, 1237. [Google Scholar] [CrossRef]
  26. Schmidt, G.; Jenkerson, C.B.; Masek, J.; Vermote, E.; Gao, F. Landsat Ecosystem Disturbance Adaptive Processing System (LEDAPS) Algorithm Description; 2331-1258; US Geological Survey: Reston, VA, USA, 2013. [Google Scholar] [CrossRef]
  27. Masek, J.G.; Vermote, E.F.; Saleous, N.; Wolfe, R.; Hall, F.G.; Huemmrich, F.; Gao, F.; Kutler, J.; Lim, T.K. LEDAPS Landsat Calibration, Reflectance, Atmospheric Correction Preprocessing Code; ORNL DAAC: Oak Ridge, TN, USA, 2012. [Google Scholar] [CrossRef]
  28. Jang, J.-C.; Park, K.-A. High-resolution sea surface temperature retrieval from Landsat 8 OLI/TIRS data at coastal regions. Remote Sens. 2019, 11, 2687. [Google Scholar] [CrossRef]
  29. Vermote, E.; Justice, C.; Claverie, M.; Franch, B. Preliminary analysis of the performance of the Landsat 8/OLI land surface reflectance product. Remote Sens. Environ. 2016, 185, 46–56. [Google Scholar] [CrossRef]
  30. Liu, Z.; Yao, Z.; Wang, R. Assessing methods of identifying open water bodies using Landsat 8 OLI imagery. Environ. Earth Sci. 2016, 75, 873. [Google Scholar] [CrossRef]
  31. Li, P.; Li, H.; Si, B.; Zhou, T.; Zhang, C.; Li, M. Mapping planted forest age using landtrendr algorithm and Landsat 5–8 on the loess plateau, China. Agric. For. Meteorol. 2024, 344, 109795. [Google Scholar] [CrossRef]
  32. Chen, J.; Zhu, W. Consistency evaluation of Landsat-7 and Landsat-8 for improved monitoring of colored dissolved organic matter in complex water. Geocarto Int. 2022, 37, 91–102. [Google Scholar] [CrossRef]
  33. Li, J.; Gao, S.; Wang, Y. Delineating suspended sediment concentration patterns in surface waters of the Changjiang Estuary by remote sensing analysis. Acta Oceanolog. Sin. 2010, 29, 38–47. [Google Scholar] [CrossRef]
  34. Ho, T.K. The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 1998, 20, 832–844. [Google Scholar] [CrossRef]
  35. Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  36. Biau, G.; Scornet, E. A random forest guided tour. Test 2016, 25, 197–227. [Google Scholar] [CrossRef]
  37. Rigatti, S.J. Random forest. J. Insur. Med. 2017, 47, 31–39. [Google Scholar] [CrossRef] [PubMed]
  38. Lee, T.-H.; Ullah, A.; Wang, R. Bootstrap Aggregating and Random Forest; Fuleky, P., Ed.; Springer: Cham, Switzerland, 2020; Volume 52, pp. 389–429. [Google Scholar] [CrossRef]
  39. Pino-Mejías, R.; Jiménez-Gamero, M.-D.; Cubiles-de-la-Vega, M.-D.; Pascual-Acosta, A. Reduced bootstrap aggregating of learning algorithms. Pattern Recognit. Lett. 2008, 29, 265–271. [Google Scholar] [CrossRef]
  40. Pal, M. Random forest classifier for remote sensing classification. Int. J. Remote Sens. 2005, 26, 217–222. [Google Scholar] [CrossRef]
  41. Wu, J.; Chen, X.-Y.; Zhang, H.; Xiong, L.-D.; Lei, H.; Deng, S.-H. Hyperparameter optimization for machine learning models based on bayesian optimization. J. Electron. Sci. Technol. 2019, 17, 26–40. [Google Scholar] [CrossRef]
  42. Liu, S.; Tai, H.; Ding, Q.; Li, D.; Xu, L.; Wei, Y. A hybrid approach of support vector regression with genetic algorithm optimization for aquaculture water quality prediction. Math. Comput. Modell. 2013, 58, 458–465. [Google Scholar] [CrossRef]
  43. Rahaman, M.H.; Sajjad, H.; Hussain, S.; Masroor, M.; Sharma, A. Surface water quality prediction in the lower thoubal river watershed, India: A hyper-tuned machine learning approach and dnn-based sensitivity analysis. J. Environ. Chem. Eng. 2024, 12, 112915. [Google Scholar] [CrossRef]
  44. Huang, Q.; Mao, J.; Liu, Y. An improved grid search algorithm of SVR parameters optimization. In Proceedings of the 2012 IEEE 14th International Conference on Communication Technology, Chengdu, China, 9–11 November 2012; pp. 1022–1026. [Google Scholar] [CrossRef]
  45. Wainer, J.; Fonseca, P. How to tune the RBF SVM hyperparameters? An empirical evaluation of 18 search algorithms. Artif. Intell. Rev. 2021, 54, 4771–4797. [Google Scholar] [CrossRef]
  46. Di Bucchianico, A. Coefficient of Determination (r2); Ruggeri, F., Kenett, R.S., Faltin, F.W., Eds.; John Wiley & Sons, Ltd.: Chichester, UK, 2008; Volume 39, pp. 279–285. [Google Scholar] [CrossRef]
  47. Wang, W.; Lu, Y. Analysis of the mean absolute error (MAE) and the root mean square error (RMSE) in assessing rounding model. IOP Conf. Ser. Mater. Sci. Eng. 2018, 324, 012049. [Google Scholar] [CrossRef]
  48. Hodson, T.O. Root mean square error (RMSE) or mean absolute error (MAE): When to use them or not. Geosci. Model Dev. Discuss. 2022, 15, 5481–5487. [Google Scholar] [CrossRef]
  49. Chicco, D.; Warrens, M.J.; Jurman, G. The coefficient of determination r-squared is more informative than smape, MAE, mape, mse and RMSE in regression analysis evaluation. PeerJ Comput. Sci. 2021, 7, e623. [Google Scholar] [CrossRef] [PubMed]
  50. Liu, L.; Zhou, J.; Zheng, B.; Cai, W.; Lin, K.; Tang, J. Temporal and spatial distribution of red tide outbreaks in the Yangtze River Estuary and adjacent waters, China. Mar. Pollut. Bull. 2013, 72, 213–221. [Google Scholar] [CrossRef]
  51. Han, J.; Liu, X.; Pan, K.; Liu, J.; Sun, Y.; Jin, G.; Li, Y.; Li, Y. Impacts of integrated multi-trophic aquaculture on phytoplankton in Sanggou Bay. J. Ocean. Univ. China 2024, 23, 835–843. [Google Scholar] [CrossRef]
  52. Tan, K.; Xu, P.; Huang, L.; Luo, C.; Huang, J.; Fazhan, H.; Kwan, K.Y. Effects of bivalve aquaculture on plankton and benthic community. Sci. Total Environ. 2024, 37, 169892. [Google Scholar] [CrossRef] [PubMed]
  53. Murphy, A.E.; Kolkmeyer, R.; Song, B.; Anderson, I.C.; Bowen, J. Bioreactivity and microbiome of biodeposits from filter-feeding bivalves. Microb. Ecol. 2019, 77, 343–357. [Google Scholar] [CrossRef] [PubMed]
  54. Xu, L.; Yang, D.; Yu, R.; Feng, X.; Gao, G.; Cui, X.; Bai, T.; Yin, B. Nonlocal population sources triggering dinoflagellate blooms in the Changjiang Estuary and adjacent seas: A modeling study. J. Geophys. Res. Biogeosci. 2021, 126, e2021JG006424. [Google Scholar] [CrossRef]
  55. Jing, Y.; Feng, C.; Chen, T.; Zhu, Y.; Li, C.; Tao, B.; Song, Q. Use of GOCI-II images for detection of harmful algal blooms in the East China Sea. Geosci. Lett. 2024, 11, 2. [Google Scholar] [CrossRef]
  56. Xu, Y.; Chen, J.; Yang, Q.; Jiang, X.; Fu, Y.; Pan, D. Trend of harmful algal bloom dynamics from GOCI observed diurnal variation of chlorophyll a off southeast coast of China. Front. Mar. Sci. 2024, 11, 1357669. [Google Scholar] [CrossRef]
  57. Feng, C.; Shen, A.; Zhu, Y.; Xu, Y.; Lu, X. Changes in dinoflagellate and diatom blooms in the East China Sea over the last two decades, under different spatial and temporal scale scenarios. Mar. Pollut. Bull. 2024, 200, 116097. [Google Scholar] [CrossRef] [PubMed]
  58. Guo, J.; Yang, F.; Costa Jr, O.S.; Yan, X.; Wu, M.; Qiu, H.; Li, W.; Xu, G. Nutrient budgets and biogeochemical dynamics in the coastal regions of northern Beibu Gulf, south China sea: Implication for the severe impact of human disturbance. Mar. Environ. Res. 2024, 197, 106447. [Google Scholar] [CrossRef] [PubMed]
  59. Zhang, J.; Liu, S.; Ren, J.; Wu, Y.; Zhang, G. Nutrient gradients from the eutrophic Changjiang (Yangtze River) Estuary to the oligotrophic kuroshio waters and re-evaluation of budgets for the East China Sea shelf. Prog. Oceanogr. 2007, 74, 449–478. [Google Scholar] [CrossRef]
  60. Liu, S.M.; Qi, X.H.; Li, X.; Ye, H.R.; Wu, Y.; Ren, J.L.; Zhang, J.; Xu, W.Y. Nutrient dynamics from the Changjiang (Yangtze River) Estuary to the east China sea. J. Mar. Syst. 2016, 154, 15–27. [Google Scholar] [CrossRef]
  61. Wen, L.; Song, J.; Dai, J.; Li, X.; Ma, J.; Yuan, H.; Duan, L.; Wang, Q. Nutrient characteristics driven by multiple factors in large estuaries during summer: A case study of the Yangtze River Estuary. Mar. Pollut. Bull. 2024, 201, 116241. [Google Scholar] [CrossRef]
  62. GB 3097-1997; Sea Water Quality Standard. Chinese Standard; National Environmental Protection Agency: Beijing, China, 1997.
  63. Beardsley, R.; Limeburner, R.; Yu, H.; Cannon, G. Discharge of the Changjiang (Yangtze River) into the East China sea. Cont. Shelf Res. 1985, 4, 57–76. [Google Scholar] [CrossRef]
  64. Xu, L.; Yang, D.; Greenwood, J.; Feng, X.; Gao, G.; Qi, J.; Cui, X.; Yin, B. Riverine and oceanic nutrients govern different algal bloom domain near the Changjiang Estuary in summer. J. Geophys. Res. Biogeosci. 2020, 125, e2020JG005727. [Google Scholar] [CrossRef]
  65. Zhang, S.; Qiao, L.; Gao, F.; Yao, Z.; Liu, X. Intra-tidal upwelling variability off Zhoushan islands, East China Sea. Estuar. Coast. Shelf Sci. 2024, 298, 108635. [Google Scholar] [CrossRef]
  66. Xiao, T.; Feng, J.; Qiu, Z.; Tang, R.; Zhao, A.; Wong, K.; Tsou, J.Y.; Zhang, Y. Remote-sensing estimation of upwelling-frequent areas in the adjacent waters of Zhoushan (China). J. Mar. Sci. Eng. 2024, 12, 1085. [Google Scholar] [CrossRef]
  67. Huang, C.; Guo, Y.; Yang, H.; Li, Y.; Zou, J.; Zhang, M.; Lyu, H.; Zhu, A.; Huang, T. Using remote sensing to track variation in phosphorus and its interaction with chlorophyll-a and suspended sediment. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2015, 8, 4171–4180. [Google Scholar] [CrossRef]
  68. Lan, F.; Liu, Q.; Ye, W.; Wang, X.; Yin, K. Riverine fluxes of different species of phosphorus in the pearl river estuary. Mar. Pollut. Bull. 2024, 200, 116079. [Google Scholar] [CrossRef] [PubMed]
  69. Chi, Y.; Liu, D.; Qu, Y.; Zhang, Z.; Liu, Z. Archipelagic human-land spatial interrelations: An empirical study in shengsi Archipelago, China. Land Use Policy 2023, 130, 106671. [Google Scholar] [CrossRef]
  70. Alsaleh, M. The impact of aquaculture economics expansion on marine water quality in the EU region. Reg. Stud. Mar. Sci. 2024, 77, 103625. [Google Scholar] [CrossRef]
  71. Fan, X.; Cheng, F.; Yu, Z.; Song, X. Diatom-based dissolved inorganic nitrogen reconstruction in the Changjiang River estuary and its adjacent areas. J. Oceanol. Limnol. 2023, 41, 1464–1480. [Google Scholar] [CrossRef]
  72. Ye, L.; Zhang, H.; Fei, Y.; Liu, L.; Li, D. Nutrient distributions in the east china sea and changes over the last 25 years. Appl. Ecol. Environ. Res. 2020, 18, 973–985. [Google Scholar] [CrossRef]
  73. Zhang, R.; Fang, J.; Zhang, Y.; Qin, X.; Zheng, X.; Zeng, C.; Wang, J. Effects of mussel-phytoplankton interactions on the aquatic environment. Aquacult. Rep. 2024, 37, 102242. [Google Scholar] [CrossRef]
  74. Jiang, H.; Hu, Y. Coastal water quality investigation and evaluation of shengsi Ma’an Archipelago conservation area from 2010 to 2017. J. Guangdong Ocean Univ. 2020, 40, 38–43. [Google Scholar] [CrossRef]
  75. Wu, T.; Xia, L.; Zhuang, M.; Pan, J.; Liu, J.; Dai, W.; Zhao, Z.; Zhang, M.; Shen, X.; He, P. Effects of global warming on the growth and proliferation of attached sargassum horneri in the aquaculture area near gouqi island, China. J. Mar. Sci. Eng. 2022, 11, 9. [Google Scholar] [CrossRef]
  76. Xu, S.; Li, S.; Tao, Z.; Song, K.; Wen, Z.; Li, Y.; Chen, F. Remote sensing of chlorophyll-a in xinkai lake using machine learning and GF-6 WFV images. Remote Sens. 2022, 14, 5136. [Google Scholar] [CrossRef]
  77. Zhu, X.; Guo, H.; Huang, J.J.; Tian, S.; Xu, W.; Mai, Y. An ensemble machine learning model for water quality estimation in coastal area based on remote sensing imagery. J. Environ. Manag. 2022, 323, 116187. [Google Scholar] [CrossRef] [PubMed]
  78. Gómez, D.; Salvador, P.; Sanz, J.; Casanova, J.L. A new approach to monitor water quality in the menor sea (spain) using satellite data and machine learning methods. Environ. Pollut. 2021, 286, 117489. [Google Scholar] [CrossRef] [PubMed]
  79. Patricio-Valerio, L.; Schroeder, T.; Devlin, M.J.; Qin, Y.; Smithers, S. A machine learning algorithm for Himawari-8 total suspended solids retrievals in the great barrier reef. Remote Sens. 2022, 14, 3503. [Google Scholar] [CrossRef]
  80. Mélin, F.; Vantrepotte, V. How optically diverse is the coastal ocean? Remote Sens. Environ. 2015, 160, 235–251. [Google Scholar] [CrossRef]
  81. Wang, F.; Wang, Y.; Zhang, K.; Hu, M.; Weng, Q.; Zhang, H. Spatial heterogeneity modeling of water quality based on random forest regression and model interpretation. Environ. Res. 2021, 202, 111660. [Google Scholar] [CrossRef] [PubMed]
  82. Wong, W.Y.; Al-Ani, A.K.I.; Hasikin, K.; Khairuddin, A.S.M.; Razak, S.A.; Hizaddin, H.F.; Mokhtar, M.I.; Azizan, M.M. Water quality index using modified random forest technique: Assessing novel input features. CMES-Comput. Model. Eng. Sci. 2022, 132, 1011–1038. [Google Scholar] [CrossRef]
  83. Chen, B.; Mu, X.; Chen, P.; Wang, B.; Choi, J.; Park, H.; Xu, S.; Wu, Y.; Yang, H. Machine learning-based inversion of water quality parameters in typical reach of the urban river by uav multispectral data. Ecol. Indic. 2021, 133, 108434. [Google Scholar] [CrossRef]
  84. Bi, J.; Lin, Y.; Dong, Q.; Yuan, H.; Zhou, M. Large-scale water quality prediction with integrated deep neural network. Inf. Sci. 2021, 571, 191–205. [Google Scholar] [CrossRef]
  85. Wang, J.; Wu, X.; Ma, D.; Wen, J.; Xiao, Q. Remote sensing retrieval based on machine learning algorithm: Uncertainty analysis. J. Remote Sens. 2023, 27, 790–801. [Google Scholar] [CrossRef]
Figure 1. Sampling sites in the study area.
Figure 1. Sampling sites in the study area.
Jmse 12 01742 g001
Figure 2. Technical workflow depicting the study methods.
Figure 2. Technical workflow depicting the study methods.
Jmse 12 01742 g002
Figure 3. Accuracy evaluation of the Chl-a concentration retrieval by the RF model.
Figure 3. Accuracy evaluation of the Chl-a concentration retrieval by the RF model.
Jmse 12 01742 g003
Figure 4. Accuracy evaluation of the phosphate concentration retrieval by the RF model.
Figure 4. Accuracy evaluation of the phosphate concentration retrieval by the RF model.
Jmse 12 01742 g004
Figure 5. Accuracy evaluation of the DIN concentration retrieval by the RF model.
Figure 5. Accuracy evaluation of the DIN concentration retrieval by the RF model.
Jmse 12 01742 g005
Figure 6. Distribution of Chl-a concentration in Ma’an Archipelago Marine Special Protected Area from 2002 to 2022. (a) 2002; (b) 2007; (c) 2013; (d) 2018; (e) 2022.
Figure 6. Distribution of Chl-a concentration in Ma’an Archipelago Marine Special Protected Area from 2002 to 2022. (a) 2002; (b) 2007; (c) 2013; (d) 2018; (e) 2022.
Jmse 12 01742 g006
Figure 7. Distribution of reactive phosphate concentrations in the Ma’an Archipelago Marine Special Protected Area from 2002 to 2022 (a) 2002; (b) 2007; (c) 2013; (d) 2018; (e) 2022.
Figure 7. Distribution of reactive phosphate concentrations in the Ma’an Archipelago Marine Special Protected Area from 2002 to 2022 (a) 2002; (b) 2007; (c) 2013; (d) 2018; (e) 2022.
Jmse 12 01742 g007
Figure 8. Distribution of DIN concentrations in Ma’an Archipelago Marine Special Protected Area from 2002 to 2022. (a) 2002; (b) 2007; (c) 2013; (d) 2018; (e) 2022.
Figure 8. Distribution of DIN concentrations in Ma’an Archipelago Marine Special Protected Area from 2002 to 2022. (a) 2002; (b) 2007; (c) 2013; (d) 2018; (e) 2022.
Jmse 12 01742 g008
Table 1. Information on Landsat 5 TM bands.
Table 1. Information on Landsat 5 TM bands.
BandwidthWavelength Range (μm)Resolution Ratio
(m)
Uses
Band 1
Blue
0.45–0.5230Very strong water penetration for land use, soil, and vegetation characteristics
Band 2 Green0.52–0.6030Emphasizing peak vegetation, useful for assessing plant vigor
Band 3
Red
0.63–0.6930Discriminating vegetation slopes
Band 4 Near IR0.76–0.9030Identifying crops and highlighting soil/crop, land/water contrasts
Band 5
SWIR
1.55–1.7530Monitoring crop drought studies and vegetation growth surveys, distinguishing between cloud, snow, and ice bands
Band 6
Thermal
10.40–12.5120Determination of geothermal activity, vegetation classification, vegetation stress analysis, and soil moisture
Band 7
SWIR
2.08–2.3530Distinguishing geological formations and identifying hydrothermal alteration zones in rocks
Table 2. Information of Landsat 8 bands.
Table 2. Information of Landsat 8 bands.
SensorsBandwidthWavelength Range
(μm)
Resolution Ratio
(m)
Uses
OLIBand 1 Coastal0.43–0.4530Coastal Zone Observations
Band 2 Blue0.45–0.5130Distinguish between soil, vegetation
Band 3 Green0.53–0.5930Distinguishing Vegetation
Band 4 Red0.64–0.6730Observation of roads, bare soil, vegetation types
Band 5 NIR0.85–0.8830Estimated biomass
Band 6 SWIR 11.57–1.6530Vegetation moisture stress, soil moisture, and rock/mineralogy identification
Band 7 SWIR 22.11–2.2930Vegetation analysis and soil moisture
Band 8 Pan0.50–0.6815High-resolution black and white imagery, and to enhance the resolution of other bands through pansharpening techniques
Band 9 Cirrus1.36–1.3830Cloud and cirrus detection, as well as for data quality assessment
TIRSBand 10 TIRS 110.60–11.19100Land surface temperature and soil moisture
Band 11 TIRS 211.50–12.51100Land surface temperature and soil moisture, with a focus on cloud-free observations
Table 3. Measured water quality parameters from 2007 to 2018.
Table 3. Measured water quality parameters from 2007 to 2018.
Water Quality ParametersMaxMinAverageAggregate
Chl-a38.8 μg/L1.1 μg/L8.975 μg/L72
Phosphate0.026 mg/L0.002 mg/L0.013 mg/L108
DIN0.71 mg/L0.285 mg/L0.533 mg/L108
Table 4. The Pearson correlation between the reflectance values derived from single bands of satellite data and in situ measurements of Chl-a, DIN, and phosphate.
Table 4. The Pearson correlation between the reflectance values derived from single bands of satellite data and in situ measurements of Chl-a, DIN, and phosphate.
BandChl-aDINPhosphate
B10.2710.1330.427
B20.7570.5980.113
B30.7310.3890.279
B40.7000.3710.113
B5−0.0460.0450.019
B6−0.054−0.1820.505
B7−0.128−0.1930.500
Table 5. The Pearson correlation between the reflectance values derived from band arithmetic, band-ratio of satellite data, and in situ measurements of Chl-a, DIN, and phosphate (only showing more relevant parts).
Table 5. The Pearson correlation between the reflectance values derived from band arithmetic, band-ratio of satellite data, and in situ measurements of Chl-a, DIN, and phosphate (only showing more relevant parts).
Band-RatioChl-aDINPhosphateBand
Arithmetic
Chl-aDINPhosphate
B3/B50.7770.3580.235B1, B70.3890.1890.551
B5/B2−0.608−0.586−0.058B2, B30.7530.5170.201
B6/B50.098−0.1900.684B2, B40.5290.6880.062
B1/B2−0.7420.111−0.378B2, B50.7720.6360.117
B3/B4−0.0660.0430.043B2, B70.7200.5460.218
B4/B60.7160.473−0.199B3, B40.6740.2730.553
B5/B6−0.0980.225−0.570B4, B50.6080.3130.097
B6/B2−0.529−0.5200.312B4, B70.7100.425−0.033
B6/B70.5170.219−0.010B5, B70.1180.220−0.410
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, Z.; Zhang, Z.; Li, H.; Jiang, H.; Zhuo, L.; Cai, H.; Chen, C.; Zhao, S. Water Quality in the Ma’an Archipelago Marine Special Protected Area: Remote Sensing Inversion Based on Machine Learning. J. Mar. Sci. Eng. 2024, 12, 1742. https://doi.org/10.3390/jmse12101742

AMA Style

Wang Z, Zhang Z, Li H, Jiang H, Zhuo L, Cai H, Chen C, Zhao S. Water Quality in the Ma’an Archipelago Marine Special Protected Area: Remote Sensing Inversion Based on Machine Learning. Journal of Marine Science and Engineering. 2024; 12(10):1742. https://doi.org/10.3390/jmse12101742

Chicago/Turabian Style

Wang, Zhixin, Zhenqi Zhang, Hailong Li, Hong Jiang, Lifei Zhuo, Huiwen Cai, Chao Chen, and Sheng Zhao. 2024. "Water Quality in the Ma’an Archipelago Marine Special Protected Area: Remote Sensing Inversion Based on Machine Learning" Journal of Marine Science and Engineering 12, no. 10: 1742. https://doi.org/10.3390/jmse12101742

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop