Research on the Inversion of Chlorophyll-a Concentration in the Hong Kong Coastal Area Based on Convolutional Neural Networks

Zhu, Weidong; Liu, Shuai; Luan, Kuifeng; Xu, Yuelin; Liu, Zitao; Cao, Tiantian; Wang, Piao

doi:10.3390/jmse12071119

Open AccessArticle

Research on the Inversion of Chlorophyll-a Concentration in the Hong Kong Coastal Area Based on Convolutional Neural Networks

by

Weidong Zhu

^1,2

,

Shuai Liu

^1,*,

Kuifeng Luan

^1,2

,

Yuelin Xu

¹,

Zitao Liu

¹,

Tiantian Cao

¹ and

Piao Wang

¹

College of Oceanography and Ecological Science, Shanghai Ocean University, No.999, Huchenghuan Rd, Nanhui New City, Shanghai 201306, China

²

Shanghai Estuary Marine Surveying and Mapping Engineering Technology Research Center, Shanghai 201306, China

^*

Author to whom correspondence should be addressed.

J. Mar. Sci. Eng. 2024, 12(7), 1119; https://doi.org/10.3390/jmse12071119

Submission received: 13 May 2024 / Revised: 3 June 2024 / Accepted: 26 June 2024 / Published: 3 July 2024

(This article belongs to the Special Issue Remote Sensing Applications in Marine Environmental Monitoring)

Download

Browse Figures

Versions Notes

Abstract

:

Chlorophyll-a (Chl-a) concentration is a key indicator for assessing the eutrophication level in water bodies. However, accurately inverting Chl-a concentrations in optically complex coastal waters presents a significant challenge for traditional models. To address this, we employed Sentinel-2 MSI sensor data and leveraged the power of five machine learning models, including a convolutional neural network (CNN), to enhance the inversion process in the coastal waters near Hong Kong. The CNN model demonstrated superior performance with on-site data validation, outperforming the other four models (R² = 0.810, RMSE = 1.165 μg/L, MRE = 35.578%). The CNN model was employed to estimate Chl-a concentrations from images captured over the study area in April and October 2022, resulting in the creation of a thematic map illustrating the spatial distribution of Chl-a levels. The map indicated high Chl-a concentrations in the northeast and southwest areas of Hong Kong Island and low Chl-a concentrations in the southeast facing the open sea. Analysis of patch size effects on CNN model accuracy indicated that 7 × 7 and 9 × 9 patches yielded the most optimal results across the tested sizes. Shapley additive explanations were employed to provide post-hoc interpretations for the best-performing CNN model, highlighting that features B6, B12, and B8 were the most important during the inversion process. This study can serve as a reference for developing machine learning models to invert water quality parameters.

Keywords:

chlorophyll-a; convolutional neural networks; machine learning; SHAP

1. Introduction

Chlorophyll-a (Chl-a), a pivotal green pigment present in plants, algae, and certain bacteria, plays an essential role in the photosynthesis of marine phytoplankton [1]. Its concentration levels have historically been adopted as a bioindicator for biomass in aquatic and terrestrial ecosystems. The detection of Chl-a is crucial for monitoring water quality and preventing uncontrolled growth of phytoplankton, which can lead to eutrophication [2]. Traditional detection methods, such as spectroscopic analysis, high-performance liquid chromatography (HPLC), and fluorescence techniques, are laborious, costly, and time-consuming due to the requirement for water sample collection and laboratory analysis [3].

With the advancement in remote sensing technology, satellite remote sensing data has become a primary tool for assessing the spatial distribution of chlorophyll in coastal waters. Various algorithms have been developed for Chl-a inversion, including the two-band ratio algorithm (BR) [4], three-band algorithm (TBA) [5], four-band algorithm (FBA) [6], baseline subtraction algorithm (BS) [7], and nested band ratio algorithm (NBR) [8]. These algorithms demonstrate high precision in Chl-a inversion when water constituents and phytoplankton vary concurrently, a phenomenon that significantly impacts the accuracy of Chl-a inversion. However, polluted inland and coastal water bodies exhibit complex optical properties characterized by considerable variation in the concentrations of optically active substances across different regions. When deployed in polluted inland and coastal water bodies, these algorithms can encounter significant performance degradation.

Machine learning (ML) methods have demonstrated effectiveness in interpreting and simulating marine water quality parameters derived from satellite data. They excel in nonlinear regression and classification tasks, adapting to complex interdependencies and enhancing the accuracy of water quality parameter retrieval [9]. Commonly adopted ML methods include random forest (RF) [10,11], extreme gradient boosting (XGBoost) [12,13], support vector machine (SVM) [14,15], and deep neural networks (DNN) [16,17]. For example, Shen et al. (2022) modeled the spatial distribution of Chl-a concentrations in a turbid inland lake in eastern China, applying four prevalent ML techniques: random forest regression; extreme gradient boosting; deep neural networks; and support vector regression, with the random forest method outperforming the others in inversion accuracy [11]. Similarly, Xu et al. (2023) utilized on-situ Chl-a measurements from Poyang Lake, alongside concurrent satellite imagery, applying a deep neural network, XGBoost, random forest, gradient boosting, decision tree, linear regression, and two empirical models to invert Chl-a concentrations in Poyang Lake. The results highlighted the superior performance of ML methods compared to traditional techniques [18]. However, these machine learning methods are pixel-based models, and thus, Chl-a concentrations are estimated from the variations of the remote sensing reflectance in the spectral bands of a pixel. In other words, the models do not consider local spatial information [19].

In contrast to conventional machine learning approaches that focus on pixel-based analysis and thus do not account for local spatial information, convolutional neural networks (CNNs) offer a paradigm shift in this domain. CNNs excel by processing image patches rather than individual pixels, enabling them to capture the spatial context surrounding target pixels and uncover intrinsic patterns within the data [20,21,22,23]. This approach significantly enhances the spatial accuracy of Chl-a concentration inversion, addressing the limitations of traditional modeling techniques. This endows CNN with significant utility for inverting Chl-a concentration fields. For instance, in 2019, Pu et al. employed a CNN model for classifying water quality in the Erhai and Chaohu lakes, China, [24]. While the core of water quality parameter inversion remains as a regression challenge rather than a classification concern. In 2021, Aptoula et al. estimated Chl-a concentrations in Balik Lake, Turkey, using Sentinel-2 imagery, consistently demonstrating the superior performance of 2D-CNNs over 3D-CNNs and confirming the improved spectral dimension utilization achievable with 2D convolution [25]. Additionally, in 2021, Ilteralp et al. proposed a multitask CNN architecture that included a secondary task of classifying unlabeled samples by month, augmented with field measurements to analyze the multi-temporal and spectral characteristics of Sentinel-2 imagery over Balik Lake in Turkey [26]. Following that, in 2022, Na et al. developed a CNN-LSTM model that integrated temporal and spatial features for predicting Chl-a concentrations, which resulted in improved prediction accuracy and training efficiency [27]. However, the transition from pixel-based to patch-based analysis introduces new considerations, such as the selection of optimal image patch sizes and band combinations, which can impact the model’s accuracy and stability.

Despite these advancements, the intricate architecture of CNN models, with their multitude of parameters, introduces new challenges. A key challenge is selecting the optimal image patch sizes and band combinations, which are critical for enhancing model accuracy and stability. While traditional machine learning methods often rely on Pearson correlation analysis to examine relationships between pixel bands and inversion values [28,29,30,31,32,33], the application of CNNs necessitates a different approach. We explored various patch sizes to identify the most effective configurations for Chl-a concentration estimation. This empirical investigation allows us to directly assess how different patch sizes influence CNN’s ability to learn and generalize from the training data. Our methodology incorporates the use of SHAP (SHapley Additive exPlanations) values, a game-theoretic approach celebrated for its ability to explain the output of machine learning models [34,35,36]. This method allows for a meticulous analysis of the contribution of each input feature. By quantifying the impact of the various spectral bands, we can identify the most influential bands for estimating chlorophyll-a (Chl-a) concentrations. This insight aids in refining our convolutional neural network (CNN) model, enhancing its predictive power and reliability in the domain of water quality monitoring.

To achieve high-precision inversion of Chl-a concentrations in coastal waters and to gain a deeper understanding of the specific impact of input data on model prediction performance, this study conducts a systematic analysis using the coastal waters of Hong Kong as the research area. By integrating Sentinel-2 satellite imagery with field measurements, we developed a CNN model that surpassed traditional ML models in predicting Chl-a concentrations and conducted inversions for April and October 2022. The structure of this paper is organized as follows: Section 2 describes the data and methodologies utilized in the study. Section 3 conducts a precision analysis of the CNN model alongside four other traditional ML models, culminating in the generation of a concentration distribution map. This section also explores the impact of variations in input data on the performance of the CNN model. Section 4 discusses the applicability of the CNN model and future research directions. Finally, Section 5 summarizes the findings and presents the conclusions drawn from the study.

2. Materials and Methods

2.1. Study Area and Dataset

The offshore waters of Hong Kong are situated in the southern part of China, spanning the geographical coordinates from 22°09′ N to 22°37′ N and from 113°52′ E to 114°30′ E. They border Guangdong Province to the north, face the South China Sea to the south, adjoin the Pearl River Estuary to the west, and are enclosed by other waters to the east. Due to its unique geographic location, the waters of Hong Kong have become a world-renowned busy seaport, serving as a vital transportation hub linking the Pearl River inland and the South China Sea. However, rapid coastal economic growth and industrialization have led to severe pollution from industrial and domestic wastewater. Therefore, monitoring water quality conditions using remote sensing technology is crucial for maintaining the ecological balance and restoring coastal ecosystems.

In this study, we obtained empirical data from the online database of the Hong Kong Environmental Protection Department, including chlorophyll concentration data from monitoring stations. The Environmental Protection Department of Hong Kong has a dedicated marine monitoring vessel equipped with Differential Global Positioning System (DGPS) and advanced Conductivity-Temperature-Depth (CTD) profilers for water quality measurements. Each month, they collect water samples simultaneously from 76 fixed monitoring stations, store the samples in 500 mL Nalgene bottles, and analyze the chlorophyll concentration using the 2-fold spectroscopy method specified by the American Public Health Association (APHA). Based on characteristics of water quality, the Hong Kong Environmental Protection Department has delineated the waters of Hong Kong into ten distinct water environmental zones, within which 76 water quality monitoring points have been established. The data from these monitoring points are detailed in Figure 1, and the boundaries of the water quality zones are illustrated in Figure 2.

Sentinel-2 is a high-resolution multispectral imaging satellite equipped with a multispectral imager (MSI) and operates at an altitude of 786 km. Its data are widely utilized in remote sensing applications, covering a spectral range of 13 different bands, including visible light, near-infrared, and shortwave infrared. Detailed information on each band of Sentinel 2 is shown in Table 1. It is worth mentioning that the Sentinel-2 satellite adopts a twin-satellite design, where the two satellites operate in tandem, complementing each other, allowing for a more frequent revisit cycle of 5 days to acquire remote sensing data.

This design not only enhances the satellite’s capability to monitor surface changes and dynamic processes but also provides more flexible data support for scientific research and applications. Additionally, the satellite boasts outstanding spatial resolution, with ground resolutions of 10 m, 20 m, and 60 m, respectively, suitable for various scales of remote sensing applications. The spectral information in each band is extremely rich, providing abundant data resources for research and applications in environmental monitoring, land planning, agricultural management, and other fields.

2.2. Data Preprocessing

Hyperspectral images have the capability to capture the radiation information of the Earth’s surface; however, the image quality is significantly influenced by sensor performance and external environmental factors. To acquire precise ground reflectance information, a sequence of preprocessing procedures must be executed on remote sensing images, including image cropping, atmospheric correction, and dataset construction.

2.2.1. Atmospheric Correction

The total radiance of ground targets measured by sensors is affected by multiple absorptions, reflections, and scattering of aerosols, water vapor particles, and sunlight, leading to an inaccurate representation of the land surface. The primary objective of atmospheric correction is to mitigate or eliminate radiation errors induced by aerosols and water vapor particles in the atmosphere, thus ensuring accurate ground object reflectance.

In this study, the Sen2cor plugin provided by the European Space Agency (ESA) specifically for L2A level data processing was deployed to conduct atmospheric correction on L1C level products, resulting in upgrading the image level of Sentinel-2A data from Level-1C to Level-2A. Notably, the number of bands decreased from 13 to 12 during this process, with Band 10 for cirrus clouds being removed to enhance data accuracy and reliability.

2.2.2. Resampling

Image resampling is a crucial step in image processing, involving the reconstruction of the original image based on the position of each pixel in the input image onto the output image according to predefined rules, performing brightness value interpolation, and establishing a new image matrix. This operation is imperative for ensuring consistency and facilitating collaborative processing among images with varying resolutions.

Sentinel-2 satellite data comprise bands with distinct resolutions of 10 m, 20 m, and 60 m. Failure to standardize these resolutions can significantly impede subsequent image analysis and processing. To address this issue, the Sentinel Application Platform (SNAP) software was utilized in this study to resample the atmospherically corrected Sentinel-2 data, unifying all bands’ resolutions to 10 m.

Resampling not only enhances data consistency and comparability but also elevates data usability, enabling researchers to efficiently compare data from different bands in subsequent analyses for more accurate land cover classification, change detection, and environmental monitoring. Hence, resampling stands as an indispensable stage in remote sensing data processing, harnessing the potential of Sentinel-2 satellite data to offer superior information and insights across diverse applications.

2.3. Machine Learning Methods

2.3.1. Support Vector Regression

Support vector regression (SVR) is a machine learning algorithm designed specifically for regression tasks, extending the principles of support vector machines (SVMs) introduced in the 1990s by Vapnik [37]. The fundamental concept behind SVR is to determine a function in the feature space that fits the training data points as closely as possible within a specified range while minimizing discrepancies of data points beyond that range. SVMs, on the other hand, aim to identify an optimal decision boundary that demonstrates strong generalization capabilities for unseen data, primarily in classification scenarios. Both SVR and SVMs utilize kernel functions to transform input data into a high-dimensional space, enabling the discovery of decision boundaries or functions that can handle linearly separable as well as intricate nonlinear relationships.

y = f (x) = w \cdot x + b

where y represents the target value, w represents the weight vector, x represents the input feature vector, and b represents the bias term.

| y - f (x) | \leq ε

where

ε

represents the width of the interval band, where data points within the band are not penalized, while data points outside the band are penalized.

2.3.2. Random Forest

Random forest, an ensemble learning method, accomplishes classification or regression tasks by constructing multiple decision trees [38]. This approach involves random sampling of data and features during training to build independent decision trees, subsequently leveraging their ensemble for predictions.

The central tenet of random forest lies in augmenting model robustness and generalization capacity through the integration of randomness. During the construction of each decision tree, random forest randomly selects a subset of samples from the training set and features for partitioning at each node. This stochastic process aids in mitigating overfitting, enhancing model diversity, and optimizing overall performance.

In the prediction phase, the final prediction arises from aggregating the outcomes of individual trees through voting or averaging. This ensemble technique not only mitigates model variance but also bolsters accuracy and generalizability. Random forest excels in processing extensive features and high-dimensional data while showcasing resilience to noise and outliers. Additionally, it offers feature importance assessments, facilitating feature selection and model interpretation. Due to its exceptional performance and user-friendly nature, random forest finds extensive application in the machine learning domain.

2.3.3. XGBoost

XGBoost (extreme gradient boosting) represents an optimized distributed gradient boosting library based on gradient boosting decision tree (GBDT) principles [39]. It aims to foster efficient, adaptable, and transportable machine learning algorithms prevalent in data science competitions and industrial settings. XGBoost enhances the objective function iteratively by incorporating new trees, with each tree fitting residuals from preceding trees. Diverging from traditional GBDT, XGBoost explicitly integrates regularization terms into the objective function to govern model complexity and avert overfitting.

Moreover, XGBoost supports column subsampling, diminishing overfitting and reducing the computational burden. Noteworthy for its exceptional performance in handling regression and classification challenges, particularly in convoluted and nonlinear datasets, XGBoost’s potent fitting capability and robustness have rendered gradient boosting one of the pivotal algorithms in the machine learning realm.

2.3.4. BP Neural Network

The backpropagation (BP) algorithm is central to the training of BP neural networks [40]. It operates on the principle of a nonlinear mapping between inputs and outputs, with the activation function relying on the sigmoid function. This function is described as differentiable, monotonically increasing, and continuous, as detailed below:

F (x) = \frac{1}{1 + e^{- x}}

The architecture of a BP neural network includes an input layer, one or more hidden layers, and an output layer. The learning process encompasses two principal stages: forward propagation and backpropagation. In the forward phase, experimental data are fed into the network, processed through the hidden layer, and ultimately produce an output value set that is sent to the output layer units [41]. The hidden layer employs a threshold activation function to generate these outputs. Backpropagation is the mechanism for error correction when the network’s predictions deviate from the expected values. It involves the transmission of errors from the output layer back to the input layer, culminating in the adjustment of weights to minimize the error. This iterative learning process continues until the output errors fall within the specified tolerance, thereby optimizing the network’s performance over successive iterations.

2.3.5. Convolutional Neural Network

The convolutional neural network (CNN) is a renowned feedforward neural network celebrated for its exceptional performance in image recognition and natural language processing domains [42]. Comprising three core elements: the convolutional layer; pooling layer; and fully connected layer, the CNN structure is defined by multiple feature maps within the convolutional layer.

Neurons in these feature maps are locally connected through convolutional kernels, sliding across with a specific stride for weight sharing. This mechanism enables the network to identify distinct image features like edges or textures without the necessity for learning individual weights per position, effectively reducing parameter count. The primary function of the pooling layer is to downsample local features extracted from the convolutional layer, decreasing free parameters and enhancing feature data robustness. Common techniques such as average and max pooling are employed to maintain crucial information while reducing spatial resolution in feature maps. While a higher downsampling ratio can enhance generalization and decrease network size, an excessively large ratio may lead to vital feature information loss and degrade network performance. Subsequently, the fully connected layer flattens feature maps originating from the pooling layers and integrates them into a multilayer perceptron (MLP) for holistic connectivity. This step aids in further processing extracted image features to deliver specific recognition outcomes.

The convolutional layer in a neural network extract features by applying convolutional kernels to the input signal. The size and number of these kernels critically influence the network’s performance. Consequently, variations in the convolutional kernel size and number, downsampling size, as well as the number of convolutional and pooling layers, can introduce uncertainty in the Chl-a concentration model inverted by the CNN. This study proposed a 7-layer CNN model for Chl-a concentration inversion, comprising an input layer, four convolutional layers, a fully connected layer, and an output layer. The model’s input layer processes raw spectral data, followed by four convolutional layers. Each convolutional layer had a kernel size of 3 and a stride of 1, ensuring no change in size post-convolution. The initial convolutional layer had 12 input channels and 36 output channels, while subsequent layers had 64, 128, and 256 output channels, respectively. ReLU activation functions were employed in all convolutional layers. Subsequent to the convolutional layers, the data were flattened into one dimension using the flatten function before entering the fully connected layer. The fully connected layer utilized ReLU functions for real value regression and implemented dropout mechanisms in each layer. Refer to the accompanying figure for specific layer numbers and details. During model operations, parameter optimization was carried out using the Adam optimizer, and the mean squared error (MSE) function was adopted as the loss function. The experimental setup encompassed the Windows 10 operating system, utilizing Python as the backend to construct a Chl-a concentration inversion CNN model based on the PyTorch deep learning framework. The convolutional neural network structure is shown in Figure 3.

2.4. Model Evaluation Metrics

Root mean square error (RMSE) is a common metric used to measure the prediction error of a model. The calculation of RMSE involves squaring the prediction error of each data point, averaging them, and then taking the square root. The mathematical representation is as follows:

R M S E = \sqrt{\sum_{i = 1}^{n} {(ŷ_{i} - y_{i})}^{2} / n}

where n represents the sample size,

ŷ_{i}

denotes the actual observed value, and

y_{i}

stands for the model’s predicted value. A lower RMSE value signifies a reduced prediction error, indicative of higher accuracy in the model. In essence, a decrease in RMSE on the test data reflects improved model fit, resulting in minimized deviations between predicted and actual results.

The coefficient of determination, denoted as R², quantifies the goodness of fit of a regression model. Its computation formula is defined as follows:

R^{2} = \sum_{i = 1}^{n} {(ŷ_{i} - y_{i})}^{2} / \sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}

where

ŷ_{i}

denotes the actual observed value,

y_{i}

represents the model’s predicted value,

\bar{y}

signifies the mean value of the dependent variable, and n signifies the sample size. The R-squared (R²) value, ranging from 0 to 1, signifies the proportion of the dependent variable’s variance explained by the independent variable. Generally, a higher R-squared value signifies a superior model fit. An R² of 1 represents a perfect fit, where the model aligns precisely with the data and predictions match observed values accurately.

The mean relative error (MRE) serves as a statistical metric to assess the average error between predicted and actual values. Its calculation formula is as follows:

M R E = \frac{100 %}{n} \sum_{i = 1}^{n} | \frac{y_{i} - ŷ_{i}}{ŷ_{i}} |

where n is the sample size,

ŷ_{i}

represents the actual observed value, and

y_{i}

stands for the model’s predicted value. The result of MRE indicates the average percentage of relative error for each sample. A smaller MRE value signifies a smaller relative error in the model, indicating that the predicted values are closer to the true values.

3. Results

3.1. Descriptive Statistics of Chl-a Concentration

Since the 1980s, the Environmental Protection Department of Hong Kong has classified the waters of Hong Kong into 10 water control zones based on their geographical locations and characteristics, establishing 76 water quality monitoring stations across these areas. The data for this study, focusing on Chl-a levels, were collected in November and December 2021, as well as January 2022. During the data processing phase, entries obscured by cloud cover were excluded, ultimately resulting in data from 213 sampling points for Chl-a.

The datasets revealed a wide range of Chl-a concentrations, spanning from a minimum of 0.2 μg/L to a maximum of 14 μg/L. The mean concentration was 3.01 μg/L, representing the arithmetic average of the data, whereas the median concentration stood at 1.9 μg/L, reflecting the central value when the data were sorted. The skewness value of 5.83 suggested a right-skewed distribution, with some exceptionally high-chlorophyll samples. Detailed data on chlorophyll a concentration are shown in Table 2. These statistics provide crucial insights into the distribution characteristics of Chl-a data, laying a foundation for further research.

During model training and testing, a random selection approach was adopted, allocating 70% of the Chl-a dataset for training and reserving the remaining 30% for validation. This partitioning methodology aims to evaluate the model’s performance on novel data, thereby ensuring its robust generalization capabilities and accuracy maintenance.

3.2. Evaluation of CNN Model Accuracy

Sentinel-2 satellite remote sensing imagery was utilized as the primary data source for the development of five inversion models, aiming to predict Chl-a concentration in the nearshore waters of Hong Kong. The model ensemble consisted of CNN, SVM, RF, XGBoost, and BPNN. Model performance evaluation was executed utilizing coefficients of determination (R²), root mean square error (RMSE), and mean relative error (MRE) as evaluation metrics. The relationship between the predicted and actual values for each model is illustrated in Figure 4, with the validation dataset represented by red points.

Figure 4 visually illustrates the deviation between the inferred and actual values of Chl-a concentration at each monitoring point. The red points in the figure represent the validation dataset, while the blue points depict the training dataset. By examining the main diagonal on the plot, a clear insight into the model’s inversion performance can be obtained. The main diagonal symbolizes the perfect scenario where predicted values align precisely with the actual values, while points deviating from the main diagonal indicate larger prediction discrepancies. The distribution of monitoring points in the figure vividly demonstrates performance variations among different models in Chl-a concentration estimation.

It is worth noting that the CNN model achieved good results across all evaluation metrics, particularly with an R² of 0.810 and an RMSE of 1.165, maintaining the highest performance among all models. Although the CNN model’s RMSE was lower than that of the XGBoost model, the CNN model’s MRE was 35.58%, slightly higher than the XGBoost’s MRE of 32.61%. This is because the XGBoost model performs better on data with low to moderate chlorophyll concentrations, while the CNN model performs better on data points with moderate to high chlorophyll-a concentrations. Therefore, despite the XGBoost model having a higher RMSE compared to the CNN model, its MRE was lower. Considering the three metrics—R², RMSE, and MRE—the CNN was the best-performing model among all. This may be attributed to the CNN’s capacity to capture local spatial features around target pixels through convolutional operations, enabling predictions of Chl-a concentration distribution in higher dimensions rather than individual pixel radiance [43]. The XGBoost model, known for its exceptional ensemble learning capabilities, also performed strongly, with an R² score of 0.80 and an RMSE of 1.183 μg/L. It was particularly adept at regression tasks and showed an optimal MRE of 32.608%, indicating high predictive accuracy, especially for low Chl-a concentration points. In contrast, the SVM, RF, and BPNN models exhibited weaker performance relative to the CNN and XGBoost models. Their respective RMSE values were 1.421, 1.454, and 1.498 μg/L, reflecting an error increase of approximately 0.25 to 0.33 μg/L. While this increase was modest, it still indicates notable discrepancies between predicted and actual values. Despite this, all three models managed R² scores of 0.8 or higher. However, the BPNN model had a lower R² value of 0.686, which was below the desired threshold of 0.7, suggesting a weaker model fit and predictive capability.

3.3. Inversion of Chl-a Concentration in the Study Area

Overall, after a thorough comparison of the evaluation metrics—R², RMSE, and MRE—this study concluded that the CNN model outperformed all others in predicting Chl-a concentrations within the study area. The XGBoost, BPNN, RF, and SVM models followed, with relatively weaker performance. To simulate Chl-a concentrations across the entire study area, we applied the CNN model to invert high-visibility, low-cloud remote sensing imagery over Hong Kong for the months of April and October 2022. The results of this inversion are detailed in Figure 5 and Figure 6.

The inversion of CNN models revealed a consistent trend in Figure 5 and Figure 6, indicating higher Chl-a concentrations in the internal sea areas compared to the external regions. Specifically, the southeastern and southern sea areas exhibited lower Chl-a levels, while the northeastern regions, especially within Tolo Harbour, displayed elevated concentrations. This spatial difference in distribution was closely associated with marine geographical features and environmental conditions.

The southeastern and southern sea regions are characterized by vast open waters where frequent water exchange occurs, resulting in relatively diminished chlorophyll levels. In contrast, the northeastern area represents a semi-enclosed sea region with slow water circulation, featuring nutrient-rich conditions that enhance aquatic organism proliferation, thereby leading to heightened Chl-a concentrations. Tolo Harbour, a slender water passage extending inland in the northeast, is linked solely to external waters via the Tathong Channel, facilitating slow water exchange and significant pollution impacts, as observed in the region with the highest Chl-a concentration in Figure 5.

Additionally, higher Chl-a concentrations were observed in the western buffer zone, Victoria Harbour, Tseung Kwan O, and Ngau Tau Kok regions during April and October. This distribution pattern could be attributed to these areas being enclosed by Hong Kong’s primary islands, situated near urban zones affected by organic-rich water sources like domestic waste and industrial discharges. Internal sea areas near the urban center receive continuous input from persistent pollution sources, resulting in escalated Chl-a levels. Conversely, external water bodies exhibit increased fluidity, promoting frequent energy exchange with neighboring sea regions, thereby reducing the impact of internal pollutant discharge and resulting in lower Chl-a concentrations in external waters. These findings align with practical conditions and validate the relevance of utilizing the CNN model for mapping Chl-a concentrations in Hong Kong waters.

3.4. Study on the Influence of Input Data on the Accuracy of CNN Model

CNNs were initially developed for color image classification. Building on their success within this domain [44], CNN models have progressively extended their application scope to encompass high-spectral land cover, usage mapping [45], as well as handling multivariate data and regression tasks [46]. Notably, they have demonstrated successful implementations in estimating the age of individuals in facial images. However, the inherent complexity, extensive parameterization, and opaque nature of CNN models pose challenges in comprehensively assessing the impact of different features on model accuracy when employed for inversion tasks. By conducting thorough investigations into the input data of models, a deeper understanding of the characteristics underlying the inversion of Chl-a concentration through CNN models can be achieved. This approach not only elucidates the “black box” aspect of the model but also furnishes valuable insights for the inversion of other parameters utilizing CNN models.

3.4.1. Influence of Patch Size on CNN Precision

Utilizing a substantial spatial neighborhood around individual pixels as input data provides the network with a richer spatial context, which can enhance its robustness against outliers. However, patches that are too large may contain a diverse array of content, complicating the model’s ability to learn an accurate representation of the study area. For example, in [24], a patch size of 21 × 21 pixels with a 30 m resolution was employed, while in [47], the average reflectance of a small 3 × 3 pixel neighborhood at a 30 m resolution was utilized. Considering that a single pixel at the 10-m spatial resolution of Sentinel-2 corresponds to an area of 100 m², when multiple pixels are combined into a patch, the coverage area expands rapidly. For example, a 3 × 3-pixel patch covers an area of 900 m², while a 10 × 10-pixel patch covers an area of 10,000 m². This rapid expansion of coverage necessitates an empirical investigation into the impact of spatial resolution on the accuracy of model inversion. Understanding how different patch sizes affect model performance is crucial for optimizing the accuracy of chlorophyll-a concentration inversion.

Given the adjustment of inversion data to 10 m Sentinel-2 data in this research, each pixel denoted an area of 100 m². To facilitate comparison, patches of varying sizes (3 × 3, 5 × 5, 7 × 7, 9 × 9, 11 × 11, 13 × 13, and 15 × 15) were extracted from the images as input data and fed into a convolutional neural network. After 300 epochs of training, the accuracy of the inversion data for each patch size was evaluated. The results of this evaluation are summarized in Table 3.

By training and evaluating models on patches of different sizes, we studied the adaptability of the model to various local information ranges in the task of Chl-a concentration inversion. We observed significant differences in model performance under different patch sizes, which is crucial for determining the optimal patch size to enhance model performance. In the experiments, patch sizes of 7 × 7 and 9 × 9 demonstrated relatively lower RMSE and MRE along with higher R². However, when the patch size deviated from the optimal—either by increasing or decreasing—the accuracy, as indicated by R², RMSE, and MRE, declined. This was particularly evident at a patch size of 15 × 15 (R² = 0.639, MRE = 73.8%), which represented a substantial area of 22,500 m² and exhibited significantly poor predictive accuracy. This could be due to the fact that patches that are too small may fail to capture broader spatial features in the imagery, while overly large patches may lose detail and exhibit greater heterogeneity. In summary, the 7 × 7 patch size showed the best performance, with the highest R², and the lowest RMSE and MRE.

In conclusion, our study emphasizes the importance of patch size in Chl-a concentration inversion models. Selecting the appropriate patch size can better capture local features in the water bodies, thereby enhancing the accuracy and generalization capabilities of CNN models. In practical applications, the choice should be tailored to the specific needs of the scene and task to achieve optimal Chl-a concentration inversion results.

3.4.2. Influence of Input Bands on CNN Precision

The radiance values from various bands of satellite images surrounding Chl-a detection sites and their adjacent regions constitute the primary data source for the inversion model. Sentinel-2 satellite imagery offers a comprehensive spectral range, with a total of 13 bands, consolidated to 12 bands post-resampling, covering the spectrum from visible light to shortwave infrared. SHAP (SHapley Additive exPlanations) is an open-source library designed to interpret machine learning model predictions. Based on Shapley values from cooperative game theory, SHAP assigns a significance value to each feature, thereby clarifying every prediction made by the model. These values serve as an explanatory tool, elucidating changes in model outputs by attributing a specific value to each contributing feature. Utilizing data from each band as input, the average absolute SHAP values per band were computed, as illustrated in the bar graph presented in Figure 7.

Figure 6 illustrates the significance of three primary input bands for Chl-a estimation: B6 (average absolute SHAP value = 4.12); B12 (average absolute SHAP value = 3.12); and B8 (average absolute SHAP value = 1.94). Conversely, B7 (average absolute SHAP value = 0.05, central wavelength = 783 nm), B3 (average absolute SHAP value = 0.25, central wavelength = 560 nm), and B4 (average absolute SHAP value = 0.32, central wavelength = 665 nm) made relatively minor contributions in the CNN model. Notably, in 2022, Young Woo Kim et al. employed a light gradient boosting machine (LGBM) and MSI data to estimate Chl-a levels in 78 lakes and estuaries across South Korea [28]. Their SHAP summary chart positioned B8 and B6 at the forefront in SHAP absolute values, mirroring the findings of this research. In essence, these SHAP values offer valuable insights into the decision-making process of the model, aiding the comprehension of Sentinel-2 data’s specific impact on Chl-a concentration inversion. An in-depth analysis of band contributions facilitates more effective model optimization and feature selection, thereby enhancing precision in monitoring aquatic ecological environments.

The SHAP summary chart in Figure 8 illustrates that B6, B12, and B2 exhibited high pixel values predominantly positioned to the right of the baseline, corresponding to positive SHAP values. In contrast, lower pixel values were predominantly on the left side of the baseline, indicating negative SHAP values. This suggests that high pixel values in the B6, B12, and B2 bands positively influence increasing the model output. Conversely, B8, B10, B11, and B4 showed high pixel values primarily to the left of the baseline with negative SHAP values, while lower pixel values were mainly on the right side of the baseline, reflecting positive SHAP values, suggesting a mitigating effect on the model’s output. However, the contributions of B3, B8a, B9, B7, and B5 to the model output were less prominent.

These findings enable the strategic selection of high-performing bands for predicting Chl-a concentration, thereby enhancing the overall model performance and its practical applications. Subsequent investigations could delve into the synergistic effects among different bands and explore methods to enhance the prediction accuracy of Chl-a concentration by integrating multiple bands.

4. Discussion

The levels of Chlorophyll-a (Chl-a) in the coastal waters near Hong Kong are closely related to the geographical environment and urban distribution. Figure 5 and Figure 6 illustrate this relationship. The internal coastal waters had higher Chl-a concentrations than the external waters. The Tolo Harbour and the channels and sheltered areas within the eastern waters of Hong Kong had the highest Chl-a concentrations, while the southeastern waters facing the open sea had the lowest. Generally, the Chlorophyll-a concentrations were higher in the coastal areas than in the offshore regions of Hong Kong. Due to the proximity of the internal coastal waters to the city, the seawater is accompanied by domestic sewage and wastewater from industrial manufacturing in the long term, which are rich in organic substances, leading to higher Chl-a concentrations in the water bodies close to the city. In contrast, the waters in the external coastal areas of Hong Kong have higher water mobility than the internal waters, and the energy flow transformation with the adjacent seawater reduces the components in the inland discharge water sources, resulting in relatively lower Chl-a concentrations. This distribution is consistent with the literature [48,49]. Therefore, the inversion results also prove the effectiveness of the CNN model.

While our model has shown promising results, there are limitations that warrant further investigation. The model’s reliance on specific patch sizes and the temporal resolution of the data may affect its applicability in different environmental conditions or during rapid changes in water quality. As noted in the referenced study, weather conditions and climate change can introduce variability in water quality parameters [50]. Therefore, future research should focus on:

(1): Extended time series analysis: incorporating a more extensive time series of satellite imagery to enhance the model’s adaptability to seasonal and environmental variations;
(2): Integration of additional water quality parameters: expanding the model to include other water quality indicators, such as total nitrogen (TN) and total phosphorus (TP), to provide a more comprehensive assessment of water quality;
(3): Real-time monitoring applications: developing a real-time monitoring framework that integrates the CNN model with in situ measurements to account for the effects of weather and climate on water quality classification.

The CNN model presented in this study offers a robust tool for water quality monitoring in the coastal waters near Hong Kong. By addressing the limitations and pursuing the future research directions outlined above, we aim to enhance the model’s practical value and contribute significantly to environmental protection and water quality management.

5. Conclusions

This study utilized Sentinel-2 hyperspectral remote sensing imagery to invert the Chl-a concentration in the waters near Hong Kong using a convolutional neural network and four other machine learning models. The model’s accuracy was analyzed, leading to the creation of a thematic map illustrating the Chl-a distribution in the waters near Hong Kong.

The main research conclusions are as follows: The primary objective of this study was to establish a model capable of accurately and efficiently inverting Chl-a in optically complex coastal waters. Among the five machine learning algorithms tested, the convolutional neural network CNN (R² = 0.779, RMSE = 1.331 μg/L, MRE = 41.964%) performed best in predicting Chl-a on a large scale. The research deeply delved into Chl-a concentration inversion using the CNN model through the following three aspects:

(1): Cross-seasonal Chl-a inversion: inversion of Chl-a concentrations in the waters near Hong Kong for April and October, resulting in the creation of a Chl-a concentration distribution map. According to the concentration distribution map, areas with high Chl-a concentrations were mainly located in the northeast region (closed and semi-closed water areas with slow water exchange) and along the southwest coast of Hong Kong Island (densely populated), while waters facing the open sea in the southeast generally exhibited lower Chl-a concentrations;
(2): Patch size optimization for CNN model accuracy: investigations into the impact of different patch sizes on the accuracy of the CNN model. Patch sizes of 7 × 7 and 9 × 9 achieved favorable results in the Chl-a concentration inversion using the CNN model, while excessively large or small patch sizes could affect the inversion accuracy of the model;
(3): SHAP analysis for band contribution: using SHAP for model interpretation, exploring the effects of different bands on Chl-a concentration inversion, B6, B12, and B8 were identified as the three most important input bands for Chl-a inversion. B6, B12, and B2 mainly positively influenced increasing the model output, whereas B8, B10, B11, and B4 primarily contributed to reducing the model output.

In summary, this study not only confirmed the effectiveness of CNN in inverting Chl-a concentrations in optically complex water bodies but also revealed the significance of band information in Chl-a concentration estimation using the SHAP framework. These findings offer new perspectives and insights into the inversion of water quality parameters. Future research could extend to longer and denser time series to enhance the model’s generalization across different time periods, providing more precise and comprehensive support for water quality monitoring and management.

Author Contributions

Conceptualization, W.Z.; Methodology, T.C.; Validation, Z.L.; Formal analysis, P.W.; Writing—original draft, S.L.; Visualization, Y.X.; Supervision, K.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are available from the corresponding author upon request and subject to the Human Subjects protocol restrictions.

Conflicts of Interest

The authors declare no conflict of interest.

References

Pahlevan, N.; Smith, B.; Schalles, J.; Binding, C.; Cao, Z.; Ma, R.; Alikas, K.; Kangro, K.; Gurlin, D.; Hà, N. Seamless retrievals of chlorophyll-a from Sentinel-2 (MSI) and Sentinel-3 (OLCI) in inland and coastal waters: A machine-learning approach. Remote Sens. Environ. 2020, 240, 111604. [Google Scholar] [CrossRef]
Moss, B. Cogs in the endless machine: Lakes, climate change and nutrient cycles: A review. Sci. Total Environ. 2012, 434, 130–142. [Google Scholar] [CrossRef]
Dörnhöfer, K.; Oppelt, N. Remote sensing for lake research and monitoring–Recent advances. Ecol. Indic. 2016, 64, 105–122. [Google Scholar] [CrossRef]
Dekker, A.G. Detection of Optical Water Quality Parameters for Eutrophic Waters by High Resolution Remote Sensing. Ph.D. Thesis, Vrije Universiteit, Amsterdam, The Netherlands, 1993. [Google Scholar]
Dall’Olmo, G.; Gitelson, A.A. Effect of bio-optical parameter variability on the remote estimation of chlorophyll-a concentration in turbid productive waters: Experimental results. Appl. Opt. 2005, 44, 412–422. [Google Scholar] [CrossRef] [PubMed]
Le, C.; Li, Y.; Zha, Y.; Sun, D.; Huang, C.; Lu, H. A four-band semi-analytical model for estimating chlorophyll a in highly turbid lakes: The case of Taihu Lake, China. Remote Sens. Environ. 2009, 113, 1175–1182. [Google Scholar] [CrossRef]
Gower, J.; King, S.; Borstad, G.; Brown, L. Detection of intense plankton blooms using the 709 nm band of the MERIS imaging spectrometer. Int. J. Remote Sens. 2005, 26, 2005–2012. [Google Scholar] [CrossRef]
Gons, H.J.; Rijkeboer, M.; Ruddick, K.G. A chlorophyll-retrieval algorithm for satellite imagery (Medium Resolution Imaging Spectrometer) of inland and coastal waters. J. Plankton Res. 2002, 24, 947–951. [Google Scholar] [CrossRef]
Lary, D.J.; Alavi, A.H.; Gandomi, A.H.; Walker, A.L. Machine learning in geosciences and remote sensing. Geosci. Front. 2016, 7, 3–10. [Google Scholar] [CrossRef]
Yajima, H.; Derot, J. Application of the Random Forest model for chlorophyll-a forecasts in fresh and brackish water bodies in Japan, using multivariate long-term databases. J. Hydroinformatics 2018, 20, 206–220. [Google Scholar] [CrossRef]
Shen, M.; Luo, J.; Cao, Z.; Xue, K.; Qi, T.; Ma, J.; Liu, D.; Song, K.; Feng, L.; Duan, H. Random forest: An optimal chlorophyll-a algorithm for optically complex inland water suffering atmospheric correction uncertainties. J. Hydrol. 2022, 615, 128685. [Google Scholar] [CrossRef]
Mitra, B.; Tiwari, S.P.; Uddin, M.S.; Mahmud, K.; Rahman, S.M. Decision tree ensemble with Bayesian optimization to predict the spatial dynamics of chlorophyll-a concentration: A case study in Bay of Bengal. Mar. Pollut. Bull. 2024, 199, 115945. [Google Scholar] [CrossRef]
Zhang, J.; Meng, F.; Fu, P.; Jing, T.; Xu, J.; Yang, X. Tracking changes in chlorophyll-a concentration and turbidity in Nansi Lake using Sentinel-2 imagery: A novel machine learning approach. Ecol. Inform. 2024, 81, 102597. [Google Scholar] [CrossRef]
Zhang, T.; Huang, M.; Wang, Z. Estimation of chlorophyll-a Concentration of lakes based on SVM algorithm and Landsat 8 OLI images. Environ. Sci. Pollut. Res. Int. 2020, 27, 14977–14990. [Google Scholar] [CrossRef] [PubMed]
Li, S.; Song, K.; Wang, S.; Liu, G.; Wen, Z.; Shang, Y.; Lyu, L.; Chen, F.; Xu, S.; Tao, H. Quantification of chlorophyll-a in typical lakes across China using Sentinel-2 MSI imagery with machine learning algorithm. Sci. Total Environ. 2021, 778, 146271. [Google Scholar] [CrossRef] [PubMed]
Gómez, D.; Salvador, P.; Sanz, J.; Casanova, J.L. A new approach to monitor water quality in the Menor sea (Spain) using satellite data and machine learning methods. Environ. Pollut. 2021, 286, 117489. [Google Scholar] [CrossRef] [PubMed]
Yim, I.; Shin, J.; Lee, H.; Park, S.; Nam, G.; Kang, T.; Cho, K.H.; Cha, Y. Deep learning-based retrieval of cyanobacteria pigment in inland water for in-situ and airborne hyperspectral data. Ecol. Indic. 2020, 110, 105879. [Google Scholar] [CrossRef]
Xu, J.; Pan, J.; Devlin, A.T. Variations in chlorophyll-a concentration in response to hydrodynamics in a flow-through lake: Remote sensing and modeling studies. Ecol. Indic. 2023, 148, 110128. [Google Scholar] [CrossRef]
Syariz, M.A.; Lin, C.-H.; Nguyen, M.V.; Jaelani, L.M.; Blanco, A.C. WaterNet: A convolutional neural network for chlorophyll-a concentration retrieval. Remote Sens. 2020, 12, 1966. [Google Scholar] [CrossRef]
Xue, Y.; Zhu, L.; Zou, B.; Wen, Y.-m.; Long, Y.-h.; Zhou, S.-l. Research on inversion mechanism of chlorophyll—A concentration in water bodies using a Convolutional Neural Network model. Water 2021, 13, 664. [Google Scholar] [CrossRef]
Fan, D.; He, H.; Wang, R.; Zeng, Y.; Fu, B.; Xiong, Y.; Liu, L.; Xu, Y.; Gao, E. CHLNET: A novel hybrid 1D CNN-SVR algorithm for estimating ocean surface chlorophyll-a. Front. Mar. Sci. 2022, 9, 934536. [Google Scholar] [CrossRef]
Abbas, A.; Park, M.; Baek, S.-S.; Cho, K.H. Deep learning-based algorithms for long-term prediction of chlorophyll-a in catchment streams. J. Hydrol. 2023, 626, 130240. [Google Scholar] [CrossRef]
Yao, L.; Wang, X.; Zhang, J.; Yu, X.; Zhang, S.; Li, Q. Prediction of Sea Surface Chlorophyll-a Concentrations Based on Deep Learning and Time-Series Remote Sensing Data. Remote Sens. 2023, 15, 4486. [Google Scholar] [CrossRef]
Pu, F.; Ding, C.; Chao, Z.; Yu, Y.; Xu, X. Water-quality classification of inland lakes using Landsat8 images by convolutional neural networks. Remote Sens. 2019, 11, 1674. [Google Scholar] [CrossRef]
Aptoula, E.; Ariman, S. Chlorophyll-a retrieval from sentinel-2 images using convolutional neural network regression. IEEE Geosci. Remote Sens. Lett. 2021, 19, 6002605. [Google Scholar] [CrossRef]
Ilteralp, M.; Ariman, S.; Aptoula, E. A deep multitask semisupervised learning approach for chlorophyll-a retrieval from remote sensing images. Remote Sens. 2021, 14, 18. [Google Scholar] [CrossRef]
Na, L.; Shaoyang, C.; Zhenyan, C.; Xing, W.; Yun, X.; Li, X.; Yanwei, G.; Tingting, W.; Xuefeng, Z.; Siqi, L. Long-term prediction of sea surface chlorophyll-a concentration based on the combination of spatio-temporal features. Water Res. 2022, 211, 118040. [Google Scholar] [CrossRef] [PubMed]
Kim, Y.W.; Kim, T.; Shin, J.; Lee, D.-S.; Park, Y.-S.; Kim, Y.; Cha, Y. Validity evaluation of a machine-learning model for chlorophyll a retrieval using Sentinel-2 from inland and coastal waters. Ecol. Indic. 2022, 137, 108737. [Google Scholar] [CrossRef]
Cui, Z.; Du, D.; Zhang, X.; Yang, Q. Modeling and Prediction of Environmental Factors and Chlorophyll a Abundance by Machine Learning Based on Tara Oceans Data. J. Mar. Sci. Eng. 2022, 10, 1749. [Google Scholar] [CrossRef]
Niu, J.; Feng, Z.; He, M.; Xie, M.; Lv, Y.; Zhang, J.; Sun, L.; Liu, Q.; Hu, B.X. Incorporating marine particulate carbon into machine learning for accurate estimation of coastal chlorophyll-a. Mar. Pollut. Bull. 2023, 192, 115089. [Google Scholar] [CrossRef] [PubMed]
Ren, J.; Zhou, H.; Tao, Z.; Ge, L.; Song, K.; Xu, S.; Li, Y.; Zhang, L.; Zhang, X.; Li, S. Long-term monitoring chlorophyll-a concentration using HJ-1 A/B imagery and machine learning algorithms in typical lakes, a cold semi-arid region. Opt. Express 2024, 32, 16371–16397. [Google Scholar] [CrossRef]
Mozo, A.; Morón-López, J.; Vakaruk, S.; Pompa-Pernía, Á.G.; González-Prieto, Á.; Aguilar, J.A.P.; Gómez-Canaval, S.; Ortiz, J.M. Chlorophyll soft-sensor based on machine learning models for algal bloom predictions. Sci. Rep. 2022, 12, 13529. [Google Scholar] [CrossRef] [PubMed]
Chen, B.; Liu, H.; Xiao, W.; Wang, L.; Huang, B. A machine-learning approach to modeling picophytoplankton abundances in the South China Sea. Prog. Oceanogr. 2020, 189, 102456. [Google Scholar] [CrossRef]
Liu, Y.; Chen, D.; Ma, A.; Zhong, Y.; Fang, F.; Xu, K. Multiscale U-shaped CNN building instance extraction framework with edge constraint for high-spatial-resolution remote sensing imagery. IEEE Trans. Geosci. Remote Sens. 2020, 59, 6106–6120. [Google Scholar] [CrossRef]
Li, Z.; Deng, S.; Hong, Y.; Wei, Z.; Cai, L. A novel hybrid CNN–SVM method for lithology identification in shale reservoirs based on logging measurements. J. Appl. Geophys. 2024, 223, 105346. [Google Scholar] [CrossRef]
Parsa, A.B.; Movahedi, A.; Taghipour, H.; Derrible, S.; Mohammadian, A.K. Toward safer highways, application of XGBoost and SHAP for real-time accident detection and feature analysis. Accid. Anal. Prev. 2020, 136, 105405. [Google Scholar] [CrossRef]
Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM Sigkdd International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning representations by back-propagating errors. Nature 1986, 323, 533–536. [Google Scholar] [CrossRef]
Yue, J.; Pang, B.; Zhang, Y.; Liu, J. Research on Water Quality Inversion of Wide and Shallow Lakes Based on Neural Networks. South North Water Divers. Water Conserv. Technol 2016, 14, 26–31. [Google Scholar]
Gu, J.; Wang, Z.; Kuen, J.; Ma, L.; Shahroudy, A.; Shuai, B.; Liu, T.; Wang, X.; Wang, G.; Cai, J. Recent advances in convolutional neural networks. Pattern Recognit. 2018, 77, 354–377. [Google Scholar] [CrossRef]
Li, A.; Fan, M.; Qin, G.; Wang, H.; Xu, Y. Water Quality Parameter COD Retrieved From Remote Sensing Based on Convolutional Neural Network Model. Spectrosc. Spectr. Anal. 2023, 43, 651–656. [Google Scholar]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
Roy, S.K.; Krishna, G.; Dubey, S.R.; Chaudhuri, B.B. HybridSN: Exploring 3-D–2-D CNN feature hierarchy for hyperspectral image classification. IEEE Geosci. Remote Sens. Lett. 2019, 17, 277–281. [Google Scholar] [CrossRef]
Dornaika, F.; Bekhouche, S.E.; Arganda-Carreras, I. Robust regression with deep CNNs for facial age estimation: An empirical study. Expert Syst. Appl. 2020, 141, 112942. [Google Scholar] [CrossRef]
Peterson, K.T.; Sagan, V.; Sloan, J.J. Deep learning-based water quality estimation and anomaly detection using Landsat-8/Sentinel-2 virtual constellation and cloud computing. GIScience Remote Sens. 2020, 57, 510–525. [Google Scholar] [CrossRef]
Dong, S.; He, H.; Fu, B.; Fan, D.; Wang, T. Remote Sensing Inversion of Chlorophyll a Concentration in Hong Kong Coastal Waters Based on Landsat-8 Operational Land Imager and Sentinel-2 Multispectral Imager sensors. Sci. Technol. Eng. 2021, 21, 8702–8712. [Google Scholar]
Zhu, W.-D.; Kong, Y.-X.; He, N.-Y.; Qiu, Z.-G.; Lu, Z.-G. Prediction and Analysis of Chlorophyll-a Concentration in the Western Waters of Hong Kong Based on BP Neural Network. Sustainability 2023, 15, 10441. [Google Scholar] [CrossRef]
Du, C.; Wang, Q.; Li, Y.; Lyu, H.; Zhu, L.; Zheng, Z.; Wen, S.; Liu, G.; Guo, Y. Estimation of total phosphorus concentration using a water classification method in inland water. Int. J. Appl. Earth Obs. Geoinf. 2018, 71, 29–42. [Google Scholar] [CrossRef]

Figure 1. Distribution of sampling points in the study area.

Figure 2. Water environmental zones in Hong Kong.

Figure 3. Structure of convolutional neural networks.

Figure 4. Comparison of inversion accuracy of five different models: blue points represent the training set data, while red points represent the validation set data.

Figure 5. Chl-a concentration distribution map of Hong Kong nearshore waters in April 2022.

Figure 6. Chl-a concentration distribution map of Hong Kong nearshore waters in October 2022.

Figure 7. Bar chart of average absolute SHAP values for each band.

Figure 8. SHAP plot of band contributions.

Table 1. Specification of Sentinel-2 MSI.

Sentinel-2 Bands	Central Wavelength (μm)	Resolution (m)
Band 1—Coastal	0.443	60
Band 2—Blue	0.490	10
Band 3—Green	0.560	10
Band 4—Red	0.665	10
Band 5—Red Edge 1	0.705	20
Band 6—Red Edge 2	0.740	20
Band 7—Red Edge 3	0.783	20
Band 8—NIR	0.842	10
Band 8a—Red Edge 4	0.865	20
Band 9—Water vapor	0.945	60
Band 10—Cirrus	1.375	60
Band 11—SWIR-1	1.610	20
Band 12—SWIR-2	2.190	20

Table 2. Descriptive statistics of Chl-a concentration (μg/L).

Minimum	Maximum	Mean Concentration	Standard Deviation	Skewness	Median Concentration
0.2	14	3.01	3.65	5.83	1.9

Table 3. Model Evaluation Metrics Table.

Patch Size	Patch Area/m²	R²	RMSE	MRE/%
3 × 3	900	0.703	1.576	53.4%
5 × 5	2500	0.748	1.468	44.9%
7 × 7	4900	0.810	1.165	35.5%
9 × 9	8100	0.779	1.331	41.9%
11 × 11	12,100	0.681	1.461	59.5%
13 × 13	16,900	0.676	1.723	75.6%
15 × 15	22,500	0.639	1.784	73.8%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhu, W.; Liu, S.; Luan, K.; Xu, Y.; Liu, Z.; Cao, T.; Wang, P. Research on the Inversion of Chlorophyll-a Concentration in the Hong Kong Coastal Area Based on Convolutional Neural Networks. J. Mar. Sci. Eng. 2024, 12, 1119. https://doi.org/10.3390/jmse12071119

AMA Style

Zhu W, Liu S, Luan K, Xu Y, Liu Z, Cao T, Wang P. Research on the Inversion of Chlorophyll-a Concentration in the Hong Kong Coastal Area Based on Convolutional Neural Networks. Journal of Marine Science and Engineering. 2024; 12(7):1119. https://doi.org/10.3390/jmse12071119

Chicago/Turabian Style

Zhu, Weidong, Shuai Liu, Kuifeng Luan, Yuelin Xu, Zitao Liu, Tiantian Cao, and Piao Wang. 2024. "Research on the Inversion of Chlorophyll-a Concentration in the Hong Kong Coastal Area Based on Convolutional Neural Networks" Journal of Marine Science and Engineering 12, no. 7: 1119. https://doi.org/10.3390/jmse12071119

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Research on the Inversion of Chlorophyll-a Concentration in the Hong Kong Coastal Area Based on Convolutional Neural Networks

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area and Dataset

2.2. Data Preprocessing

2.2.1. Atmospheric Correction

2.2.2. Resampling

2.3. Machine Learning Methods

2.3.1. Support Vector Regression

2.3.2. Random Forest

2.3.3. XGBoost

2.3.4. BP Neural Network

2.3.5. Convolutional Neural Network

2.4. Model Evaluation Metrics

3. Results

3.1. Descriptive Statistics of Chl-a Concentration

3.2. Evaluation of CNN Model Accuracy

3.3. Inversion of Chl-a Concentration in the Study Area

3.4. Study on the Influence of Input Data on the Accuracy of CNN Model

3.4.1. Influence of Patch Size on CNN Precision

3.4.2. Influence of Input Bands on CNN Precision

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI