A Transfer Learning Technique for Inland Chlorophyll-a Concentration Estimation Using Sentinel-3 Imagery

Syariz, Muhammad Aldila; Lin, Chao-Hung; Heriza, Dewinta; Lasminto, Umboro; Sukojo, Bangun Muljo; Jaelani, Lalu Muhamad

doi:10.3390/app12010203

Open AccessArticle

A Transfer Learning Technique for Inland Chlorophyll-a Concentration Estimation Using Sentinel-3 Imagery

by

Muhammad Aldila Syariz

^1,2,

Chao-Hung Lin

¹

,

Dewinta Heriza

¹

,

Umboro Lasminto

²,

Bangun Muljo Sukojo

³ and

Lalu Muhamad Jaelani

^3,*

¹

Department of Geomatics, National Cheng Kung University, Tainan 70101, Taiwan

²

Civil Engineering, Institut Teknologi Sepuluh Nopember, Surabaya 60111, Indonesia

³

Geomatics Engineering, Institut Teknologi Sepuluh Nopember, Surabaya 60111, Indonesia

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2022, 12(1), 203; https://doi.org/10.3390/app12010203

Submission received: 21 November 2021 / Revised: 16 December 2021 / Accepted: 22 December 2021 / Published: 25 December 2021

(This article belongs to the Special Issue Sustainable Agriculture and Advances of Remote Sensing)

Download

Browse Figures

Versions Notes

Abstract

:

Chlorophyll-a (Chla) concentration, which serves as a phytoplankton substitute in inland waters, is one of the leading indicators for water quality. Generally, water samples are analyzed in professional laboratories, and Chla concentrations are measured regularly for the purpose of water quality monitoring. However, limited spatial water sampling and the labor-intensive nature of data collection make global and long-term monitoring difficult. The developments of remote-sensing optical sensors and technologies make the long-term monitoring of Chla concentrations for an entire water body more achievable. Many studies based on machine learning techniques, such as regression and artificial neural network (ANN) methods, have recently been proposed for Chla concentration estimation using optical satellite images. The methods based on machine learning can achieve accurate estimation. However, overfitting problems may arise because the in situ Chla dataset is generally insufficient to train a complicated machine learning model, which makes trained models inapplicable. In this study, an ANN model containing three convolutional and two fully connected layers with 4953 unknown parameters is designed. A transfer learning method, consisting of model pretraining, main-training, and fine-tuning stages, is proposed to ease the problem of insufficient in situ samples. In the model pretraining stage, the ANN model is pretrained and initialized using samples derived from an existing Chla concentration model. The pretrained ANN model is then fine-tuned using the proposed transfer learning technique with in situ samples collected in five different campaigns carried out during early 2019 from Laguna Lake, the Philippines. Before the transfer learning, data augmentation and rebalancing methods are conducted to enrich the variability and to near-uniformly distribute the in situ samples in Chla concentration space, respectively. To estimate the alleviation of model overfitting, the trained ANN model, using an in situ dataset from Laguna Lake, was tested using an in situ dataset from Lake Victoria, Uganda, obtained in 2019, which has a similar trophic state as Laguna Lake. The experimental results from Sentinel-3 imagery indicated that the overfitting problem was significantly alleviated and the trained ANN model outperformed related models in terms of the root-mean-squared error of the estimated Chla concentrations.

Keywords:

chlorophyll-a concentration; artificial neural network; transfer learning; overfitting

1. Introduction

Lakes are land-surrounded water bodies that generally provide freshwater for human daily needs. For instance, water from Lake Biwa, Japan, is used as a water drinking resource for people in Osaka and Kyoto and has been maintained as a conservation ecosystem with good water quality [1]. In Indonesia, a freshwater treatment plant, namely, PDAM Kabupaten Kerinci, was built around Lake Kerinci in Jambi to take, store, filter, and distribute the water to people living nearby [2]. Meanwhile, the worldwide demand for fish products has steadily increased due to the growing need for protein and the shift in behavior towards the consumption of healthier food [3,4]. The aquaculture industry often adds nutrient fertilizers, which are useful for commercial fish, to the water somewhere around the lake body. This procedure can fulfil the consumption demands; however, algal growth may be enhanced when nutrients are oversupplied. Consequently, the penetration of sunlight, which is required for respiration in fish, is limited and may lead to the extensive deterioration of water quality and the declining availability of freshwater, harming not only the fish, but also society. Therefore, the long-term monitoring of the water quality in lakes is necessary for the authorities to develop sustainable management initiatives to prevent water quality degradation and to maintain freshwater supplies in the future.

Chlorophyll-a (Chla), a pigment found in every phytoplankton species, is considered a critical water quality parameter for many environmental issues [5,6,7]. The water quality and Chla concentration can be categorized into four classes based on the trophic state index: oligotrophic (less than 2.6 μg/L), mesotrophic (2.6–20 μg/L), eutrophic (20–56 μg/L), and hypertrophic (more than 56 μg/L) [8]. The water quality condition for each class is described in Table 1. Chla concentrations measured using field surveys are accurate and precise; however, the concentration data are only available at the sampling locations. Taking more measurements from the lake water body is hindered by the high labor and financial costs. Remote sensing technology enables researchers to empirically estimate the Chla concentration at the full spatial coverage of the lake water body by regressing the remote-sensing reflectance (

R_{r s}

) or the features with the in situ data obtained from field survey. Dall’Olmo and Gitelson [9] utilized the features of band ratios, combining

R_{r s}

at wavelengths 443, 490, and 560 nm (denoted as

λ_{443}

,

λ_{490}

, and

λ_{560}

) in a three-band model, in which the in situ samples used in training ranged from 4.4 μg/L to 217.3 μg/L. Al-Shehhi et al. [10] exchanged the

R_{r s}

at wavelengths

λ_{560}

to

λ_{645}

, which has been found to represent both water turbidity and algal absorption in a narrower range of in situ data (0.1–27.8 μg/L). Chen et al. [11] performed local calibration in Chinese waters resulting in an

R_{r s}

feature at

λ_{580}

,

λ_{600}

, and

λ_{692}

. Gitelson et al. [12] and Moses et al. [13] simplified the three-band model to a two-band model by removing the

R_{r s}

at

λ_{443}

due to the similar sensitivity to absorption as

R_{r s}

at

λ_{490}

. Hence, Mishra and Mishra [14] proposed a differentiate index, called the normalized differentiate Chla index (NDCI), and demonstrated that the method outperforms the three-band and two-band models in cross validation. Many researchers [15,16,17,18,19,20] searched for important features that are sensitive to Chla concentrations; however, the procedure is somewhat statistically exhaustive.

Another promising procedure to estimate Chla concentrations is by means of an artificial neural network (ANN). Buckton et al. [21] proposed a fully connected neural network containing one hidden layer that revealed the capability of ANN for Chla concentration estimation. Similar work was also conducted by other researchers [22,23,24]. Hafeez et al. [25] designed several fully connected neural networks and searched for the optimal hyperparameters, including the number of hidden layers and the number of neurons in a layer. The study also revealed that the optimal ANN model outclassed the other machine learning methods, including random forest, cubist regression, and support vector regression, in terms of Chla concentration estimation. Furthermore, several researchers utilized convolutional neural networks (CNNs), which consider neighborhood spectral information in Chla concentration modelling using convolutional layers with 3D kernels [26,27,28,29].

An ANN model requires a high number of labelled data—that is, it uses in situ Chla concentrations as outputs and their corresponding

R_{r s}

in satellite images as inputs, and the initial values for unknown parameters for model training. Pyo et al. [28] constructed a CNN model with more than 2000 unknown parameters. This model was trained using only 238 labelled data. Meanwhile, Aptoula and Ariman [26] utilized 320 labelled data to train a CNN model containing 2432 unknown parameters. However, overfitting problems may arise because insufficient labelled data are used to search for the optimal values of thousands of unknown parameters during model training. Nguyen et al. [30] applied data augmentation to enrich the labelled data; however, they did not consider the data imbalance problem that may affect the estimation accuracy. Furthermore, some researchers utilized simulated datasets instead of in situ Chla concentration data to deal with the labelled data insufficiency [31,32,33]. A simulated dataset means that the Chla concentration information is obtained from an existing known model. With this procedure, the labelled data insufficiency can be solved; however, training a neural network model with a simulated dataset may not reach the global optimum of the defined loss function. Syariz et al. [34] proposed a two-stage training method, in which the model is firstly pretrained using a simulated dataset, and the pretrained model is then retrained using an in situ dataset. The advantage of this method is that the pretraining process is able to provide good initial values for the unknown parameters before the main training process using the in situ dataset. The training process can train an ANN model rather well for Chla concentration estimation. However, the overfitting problem is not fully alleviated because of the lack of training sample variability and the problem of training sample imbalance.

In this study, the main objectives were (1) to propose a transfer learning technique using the two-stage transfer training approach for better Chla concentration estimation accuracy; (2) to enrich and balance the Chla-labelled data by performing data augmentation and rebalancing techniques; and (3) to test the ANN model trained using the improved proposed two-stage training transfer learning approach with an in situ dataset from Laguna Lake, using the in situ dataset acquired from Lake Victoria, Uganda. To evaluate the effectiveness of the proposed model learning methods, an ANN model, namely WaterNet, first proposed by Syariz et al. [34], was adopted. The input to WaterNet was a water-body image patch of the size 7 (width) × 7 (height) × 16 (bands) and the output was an estimated Chla concentration at the center pixel of the input patch. Lastly, the proposed transfer learning method can increase the accuracy of Chla concentration retrieval in the lake water body, which can later be utilized by governments to better understand the lake water state and develop a clinical management plan to prevent water quality degradation and to maintain freshwater supplies in the future. The remainder of the paper is organized as follows. Section 2 describes the study area, data material, acquisition, and preprocessing. Section 3 elaborates the proposed transfer learning technique, data augmentation, and data rebalancing. Section 4 presents the experimental results, performance, and the comparisons of the trained ANN model and related models, and Section 5 provides the conclusions and future work.

2. Data Materials and Preprocessing

The in situ dataset acquired from Laguna Lake, the Philippines, was used to train the proposed ANN model while the in situ dataset acquired from Lake Victoria, Uganda, which has a similar trophic state (i.e., mesotrophic) with Laguna Lake, was utilized to test the trained model. The acquisitions of these two datasets are described in Section 2.1 and Section 2.2. The Sentinel-3 imagery used for Chla estimations and data preprocessing are described in Section 2.3.

2.1. Laguna Lake of the Philippines

Laguna Lake, with an area of 900 km² and an average depth of 2.5 m, is the largest lake in the Philippines. There are more than 20 million people living in the surrounding areas of Laguna Lake, indicating the importance of the lake in providing freshwater for local daily needs [35]. However, around 17% of the lake water body (~150 km²) is occupied by aquaculture cages, where the nutrients and hazardous substances from industrial activity may pollute Laguna Lake, in addition to the issues of rapid population growth, industrialization, and urbanization [36,37]. In this study, field measurements of Chla concentrations were conducted during five different campaigns in 2019, as shown in Figure 1. Infinity-CLW ACLW2-USB, an optical-based data logger used to measure Chla concentrations, was installed on a boat at a depth of 0.5 m below the water’s surface. The data logger recorded the Chla concentrations once per second during a 5-hour field survey, collecting more than 15,000 records at each campaign. Outlier removal and data down-sampling were conducted to remove noise and to match the Chla concentration sampling resolution with the spatial resolution of the Sentinel-3 images, respectively. After the data pre-processing, 257 in situ Chla samples were obtained from the five field campaigns, as shown in Table 2, and the resulting samples were utilized to train the ANN model and related models for comparison and evaluation.

2.2. Lake Victoria of Uganda

The in situ Chla concentrations from Lake Victoria were obtained from the Mendeley Online Database (https://data.mendeley.com/ (accessed on 3 August 2021)), as provided by Deirmendjan et al. [38]. The study in [38] estimated the dissolved organic matter (DOM) under the support of the project Lake Victoria Greenhouse Gas Dynamics (LAVIGAS). In this project, there were three campaign periods: 29 March to 8 April 2018, 25 October to 4 November 2018, and 7 June to 17 June 2019. At each period, the water samples for Chla concentrations were measured daily in water depths ranging from 1 to 40 m. Considering that (1) the samples have the same trophic state as Laguna Lake (2.6–20 μg/L), the measurement depth should be similar to that for Laguna Lake (0.5 m), (2) the Chla sampling time should match with the Sentinel-3 image acquisition time, and (3) the Sentinel-3 image pixels corresponding the collected Chla samples should be cloud-free, only two in situ samples, shown in Figure 2, could be utilized. These two samples were used to evaluate the inference performance of the trained models to compare the trained model with related models.

2.3. Sentinel-3 Image Dataset

Fifteen level 2 water full resolution (WFR) images of Laguna Lake, acquired by the ocean and land color instrument (OLCI) sensor of Sentinel-3, were utilized. A Sentinel-3 WFR image contains 16 atmospherically-corrected bands, excluding bands 13–15 (

λ_{761}

,

λ_{764}

,

λ_{767}

) and bands 18–19 (

λ_{885}

,

λ_{900}

,) which are mainly designed for atmospheric correction [39]. The water-leaving reflectance in Sentinel-3 WFR images is further divided by

π

to derive the remote-sensing reflectance

R_{r s}

. In addition, the Sentinel-3 WFR product also contains several water quality parameters, including the Chla concentrations estimated by using an inverse radiative transfer model–neural network (IRTM-NN) [40]. The Chla concentrations from the IRTM-NN were regarded as a simulated dataset in this study and were used for model pretraining.

In the image data preprocessing, cloud-free water pixels in the

R_{r s}

images and their neighboring local patches of the spatial size 7 × 7 were extracted. Image patches containing non-water pixels, such as cloud, and pixels with negative

R_{r s}

values due to imprecise atmospheric correction or cloud shadow, were excluded from the dataset, forming full-water

R_{r s}

image patches. The summary of the

R_{r s}

image patches is presented in Table 3. Similarly, the cloud-free Sentinel-3 image patches corresponding to the locations with the simulated Chla concentrations generated by IRTM-NN were extracted. These image patches with simulated Chla concentrations were used in the model pretraining. The

R_{r s}

water patches and their corresponding simulated Chla data were used as a training set. The training set is denoted as

{(P_{i}, s_c h l a_{i})}_{i = 1}^{n}

, where

n

denotes the number of simulated labelled data, and

P_{i}

and

s_c h l a_{i}

represent the

i

-th

R_{r s}

water patch and its corresponding simulated Chla concentration, respectively. There were a total of 47,231 simulated labelled data. In addition, 275 in situ Chla data over Laguna Lake and their corresponding

R_{r s}

water patches were used as the retraining dataset. The retraining dataset is denoted as

{(K_{i}, t_c h l a_{i})}_{i = 1}^{m}

, where

n

represents the number of in situ Chla samples, and

k_{i}

and

t_c h l a_{i}

represent the

i

-th

R_{r s}

water patch and its corresponding in situ Chla concentration, respectively. In addition, one Sentinel-3 WFR image

R_{r s}

located in Lake Victoria was also obtained, and the acquisition date of the image was 15 June 2019. A

R_{r s}

water patch of the size 7 × 7 located at the field measurement point

{LV}_{1}

was extracted. As for the field measurement

{LV}_{2}

, which was taken on 16 June 2019, the water patch was extracted from the Sentinel-3 image acquired on 15 June 2019. This means that the estimation was conducted using the image acquired one day before the field measurement in

{LV}_{2}

.

Considering the stability of the model training, the water patches from Laguna Lake and Lake Victoria containing

R_{r s}

at 16 spectral bands were normalized to the range 0, 1 using the minimal and maximal

R_{r s}

values at each spectral wavelength. The data normalization process was also performed for the in situ and simulated Chla concentration data.

3. Methodology

3.1. Artificial Neural Network Model

An ANN model, namely WaterNet, proposed by Syariz et al. [34] was adopted. As shown in Figure 3, the input and output to the model was an image patch of the size 7 × 7 × 16 and an estimated Chla concentration in the center pixel of the patch, respectively. The model is an end-to-end network structure consisting of three phases: that is, band expansion, feature extraction, and Chla concentration estimation. In the band expansion phase, there were three convolutional layers with 1 × 1 × 3 kernel filters. The 1 × 1 × 3 kernel filters performing convolution on the spectral domain attempt to augment spectral features from the spectral bands of the input image patch, which is also known as spectral feature extraction via band combination [41,42,43]. Meanwhile, two convolutional layers containing ten filters of the size 3 × 3 × 42, and five filters of the size 3 × 3 × 10, were utilized in the feature extraction phase. With those filters, the spatial feature information was extracted. The output to this phase was a feature map of the size 3 × 3 × 5, and this output was further flattened and linked to the Chla concentration estimation phase which contained two fully connected layers. A rectified linear unit (ReLU) and sigmoid functions were used as the activation function in convolution and fully connected layers, respectively. In total, this ANN model contained 4753 unknown parameters.

3.2. ANN Model Training

Utilizing insufficient in situ Chla concentration data and unsuitable initialization for the unknown parameters in ANN model training may lead to model overfitting and make the loss function difficult to converge. Syariz et al. [34] proposed a two-stage training approach consisting of pretraining and main-training, which is shown to be able to deal with the aforementioned problems. The first stage provides a better initialization for the unknown parameters before the main stage by pretraining the model with the simulated labelled data

{(K_{i}, s_c h l a_{i})}_{i = 1}^{n}

. Here, the estimation error is large and backpropagating the error could make the extraction of the spatial feature not optimum. Moreover, the convergence of the loss function may not reach its global minimum due to the utilization of the simulated data. However, this allows the model to have suitable initial values of the unknown parameters before the main training stage. Then, the pretrained model is refined with the in situ labelled data

{(P_{i}, t_c h l a_{i})}_{i = 1}^{m}

. This procedure is also known as transfer learning.

In this study, the two-stage training was adopted and the main stage part was improved by the implementation of fine-tuning, another kind of transfer learning technique. Moreover, data augmentation and rebalancing were also proposed and performed before the training in the improved main stage. The aim was to have more in situ labelled data with balanced amounts of samples in Chla concentration distribution space. Details regarding the data augmentation and rebalancing and the proposed transfer learning approach are explained below.

3.2.1. Data Augmentation and Rebalancing

To enrich the variability of the Chla in situ dataset, the data augmentation technique was implemented, as the convolutional processing is insensitive to rotation and scale [44,45]; however, the balance of data may not be considered. In this study, the data augmentation was performed on the in situ labelled data

{(P_{i}, t_c h l a_{i})}_{i = 1}^{m}

by applying rotation to the image patches (with angles of 90°, 180°, and 270°) and flipping the rotated images from the left to right. Then, the rotated and flipped image patches were linked to their corresponding Chla concentration as a new dataset, namely, an augmented dataset: that is,

{(Q_{i}, n_c h l a_{i})}_{i = 1}^{q}

where

q

is the number of rotated and flipped images (2216 data in total). The augmented dataset was further reclassified into 12 classes, with the first class starting from 6 μg/L, the last class ending at 12 μg/L, and each class covering 0.5 μg/L, as shown in Figure 4. Figure 4a implies the frequency of the in situ Chla in the augmented dataset. As seen, the difference between the Chla concentration data inter-range is huge, and indicates the imbalanced distribution of the data. Training the model with a data imbalance may reduce the optimum accuracy, and therefore data rebalancing is necessary. For that, a sample rebalancing technique was conducted by randomly removing several rotated and flipped in situ labelled data if the frequency of Chla concentration of the corresponding class was more than 100 sets (see Figure 4b). This kept the Chla concentration data at each range equal to or less than 100 sets, thus the balance of the data was achieved. In total, the data augmentation and rebalancing generated 900 rebalanced data

{(R_{i}, n_c h l a_{i})}_{i = 1}^{r}

where

r

denotes the number of rebalanced in situ labelled data,

q_{i}

and

n_c h l a_{i}

represent the

i

-th

R_{r s}

water patch and its corresponding in situ Chla concentration, respectively. This also includes its original data

{(P_{i}, t_c h l a_{i})}_{i = 1}^{m}

. For simplification, the summary of dataset variations is described below.

■: The simulated labelled data ${(K_{i}, s_c h l a_{i})}_{i = 1}^{n}$ ,
■: the in situ labelled data which also refer to the original dataset ${(P_{i}, t_c h l a_{i})}_{i = 1}^{m}$
■: the augmented dataset ${(Q_{i}, n_c h l a_{i})}_{i = 1}^{q}$ , and
■: the rebalanced dataset ${(R_{i}, n_c h l a_{i})}_{i = 1}^{r}$ .

3.2.2. Transfer Learning

In this study, two-stage training was adopted and the main stage part was improved by the implementation of fine-tuning. The procedures for the fine-tuning in the proposed transfer learning is as follows. There are two sub-stages in the main training.

Main-training stage. With the help of the pretraining stage, the ANN model contains suitable values of unknown parameters. Training them with the rebalanced dataset increases the possibility that the search for the global minimum in the loss function can be reached. This also means that the accuracy of the estimation is enhanced or the estimation error is smaller. The error is then backpropagated to update the unknown parameters and the spatial feature is more robust.
Fine-tuning stage. In the previous stage, the extraction of the spatial feature is already powerful, and continuing training the previously trained ANN model with the rebalanced dataset may only endanger the spatial feature. Therefore, a fine-tuning technique is performed in this stage by means of “network surgery”. First, the model is split into two parts: the body part, consisting of the first and second phase of the ANN model; and the head part, consisting of the last phase of the ANN model, which is the Chla concentration estimation phase. The head part is then removed, leaving the body part only. Inputting an image patch to the body part only will result in a spatial feature image. In machine learning, the technique to split and remove the head part is known as a feature extractor. Moreover, a new head part containing a similar network as the last phase with a random initial value for the unknown parameters is attached to the body part. Here, if the gradient is allowed to backpropagate from these random values all the way through the network, the powerful spatial features could be at risk. To prevent this problem, the layers in the body part, i.e., in the first and second phase of the model, are frozen or set as untrainable and allow the backpropagation when training be performed on the new head only. This allows the network to start learning from the powerful spatial feature and the estimation of Chla concentration can be optimized. Lastly, all of the layers are unfrozen or set as trainable. However, different to the previous stage or sub-stage in which the training is conducted with a learning rate of 0.001, the learning rate is now set to a very small rate of 0.0001. The aim of setting such very small rate is to obtain a suitable adjustment for the body and head parts. For simplification, Figure 5 shows the workflow of the fine-tuning stage.

For hyperparameters, the Adam optimizer is employed due to its capability in adaptively tuning the learning rate and moment [46], and the mean squared error (MSE) is used as the loss function

L

and is defined as follows:

L = \frac{1}{m} \sum_{i = 1}^{m} {(p r C h l a_{i} - i s C h l a_{i})}^{2},

(1)

where

p r C h l a_{i}

is the prediction or estimation of Chla concentration from the input image patch of the

i

-th labelled data. Moreover, overfitting is alleviated by adopting two regularization techniques: dropout and L₂ regularization. The dropout rate is set to 0.5, meaning that only 50% of the total unknown parameters are temporarily deactivated when computing the loss function for model convergence monitoring, whereas the L₂ regularization adds the Frobenius norm to the loss function to penalize large weights during error backpropagation for the tuning of unknown parameters. The maximum epoch is set to 30 and the trained network from an epoch with the smallest value of the loss function will be stored and used for the Chla concentration estimation.

4. Experimental Results and Discussion

This study proposed a transfer learning technique consisting of model pretraining, main-training, and fine-tuning stages for Chla ANN model training with an insufficient in situ dataset. In addition, the data augmentation and rebalancing were integrated with the transfer learning for Chla in situ data enrichment and imbalance. To evaluate the proposed method, a k-fold cross validation was performed with the Chla in situ dataset from Laguna Lake, the Philippines, where k was empirically set to 10. In this section, the results of the proposed transfer learning are presented in Section 4.1, and the effect of data imbalance to the trained ANN model is presented in Section 4.2. In addition, Section 4.3 demonstrates the comparisons between the CNN model trained by the proposed transfer learning with the related models using the dataset from Lake Victoria, Uganda. For accuracy assessment, the root mean squared error (RMSE) is employed by rooting the MSE in Equation (1).

4.1. Evaluation of the Transfer Learning

To evaluate the proposed transfer learning technique with the processes of data augmentation and rebalance, the ANN named WaterNet was used for Chla concentration estimation. For details about WaterNet, please refer to Section 3.2. To evaluate the performance of the three training stages in the transfer learning, the hyperparameters containing the batch size, the optimizer, and the number of epochs was the same and 10-fold cross validation was performed on the rebalanced dataset from Laguna Lake. The evaluation results are presented in Table 4. After the model pretraining, the accuracy of the estimated Chla concentrations was not satisfied. The range and average of RMSEs of the folds were 2.070~2.228 μg/L and 2.144 μg/L, respectively. This implies that a poor performance with high estimation errors was obtained when training the ANN model using the simulated Chla data. Although the ANN model at this stage cannot effectively retrieve the Chla concentrations, this training stage can provide suitable initial values for the unknown parameters for the coming stage. As a result, a better estimation result was obtained in the main-training stage. The range and average of the RMSEs at folds decreased to 0.4866~0.6887 μg/L and 0.5819 μg/L, respectively. Moreover, the trained ANN model was further fine-tuned in the next stage. The average RMSE improved from 0.5819 μg/L in the second stage to 0.3724 μg/L in the third stage. This was caused by setting the layers in the band extension and feature extraction phases to untrainable and only permitting the backpropagation to work on the layers in the Chla concentration phase.

The ANN model trained by the proposed transfer learning was applied to five Sentinel-3 images, which were acquired at similar dates with the field campaigns in Laguna Lake, Philippines. The Chla concentration maps for the water body, shown in Figure 6, are visualized by colors ranging from yellow (6 μg/L) to red (12 μg/L). In addition, the outputs from the feature extraction phase in the ANN shown in Figure 3 are convolutional feature maps of the size 3 × 3 × 5. The feature maps imply the importance of spatial features for the Chla estimation. To visualize the feature maps for the whole lake body, the center pixels of the feature maps were extracted and combined to form spatial feature maps. The Chla concentrations of Laguna Lake on 6 April 2019, estimated by the trained ANN and the spatial feature maps extracted from the trained ANN, are shown in Figure 7. The spatial feature maps #1 and #3 are flashier than the others. To address this on the two spatial feature maps, the two dashed boxes are set on the maps to represent the area of interest for highlight and discussion. As shown in the brown dashed box, most of the features within this area have smaller values in feature map #1 and higher values in feature map #3. The significant differences between these two feature maps result in high Chla concentrations during the model prediction. As for those in the yellow dashed box, the opposite results are obtained, because the area is homogeneous and the pixels within this area have similar values. This observation revealed that the proposed transfer learning is able to preserve spatial features that are important in Chla concentration estimation.

4.2. Performance of Data Augmentation and Rebalancing

Three datasets are used and tested in this subsection, namely, original, augmented, and balanced datasets. The original dataset refers to the Chla in situ data acquired from Laguna Lake, the Philippines. The augmented and balanced datasets are the augmented in situ datasets without and with, respectively, the consideration of in situ Chla concentration unbalancing. The comparisons of the proposed transfer learning using these three datasets are shown in Figure 8. The results indicated that the RMSEs of the training using the original dataset ranged from 0.5 μg/L to 1.0 μg/L. By using the augmented dataset, the RMSEs of estimated Chla concentrations ranged from 7.5 μg/L to 9.5 μg/L. This is caused by the fact that more Chla samples in the augmented dataset are in the Chla concentration ranges 7.5~8 μg/L and 9.5~10 μg/L. Consequently, the sample imbalance on the Chla concentrations makes the performance of the trained model worse than that trained using the original dataset. When the data rebalancing that considers the distribution of samples’ Chla concentrations in the augmented dataset is performed, the RMSEs of the estimated Chla concentrations are improved to 0.5~0.7 μg/L. Similar statistical results are shown in Figure 9, where the correlation coefficient between the estimated and in situ Chla concentrations was improved when the data rebalancing was performed with data augmentation.

4.3. Comparisons of Chla Estimation Models

A real model test, in which the machine-learning model is trained and tested using two geographically different and dependently corrected sample datasets, is rarely conducted due to the limited in situ Chla samples and overfitting problems. In this study, a ANN model was trained using the proposed transfer learning with the processes of data augmentation and rebalancing. The training Chla sample dataset was collected from Laguna Lake, Philippines. The trained ANN model was then applied to the Chla samples acquired from Lake Victoria, Uganda, for testing and evaluation. In addition, the trained ANN model was compared with the related models, including the three-band model [9], two-band model [13], NDCI [14], and WaterNet [34]. WaterNet is described in Section 3.1 and the other models are presented in Table 5. For fair comparisons, the three-band and two-band models were calibrated using in situ Chla-labelled data from Laguna Lake with a linear regression model. Linear regression was selected to ease overfitting problems. In addition, the hyperparameters containing the batch size, the optimizer, and the learning rate in the WaterNet training with original two-stage training are the same as that in the proposed training. Different to the other compared models, it is not necessary to calibrate the NDCI model, as the model directly outputs the estimated Chla concentrations. All of the compared models were trained using the dataset from Laguna Lake and then tested using the dataset from Lake Victoria for fair comparisons.

Table 6 shows the comparison results of WaterNet, trained using original two-stage training with original data, and the proposed method, including the improved transfer learning with data augmentation and rebalancing. The table also contains the related models using the Chla dataset from Lake Victoria. The results indicate that the three-band model with the performance RMSE = 0.588 μg/L and the two-band models with the performance RMSE = 0.509 μg/L have similar Chla concentration prediction accuracy. This may be due to the fact that these two models utilize similar

R_{r s}

features, that are

R_{r s}

at

λ_{443}

and at

λ_{490}

, which share similar sensitivity to the absorption [13]. WaterNet trained with original two-stage training and data also performed similarly, with RMSE = 0.496 μg/L. Better performances were obtained when the estimation of Chla concentrations was conducted using WaterNet with the proposed training method and NDCI. The RMSEs of the two models were 0.228 μg/L and 0.244 μg/L, and WaterNet with the proposed training was slightly better than NDCI. This means that the proposed transfer learning with the processes of data augmentation and rebalancing is able to resist the overfitting problem, and the performance of the trained model outperforms the related models.

5. Conclusions and Future Work

A transfer learning method containing the stages of model pretraining, main training, and fine tuning, was proposed to train ANN models for Chla concentration estimation using Sentinel-3 images. In addition, data augmentation and rebalancing were performed not only to increase the variability of the training dataset, but also to balance the samples in terms of Chla concentrations. To evaluate the ease of overfitting and to compare with related models, the models were trained using the Chla dataset from Laguna Lake and then tested using the Chla dataset from Lake Victoria, which has the same trophic state with Laguna Lake. The quantitative assessments on the Setinel-3 WFR images demonstrate that the proposed transfer learning method is better than that of WaterNet, and the trained CNN outperforms the related models in terms of Chla estimation accuracy. Considering that the data rebalancing can provide massive effects to the performance of the model, in the near future, WaterNet will be redesigned such that the neural network can be applied to other optical satellite imagery with better spatial resolution, including Sentinel-2 and Landsat 8 images, in order to improve the extraction of important spatial features in lake water bodies. In addition, other water quality parameters, such as turbidity and total suspended matter, will be included in the modelling.

Author Contributions

Conceptualization, M.A.S., C.-H.L. and L.M.J.; data curation, M.A.S.; formal analysis, M.A.S., C.-H.L. and L.M.J.; funding acquisition, C.-H.L.; investigation, M.A.S., D.H., U.L. and B.M.S.; methodology, M.A.S.; project administration, C.-H.L.; software, M.A.S. and D.H.; supervision, C.-H.L., U.L., B.M.S. and L.M.J.; validation, M.A.S. and D.H.; visualization, M.A.S.; writing—original draft, M.A.S., C.-H.L. and L.M.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research was partially funded by the Ministry of Science and Technology, Taiwan (grant numbers MOST 106-2923-M-006-003-MY3 and 109-2923-M-006-001-MY3); and the Indonesian Ministry of Research and Technology/National Agency for Research and Innovation (grant number 1377/PKS/ITS/2020).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

We would like to thank Ariel C. Blanco from the University of the Philippines and Loris Deirmendjian from Paul Sabatier University and colleagues for the collection and sharing of water quality data samples from Laguna Lake and Lake Victoria, respectively. Sentinel-3 imagery courtesy of the European Space Agency.

Conflicts of Interest

The authors declare no conflict of interest.

References

Kira, T.; Ide, S.; Fukada, F.; Nakamura, M. Lake Biwa Experience and Lessons Learned Brief. In Managing Lakes and Their Basins for Sustainable Use: A Report for Lake Basin Manegers and Stakeholders; International Lake Environment Committee: Otsu, Japan, 2006; Volume 5. [Google Scholar]
Kementerian Lingkungan Hidup. Profil 15 Danau Prioritas Indonesia; Kementerian Lingkungan Hidup: Jakarta, Indonesia, 2011.
Ipsos Business Consultant. Indonesia’s Aquaculture-Key Sectors for Future Growth; Ipsos Business Consultant: Jakarta, Indonesia, 2010. [Google Scholar]
World Bank. Fish to 2030 Prospects for Fisheries and Aquaculture; World Bank: Washington, DC, USA, 2013. [Google Scholar]
Cristina, S.; Fragoso, B.; Icely, J.; Grant, J. Aquaspace Project Document; Aquaspace Project: Sagaremisco, Portugal, 2018. [Google Scholar]
Gurlin, D.; Gitelson, A.A.; Moses, W.J. Remote estimation of chl-a concentration in turbid productive waters-Return to a simple two-band NIR-red model? Remote Sens. Environ. 2011, 115, 3479–3490. [Google Scholar] [CrossRef]
Moutzouris-Sidiris, I.; Topouzelis, K. Assessment of Chlorophyll-a concentration from Sentinel-3 satellite images at the Mediterranean Sea using CMEMS open source in situ data. Open Geosci. 2021, 13, 85–97. [Google Scholar] [CrossRef]
Carlson, R.E. A trophic state index for lakes. Limnol. Oceanogr. 1977, 22, 361–369. [Google Scholar] [CrossRef] [Green Version]
Dall’Olmo, G.; Gitelson, A.A. Effect of bio-optical parameter variability and uncertainties in reflectance measurements on the remote estimation of chlorophyll-a concentration in turbid productive waters: Modeling results. Appl. Opt. 2006, 45, 3577. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Al Shehhi, M.R.; Gherboudj, I.; Zhao, J.; Ghedira, H. Improved atmospheric correction and chlorophyll-a remote sensing models for turbid waters in a dusty environment. ISPRS J. Photogramm. Remote Sens. 2017, 133, 46–60. [Google Scholar] [CrossRef]
Chen, J.; Zhang, X.; Quan, W. Retrieval chlorophyll-a concentration from coastal waters: Three-band semi-analytical algorithms comparison and development. Opt. Express 2013, 21, 9024. [Google Scholar] [CrossRef]
Gitelson, A.A.; Dall’Olmo, G.; Moses, W.; Rundquist, D.C.; Barrow, T.; Fisher, T.R.; Gurlin, D.; Holz, J. A simple semi-analytical model for remote estimation of chlorophyll-a in turbid waters: Validation. Remote Sens. Environ. 2008, 112, 3582–3593. [Google Scholar] [CrossRef]
Moses, W.J.; Gitelson, A.A.; Berdnikov, S.; Povazhnyy, V. Satellite estimation of chlorophyll-a concentration using the red and NIR bands of MERIS-The azov sea case study. IEEE Geosci. Remote Sens. Lett. 2009, 6, 845–849. [Google Scholar] [CrossRef]
Mishra, S.; Mishra, D.R. Normalized difference chlorophyll index: A novel model for remote estimation of chlorophyll-a concentration in turbid productive waters. Remote Sens. Environ. 2012, 117, 394–406. [Google Scholar] [CrossRef]
Andrzej Urbanski, J.; Wochna, A.; Bubak, I.; Grzybowski, W.; Lukawska-Matuszewska, K.; Łącka, M.; Śliwińska, S.; Wojtasiewicz, B.; Zajączkowski, M. Application of Landsat 8 imagery to regional-scale assessment of lake water quality. Int. J. Appl. Earth Obs. Geoinf. 2016, 51, 28–36. [Google Scholar] [CrossRef]
Niroumand-jadidi, M.; Bovolo, F.; Bruzzone, L. Novel Spectra-Derived Features for Empirical Retrieval of Water Quality Parameters: Demonstrations for OLI, MSI. IEEE Trans. Geosci. Remote Sens. 2019, 57, 10285–10300. [Google Scholar] [CrossRef]
Van Nguyen, M.; Lin, C.H.; Chu, H.J.; Jaelani, L.M.; Syariz, M.A. Spectral feature selection optimization for water quality estimation. Int. J. Environ. Res. Public Health 2020, 17, 272. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Wang, X.; Zhang, F.; Ding, J. Evaluation of water quality based on a machine learning algorithm and water quality index for the Ebinur Lake Watershed. Sci. Rep. 2017, 7, 12858. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Zhang, Y.; Feng, X.; Cheng, X.; Wang, C. Remote estimation of chlorophyll-a concentrations in Taihu Lake during cyanobacterial algae bloom outbreak. In Proceedings of the 2011 19th International Conference on Geoinformatics, Shanghai, China, 24–26 June 2011; pp. 1–6. [Google Scholar]
Guo, Y.; Liu, C.; Ye, R.; Duan, Q. Advances on water quality detection by uv-vis spectroscopy. Appl. Sci. 2020, 10, 6874. [Google Scholar] [CrossRef]
Buckton, D.; O’Mongain, E.; Danaher, S. The use of Neural Networks for the estimation of oceanic constituents based on the MERIS instrument. Int. J. Remote Sens. 1999, 20, 1841–1851. [Google Scholar] [CrossRef]
Kown, Y.S.; Baek, S.H.; Lim, Y.K.; Pyo, J.C.; Ligaray, M.; Park, Y.; Cho, K.H. Monitoring coastal chlorophyll-a concentrations in coastal areas using machine learning models. Water 2018, 10, 1020. [Google Scholar] [CrossRef] [Green Version]
Samli, R.; Sivri, N.; Sevgen, S.; Kiremitci, V.Z. Applying artificial neural networks for the estimation of chlorophyll-a concentrations along the Istanbul coast. Pol. J. Environ. Stud. 2014, 23, 1281–1287. [Google Scholar]
Wang, Q.; Wang, S. A predictive model of chlorophyll a in western lake erie based on artificial neural network. Appl. Sci. 2021, 11, 6529. [Google Scholar] [CrossRef]
Hafeez, S.; Wong, M.; Ho, H.; Nazeer, M.; Nichol, J.; Abbas, S.; Tang, D.; Lee, K.; Pun, L. Comparison of Machine Learning Algorithms for Retrieval of Water Quality Indicators in Case-II Waters: A Case Study of Hong Kong. Remote Sens. 2019, 11, 617. [Google Scholar] [CrossRef] [Green Version]
Aptoula, E.; Ariman, S. Chlorophyll-a Retrieval From Sentinel-2 Images Using Convolutional Neural Network Regression. IEEE Geosci. Remote Sens. Lett. 2021, 20, 1–5. [Google Scholar] [CrossRef]
Choi, J.H.; Kim, J.; Won, J.; Min, O. Modelling Chlorophyll-a Concentration using Deep Neural Networks considering Extreme Data Imbalance and Skewness. In Proceedings of the 2019 21st International Conference on Advanced Communication Technology (ICACT), Pyeongchang, Korea, 17–20 February 2019; pp. 631–634. [Google Scholar]
Pyo, J.C.; Duan, H.; Baek, S.; Kim, M.S.; Jeon, T.; Kwon, Y.S.; Lee, H.; Cho, K.H. A convolutional neural network regression for quantifying cyanobacteria using hyperspectral imagery. Remote Sens. Environ. 2019, 233, 111350. [Google Scholar] [CrossRef]
Syariz, M.A.; Lin, C.; Blanco, A.C. Chlorophyll-a Concentration Retrieval using Convolutional Neural Networks in Laugna Lake, Philippines. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2019, 42, 14–15. [Google Scholar]
Van Nguyen, M.; Lin, C.; Syariz, M.A.; Thu, T.; Le, H.; Blanco, A.C. Multi-task Convolution Neural Network for Season-insensitive Chlorophyll-a Estimation in Inland Water. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 10439–10449. [Google Scholar] [CrossRef]
Ioannou, I.; Gilerson, A.; Gross, B.; Moshary, F.; Ahmed, S. Deriving ocean color products using neural networks. Remote Sens. Environ. 2013, 134, 78–91. [Google Scholar] [CrossRef]
Ioannou, I.; Gilerson, A.; Gross, B.; Moshary, F.; Ahmed, S. Neural network approach to retrieve the inherent optical properties of the ocean from observations of MODIS. Appl. Opt. 2011, 50, 3168. [Google Scholar] [CrossRef]
Yu, B.; Xu, L.; Peng, J.; Hu, Z. Global chlorophyll-a concentration estimation from moderate resolution imaging spectroradiometer using convolutional neural networks. J. Appl. Remote Sens. 2021, 14, 034520. [Google Scholar] [CrossRef]
Syariz, M.A.; Lin, C.H.; Van Nguyen, M.; Jaelani, L.M.; Blanco, A.C. WaterNet: A convolutional neural network for chlorophyll-a concentration retrieval. Remote Sens. 2020, 12, 1966. [Google Scholar] [CrossRef]
Saguin, K. Biographies of fish for the city: Urban metabolism of Laguna Lake aquaculture. Geoforum 2014, 54, 28–38. [Google Scholar] [CrossRef]
Herrera, E.; Nadaoka, K.; Blanco, A.C.; Hernandez, E.C. Hydrodynamic investigation of a shallow lake environment (Laguna Lake, Philippines) and associated implications for eutrophic vulnerability. ASEAN Eng. J. Part C 2015, 4, 48–62. [Google Scholar]
Tamayo-Zafaralla, M.; Santos, R.A.V.; Orozco, R.P.; Elegado, G.C.P. The ecological status of Lake Laguna de Bay, Philippines. Aquat. Ecosyst. Health Manag. 2002, 5, 127–138. [Google Scholar] [CrossRef]
Deirmendjian, L.; Lambert, T.; Morana, C. Dissolved organic matter composition and reactivity in Lake Victoria, the World’s largest tropical lake. Biogeochemistry 2020, 150, 61–83. [Google Scholar] [CrossRef]
European Space Agency. Copernicus Sentinel-3 OLCI Land User Handbook; European Space Agency: Paris, France, 2021. [Google Scholar]
Bricaud, A.; Morel, A.; Babin, M.; Allali, K.; Claustre, H. Variations of light absorption by suspended particles with chlorophyll a concentration in oceanic (case 1) waters: Analysis and implications for bio-optical models. J. Geophys. Res. 1998, 103, 31033–31044. [Google Scholar] [CrossRef]
Ha, N.T.T.; Koike, K.; Nhuan, M.T.; Canh, B.D.; Thao, N.T.P.; Parsons, M. Landsat 8/OLI Two bands ratio algorithm for chlorophyll-a concentration mapping in hypertrophic waters: An application to west lake in Hanoi (Vietnam). IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2017, 10, 4919–4929. [Google Scholar] [CrossRef]
Ha, N.T.T.; Koike, K.; Nhuan, M.T. Improved accuracy of chlorophyll-a concentration estimates from MODIS Imagery using a two-band ratio algorithm and geostatistics: As applied to the monitoring of eutrophication processes over Tien Yen Bay (Northern Vietnam). Remote Sens. 2013, 6, 421–442. [Google Scholar] [CrossRef] [Green Version]
Menon, H.B.; Adhikari, A. Remote Sensing of Chlorophyll-A in Case II Waters: A Novel Approach With Improved Accuracy Over Widely Implemented Turbid Water Indices. J. Geophys. Res. Ocean. 2018, 123, 8138–8158. [Google Scholar] [CrossRef]
Kohl, S.A.A.; Romera-Paredes, B.; Meyer, C.; De Fauw, J.; Ledsam, J.R.; Maier-Hein, K.H.; Ali Eslami, S.M.; Rezende, D.J.; Ronneberger, O. A probabilistic U-net for segmentation of ambiguous images. arxiv 2018, arXiv:1806.05034. [Google Scholar]
Patterson, J.; Gibson, A. Deep Learning: A Practitioner’s Approach; O’Reilly Media: Sebastopol, CA, USA, 2017; ISBN 9781491914250. [Google Scholar]
Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. In Proceedings of the ICLR, San Diego, CA, USA, 7–9 May 2015; pp. 1–15. [Google Scholar]

Figure 1. Laguna Lake and field campaigns for Chla concentration collection. The routes of the field campaigns and the locations of the collected samples are visualized by colors.

Figure 2. Lake Victoria and in situ samples. The locations of samples are marked by red dots, and the sample information is provided.

Figure 3. Network structure of WaterNet.

Figure 4. Amount of in situ Chla concentration after performing data augmentation (a) without and (b) with the consideration of the balance of data.

Figure 5. Training flow in the main-training stage of the proposed method. Blue and grey boxes represent trained unknown parameters in a layer with a learning rate of 0.001 and 0.0001, respectively; the green box denotes the random reinitialization of unknown parameters in a layer.

Figure 6. Maps of estimated Chla concentrations using the trained ANN model. The Sentinel-3 images are shown in false color combination (R: Band 17; G: Band 5; B: Band 3).

Figure 7. Feature maps in the trained ANN model. Chla concentration estimation map at 6 April 2019 (left) and the corresponding spatial features as the output of the feature extraction phase in WaterNet (right).

Figure 8. Comparisons of ANN model training using original, augmented, and balanced dataset.

Figure 9. Performance of ANN model training with and without data rebalancing. Yellow and silver dots represent the estimated Chla concentrations using the augmented dataset with and without the process of data rebalancing.

Table 1. Description of trophic state index [8].

Trophic Class	Chla Concentration Range (in μg/L)	Water Condition
Oligotrophic	0~2.6	A lake with very clear waters and high drinking water quality due to low nutrient content and algal production.
Mesotrophic	2.6~20	Commonly clear water lakes with beds of submerged aquatic plants and medium levels of nutrients.
Eutrophic	20~56	The water body will be dominated either by aquatic plants or algae.
Hypertrophic	More than 56	Highly nutrient-rich lakes characterized by frequent and severe nuisance algal blooms and low transparency.

Table 2. Statistical summary of the in situ samples from Laguna Lake. “Min”, “Max”, Mean” and “Std.” represent the minimum, maximum, mean, and standard deviation of the Chla concentrations, respectively.

Campaign #	Date (in 2019)	# of Samples	Chla Concentration Statistics (μg/L)
Campaign #	Date (in 2019)	# of Samples	Min.	Max.	Mean	Std.
1	11 Jan	35	9.072	13.235	11.391	0.655
2	29 Mar	74	7.378	8.076	7.906	0.163
3	6 Apr	98	6.980	10.970	8.459	1.218
4	26 Apr	22	6.731	7.692	7.254	0.315
5	30 Apr	48	7.856	11.295	9.613	0.801

Table 3. Summary of Sentinel-3 R_rs image patches from Laguna Lake.

Image Acquisition Day	# of Image Patches
Image Acquisition Day	Pretraining Stage	Transfer-Learning Stage
11 Jan	1008	35
29 Mar	4715	74
6 Apr	5681	98
26 Apr	1908	22
30 Apr	2582	48
15 Jan	3471
22 Jan	2017
7 Feb	3722
8 Feb	3681
19 Feb	2877
2 Mar	1654
10 Mar	3393
26 Mar	2766
10 Apr	3984
21 Apr	3772
Total	47,231	275

Table 4. Performance of training stages in the proposed transfer learning.

Fold	Transfer Learning Performance (RMSE in μg/L)
	Pretraining Stage	Transfer-Learning
	Pretraining Stage	Main-Training Stage	Fine-Tuning Stage
1	2.070	0.689	0.478
2	2.144	0.562	0.229
3	2.113	0.487	0.219
4	2.089	0.606	0.430
5	2.190	0.576	0.414
6	2.120	0.517	0.284
7	2.194	0.609	0.441
8	2.228	0.560	0.387
9	2.216	0.633	0.508
10	2.079	0.581	0.336
Avg.	2.144	0.582	0.372

Table 5. Information of the compared Chla estimation models.

Model Name	Formula	Calibration Model
Three-band model	${[R_{r s}^{- 1} (665) - R_{r s}^{- 1} (709)] \times R_{r s}^{} (754)}$	Linear regression
Two-band model	${[R_{r s}^{} (709) \div R_{r s}^{} (665)]}$	Linear regression
NDCI	${\begin{matrix} [R_{r s}^{} (665) - R_{r s}^{} (709)] \\ \div [R_{r s}^{} (665) + R_{r s}^{} (709)] \end{matrix}}$
WaterNet

Table 6. Comparisons of the ANN model trained by the proposed transfer learning with the related models using Chla samples acquired from Lake Victoria, Uganda.

Station Name	Estimation Error (in μg/L)
Station Name	Three-Band Model	Two-Band Model	NDCI	WaterNet	Proposed Method
LV₁	0.746	0.653	0.304	0.645	0.302
LV₂	0.367	0.303	−0.164	0.277	0.117
RMSE	0.588	0.509	0.244	0.496	0.229

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Syariz, M.A.; Lin, C.-H.; Heriza, D.; Lasminto, U.; Sukojo, B.M.; Jaelani, L.M. A Transfer Learning Technique for Inland Chlorophyll-a Concentration Estimation Using Sentinel-3 Imagery. Appl. Sci. 2022, 12, 203. https://doi.org/10.3390/app12010203

AMA Style

Syariz MA, Lin C-H, Heriza D, Lasminto U, Sukojo BM, Jaelani LM. A Transfer Learning Technique for Inland Chlorophyll-a Concentration Estimation Using Sentinel-3 Imagery. Applied Sciences. 2022; 12(1):203. https://doi.org/10.3390/app12010203

Chicago/Turabian Style

Syariz, Muhammad Aldila, Chao-Hung Lin, Dewinta Heriza, Umboro Lasminto, Bangun Muljo Sukojo, and Lalu Muhamad Jaelani. 2022. "A Transfer Learning Technique for Inland Chlorophyll-a Concentration Estimation Using Sentinel-3 Imagery" Applied Sciences 12, no. 1: 203. https://doi.org/10.3390/app12010203

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Transfer Learning Technique for Inland Chlorophyll-a Concentration Estimation Using Sentinel-3 Imagery

Abstract

1. Introduction

2. Data Materials and Preprocessing

2.1. Laguna Lake of the Philippines

2.2. Lake Victoria of Uganda

2.3. Sentinel-3 Image Dataset

3. Methodology

3.1. Artificial Neural Network Model

3.2. ANN Model Training

3.2.1. Data Augmentation and Rebalancing

3.2.2. Transfer Learning

4. Experimental Results and Discussion

4.1. Evaluation of the Transfer Learning

4.2. Performance of Data Augmentation and Rebalancing

4.3. Comparisons of Chla Estimation Models

5. Conclusions and Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI