1. Introduction
Over half of the world’s population lives near the coast, adding more pressure on coastal environments [
1]. Extensive anthropogenic activities along the coast, such as land reclamation and disposal of household/industrial waste into the coastal waters, significantly degrades the coastal water quality [
2,
3]. Additionally, a major source of marine water pollution is due to the storm water runoff into the coastal areas without any treatment [
4]. Thus, effective coastal monitoring programs are highly demanded to protect and maintain coastal ecosystems. Monitoring these dynamic ecosystems require the integration of in situ data with high spatial and temporal resolution data, such as remotely-sensed data [
5]. Remote sensing datasets provide a synoptic view of coastal areas with the ability to measure the upwelling radiance in different spectral regions [
6,
7,
8,
9,
10].
Coastal waters, specifically the estuarine coastal waters, are dynamic in nature and are characterized with fluctuating water turbidity. For instance, three major water types exist in the coastal area of Hong Kong, i.e., the clear water type in the eastern region, moderately turbid waters in the central region, and highly turbid waters in the western region. The western region of Hong Kong receives a large amount of suspended particles from the Pearl River, while the eastern region shows a significantly different clear water type due to its connection with the South China Sea. A single model developed for such complex coastal environments might be associated with high uncertainties. Thus, the development of a separate water quality parameter (WQP) estimation model for each water type found in the region could improve the estimation of coastal water parameters. A previous study [
11] has suggested the existence of five optically distinct water classes found in the coastal area of Hong Kong (
Figure 1). Different studies [
4,
12,
13] have reported the role of chlorophyll-a (Chl-a) and suspended solids (SS) in aquatic ecosystem health, tourism, shipping, and fisheries, which can be enhanced if estimations of Chl-a and SS are obtained with high accuracy. Additionally, accuracy becomes more important for complex water bodies, such as the South China Sea, which experiences a high volume of pollutants and storm water runoff from the Pearl River Delta.
The coastal WQPs have been estimated over a variety of geographical locations and environmental conditions using different remotely-sensed data, including Landsat and Chinese HJ-1 A/B charge couple device sensors (hereafter referred as HJ-1 CCD) imagery [
14,
15,
16,
17,
18,
19]. Remote sensing studies over the Pearl River Delta and Hong Kong coastal area have also been conducted to estimate coastal water constituents. Zhang et al. [
20] used Terra MODIS (Moderate Resolution Imaging Spectroradiometer—1000 m) and MERIS (Medium Resolution Imaging Spectrometer—300 m) sensors to estimate Chl-a concentrations. Xi and Zhang [
21] developed an empirical model using MERIS imagery to retrieve the SS concentrations. Using the two-band model (MERIS bands 6 and 7) they retrieved and mapped the SS concentration with an R
2 of 0.75 and root mean square error (RMSE) of 1.69 mg/L. Tian et al. [
22] used the medium (30 m) spatial resolution HJ-1 CCD sensors data to model SS concentrations for the Deep Bay area of Hong Kong.
Chen et al. [
23,
24] classified the Hong Kong and Pearl River Delta region waters based on their optical properties, into five different classes using a Landsat Thematic Mapper (TM) image. They used three classification techniques namely, maximum likelihood (MLH), neural network (NN), and support vector machine (SVM), to recognize spatial patterns in water color, and found similar spatial patterns of spectral reflectance, with varying classification accuracies in all classification techniques. In classifying the optically different water types, they made imprecise assumptions in (i) obtaining actual reflectance values of Landsat TM, i.e., using an image-based atmospheric correction method and no subsequent validation of the atmospheric correction results; (ii) defining classes based on only one TM image i.e., using one image one cannot pick the variation in coastal water dynamics (iii) training/validation of the classification techniques using satellite-retrieved WQPs, i.e., retrieving the Chl-a and SS concentrations from SeaWiFS (Sea-Viewing Wide Field-of-View Sensor—1.1 km) and AVHRR (Advanced Very High Resolution Radiometer—1.1 km) images, respectively; and (iv) in considering that the water reflectance was constant over different dates and flushing in the Pearl River Delta was not strong, i.e., the time difference between image date and the in situ WQP data collection was 3–12 days (for 66% of the data) and 14–23 days (for 34% of the data). However, it is reported that in a turbid estuary the water properties can vary within ±24 h [
25].
In this study, we aimed to use in situ and remotely-sensed data to model two of the most commonly estimated WQPs (i.e., Chl-a and SS concentrations) within the entire coastal area of Hong Kong and within each coastal water class of Hong Kong using empirical predictive and machine learning methods. Empirical methods are straightforward, whereas machine learning methods require a certain level of expertise by the user, but are computationally fast, and can handle large data [
26]. Hence, the results of this study will not only benefit the scientific community, but will also help the policy-makers to protect Hong Kong’s coastal environment based on significantly reliable and efficient routine estimates of Chl-a and SS concentrations.
3. Methodology
3.1. Satellite Imagery Pre-Processing
Landsat TM/ETM+ scenes were processed to Standard Terrain Correction (Level 1T) by the provider, i.e., the United States Geological Survey (USGS). Level 1T product provides systematic geometric accuracy by incorporating ground control points (GCPs) while employing a digital elevation model (DEM) for topographic accuracy. When obtained from the China Centre for Resources Satellite Data and Application (CRESDA), the HJ-1 CCD images were not geometrically corrected. Therefore, all the images were geometrically corrected using concurrent or most recent Landsat TM/ETM+ image(s) as a reference. As the HJ-1 CCD images have a wide swath width (360 km) in comparison with a Landsat TM/ETM+ scene (185 km), before geometric rectification, all the HJ-1 CCD images were subset over the study area (i.e., Hong Kong) to reduce the processing time and to save disk space.
After having sufficient number of GCPs (from 25 to 50) from reference Landsat TM/ETM+ image(s), the HJ-1 CCD images were geometrically corrected with an average RMSE of 1 pixel. To minimize the loss of spectral information resulting from image resampling during geometric correction, the nearest neighbor resampling method was used. After the geometric correction, images were further processed to convert to a standard radiometric scale. For Landsat TM and ETM+ sensors, Equation (1) was used to convert the digital numbers (DN) to the top-of-atmosphere (TOA) radiance (L
satλ) [
27]:
where:
where:
Lsatλ = At-satellite spectral radiance/TOA radiance for band λ (W/(m2 sr μm));
Qcalλ = Quantized calibrated pixel value for band λ (DN);
Qcalmin = Minimum quantized calibrated pixel value corresponding to Lminλ (DN);
Qcalmax = Maximum quantized calibrated pixel value corresponding to Lmaxλ (DN);
Lminλ = Spectral at-sensor radiance, scaled to Qcalmin for band λ (W/(m2 sr μm)); and
Lmaxλ = Spectral at-sensor radiance, scaled to Qcalmax for band λ (W/(m2 sr μm)).
For HJ-1 CCD sensors, Equation (4) (given in the metadata file of the imagery) was used to convert the DN values to TOA radiance (L
satλ):
where:
DNλ = Quantized calibrated pixel value for band λ;
Gλ = Band-specific gain factor [DN/(W/(m2 sr μm)] for band λ; and
L0λ = Band-specific bias factor [W/(m2 sr μm)] for band λ.
The values of all the above-mentioned parameters, required for the radiometric correction, were obtained from the respective image metadata files.
3.2. Cross-Comparison of Sensors
Landsat TM/ETM+ and HJ-1 CCD sensors have similar band designations, data bit levels (8-bit) and spatial resolutions (30 m) [
30]. Further, Nazeer and Nichol (2015) [
31] compared the image statistics for the common homogenous water areas observed near-simultaneously by the two satellite sensors. They revealed that there is a high degree of consistency between the two sensors, i.e., a correlation of ≥0.80 was observed for visible bands (B1, B2, B3, and B4). Therefore, this analysis provides a sound base for combining the data from the two sensors.
3.3. Atmospheric Correction
Atmospheric correction is a key pre-processing step that needs to be performed when combining data from different sensors, as well as from different dates [
32]. Atmospheric correction was, therefore, performed using the Second Simulation of the Satellite Signal in the Solar Spectrum (6S) method [
28]. 6S is a radiative transfer model [
33] that determines the three coefficients (
xa,
xb, and
xc) required for estimation of surface reflectance (
) for each band (λ) (Equation (5)). The coefficients (
xa,
xb, and
xc) were calculated by the 6S model based on input parameters including sensor type, image acquisition date and time, solar zenith and azimuth angles, sensor's zenith and azimuth angles, atmospheric model, and aerosol optical depth/visibility:
where,
xa is the inverse of transmittance,
xb is the scattering term of the atmosphere and
xc is the reflectance of the atmosphere for isotropic light.
Information on solar and sensor zenith and azimuth angles was extracted from the image metadata files. The datasets for aerosol optical depth and water vapor were retrieved from MODIS Terra Daily Level–3 (1° × 1°) global atmospheric product (MOD08_D3.051) and the Ozone Monitoring Instrument’s (OMI) Daily Level–3 (0.25° × 0.25°) global gridded product, respectively [
34].
3.4. Satellite and In Situ Data Matching
Water properties can vary rapidly in a turbid coastal environment. Therefore, a narrow time window (±3 h) should be considered to match the satellite and in situ water samples [
35]. To capture short-term changes in coastal waters of Hong Kong, a time window of ±2 h (9:00 am to 1:00 pm local) of the image acquisition time was used to identify collocated samples. Water sampling locations affected by adjacency effects, clouds, scan line errors on ETM+ images, and ship wake effects were not included. A mean surface reflectance of each sampling station was extracted from a window of 3
3 pixels. This criteria resulted in 240 observations (N = 240) for a Chl-a range of 0.30 to 13.0 µg/L and SS concentration range of 0.5 to 56.0 mg/L.
Table 2 shows the satellite and in situ data match-ups for each of the water classes (defined by [
11]), as well as the range of Chl-a and SS concentrations for the match-ups.
3.5. Modeling of Chl-a and SS Concentrations
EPM and NN techniques were used for the development of models for the estimation of Chl-a and SS concentrations using Landsat TM/ETM+ and HJ-1 CCD images. The reason for the inclusion of the satellite data from different years and months was to capture the variability of WQPs over the study area. Two types of models were developed, i.e.
For modeling the entire coastal area of Hong Kong (regional models), 200 observations (from 2000–2010) were used in the model development and 40 observations (from 2011–2012) were used for validation. The mean Chl-a concentration for the model development dataset was 3.23 µg/L with a standard deviation (StDev) of 2.94 µg/L, while the mean for the validation dataset was 1.73 µg/L with a StDev of 1.84 µg/L. The mean SS concentration of the model development dataset was 5.86 mg/L with a StDev of 6.46 mg/L, while the mean of the validation dataset was 5.09 mg/L with a StDev value of 3.52 mg/L. The high standard deviation values are due to the pronounced spatial variability of Chl-a and SS concentrations over the entire area of Hong Kong’s coastal region, from the South China Sea in the east to the Pearl River Delta in the west, reflecting the complexity of Hong Kong waters. This high variability is similar in both the development and validation datasets.
For local class-specific models, the data set of EPM and NN was divided into two segments: a training segment, which included 70% of the data to be used in the training phase, and a validation segment including the remaining data points (30%), used to examine the model efficiency. Both subsets were randomly selected from all 57 images. They covered the entire observed range of Chl-a and SS concentrations within each class.
3.5.1. Empirical Predictive Modeling (EPM)
For the EPM of Chl-a and SS concentrations, first of all, the correlations of these two parameters were determined using the first four bands of TM/ETM+ and HJ-1 CCDs surface reflectance (after atmospheric correction). There is no particular band or band ratio that consistently shows a good correlation between in situ-measured WQPs (Chl-a and SS concentrations) and remotely-sensed data for every geographical location. Therefore, the relationship between in situ WQPs and water surface reflectance was explored before developing a model. In the analysis the band transformations, including addition, subtraction, multiplication, division, averaging, and logarithm, were also considered (e.g., [
36]).
To select the best independent variables, the variables having p-value of less than or equal to 0.05 and a correlation coefficient (R) of ≥0.50 were considered in the EPM development. An independent variable in empirical predictive modeling was selected based on R, standard error, t-test, p-value, and the residual. If the value of R was close to 1, and values of standard error (S), t-test, p-value, and residual were close to zero, or negative, then that band, or the combination of bands, was marked as potentially meaningful.
3.5.2. Neural Network Modeling (NN)
A multilayer perceptron NN was adopted for the retrieval of Chl-a and SS concentrations using Landsat TM/ETM+ and HJ-1 CCD images over the study area. To achieve the best possible output, the network was trained using a supervised learning technique, i.e., training using prior information of the desired output corresponding to a set of input data (Chl-a and SS concentrations). In the training phase a relationship between the input and the desired output was established based on the weight values for each connection. The weights were iteratively adjusted to minimize the error, computed as the mean square difference between the model and the actual output. The training set consisted of in situ measured Chl-a and SS concentrations and the Landsat TM/ETM+ and HJ-1 CCD reflectance values from bands B1 to B4. Inputs were fed into the network and based on this, the network calculated the output. This output was compared with the actual output and the difference between these two outputs, called the network error, was calculated. By adjusting the internal weights through an iterative process, the network error was reduced through back-propagation, i.e., the network adjusted its weights, starting with the output layer and working back through the network.
3.6. Performance Evaluation
The models developed to estimate Chl-a and SS concentrations using EPM and NN were validated. The validation was based on five statistical parameters, namely, Pearson correlation coefficient (
R, Equation (7)), RMSE (Equation (8)) [
37], mean absolute error (MAE, Equation (9)), Bias (ψ, Equation (10)) and scattering of data points (|ψ|, Equation (12)) [
38], where
R is a measurement of the correlation between the observed and the predicted datasets. RMSE measures the difference between observed and predicted values. MAE is the measure of magnitude of the mean error, the quantity ψ determines the bias, while |ψ| indicates the scattering of data points. The following parameters were calculated using Microsoft Excel (Microsoft Corporation, Redmond, WA, USA).
where n is the number of observations,
xi and
yi are the estimated and observed concentrations of Chl-a and SS.
5. Discussion and Conclusions
The primary objective of this study was to evaluate the empirical and machine learning methods for the estimation of coastal water quality parameters. The study has used a 13-year dataset of Landsat TM/ETM+ and HJ-1 CCD sensors, coincident with the in situ Chl-a and SS concentration data collected within 2 h of the satellite overpass. Further, two types of models, based on empirical predictive and neural network approaches, were developed for accurate estimation and mapping of Chl-a and SS concentrations at both regional (representing the entire coastal area of Hong Kong) and local (specific for each water class) scales. For regional estimation using neural network and empirical predictive models, the retrieval accuracies of 93% (97%) and 83% (78%) were observed for Chl-a (SS) concentrations, respectively. The neural networks also outperformed for estimation of Chl-a (60–94%) and SS (81–94%) concentrations than the empirical predictive models (3–63% for Chl-a and 52–87% for SS concentrations) for local class-specific retrievals. Although the neural networks have better regional estimation, it was observed that the estimated Chl-a and SS concentrations for each class were well correlated with in situ measurements, suggesting that a class-specific (local) neural network is suitable for the remote sensing-based routine monitoring of Chl-a and SS concentrations over the complex coastal environment of Hong Kong.
For all class-specific models (Chl-a or SS) the empirical predictive modeling performed poorly compared to the neural networks. Conversely, the neural networks performed well in estimating the Chl-a and SS concentrations for each water class. The adjustments of the weight values between the input, hidden, and output layers reduced the network error and led to a satisfactory performance of the neural networks. A comparison of the correlation coefficients of class-specific neural networks and empirical predictive models is presented in
Figure 5. Overall, a higher correlation for both the training and validation datasets was observed for neural network models. It is evident that a lower correlation exists for the empirical predictive models, but there is a notable decline in Class 4 even for neural network models when estimating Chl-a concentrations (
Figure 5a). This decline may result from higher variance (11.63) in Class 4 compared to other water classes (for classes 1, 2, 3, and 5 the variance is 3.54, 7.69, 6.45, and 4.92, respectively), which indicates that the data is spatially variable over this class.
Overall, this study has found machine learning methods (i.e., NN) to be 19% and 10% more efficient than empirical predictive models for the estimation of SS and Chl-a, respectively, in complex water bodies. Therefore, this study suggests the use of machine learning methods for the better/accurate estimation of coastal water quality parameters for the routine monitoring of coastal water quality parameters. Hence, the proposed method in this study will not only benefit the scientific community but will also help the policy-makers to protect the oceanic and coastal environments based on significantly reliable estimates of Chl-a and SS concentrations.