1. Introduction
The coastal and marine environment provides valuable natural resources for food, transportation, and recreation. It also hosts an invaluable biodiversity that forms complex ecosystems. Due to near-shore development, costal and marine ecosystems are being threatened by anthropogenic impacts including fisheries, aquaculture, shipping, urbanization and tourism, resulting in increased harmful algal blooms (HABs) events. HABs, also termed as red tides, generate toxic and harmful effects to the human body, fish, birds and marine ecosystem when present. In recent decades, HAB has been increased in frequency and spatial extent worldwide [
1]. In China, HABs are a national concern because they have occurred in every coastal province and their occurrences are on the rise [
2]. Shenzhen, one of the most developed areas in China, is also seriously affected by HAB occurrences. Algal blooms occurred more frequently in Shenzhen in recent decades [
3,
4], due to both natural and anthropogenic sources. Raid monitoring and accurate forecasting of HABs is an important task for local management agencies to protect marine resources.
A number of efforts have been made to detect the location and intensity of HABs and forecast the development of HABs. Detection of HABs strongly relies on in situ observations. Benefiting from the rapid development of real-time sensor technology and communication technology, many in situ monitoring systems have been developed to detect HABs by measuring marine water and environmental parameters in real-time. Remote sensing also plays an important role in detecting HABs. Several water column parameters can be monitored by satellites, such as chlorophyll-a (Chl-a) concentration, phytoplankton biomass, water transparency, and total suspended matter (TSM) concentrations. The data obtained from remote sensing of ocean properties is called ocean color data. Chl-a is usually used as a health indicator of the coastal and marine water. Many studies have been conducted to estimate Chl-a by using ocean color sensors, such as Sea-Viewing Wide Field-of-View Sensor (SeaWiFS) [
5], the Medium-Resolution Imaging Spectrometer (MERIS) [
6], and the Moderate-Resolution Imaging Spectroradiometer (MODIS) [
7]. These sensors are specially designed for the assessment of ocean water quality parameters; However, their coarse spatial resolution (250~1000 m) limits their use in the coastal water regions. Landsat sensors, including Thematic Mapper (TM) [
8], Enhanced Thematic Mapper Plus (ETM+) and Operational Land Imager (OLI) sensors, are designed for terrestrial applications, but they can also be used for studying coastal waters due to their high spatial resolution (30 m) [
9,
10].
Short-term forecasting of coastal HABs mainly relies on numerical models. Numerical modelling enables the prediction of occurrence timing, spatial extent and magnitude of a possible HAB event [
11]. Many models and commercial software have been developed for numerical ocean modelling, including Finite-Volume Community Ocean Model (FVCOM) [
12], MIKE3, Environmental Fluid Dynamics Code (EFDC), Princeton Ocean Model (POM), and Delft3D [
13]. Despite the increasing accuracy of HABs forecasts, there is uncertainty in all predictions, resulting from inevitable uncertainties in the initial and boundary conditions and necessary approximations in the construction of a numerical model of the real ocean system. To reduce model uncertainties, data assimilation methods are increasingly used in ocean modeling. The core idea behind data assimilation is to integrate model dynamics and measured data, thereby providing a better simulation that is consistent with the observed fields [
14]. The most widely used data assimilation techniques include nudging, optimal interpolation (OI) [
15], Ensemble Kalman Filter (EnKF) [
16], and Particle Filter [
17].
Ocean observation data and modeling results are usually large-volume, multi-source, and multi-dimensional. It is usually difficult for end-users to understand, analyze and utilize these complex data sets. Bringing these data sets into one system and visualizing them in a geospatial context would greatly improve the understanding and management of HABs. Recent progress in Web-Based Geographic Information System (GIS) technologies provides a great opportunity to achieve this goal. A number of efforts have been made to develop online systems for coastal water management. For example, Kulawiak et al. introduce a Web-Based GIS system for monitoring and forecasting oil spill [
18]. Qin and Lin developed a coastal seiches monitoring and forecasting system based on a Web-Based GIS platform [
19]. Zhang et al. developed a system to forecast short-term algal bloom in Lake Taihu [
20].
Despite the significant progress that has been made in monitoring techniques, numerical ocean modeling, remote sensing methods of ocean color and GIS, integrated systems that are capable of integrating these techniques and models into a single platform for HABs detection and forecasting are limited. This study introduces the design and development of a Web-Based GIS system named HABs Monitoring and Forecasting System (HMFS). The system is intended to rapidly detect and accurately predict coastal HABs events in Shenzhen city. HMFS integrates in situ observations, a remote-sensing-based model, hydrodynamic and water quality model and Web-Based GIS technique into one environment. The in situ sensors and remote sensing model provides continuous monitoring of the coastal water conditions. The numerical models provide short-term prediction and early warning of HAB. By leveraging a Web-Based GIS technique and Service-Oriented Architecture (SOA), the web portal of HMFS provides a graphic interface for users to view real-time data and remote sensing maps, explore numerical model outputs and get early warning information.
The remainder of this paper is organized as follows.
Section 2 introduces the study area.
Section 3 introduces the framework of the system, including system architecture and core system components.
Section 4 describes the web portal of the system with the application case in Shenzhen. Conclusions are provided in
Section 5.
2. Study Area
Shenzhen, a typical coastal city in China, covers an area of 1997 km
2. Shenzhen is located at the eastern shore of the Pearl River Estuary (PRE) and lies immediately North of Hong Kong. The coastal areas of Shenzhen consist of part of PRE and three major bays, i.e., Shenzhen Bay, Mirs Bay and Daya Bay.
Figure 1 describes the coastal areas of Shenzhen. The coastal environment of Shenzhen is complicated by the huge amount of freshwater discharges from the PRE and polluted urban areas. The city has a subtropical marine climate. The annual average temperature is approximately 22 °C and the annual precipitation is 1933 mm.
Shenzhen was the first of China’s five Special Economic Zones in 1980. It then became the fastest growing city in China. Due to rapid population growth and economic development, loadings of pollutants into the costal water were greatly increased, and algal blooms occurred more frequently in the last three decades. A total of 162 HAB events were recorded in Shenzhen coastal areas from 1981 to 2014 [
4]. The HABs events occurred most frequently from March to April. These algal blooms resulted in a large number of fish deaths and ecological damage. More than 40 algal species have been recorded to form red tides in Shenzhen, but most of them are harmless. The dominant algal species include
skeletonema costatum,
karenia mikimotoi,
gyrodimium instriatum, and
noctiluca scintillans. Algal species that cause harmful effects on fish include
karenia digitate,
chattonella marina and
chattonella marina var. ovata.
3. Framework of the System
3.1. System Architecutre
HMFS was designed based on the concept of Service-Oriented Architecture (SOA). SOA is a conceptual architecture which includes collection of web services. These web services are loosely coupled and interoperable in a distributed computing environment.
Figure 2 illustrates the service-oriented and multi-layer architecture of the system. The architecture is comprised of four layers: data layer, model layer, service layer, and presentation layer (or client layer).
The bottom layer contains a variety of data sources. The data sources could be grouped into two categories: satellite images and in situ observations. The satellite images (i.e., Landsat and MODIS) are regularly retrieved from online data repositories using Python script. The retrieved satellite images are processed and stored on the web server. The in situ observations measured at buoys, tide stations, hydrologic stations and metrological stations, are retrieved from the online system and stored in the geodatabase. In particular, real time data from in situ sensors are stored in the geodatabase following the Observations Data Model (ODM) specification [
21]. ODM provides a standard to store and manipulate point observational data in a relational database.
The model layer forms the backbone of the system. It contains three types of models. The numerical hydrodynamic model coupled with a water quality model is driven by meteorological forecast data. The numerical model performs short-term prediction of hydrodynamic variables (e.g., water levels, currents, salinity, and water temperature) and water quality variables (e.g., Chl-a and dissolved oxygen). The remote-sensing-based Chl-a estimation models are used to estimate Chl-a spatial distribution by using machine learning algorithms. Currently, the available machine learning algorithms include Support Vector Regression (SVR) and Artificial Neuronal Network (ANN). Maps from remote sensing are directly saved in the geodatabase as raster. The data assimilation model is used to assimilate in situ and remote sensing observations into the numerical model by using the EnKF method, thereby reducing the mismatch between the model and observations.
The service layer provides web services through which the client can communicate with the server and retrieve data from the geodatabase. The geodatabase is a central data repository to store and manage all kinds of data, including time series at in situ stations, remote sensing maps and numerical model outputs. Geospatial data in the geodatabase is shared through ArcGIS Server web services. ArcGIS Server is a map server developed by Environmental Systems Research Institute (ESRI), Inc. It provides high-performance functionalities for end-users and developers to create, analyze and manage massive geospatial data. The ArcGIS Server enables clients to access data over the Internet through standard Open Geospatial Consortium (OGC) web services, including Web Map Service (WMS) and Web Feature Service (WFS). HMFS provides a variety of functionalities to query and retrieve point observations in the geodatabase. These functionalities are exposed to clients through Simple Object Access Protocol (SOAP) web services. In HMFS, the Web Server was built on Microsoft Internet Information Services (IIS) 10.0. Microsoft SQL Server2016 and Arc Spatial Database Engine (ArcSDE) were utilized to store and manage geospatial and time series of observational data.
The presentation layer contains a web portal. The web portal was developed based on Microsoft ASP.NET. By making use of Rich Internet Applications (RIAs) technique, the web portal provides users with a highly interactive interface. The web portal contains a variety of tools, including map viewer, time series viewer, coastal watch, HABs forecast, and HAB warning bulletin. The web portal can be accessed through a common browser such as Internet Explorer.
3.2. In Situ Observation
The Shenzhen Planning and Natural Resources Bureau (SPNRB) has constructed a multi-sensor monitoring system to monitor coastal water conditions including tide and water qualities. Currently, the monitoring system consists of thirteen surface buoys and four tidal stations that are distributed along the coastline (see
Figure 1). The real-time measurements from these in situ sensors are retrieved from the information system of SPNRB and stored in the geodatabase. In the study area, the Hong Kong Environmental Protection Department (EPD) has also implemented a water quality monitoring system which was initiated in 1986. The system consists of 94 routinely sampled stations for marine waters and 60 for bottom sediments, reporting marine water quality on a monthly basis. These monthly water quality measurements are very valuable for studying and understanding coastal and marine waters in Shenzhen and Hong Kong. Thus, these measurements are also routinely retrieved from the EPD website and archived in the geodatabase of the system.
3.3. Remote Sensing Based Estimation of Chlorophyll-a
A potential algal bloom could be detected by anomalously high Chl-a concentrations in coastal waters [
22]. Following this idea, Chl-a concentration was used as an indicator to detect and monitor HABs. Two types of remote sensing products are used to monitor HABs in HMFS. The first type adopts the standard MODIS Ch-l data products, which are generated and distributed by the Ocean Biology Processing Group (
https://oceancolor.gsfc.nasa.gov) of National Aeronautics and Space Administration (NASA). The algorithm used to estimate surface water Chl-a in the NASA products combines a three-band difference algorithm for low Chl-a waters [
23] with a band-ratio algorithm for high Chl-a waters [
24]. The MODIS Chl-a data products have frequent revisit times (1–2 days) but have low spatial resolution (1 km).
To provide a Chl-a distribution map with higher spatial resolution, Landsat images were also used in HMFS. Landsat sensors, including Thematic Mapper (TM) of Landsat-5 (L5), Enhanced Thematic Mapper Plus (ETM+) of Landsat-7 (L7) and Operational Land Imager (OLI) of Landsat-8 (L8), were used to estimate Chl-a in coastal waters. All the sensors have a spatial resolution of 30 m. Top of the Atmosphere (TOA) data L8, L7, and L5 were automatically downloaded from the Earth Explorer website (
http://earthexplorer.usgs.gov/) through a Python routine, which is available at the website
https://github.com/olivierhagolle/LANDSAT-Download. To remove atmospheric disturbances, atmospheric correction must be performed on the TOA data. Currently, a number of atmospheric correction methods are available. In HMFS, FLAASH (Fast Line-of-sight Atmospheric Analysis of Spectral Hypercubes) [
25,
26], an atmospheric correction modeling tool contained in ENVI (The ENvironment for Visualizing Images) software, was utilized to retrieve spectral reflectance (SR) from TOA images. FLAASH is one of the most popular atmospheric correction programs. It is developed based on the radiative transfer code, MODTRAN 4 (MODerate spectral resolution atmospheric TRANsmittance).
After performing the atmospheric correction, the spectral reflectance images were used to estimate chlorophyll-a concentration by using machine learning algorithms. The machine learning algorithms establish a relationship between in situ Chl-a concentrations and surface reflectance through training sets and then predict spatial distribution of Chl-a concentrations via the established relationship. In HMFS, two machine learning models were specially developed for Shenzhen coastal water, including Support Vector Regression (SVR) [
27] and Artificial Neural Network (ANN). SVR is a popular machine learning algorithm for regression. SVR is able to deal with small sample size, non-linear and high-dimensional problems. It uses the same principle as the well-known Support Vector Machine (SVM), with minor differences. SVR develops a linear dependency between n input variables and one target variable by fitting an optimized hyper-plane. The remote sensing scenes over the Shenzhen coastal areas are often contaminated by cloud. Thus, the number of training samples is limited. Nevertheless, SVR has good generalization ability, even the number of training samples is small. In this study, Gaussian kernel was used in the SVR model. Several key parameters of SVR, including regularization parameter (C) and the loss function parameter (ε), were tuned through training processes until achieving the best prediction capabilities. ANN is another popular machine learning approach. It has been increasingly used to forecast algal blooms because of its capabilities in forecasting complex relationships [
28,
29]. ANN applied in this study consists of an input layer with
Np nodes, a hidden layer with
Nh nodes, and an output layer with one node.
Np is the number of input variables which depend on the selected surface reflectance bands.
Nh is the number of neuron nodes in the hidden layer. The tangent sigmoid function was chosen as the transfer function between the hidden and output layers.
3.4. Numerical Models for HAB Forecasting
The numerical models for HAB forecasting consist of a hydrodynamic model, a water quality model and a data assimilation model. FVCOM (version 3.1.6) was used to establish the hydrodynamic model. FVCOM is an unstructured-grid and finite-volume ocean model with a terrain-following coordinate system in the vertical and a triangular grid in the horizontal. It can be easily customized due to its open source policy. FVCOM enables modelling of hydrodynamics variables including water level, current, water temperature, and salinity. A great advantage of FVCOM is that it enables the use of unstructured triangular grids to accurately fit the complex coastlines, such as the Peral River Estuary and the Shenzhen coastal areas. FVCOM is numerically solved using a split-mode method. The source code of FVCOM is parallel, and a Message Passing Interface (MPI) is used as the message passing system. FVCOM has been successfully applied to coastal areas [
30,
31], estuaries [
32], and data assimilation studies [
14,
33].
The modeling domain and discretized grid in this study are presented in
Figure 3. The domain includes the Pearl River Estuary and the coastal areas around Shenzhen and Hong Kong. Horizontally, the grid consists of 15,351 nodes and 28,880 triangular cells. The spatial resolution of the horizontal cells ranges from 400 m to 1500 m. The triangular cells along the coast have finer resolution, while the cells at the offshore open boundary have a coarser resolution. Vertically, the model has nine terrain-following layers. The model bathymetry is derived from the General Bathymetric Chart of the Oceans. Triangular cells with a depth of less than 1 m occupy less than 1%; therefore, the minimum depth was set to 1 m. The modified Mellor and Yamada level-2.5 (MY-2.5) was chosen as the turbulent closure scheme. Parameters of the MY-2.5 scheme and the non-dimensionless viscosity parameter in the Smagorinsky formula were set according to the ones used in other similar FVCOM models. On the basis of the Courant–Friedrichs–Lewy (CFL) criterion, an integration time step of 5 s was set for the external mode and the time step of the internal mode was set to 50 s [
34,
35].
The water quality model used in this study is the Water Analysis Simulation Program (WASP). WASP simulates eight water column state variables, including dissolved oxygen (DO), nitrate and nitrite (NO
2 and NO
3), ammonia (NH
3), organic nitrogen (ON), organic phosphorus (OP), inorganic phosphorus (OPO4), carbonaceous biochemical oxygen demand (CBOD), and chlorophyll-a (Chl-a). The governing equations of the WASP are described in detail in [
22]. The WASP was coupled with FVCOM and can be run in an online mode. The parameter values of the WASP model were specified based on literature reviews [
4,
36,
37] and previous WASP applications [
38].
The coupled model was driven by meteorological data, including precipitation, air pressure, temperature, relative humidity, and evaporation. The meteorological data was derived from the European Center for Medium-Range Weather Forecast (ECMWF) datasets (available at
http://www.ecmwf.int). ECMWF produces global numerical weather predictions four times per day. Tidal forcing was considered by specifying eight major tidal constituents at the offshore open boundary. The tidal constituents include M2, N2, S2, K2, K1, O1, P1, and Q1. These tidal constituents were determined by interpolating the 1/6° inverse tidal model results generated by Tidal Inversion Software (OTIS) [
39]. Freshwater and nutrient fluxes from the major river inlets were used as boundary conditions. River flow rate forecasts were generated using machine learning algorithms based on daily discharges observations obtained from Pearl River Water Resources Commission. Water quality constituent fluxes were derived based on weekly measurements at the river inlets, obtained from the Department of Ecology and Environment of Guangdong Province.
The coupled model was calibrated by comparing simulated hydrodynamic variables (i.e., tidal level and water temperature) and water quality variables (i.e., Chl-a and DO) against observation values. The observational data sets in 2017 were used to perform the calibration. The hydrodynamic model parameters (e.g., roughness coefficients) and the WASP model parameters were adjusted during the calibration until good agreement between the simulation results and the observations was achieved.
For an operational system, it is useful to implement a sequential data assimilation algorithm, through which the observations can be assimilated as they become available. To this end, a popular data assimilation scheme, EnKF, was employed to assimilate the Chl-a concentration observed at in situ stations and derived from remote sensing observations into the numerical model. In HMFS, the EnKF ensemble was generated by perturbing meteorological driving force fields. High-Performance Computing (HPC) environments were employed to run the model since the numerical models are computationally and data-intensive. Outputs from both the free model run (without assimilation) and assimilated model run were achieved on the HPC cluster using netCDF (network Common Data Form) file format, which is a widely used format for storing scientific data. The numerical models produce a 3–5 days forecast of hydrodynamics and Chl-a distribution, and the possible HAB event occurrence was determined by comparing the forecasted Chl-a concentration with a predefined threshold. The threshold could be defined by decision-makers. Currently, a potential HAB event triggering for the Shenzhen coastal region was set with current Chl-a concentration greater than 15 μg/L.
3.5. Data Post-Processing and Publication
In HMFS, the raw modeling outputs achieved on the cluster are pushed to the web server via FTP (File Transfer Protocol) and are automatically processed by a post-processing tool. The tool was developed using C# and deployed on the server-side. It was used to convert model outputs in the netCDF format to GIS raster/feature layers and store the converted data sets in the geodatabase.
Table 1 lists GIS layers automatically generated by the post-processing tool. To provide better visualization effects, rendering style is pre-designed for each kind of variable. The raster layers are symbolized by classification/stretch rendering with color ramps. Current filed (velocity and direction) is symbolized with arrows. The generated GIS layers are then published via ArcGIS Server.
5. Conclusions
In this study, we introduced an integrated web-based system for the monitoring and forecasting of coastal HABs. The system provides a web-based environment to monitor and forecast HABs by leveraging remote sensing, numerical modeling, and Web-GIS technologies. The system allows end-users to view the in situ observations, remote sensing maps and numerical model forecasts in a geospatial context. The application of the system can improve the understanding of the hydrodynamic and water quality processes in the Shenzhen coastal areas and may reduce potential economic and health losses arising from the occurrence of HABs events.
The system was successfully applied to detect coastal HABs events using the remote sensing models. The remote sensing models were proven to be effective by identification of anomalously higher Chl-a concentrations. They can be used to perform routine monitoring of HAB events in the complex coastal areas of Shenzhen. The forecast results illustrate that the numerical model, in combination with the EnKF assimilation algorithm, is successful in predicting hydrodynamics and water quality processes up to 5 days. By leveraging Web-GIS, RIAs, and SOA technologies, the web portal of the system provides a single map-based interface, in which data of different types (point time series, vector, and raster) can be visualized and analyzed through interactive tools. The source code of the web portal could be found at the link
https://github.com/DeepHydro.
Although the system was specially developed for Shenzhen coastal areas, it can be adapted to other coastal regions where monitoring and forecasting of HABs are required. However, the numerical models have to be established and calibrated for the region of interest, and the software environments (i.e., GIS server and geodatabase) have to be deployed and configured by developers. In future work, routine monitoring of Chl-a based on remote sensing will be improved by employing more sensors, such as Japanese Himawari-8, European Space Agency (ESA) sentinel 2/3 and Chinese HJ-1 A/B. The numerical models also need to be improved, for example, the hydrodynamic model should better simulate typhoon or tropical storms and the water quality model should better simulate the ecosystem in the sea water. We noticed that ArcGIS API for Silverlight may not be the best option for the system. ArcGIS API for JavaScript, the new generation of ESRI’s technology for web GIS development is a better choice since it provides better performance and supports the HTML5 (Hypertext Markup Language5). To meet the needs of mobile Internet applications, we plan to develop a new version of the web portal via ArcGIS API for JavaScript and HTML5 platform.