Impacts of Missing Buoy Data on LSTM-Based Coastal Chlorophyll-a Forecasting
Abstract
:1. Introduction
2. Materials and Methods
2.1. Data Collection and Preprocessing
2.2. Long Short-Term Memory Model
2.3. Patterns of Missing Data
- ■
- Edge-missing data: This refers to missing data points at the boundaries of the time series, such as when the most recent buoy data are unavailable for forecasting the Chl concentration on the next day.
- ■
- Non-edge-missing data: This refers to missing observations within the time series. For example, if the model uses the last 5 days of data to predict the Chl concentration of the next day, and the data on day 3 are missing, this is considered a case of non-edge-missing data.
2.4. Evaluation Criteria
3. Results and Discussion
3.1. Influence of Missing Parameters
3.2. Influences of Discontinuities in Time Series Data
3.3. Importance of Input Parameter Completeness on LSTM Prediction Models
4. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Anderson, C.R.; Sapiano, M.R.; Prasad, M.B.; Long, W.; Tango, P.J.; Brown, C.W.; Murtugudde, R. Predicting potentially toxigenic Pseudo—Nitzschia blooms in the Chesapeake Bay. J. Mar. Syst. 2010, 83, 127–140. [Google Scholar] [CrossRef]
- Anderson, D.M.; Alpermann, T.J.; Cembella, A.D.; Collos, Y.; Masseret, E.; Montresor, M. The globally distributed genus Alexandrium: Multifaceted roles in marine ecosystems and impacts on human health. Harmful Algae 2012, 14, 10–35. [Google Scholar] [CrossRef] [PubMed]
- Behrenfeld, M.J.; Boss, E.S. Resurrecting the Ecological Underpinnings of Ocean Plankton Blooms. Annu. Rev. Mar. Sci. 2014, 6, 167–194. [Google Scholar] [CrossRef]
- Anderson, C.R.; Moore, S.K.; Tomlinson, M.C. Living with Harmful Algal Blooms in a Changing World: Strategies for Modeling and Mitigating Their Effects in Coastal Marine Ecosystems. In Coastal and Marine Hazards, Risks, and Disasters; Elsevier: Amsterdam, The Netherlands, 2015; pp. 495–561. [Google Scholar] [CrossRef]
- McGillicuddy, D.J.; Anderson, D.M.; Lynch, D.R.; Townsend, D.W. Mechanisms regulating large-scale seasonal fluctuations in Alexandrium fundyense populations in the Gulf of Maine: Results from a physical-biological model. Deep Sea Res. Part II Top. Stud. Oceanogr. 2005, 52, 2698–2714. [Google Scholar] [CrossRef]
- Anderson, C.R.; Siegel, D.A.; Kudela, R.M.; Brzezinski, M.A. Empirical models of toxigenic Pseudo-nitzschia blooms: Potential use as a remote detection tool in the Santa Barbara Channel. Harmful Algae 2009, 8, 478–492. [Google Scholar] [CrossRef]
- Deng, T.N.; Chau, K.W.; Duan, H.F. Machine learning based marine water quality prediction for coastal hydro-environment management. J. Environ. Manag. 2021, 284, 112051. [Google Scholar] [CrossRef]
- Manucharyan, G.E.; Siegelman, L.; Klein, P. A Deep Learning Approach to Spatiotemporal Sea Surface Height Interpolation and Estimation of Deep Currents in Geostrophic Ocean Turbulence. J. Adv. Model. Earth Syst. 2021, 13, e2019MS001965. [Google Scholar] [CrossRef]
- Dong, S.; Wang, P.; Abbas, K. A survey on deep learning and its applications. Comput. Sci. Rev. 2021, 40, 100379. [Google Scholar] [CrossRef]
- Ding, W.X.; Li, C.L. Algal blooms forecasting with hybrid deep learning models from satellite data in the Zhoushan fishery. Ecol. Inform. 2024, 82, 102664. [Google Scholar] [CrossRef]
- Ham, Y.G.; Kim, J.H.; Luo, J.J. Deep learning for multi-year ENSO forecasts. Nature 2019, 573, 568–572. [Google Scholar] [CrossRef]
- Gambin, A.F.; Angelats, E.; Gonzalez, J.S.; Miozzo, M.; Dini, P. Sustainable Marine Ecosystems: Deep Learning for Water Quality Assessment and Forecasting. IEEE Access 2021, 9, 121344–121365. [Google Scholar] [CrossRef]
- Tian, W.C.; Liao, Z.L.; Wang, X. Transfer learning for neural network model in chlorophyll-a dynamics prediction. Environ. Sci. Pollut. Res. 2019, 26, 29857–29871. [Google Scholar] [CrossRef]
- Yussof, F.N.; Maan, N.; Reba, M.N.M. LSTM Networks to Improve the Prediction of Harmful Algal Blooms in the West Coast of Sabah. Int. J. Environ. Res. Public Health 2021, 18, 7650. [Google Scholar] [CrossRef]
- Chen, Z.; Xu, H.; Jiang, P.; Yu, S.; Lin, G.; Bychkov, I.; Hmelnov, A.; Ruzhnikov, G.; Zhu, N.; Liu, Z. A transfer Learning-Based LSTM strategy for imputing Large-Scale consecutive missing data and its application in a water quality prediction system. J. Hydrol. 2021, 602, 126573. [Google Scholar] [CrossRef]
- Zhou, Y.N.; Wang, S.Y.; Wu, T.J.; Feng, L.; Wu, W.; Luo, J.C.; Zhang, X.; Yan, N.N. For-backward LSTM-based missing data reconstruction for time-series Landsat images. Giscience Remote Sens. 2022, 59, 410–430. [Google Scholar] [CrossRef]
- Cosgrove, S.; Ní Rathaille, A.; Raine, R. The influence of bloom intensity on the encystment rate and persistence of Alexandrium minutum in Cork Harbor, Ireland. Harmful Algae 2014, 31, 114–124. [Google Scholar] [CrossRef]
- Sourisseau, M.; Le Guennec, V.; Le Gland, G.; Plus, M.; Chapelle, A. Resource Competition Affects Plankton Community Structure: Evidence from Trait-Based Modeling. Front. Mar. Sci. 2017, 4, 52. [Google Scholar] [CrossRef]
- Duan, Q.; Djidjeli, K.; Price, W.G.; Twizell, E.H. Weighted rational cubic spline interpolation and its application. J. Comput. Appl. Math. 2000, 117, 121–135. [Google Scholar] [CrossRef]
- Guo, Z.; Wan, Y.; Ye, H. A data imputation method for multivariate time series based on generative adversarial network. Neurocomputing 2019, 360, 185–197. [Google Scholar] [CrossRef]
- Barthelmann, V.; Novak, E.; Ritter, K. High dimensional polynomial interpolation on sparse grids. Adv. Comput. Math. 2000, 12, 273–288. [Google Scholar] [CrossRef]
- Zhang, M.; Liang, X.Z. On a Hermite interpolation on the sphere. Appl. Numer. Math. 2011, 61, 666–674. [Google Scholar] [CrossRef]
- Sun, X.L.; Guo, Y.; Li, N.; Song, X.X. Multivariate missing data imputing algorithm based on modified RNN. Inf. Technol. Netw. Secur. 2019, 38, 47–53. [Google Scholar] [CrossRef]
- Fouladgar, N.; Framling, K. A Novel LSTM for Multivariate Time Series with Massive Missingness. Sensors 2020, 20, 2832. [Google Scholar] [CrossRef] [PubMed]
- Song, W.; Gao, C.; Zhao, Y.; Zhao, Y.D. A Time Series Data Filling Method Based on LSTM-Taking the Stem Moisture as an Example. Sensors 2020, 20, 5045. [Google Scholar] [CrossRef] [PubMed]
- Ding, W.X.; Zhang, C.Y.; Shang, S.P.; Li, X.D. Optimization of deep learning model for coastal chlorophyll a dynamic forecast. Ecol. Model. 2022, 467, 109913. [Google Scholar] [CrossRef]
- Zhang, C.Y.; Zhang, X.M.; Shang, S.L. Study on quality control of automatic monitoring buoy data in Xiamen west area. In Proceedings of the 2009 Annual Academic Conference of the Chinese Society for Environmental Sciences; Beijing University of Aeronautics and Astronautics Press: Beijing, China, 2009; Volume I, pp. 582–586. [Google Scholar]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural. Comput. 1997, 19, 1735–1780. [Google Scholar] [CrossRef]
- Roy, D.K.; Sarkar, T.K.; Kamar, S.S.A.; Goswami, T.; Muktadir, M.A.; Al-Ghobari, H.M.; Alataway, A.; Dewidar, A.Z.; El-Shafei, A.A.; Mattar, M.A. Daily Prediction and Multi-Step Forward Forecasting of Reference Evapotranspiration Using LSTM and Bi-LSTM Models. Agronomy 2022, 12, 594. [Google Scholar] [CrossRef]
- Lee, S.; Lee, D. Improved Prediction of Harmful Algal Blooms in Four Major South Korea’s Rivers Using Deep Learning Models. Int. J. Environ. Res. Public Health 2018, 15, 1322. [Google Scholar] [CrossRef]
- Srivastava, N.; Hinton, G.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 2014, 15, 1929–1958. [Google Scholar]
- Lecun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
- Siegel, D.A.; Doney, S.C.; Yoder, J.A. The North Atlantic spring phytoplankton bloom and Sverdrup’s critical depth hypothesis. Science 2002, 296, 730–733. [Google Scholar] [CrossRef] [PubMed]
- Strutton, P.G.; Martz, T.R.; DeGrandpre, M.D.; McGillis, W.R.; Drennan, W.M.; Boss, E. Bio-optical observations of the 2004 Labrador Sea phytoplankton bloom. J. Geophys. Res.-Ocean. 2011, 116, C11037. [Google Scholar] [CrossRef]
- Sarangi, R.K. Observation of Algal Bloom in the Northwest Arabian Sea Using Multisensor Remote Sensing Satellite Data. Mar. Geod. 2012, 35, 158–174. [Google Scholar] [CrossRef]
- Lim, P.T.; Leaw, C.P.; Usup, G.; Kobiyama, A.; Koike, K.; Ogata, T. Effects of light and temperature on growth, nitrate uptake, and toxin production of two tropical dinoflagellates: Alexandrium tamiyavanichii and Alexandrium minutum (Dinophyceae). J. Phycol. 2006, 42, 786–799. [Google Scholar] [CrossRef]
- Guallar, C.; Bacher, C.; Chapelle, A. Global and local factors driving the phenology of Alexandrium minutum (Halim) blooms and its toxicity. Harmful Algae 2017, 67, 44–60. [Google Scholar] [CrossRef]
- Kim, Y.; Shin, H.S.; Plummer, J.D. A wavelet-based autoregressive fuzzy model for forecasting algal blooms. Environ. Model. Softw. 2014, 62, 1–10. [Google Scholar] [CrossRef]
- Iriarte, A.; Aravena, G.; Villate, F.; Uriarte, I.; Ibanez, B.; Llope, M.; Stenseth, N.C. Dissolved oxygen in contrasting estuaries of the Bay of Biscay: Effects of temperature, river discharge and chlorophyll a. Mar. Ecol. Prog. Ser. 2010, 418, 57–71. [Google Scholar] [CrossRef]
- Pedersen, M.F.; Hansen, P.J. Effects of high pH on a natural marine planktonic community. Marne Ecol. Prog. Ser. 2003, 260, 19–31. [Google Scholar] [CrossRef]
- Ajin, A.M.; Silvester, R.; Alexander, D.; Nashad, M.; Abdulla, M.H. Characterization of blooming algae and bloom-associated changes in the water quality parameters of traditional pokkali cum prawn fields along the South West coast of India. Environ. Monit. Assess. 2016, 188, 145. [Google Scholar] [CrossRef]
- Pitawala, S.; Trifunovic, Z.; Steele, J.R.; Lee, H.C.; Crosbie, N.D.; Scales, P.J.; Martin, G.J.O. Variation of the photosynthesis and respiration response of filamentous algae (Oedogonium) acclimated to averaged seasonal temperatures and light exposure levels. Algal Res.-Biomass Biofuels Bioprod. 2023, 64, 103213. [Google Scholar] [CrossRef]
- Guan, D.; Gao, D.W.; Ren, N.Q.; Li, Y.F. Viewpoints of Dominant Environmental Factors Influencing Algal Blooms. Int. Conf. Mech. Mater. Manuf. Eng. 2011, 66–68, 155–159. [Google Scholar] [CrossRef]
- Zhang, Y.L.; Shi, K.; Liu, J.J.; Deng, J.M.; Qin, B.Q.; Zhu, G.W.; Zhou, Y.Q. Meteorological and hydrological conditions driving the formation and disappearance of black blooms, an ecological disaster phenomena of eutrophication and algal blooms. Sci. Total Environ. 2016, 569, 1517–1529. [Google Scholar] [CrossRef] [PubMed]
- Zhou, Y.T.; Yan, W.J.; Wei, W.Y. Effect of sea surface temperature and precipitation on annual frequency of harmful algal blooms in the East China Sea over the past decades. Environ. Pollut. 2021, 270, 116224. [Google Scholar] [CrossRef] [PubMed]
- Tzoumpas, K.; Estrada, A.; Miraglio, P.; Zambelli, P. A Data Filling Methodology for Time Series Based on CNN and (Bi)LSTM Neural Networks. IEEE Access 2024, 12, 31443–31460. [Google Scholar] [CrossRef]
- Garcia, D.A.; Amori, M.; Giovanardi, F.; Piras, G.; Groppi, D.; Cumo, F.; de Santoli, L. An identification and a prioritisation of geographic and temporal data gaps of Mediterranean marine databases. Sci. Total Environ. 2019, 668, 531–546. [Google Scholar] [CrossRef]
Scenario | No Missing | Missing Chl | Missing DO | Missing Temp | Missing pH |
---|---|---|---|---|---|
r | 0.93 | 0.70 | 0.91 | 0.89 | 0.90 |
RMSE (μg/L) | 1.09 | 2.08 | 1.22 | 1.32 | 1.28 |
Percentage of absolute error <0.5 μg/L | 67% | 36% | 66% | 68% | 66% |
Percentage of absolute error >5 μg/L | 1% | 4% | 1% | 1% | 1% |
Consecutive Missing Days | RMSE | r | High Value (≥5 μg/L) | Low Value (<5 μg/L) | |||
---|---|---|---|---|---|---|---|
RMSE | r | RMSE | r | ||||
No missing | 1.09 | 0.93 | 2.65 | 0.76 | 0.57 | 0.85 | |
1 day | Evaluation coefficient | 1.22 | 0.91 | 3.00 | 0.73 | 0.60 | 0.84 |
Improvement rate | −11.9% | −1.4% | −13.6% | −3.3% | −5.5% | −0.9% | |
3 days | Evaluation coefficient | 1.31 | 0.89 | 3.30 | 0.64 | 0.58 | 0.85 |
Improvement rate | −20.0% | −3.5% | −24.8% | −15.4% | −3.0% | −0.2% | |
5 days | Evaluation coefficient | 1.35 | 0.89 | 3.43 | 0.63 | 0.56 | 0.86 |
Improvement rate | −23.1% | −3.8% | −29.7% | −16.9% | 1.6% | 0.3% | |
7 days | Evaluation coefficient | 1.38 | 0.89 | 3.47 | 0.64 | 0.59 | 0.85 |
Improvement rate | −25.8% | −3.8% | −31.0% | −16.2% | −5.1% | −0.5% |
Evaluation Coefficient | RMSE | r | High Value (≥5 μg/L) | Low Value (<5 μg/L) | ||
---|---|---|---|---|---|---|
RMSE | r | RMSE | r | |||
Chl parameter elimination | 2.079 | 0.70 | 4.92 | 0.58 | 1.16 | 0.17 |
Chl parameter imputation | 1.462 | 0.88 | 3.73 | 0.59 | 0.60 | 0.85 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, C.; Ding, W.; Zhang, L. Impacts of Missing Buoy Data on LSTM-Based Coastal Chlorophyll-a Forecasting. Water 2024, 16, 3046. https://doi.org/10.3390/w16213046
Zhang C, Ding W, Zhang L. Impacts of Missing Buoy Data on LSTM-Based Coastal Chlorophyll-a Forecasting. Water. 2024; 16(21):3046. https://doi.org/10.3390/w16213046
Chicago/Turabian StyleZhang, Caiyun, Wenxiang Ding, and Liyu Zhang. 2024. "Impacts of Missing Buoy Data on LSTM-Based Coastal Chlorophyll-a Forecasting" Water 16, no. 21: 3046. https://doi.org/10.3390/w16213046
APA StyleZhang, C., Ding, W., & Zhang, L. (2024). Impacts of Missing Buoy Data on LSTM-Based Coastal Chlorophyll-a Forecasting. Water, 16(21), 3046. https://doi.org/10.3390/w16213046