Prediction of Algal Chlorophyll-a and Water Clarity in Monsoon-Region Reservoir Using Machine Learning Approaches
Abstract
:1. Introduction
2. Materials and Methods
2.1. Study Area
2.2. Analysis of Water Quality Parameters and Rainfall Data
2.3. Regression and Machine Learning Approaches
2.3.1. Multiple Linear Regression (MLR)
2.3.2. Support Vector Machine (SVM)
2.3.3. Artificial Neural Network (ANN)
2.4. K-Fold Cross-Validation and Model Accuracy Metrics
2.4.1. Mean Absolute Error
2.4.2. Root Mean Squared Error
2.4.3. Co-Efficient of Determination
2.5. Trophic State
2.6. Data Analysis
3. Results
3.1. Water Quality Summary
3.2. Analysis of Hydrology Pattern
3.3. Chlorophyll-a Prediction, Cross-Validation and Trophic State in Different Zones
3.4. Chlorophyll-a Prediction, Cross-Validation and Trophic State in Different Season
3.5. Transparency (Secchi Depth) Prediction, Cross-Validation and Trophic State in Different Zones
3.6. Transparency (Secchi Depth) Prediction, Cross-Validation and Trophic State in Different Season
4. Discussion
5. Summary
Supplementary Materials
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Smith, V.H. Eutrophication of freshwater and coastal marine ecosystems: A global problem. Environ. Sci. Pollut. Res. 2003, 10, 126–139. [Google Scholar] [CrossRef] [PubMed]
- Gao, C.; Zhang, T.L. Eutrophication in a Chinese context: Understanding various physical and socio-economic aspects. Ambio 2010, 39, 385–393. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Morse, N.B.; Wollheim, W.M. Climate variability masks the impacts of land use change on nutrient export in a suburbanizing watershed. Biogeochemistry 2014, 121, 45–59. [Google Scholar] [CrossRef]
- Glasgow, H.B.; Burkholder, J.M.; Reed, R.E.; Lewitus, A.J.; Kleinman, J.E. Real-time remote monitoring of water quality: A review of current applications, and advancements in sensor, telemetry, and computing technologies. J. Exp. Mar. Biol. Ecol. 2004, 300, 409–448. [Google Scholar] [CrossRef]
- Park, Y.; Cho, K.H.; Park, J.; Cha, S.M.; Kim, J.H. Development of early-warning protocol for predicting chlorophyll-a concentration using machine learning models in freshwater and estuarine reservoirs, Korea. Sci. Total Environ. 2015, 502, 31–41. [Google Scholar] [CrossRef]
- Cho, K.H.; Kang, J.H.; Ki, S.J.; Park, Y.; Cha, S.M.; Kim, J.H. Determination of the Optimal Parameters in Regression Models for the Prediction of Chlorophyll-a: A Case Study of the Yeongsan Reservoir, Korea. Sci. Total Environ. 2009, 407, 2536–2545. [Google Scholar] [CrossRef]
- Handan, C.; Nilsun, D.; Kanik, A.; Keskyn, S. Use of Principal Component Scores in Multiple Linear Regression Models for Prediction of Chlorophyll-a in Reservoirs. Ecol. Model. 2004, 181, 581–589. [Google Scholar]
- Pereira, G.C.; Evsukoff, A.; Ebecken, N.F.F. Fuzzy modelling of chlorophyll production in a brazilian upwelling system. Ecol. Model. 2009, 220, 1506–1512. [Google Scholar] [CrossRef]
- Anderson, D.M.; Andersen, P.; Bricelj, V.M.; Cullen, J.J.; Rensel, J.E. Monitoring and Management Strategies for Harmful Algal Blooms in Coastal Waters; Intergovernmental Oceanographic Commission Technical Series No. 59; APEC #201-MR-01.1; Asia Pacific Economic Program: Singapore; UNESCO: Paris, France, 2001. [Google Scholar]
- Wu, Z.; Zhang, Y.; Zhou, Y.; Liu, M.; Shi, K.; Yu, Z. Seasonal-spatial distribution and long-term variation of transparency in xin’anjiang reservoir: Implications for reservoir management. Int. J. Environ. Res. Public Health 2015, 12, 9492–9507. [Google Scholar] [CrossRef]
- Wang, F.; Wang, X.; Chen, B.; Zhao, Y.; Yang, Z. Chlorophyll a simulation in a lake ecosystem using a model with wavelet analysis and artificial neural network. Environ. Manag. 2013, 51, 1044–1054. [Google Scholar] [CrossRef]
- Kirk, J.T. Light and Photosynthesis in Aquatic Ecosystems; Cambridge University Press: Cambridge, UK, 2011. [Google Scholar]
- Karlsson, J.; Byström, P.; Ask, J.; Ask, P.; Persson, L.; Jansson, M. Light limitation of nutrient-poor lake ecosystems. Nature 2009, 460, 506–509. [Google Scholar] [CrossRef] [PubMed]
- Zhang, Y.; Wu, Z.; Liu, M.; He, J.; Shi, K.; Zhou, Y.; Wang, M.; Liu, X. Dissolved oxygen stratification and response to thermal structure and long-term climate change in a large and deep subtropical reservoir (Lake Qiandaohu, China). Water Res. 2015, 75, 249–258. [Google Scholar] [CrossRef] [PubMed]
- Jassby, A.D.; Reuter, J.E.; Goldman, C.R. Determining long-term water quality change in the presence of climate variability: Lake Tahoe (USA). Can. J. Fish. Aquat. Sci. 2003, 60, 1452–1461. [Google Scholar] [CrossRef]
- Naumenko, M.A. Seasonality and trends in the Secchi disk transparency of Lake Ladoga. In European Large Lakes Ecosystem Changes and Their Ecological and Socioeconomic Impacts; Springer: Berlin, Germany, 2008; pp. 59–65. [Google Scholar]
- Martin, S.; Soranno, P.; Bremigan, M.; Cheruvelil, K. Comparing hydrogeomorphic approaches to lake classification. Environ. Manag. 2011, 48, 957–974. [Google Scholar] [CrossRef]
- Jiang, J.; Wang, P.; Tian, Z.; Guo, L.; Wang, Y. A comparative study of statistical learning methods to predict eutriphication trendency in a resevior, northeast China. In 2011 Second International Conference on Mechanic Automation and Control Engineering; IEEE: Piscataway, NJ, USA, 2011; ISBN 978-1-4244-9439-2/11. [Google Scholar]
- Halecki, W.; Mlynski, D.; Ryczek, M.; Kruk, E.; Radecki-Pawlik, A. Applying an artificial neural network (ANN) to assess soil salinity and temperature variability in agricultural areas of a mountain catchment. Pol. J. Environ. Stud. 2017, 26, 2545–2554. [Google Scholar] [CrossRef]
- Alam, M.A.; Fukumizu, K. Hyperparameter selection in kernel principal component analysis. J. Comput. Sci. 2014, 10, 1139–1150. [Google Scholar] [CrossRef] [Green Version]
- Xie, Z.; Lou, I.; Ung, W.K.; Mok, K.M. Freshwater algal bloom prediction by support vector machine in macau storage reservoirs. Math. Probl. Eng. Vol. 2012, 2012, 397473. [Google Scholar] [CrossRef]
- Ren, Y.; Bai, G. Determination of optimal SVM parameters by using GA/PSO. J. Comput. 2010, 5, 1160–1168. [Google Scholar] [CrossRef]
- Kim, B.K.; Kim, S.; Kyung, M.S.; Lee, K.H.; Kim, H.S. Prediction of suspended sediment in Imha Reservoir, Korea. In World Environmental and Water Resources Congress 2007: Restoring Our Natural Habitat; ASCE: Reston, VA, USA, 2007. [Google Scholar]
- Ji, J.; Kim, H.; Yu, M.; Choi, C.; Yi, J.; Kang, J. Reservoir system operation using a diversion tunnel. WIT Trans. Ecol. Environ. 2014, 184, 87–98. [Google Scholar] [CrossRef] [Green Version]
- Engineering Consultation and Survey Center Central Mill Supply Co. Ltd. Feasibility Study of Hydro Sites on Nakdong River-Imha Hydroelectric Project; The Government Republic of Korea: Seoul, Korea, 1962.
- Korea Ministry of Environment. Water Pollution Investigation Method. 2001. Available online: http://Water.nier.go.kr (accessed on 19 December 2019).
- US Environmental Protection Agency. Guideline for Data Quality Assessment; USEPA: Washington, DC, USA, 2007.
- Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
- Alam, M.A.; Fukumizu, K.; Wang, Y.P. Influence function and robust variant of kernel canonical analysis. Neurocomputing 2018, 304, 12–29. [Google Scholar] [CrossRef] [PubMed]
- Cho, K.H.; Sthiannopkao, S.; Pachepsky, Y.A.; Kim, K.W.; Kim, J.H. Prediction of contamination potential of groundwater arsenic in Cambodia, Laos, and Thailand using artificial neural network. Water Res. 2011, 45, 5535–5544. [Google Scholar] [CrossRef] [PubMed]
- Marinósdóttir, H. Applications of Different Machine Learning Methods for Water Level Predictions. Master’s Thesis, Reykjavik University, Reykjavik, Iceland, 2019. [Google Scholar]
- United States Environmental Protection Agency (USEPA). The Lake and Reservoir Restoration Guidance Manual; EPA 440/5-88-002; USEPA: Washington, DC, USA, 1988.
- Balabin, R.M.; Lomakina, E.I. Support vector machine regression (SVR/LS-SVM)—An alternative to neural networks (ANN) for analytical chemistry? Comparison of nonlinear methods on near infrared (NIR) spectroscopy data. Analyst 2011, 136, 1703–1712. [Google Scholar] [CrossRef] [PubMed]
- Sebald, D.J.; Bucklew, J.A. Support vector machine techniques for nonlinear equalization. IEEE Trans. Signal Process. 2000, 48, 3217–3226. [Google Scholar] [CrossRef]
- Kim, K.J. Financial time series forecasting using support vector machines. Neurocomputing 2003, 55, 307–319. [Google Scholar] [CrossRef]
- Chen, W.H.; Hsu, S.H.; Shen, H.P. Application of SVM and ANN for intrusion detection. Comput. Oper. Res. 2005, 32, 2617–2634. [Google Scholar] [CrossRef]
- Basu, A.; Walters, C.; Shepherd, M. Support vector machines for text categorization. In Proceedings of the IEEE 36th Annual Hawaii International Conference on System Sciences, Big Island, HI, USA, 6–9 January 2003; p. 7. [Google Scholar]
- Xia, L.; Feng, J.; Wang, Y. Chlorophyll a predictability and relative importance of factors governing lake phytoplankton at different timescales. Sci. Total Environ. 2015, 648, 472–480. [Google Scholar]
- Mamun, M.; Lee, S.J.; An, K.G. Temporal and spatial variation of nutrients, suspended solids, and chlorophyll in Yeongsan watershed. J. Asia-Pac. Biodivers. 2018, 11, 206–216. [Google Scholar] [CrossRef]
- Atique, U.; An, K.G. Reservoir water quality assessment based on chemical parameters and the chlorophyll dynamics in relation to nutrient regime. Pol. J. Environ. Stud. 2019, 28, 1–19. [Google Scholar] [CrossRef]
- Kuo, J.T.; Hsieh, M.H.; Lung, W.S.; She, N. Using artificial neural network for reservoir eutrophication prediction. Ecol. Model. 2007, 200, 171–177. [Google Scholar] [CrossRef]
- Calderon, M.S.; An, K.G. An influence of mesohabitat structures (pool, riffle, and run) and land-use pattern on the index of biological integrity in the Geum River watershed. J. Ecol. Environ. 2016, 40, 1–13. [Google Scholar] [CrossRef] [Green Version]
- Ingole, N.P.; An, K.G. Modifications of nutrient regime, chlorophyll-a, and trophic state relations in Daechung Reservoir after the construction of an upper dam. J. Ecol. Environ. 2016, 40, 1–10. [Google Scholar] [CrossRef] [Green Version]
Water Quality Parameters | Riverine Zone Mean ± SD (Min–Max) (CV) | Transitional Zone Mean ± SD (Min–Max) (CV) | Lacustrine Zone Mean ± SD (Min–Max) (CV) | Premonsoon Mean ± SD (Min–Max) (CV) | Monsoon Mean ± SD (Min–Max) (CV) | Postmonsoon Mean ± SD (Min–Max) (CV) |
---|---|---|---|---|---|---|
DO (mg/L) | 8.70 ± 2.16 (2.1–17.6) (24.82) | 7.91 ± 2.13 (3.1–13.2) (26.92) | 8.50 ± 2.21 (2.8–15.4) (26) | 9.58 ± 1.71 (3.2–15.4) (17.84) | 7.17 ± 1.91 (2.6–17.6) (26.63) | 7.16 ± 1.96 (2.1–11.5) (27.37) |
BOD (mg/L) | 2.05 ± 0.49 (1.2–4) (23.90) | 2.04 ± 0.51 (1.2–3.6) (25) | 2.12 ± 0.50 (1.2–4) (23.58) | 2.0 ± 0.52 (1.2–4) (26) | 2.12 ± 0.40 (1.2–3.2) (18.86) | 2.16 ± 0.50 (1.3–4) (23.14) |
TSS (mg/L) | 4.51 ± 3.67 (0.5–36) (81.37) | 3.70 ± 1.63 (0.2–19) (44.05) | 3.91 ± 2.90 (0.5–14.4) (74.16) | 3.54 ± 3.63 (0.2–36) (102.54) | 4.98 ± 3.40 (0.5–15.4) (68.27) | 4.31 ± 2.70 (0.3–14.6) (62.64) |
TN (mg/L) | 1.69 ± 0.41 (0.79–2.99) (24.26) | 1.63 ± 0.39 (0.80–3.37) (23.92) | 1.63 ± 0.42 (0.73–2.85) (25.76) | 1.63 ± 0.40 (0.79–3.38) (24.53) | 1.77 ± 0.45 (1.03–2.85) (25.42) | 1.63 ± 0.38 (0.73–2.66) (23.31) |
TP (mg/L) | 0.03 ± 0.01 (0.009–0.17) (33.33) | 0.027 ± 0.01 0.01–0.10 (37.03) | 0.028 ± 0.01 (0.01–0.14) (35.71) | 0.025 ± 0.007 (0.01–0.057) (28) | 0.035 ± 0.02 (0.015–0.174) (57.14) | 0.03 ± 0.01 (0.01–0.138) (33.33) |
NP ratios | 65.10 ± 23.66 (14.36–179.33) (36.34) | 67.55 ± 25.27 (13.77–169.28) (37.40) | 64.20 ± 23.66 (10.85–165.75) (36.85) | 70.23 ± 24.06 (21.25–169) (34.25) | 61.57 ± 25.01 (13.77–158.83) (40.62) | 60.85 ± 22.84 (10.85–179.33) (37.53) |
WT (°C) | 14.09 ± 7.59 (2–29) (53.86) | 10.13 ± 4.42 (3–19) (43.63) | 13.31 ± 6.88 (2–27) (51.69) | 8.02 ± 4.89 (2–23) (60.97) | 19.36 ± 4.75 (9.8–29) (24.53) | 15.69 ± 4.65 (7–25.4) (29.63) |
Cond. (µs/cm) | 177.42 ± 61.91 (57–404) (34.88) | 163.62 ± 61.29 (4–312) (37.40) | 168.85 ± 57.28 (83–322) (33.87) | 168.69 ± 67.80 (4–404) (40.19) | 184.25 ± 49.60 (114–332) (26.91) | 164.60 ± 52.20 (86–316) (31.71) |
CHL-a (µg/L) | 4.80 ± 4.19 (1.8–23.6) (87.29) | 3.18 ± 2.49 (0.1–12.5) (78.30) | 3.82 ± 2.74 (0.1–18.3) (71.72) | 2.68 ± 1.86 (0.1–15) (69.14) | 6.23 ± 4.91 (0.47–4.91) (78.81) | 4.65 ± 3.45 (0.1–18.3) (74.19) |
SD (m) | 1.89 ± 0.96 (0.1–5) (50.79) | 2.62 ± 1.34 (0.1–6) (51.14) | 2.12 ± 1.05 (0.1–4.5) (49.52) | 2.22 ± 1.18 (0.1–5.5) (53.15) | 2.33 ± 1.17 (0.2–6) (50.21) | 2.12 ± 1.15 (0.1–5.5) (54.24) |
Model Accuracy Metrics | Riverine Zone (Rz) | Transitional Zone (Tz) | Lacustrine Zone (Lz) | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Before Validation | After Validation | Before Validation | After Validation | Before Validation | After Validation | |||||||||||||
MLR | SVM | ANN | MLR | SVM | ANN | MLR | SVM | ANN | MLR | SVM | ANN | MLR | SVM | ANN | MLR | SVM | ANN | |
RMSE | 3.37 | 2.86 | 2.83 | 3.61 | 2.57 | 4.96 | 2.07 | 1.58 | 1.56 | 2.18 | 1.31 | 2.93 | 2.25 | 1.77 | 1.72 | 2.31 | 1.59 | 3.40 |
R2 | 0.34 | 0.56 | 0.53 | 0.28 | 0.75 | 0.43 | 0.30 | 0.63 | 0.60 | 0.26 | 0.73 | 0.40 | 0.31 | 0.58 | 0.60 | 0.25 | 0.80 | 0.40 |
MAE | 2.30 | 1.52 | 1.96 | 2.49 | 1.24 | 3.20 | 1.54 | 0.96 | 1.15 | 1.62 | 0.62 | 1.98 | 1.65 | 1.07 | 1.28 | 1.73 | 0.68 | 2.25 |
Model Accuracy Metrics | Premonsoon (January–June) | Monsoon (July–August) | Postmonsoon (September–December) | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Before Validation | After Validation | Before Validation | After Validation | Before Validation | After Validation | |||||||||||||
MLR | SVM | ANN | MLR | SVM | ANN | MLR | SVM | ANN | MLR | SVM | ANN | MLR | SVM | ANN | MLR | SVM | ANN | |
RMSE | 1.60 | 1.35 | 1.42 | 1.64 | 1.04 | 1.99 | 4.34 | 3.25 | 2.85 | 4.77 | 1.51 | 6.04 | 2.65 | 2.15 | 2.09 | 2.83 | 1.80 | 4.88 |
R2 | 0.25 | 0.50 | 0.41 | 0.23 | 0.71 | 0.37 | 0.20 | 0.57 | 0.65 | 0.10 | 0.81 | 0.27 | 0.40 | 0.62 | 0.63 | 0.34 | 0.77 | 0.48 |
MAE | 1.15 | 0.81 | 1.04 | 1.19 | 0.40 | 1.29 | 3.09 | 1.73 | 2.10 | 3.37 | 0.78 | 3.78 | 1.98 | 1.26 | 1.64 | 2.14 | 0.80 | 2.59 |
Model Accuracy Metrics | Riverine Zone (Rz) | Transitional Zone (Tz) | Lacustrine Zone (Lz) | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Before Validation | After Validation | Before Validation | After Validation | Before Validation | After Validation | |||||||||||||
MLR | SVM | ANN | MLR | SVM | ANN | MLR | SVM | ANN | MLR | SVM | ANN | MLR | SVM | ANN | MLR | SVM | ANN | |
RMSE | 0.63 | 0.50 | 0.60 | 0.87 | 0.29 | 1.10 | 0.77 | 0.64 | 0.82 | 1.10 | 0.40 | 1.49 | 0.63 | 0.47 | 0.69 | 0.92 | 0.33 | 1.17 |
R2 | 0.55 | 0.72 | 0.59 | 0.24 | 0.90 | 0.49 | 0.66 | 0.77 | 0.62 | 0.32 | 0.90 | 0.50 | 0.63 | 0.79 | 0.56 | 0.26 | 0.90 | 0.47 |
MAE | 0.48 | 0.32 | 0.46 | 0.67 | 0.19 | 0.81 | 0.59 | 0.39 | 0.62 | 0.87 | 0.27 | 1.24 | 0.49 | 0.30 | 0.52 | 0.73 | 0.23 | 0.92 |
Model Accuracy Metrics | Premonsoon (January–June) | Monsoon (July–August) | Postmonsoon (September–December) | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Before Validation | After Validation | Before Validation | After Validation | Before Validation | After Validation | |||||||||||||
MLR | SVM | ANN | MLR | SVM | ANN | MLR | SVM | ANN | MLR | SVM | ANN | MLR | SVM | ANN | MLR | SVM | ANN | |
RMSE | 0.88 | 0.61 | 0.69 | 0.95 | 0.41 | 1.01 | 0.85 | 0.62 | 0.57 | 1.00 | 0.39 | 1.18 | 0.94 | 0.65 | 0.79 | 1.01 | 0.31 | 1.27 |
R2 | 0.43 | 0.72 | 0.65 | 0.40 | 0.87 | 0.57 | 0.46 | 0.71 | 0.75 | 0.37 | 0.87 | 0.53 | 0.32 | 0.67 | 0.53 | 0.24 | 0.92 | 0.47 |
MAE | 0.68 | 0.42 | 0.56 | 0.71 | 0.23 | 0.80 | 0.66 | 0.37 | 0.44 | 0.80 | 0.21 | 0.78 | 0.76 | 0.40 | 0.62 | 0.82 | 0.19 | 1.27 |
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Mamun, M.; Kim, J.-J.; Alam, M.A.; An, K.-G. Prediction of Algal Chlorophyll-a and Water Clarity in Monsoon-Region Reservoir Using Machine Learning Approaches. Water 2020, 12, 30. https://doi.org/10.3390/w12010030
Mamun M, Kim J-J, Alam MA, An K-G. Prediction of Algal Chlorophyll-a and Water Clarity in Monsoon-Region Reservoir Using Machine Learning Approaches. Water. 2020; 12(1):30. https://doi.org/10.3390/w12010030
Chicago/Turabian StyleMamun, Md, Jung-Jae Kim, Md Ashad Alam, and Kwang-Guk An. 2020. "Prediction of Algal Chlorophyll-a and Water Clarity in Monsoon-Region Reservoir Using Machine Learning Approaches" Water 12, no. 1: 30. https://doi.org/10.3390/w12010030
APA StyleMamun, M., Kim, J. -J., Alam, M. A., & An, K. -G. (2020). Prediction of Algal Chlorophyll-a and Water Clarity in Monsoon-Region Reservoir Using Machine Learning Approaches. Water, 12(1), 30. https://doi.org/10.3390/w12010030