Hyperspectral Data and Machine Learning for Estimating CDOM, Chlorophyll a, Diatoms, Green Algae and Turbidity
Abstract
:1. Introduction
- a detailed description of our measurements of water quality parameters with the Biofish multi-sensor system and the PhycoSens fluorometer, which is yet non-existent;
- a comprehensive analysis of the potential of an appropriate regression framework based on different regression models, e.g., linear models, tree ensemble methods and artificial neural networks;
- an underlying analysis of two distinct preprocessing methods combined with a detailed evaluation of the regression performance;
- a detailed visualization of the regression results compared to the real probe measurements based on recorded GPS tracks along the river Elbe.
2. Sensors and Datasets
2.1. Sampling Chlorophyll a, Green Algae and Diatoms
2.2. Sampling CDOM and Turbidity
2.3. Recording Hyperspectral Images
2.4. Elbe Field Campaign Datasets
3. Methodology
3.1. Preprocessing
3.2. Regression Models
4. Results
5. Discussion
5.1. Performance and Applicability of the Regression Framework
5.2. Discussion of the ET’s Feature Importance
6. Conclusions
Author Contributions
Acknowledgments
Conflicts of Interest
References
- Postel, S.L. Entering an era of water scarcity: The challenges ahead. Ecol. Appl. 2000, 10, 941–948. [Google Scholar]
- Brönmark, C.; Hansson, L.A. Environmental issues in lakes and ponds: Current state and perspectives. Environ. Conserv. 2002, 29, 290–307. [Google Scholar]
- Findlay, S.; Sinsabaugh, R.L. Aquatic Ecosystems: Interactivity of Dissolved Organic Matter; Academic Press: San Diego, CA, USA, 2003. [Google Scholar]
- Suggett, D.J.; Prášil, O.; Borowitzka, M.A. Chlorophyll a Fluorescence in Aquatic Sciences: Methods and Applications; Springer: Dordrecht, The Netherlands, 2010. [Google Scholar]
- Furnas, M.J. In situ growth rates of marine phytoplankton: Approaches to measurement, community and species growth rates. J. Plankton Res. 1990, 12, 1117–1151. [Google Scholar]
- Smol, J.P.; Stoermer, E.F. The Diatoms: Applications for the Environmental and Earth Sciences; Cambridge University Press: Cambridge, UK, 2010. [Google Scholar]
- Yool, A.; Tyrrell, T. Role of diatoms in regulating the ocean’s silicon cycle. Glob. Biogeochem. Cycles 2003, 17, 1–21. [Google Scholar]
- Clesceri, L.S.; Greenberg, A.E.; Eaton, A.D. Standard Methods for the Examination of Water and Wastewater; American Public Health Association: Washington, DC, USA, 1994. [Google Scholar]
- Palmer, S.C.; Kutser, T.; Hunter, P.D. Remote sensing of inland waters: Challenges, progress and future directions. Remote Sens. Environ. 2015, 157, 1–8. [Google Scholar] [Green Version]
- Bukata, R.P. Retrospection and introspection on remote sensing of inland water quality: “Like Déjà Vu all over again”. J. Great Lakes Res. 2013, 39, 2–5. [Google Scholar]
- Gitelson, A.; Keydan, G. Remote sensing of inland surface water quality–measurements in the visible spectrum. Acta Hydrophys. 1990, 34, 5–27. [Google Scholar]
- Gitelson, A. The peak near 700 nm on radiance spectra of algae and water: Relationships of its magnitude and position with chlorophyll concentration. Int. J. Remote Sens. 1992, 13, 3367–3373. [Google Scholar]
- Rundquist, D.C.; Han, L.; Schalles, J.F.; Peake, J.S. Remote measurement of algal chlorophyll in surface waters: The case for the first derivative of reflectance near 690 nm. Photogramm. Eng. Remote Sens. 1996, 62, 195–200. [Google Scholar]
- Fraser, R. Hyperspectral remote sensing of turbidity and chlorophyll a among Nebraska Sand Hills lakes. Int. J. Remote Sens. 1998, 19, 1579–1589. [Google Scholar]
- Menken, K.D.; Brezonik, P.L.; Bauer, M.E. Influence of chlorophyll and colored dissolved organic matter (CDOM) on lake reflectance spectra: Implications for measuring lake properties by remote sensing. Lake Reserv. Manag. 2006, 22, 179–190. [Google Scholar]
- Schalles, J.F.; Gitelson, A.A.; Yacobi, Y.Z.; Kroenke, A.E. Estimation of chlorophyll a from time series measurements of high spectral resolution reflectance in an eutrophic lake. J. Phycol. 1998, 34, 383–390. [Google Scholar]
- Mannheim, S.; Segl, K.; Heim, B.; Kaufmann, H. Monitoring of lake water quality using hyperspectral CHRIS-PROBA data. In Proceedings of the 2nd CHRIS/Proba Workshop, Frascati, Italy, 28–30 April 2004; pp. 28–30. [Google Scholar]
- Hunter, P.D.; Tyler, A.N.; Présing, M.; Kovács, A.W.; Preston, T. Spectral discrimination of phytoplankton colour groups: The effect of suspended particulate matter and sensor spectral resolution. Remote Sens. Environ. 2008, 112, 1527–1544. [Google Scholar]
- Ifarraguerri, A.; Chang, C.I. Unsupervised hyperspectral image analysis with projection pursuit. IEEE Trans. Geosci. Remote Sens. 2000, 38, 2529–2538. [Google Scholar] [Green Version]
- Yu, Q.; Tian, Y.Q.; Chen, R.F.; Liu, A.; Gardner, G.B.; Zhu, W. Functional linear analysis of in situ hyperspectral data for assessing CDOM in rivers. Photogramm. Eng. Remote Sens. 2010, 76, 1147–1158. [Google Scholar]
- Brezonik, P.L.; Olmanson, L.G.; Finlay, J.C.; Bauer, M.E. Factors affecting the measurement of CDOM by remote sensing of optically complex inland waters. Remote Sens. Environ. 2015, 157, 199–215. [Google Scholar]
- Maier, P.M.; Keller, S. Machine learning regression on hyperspectral data to estimate multiple water parameters. In Proceedings of the 9th Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing (WHISPERS), Amsterdam, The Netherlands, 23–26 September 2018. [Google Scholar]
- Keller, S.; Riese, F.M.; Stötzer, J.; Maier, P.M.; Hinz, S. Developing a machine learning framework for estimating soil moisture with VNIR hyperspectral data. In Proceedings of the ISPRS Technical Commission I Symposium, International Society for Photogrammetry and Remote Sensing (ISPRS), Karlsruhe, Germany, 9–12 October 2018. [Google Scholar]
- Holbach, A.; Norra, S.; Wang, L.; Yijun, Y.; Hu, W.; Zheng, B.; Bi, Y. Three Gorges Reservoir: Density pump amplification of pollutant transport into tributaries. Environ. Sci. Technol. 2014, 48, 7798–7806. [Google Scholar]
- Maier, P.M.; Hinz, S.; Keller, S. Estimation of Chlorophyll a, Diatoms and Green Algae Based on Hyperspectral Data with Machine Learning Approaches. In Tagungsband der 37. Wissenschaftlich-Technische Jahrestagung der DGPF e.V.; Deutsche Gesellschaft für Photogrammetrie, Fernerkundung und Geoinformation: Munich, Germany, 2018; Volume 27, pp. 49–57. [Google Scholar]
- Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [Green Version]
- Geurts, P.; Ernst, D.; Wehenkel, L. Extremely randomized trees. Mach. Learn. 2006, 63, 3–42. [Google Scholar] [Green Version]
- Freund, Y.; Schapire, R. A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 1997, 55, 119–139. [Google Scholar]
- Breiman, L. Arcing The Edge; Technical Report 486; Statistics Department, University of California: Berkeley, CA, USA, 1997. [Google Scholar]
- Altman, N.S. An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression. Am. Stat. 1992, 46, 175. [Google Scholar] [CrossRef] [Green Version]
- Vapnik, V.N. The Nature of Statistical Learning Theory; Springer-Verlag New York, Inc.: New York, NY, USA, 1995. [Google Scholar]
- Friedman, J.; Hastie, T.; Tibshirani, R. The Elements of Statistical Learning; Series in Statistics; Springer: New York, NY, USA, 2001; Volume 1. [Google Scholar]
- Riese, F.M.; Keller, S. Introducing a Framework of Self-Organizing Maps for Regression of Soil Moisture with Hyperspectral Data. In Proceedings of the 2018 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Valencia, Spain, 22–27 July 2018. [Google Scholar]
- Kohonen, T. The self-organizing map. Proc. IEEE 1990, 78, 1464–1480. [Google Scholar]
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Louppe, G.; Prettenhofer, P.; Weiss, R.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
- Abadi, M.; Barham, P.; Chen, J.; Chen, Z.; Davis, A.; Dean, J.; Devin, M.; Ghemawat, S.; Irving, G.; Isard, M.; et al. TensorFlow: A system for large-scale machine learning. arXiv 2016, arXiv:1605.08695. [Google Scholar]
- Chen, Y.; Lin, Z.; Zhao, X.; Wang, G.; Gu, Y. Deep Learning-Based Classification of Hyperspectral Data. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2014, 7, 2094–2107. [Google Scholar]
- Keller, S.; Riese, F.M.; Allroggen, N.; Jackisch, C.; Hinz, S. Modeling Subsurface Soil Moisture Based on Hyperspectral Data: First Results of a Multilateral Field Campaign. In Tagungsband der 37. Wissenschaftlich-Technische Jahrestagung der DGPF e.V.; Deutsche Gesellschaft für Photogrammetrie, Fernerkundung und Geoinformation: Munich, Germany, 2018; Volume 27, pp. 34–48. [Google Scholar]
- Kingma, D.P.; Ba, J.L. Adam: A Method for Stochastic Optimization. In Proceedings of the International Conference on Learning Representations, San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
- Neville, R.; Gower, J. Passive remote sensing of phytoplankton via chlorophyll α fluorescence. J. Geophys. Res. 1977, 82, 3487–3493. [Google Scholar]
- Blondeau-Patissier, D.; Gower, J.F.; Dekker, A.G.; Phinn, S.R.; Brando, V.E. A review of ocean color remote sensing methods and statistical techniques for the detection, mapping and analysis of phytoplankton blooms in coastal and open oceans. Prog. Oceanogr. 2014, 123, 123–144. [Google Scholar] [Green Version]
Water Quality Parameter | Number of Datapoints | ||
---|---|---|---|
Full Dataset | Training Subset | Test Subset | |
CDOM | 802 | 240 | 562 |
Chlorophyll a | 1035 | 339 | 696 |
Green algae | 1028 | 336 | 692 |
Diatoms | 1012 | 332 | 680 |
Turbidity | 802 | 240 | 562 |
Variable | Model | Baseline | with PCA | with Scaling | |||
---|---|---|---|---|---|---|---|
in % | RMSE | in % | RMSE | in % | RMSE | ||
CDOM | Linear | 74.3 | 1.05 | 83.2 | 0.85 | 74.3 | 0.12 |
PLS | 84.9 | 0.81 | 83.2 | 0.85 | 84.9 | 0.09 | |
RF | 82.4 | 0.87 | 91.4 | 0.61 | 82.4 | 0.10 | |
ET | 86.2 | 0.77 | 94.6 | 0.48 | 86.3 | 0.09 | |
AdaBoost | 80.0 | 0.93 | 91.9 | 0.59 | 79.9 | 0.11 | |
GB | 80.1 | 0.93 | 91.2 | 0.61 | 80.0 | 0.11 | |
k-NN | 85.5 | 0.79 | 85.3 | 0.80 | 83.0 | 0.10 | |
SVM | 91.5 | 0.61 | 91.2 | 0.61 | 85.6 | 0.09 | |
ANN | 87.2 | 0.74 | 50.8 | 1.44 | 93.7 | 0.06 | |
SOM | 85.8 | 0.78 | 83.5 | 0.84 | 83.0 | 0.10 | |
Chlorophyll a | Linear | 70.2 | 18.45 | 75.5 | 16.70 | 70.2 | 0.14 |
PLS | 73.5 | 17.38 | 75.5 | 16.70 | 73.5 | 0.13 | |
RF | 76.5 | 16.36 | 88.7 | 11.35 | 76.6 | 0.12 | |
ET | 80.0 | 15.10 | 91.4 | 9.92 | 80.0 | 0.11 | |
AdaBoost | 68.7 | 18.89 | 80.0 | 15.11 | 66.7 | 0.14 | |
GB | 76.5 | 16.36 | 89.4 | 10.98 | 75.5 | 0.12 | |
k-NN | 76.1 | 16.51 | 76.6 | 16.34 | 75.4 | 0.12 | |
SVM | 88.0 | 11.69 | 90.0 | 10.71 | 87.6 | 0.09 | |
ANN | 67.3 | 19.12 | 90.5 | 10.40 | 89.3 | 0.08 | |
SOM | 74.3 | 17.12 | 74.7 | 16.99 | 71.5 | 0.13 | |
Green algae | Linear | 49.7 | 14.42 | 62.3 | 12.49 | 49.7 | 0.18 |
PLS | 62.6 | 12.44 | 62.3 | 12.49 | 62.6 | 0.15 | |
RF | 69.6 | 11.21 | 81.6 | 8.73 | 69.6 | 0.14 | |
ET | 73.1 | 10.55 | 87.5 | 7.20 | 73.2 | 0.13 | |
AdaBoost | 60.5 | 12.78 | 75.6 | 10.05 | 61.7 | 0.15 | |
GB | 67.0 | 11.68 | 80.6 | 8.95 | 67.1 | 0.14 | |
k-NN | 68.8 | 11.35 | 68.6 | 11.40 | 68.0 | 0.14 | |
SVM | 83.1 | 8.36 | 79.7 | 9.18 | 76.6 | 0.12 | |
ANN | 56.8 | 13.34 | 81.3 | 8.79 | 75.9 | 0.12 | |
SOM | 64.8 | 12.06 | 64.3 | 12.15 | 66.2 | 0.15 | |
Diatoms | Linear | 55.0 | 10.51 | 62.4 | 9.60 | 55.0 | 0.15 |
PLS | 58.8 | 10.06 | 62.4 | 9.60 | 58.8 | 0.15 | |
RF | 68.2 | 8.84 | 81.8 | 6.68 | 68.2 | 0.13 | |
ET | 72.7 | 8.19 | 86.4 | 5.78 | 72.7 | 0.12 | |
AdaBoost | 56.7 | 10.31 | 76.3 | 7.62 | 56.8 | 0.15 | |
GB | 68.0 | 8.87 | 81.5 | 6.73 | 67.2 | 0.13 | |
k-NN | 68.6 | 8.78 | 68.6 | 8.78 | 67.7 | 0.13 | |
SVM | 80.5 | 6.93 | 78.2 | 7.32 | 80.3 | 0.10 | |
ANN | 62.4 | 9.60 | 86.9 | 5.65 | 79.8 | 0.10 | |
SOM | 64.0 | 9.40 | 64.2 | 9.38 | 63.8 | 0.14 | |
Turbidity | Linear | 45.0 | 0.44 | 70.9 | 0.32 | 44.0 | 0.19 |
PLS | 73.3 | 0.30 | 70.9 | 0.32 | 73.3 | 0.13 | |
RF | 68.0 | 0.33 | 84.1 | 0.24 | 67.7 | 0.15 | |
ET | 73.6 | 0.30 | 89.9 | 0.19 | 73.0 | 0.14 | |
AdaBoost | 67.6 | 0.34 | 85.2 | 0.23 | 66.4 | 0.15 | |
GB | 69.2 | 0.33 | 85.5 | 0.22 | 69.3 | 0.14 | |
k-NN | 72.5 | 0.31 | 72.8 | 0.31 | 70.7 | 0.14 | |
SVM | 80.5 | 0.26 | 74.3 | 0.30 | 73.3 | 0.13 | |
ANN | 79.9 | 0.26 | 88.4 | 0.20 | 80.8 | 0.11 | |
SOM | 76.0 | 0.29 | 71.6 | 0.31 | 73.0 | 0.14 |
© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Keller, S.; Maier, P.M.; Riese, F.M.; Norra, S.; Holbach, A.; Börsig, N.; Wilhelms, A.; Moldaenke, C.; Zaake, A.; Hinz, S. Hyperspectral Data and Machine Learning for Estimating CDOM, Chlorophyll a, Diatoms, Green Algae and Turbidity. Int. J. Environ. Res. Public Health 2018, 15, 1881. https://doi.org/10.3390/ijerph15091881
Keller S, Maier PM, Riese FM, Norra S, Holbach A, Börsig N, Wilhelms A, Moldaenke C, Zaake A, Hinz S. Hyperspectral Data and Machine Learning for Estimating CDOM, Chlorophyll a, Diatoms, Green Algae and Turbidity. International Journal of Environmental Research and Public Health. 2018; 15(9):1881. https://doi.org/10.3390/ijerph15091881
Chicago/Turabian StyleKeller, Sina, Philipp M. Maier, Felix M. Riese, Stefan Norra, Andreas Holbach, Nicolas Börsig, Andre Wilhelms, Christian Moldaenke, André Zaake, and Stefan Hinz. 2018. "Hyperspectral Data and Machine Learning for Estimating CDOM, Chlorophyll a, Diatoms, Green Algae and Turbidity" International Journal of Environmental Research and Public Health 15, no. 9: 1881. https://doi.org/10.3390/ijerph15091881
APA StyleKeller, S., Maier, P. M., Riese, F. M., Norra, S., Holbach, A., Börsig, N., Wilhelms, A., Moldaenke, C., Zaake, A., & Hinz, S. (2018). Hyperspectral Data and Machine Learning for Estimating CDOM, Chlorophyll a, Diatoms, Green Algae and Turbidity. International Journal of Environmental Research and Public Health, 15(9), 1881. https://doi.org/10.3390/ijerph15091881