Comparative Analysis of Snowmelt-Driven Streamflow Forecasting Using Machine Learning Techniques
Abstract
:1. Introduction
- We compare SOTA deep learning (DL) architectures such as the Transformer and TCN models for snowmelt prediction with traditional ML methods like SVR and previous DL techniques such as LSTM. Notably, our study incorporates a TCN architecture that has not yet been widely utilized for comprehensive comparison in snowmelt forecasting.
- We use nested CV to evaluate the model’s generalization capability in the context of snowmelt prediction, which has not been carried out in previous studies.
2. Materials and Methods
2.1. Study Area
2.2. Hydrometeorological Data
2.3. Snow Cover
2.4. Dataset Preparation
- x is the input value;
- xmin is the minimum value of the column;
- xmax is the maximum value of the column.
2.5. Experimental Setup
- SVR;
- LSTM;
- Transformer;
- TCN.
2.5.1. Kling–Gupta Efficiency
- r is the Pearson’s correlation coefficient;
- is the bias ratio;
- is the average observed discharge;
- is the average simulated discharge;
- is the variability;
- is the observed coefficient of variation;
- is the simulated coefficient of variation
2.5.2. Nash–Sutcliffe Efficiency
- is the observed discharge at time t;
- is the simulated discharge at time t;
- denotes the average observed discharge.
2.5.3. R Square ()
- is the observed discharge at time t;
- is the simulated discharge at time t;
- is the average observed discharge;
- is the average simulated discharge.
2.5.4. Root-Mean-Square Error
- is the observed discharge at time t;
- is the simulated discharge at time t.
2.5.5. Mean Absolute Error
- is the observed discharge at time t;
- is the simulated discharge at time t.
3. Methodology
3.1. SVR
- and are the input feature vectors;
- is a parameter controlling the width of the kernel.
3.2. LSTM
- Forget gate:
- Input gate:
- Update vector:
- Cell state:
- Output gate:
- , , , and are weights for the forget gate, input gate, cell state, and output gate, respectively;
- , , , and are biases for the forget gate, input gate, cell state, and output gate, respectively;
- represents the sigmoid activation function;
- represents the concatenation of the current input and the previous hidden state.
3.3. Transformer
3.3.1. Self-Attention Mechanism
- represents the query;
- represents the key;
- represents the value;
- represents the dimensionality of the key in the items.
3.3.2. Multi-Head Attention
- represents the heads;
- represents the matrix of the entire multi-head attention mechanism.
3.3.3. Position Encoding
- pos is the position of the sequence;
- is the dimensionality of the model.
3.4. TCN
3.4.1. Dilated Causal Convolutions
- is the output at time t;
- is the weight of the convolutional filter;
- is the input at time t;
- d is the dilation rate;
- k is the filter size.
3.4.2. Residual Blocks
3.4.3. Temporal Skip Connection
4. Results
4.1. Performance Analysis of ML Models with Four Inputs
4.2. Performance Analysis of ML Models with Three Inputs
4.3. Overall Comparison of ML Models
4.4. Testing Time
5. Discussion
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Panday, P.K.; Frey, K.E.; Ghimire, B. Detection of the timing and duration of snowmelt in the Hindu Kush-Himalaya using QuikSCAT, 2000–2008. Environ. Res. Lett. 2011, 6, 024007. [Google Scholar] [CrossRef]
- Wester, P.; Mishra, A.; Mukherji, A.; Shrestha, A.B. The Hindu Kush Himalaya Assessment: Mountains, Climate Change, Sustainability and People; Springer Nature: New York, NY, USA, 2019. [Google Scholar]
- Griessinger, N.; Schirmer, M.; Helbig, N.; Winstral, A.; Michel, A.; Jonas, T. Implications of observation-enhanced energy-balance snowmelt simulations for runoff modeling of Alpine catchments. Adv. Water Resour. 2019, 133, 103410. [Google Scholar] [CrossRef]
- Ohmura, A. Physical basis for the temperature-based melt-index method. J. Appl. Meteorol. Climatol. 2001, 40, 753–761. [Google Scholar] [CrossRef]
- Massmann, C. Modelling snowmelt in ungauged catchments. Water 2019, 11, 301. [Google Scholar] [CrossRef]
- ASCE Task Committee on Application of Artificial Neural Networks in Hydrology. Artificial neural networks in hydrology. I: Preliminary concepts. J. Hydrol. Eng. 2000, 5, 115–123. [Google Scholar] [CrossRef]
- ASCE Task Committee on Application of Artificial Neural Networks in Hydrology. Artificial neural networks in hydrology. II: Hydrologic applications. J. Hydrol. Eng. 2000, 5, 124–137. [Google Scholar] [CrossRef]
- Callegari, M.; Mazzoli, P.; De Gregorio, L.; Notarnicola, C.; Pasolli, L.; Petitta, M.; Pistocchi, A. Seasonal river discharge forecasting using support vector regression: A case study in the Italian Alps. Water 2015, 7, 2494–2515. [Google Scholar] [CrossRef]
- De Gregorio, L.; Callegari, M.; Mazzoli, P.; Bagli, S.; Broccoli, D.; Pistocchi, A.; Notarnicola, C. Operational river discharge forecasting with support vector regression technique applied to alpine catchments: Results, advantages, limits and lesson learned. Water Resour. Manag. 2018, 32, 229–242. [Google Scholar] [CrossRef]
- Uysal, G.; Şensoy, A.; Şorman, A.A. Improving daily streamflow forecasts in mountainous Upper Euphrates basin by multi-layer perceptron model with satellite snow products. J. Hydrol. 2016, 543, 630–650. [Google Scholar] [CrossRef]
- Granata, F.; Di Nunno, F.; Najafzadeh, M.; Demir, I. A stacked machine learning algorithm for multi-step ahead prediction of soil moisture. Hydrology 2022, 10, 1. [Google Scholar] [CrossRef]
- Nagesh Kumar, D.; Srinivasa Raju, K.; Sathish, T. River flow forecasting using recurrent neural networks. Water Resour. Manag. 2004, 18, 143–161. [Google Scholar] [CrossRef]
- Kratzert, F.; Klotz, D.; Brenner, C.; Schulz, K.; Herrnegger, M. Rainfall–runoff modelling using long short-term memory (LSTM) networks. Hydrol. Earth Syst. Sci. 2018, 22, 6005–6022. [Google Scholar] [CrossRef]
- Le, X.H.; Ho, H.V.; Lee, G.; Jung, S. Application of long short-term memory (LSTM) neural network for flood forecasting. Water 2019, 11, 1387. [Google Scholar] [CrossRef]
- Fan, H.; Jiang, M.; Xu, L.; Zhu, H.; Cheng, J.; Jiang, J. Comparison of long short term memory networks and the hydrological model in runoff simulation. Water 2020, 12, 175. [Google Scholar] [CrossRef]
- Thapa, S.; Zhao, Z.; Li, B.; Lu, L.; Fu, D.; Shi, X.; Tang, B.; Qi, H. Snowmelt-driven streamflow prediction using machine learning techniques (LSTM, NARX, GPR, and SVR). Water 2020, 12, 1734. [Google Scholar] [CrossRef]
- Bai, S.; Kolter, J.Z.; Koltun, V. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv 2018, arXiv:1803.01271. [Google Scholar]
- RGI Consortium; Randolph Glacier Inventory. A Dataset of Global Glacier Outlines: Version 6.0; Global Land Ice Measurements from Space: Boulder, CO, USA, 2017. [Google Scholar]
- Ragettli, S.; Pellicciotti, F.; Immerzeel, W.W.; Miles, E.S.; Petersen, L.; Heynen, M.; Shea, J.M.; Stumm, D.; Joshi, S.; Shrestha, A. Unraveling the hydrology of a Himalayan catchment through integration of high resolution in situ data and remote sensing with an advanced simulation model. Adv. Water Resour. 2015, 78, 94–111. [Google Scholar] [CrossRef]
- Yasutomi, N.; Hamada, A.; Yatagai, A. Development of a long-term daily gridded temperature dataset and its application to rain/snow discrimination of daily precipitation. Glob. Environ. Res. 2011, 15, 165–172. [Google Scholar]
- Thapa, S.; Li, B.; Fu, D.; Shi, X.; Tang, B.; Qi, H.; Wang, K. Trend analysis of climatic variables and their relation to snow cover and water availability in the Central Himalayas: A case study of Langtang Basin, Nepal. Theor. Appl. Climatol. 2020, 140, 891–903. [Google Scholar] [CrossRef]
- Immerzeel, W.W.; Droogers, P.; De Jong, S.; Bierkens, M. Large-scale monitoring of snow cover and runoff simulation in Himalayan river basins using remote sensing. Remote Sens. Environ. 2009, 113, 40–49. [Google Scholar] [CrossRef]
- Stigter, E.E.; Wanders, N.; Saloranta, T.M.; Shea, J.M.; Bierkens, M.F.; Immerzeel, W.W. Assimilation of snow cover and snow depth into a snow model to estimate snow water equivalent and snowmelt runoff in a Himalayan catchment. Cryosphere 2017, 11, 1647–1664. [Google Scholar] [CrossRef]
- Hall, D.K.; Riggs, G.A.; Salomonson, V.V.; DiGirolamo, N.E.; Bayr, K.J. MODIS snow-cover products. Remote Sens. Environ. 2002, 83, 181–194. [Google Scholar] [CrossRef]
- Kling, H.; Fuchs, M.; Paulin, M. Runoff conditions in the upper Danube basin under an ensemble of climate change scenarios. J. Hydrol. 2012, 424, 264–277. [Google Scholar] [CrossRef]
- Vapnik, V. The Nature of Statistical Learning Theory; Springer Science & Business Media: New York, NY, USA, 1999. [Google Scholar]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
- Gama, J.; Žliobaitė, I.; Bifet, A.; Pechenizkiy, M.; Bouchachia, A. A survey on concept drift adaptation. ACM Comput. Surv. (CSUR) 2014, 46, 1–37. [Google Scholar] [CrossRef]
- Guillory, D.; Shankar, V.; Ebrahimi, S.; Darrell, T.; Schmidt, L. Predicting with confidence on unseen distributions. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 10–17 October 2021; pp. 1134–1144. [Google Scholar]
Model | Hyperparameters | Values |
---|---|---|
SVR | C Epsilon Kernel | [0.1, 1, 10] [0.01, 0.1, 0.2] [Linear, RBF] |
LSTM | LSTM Layers LSTM Units Dropout Rate Optimizer Learning Rate | [1, 2, 3] [32, 64, 128] [0.2, 0.3, …, 0.5] [Adam, Adamax, RMSProp, SGD] [0.0001, 0.001, …, 0.1] |
Transformer | Transformer Blocks Head Size Number of Heads FF Dim Dropout Rate Number of MLP Layers MLP Units MLP Dropout Optimizer Learning Rate | [2, 4, 6, 8] [8, 16, …, 256] [2, 4, …, 16] [4, 8, …, 64] [0.1, 0.2, …, 0.6] [1, 2, 3] [32, 64, …, 256] [0.1, 0.2, …, 0.6] Adam, Adamax, RMSProp, SGD [0.0001, 0.001, …, 0.1] |
TCN | TCN Layers Number of Filters Kernel Size Optimizer Learning Rate | [1, 2, 3] [32, 64, …, 128] [2, 3, 4] [Adam, Adamax, RMSProp, SGD] [0.0001, 0.001, …, 0.1] |
Model | Folds | MAE | RMSE | KGE | NSE | |
---|---|---|---|---|---|---|
SVR | 1 2 3 4 5 | 0.0341 0.0312 0.0306 0.0338 0.0313 | 0.0451 0.0401 0.0406 0.0450 0.0395 | 0.9657 0.9723 0.9710 0.9672 0.9734 | 0.8967 0.9291 0.9227 0.9175 0.9220 | 0.9657 0.9723 0.9710 0.9672 0.9734 |
LSTM | 1 2 3 4 5 | 0.0161 0.0130 0.0126 0.0130 0.0121 | 0.0301 0.0235 0.0239 0.0242 0.0220 | 0.9847 0.9905 0.9901 0.9905 0.9918 | 0.9727 0.9889 0.9693 0.9713 0.9735 | 0.9847 0.9905 0.9903 0.9905 0.9918 |
Transformer | 1 2 3 4 5 | 0.0195 0.0177 0.0186 0.0185 0.0177 | 0.0308 0.0274 0.0283 0.0281 0.0258 | 0.9840 0.9870 0.9860 0.9872 0.9887 | 0.9855 0.9880 0.9819 0.9859 0.9883 | 0.9840 0.9870 0.9860 0.9872 0.9887 |
TCN | 1 2 3 4 5 | 0.0126 0.0110 0.0107 0.0109 0.0098 | 0.0269 0.0227 0.0214 0.0230 0.0192 | 0.9878 0.9911 0.9919 0.9914 0.9937 | 0.9914 0.9927 0.9913 0.9926 0.9931 | 0.9878 0.9911 0.9919 0.9914 0.9937 |
Model | Folds | MAE | RMSE | KGE | NSE | |
---|---|---|---|---|---|---|
SVR | 1 2 3 4 5 | 0.0317 0.0300 0.0288 0.0300 0.0284 | 0.0426 0.0386 0.0374 0.0395 0.0368 | 0.9693 0.9743 0.9755 0.9747 0.9769 | 0.9316 0.9403 0.9405 0.9335 0.9369 | 0.9693 0.9743 0.9755 0.9746 0.9769 |
LSTM | 1 2 3 4 5 | 0.0143 0.0136 0.0112 0.0116 0.0106 | 0.0272 0.0236 0.0231 0.0232 0.0206 | 0.9875 0.9903 0.9905 0.9912 0.9927 | 0.9768 0.9757 0.9812 0.9840 0.9875 | 0.9875 0.9903 0.9905 0.9912 0.9927 |
Transformer | 1 2 3 4 5 | 0.0178 0.0161 0.0162 0.0126 0.0115 | 0.0305 0.0262 0.0266 0.0243 0.0213 | 0.9842 0.9881 0.9875 0.9904 0.9922 | 0.9663 0.9781 0.9751 0.9826 0.9831 | 0.9842 0.9881 0.9875 0.9904 0.9922 |
TCN | 1 2 3 4 5 | 0.0122 0.0108 0.0113 0.0115 0.0104 | 0.0269 0.0217 0.0223 0.0228 0.0201 | 0.9877 0.9918 0.9913 0.9915 0.9931 | 0.9806 0.9891 0.9865 0.9888 0.9904 | 0.9877 0.9918 0.9913 0.9915 0.9931 |
Model | Inputs | MAE | RMSE | KGE | NSE | |
---|---|---|---|---|---|---|
SVR | M1 M2 | 0.032 0.030 | 0.042 0.039 | 0.970 0.974 | 0.918 0.937 | 0.970 0.975 |
LSTM | M1 M2 | 0.013 0.012 | 0.025 0.024 | 0.989 0.990 | 0.975 0.981 | 0.989 0.990 |
Transformer | M1 M2 | 0.018 0.015 | 0.028 0.026 | 0.987 0.989 | 0.986 0.977 | 0.987 0.989 |
TCN | M1 M2 | 0.011 0.011 | 0.023 0.023 | 0.991 0.991 | 0.992 0.987 | 0.991 0.991 |
Persistence | N/A | 0.0143 | 0.0310 | 0.9837 | 0.991 | 0.983 |
Models | Testing Time (s) |
---|---|
LSTM | 0.694 |
Transformer | 0.652 |
TCN | 0.372 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Thapa, U.; Pati, B.M.; Thapa, S.; Pyakurel, D.; Shrestha, A. Comparative Analysis of Snowmelt-Driven Streamflow Forecasting Using Machine Learning Techniques. Water 2024, 16, 2095. https://doi.org/10.3390/w16152095
Thapa U, Pati BM, Thapa S, Pyakurel D, Shrestha A. Comparative Analysis of Snowmelt-Driven Streamflow Forecasting Using Machine Learning Techniques. Water. 2024; 16(15):2095. https://doi.org/10.3390/w16152095
Chicago/Turabian StyleThapa, Ukesh, Bipun Man Pati, Samit Thapa, Dhiraj Pyakurel, and Anup Shrestha. 2024. "Comparative Analysis of Snowmelt-Driven Streamflow Forecasting Using Machine Learning Techniques" Water 16, no. 15: 2095. https://doi.org/10.3390/w16152095
APA StyleThapa, U., Pati, B. M., Thapa, S., Pyakurel, D., & Shrestha, A. (2024). Comparative Analysis of Snowmelt-Driven Streamflow Forecasting Using Machine Learning Techniques. Water, 16(15), 2095. https://doi.org/10.3390/w16152095