Floods are among the most devastating natural disasters; predicting their depth and extent remains a global challenge. Machine Learning (ML) models have demonstrated improved accuracy over traditional probabilistic flood mapping approaches. While previous studies have developed ML-based models for specific local regions, this
[...] Read more.
Floods are among the most devastating natural disasters; predicting their depth and extent remains a global challenge. Machine Learning (ML) models have demonstrated improved accuracy over traditional probabilistic flood mapping approaches. While previous studies have developed ML-based models for specific local regions, this study aims to establish a methodology for estimating flood depth on a global scale using ML algorithms and freely available datasets—a challenging yet critical task. To support model generalization, 45 catchments from diverse geographic regions were selected based on elevation, land use, land cover, and soil type variations. The datasets were meticulously preprocessed, ensuring normality, eliminating outliers, and scaling. These preprocessed data were then split into subgroups: 75% for training and 25% for testing, with six additional unseen catchments from the USA reserved for validation. A sensitivity analysis was performed across several ML models (ANN, CNN, RNN, LSTM, Random Forest, XGBoost), leading to the selection of the Random Forest (RF) algorithm for both flood inundation classification and flood depth regression models. Three regression models were assessed for flood depth prediction. The pixel-based regression model achieved an R
2 of 91% for training and 69% for testing. Introducing a pixel clustering regression model improved the testing R
2 to 75%, with an overall validation (for unseen catchments) R
2 of 64%. The catchment-based clustering regression model yielded the most robust performance, with an R
2 of 83% for testing and 82% for validation. The developed ML model demonstrates breakthrough computational efficiency, generating complete flood depth predictions in just 6 min—a 225× speed improvement (90–95% time reduction) over conventional HEC-RAS 6.3 simulations. This rapid processing enables the practical implementation of flood early warning systems. Despite the dramatic speed gains, the solution maintains high predictive accuracy, evidenced by statistically robust 95% confidence intervals and strong spatial agreement with HEC-RAS benchmark maps. These findings highlight the critical role of the spatial variability of dependencies in enhancing model accuracy, representing a meaningful approach forward in scalable modeling frameworks with potential for global generalization of flood depth.
Full article