*2.4. Validation*

Analysis of the results are per confusion matrix, including several measures: Accuracy, Kappa, Sensitivity, Specificity, and F1 score.

$$\text{Accuracy } Acc = \frac{TP + TN}{TP + FN + FP + TN} \tag{1}$$

$$\text{Sensitivity } Sn = \frac{TP}{TP + FN} \tag{2}$$

$$\text{Specificity } Sp = \frac{TN}{TN + FP} \tag{3}$$

$$\text{F1 score } F = 2 \times \frac{P \times Sn}{P + Sn} \tag{4}$$

$$\text{Precision (P)} = \frac{TP}{TP + FP} \tag{5}$$

where *TP* is a true positive prediction, *TN* is true negative, *FP* is a false positive, and *TN* is true negative prediction. The receiver operating characteristics (ROC) area under the curve (AUC), which offers an overall assessment of the model at all classification levels, was also evaluated.

### *2.5. Data and Study Area*

Data used in this project are from nationally available datasets available from a variety of federal governmen<sup>t</sup> organizations. All data used in this project is publicly accessible and can be found at open.canada.ca (accessed on 10 September 2021) or climate.weather.gc.ca/ (accessed on 20 August 2021). Five sites were selected across Canada, all with a history of flooding and which experienced flooding in the past ten years. The sites include southern British Columbia (BC), which experienced flooding due to atmospheric river in 2021, and flooding south of Lake Athabasca in northern Alberta (AB) due to heavy rainfall. Southern Manitoba (MB), Ontario (ON), and New Brunswick (NB) all have flood events due to the spring freshet along the Red River, Ottawa River, and Saint John River, respectively.

### **3. Results**

In this section, results are presented for (i) single and multi-region model approaches, (ii) single model results and ensemble model across Canada, and (iii) comparison of FS prediction to historic flood events database.

### *3.1. Single and Multi-Region Model*

Results from parallel Random Forest model (parRF) are shown in Table 2 for the multi-region and national approaches using local, regionally important variables and a compiled national set of important variables. For most of the measures in each of the study areas, the results are nearly identical between the local, self-selected factors and the national list of factors. The ON region is the exception, where an increase in model performance was found in all measures using the national list of important factors. Notably, in ON, there was an increase in accuracy from 0.89 to 0.92 in overall accuracy and increase by 0.07 in kappa and specificity when the national list of factors was used. In BC, there is an increase of 0.01 in specificity and decrease of 0.02 in both sensitivity and F1 score between the national and

local list of factors. The average of the regional models found a slight increase in accuracy, kappa, and specificity when the national factor list was used.


**Table 2.** Metrics of the individual models and national model, parallel random forest results shown.

#### *3.2. Single Model vs. Ensemble over Large Distance*

The National model approach was run on all the models listed in Table 1. The four RF models all perform well, with accuracy of 0.91 or 0.92, and ROC-AUC of 0.96–0.97, respectively. Similar results are found with the two boosting models, xgbDART and xgbTree, and the C5.0 decision tree model. The poorest results come from the mlp model, with an accuracy of 0.76, kappa of 0.52, and ROC-AUC 0.85. The MARS earth model has an accuracy of 0.82 and kappa of 0.64, putting it at the lower end of the performance scale. The results of the models produce results in the range of 0 to 1, where 0 is no flooding and 1 is flooding/wet pixels, and are multiplied by 100 to avoid storing float values in the final dataset.

To compare how the different ML models performed across the country, in areas distinct from the training sites, the national approach was applied to several ML models to evaluate the resultant map, Figure 2. As a reference, the extent of historic flooding for which there is a digital record, is shown in Figure 2a. The RF and earth models present, at a national scale, are somewhat similar susceptibility maps, though the earth model has higher predictions, especially in Nunavut (NU), Northwest Territories (NT), into Saskatchewan (SK) and Manitoba (MB). The earth model computes very high susceptibility values along the western shore of Hudson Bay. This is not found in the other models, nor is it present in the historic record. The RF model, in northern NU, and NT computes a relatively stable prediction without much variation between the islands nor along the shorelines in NU. There are a few spots in western Yukon (YT), which have higher predictions, and this corresponds to the location of meteorological stations. All models capture higher susceptibility to flooding along the southern borders of SK and MB, which aligns with the historic record. The nnet model appears to do well with predictions where the training data exists and provides relatively low susceptibility values in training-sparse regions: northern Quebec (QC), NU, and around the Canadian Rockies have very low susceptibility values. The svmRadial model shows peculiar 'rings' with higher predictions at the outer edges, covering most of NU and eastern NT. These areas may be more diverse from the labelled data than any others, and it is clear the svmRadial model had challenges.

**Figure 2.** (**a**) Historic flooding in Canada (from EGS Flood Archive and national flood hazard data layer), and results of different single ML models using national approach, (**b**–**f**) ensemble result.

### **4. Conclusions**

Developing a FS map across a nation as geographically large and diverse as Canada, presents several challenges. In this work, ML algorithms and publicly available national datasets were included to map FS and identify regions more prone to flooding. Testing if a single national model outperformed a regional multi-region model mosaic found that the single national model produced better predictions. However, when a single ML model was extrapolated across the whole of Canada, there were limitations found in several models, including SVM, NN, MLP, and RF. An ensemble approach ultimately produced the best FS map, in comparison to historic flood maps, even though the statistics from the confusion matrix found the ensemble was not the best performer with accuracy and ROC-AUC of 0.89 and kappa 0.78. The resultant dataset provides the first continuous, national picture of flood susceptibility in Canada, with the intended use to support identification and priority setting of flood hazard mapping project and for flood awareness communication.

**Author Contributions:** H.M.: conceptualization; data curation; formal analysis; investigation; methodology; project administration; resources; software; supervision; validation; visualization; roles/writing—original draft; writing—review and editing. P.N.G.: data curation; writing—review. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Data is available: https://open.canada.ca/data/en/dataset/df106e11- 4cee-425d-bd38-7e51ac674128 (accessed on 15 June 2022).

**Acknowledgments:** Natural Resources Canada Contribution Number: 20220100.

**Conflicts of Interest:** The authors declare no conflict of interest.
