*2.3. Spatial Mapping of Rainfall*

Initially, the relationship between altitude and rainfall height (mm) for both an annual and a seasonal basis was evaluated using different types of trendlines (linear, logarithmic, polynomial, power, and exponential). Moreover, the spatial distribution was determined, applying multiple regression equation and taking into account not only the altitude but also the longitude and latitude. The multiple linear regression equation has the following form:

$$P = a + b\_1 X\_1 + b\_2 X\_2 + b\_3 X\_3 \tag{5}$$

where *P* represents the rainfall (mm), *a* is constant, *b*<sup>1</sup> ... *b*<sup>3</sup> are coefficients obtained for each independent variable, *X*<sup>1</sup> is longitude (◦), *X*<sup>2</sup> is latitude (◦), and *X*<sup>3</sup> is altitude (m).

Furthermore, the geostatistical interpolation method of ordinary kriging (spherical variogram) was employed, using the ArcGIS 10.2 software. At this point, it should be noted that geostatistical methods are more valid for increasing sample size. To this end, automatic points were generated in a 1 km × 1 km grid resolution within the catchments, using the Fishnet command of the ArcGIS 10.2 software's Data Management toolbar. Therefore, rainfall height was calculated for each point

and all seasons, based on the multiple regression equation described above and the calculation of the individual variables for each point.

Finally, cross-validation was performed, in order to compare results of rainfall spatial interpolation derived from ordinary kriging with other spatial interpolation methods, for example, inverse distance weighting (IDW), radial basis function (RBF), and universal kriging (UK), and a combination of variograms (spherical, exponential), using the Geostatistical Wizard tool of ArcGIS [40].

Cross-validation is any of various similar model validation techniques for assessing how the results of a statistical analysis will generalize to an independent dataset. It is mainly used in settings where the goal is prediction, and where one wants to estimate how accurately a predictive model will perform in practice. In a prediction problem, a model is usually given a dataset of known data on which training is run (training dataset), and a dataset of unknown data (or first seen data) against which the model is tested. The goal of cross-validation is to test the model's ability to predict new data that were not used in estimating it, in order to flag problems like overfitting and to give an insight on how the model will generalize to an independent dataset.

The root mean square error (RMSE) and mean error were used as evaluation indexes in this case study. The mathematical description of these indexes is given below:

$$RMSE = \sqrt{\frac{1}{n} \sum\_{i=1}^{n} (y\_i - x\_i)}^2 \,\,\,\,\,\tag{6}$$

$$MAE = \frac{1}{n} \sum\_{i=1}^{n} (y\_i - x\_i)\_\prime \tag{7}$$

where *n* is the number of observations, and *xi* and *yi* are the observed and interpolated rainfall values, respectively, for *i* = 1, 2 ... n. The RMSE is considered one of the most reliable indexes because it depicts the deviation from the truth rather than the mean value, as in the case of standard deviation. The RMSE gives the weighted variations (residuals) in errors between the estimated and observed values, whereas mean error measures the weighted average magnitude of the errors. Mean error is the most natural and unambiguous measure of average error magnitude [41,42]. RMSE, on the other hand, is one of the most widely used error measures [43].
