1. Introduction
Reliable regional high-resolution data for precipitation and extreme precipitation have become increasingly more important since they are essential for assessing the accuracy and correcting the biases of global and regional climate models, which in turn help us make more accurate predictions about the future. These future predictions are essential for climate change risk and vulnerability assessments, which are legally required according to the new EU Taxonomy Regulation and will be used to make important decisions by businesses and law makers. Precipitation modelling and downscaling techniques are used in order to achieve higher-resolution datasets, which are useful in predicting precipitation extremes and droughts. The frequency and intensity of precipitation is important to policymakers because it disrupts farming and causes natural hazards like floods and mudslides. Moreover, higher-resolution data can also be used in hydrological and other engineering models to make predictions about the future frequency and intensity of climate change-related hazards.
In the Greek region, precipitation and extreme precipitation are very difficult parameters to simulate due to their distribution and, in Greece’s case, due to their rarity in certain months. The Greek region, despite the large number of islands, is dominated by mountains in the Greek mainland (
Figure 1). The large mountain ranges give precipitation in the country a distinct longitudinal shift [
1]. Additionally, the country experiences intense interseasonal variability [
2], which is in line with the Mediterranean climate. The Mediterranean basin experiences extreme precipitation events [
3] in the winter and a lot of droughts in the summer [
4]. In order to properly model the interseasonal variability as well as extreme precipitation events, Rauscher et al. [
5] found that when resolution is increased, the accuracy of the prediction also increases in the greater European region. In their study, they also found that dry days and precipitation extremes are affected by the resolution of the models to a greater extent than standalone precipitation. This is extremely important for the Greek region because in summer, precipitation is extremely rare, and the Greek region records a large number of consecutive dry days. In winter, on the other hand, mainland Greece records a large number of extreme precipitation events, with most of them occurring in western Greece.
Additionally, in Greece, precipitation exhibits very high spatial variability, as documented by a wealth of research on this subject [
6,
7]. This makes precipitation modelling in Greece quite hard because of the high spatial resolution needed in order to adequately describe the precipitation variability in the area. The difficulty in modelling precipitation with high spatial variability was also noted by Lee et al. [
8]. They found that models with coarser resolutions fail to capture the variability of daily precipitation, and additionally, coarser models were not able to simulate precipitation intensity correctly. The different landscapes and microclimates that appear in the Greek region in combination with the high interseasonal variability create a challenging environment that is perfect for testing the efficacy and performance of climate models and downscaling methods.
Climate change is expected to intensify the hydrological cycle and change the spatial distribution of precipitation [
9]. According to IPCC reports [
10], the Mediterranean is expected to become warmer and dryer by the end of the 21st century. Furthermore, the warming is expected to be more intense in the Mediterranean region than the global mean [
11]. In the Greek region, there is a decreasing trend in the precipitation recorded since 1950 [
12,
13]. Extreme precipitation in the region, on the other hand, in more recent research has revealed an increasing trend [
14], which is in line with research conducted in the Mediterranean region [
15,
16].
Extreme precipitation results in flooding, which causes more economic losses than any other natural disaster. Greece has a long history of flooding [
17,
18], mainly caused by intense rainstorms, as snowmelt floods are not common in the region. In recent years, deforestation and urbanization have played an important role in increasing the severity and destructive power of floods. The Attica region has been extremely urbanized in the last 70 years, and as a result, it records the majority of the damages from floods [
19]. Most of these floods are caused by extreme weather phenomena [
20], and their destructive power has been amplified by the intense urbanization of the city [
21]. Therefore, changes in extreme precipitation frequency and intensity are very important to policymakers in order to construct mitigation plans. For example, high-resolution precipitation datasets for can inform urban planners about the capacity and the extent to which they must implement stormwater control and other relevant management systems to avoid flooding [
22], while the number of wet days alongside precipitation can be used to analyze droughts [
23].
The regression kriging method has been used to simulate precipitation before [
24,
25,
26,
27]. Recently, the method was paired with more complex algorithms other than just a simple regression [
28]. Gradient boosting is usually combined with decision trees to estimate susceptibility or risk [
29,
30] and with regression trees in order to fill data for meteorological time series [
31] or some short-term prediction [
32]. More recently, further research has been performed that indicates that gradient boosting can be used to statistically postprocess weather forecast data [
33] to achieve better prediction accuracy or process satellite data to estimate total precipitable water [
34]. Random forests are also used to predict precipitation [
35,
36] and temperature [
37], indicating their efficacy in predicting climatic variables. Taking into account the previous success of gradient boosting and random forests in simulating precipitation in particular, the aim of this study is to pair the algorithms and utilize ERA5 precipitation data to create a high-resolution precipitation dataset. We believe that pairing two state-of-the-art algorithms along with the newest ERA5 reanalysis dataset will enable us to provide high-resolution data that will provide improvements to the ERA5’s performance over the Greek region. To achieve this, we utilized the land reanalysis, which has a higher resolution, and additionally, we added the AURELHY principal components as independent variables. Furthermore, we also examined extreme precipitation parameters and wet days in order further analyze specific components of precipitation over the region.
3. Results
The results section is categorized according to the variable studied. For each variable, a table is given that shows the improvements that were made in certain metrics (R
2, correlation, and RMSE) in the downscaled dataset compared to the standalone ERA5 reanalysis dataset. The metrics were calculated for the downscaled dataset (from here on referenced as HGRP) and the standalone ERA5 reanalysis datasets against data from the gauges, and all formulas used for the calculation of metrics or in our methodology are presented in
Formulas (S1)–(S11) in the Supplementary Materials Section. Additionally, maps are given on a seasonal and on an annual basis in order to showcase the differences between the datasets. The maps were constructed by calculating the mean for each dataset for the whole period of 1980–2010. Additionally, for each station, the RMSE was calculated by the LOOCV methodology, and the difference between the reanalysis RMSE and the downscaled RMSE was mapped for each season and on an annual basis. In the maps, positive values represent an improvement in RMSE, while negative values represent a worsening RMSE.
3.1. Precipitation Total
In the precipitation totals in winter, there are obvious improvements in the resolution of the data, while at the same time, the spatial variability of precipitation also improved. In particular, the model increases the precipitation in the mountainous Peloponnese region and in the western part of mountainous Crete, while at the same time, it decreases precipitation in the Thessaloniki region. From the RMSE differences map (Figure 3), we see that in the mountainous regions of Crete, there are significant improvements in the RMSE, where HGRP adds precipitation. In the Thessaloniki region, there is also an improvement in the RMSE, where the model reduces precipitation. In the mountainous Peloponnese region, there are not as many stations in order to validate the increase in precipitation that HGRP simulates. Additionally, in the RMSE differences map, we can see that there are also improvements in the western part of the Pindos mountain range, where the bulk of precipitation usually occurs in the Greek region. From
Figure 2, we can observe that HGRP simulates more precipitation in the northwestern mountain range while also improving on the spatial distribution of the precipitation simulated from the ERA5 reanalysis. In the central part of Greece, there are the most increases of RMSE, where the model reduces precipitation for the region. Additionally, the largest increase in RMSE is recorded in the island of Samos, where HGRP does not seem to have changed the original ERA5 precipitation as much; however, the precipitation simulated in the island is significantly higher than the rest of the islands near Samos in both the ERA5 and our model.
In spring, the model decreases precipitation in the Greek region. More specifically, precipitation is reduced in the mountainous western Greece, in the Katerini region, and in the northern Peloponnese region. In the map of RMSE differences (
Figure 3), we observe considerable improvements in the RMSE in those regions, in particular in the northwestern mountainous regions. In the northeastern parts of Greece, precipitation is kept about the same, while the RMSE differences remain mixed. In the islands of the Aegean, precipitation is about the same, with the model increasing precipitation in the islands in the east, in particular Mytilene, Samos, and Rodos. In the Aegean islands, RMSE differences are slightly negative, indicating a slightly worsening performance. Finally, in Crete, the model increases precipitation, particularly in the mountainous western regions. From
Figure 3, the results from RMSE differences are mixed; in particular, in the eastern part of the island, a lot of improvements are observed, while in the western part of the island, there are slightly negative values.
In summer, the overall precipitation is also reduced by the model and in particular in the mountainous regions of northwestern Greece and in the Katerini region. In Thrace, precipitation is also reduced but not to the extent that precipitation is reduced in the previously mentioned regions. In the Aegean islands and Crete, the model is in agreement with the ERA5, and precipitation is essentially zero. At this point, it is worth noting that the ERA5 reanalysis dataset and the model perform the worst in these summer months (
Table 1), and this can be explained by the overall lack of precipitation in Greece during summer, which makes simulating precipitation very challenging. This is also reflected in the RMSE difference maps, where the results are generally mixed. In the Aegean islands and in the mountainous northwestern regions, there are some slightly negative values, while in the northeastern part of the region, the RMSE suffers most. In the rest of the Greek regions, the values are positive overall. In
Table 1, the improvements in R
2 and correlation are also presented on a seasonal basis. The model’s outperformance over the ERA5 dataset occurs primarily during winter and autumn, coinciding with periods of heavy precipitation. Conversely, during the typically arid summer months, the model’s output closely resembles the reanalysis data due to the limited rainfall.
In autumn, the results between the HGRP and the ERA5 precipitation totals are more similar than the rest of the seasons. The only region where precipitation is meaningfully increased is the Crete region, where the biggest improvements in the RMSE are also recorded. In the rest of Greece, there are improvements in the spatial distribution of precipitation, in particular in the western Peloponnese region. The RMSE differences remain slightly positive in mainland Greece while slightly negative in the Aegean and Ionian islands.
On an annual basis, the results are similar to the ones recorded in autumn. There are improvements in the spatial distribution of precipitation in the Peloponnese region, and there is a meaningful increase in precipitation in the Crete region. Additionally, the biggest RMSE improvement on an annual basis is also recorded in the Crete region. In the mountainous regions of Pindos in northwestern Greece, the overall RMSE differences are positive, while for the Aegean and Ionian islands, the RMSE differences are mostly slightly negative. Samos remains an outlier, as the precipitation simulated by both the model and the ERA5 reanalysis is obviously significantly different from the rest of the islands, and the RMSE difference is also very different from the rest of the islands.
Although there are significant improvements recorded, because the results are not always uniform in every station, the metrics remain relatively the same compared to the reanalysis dataset (
Table 1 and
Table S5). The downscaling method seems to perform worst in the summer months while performing the best on an annual basis in every metric studied.
3.2. Wet Days
In
Figure 4, when comparing the ERA5 wet days and the wet days from HGRP in all seasons, we can see a reduction in wet days across all areas of Greece, while at the same time, these reductions are translated into improvements in all areas and in all seasons, as we can see from the RMSE differences maps (
Figure 5). More specifically, in winter, the pattern that is simulated by the ERA5 remains in the HGRP; however, the amount simulated is massively different. In the northwestern mountain range, the ERA5 dataset reaches the 43-wet-day limit, when in the HGRP, the wet days do not surpass the 33-day threshold. The reduction in wet days in the northwestern mountain range is validated by the RMSE differences maps, where RMSE is reduced in that region. Furthermore, east of the Pindos mountain range, in the greater Larissa and Lamia regions, there are some of the largest RMSE improvements, where HGRP also greatly reduced the number of wet days simulated by the ERA5 reanalysis dataset. Both the ERA5 and the HGRP simulate more wet days in the western part of Crete than the rest of Greece, although the HGRP wet days are much fewer than the ERA5. In that particular region, we can see that there are also very large improvements in the RMSE (
Figure 5). In the Aegean and Ionian islands, we can also see that there are improvements in the RMSE across all stations, where the number of wet days is also reduced.
In spring and summer, HGRP seems to keep the spatial distribution of wet days from the ERA5 dataset, but the overall volume of wet days is also massively reduced. In the RMSE differences map, we can see massive improvements in both spring and summer across all stations and areas. In spring, RMSE improves across the northwestern mountain ranges the most, while some slightly negative differences are observed in the eastern part of Crete. In spring, the RMSE differences stay mostly positive across the Aegean and the Ionian islands, while in summer, most differences are positive, but there are a lot of differences that are at zero. This is due to the fact that precipitation is extremely rare; therefore, any difference between HGRP and the ERA5 dataset is extremely small.
In autumn, the ERA5 simulates a large volume of wet days across all mainland Greece, with the exception of Attica. In the Aegean islands, the wet days are noticeably fewer than in the rest of the region, with the exception of the western part of Crete. The bulk of wet days occurs in the northwestern region in both the ERA5 and the HGRP. From
Figure 5, we can see that RMSE is improved across all Greece. The largest improvements are recorded in the western part of Crete and the greater Lamia region. These areas also showed very large improvements in the winter RMSE differences.
On an annual basis, the ERA5 dataset, similarly to autumn, simulates a lot of wet days across mainland Greece, while in the Aegean islands, with the exception of Crete, the wet days are noticeably fewer. The HGRP also reduces the volume of wet days simulated by the ERA5, like the rest of the seasons. The spatial distribution of the wet days remains very similar to the ERA5 wet days, but the elevation peaks seem to retain more wet days in HGRP, especially in Crete, where the mountain peaks seem to retain more wet days, which are better defined in HGRP than in the ERA5 reanalysis. The RMSE differences (
Figure 5) seem to improve across all Greece. The RMSE is improved the most in the northeastern part of Greece and in the northwestern mountain ranges. There are some slight decreases in RMSE in the eastern part of Crete, where HGRP does not change the number of wet days significantly compared to the reanalysis dataset, especially in comparison to the rest of Greece. In the Aegean and Ionian islands, the RMSE mostly improves but to a lesser extent compared to the rest of Greece. This occurs because the islands also record less precipitation in general.
The downscaling model improves the metrics across all seasons and especially on an annual basis (
Table 2 and
Table S6). Although the R
2 and correlation show small improvements, the RMSE improves dramatically. On an annual basis, the RMSE halves, while it also shows significant improvement in every season and month. In comparison to the precipitation totals, the R
2 and correlation stay more uniform across all months and seasons, whereas in
Table 1, there is a notable reduction in the summer months. Additionally, it is obvious that the ERA5 reanalysis dataset has an obvious bias towards more wet days.
3.3. Number of Days Precipitation Exceeds 10 mm
In the number of days that precipitation exceeds 10 mm (from here on P10), shown in
Figure 6, we can observe that HGRP changes the spatial distribution of P10 significantly. In winter, the ERA5 reanalysis simulates most of P10 over the northwestern mountainous regions and in the western part of the Peloponnese region. The HGRP adds a lot of P10 over the western part of the Peloponnese region and in Crete; in the RMSE differences maps (
Figure 7), we can see that in those particular areas, some of the largest improvements are recorded. In Crete in particular, very large improvements are observed across the whole island, and some of the largest improvements in all of Greece are observed in the mountainous regions of the island. In the Ionian and Aegean islands, the HGRP added some events in limited areas; however, from
Figure 7, we observe that the overall results are mixed in terms of RMSE improvements. In the Attica and Evia regions, in
Figure 6, we observe an increase in P10 and a change in the spatial distribution, which is translated into positive RMSE differences (
Figure 7). In northern Greece and in the greater Larissa and Lamia regions, RMSE differences remain negative, and HGRP performs the worst. The changes in P10 made by HGRP are not as large as the ones mentioned above, and the RMSE differences are rather small, as they remain less than one in most stations.
In spring, the overall pattern of P10 is retained in the HGRP dataset; however, the greater resolution aids in capturing the Greek orography better. In Crete and the Aegean islands, HGRP seems to increase the events simulated by the ERA5 dataset. In
Figure 7, we observe that the RMSE differences in those regions are positive, and in Crete in particular, we see the best improvements. In the Peloponnese, Attica, and northwestern Greece, the RMSE differences are positive, while in the greater Larissa and Thessaloniki regions, RMSE differences are mostly negative. In these particular areas, although there are no significant changes in the volume of P10, there are changes in the resolution of the data simulated.
In summer, it is worth noting that days where precipitation exceeds 10 mm are extremely rare and mostly occur in the northern mountainous regions of Greece. From
Figure 6, we can see that HGRP reduces P10 in the Katerini and the northeastern part of Greece; however, the station coverage in those areas is not sufficient to make a conclusion regarding the improvement in RMSE. In the rest of the mainland Greek region, there are improvements in the spatial distribution of P10. In the Peloponnese region, we can see that the P10 generated by HGRP is more spread out compared to the standalone ERA5 dataset. In the Peloponnese region, there are also major improvements in RMSE (
Figure 7). In the Aegean islands, the ERA5 dataset and HGRP are in agreement, as P10 is essentially zero in most occasions in those areas.
In autumn, the changes in the spatial distribution made by the HGRP are similar to the ones recorded in winter. HGRP increases P10 in the western Peloponnese region and in Crete, and in those regions, we also observe large RMSE improvements. Increases are also observed in the Katerini region, and there is a better spatial distribution in HGRP in northern Greece. These improvements are also observed in the RMSE differences map, where in most of northern Greece, RMSE improves in the HGRP. In the greater Attica and Evia regions, we observe an increase in P10 in the HGRP, and the RMSE differences are also mostly positive in the region. In the Aegean islands, HGRP slightly increases P10, with the RMSE also improving, as shown in in
Figure 7.
On an annual basis, the changes made by HGRP are mostly similar to the ones recorded in autumn and winter, which is to be expected since P10 occurs the most in those seasons. From
Figure 6, we can see that HGRP adds P10 in the western Peloponnese region and in Crete. In those particular areas, the biggest improvements of RMSE are recorded, similarly to autumn and winter. In the northwestern part of Greece, large improvements in RMSE are also recorded. These improvements are due to the higher spatial resolution of the HGRP and the better spatial distribution achieved. In Attica and Evia, there is also an increase in P10, which leads to improvements in the RMSE in those areas. In the Aegean and Ionian islands, the HGRP adds P10 in most islands and especially in the eastern part of the Aegean. In those areas, there are also positive improvements in
Figure 7. In northern Greece and the greater Larissa and Lamia regions, there are fewer changes made by the HGRP, and this is also evident by the very small changes in RMSE recorded in those areas, as shown in
Figure 7.
The number of days precipitation exceeds 10 mm also shows improvements in the metrics (
Table 3); however, these improvements are not as numerous as the ones for the number of wet days. The improvements are mainly to the RMSE and less so in R
2 and correlation. The reanalysis dataset here does not have an obvious bias towards more precipitation over 10 mm, in contrast to the number of wet days, where there was an overall bias towards more wet days. Rather, it seems to lack P10 in particular regions, which is further solidified by the maps of the RMSE difference, where the mountainous regions of the Peloponnese, Crete, and western Greece record most of the improvements (
Figure 7).
3.4. Number of Days Precipitation Exceeds 20 mm
Firstly, days where precipitation exceed 20 mm (from here on P20) are very rare in the Greek region, and they mostly occur in the winter and autumn months, while in summer in particular, they are very rare. In winter, as shown in
Figure 8, the ERA5 simulates almost all P20 in northwestern Greece and to a smaller extent in the Peloponnese and the northeastern Greek region. In the HGRP, P20 displays a greater spatial extent in all of Greece, and there is an increase in P20 events over the Peloponnese and Crete in particular. In these areas, there is also the largest improvement in RMSE, as we can see in
Figure 9. In Attica and the Evia regions, there is an increase in P20 in the HGRP dataset, which improves RMSE across those areas. In the Aegean and Ionian islands, HGRP increases P20 across most islands. Samos is again an outlier, recording much higher amounts of P20 in the ERA5 reanalysis than the rest of the islands. These changes mostly result in improvements in RMSE in the Ionian and Aegean islands.
In spring, HGRP is very different from the standalone ERA5 reanalysis dataset. The events in HGRP are much more spread out than in the ERA5 reanalysis, where most of the events are simulated in the northwestern mountainous regions. In HGRP, the P20 in the northwestern mountainous regions is more defined throughout most of the region, which does improve RMSE in the southern part of the Pindos mountain range (
Figure 9). In the Peloponnese region, there is also an increase in P20 in HGRP, especially in areas with higher elevation in the region, which improves RMSE in the area. In Crete, the standalone ERA5 dataset does not capture the orography of the island and, obviously, does not simulate enough P20 in the island, whereas in the HGRP, there is an increase in P20, particularly in higher-elevation areas, which improves RMSE in the area (
Figure 9). Additionally, HGRP increases P20 in the northeastern part of Greece and also improves on the spatial distribution and capturing the underlying elevation of the region better. As seen in
Figure 9, there is an improvement in the RMSE in the HGRP in northeastern Greece from the changes made by the HGRP. In the Aegean and Ionian islands, HGRP adds P20 in most areas, which translates into mostly positive RMSE changes. In summer, HGRP follows the pattern of the rest of the seasons, where P20 is more spread out in the Greek region; however, in summer, P20 events are almost nonexistent; therefore, this is a mistake. This is further confirmed by the RMSE differences, which are overwhelmingly negative in summer, making this the only season and parameter where HGRP does not improve on the standalone ERA5 dataset.
In autumn, similarly to winter, most P20 simulated by the ERA5 reanalysis dataset is centered around the northwestern mountainous regions and to a lesser extent in the Peloponnese region. In the HGRP, there is an increase in P20 in the Peloponnese region and Crete, which improves RMSE significantly (
Figure 9). In Attica and Evia, there is an increase in P20 in the HGRP and a greater spatial distribution of the events, and this is validated by an improvement in the RMSE metrics in these regions. Additional increases of P20 are also recorded in the northeastern part of Greece, where the RMSE differences remain mixed, and in the Aegean and Ionian islands, where RMSE mostly improves.
On an annual basis, the changes made by HGRP are similar to winter and autumn, which is to be expected as most P20 happens in those seasons, similarly to P10. HGRP increases P20 in the greater Greek region, with the most substantial increases happening in the western part of the mountainous regions of Peloponnese, in Crete, and in north eastern Greece. In the Peloponnese region, the changes result in improvements in RMSE; similarly, in the mountainous regions of Crete, the RMSE improves the most in Greece. By contrast, in the northeastern part of Greece, the metrics deteriorate by the increase in P20 in the region. In Attica and Evia, there is also an increase in P20, which results in positive changes in RMSE. In the Ionian and Aegean islands, there are mostly increases by the HGRP, which improves the metrics in those regions. Samos remains an outlier, where the ERA5 dataset simulates much more P20 in the island when compared to the rest of the region, and this is transferred over to the HGRP.
In
Table 4, there are improvements in all metrics across all seasons except summer. Obviously, the metrics recorded in P20 are worse than the rest of the parameters studied because these events are very rare in the Greek region. It is safe to conclude that the ERA5 reanalysis simulates a lot fewer days where precipitation exceeds 20 mm, and this is improved by the HGRP. From
Figure 8, we observe that in order to improve the results of the ERA5, HGRP increases P20 in all seasons, and unfortunately, this is not correct in summer, which is why it is the only month where the model underperforms the ERA5 dataset. Additionally, in the rest of the seasons, there are significant improvements in the spatial resolution of the ERA5 dataset, which are not adequately reflected in the metrics.
4. Discussion
Firstly, we can confidently conclude that this research further validates the results from Wu et al. [
38] and Jiang et al. [
39], where they found that the ERA5 overestimates the frequency and duration of precipitation and underestimates its intensity. With our methodology, we were able to significantly improve the accuracy of the predictions by both improving the values simulated by the model and their geographical distribution. The largest improvements were definitely in the number of wet days and the number of days where precipitation exceeded 10 and 20 mm. In contrast, the total precipitation did not exhibit any large improvement in the metrics studied; instead, most of the improvement came from the higher resolution achieved by the downscaling model and the improved geographical distribution of the precipitation.
More specifically, in the number of wet days, there is a wide difference between the simulated and the real values. The main improvement in the number of wet days was the reduction in RMSE and less so in the R2. This occurred because the geographical distribution of wet days simulated by the ERA5 is correct, but the quantity of wet days simulated is extremely inflated. In the number of days precipitation exceeded 10 mm, the improvements were smaller compared to the wet days and less conclusive in terms of the bias in the model. The downscaling method does seem to increase the quantity of the events over the total of the Greek region and, in particular, western Greece and Crete. Furthermore, the ERA5 reanalysis significantly underestimates the number of days precipitation exceeded 20 mm. Although with our downscaling methodology, we were able to achieve significant improvements, the annual R2 achieved by our model is still only 0.31 compared to the 0.24 R2 of the ERA5 reanalysis. In comparison, the downscaled annual R2 is 0.45 and 0.58, and the ERA5 reanalysis R2 is 0.38 and 0.56 for number of days precipitation exceeded 10 mm and wet days, respectively. These results indicate that, ultimately, the main force for the downscaling method is still the ERA5 data and that the rarer and more extreme a parameter is, the harder it is to simulate in general.
In terms of geographical distribution, we can see that the ERA5 is not able to correctly simulate the precipitation that occurs in the mountainous regions of the Peloponnese and Crete. In Crete in particular, the ERA5 reanalysis underestimated every variable studied, with the exception of the number of wet days. It is important to note that the gauge dataset we used does have a lot of high-elevation stations in the mountainous regions of Crete, which could be one of the reasons that the differences are so pronounced in the region. However, from the maps of the RMSE difference, we can see that some of the biggest improvements occur in that region. The overall results are also further validated by the R
2 differences maps (
Figures S1–S4) presented in the
Supplementary Information section and the metrics presented in
Tables S5–S8. In the rest of Greece, although there is an adequate number of stations covering the whole region, we could not include any very high-elevation stations in the mountainous areas of western Greece, which could influence the overall results since the bulk of precipitation in Greece occurs in the mountainous regions of western Greece. If such data do become available, they could be the subject of future research.
Overall, the improvements in the metrics can be explained by the increased resolution of our dataset, which greatly influences the performance of ERA5 in the oceanic regions of Greece since in
Table 5, we can observe that the bulk of improvements happen at islands’ stations. The increased resolution allows the model to better depict the geography of the islands, which greatly increases the accuracy of the predictions made. Achieving high-accuracy results for the maritime regions of Greece is especially important since they are highly water-stressed areas, which are expected to be greatly impacted by future temperature increases.
5. Conclusions
The goal of this study was to create high-resolution (1 km × 1 km) monthly databases for precipitation totals, number of wet days, and number of days precipitation exceeded 10 and 20 mm using regression kriging with a histogram-based gradient boosting regression tree. In order to achieve this, we used climatic data from the newest land-based reanalysis dataset, geospatial variables from a high-resolution digital elevation model, the AUREHLY principal components, and the North Atlantic Circulation Index as the independent variables. As dependent variables, we used 97 precipitation gauges from the Hellenic National Meteorological Service for the period 1980–2010. In order to compare the results between the standalone ERA5 dataset and our downscaling methodology, we used an iterative LOOCV cross-validation. The downscaling was carried out on a monthly basis, where both the gauge data and the ERA5 data were aggregated on a monthly basis and then downscaled.
Our results confirmed biases that were also observed in previous papers [
38,
39], whereby the ERA5 reanalysis overestimates the frequency of precipitation and underestimates its intensity. In our research, we found that the number of wet days simulated by the ERA5 data was very inflated, while precipitation exceeding 10 and in particular 20 mm was understated. One of the reasons behind these biases may be because of the coarse resolution of the model, which does not manage to capture the intense geographical variation that precipitation exhibits in the Greek region. With our methodology and the higher resolution we achieved, we managed to correct some of these biases, especially in the Greek islands, which recorded most of the increases in accuracy (
Table 5). More specifically, in the precipitation totals, the main improvements came from the increased resolution and an improvement in the spatial distribution of precipitation, while in the island stations, the metrics were improved with RMSE decreasing by 7.7% while remaining mostly the same on the rest of Greece. In contrast, in the number of wet days and the number of times precipitation exceeded 10 mm, there were large improvements in the metrics studied. In the number of wet days, the RMSE halved on an annual basis, with additional large reductions on a monthly and seasonal basis. This was achieved by reducing the number of wet days simulated by the ERA5 dataset. On the number of days where precipitation exceeded 10 mm, there were improvements in both the metrics studied and the geographical distribution of the events. Finally, on the number of days where precipitation exceeded 20 mm, there were smaller improvements in the metrics because the occurrence of such events is very rare. However, it is safe to assume that P20 was underestimated in the ERA5 reanalysis, and HGRP was able to improve its accuracy. The differences in the variables studied can be attributed first to the coarse resolution of the ERA5 dataset, but additionally, the geographical variability of a small and complex region like Greece poses unique challenges in simulating its climate variable that cannot be addressed by more generalized models made in order to depict the European and global climate. In future versions of the ERA dataset, where higher resolution is achieved, we could expect more accurate results.
The largest improvements geographically were recorded in the region of Crete, where we found that the ERA5 reanalysis dataset underestimated every variable studied, with the exception of wet days. Next, the mountainous regions of the Peloponnese also recorded a large improvement, with the smallest improvements occurring in western Greece. At this point, however, it is important to note that the gauge dataset used had a large number of stations in the mountainous regions of Crete; therefore, in future research, it would be helpful if more stations could be added in the mountainous regions of western Greece in particular, where there is also higher elevation and where the bulk of precipitation occurs.
Overall, the main driver of the variables studied continues to be the ERA5 variables; however, with our downscaling methodology, we were able to achieve significant improvements in the metrics when compared to the standalone ERA5 dataset. The largest improvements were recorded in the wet days and the number of days where precipitation exceeded 10 mm, while there were smaller to no improvements in the precipitation total. The improvements mainly occurred by better distribution of precipitation, wet days, P10, and P20 in the Greek region, which can be attributed to the better resolution that we were able to achieve when compared to the standalone dataset as well as improving on the wet days bias that was clearly exhibited by the ERA5 dataset. The improvements made in relation to P10 and P20 indicate that extreme precipitation indices provided by the standalone ERA5 dataset should be adjusted to correct the biases of the dataset before being used for hazard assessments like flooding, droughts, landslide susceptibility, etc. Caution is also warranted when using the ERA5 for analysis in small island regions since its resolution may not be adequate depending on the size of the region. Our research improves upon the ERA5 dataset by offering new insights on its accuracy in Greece, while we can also confidently conclude the algorithm tested seems to be a good fit for creating precipitation datasets.