1. Introduction
The hydrological cycle is influenced by several factors, such as type of soil, land use and land cover and climatic and weather conditions. Specifically, land cover and climate variability are important features affecting hydrological processes, causing significant changes to overland flow and evapotranspiration [
1,
2]. Increased overland flow can be linked to an increase in floods, as the main reasons behind the increase in flood events (i.e., from around 100 in 1980 to 321 in 2010 in Europe [
3]) are population growth, climate change and human activities such as deforestation and change in land use patterns. Urbanization, generally also linked to flash floods, is also increasing over the globe, from 30% of urbanized areas in 1950 to 55% in 2018 [
4], while in 2010 75% of Europe was considered urban [
5]. The use of yearly and spatially distributed land cover data that depicts these changes increases the accuracy of hydrological models [
6], which are generally used for water resources and flood risk management. Moreover, for rural and ungauged areas, where the influence of humans on the hydrological cycle is higher [
7], such analysis is important. However, it is costly and time-consuming to get updated land cover data in highly changing systems. That is the case of Greece, where only 6% of people lived in urbanized areas until 1821 [
8] and by 2017 around 80% of the area was urbanized [
9].
One way to obtain information is through crowdsourcing, a process in which novel low-cost data can be generated with the support of citizens [
10]. The Scent project (
https://scent-project.eu/, accessed on 1 January 2022) is an example of using crowd observed data [
11]. Scent is a European Union (EU) H2020 research project, intending to engage citizens for the collection of data and make them the ‘eyes’ of the decision-makers. Data are also made freely available by different institutions, such as by Copernicus (
https://www.copernicus.eu/en, accessed on 1 January 2024), the European Union’s Earth Observation Programme. Copernicus produces varied maps at global, European and local level, from which CORINE (coordination of information on the environment) available at
https://land.copernicus.eu/pan-european/corine-land-cover, accessed on 1 January 2024) is a general land cover map and Urban Atlas is a dedicated sub-product with emphasis on urban areas.
One of the pilot areas considered in the Scent project was the upstream part of the Kifissos catchment, in Greece. In pace with the situation over the country, the urban area increased rapidly in this region, and nowadays, the catchment is about 68% urbanized and continues to further be urbanized [
12]. Almost 3.1 million people are living in Athens nowadays [
13], pushing towards growth of urban areas and contributing to flooding problems. The catchment’s hydro-meteorological conditions determine low or non-existent river flows for most of the time during the year. However, when short time, intense precipitation events happen, they trigger floods in the catchment [
14]. Therefore, the Kifissos catchment is an example of where updated information is important. Through the Scent project, new data in Kifissos were available and can be applied to improve the hydrological models in the catchment. Similarly, the land cover maps produced by Copernicus can also be used and analyzed in their ability to provide updated information for models. Hence, the quality and applicability of those data sources needed to be compared with each other to capture the hydrology in the region.
This research explored the influence of varied land cover data sources in the hydrological modelling of the upstream part of Kifissos catchment by comparing gridded and lumped models. These two representations required different parameterizations, for which land cover sensitivity analysis was carried out.
2. Materials and Methods
2.1. Study Area
The total area of Kifissos catchment is 374.6 km
2 [
15]. The present study focuses on the upper 136 km
2, upstream of Athens, Greece (
Figure 1). Several mountains are located in the area, surrounding the river basin: Parnis (1400 m) in the north, Penteli (1100 m) in the north-east, Hymettos (1000 m) in the east and Aigaleo (400 m) in the west. The basin outlets at the Saronikos Gulf [
15], which shows that the catchment contains steep slopes. Kifissos catchment has a Mediterranean climate with an average annual rainfall of 332.2 mm. The main river is 33.7 km, while the river network is sparse [
16]. The climatological, geomorphological and anthropogenic factors of the catchment are such that the river becomes dry during most of the time of the year; however, it conveys very large volumes of water within short and intense precipitation events [
14].
A study with Landsat imagery, conducted by Chrysoulakis et al. [
17], found that the increment in urban area over 20 years (from 1988 to 2007) in the Athens basin was about 30%. This basin used to have six natural river networks drained to a lake; however, rapid urbanization did not take into account those natural water bodies, which were, among others, canalized underground or built over. This had consequences during severe flood events, such as heavy damage in physical properties and lives [
18]. In 2019, 68% of the catchment is urbanized, which further increased the flood risk in Athens. Bathrellos et al. [
16] studied the flood event of 22 February 2013, during which rainfall reached 100 mm in five hours, the highest in the last 50 years, causing blockades on highways, the metro and train stations, electricity supply shortages and fatalities. On 24 October 2014, due to extreme rainfall, the west side of the catchment was heavily affected.
2.2. Overall Methodology
For the study, the Hydrologic Modeling System (HMS) suite, developed by the Hydrologic Engineering Center (HEC) of the USACE (United States Army Corps of Engineers, Davis, CA, USA), was used. HEC-HMS has already been widely tested, as reported in [
19]. Multiple studies show its applicability in different climate regions.
A lumped model uses parameters that represent spatially averaged characteristics of a hydrological system. They represent only the time variation of a catchment; space is reduced conceptually to a single point. When more detailed spatial information can be included in a model, catchments can be considered divided into cells (grids), where each will act as lumped hydrological models. The contribution of each grid is added together to obtain the response of the whole basin [
21]. In HEC-HMS, the DEM differences may lead to different sub-basin delineation, which would produce different level of detail spatially. This could lead to more insights about hydrological processes at the local scale. The number of sub-basins (spatial detail) depends on the data availability and modelling objectives in each study.
In total, based on HEC-HMS software, an instantiation of 12 hydrological models was developed in this study, six lumped and six gridded. Each of these models was set up with 21 sub-basins, to capture the spatial variability of catchment characteristics. The focus here is on variations in land use land cover LULC input data and their impact on the model outputs.
Table 1 presents an overview of the specific differences between these models:
Each developed model has an identifier, M0 or M1, a letter for model structure and parametrization (L for lumped and G for gridded), followed by a letter showing the land cover datasets used in the model (C stands for CORINE, E stands for European Union Urban Atlas and S stands for Scent). Each model instance was built based on two different imperviousness maps, hence identifier 0 and 1. One imperviousness map was based on LULC data and the other was obtained from online sources. Based on data availability, the simulation period was 1 July 2017 to 30 September 2019.
2.3. Available Data
2.3.1. Land Use Land Cover (LULC) Maps
The European Union’s Earth Observation Programme (Copernicus), provides the CORINE land cover inventory every four to six years at European level. In this study, the 2018 map was used, with 17 land classes at 100 × 100 m cell size in the case study area. This study also investigates the value of a sub-product provided by the same institution, the Urban Atlas, which contains details of urban features in vector form with 22 land classes. Lastly, the Scent project involved citizens equipped with a gaming application taking pictures of land cover features, in campaigns organized from September 2018 to June 2019. Together with high resolution satellite imagery, a land cover map was produced using a deep learning algorithm. The generated map has a cell size of 40 × 40 cm with 4 land classes: bare soil, forest, agricultural land and concrete. By analyzing these three land cover maps (
Figure 2), it can be seen that the upstream portion of the basin is more covered with forest, vegetation and agriculture, whereas the downstream is mainly urban with built-up areas. The main difference among these datasets is the distribution and area of land classes, as well as the resolution. The Urban Atlas has a more detailed characterization in classes and CORINE has the largest pixel size, whereas Scent has a smaller pixel size but very few land classes.
Using these three LULC datasets, each with different spatial resolutions, affects the models differently depending on the type of model. In the considered lumped models, the spatial resolution impacts the area proportions of each land cover class within sub-basins, which in turn may influence how certain sub-basin parameters are weighted by land cover class, as detailed in
Section 2.5.2. In some cases, the spatial resolution might be irrelevant if both coarse and high-resolution LULC datasets result in similar land cover class proportions. In the gridded models, there are processes calculated within a 500 × 500 m grid, with most sub-basins covered by 2 to 3 grid cells. As a result, increasing the model resolution allows for greater refinement in hydrological process representation. The impact of spatial resolution on model results is further discussed in the comparison between lumped and gridded models.
2.3.2. Other Data Types
A hydrologic soil properties map was taken from soil information provided by the European Soil Data Centre (ESDAC) [
22]. The European Soil Database (ESDB) was developed in collaboration with the European Soil Bureau Network. Information on the soil texture was used for parameterizing the hydrological models.
The National Observatory of Athens makes available daily precipitation information in Greece, and nine stations were identified near the study area. The north-west side of the basin does not have a rainfall station, and only three stations are inside the basin, fairly well distributed. Averaged monthly data for potential evapotranspiration (ET) were taken from the estimation made by Tegos et al. [
23], based on data from a meteorological station located in Athens, around 15 km from the study area. They used varied calculation methods, from which we adopted the Penman–Monteith values since they were the best estimates of potential ET mentioned in their study.
Water depth time series were obtained from three stations by the Scent project (
https://Scent-harm.iccs.gr/, accessed on 1 November 2020), of which almost 50% of the data are missing from the studied period. The water depth values were then converted to discharges using Manning’s equation and measured river cross-sectional data. The telemetric data were recorded in 15 min intervals, and they were converted to average daily discharge. The locations of the stations (Kokinosmilos, Monastiri and Dekeleia) are presented in
Figure 1. The digital elevation map used for the gridded hydrological model has a resolution of 5 × 5 m. It was provided by the Scent project, which was obtained upon request from Greek authorities.
2.4. Model Setup
The HEC-HMS software has four components in the lumped setup: basin model, meteorological model, control specifications and time series data. The basin model contains the basin’s characteristics, and it is further composed of: canopy, surface, loss, transform and routing. The canopy component includes the parameters of rainfall storage on the canopy, whereas surface deals with the amount of water stored in depressions on the ground. The loss component contains the variables moisture content and the deficit of the conceptual linear groundwater reservoir that simulates the base-flow component. Transform converts the excess precipitation to runoff, and routing represents the conversion of surface runoff to the flow. The meteorological model provides the rainfall, and ET data and control specification specified the starting and ending date and time of the simulation to the model. Similarly, time series data contain the rainfall, ET and discharges. There are additional components for the gridded set-up, terrain data, grid regions and grid data (discretization for gridded models). The grid size of the input files was 500 × 500 m, which is the minimum grid size available for the gridded transform component. In this case study, the first lumped model was built based on an existing event-based model [
24], for which the basin component was modified to better fit continuous simulations and the current datasets.
A simple method applicable to both continuous and gridded basin models is the deficit and constant method, whereby most of the parameters could be calculated based on available data. The method considers the rainfall as the main input in the hydrological process and includes distinct treatments for pervious and impervious areas. When rainfall occurs, the canopy stores some of the precipitation, which either evaporates or infiltrates into the ground. For pervious areas, percolation is dependent upon the soil properties and only occurs if the soil is saturated, and similarly, evapotranspiration only occurs when there is no rainfall and there is moisture in the soil. The soil is represented by a single reservoir and base-flow is added separately. Excess precipitation is generated due to soil saturation. When soil gets saturated and there is still rainfall, then surface depressions retain some of the precipitation. The fluctuation in the linear reservoir is dependent upon the moisture content in the soil from the previous day (or initial condition) and addition due to infiltration or deduction due to evapotranspiration. Once evapotranspiration occurs, water is lost permanently. Over impervious areas, precipitation not intercepted by the canopy becomes excess precipitation. Impervious areas are defined by the percentage of imperviousness.
The model variables are defined by the following equations:
when there is no rainfall,
and when there is rainfall,
where
P = precipitation,
Pe = excess precipitation,
imper% = imperviousness percentage,
Ce = excess canopy,
I = infiltration,
MCt = moisture content fluctuation,
MCt-1 = remaining moisture content of previous day or initial condition,
Cs = canopy storage,
Cs-1 = Canopy storage of previous day,
Ss = water available for surface storage,
Sm = maximum surface storage
ET = evapotranspiration,
Pimper = excess precipitation due to imperviousness % and
Psoil = excess precipitation due to soil saturation.
Over pervious areas, Equations (5) and (6) for moisture content are used when the reservoir is not full. Once the reservoir is full, percolation into the groundwater starts to occur, which is another variable for permanent loss of water in this model. In the case of a full reservoir, if the excess canopy available for infiltration is higher than the soil percolation, then Psoil is activated. Water is available for surface storage, and when maximum capacity is reached, then it starts to produce excess precipitation. Otherwise, the imperviousness alone is responsible for the generation of excess precipitation. In the case of gridded models, the input parameters were given in a gridded form and hydrological processes partially took place over the individual grid. However, the lumped model takes parameters as per sub-basin, but the results of different variables such as excess precipitation, infiltration etc. for both gridded and lumped models were analyzed per sub-basin.
2.5. Input Parameters
2.5.1. Canopy Method
The simple canopy method was adopted where water is stored in leaves when rainfall occurs, and interception continues until the maximum capacity is reached. Due to the geographical position of the catchment, evapotranspiration was set to occur only in seasons with warm temperatures, which are dry periods. The additional rainfall after maximum storage of leaves will fall to the surface. It was calculated individually for all three (Scent, Urban Atlas and CORINE) land cover datasets. The canopy interception values were adopted from Verbeiren et al. [
25] and have the storage value per land cover type. For gridded models, raster maps were created by attributing canopy values to each land class cell. For lumped models, the area and canopy type of each land class were determined and the weighted average canopy value for each sub-basin was calculated, as described in the following equation:
where
= maximum canopy storage for sub-basin ,
Alci = area of land class ,
= maximum canopy storage of land class ,
A = total area of the sub-basin.
The distribution of maximum canopy storage at the north-west corner of the basin for all LULC datasets seems higher than other parts and lower at the downstream part of the basin.
The crop coefficient is a ratio multiplied with the potential ET to get the actual ET from the soil. Its value was obtained from Nistor et al. [
26]. The calculation process and the distribution of data followed a similar pattern to canopy maximum storage.
2.5.2. Surface Method and Loss Method
The surface method was represented by a simple surface process. Initially, surface storage was considered as dry, while the maximum was determined based on its relation to the sub-catchment slope type, according to Bennett [
27]. It is noticed that in the downstream and some of the middle part of the catchment, the surface storage is maximum, while in other parts of the basin it is not.
The deficit and constant loss method were used to represent the surface and sub-surface hydrological processes by means of one linear reservoir with three parameters: initial deficit, maximum deficit and imperviousness and percolation rate. The maximum deficit characterizes the maximum quantity of moisture that the linear reservoir can retain, stated in depth, and is calculated from the curve number (CN), as per in Equation (8).
where
S = maximum deficit,
.
Curve numbers for different land covers were calculated and estimated from data and from the CN tables of the HEC-HMS technical reference manual [
28]. Initial deficit is the initial condition of the moisture depth of the reservoir, and it was taken as 20% of the maximum deficit. Based on the resulting inputs, the CORINE has a higher maximum deficit in the middle part of the basin, whereas for the Urban Atlas, this is noticed in its eastern area. In contrast, for Scent models, the downstream part of the basin presents the highest values. The maximum deficit for CORINE land cover is higher than the other two datasets.
2.5.3. Imperviousness and Constant Rate
All the rainfall occurring in the impervious portion and remaining after canopy storage converts into direct runoff. As mentioned, two types of imperviousness percentage inputs were assessed in this study. The first was generated based on the Copernicus imperviousness map, in which, for lumped models, the average imperviousness for each sub-basin was calculated, and for gridded, upscaling of the cell size from 100 × 100 m to 500 × 500 m was completed. In this case, the same imperviousness inputs were used independently of the land cover datasets. In opposition, USGS imperviousness was calculated individually for each land cover dataset, using the coefficient of imperviousness adopted from Tilley and Slonecker [
29]. For the lumped model, the same calculation process of other parameters (weighted average) was followed. For gridded models, raster files were made with the use of imperviousness coefficients. The upscaling of land cover for 500 × 500 m was performed by taking the average of all the pixels of land class located inside the grid and then attributing the obtained value to imperviousness. All the maps have a similar nature whereby the downstream portion of the basin is more impervious than the upstream part.
The constant rate indicates the percolation rate in mm/hr for the model, and they were taken from the soil physical properties information from the study conducted by Elnesr [
30]. For this analysis, textural information about soil was needed, and that information was taken from ESDB map.
2.5.4. Transform Method
The Clark unit hydrograph and Modclark methods were chosen to represent overland flow processes for lumped and gridded models, respectively. The parameter time of concentration was calculated by taking the values of lag time, taken from the existing event-based model [
24], divided by 0.6 [
31]. Similarly, storage coefficients which account for storage effects were calculated by dividing the average observed flow (of three observed stations) at inflection points on the falling limbs of several flow peaks by the time derivative of the flow. The time derivative was obtained by the difference in flow at the point of inflection and the flow during the next day divided by time (1-day interval).
2.6. Model Calibration and Validation
As the objective of this study is to investigate the influence of land cover on hydrological processes, parameters defined based on data were not calibrated. The lumped and gridded CORINE-based models (M0LC and M0GC) were calibrated and validated. Calibrated parameters were used in further model instances. Calibration of the following parameters was performed:
Transformation method: time of concentration and storage coefficient
Base-flow method: initial discharge, recession constant and ratio to peak
The total simulation time was about 27 months (1 July 2017 to 28 September 2019). Since there was a lot of absence of data, the longest time of available data was about six months (28 September 2017 to 31 March 2018), and hence, it was chosen for calibration. Similarly, the period of 14 September 2018 to 12 January 2019 was chosen for validation.
Calibration was performed manually, by comparing simulated and observed flow hydrographs at the three water depth stations. Three discharge stations, shown in
Figure 1, were used for calibration: Kokinosmilos (J09), Monastiri (J12) and Dekeleia (J08). The calibration was guided by the physical characteristics of the case study and of each sub-basin: steepness, level of urbanization or forestation and depth to the groundwater table (GWT). The final calibrated values for lumped and gridded models are shown in
Table 2, and the results obtained after calibration and validation are shown in
Figure 3. For both approaches, lumped and gridded models, the obtained value for the ratio to peak for all sub-basins is 0.5 and the recession constant is 1. The base flow for lumped and gridded models matched, whereas the peaks of observed flow are higher for both calibrated and validated models. Similarly, the starting time of the peak for all models is the same. From the calibration of lumped and gridded models, it was found that almost all the parameters were the same except the initial discharges. The initial discharges for gridded sub-basins were higher.
4. Conclusions
The analysis of the hydrology of three different sub-basins having different characteristics gave insights into three main questions, which are detailed below. This section then gives an assessment of the limitations and future perspectives.
How land cover affects the structure and parameterization of Kifissos hydrological models?
The models describe hydrological processes that allow for canopy storage, infiltration, sub-surface storage with soil percolation (represented by a linear reservoir), evapotranspiration and excess precipitation. Canopy storage is different for each type of land cover; however, its influence in the models is small and does not affect the excess precipitation. All precipitation that is not stored in the canopy infiltrates into the ground through pervious areas. The infiltrated rainfall, actual ET and the maximum deficit (i.e., the reservoir size) control the moisture content of the linear reservoir. All these variables and parameters are different according to each land class. In a forested sub-basin (70% forest), ET resulted in 65% of the total precipitation, which is 15% more than the results for a sub-basin with mixed land cover and 4 times more than those of an urban sub-basin. This difference in ET and the difference in the size of the reservoir causes the saturation of the soil and then percolation at different times. The sub-basin with mixed land cover takes a longer time for the soil to saturate than the urban and forested ones because it has the biggest linear reservoir and considerable ET. The small reservoir at the urban sub-basin filled up quickly and kept full for longer, due to lower ET. When there is soil saturation, percolation processes are dominant and account for most of the reservoir losses during wet periods. This process is not influenced by land cover. Further, when the excess canopy was greater than the infiltration because the maximum percolation rate had been reached, surface storage occurred, which accounted for small volumes. The precipitation from pervious areas, i.e., from soil saturation, is very small regardless of land cover.
Excess canopy over impervious ground does not allow the water to penetrate the soil and is directly converted into excess precipitation. Therefore, the urban sub-basin, which is highly impervious, generates 69.9% excess precipitation, while the sub-basin which is less impervious (the forested one) produces 11.3% excess precipitation.
For all basins in all models, the generation of excess precipitation is highly dependent on the excess precipitation caused by imperviousness (and, consequently, proportional to imperviousness percentages themselves). The minor differences are caused by excess precipitation from soil saturation, which comes from the perviousness percentage of the ground surface.
How sensitive were the hydrological models to changes in land cover from different data sources?
The parameters that changed with the LULC datasets were canopy storage, imperviousness, maximum deficit, potential evapotranspiration and crop coefficient. Different parameters influenced infiltration, fluctuation in the linear reservoir and excess precipitation. From the results, the maximum deficit and imperviousness percentage play important roles and are lower in the Scent data. Reservoirs filled up earlier and generated more excess precipitation from soil saturation than for CORINE, which has a larger reservoir and higher imperviousness percent. Urban Atlas and CORINE have a similar nature in terms of excess precipitation because their imperviousness percentages are similar; however, the linear reservoir fills up first in Urban Atlas because of its smaller size than CORINE. Although the land cover data sets influenced hydrological processes in pervious areas by influencing the size of the reservoir, similarly to the discussion in the previous research question, there is the contribution of excess precipitation due to the saturated soil condition, but they are insignificant. Therefore, in this analysis, the main dominant process for the generation of excess precipitation is also imperviousness.
Hence, it is found that the main parameter playing a role for both sensitivity to land cover and parameterization was imperviousness.
What are the key differences between gridded and lumped model representation, under three land use land cover datasets?
The main difference between gridded and lumped representation comes from the semi-distributed and distributed patterns of the sub-basins. In lumped models, a weighted average of all the land classes within one sub-basin was taken, whereas a separate value was given for the case of gridded models for the calculation of some parameters. The ET of CORINE, Urban Atlas and Scent lumped models are higher than from gridded models for all the cases, and the linear reservoirs of gridded models are filled up earlier than the lumped models.
Similarly, due to the rasterization and making cell sizes of 500 × 500 m, the actual representation of the land class deviates in gridded models, which resulted in the different amounts of imperviousness and, consequently, of excess precipitation, especially in the case of Urban Atlas. The Urban Atlas gridded model is less impervious and has less excess precipitation than in the lumped one. Furthermore, the excess precipitation due to soil saturation for both models is insignificant compared to the imperviousness percentage.
In conclusion, although the parametrization as a lumped and gridded model also affected the representation of hydrological processes in pervious areas, it was not relevant in terms of excess precipitation.
Future research work and perspectives
Present work shows that if the model was to be used for analyzing hydrological changes under climate change scenarios, the changes would be introduced in the forcings of the model, such as precipitation and temperature (potential evapotranspiration). Considering the focus of our study, which is LULC data, the most sensitive input in all models is imperviousness. Different case scenarios of imperviousness could be generated by changing the area covered by forest or a paved surface. The outcome excess precipitation could be analyzed and compared using different case scenarios.
HEC-HMS itself is a physically based model; therefore, artificial intelligence methods cannot be used in the model itself, but HEC-HMS can add input to a hydrological artificial neural network (ANN) model, for example, to provide training data or in a hybrid set-up to provide improved accuracy. Examples of such studies are available in [
32].
There are a series of improvements to the current model that need to be further explored, such as the Modclark transformation method was available only for grid cells that were at least 400 m × 400 m (hence the selection of 500 m × 500 m resolution) and the time step could be reduced from one day to a smaller one, provided that precipitation data is available on a smaller time step.
Regular update of landcover products is important [
33,
34] but challenging for physical process-based models [
35] and is not commonly carried out. Further research into LULC automatic updates and their inclusion in the model along with climate change scenarios would give further insights on the catchment response.