1. Introduction
Based on their spatial scale of process representation, hydrological models are generally categorized from lumped to distributed. In the former, the watershed is considered as a single unit that responds homogeneously to its spatially-averaged meteorological inputs, while the model parameters are only associated with the macroscopic properties of the watershed as a whole. In this context, the only means to establish a lumped modelling scheme of satisfactory predictive capacity is to infer its parameters through calibration.
In contrast, distributed models explicitly account for the spatial heterogeneities of physiographic characteristics, meteorological inputs, hydrological processes and boundary conditions, via discretization of the model domain in finely-resolved computational units (e.g., grid cells). The fundamental laws of hydraulics and semi-empirical hydrological formulae are applied at each spatial unit, which allows, in theory, to estimate all model parameters from field data. Due to their bottom-up basis, which takes advantage of the significant advances in understanding the hydrological processes at the micro (point, continuum)-scale, fully-distributed models are also identified as “physically-based” [
1].
The advantages and disadvantages of the two approaches have been widely discussed in the literature [
2,
3,
4]. The widely-accepted conclusion is that lumped models cannot represent the spatial variability of the basin processes, while fully-distributed models are data and time demanding. Nevertheless, despite their different backgrounds, both approaches suffer from uncertainties related to the estimation of their parameters. In fact, from the introduction of physically- or process-based models, the hydrological community has criticized them as overly complex, over-parameterized and difficult to use, while challenging their ability to apply parameters that are directly measured in the field (more generally, extracted through observable data), without some kind of calibration [
1,
5,
6,
7]. The reasonable argument behind this criticism is that the physical properties are measured at the point scale, while the parameters are assigned to the grid-cell scale, thus by definition conceptual. Hence, given that distributed models are comprised by many parameters, they are vulnerable to the well-known shortcomings of calibration, i.e., the equifinality problem [
8]. However, it is argued that in some situations, the use of process-based models varies from necessary to most appropriate, especially when knowledge of flow paths or distributed state variables and/or preservation of physical constraints is important [
6].
For many problems of everyday practice requiring flow predictions at a relatively small number of points across the river network, semi-distributed models are broadly recognized as a good compromise between lumped and fully-distributed approaches [
2]. The key concept is to divide the watershed into a number of sub-basins and propagate the runoff generated from each sub-basin through the river network (and similarly for the subsurface processes). Although at the sub-basin scale, the modelling remains lumped (except for certain cases, in which sub-basins are further divided into smaller units), at the watershed scale, the heterogeneities are partially accounted for by assigning different meteorological inputs and different properties to each sub-basin. Another advantage of semi-distributed schemes is the representation of flow routing, which is of key importance when fine time steps are employed (e.g., in flood modelling). We note that for flow routing, many approaches exist of varied complexity, i.e., from lumped black-box hydrological methods to advanced hydraulic simulation schemes. Apparently, the selection of routing approach takes into account the available data, which is a crucial aspect of the overall modelling procedure.
The configuration of a semi- or fully-distributed model and the associated level of complexity are determined by the user. This task involves two separate tasks, the discretization of the watershed, i.e., the model schematization, and the model parameterization. The former refers to the delineation of the spatial units (typically, grid cells in the case of distributed and sub-basins in the case of semi-distributed models), while the later refers to the spatial assignment of the model parameters [
4,
9]. Following, a review of current schematization and parameterization practices, emphasizing Hydrological Response Units (HRUs), is provided. This review will provide the background of well-established approaches that have been implemented within several popular hydrological packages, including the Soil and Water Assessment Tool (SWAT) [
10,
11]. All approaches, despite their differences in delineating HRUs, make use of geographical layers of watershed properties that affect runoff dynamics, in order to define areas with approximately homogenous response, and thus parameterization.
In order to provide a systematic and physically-consistent procedure for the delineation of HRUs, we take advantage of the runoff Curve Number concept (CN), which has been widely used as a representative indicator of the catchment’s behavior. Given that the hydrological and engineering community has great experience in estimating the CN parameter on the basis of easily-retrieved geographical information, we propose and test a methodology for delineating HRUs based on distributed CN maps, along with guidelines for its optimal use. Their formulation is an extension of the standard SCS approach, with the use of an empirical expression accounting for three major physiographic characteristics, by means of soil permeability, land use/land cover characteristics and drainage capacity indices. The map of CN classes is eventually used within model parameterization, to identify the essential number and spatial extent of HRUs and, consequently, the number of control variables of the calibration problem. The proposed approach aims, on the one hand, at reducing subjectivity introduced by the definition of HRUs and providing parsimonious modelling schemes, on the other. The key features of the methodology are demonstrated in a hydrological simulation of a heterogeneous river basin. In this example, parametrization approaches of varying complexity will be examined.
2. Overview of Schematization and Parameterization Approaches in Hydrological Modelling
During the history of hydrological models, two topics have been extensively discussed. In particular, the discretization of the watershed predetermines the aggregation patterns of spatial information, i.e., the size and shape of computational elements, and controls the associated topographic parameter values, such as slope, aspect, etc. On the other hand, different delineations of the stream network connectivity and hillslope size affect and can often misrepresent rainfall-runoff processes. Consequently, watershed subdivision can affect the model outputs [
12]. This relates to the ongoing debate about the optimal level of discretization to adequately represent the spatial heterogeneity of a watershed, a topic that has received considerable attention from the early steps of distributed modelling [
13]. For instance, Hromadka [
14] states that “Arbitrary subdivision of the watershed into subareas should generally be avoided. … an increase in the watershed subdivision does not necessarily increase the modelling ‘accuracy’ but rather transfers the model’s reliability from the calibrated unit hydrograph and lag relationships to the unknown reliability of the several flow routing submodels used to link together the several subareas”.
While it is common practice to partition a watershed into smaller units (e.g., sub-basins), it is not easy to identify a strictly optimal spatial scale. Nevertheless, it is accepted that the sub-watershed size should be adapted to the modelling objectives, as the latter determine the dominant hydrological processes to be considered in simulations [
15]. It is feasible to represent the landscape heterogeneity through small sub-watersheds; for large watersheds, however, a fine resolution is often prohibited, mainly by high data requirements. Therefore, in meso- and large-scale watersheds, significant simplifications and reductions are required, in order to aggregate spatial heterogeneity and associated parameters at various levels of watershed subdivision [
16].
Increasing the number of sub-watersheds definitely increases input data preparation, computational time and calibration effort; therefore, the sub-watershed-average inputs generated from gauging stations vary with different watershed subdivisions [
17]. Several authors discuss how variations in the distributed model inputs result in various levels of watershed subdivision. For instance, Li et al. [
18] report that the precipitation input pattern was significantly affected by watershed subdivision. Savenije [
19] proposed an approach between complex distributed and simple lumped modelling, aiming to find the right level of simplicity while avoiding over-simplification. He proposed using topography as means for classification, thus retaining maximum simplicity, whilst accounting for observable landscape characteristics. A new terrain descriptor, the Height Above the Nearest Drainage (HAND), based on SRTM-DEM data [
20], has also been employed in classifications, revealing a strong correlation between soil moisture conditions and topography (Donnelly et al. [
21] found the strongest relationships to be between upstream area, proportion of upstream agricultural land, elevation, mean precipitation and mean temperature). Gharari et al. [
22] assessed the performance and sensitivity of the HAND-based landscape classification framework compared with several other classification indicators. They reported that HAND and surface slope appeared to be stronger indicators for different dominant hydrological processes.
Over the years, many watershed subdivision practices have been developed and tested using a variety of well-established models. An obvious approach is to divide the watershed into its natural sub-watersheds, thus preserving the watershed’s natural boundaries, flow-paths and channels for realistic water routing [
23,
24,
25,
26,
27,
28,
29]. The concepts of critical source area [
30,
31,
32,
33,
34,
35,
36], threshold drainage area [
37] and aggregated simulation area [
38] have also been used to delineate sub-watersheds within semi-distributed models. Moving from semi- to fully-distributed modelling, more detailed concepts are employed, such as grid elements, Representative Elementary Areas (REAs) and Representative Elementary Watershed (REW). In particular, the equally-spaced grid refers to the subdivision of the watershed into grid elements, each representing the dominant land use and soil type. The effects of grid-cell size on surface runoff and model accuracy have been investigated, among others, by Bathurst [
39], Bruneau et al. [
40], Manguerra and Engel [
41], Molnar and Julien [
42], Tao and Kouwen [
43] and Zang and Montgomery [
44]. The REA concept, introduced by Wood et al. [
45] and further investigated by Sasowsky and Gardner [
46], considers a watershed as composed of numerous (infinite) points (REAs), at which evaporation, infiltration and runoff fluctuate. Finally, in the REW approach [
47,
48,
49,
50], the point-scale equations of mass, momentum and energy conservation are integrated over the REW and mapped on a large scale.
In the above approaches, the river basin is represented as an assembly of discrete entities of pre-determined geometry (typically, sub-basins, in the case of semi-distributed schemes, and grid cells, in the case of fully-distributed ones), where the heterogeneity of its physical characteristics is not explicitly accounted for. Therefore, the model parameterization is dictated by the discretization, which may result in unjustifiable complexity, due to the large number of associated parameters [
4,
9,
51]. A well-known example is the HEC-HMS (Hydrologic Engineering Centre—Hydrologic Modeling System) model, which considers different parameters per sub-basin; thus, their number increases linearly with the number of sub-basins.
From the mid-990s, attention has been given to physically-based parameterization approaches through the so-called Hydrologic Response Units (HRUs), which generally represent areas with similar land use and soil characteristics. Their implementation is quite simple, since there is no interaction or topological connection between the HRUs; thus, runoff from each HRU is calculated separately and summed together to determine the total runoff from each sub-basin. HRUs were introduced by Leavesley et al. [
52], who delineated topographic-stream-segment-based elements for storm hydrograph simulations with the PRMS (Precipitation Runoff Modeling System) hydrologic model. In their approach, an HRU was considered to be the equivalent of a flow plane or delineated as several flow planes. However, the key assumption of homogeneity was asserted by Flugel [
53], who defined HRUs as “distributed, heterogeneously structured areas with common land use and pedo-topo-geological associations controlling their unique hydrological dynamics”. This assumption is justified by the fact that the dynamics of hydrological processes within an HRU vary only by a small amount compared to the dynamics among different HRUs. The concept of homogeneous HRUs was applied to the River Bröl Basin, Germany, using the PRMS/MMS (Modular Modeling System) model [
54]. The basin was parameterized into 23 HRUs, over a total area of 216 km
2. It was found that HRUs are a reliable means for regional hydrological watershed modelling, allowing spatial up- and down-scaling [
53,
55]. Bongartz [
56] compared the topographic-based and homogeneous HRUs and reported that, for watersheds with areas less than 200 km
2, homogeneous HRUs provided better representation of the watershed processes.
A variation of the above concept requires threshold specification for land cover, soil and slope classes, which is then used to delineate HRUs [
57,
58]. More specifically, the watershed is divided into several sub-basins that are further divided into discontinuous land masses, delineated through aggregation, according to user-defined thresholds for land use, soil type and slope ranges within each sub-watershed; this is followed by a GIS-based spatial overlay scheme, resulting in a “unique combination” HRUs with homogeneous characteristics, representing percentages of the sub-basin area that contribute differently to the entire watershed responses. The use of thresholds results in apparent loss of information, so it is suggested to be applied only when the number of HRUs delineated results in acceptable computation costs [
59]. This approach has been adopted, among others, in SWAT. While the incorporation of HRUs has allowed SWAT the flexibility to adapt from field plots to entire river basins, the fact that HRUs are used in a non-spatial manner, i.e., they are not identified spatially within simulations, is regarded as a key weakness of the model [
60,
61].
Efstratiadis et al. [
9] proposed an alternative definition of HRUs, as the product of separate partitions of the basin that account for different properties, such as land use, soil permeability, geology, slope, etc. This product is known as the common refinement of the partitions, while the related procedure in GIS will be next referred to as the “union of layers”. This approach has been adopted in their modelling system HYDROGEIOS, in an attempt to make schematization and parameterization two clearly independent procedures. Using an appropriate classification of the key watershed properties, the number of HRUs and, consequently, the number of parameters are manually adjusted. In particular, the number of HRUs is determined as the product of the parent layer classes, i.e.,
N = ∏
nj, where
nj is the number of classes corresponding to the
i-th layer. It is worth mentioning that, in contrast to classical parameterization approaches, HRUs do not represent contiguous geographical areas (while sub-basins are by definition contiguous). Instead, they represent basin partitions with common characteristics, and thus common parameter values. The intersection of the sub-basin and HRU layers represents the minor geographical elements in the modelling procedure, driven by the rainfall and Potential Evapotranspiration (PET) of the corresponding sub-basin and responding according to the parameter values of the corresponding HRU. The shape of this non-contiguous element is of no interest, since at each time step, the runoff generated by HRUs is integrated over sub-basins and propagated to their outlet.
The key advantage of this approach is its flexibility, since there are no restrictions on the configuration of HRUs, while the availability of data (critical for calibration) does not influence the watershed schematization, i.e., the delineation of sub-basins: one may consider a large number of sub-basins, in order to provide a detailed spatial representation of runoff across the watershed, while keeping a parsimonious parameterization. However, the fact that the delineation of HRUs is disengaged from watershed schematization does not necessarily ensure efficient, physically-consistent and parsimonious parameterizations, as the modeler is still allowed to determine any combination of layers. In addition, by definition, the number of HRUs is equal to the product of different classes that are accounted for within the formulation of the aforementioned layers. If one wishes to reduce the number of HRUs, one must manually merge the initial HRUs, thus losing information and physical consistency. The above shortcoming was the motivation to develop an improved procedure for HRU delineation, based on the well-established curve number concept, which is even more flexible and, at the same time, easy to implement within any modelling scheme.
4. Case Study
The proposed method is tested in the hydrological simulation of the Nedontas River Basin, which is one of the aforementioned pilot areas. It is equipped with three flow monitoring stations. The simulation suite used was the HYDROGEIOS model, which is based on the HRU concept and is very flexible in the formulation of HRUs. The formulation of the HRUs was evaluated against alternative parameterizations and their impact on model calibration and predictive capacity. In this context, we employ two calibration experiments. In the first experiment, we use the CN map of the basin to delineate from one up to five HRUs, thus providing configurations of varying complexity, while in the second one, we contrast the proposed CN-based method with two other well-established HRU delineation strategies, i.e., the unique combination and the union of layers.
4.1. Modelling Approach
The modelling approach is based on the most recent version of the HYDROGEIOS software, which is a GIS-based tool implementing conjunctive simulation of surface and groundwater processes across river basins that may also be human-modified; in the latter case, the allocation of hydrological fluxes under anthropogenic interventions is represented through a system-oriented water management scheme. While the software was initially developed for monthly simulations [
4,
9], it has been upgraded to also support daily and hourly time steps, in order to represent fine-scale hydrological processes, particularly floods [
83].
The water fluxes are represented on the basis of a semi-distributed schematization of the river basin, which is divided into sub-basins, which are interconnected through the hydrographic network. The latter propagates the surface runoff generated from sub-basins, where precipitation and PET time series are assigned as meteorological inputs. The model parameterization is based on the HRU concept. At each basin partition, i.e., the intersection of sub-basin and HRU, we employ a conceptual model that transforms precipitation into actual evapotranspiration, percolation to the groundwater system and surface runoff. A brief description of the model processes is given in the
Appendix A; for details, the reader may refer to the associated publications [
4,
9] and technical documents [
83]. We remark that within HYDROGEIOS, we do not make use of the standard NRCS-CN approach, since the latter is not applicable for continuous simulation.
4.2. Study Area and Data
The study area is the Nedontas River Basin, upstream of the monitoring station “Latomeio (quarry) Baka”, which is located in the southeastern part of Peloponnesus, Greece, and drains into the Gulf of Messenia (
Figure 4). The basin is surrounded by Taygetos Mountain and has a drainage area of 118.4 km
2 and an average elevation of 770 m, reaching a maximum elevation of 1715 m and a minimum of 93 m at the Latomeio Baka outlet. Terrain slope ranges from almost flat on the lowland and riverside areas to 76% on the steep mountain slopes, with the mean slope calculated at 22%. Nedontas originates in the western slopes of Taygetos and is mainly formed by three tributaries: its headwaters comprise the Nedousa and Alagonia tributaries, as well as spring flows in their sub-basins; the upper reach of Nedontas is joined by the lower reach of Karveliotis stream emanating from the SE area of the basin. The length of the main river course is about 26 km; the river discharges to the Gulf of Messenia, traversing the city of Kalamata, a regional economic center of SW Peloponnese, with 55,000 inhabitants.
The broader area is characterized by a mild Mediterranean climate influenced by orography, with long dry summers from April–October and wet winters from November–March. Mean annual temperature ranges from 13 °C–19 °C, with July being the hottest month (26.4 °C) and January the coldest (10.2 °C). Mean annual precipitation varies from 600 mm at the south, 1500 mm on the mountain range and 800–1200 mm in the central and northern plains and hilly areas [
84]. The river discharge is characterized by significant seasonal variability, since during the dry period, the flow is occasionally interrupted. On a mean annual basis, the runoff arriving at the outlet is approximately 550 mm (2.1 m
3/s).
In the context of the aforementioned project (DEUCALION), the basin was equipped with a telemetry-based hydro-meteorological network, comprising four meteorological and three flow-gauging stations, installed at the basin outlet (Latomeio Baka) and upstream on the two main tributaries, Alagonia and Karveliotis [
85]. The input time series were meteorological and flow data for a three-year period (September 2011–April 2014), provided at 10- and 15-min intervals, respectively. The input geographical layers were a 25-m resolution DEM, to delineate the river network and associated sub-basins and to calculate their geometrical properties, as well as maps of classified geological formations and land cover. The geographical layers were used to formulate the CN map and define the HRUs.
4.3. Model Setup and Other Assumptions
The study area was divided into seven sub-basins, by setting a flow accumulation threshold of 10 km
2 and by setting additional nodes at the three flow station sites, as illustrated in
Figure 4. Their main properties are summarized in
Table 3. Spatially-averaged precipitation time series of hourly resolution were extracted at the sub-basin scale, using the Thiessen polygon method. For the assignment of PET time series over sub-basins, we employed a parametric radiation-based approach that uses extraterrestrial solar radiation (which is periodic function of latitude and time), temperature and three parameters [
86,
87]. Areal temperature data were also estimated through the Thiessen polygon method, while the model parameters have been derived by fitting the model against Penman–Monteith data from the neighboring meteorological station of Kalamata. Finally, hourly discharge time series, used within calibrations, were constructed based on raw hydrometric data, i.e., 15-min interval river stage observations and sparse flow measurements were used to construct rating curves at the three stations.
The groundwater network was formulated by considering a conceptual groundwater tank underneath each sub-basin and allowing hydraulic connections between adjacent tanks, as well as with a hypothetical tank located downstream of the basin, to implement the underground losses to the sea, which are a quite significant portion of the water balance of the basin.
In all experiments, the simulation period was divided into a 10-month calibration period (September 2011–June 2012) and a 22-month validation period (July 2012–April 2014). Given that the simulation begins on 1 September, the beginning of the wet season in Greece, we assumed negligible initial soil moisture in the upper and lower soil moisture tanks. For the groundwater tanks, we assigned reasonable initial level values, based on preliminary analyses and also taking advantage of macroscopic information (e.g., elevation of springs).
The model parameters were assigned as follows: (a) seven parameters for the rainfall-runoff component of each HRU; (b) one recession parameter for each sub-basin (seven in total); (c) one hydraulic conductivity for each groundwater cell (eight in total); and (d) a single leakage coefficient assigned to the downstream river segment accumulating the runoff losses due to infiltration (for the rest of the segments, we did not considered infiltration losses). In the context of calibration experiments, the number of parameters that were associated with HRUs was varied, while the number of the rest of parameters was kept the same, i.e., 16. For the non-calibrated parameters, approximate estimates were provided, to ensure realistic fluctuations of the associated fluxes; for instance, a common specific yield was assigned to all groundwater cells, equal to 10%.
The unknown parameters of each HRU configuration scheme were calibrated against the observed flows at the three monitoring stations. In order to evaluate the model performance in calibration and validation, we used a common error measure, comprising a weighted sum of the coefficients of efficiency, high-flow efficiencies and a trend penalty to control the generation of unreasonable trends in the groundwater level behavior (cf.
Appendix A). Note that according to common hydrological practices [
88,
89], efficiency values greater than 50% indicate a rainfall-runoff model of satisfactory predictive capacity, while a value of 30% indicates a marginally acceptable model (Freer et al. [
88] proposed this limit as the threshold for distinguishing behavioral solutions within the GLUE (Generalized likelihood uncertainty estimation) method. In our experiments, we considered as “satisfactory” efficiency values above 75–80%, while efficiency values lower than 30% were characterized as “unsatisfactory”.
4.4. Preparation of CN Map
For the preparation of the CN map, to be next considered as the background for the delineation of HRUs, we used maps of classified geological formations, land cover maps, as well as a raster map of terrain slopes. Specifically:
Geological formations and information on the tectonic and lithological elements of the river basin were extracted from geological maps, published by the Institute of Geology & Mineral Exploration of Greece. As shown in
Figure 5a, the basin includes two major geological formations, i.e., Tripolis geotectonic units, on the western part, and Phyllite-Quartzite series, on the eastern. The Tripolis formations that occupy the largest part of the hilly and lowland areas mainly consist of limestones and marbles, alternating with crystalline slate. Limestone is porous, and through the centuries, rain and snow water have caused erosion, forming caves with stalagmites, stalactites and other karst landforms. In this respect, waters that have penetrated the karstified rocks either outflow from springs in the mountainous areas or are conducted to the sea, as underwater estuaries called “Anavoli”. On the other hand, the Phyllite-Quartzite series are extended over the mountain areas (Taygetos) and consists of Permian-Lower Triassic rocks of low to moderate permeability. The other three formations occupy only a small area in the SE part of the basin (Pindos formation, consisting of Upper Cretaceous-Paleocene rocks, Tyros layers, consisting of Carboniferous-Upper Triassic rocks, and Postalpine sediments, consisting of Quaternary, Holocene and Pliocene rocks).
Land-cover classes, illustrated in
Figure 5b, were derived from the CORINE Land Cover 2000 map of the European Environmental Agency. These include broad-leaved forest (37%), coniferous forest (23%), transitional woodland-shrub (16%), pastures (10%), sclerophyllous vegetation (7%), mixed forest (3%), land principally occupied by agriculture, with significant areas of natural vegetation (3%), olive groves (<1%), complex cultivation patterns (<1%) and industrial or commercial units (<1%). In general, vegetation favors interception and infiltration, thus resulting in reduced overland flow. Given that a quite extended part of the basin, mainly in the NE, was burned in 2007, an updated map, provided by the National Observatory of Athens, Institute for Astronomy, Astrophysics, Space Applications & Remote Sensing (C. Contoes, personal communication), was also used for to represent the land cover characteristics during the simulation period. As explained in
Section 3.2, these areas were classified under one category with respect to their original condition.
Finally, for the classification of drainage capacity, we used the terrain slope map (
Figure 5c), which has been automatically extracted through the DEM. The raster map of CNs, shown in
Figure 5d, has been produced by overlaying the raster layers of
,
and
, and then employing Equation (1). The values of the three indices were assigned according to the guidelines of
Table 1, accounting for the dominant classes of permeability, vegetation and drainage capacity, which were estimated on the basis of the aforementioned geological, land use and terrain slope information, respectively. The derived map contains 18 classes, ranging from CN = 37, for areas with high permeability, dense vegetation and very low slope, to CN = 91, for bare areas with low permeability and very high slope.
4.5. Calibration Experiment 1: Varying the Number of HRUs
In order to test our key hypothesis that the number of HRUs should equal the number of the available flow records (in that case, three), we run the calibration problem considering five alternative parameterizations, from one to five HRUs, as shown in
Figure 6. Obviously, regarding the model schematization, the same semi-distributed structure was used, comprising seven sub-basins and associated meteorological inputs. We note that the single-HRU formulation with semi-distributed inputs is referred to as the semi-lumped modelling approach [
3]. For the surface hydrological processes’ representation, seven parameters were assigned to each HRU; thus, for each configuration, the total number of control variables is 7
N, where
N is the number of HRUs. Therefore, the simplest parameterization, the semi-lumped (homogenous) basin, required the estimation of seven parameters, while for the scenario with five HRUs, 35 unknown parameters of the surface hydrological module were considered.
The parameter estimation problem was handled through of a fully-automatic optimization approach, using the evolutionary annealing-simplex algorithm across an extended search space, defined by the physical bounds of the associated set of parameters (7, 14, 21, 28 or 35). To ensure unbiased results, all calibrations were carried out by assigning the same initial conditions and the same values for the rest of the parameters (groundwater conductivities, sub-basin routing rates, infiltration coefficient), as specified through preliminary calibrations. Moreover, to avoid getting trapped in local optima, thus resulting in sub-optimal model performance, several independent optimizations were employed for each parameterization, finally selecting the best solution. To take into account the increasing computational burden of optimization against the number of control variables, we set a maximum budget of 2000
N trials. Thus, for the lumped approach, we allowed up to 2000 function evaluations for each optimization run, while for the more complex configuration (
N = 5), the budget was increased up to 10,000 function evaluations (
Table 4). It is important to note that within this first experiment, no manual interventions were allowed, as done in the context of the second experiment, which provided much improved results (
Section 4.6).
The overall model performance, in terms of the error measure for both the calibration and validation periods, is shown in
Table 4, while analytical results of the optimal efficiency and high flow efficiency values obtained by each simulation at each monitoring station are summarized in
Table 5. It is clear that the best overall model performance is achieved by the simulation with the three HRUs, both in calibration and validation, with overall index errors of 1.985 and 2.675, respectively. Only in two cases did other parameterizations result in more efficient results: at the outlet, the simulation with the four HRUs achieved efficiency and high flow efficiency values of 77.3% and 81.4%, respectively, and the simulation with the two HRUs achieved efficiency and high flow efficiency values of 79.8% and 81.0%, respectively, compared to the lower values of 67.2% and 72.2%, respectively, of the simulation with the three HRUs. However, in all other cases, the efficiency and high flow efficiency values achieved by the simulation with the three HRUs were higher, further confirming our fundamental hypothesis that the best compromise, in terms of model performance against computational effort, is ensured by considering the parameterization with three HRUs, which equals the number of available hydrographs.
4.6. Calibration Experiment 2: Contrasting Alternative HRU Delineation Approaches
The CN-based approach with three HRUs was compared to existing HRU delineation methodologies, herein referred to as unique combination and union of layers. The former accounts for user-defined thresholds for land use, soil type and slope range within each sub-watershed, followed by a spatial overlay scheme, 20%, 10% and 20% thresholds used for land use, soil and slope, respectively [
90]. This means that any land use, soil and slope occupying less than or equal to the area percentage threshold defined in each sub-basin will be lumped with the adjacent dominant cells. A spatial overlay was then performed, and cells having the same combination of land use, soil and slope categories were given a unique HRU combination number, resulting in 31 HRUs (
Figure 7a). In the union of layers delineation, HRUs are defined as the product of separate partitions accounting for different watershed properties. In this case, the product of the two dominant geological formations, corresponding to low and high permeability, and the two dominant land uses, corresponding to forests and meadow-pastures, was considered, thus obtaining four HRUs (
Figure 7b).
Due to the different HRU delineations, the number of parameters associated with the rainfall-runoff module varied greatly (
Table 6), specifically 31 × 7 = 217 for the unique combination (233 in total), 4 × 7 = 28 for the union of layers (44 in total) and 3 × 7 = 21 for the proposed CN approach (37 in total). The calibration problem was carried out through a hybrid strategy, combining human experience and automatic optimization tools [
91,
92,
93,
94]. It was an iterative process seeking progressive improvements of a relatively small group of parameters within a realistic value range and satisfactory predictive capacity for all model responses. Trial calibrations were initially employed that allowed large parameter variations. Following, several optimization runs were carried out by modifying the bounds of the feasible parameter search space, while trying different combinations of criteria weights. Then, we focused on the optimization of the HRU parameters and on the most significant groundwater parameters, to attain a good fit of the hydrograph at the basin outlet, especially during high flows, and a satisfactory fit of the spring discharges. Once an initial good fit was achieved, the calibration focused on the improvement of specific aspects of the model responses, while ensuring a realistic water balance.
The optimized metrics are summarized in
Table 7, whereas
Figure 8,
Figure 9 and
Figure 10 contrast the observed and simulated hydrographs at the three monitoring sites. At the basin outlet (
Figure 8), a very good fit is achieved by all parameterizations, for both calibration and validation period. Overall, the highest efficiency values are obtained by the proposed CN approach, with 81.4% and 80.2% for calibration and validation, respectively, compared to 80.1% and 77.1% obtained by the simulations performed with the union of layers and 76.8% and 71.9% with the unique combination. As shown in
Figure 8, the simulations preserve the important features of the hydrographs, such as the high flows over the winter and zero flows over the summer periods for the calibration periods. Although the high flow efficiency values are very good for the calibration period, none of the models succeeds at simulating the part of the hydrograph between March and June 2013 well (yet, this may be attributed to systematic measurement errors, since the recession limb appearing at the upstream stations is not shown at the outlet). The CN approach and the unique combination parameterization simulate the low flows well, while the union of layers overestimates the low summer flow both for the calibration and the validation periods.
The best performance by all parameterizations was achieved for the Karveliotis station (
Figure 9), located in the eastern part of the basin (calibration value first, validation value second): 89.2% and 83.7% for the CN approach, 90.8% and 81.3% for the unique combination and 65.5% and 76.2% for the union of layers delineation (
Table 7). High flow efficiency with the CN approach was satisfactory both for calibration and validation (78.9% and 57.9%, respectively). Although the models preserved quite well the overall behavior of the hydrographs, some of the high flow events were somewhat underestimated, while the union of layers approach also underestimated the summer flows.
For Alagonia station (
Figure 10), located in the NE part of the basin, the efficiency values, although lower, were still very satisfactory, with 79.7% in calibration and 75.3% in validation for the simulation with the CN approach for the HRU delineation, 75.4% and 68.0%, respectively, with the unique combination delineation, and 65.8% and 72.6%, respectively, with the union of layers delineation (
Table 7). High flow efficiency was satisfactory across the full simulation period only with the CN approach; the other two parameterizations produced mixed results. The CN-based and unique combination approaches better capture the high flows in the calibration period, while all modes simulated the low summer flows well.
Note that although the same calibration strategy was employed for the three HRU approaches, the computational time and human effort spent differed markedly.
Table 6 provides a qualitative expression of this effort, which is mainly associated with the time needed for manual interventions within the hybrid calibration procedure. It is also associated with the number of individual optimizations, using small sub-sets of parameters or different ranges of their feasible bounds, in an attempt to achieve an acceptable model performance and simultaneously ensuring realistic hydrological behaviors. The unique combination delineation required at least triple the effort to achieve satisfactory performance, compared to the union of layers and CN approaches, while preserving a realistic set of parameters turned out to be a very tedious task.
The overall conclusion is that neither an increased number of parameters, nor an increased calibration effort resulted in improved model performance. Therefore, it was confirmed that a too detailed parameterization actually induces more complexity to the calibration problem, which is far from a straightforward task. Actually, as a result of interrelated uncertainties and errors in all aspects of hydrological modelling and calibration [
75], an automatic optimization procedure cannot ensure a good predictive capacity, typically due to algorithmic weaknesses (e.g., trapping to local optima). In this context, it was not surprising that the optimal performance in calibration, and particularly in validation, was achieved by the most parsimonious parameterization, considering three HRUs.
4.7. Investigation of Model Results for CN-Based Parameterization
Table 8 shows the optimal values of the seven parameters assigned to the three corresponding HRUs, derived through the second calibration experiment. As explained in
Section 4.1, a low CN indicates areas with significant permeability, dense vegetation and very low drainage capacity, which do not favor the generation of overland flow. On the other hand, a high CN indicates areas with low water permeability, sparse vegetation and a high drainage capacity, producing significant runoff. Therefore, it is not surprising that the maximum infiltration ratio, which is associated with soil permeability, is higher for the HRUs with low CNs. On the other hand, soil capacity seems to be addressed by terrain slope characteristics, as plain areas of the basin have much larger capacities than the mountainous ones. Recession rates for percolation and the percentage of infiltration to the lower zone are also associated with permeability; thus, the infiltration rates through the high-permeability soils are significant. The good correspondence of the CN values with the physical characteristics of the HRUs allow the user to define a priori suitable parameter bounds and therefore facilitate the calibration procedure, while ensuring physically-consistent parameters.
Table 9a summarizes the annually-averaged water balance of the basin for the entire simulation period, ending in April 2014. As seen in
Table 9b a significant part of precipitation is lost to underground runoff (23.2%), while mean evapotranspiration losses are 34.3%. These values differ by 10–20% at the end of a hydrologic year (mid-autumn), with increased values of actual evapotranspiration and decreased values of underground losses. Percolation reaches 35.9% of precipitation, whereas surface runoff is 22.8%. The mean annual runoff coefficient is estimated to be around 33.0%.
5. Summary and Conclusions
In the context of distributed hydrological modelling, the configuration of HRUs is subject to the conflict between two topics: the accuracy in the representation of process heterogeneity, dictating the required number of HRUs; and model parsimony, associated with the number of parameters to be inferred through calibration. To reduce the subjectivity introduced by the definition of a small number of HRUs in a parsimonious modelling structure, this research provides empirical guidelines for formulating HRUs, based on the widespread known curve number concept. Key innovations of this work are: (1) the formulation of CN maps, accounting for three major physiographic characteristics of the river basin, namely soil permeability, vegetation density and drainage capacity, within a GIS; (2) the use of these maps as input layers for delineating HRUs within hydrological models of various levels of complexity; (3) the use of CN classes as a guide for configuring a specified number of HRUs in semi-distributed models.
A raster map of CN values may have several applications, since this parameter is commonly used in watershed hydrology. For instance, it can provide information on potential maximum soil moisture retention at the grid-scale, in the context of the well-known NRCS-CN model. However, the emphasis here was on the use of CN data as a proxy for delineating HRUs in the context of fully- or semi-distributed hydrological models of any structure, provided that parameters associated with the modelled processes are mapped at the HRU scale. As the level of detail of the model parameterization depends on the number of HRUs, one can determine a specific number of CN classes to be used as the basis for HRU configuration, through aggregating cells of similar CN values.
A very important issue highlighted here is that the model complexity, expressed in terms of the number of CN classes and, consequently, the number of HRUs, should be consistent with the context of available hydrological information. Therefore, a general recommendation is to provide as many HRUs as the number of available discharge observation stations in the basin. This ensures a satisfactory balance between process realism and modelling parsimony and also allows taking advantage of all available information in order to avoid over-parameterized schemes.
The CN-based approach for HRU delineation was tested by using the latest version of HYDROGEIOS. This model was originally built on the HRU concept, following the union of layers delineation approach. A shortcoming of this method is the configuration of a specific number of HRUs, i.e., the product of different layer classes. In contrast, in the CN-based delineation, the number of HRUs is determined a priori. Therefore, the new approach is more objective, more flexible and, ultimately, more parsimonious. Furthermore, it allows explaining the HRU response and associated parameter values in terms of CNs; for instance, it is anticipated that an area with low CN values will generate less surface runoff, and vice versa. The correspondence of an HRU’s response with CN is very advantageous, since the user can easily recognize whether the model parameters are physically consistent or not, which in return facilitates the evaluation of the model robustness both in terms of goodness-of-fit and physical representation.
The new CN approach for HRU delineation was demonstrated in the hydrological simulation of Nedontas River Basin, Greece, where parameterizations of different levels of complexity were employed within HYDROGEIOS. Results showed that the optimal performance was achieved by the most parsimonious parameterization, i.e., the CN approach with three HRUs, as dictated by the number of monitoring stations across the basin. Our experiments showed that the calibration effort (in terms of time needed for manual interventions, as well as the number of individual optimizations considering small sub-sets of parameters), for achieving acceptable model performance and ensuring realistic hydrological behavior, varied markedly with the different HRU configuration approaches. An important conclusion is that the CN approach resulted in optimal parameter values across the three HRUs that are in agreement with their physical interpretation, as also quantified in terms of associated CN classes. Hence, it is anticipated that through a proper classification of CNs, the user will be able to determine a priori a reasonable and relatively narrow range of feasible parameter bounds, thus substantially facilitating the calibration effort.
Future research steps involve testing this approach in a representative number of river basins, as well as its implementation within different modelling schemes.