1. Introduction
Low flow is the parameter that is most frequently used for many purposes in catchment water management. This value is needed for determining permitted water transfers, irrigation systems and when planning water supply for settlements. Low flow characteristics are also needed in water management for water intake permits, the discharge of treated wastewater and the associated location of wastewater treatment plants [
1,
2,
3]. Another important aspect of the use of this parameter is the determination of environmental flows for the protection of wildlife or for hydropower purposes. Increased demand for water during periods of low flow may increase water deficits. [
4,
5,
6]. In addition, the application of the Water Framework Directive requires the estimation of low flows to assess good state of water bodies [
2]. As pointed out by [
7], rivers characterized by a rainfall-dominated discharge (e.g., the south of France, southwest England, Central Europe) will be prone to droughts in the future. The opposite situation will occur for catchments with a flow regime induced by melting snow [
7]. The magnitudes and seasonal timings of low flows vary between geographic regions [
8]. Low flows should be considered taking into account many factors that influence them, such as soil infiltration characteristics, hydraulic characteristics, aquifer extent, rate, the frequency and magnitude of recharge and evapotranspiration, but also land cover, topography and climatic factors [
9,
10,
11]. On a regional basis, determining the spatial variation of low flow is important not only for controlled, but also uncontrolled, watercourses [
12]. This is an issue of particular relevance to water management and especially for small water supply systems for which there is a need to develop a project for water intake. The water intakes for such systems usually take water from watercourses located in ungauged catchments. In Poland, and in many countries, there is a need to search for the methods of estimating low flows in ungauged catchments.
Many existing water intakes have become insufficient for the needs of the population. In order to develop a concept or a water supply permit, it is necessary to have hydrological information on low flows, which is particularly difficult for small systems as they are located in ungauged catchments. Limited water retention capacity in a catchment area can worsen the effects of drought, thus managing water resources in such areas will become a major challenge. Therefore, the knowledge of the occurrence of drought periods allows better preparation for the occurrence of such phenomena [
7]. Additionally, assuming population growth and changing water consumption patterns, an increase in water demand is expected, which could lead to conflicts over water [
13]. An important issue in the analysis of low flows is the seasonality of their occurrence. This is important in relation to the assessment and monitoring of the hydrological regime of rivers, as well as the impact of climate change on low flows [
14,
15]. As a result of climate change, it is estimated that future catchments of different scales in Central Europe, e.g., the Upper Rhine, will experience a regime shift from the winter low flow regime to the summer low flow regime [
16,
17]. At the same time, [
16] indicates that as a result of a warming climate, the variability in the timing of lows will decrease, making drought a more predictable phenomenon. In the case of a seasonal climate, with warm and cold seasons occurring, low flows are influenced by various processes so that the annual series of extreme values are a mixture of summer recessions and winter freezes. Because of the fundamental differences of summer and winter processes, regionalization may take advantage of a separation of summer and winter low flows [
18]. Schreiber and Demuth [
19] analyzed the seasonality of the annual mean 10-day minimum MAM(10) of total runoff in catchments in southwestern Germany. The analysis was performed for 10 regions and for the entire study area. Aschwanden and Kan [
20] conducted a Q
95 analysis for a catchments in Switzerland. In the United States [
21] analyzed river headwaters and noted an identified distinct variability pattern in the frequency of low flow days. Based on their analysis, they identified three clusters in which low flows occurred mainly during summer and early autumn. They found that in the western case, a precipitation deficit played a role in the occurrence of low flows, while in the central and western regions, which are characterized by dense vegetation, low flows occurred mainly during summer and autumn. This was related to increased evaporation during the summer season [
21]. Another example of low flow analysis is the study by Dingman and Lawlor [
22], conducted for the Vermont and New Hampshire region. Fangman and Haberlandt [
23] studied low flow seasonality in the federal state of Lower Saxony, situated in northwestern Germany. A detailed analysis of the seasonality of low flows was carried out by Laaha [
24] and Laaha and Blöschl [
25,
26]. They conducted such an analysis for low flows q
95 in 325 small catchments, located in Austria. They used seasonality histograms (SHs), seasonality index (SI) and seasonality ratio (SR) in their analysis. As a result of this study, they divided the catchments into 10 groups and classified them into summer or winter subregions. Similar analysis was made by Vezza et al., 2010 [
12], who used, similarly to Laaha and Blöschl [
25], three seasonality indices for low flow regionalization for catchments located in the northwestern Italy. Vlach et al. [
17] analyzed the long-term variability and seasonality of low flows and streamflow droughts in fifteen headwater catchments of three regions within Central Europe. In their analysis, SI and SR indices were used.
In Poland, one of the first analysis of seasonality was carried out by Rotnicka [
27]. She attempted to identify the so-called hydrological season, i.e., a sequence of days in a year characterized by relative similarity of the magnitude, dispersion and character of water levels. Additionally, Jokiel and Tomalski [
28] determined the hydrological seasons of annual flow hydrograms for selected rivers located in central Poland. In another study by Jokiel and Stanisławczyk [
15], the authors analyzed the changes and long-term seasonal variability of outflow for catchments located in different parts of Poland.
The estimation of low flow characteristics in the case of a gauged catchment is carried out using direct statistical methods [
29]. The issue concerns ungauged catchments, where hydrological information is often needed for different water management purposes, e.g., for water supply. Hence, there is a need to develop methods to determine low flow in such catchments, taking into account long hydrometric and meteorological measurement series.
The aim of the study is to determine the values of seasonality indices for the regionalization of low flow characteristics in the Upper Vistula river basin. As a predictive model, a stepwise multiple regressions based on physical catchment characteristics and seasonality indices was used. The results of the studies can be used to estimate q95 flows in ungauged catchments and also, in practice, for modernization and design of new water intakes for small local water supply systems. Additionally, it should be added that the presented analyses are innovative, both with regard to Poland but also with reference to international reports, which are limited.
2. Materials and Methods
The study was performed in 32 selected catchments belong to the Upper Vistula basin (
Figure 1). The source material for the analysis was time series daily data from the period 1951–2016 (
Table 1). It was assumed, as a criterion of catchments’ selection, that for analysis only those areas in which daily streamflows were available with a minimum record length of 20 years were utilized. In the analysis, 12 physiographic and meteorological characteristics of catchments were also used (
Table 1).
The following physiographic and meteorological parameters of the catchment were used (
Table 1): length of the watercourse (L), catchment area (A), mean annual precipitation (P), mean catchment slope (I), land use (U) and soils (S).
The analysis was performed for a Q
95% flow quantile, i.e., a flow that is equaled or exceeded on 95% of all days within the observation period. This characteristic was chosen because it is relevant to many aspects of water management, such as design of water supply systems, and is widely used in Europe. Then, Q
95% was subsequently standardized by the catchment area, obtaining unit specific low flow discharge q
95 dm
3∙s
−1∙km
- 2. A map of specific low flow discharge q
95 dm
3∙s
−1∙km
−2 for the analyzed catchments of the Upper Vistula river basin is presented in
Figure 2.
Time series of daily data and precipitation were obtained from the Institute of Meteorology and Water Management, National Research Institute in Warsaw. Parameters related to morphometry and landform were determined on the basis of Kondracki [
30] and soils, land cover and catchment development were determined on the basis of the soil map of Poland [
31] and Corine Land Cover 2012 [
32].
Physiographically, the area of the Upper Vistula basin, where the analyzed catchments are located, is situated within three large Carpathian physiographic units: Carpathians, Subcarpathian valleys and Małopolska Upland. The area of the Carpathians and the Upland is the area where most tributaries of the upper Vistula have their sources, while the Subcarpathian valleys are the transit area for the Vistula and the estuary area for the rivers and streams formed in the Subcarpathian and the Subcarpathian Upland [
33]. Additionally, in terms of geological structure, the basin area is diversified [
34], which translates into a diversity of soils in the area. Four types of relief can be distinguished in this area: mountain, upland, foothill and lowland.
For the analysis, catchments differentiated in terms of physiographic parameters were taken (
Table 1). The smallest catchment area is 66.3 km
2 and the largest is 2034 km
2. Additionally, the mean slope of the catchments varies from 0.002 for the Pszczynka and Łęg catchments to 0.091 for the Biała catchment (
Table 1). The selected catchments are also diversified in terms of land use. They are dominated by arable land, which ranges from 7.4% for the Łęg to 87% for the Szreniawa, and forests, which constitute from 3% (Szreniawa) to 67.60% (Skawica).
The methods applied in this study are similar to those used by Laaha and Blöschl [
25] and Vezza et al. [
12]. Three seasonality indices were used to determine the seasonality of the occurrence of specific low flow discharge q
95: a seasonality ratio (SR), a cyclic seasonality index (SI) and seasonality histograms (SHs).
The seasonality ratio (SR) representing the ratio of summer (q
95s) to winter low flow characteristics (q
95w) is given in Equation (1) [
26]. First, daily discharge data were divided into summer discharge series, from 1 May to 31 October, and winter discharge series, from 1 November to 30 April, in order to differentiate summer low flows caused by precipitation deficit and winter low flow events caused by snow accumulation and frost in highland and mountain areas [
17]. Then, from summer and winter discharge time series data, the characteristic values of q
95s and q
95w were calculated for each catchment.
SR values > 1 indicate the presence of a winter low flow regime, whereas SR values < 1 indicate the presence of a summer low flow regime.
Another parameter analyzed was the seasonality index, SI. This index is similar to the one used by Burn [
35], Young et al. [
36] and Laaha and Blöschl [
26]. The SI is a parameter on the basis of which it is possible to determine the average number of days of low flow intensity. The index also shows how strong this relationship is. The index is based on two parameters, Θ and r, which are calculated from the Julian dates of all days of the observation period when discharges are equal or lower to Q
95. The parameter Θ represents a measure of the average seasonality of low flows by the average day of low flow occurrence in radians. The value of the parameter ranges from 0 to 2π, where Θ = 0 relates to 1 January,
relates to 1 April, π relates to 1 July and
relates to 1 October. For each catchments, the days when the flow was less than Q
95 were taken for further analysis and then were transformed into Julian dates D
j [
17,
26].
The parameter Θ (Equation (2)) is calculated as follows [
26,
37]:
where:
Dj is the day of the occurrence, when flows are lower than Q95.
In turn, the parameter r is the mean days of occurrence and describes the seasonal variability of low flows and it is dimensionless indicator. Its values range between 0 (low seasonality and means that all low flow events are uniformly distributed over a year) and 1 (strong seasonality and means that all low flow events occurred exactly the same day of the year) [
17,
26].
The seasonal concentration index r is calculated as Equation (3) [
26]:
where:
x
θ and y
θ are the arithmetic mean of Cartesian coordinate of a total of n single days j and are calculated as Equation (4) [
26]:
The angle of the mean directional vector is defined as Equations (5) and (6) [
26]:
Lastly, seasonality histograms (SH) were analyzed, which allowed for a more detailed analysis of the seasonality distribution of low flows [
26]. The histograms provide complementary information to the seasonality index (SI) and from these it is possible to identify which months are affected by low flow and to determine the shape of the seasonal distribution, whether it is, for example, multimodal or skewed. In the case of this parameter, the shape of the distribution depends on all days when the discharge of a catchment falls below the threshold Q
95 [
26].
A non-hierarchical cluster analysis method, the K-means method, was then used to determine groups of catchments similar in terms of the seasonality of low flow occurrence. Euclidean distance was used as a measure of distance between clusters.
In the next stage of the analysis, groups of catchments similar to each other due to seasonal occurrence of low flows were determined. For this purpose, the non-hierarchical cluster analysis method (K-means method) was used. Euclidean distance was used as a measure of distance between clusters.
In the final stage of the study, regional models were developed in the form of a multiple regression equation (Equation (7)), in which the dependent variable was low flow characteristics and the independent variables were morphoclimatic parameters of the analyzed catchments [
29].
where:
xi—morphoclimatic parameters of a catchment,
βi—regression coefficient.
Multiple regression analysis identified the parameters that most strongly influence low flows. The best results were obtained using stepwise regression, using Mallow’s C
p coefficient as an optimality criterion [
25].
The following assumptions were made for the regression model: the absence of multicollinearity, homoscedasticity and normality of residuals. It is important to check multicollinearity to see whether the independent variable is highly correlated with one or more of the other independent variables in the multiple regression equation. The presence of multicollinearity makes statistical inference less reliable. Another parameter, homoscedasticity identifies dissimilarities in a population. Any variance in a population or sample that is not even will produce results that are skewed or biased, making the analysis incorrect or worthless. Checking the normality of residuals is crucial and is one of the main assumptions of a linear regression model. If the residuals are not normally distributed, then model inference (i.e., model predictions and confidence intervals) might be invalid. The absence of multicollinearity was checked using the variance inflation factor (VIF), normality of residuals using the Anderson–Darling and Shapiro–Wilk tests and diagnostic plots. White’s test was used to check the homoscedasticity of the residuals.
The coefficient of determination R
2 for significance level = 0.05 and the adjusted coefficient of determination R
2adj (Equation (8)) were calculated in order to determine the consistency of calculated values with observed values.
where:
R2—coefficient of determination,
N—number of predictors,
p—total sample size.
In addition, the coefficient of determination R
2cv was determined using leave-one-out cross-validation, which is described in Equation (9). This was applied in order to avoid the so-called error of the third kind and to select the best forecasting model [
25].
where:
V
cv—is the average residual square error and is calculated with Equation (10) [
25],
var(q
95)—is the spatial variance of the observed specific low flow.
where:
—is the estimated value of the i-th dependent variable obtained using a model estimated with all observations except i-th,
—is the observed value.
The root mean square error (RMSE) [
38] for each cluster and for cross-validation (RMSE
cv) was also calculated. RMSE values equal to 0 indicate a perfect fit.
3. Results and Discussion
The values of the seasonality index (SR), classifying the seasonality of low flows for the analyzed catchments located in the Upper Vistula river basin, are presented in
Figure 3. On the basis of the index threshold value equal 1, summer and winter low flows can be distinguished. It was found that in most catchments of the upper Vistula river basin (28 catchments), summer low flows dominate in the analyzed period. In the case of four catchments, Skawica, Dłubnia, Dunajec and Stupnica, SR values were higher than 1, which indicates the dominance of winter low flows.
Another parameter analyzed was the SI seasonality index. The results are presented in
Figure 4a. The analysis of SI values for the analyzed catchments showed that in the case of the analyzed catchments low flows occur in two seasons: in summer, in twenty-eight catchments, and in winter (January–March), only in four catchments. Summer low flows already occurred from mid-July. Low flows dominated in August and September (10 catchments each), which is related to high air temperature and high evaporation. These results are similar to those obtained by Raczyński and Dyer [
39], who determined the seasonal variability of low river stages in south-eastern Poland and found that the maximum frequency of occurrence of low flows occurred in early August. Analyzing the strength of variability of low flows in the analyzed catchments, according to Guilford’s classification [
40], it can be noted that strong variability (0.7 < r < 0.9) was observed in three catchments (Mleczka, Czarna and Jasiołka). In contrast, weak variability (0.1 > r > 0.3) was observed only in the Szreniawa catchment. In the remaining 28 catchments, the variability of low flows was assessed as medium. These results are similar to those obtained by Jokiel and Stanisławczyk [
15], who studied seasonality indices in catchments located in different regions of Poland and on the Carpathian rivers. They obtained seasonality indices higher than 50%. Kohnová et al. [
41] analyzed the seasonality of the occurrence of low flow Q
95 for Slovak catchments. The study included 198 small and medium-sized catchments from across the country. Similarly to the study carried out in the paper, summer lows were observed mainly from August to October. However, low flows for August were dominant in catchments located in the mountainous parts of Slovakia. The magnitude of the seasonal concentration index (r) was higher than that obtained by the authors of the paper. For summer low flows, it ranged from 0.8 to 0.95 in 84% of the analyzed catchments. On the other hand, winter low flows were observed from December to March for the catchments located in the central and eastern part of Slovakia. Of these low flows, those occurring in January were predominant. Additionally, for this type of low flow, high values of r were observed, which ranged from 0.8 to 0.9 for most of the catchments.
The SI results obtained by the authors of this paper are similar to those based on SH and SR, thus confirming the type of seasonality observed in the analyzed catchments of the Upper Vistula river basin. For all analyzed seasonality indices, the occurrence of seasonality of winter low flows concerned the same catchments. Knowledge of low flow seasonality is an important aspect during the planning, as well as during the operation of surface water intake. The results show both spatial and temporal variability of low flow, which are important for administrative decisions on the development and management strategy of an area. Water availability, represented by low flows, is used to define the water permit limit for water withdrawal as a tool for water management and planning [
13].
The seasonality histograms (SH) (
Figure 4b) were analyzed to determine in which month of the hydrological year low flows occur. On this basis, a map of the analyzed catchments with the dominant type of low flows was created (
Figure 5). Analyzing the histograms, it can be seen that for 28 catchments, summer low flows dominate. However, on the basis of a more detailed analysis of the histograms, catchments of mixed type can be distinguished, where summer low flows dominate but winter low flows also occur (nine catchments of mixed low flow type). As far as winter low flows are concerned, their occurrence is observed in four catchments, whereas in three catchments, despite the dominance of winter low flows, autumn low flows occurring towards the end of the hydrological year are also observed, mainly in September and October. Seasonality histograms for the analyzed catchments practically coincide with the type of low flows determined from the SR coefficient.
In a study by Schreiber and Demuth [
19], carried out in 169 catchments located in Germany, the seasonality of low flow—MAM(10)—was recorded in practically every month of the year. However, the lowest flows are observed in late autumn, especially in September and October, for most catchments. Vezza et al. [
12] analyzed different grouping methods, due to low flow characteristics q
95, for 41 catchments in northwestern Italy. For this area and the three seasonality indices used, they found that, for Italian catchments, the division of low flows into two seasonality groups (summer and winter) best reflected the nature of the area analyzed. Summer low flows were observed in the Apennine–Mediterranean catchments, while winter low flows were found in the catchments of the Alpine region. The coefficient of determination which they obtained for the global model (without taking flow seasonality into account) had, as in the study of the authors of this paper, the lowest value. In a study of the Rhine river basin, Demirel et al. [
16], noted that, on the basis of the SR index, the Alpine catchments have low flows in the winter half-year, while the others have low flows in the summer half-year. Based on the WMOD (which is equivalent to the Θ parameter analyzed in the authors’ work), they found that summer half-year low flows occurred in September and October. As for the low flows of the winter half-year, the alpine catchments were characterized by their occurrence mainly in January and February [
16]. Similar results were obtained for low flows for the catchments analyzed in this paper.
Then, regions with similar seasonality of low flow occurrence were identified using the K-means method. In this method, the first step is to determine the number of clusters. It was assumed that the analysis would be carried out for between two and five clusters of similar catchments. However, dividing the catchments into four and five groups resulted in separate clusters of catchments that were similar to each other, e.g., in terms of SI. Therefore, the analysis was terminated in three clusters.
Regression relationships (
Table 2) were determined for all catchments, without clustering, and for separate groups of catchments, for regions corresponding to winter and summer flow seasonality and for three groups of low flow seasonality occurrence: winter, summer and mixed.
There is one feature in the global regression model (
Table 2), catchment slope, which affects low flows and has a positive sign. Catchment slope was also a feature affecting low flows in the regression models for two groups. On the other hand, for three groups, soils (cambisols) were the factor influencing q
95, which has a positive sign for the summer seasonality of low flow and a negative sign for the mixed group. This is related to the location of the catchment in the Upper Vistula river basin.
The mixed group includes catchments located in the upland, Carpathian foothills and mountainous climate. In terms of water circulation, the catchments of this group are characterized by the occurrence of poorly permeable and impermeable soils. On the other hand, catchments with summer low flow seasonality are located in the upland climate and sub-Carpathians and are characterized by medium and easily permeable soils. In an Austrian study [
25] and in a study for catchments located in Switzerland [
20], the occurrence of low flow seasonality is related to catchment altitude.
A value of R
2 = 41% (R
2adj = 39%) was obtained for the global regression model. This is the lowest value of the coefficient, compared to the values calculated for the groups using the K-means method. Calculated values of RMSE and RMSE
cv (
Table 2) for the global model were above 0.7, which proves that the determination of the characteristics of low flow, taking into account the seasonality of its occurrence in the case of catchments located in the Upper Vistula basin, allows for its more accurate estimation. Due to the small number of catchments (four catchments), the regression model was not applied to separate three clusters and catchments characterized by the seasonal character of low flow in winter period. The analysis carried out allowed to conclude that separation of the mixed group gave much better fitting of the regression model in comparison to the two groups of seasonality. It is evidenced, inter alia, by the value of coefficient of determination, determined by the cross-validation method for the catchments characterized by summer low flow seasonality, which, in the case of the three clusters, was 66% and was more than twice as high as in the case of the two clusters (R
2cv = 30%), and in the case of the global model its value was only 23% (
Table 2). For the three cluster groups, RMSE and RMSE
cv values below 0.5 were obtained, which confirms the better fit of the regression model compared to the two seasonality groups. Therefore, further analysis was conducted for the three groups.
The resulting regression models were tested for homoscedasticity and normality of the residuals. White’s test showed that the residuals were homoscedastic. Based on the Anderson–Darling test (the value of test A is given in
Figure 6) and the Shapiro–Wilk test, the residuals were found to have a normal distribution.