1. Introduction
The peak water demand for residential users implies one of the most onerous operative conditions for an urban Water Distribution System (WDS). The maximum water demand during the day, along with mechanical failure of one or more components of the system (e.g., pipes, pumps, and valves) or fire-fighting flow conditions, is one of the scenarios for which the WDSs may experience some deficiency.
Hence, the peak water demand is a working condition that is usually taken into account for the design and management of WDSs (e.g., [
1,
2]).
In the scientific literature, few papers define completely the daily maximum water demand for a considerable number of users (around one thousand inhabitants) following a probabilistic approach. This issue is often tackled by means of the more practical deterministic equations (e.g., [
3,
4,
5]), although the random nature of the water demand is generally recognized.
On the other hand, several stochastic models describing the water demand for few end users (a dozen of dwellings) were proposed (e.g., [
6,
7,
8,
9]). These approaches model the residential water demand as it happens at the hydro-sanitary devices of the dwellings. Therefore, the demanded flow is generally composed by the overlapping of several rectangular pulses, which are generated by turning on the hydro-sanitary devices.
The rectangular pulses approaches need to model at least three random phenomena [
6]: arrival, duration and intensity of the demand pulses. Hence, these models induce significant computational efforts and require a detailed knowledge of the lifestyle related to the users on which the parameters of the component distributions depend.
However, the rectangular pulse models could be overwhelming when the target of the study is the estimation of the water requirement in a specific condition, such as the maximum water demand during the day. Examples of these approaches can be found in [
10,
11] where a very detailed knowledge of the dwelling composition and user habits is required.
On the other hand, when the number of the users is greater than 200, the trend of the water demand during the day becomes smooth and the single pulses are no longer distinguishable [
12,
13]. In these conditions, the water demand can be represented by means of a single random variable that is continuous and positive.
The aim of this work is to obtain a reliable probabilistic approach that is capable of providing accurate estimations of the maximum residential water demand. For this goal, a specific field laboratory, which monitors the actual water demand of a small town in Southern Italy, was set up by the
Laboratorio di Ingegneria delle Acque (LIA) of the
University of Cassino and Southern Lazio. This monitoring system allowed obtaining copious data samples, where the number of the served inhabitants is known. The data sample was increased considering two further case studies whose users have very different lifestyles in respect to the monitored inhabitants of the Southern Lazio. In particular, the peak water demand for two further towns was also taken into account, one in the Netherlands (the case study of Franeker [
14]) and one in Northern Italy (the case study of Castelfranco Emilia [
7]).
The enlargement of the data sample showed the independence of the daily maximum demand from the local habits. Hence, the results of this study could be considered of great practical application, allowing the use of the hereby proposed probabilistic models to obtain a good estimation of the residential peak water demand.
Some effective distributions, able to model the daily maximum demand for residential users, have been proposed by means of statistical inferences on the above-mentioned observed data. In addition, practical and reliable equations have been suggested which allow estimating the parameters of the suggested Cumulative Distribution Functions (CDFs) in relation to the number of users, Nus, where Nus represents the total number of the dwelling occupants.
The temporal resolution is a fundamental choice when modeling water demand (e.g., [
15,
16,
17,
18,
19]), leading to significant reductions of the peak demand when the time step increases [
20,
21]. Therefore, time scaling effects were also investigated and further equations that take into account such phenomenon were presented. The proposed equations allow estimating the CDFs’ parameters when the time step (∆
t) varies in the interval 1 min ≤ ∆
t ≤ 1 h.
2. The monitoring Systems and the Data Samples
In real WDSs monitoring systems, flow measures do not often represent the actual users demand, generally for two main issues.
The hydraulic shortages (first issue), which could be caused in different points of a WDS for under-dimensioned pipes or failure of electro-mechanical components, do not allow considering the flow measurements as an expression of the water demand. Indeed, in these working conditions, the measurements represent only the water delivery capability of the WDS under specific user requests; hence, the obtained data underestimate the actual water demand. In addition, these shortage conditions are likely when the daily maximum water demand occurs.
The monitored WDSs have to be redundant in order to avoid that the collected data are compromised by such operating conditions. In this way, the WDSs are able to totally satisfy the users’ water demand, even when this is particularly high. Hence, the monitoring system must be composed by both flow meters and pressure probes, to check that demand data obtained is reliable.
A fundamental parameter to describe the peak phenomenon is
Nus (e.g., [
5,
22]), but for looped WSDs, when the measurements are performed on the links, it is not possible to know univocally the number of users (second issue) related to each measurement point. The served users can vary with the different working conditions.
In order to solve this issue, the measuring points have to be set on the pipes which connect the Demand Monitoring Areas (DMAs) to the rest of the WDS. A DMA is a part of the entire distribution network which is linked to the rest of the WDS by means of one or a limited number of pipes. In this way, Nus for each DMA is not dependent on the working conditions.
An accurate census must define the number of the resident population in the considered DMA, and it must verify that the users are only residential.
Based on these considerations a specific monitoring system, which involves the real WDS of a small town, Piedimonte San Germano (PSG), was realized.
In detail, the monitoring system is composed by four measurement points, each with an electromagnetic flow meter and a pressure cell, where all probes are connected to a data logger. Flows and pressures were recorded continuously, where a variable frequency data logger allowed an acquisition frequency up to 1 Hz. The location of the probes allowed the measurement of the water demand for four DMAs, where the number of users was respectively equal to: 239, 777, 981 and 1220. More details of the PSG laboratory are provided in [
21,
23].
In order to expand the data set, the time series of two further case studies, with similar characteristics to PSG in terms of data reliability, but characterized by a different users lifestyle, were also used: Castelfranco Emilia (CE) in Italy and Franeker (Fr) in the Netherlands.
CE monitoring system covering 596 users is part of a wider WDS. Data are collected by means of volumetric measures (Woltmann volume meter), and they are averaged over one-minute intervals. For Fr network, continuous per minute flow measurements are taken on the overall network, which supplies about 1150 users.
Alvisi et al. [
7] and Blokker et al. [
14] provide respectively detailed information of the CE and Fr field laboratories.
Table 1 summarizes the principal characteristic of the three in-situ laboratories.
Initially, the analysis assumed a time resolution of one minute, which was increased up to ∆t =1 h in the following elaborations in order to investigate the effects of the temporal aggregation.
The water demand presents different trends between working days and weekends which affect also the maximum water demand during the day. Therefore, only the working days have been considered in which the peak water demand is slightly more pronounced.
No significant seasonal trend of the water demands was observed for the field data examined, contrary to what happens for residential users in some countries [
24,
25].
On this basis, in order to obtain a more reliable data set, the time series of the three field laboratories were filtered. The daily series were removed when they presented one of the following conditions: holidays/weekends; days with missing data; days with anomalous data (e.g., fire flow, corrupted signals of the probes, and big leakages). The number of days used during the following elaborations is indicated in the last row of
Table 1.
In the following sections, the maximum demand of flow during the day has been made dimensionless by means of the ratio:
where
μQ is the daily mean water demand, and
Qp the peak flow during the day. Equation (1) gives the
Peak Demand Coefficient,
Cp. This work focuses on the dimensionless variable
Cp.
3. Probabilistic Distributions for the Cp
The extreme events are effectively described by means of the Log-Normal distribution (
LN) (e.g., [
26])
and Gumbel distribution (
Gu) (e.g., [
26])
where
is the probability density function;
x is the random variable and
y = ln
x; and
μ and
σ represent, respectively, the mean and the standard deviation. The parameters
α and ε of
Gu are given by the relations:
In the scientific literature, the peak water demand for residential users was described by means of these probabilistic models (e.g., [
20,
21,
27]).
The
LN and
Gu have been tested by means of the observed data for the above described case studies, assuming a time aggregation ∆
t = 1 min. In addition, the Log-Logistic (
LL) model was investigated to describe the daily maximum water demand. The
LL distribution presents a trend very similar to the
LN, but, contrary to Gauss model, the probability density function can be analytically integrated [
28,
29]. The CDF of the
LL distribution has the following expression:
If x = Cp (dimensionless peak demand Equation (1)), y is lnCp.
The parameters of Equation (5) can be estimated by means of statistics on the logarithmic of the original sample, or by means of
μ and
σ of the pristine random variable by means of the following equations [
28]:
where CV
x =
σx/μx is the variation coefficient.
However, an early and rough estimation of the standard deviation of y variable can be obtained by means of the assumption σy ≈ CVx (this assumption becomes absolutely robust when CVx is less than 0.15).
The diagrams in
Figure 1 are the Quantile-Quantile (Q-Q) plots for three monitored number of users (for the sake of synthesis, the plots of PSG
Nus = 239, 777 and 1220 were not here reported, but they are similar to diagrams of
Figure 1). More precisely, the Q-Q plots show the comparison between the observed and the theoretical
Cp quantiles with the same probability of occurrence. The theoretical values were estimated by means of the integral of Equations (2) and (3)—numerical integration for LN distribution—and Equation (5). The quantiles of the peak demand of
Figure 1 fit well the bisector line, which represents the condition where the observed quantiles are equal to the theoretical quantiles.
The effectiveness of the three distributions was checked by means of the Kolmogorof–Smirnov (KS) test.
Table 2 summarizes the threshold values for a 2% confidence level (D
2%—column 3) and the relative KS parameters (D
KS) for the
LN Gu, and
LL CDFs (columns 4–6). The goodness of fit tests were satisfied (D
KS < D
2%) for all monitored users. In addition,
Table 2 shows the statistics of the data sample (
µCp and
σCp—columns 7–8). These probability distributions proved to be equally effective to model the peak phenomena, thus the choice is left to the decision maker.
Equations (2) and (3) allow to explicate the
Peak Demand Coefficient . for a predefined probability Pr[
S] of not exceedance. Hence the peak coefficient with a predefined probability of not exceedance following
LL is given by:
while, for
Gu distribution,
presents the following relation:
The quantile for LN model can be estimated only by means of a numerical approximation.
Figure 2 shows—as an example—the comparison between the observed data (in this case the peak demand of PSG for
Nus = 1220 and Δ
t = 1 min) and the values of
estimated by means of Equations (7) and (8), assuming
µx = 2.50 and
σx = 0.22. The plots confirm that both the
LL and
Gu distributions fit well the experimental points.
4. Estimation of Parameters
All the proposed distributions (LN, Gu and LL) are bi-parametric models; hence, they require the estimation of two parameters, for instance, the mean and the variation coefficient.
The PSG demand data, jointly with data from the other monitored users (CE and Fr), shows that the dimensionless average of the peak water demand varies in relation to the number of users according to the following equation (
Figure 3a):
Equation (9) resembles the relation proposed by Babbitt [
30], but the coefficient 10 of the power function is significantly less than the one of the Babbitt equation, as already shown in Tricarico et al. [
21]. The deterministic approaches, of which the Babbitt equation represents a classic example, imply an overestimation of the peak demand.
for peak demand can also be estimated in relation to the number of users.
Equation (10) fits the experimental points (
Figure 3b), and it highlights a decreasing trend of the
which tends asymptotically towards 0.1 when the number of the users increases. Therefore it is reasonable to assume
= 0.1 when the number of residential inhabitants is greater than 1000 [
21,
31].
It is worth noting that the investigated distributions and the suggested relations–Equations (9) and (10)—are effective for users with completely different habits. Gargano et al. [
13] give more details about the daily pattern of the water demand for Ce, Fr and PSG users.
Equations (9) and (10) adjust the relations introduced by Tricarico et al. [
21], because the hereby proposed equations have been obtained on the basis of more copious and detailed data sample. In addition, the data of water demand recorded in five suburbs of Melbourne in Australia [
32] show similar trends to the ones of Equations (9) and (10), respectively, for the mean and coefficient of variation.
For a predefined number of users, the parameters ( and ) can be estimated by means of the Equations (9) and (10), and hence Equations (7) and (8) allow defining the Peak Demand Coefficient with assigned probabilities of not exceedance.
The diagrams of
Figure 4 show that higher values of the probability of not exceeding of
Cp lead to greater peak demand coefficients, and this effect is more evident when the number of users decreases. Moreover, the comparison of
Figure 4a,b denotes a close trend between
Gu and
LL distributions for equal values of
Nus and fixed probability.
Finally, for the investigated number of users (200–1250),
Figure 4 shows that the old deterministic equations (i.e., [
30]) overestimate the peak phenomena with respect to the probabilistic relations, also when these latter assume high values of the success probability (i.e.,
).
It is worth noting that the classic relation of Babbitt [
30] has been recalled here only as an example of deterministic relations, in fact all of them usually lead to an overestimation of the peak phenomenon of the water demand [
31]. Several researchers [
3,
10,
22] give an interesting overview of the ancient deterministic equations for the estimation of the peak phenomena.
The Peak Coefficient curve (
Figure 4) for a defined probability of non-exceedance can also be expressed by means of the product:
where
is estimated by means of Equation (9) in function of the users number, and K
F represents a growth factor (K
F > 1), as used for flood frequency analysis (e.g., [
33]); therefore, it increases when the Pr[
S] increases. For
LL and
Gu, the growth factors are obtained, respectively, from Equations (7) and (8):
KF is estimated when the probability of not exceedance Pr[S] is fixed (Equations (12) and (13)) and the probabilistic parameters are defined in relation to the number of the inhabitants.
The probability of non-exceedance usually is an engineering choice made by the decision makers and should be a consequence of a cost–benefit analysis.
5. Time Aggregation Effects
The results shown in the previous sections are related to a time step equal to ∆t = 1 min, while, in the numerical analysis of WDSs, a less refined time resolution is usually assumed (1 min < ∆t < 60 min).
The time step presents noticeable implications on the estimation of the maximum water demand [
3,
27], in fact when ∆
t increases, the time averaging effect smooths the peak phenomena.
In the following sections, the effects of the time aggregations are investigated in order to extend the above proposed distributions and relations to other temporal averaging intervals.
5.1. Distributions in Relation to Time Step
The same data samples of the residential water demand, which were considered in previous sections for ∆t = 1 min, were resampled to define the distributions with a different time aggregation (∆t = 5, 10, 15, 30 and 60 min). The effectiveness of the LN, Gu and LL distributions was demonstrated by means of statistical inferences for different time resolutions, from 1 min up to 1 h.
For the sake of the synthesis, in
Figure 5, only the diagrams for the
Cp of PSG 981 with different time resolutions (∆
t = 1, 10, 30 and 60 min), are reported. However, similar results have been obtained for all other monitored users with a time resolution up to one hour.
The Q-Q plots of
Figure 5 show the goodness of the three distributions for all investigated time aggregations, which has been also demonstrated more rigorously by means of KS tests.
The tests with significance level 2% (the value of D
2% is represented by the dashed line in
Figure 6a,c,e) were all satisfied for
LN,
Gu and
LL distributions for any value of ∆
t that falls into the interval 1–60 min.
The results of the KS tests did not highlight any trends in relation to the time step, hence the effectiveness the proposed distributions seem to be independent of the time resolution. In addition to KS test, the
Mean Squared Error (
MSE) was investigated, where
MSE is given by:
where
and
are, respectively, the observed and the predicted
Cp values, and
n is total number of observations.
The
MSE plots (
Figure 6b,d,f) show that the time aggregation for the investigated range seems not to trigger any tendency on the effectiveness of the three probability distributions. Therefore, the estimated values of MSE vary independently by Δt and
Nus.
5.2. Time Dependence of the Parameters
If the time scale does not lead to significant effects on the probabilistic models, it has to be considered in order to estimate the parameters when ∆
t changes. Indeed, the statistical analysis highlighted that the proposed equations to estimate the parameters of distributions need a reduction coefficient when the time step is greater than 1 min. Therefore, Equation (9) becomes:
where
is the average value of
Cp for ∆
t ≥ 1 min and
represents the corrective coefficient of the mean peak demand coefficient for ∆
t = 1 min (
), which decreases when ∆
t increases (
Figure 7a).
Hence,
is estimated in relation to the users’ number and ∆
t
where the time is counted in min. Equations (17) and (18) are valid for 1 min ≤ ∆
t ≤ 60 min and 250 users <
Nus < 1250 users.
The abacus of
Figure 7b shows that the peak demand coefficient decreases significantly when ∆
t increases from 1 min to 60 min. For instance, it decreases of about 25% for
Nus = 1250, and about 50% for
Nus = 250.
Likewise, Equation (10) requires a corrective coefficient
β in order to estimate
when the time aggregation is greater than 1 min:
The coefficient depends on the
Nus and the time step according to Equation (18)
where the time is counted in minutes and Equations (17) and (18) are valid for 1 min ≤ ∆
t ≤ 60 min and 250 users <
Nus < 1250 users.
Figure 8 shows the effectiveness of Equations (17) and (18) to describe the effects of the time aggregation on the
Cp variation coefficient. Moreover,
Figure 8 confirms what has been shown in
Figure 3b; in fact
= 0.1 seems to be a good simplification when the number of users is greater than 250 [
31,
34].
Equations (9) and (10), combined with Equations (15)–(18), allow estimating the parameters of the suggested distributions, for every value of the time aggregation and users number, when 1 min ≤ ∆t ≤ 60 min and 200 users < Nus < 1250 users.
6. Conclusions
The peak water demand is a very demanding working condition in WDSs and represents an issue of great interest for both design and management problems. The results of the present research give a practical contribution to the characterization of the daily maximum water demand according to a probabilistic approach. The proposed method allows estimating the peak demand through biparametric probability distributions and relations with practical applicability.
By means of statistical inferences which took into account copious data samples related to different residential users, the effectiveness of the Log-Normal and Gumbel distributions was verified to describe the peak water demand during the day. In addition, also the Log-Logistic distribution, whose CDF presents a closed form (as Gumbel distribution), is capable of modeling the maximum daily water demand.
Therefore, some practical equations were suggested, which allow estimating the daily peak demand for predefined values of the success probability.
The investigated case studies are very significant, not for the geographic location, but for the very different users’ behavior. In fact, the three towns present daily demand patterns whose peaks are different in number, in time of occurrence and relative entity (e.g., the ratio of the morning peak in respect to the evening one). The probabilistic models proved to be independent of the daily demand pattern and of the peak occurrence time. In fact, although the lifestyles of the investigated users are completely different, the proposed models fitted very well all datasets.
In addition, statistical analysis allowed obtaining effective equations to estimate the parameters of the proposed distributions in relation to the users’ number. Even if the investigated range of the aggregated users (200–1250) is limited, it is effective to model the water demand at the nodes of a WDS, because in numerical simulations the number of served users often falls in this interval.
T time resolution has been analyzed investigating the effects of the time step, in the range 1 min–1 h, on the peak phenomena. It was observed that an aggregation of 1 min ≤ ∆t ≤ 1 h does not affect the choice of the distribution. In fact, the three distributions (LN, Gu and LL) proved to be all equally effective when modeling the peak water demand for all the investigated ∆t. On the other hand, when estimating the parameters of the distributions, corrective coefficients are needed and some effective equations have been proposed for this purpose. These latter allow estimating the parameters of the CDFs in relation to the number of users and the value of ∆t. The time resolution analysis considered the range 1 min–1 h, because it covers a wide spectrum of engineering problems (e.g., design and management problems).
In the future, it is desirable that the results herein obtained could be made more robust increasing the data sample, so that the validity field of the models could also be increased. For this end, new in-situ laboratories, whose characteristics are consistent with those described in the work, are necessary.