**1. Introduction**

Population growth, urbanization, and climate change are expected to increase the stress on freshwater resources and the burden over urban water systems [1–3]. Adaptive planning and management strategies are thus needed to address seasonal or prolonged water scarcity in drought-prone areas and meet water demands with reduced operational expenditure, overall increasing the resilience of critical urban water network infrastructure systems [4].

In the last decades, demand-side management has increasingly emerged as a key approach to complement traditional water supply operations [5]. Different water demand management strategies (WDMS) have been proposed in the literature to foster water conservation and more efficient water demands [6,7]. These include technological, financial, legislative, maintenance, and educational interventions [8]. The rise of demand-side water management has motivated the development of more and more sophisticated technologies and mathematical models to monitor, characterize, and predict water demands at different spatial and temporal scales, and capture the existing relationships between water demand and its potential climatic and socio-demographic determinants [9–11].

At the coarser urban and suburban scales, the state-of-the-art literature is rich with studies focused on improving the efficiency of water distribution network (WDN) opera-

**Citation:** Di Mauro, A.; Cominola, A.; Di Nardo, A.; Castelletti, A. Urban Water Consumption at Multiple Spatial and Temporal Scales. A Review of Existing Datasets. *Water* **2021**, *13*, 36. https://dx.doi.org/ 10.3390/w13010036

Received: 31 October 2020 Accepted: 14 December 2020 Published: 28 December 2020

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/ licenses/by/4.0/).

tions (e.g., [12–14]). In these studies, water demands are often considered as a stationary or seasonal input to the hydraulic model of the WDN, with a spatial level of aggregation referred to the city or the district scale. Such spatial scales are typically relevant for infrastructure planning, WDN design, and WDN partitioning. More recently, various techniques for water demand forecasting have also been proposed in the literature. They include regression analysis, time series analysis, and techniques based on black box models, including different Artificial Neural Network architectures (e.g., [15]). Demand prediction models have been developed at different spatial and temporal scales, with the majority of the studies focusing on urban and suburban scales, and temporal resolutions spanning from hourly to monthly intervals (e.g., [16–18]). A disruptive phase in the development of water demand studies is represented by the advent of smart metering technologies [8,19]. The development of smart meters allowed gathering water demand data with an unprecedented level of spatiotemporal detail. Water demand data became potentially available at the spatial scale of individual households and data logging intervals of a few seconds [20]. While understanding the full range of potential benefits of smart meters for water utilities and customers is still a topic for active discussion [21], the variety of studies in the literature based upon smart meter data demonstrates the diversity of data-driven opportunities that high-resolution smart meter data opened up in the context of water demand modelling and management. These include, e.g., water demand profiling and customer segmentation [22], post meter leak detection and water loss management [23], end use studies for fixturelevel water demand breakdown and detailed demand forecasting [24], and behavioral studies [25].

The continuously increasing amount of smart meter trials and demand modelling and management studies since the middle of the 1990s [8] suggests that several high-resolution water demand datasets have been recently compiled. The availability of high-resolution datasets opens up several opportunities for advanced applications, including the development of water end use disaggregation algorithms and machine learning techniques for user profiling. Such applications could benefit from open datasets to enhance comparative applications, benchmarking, and facilitate the development of general algorithms trained on combined datasets with water consumption data from different sources and locations. High-resolution datasets, considered in combination with the more traditional water demand datasets gathered at coarser spatial and temporal resolutions would represent a valuable resource for researchers and scientific efforts targeting the development and validation of mathematical models of water demand at different spatial and temporal scales, or the development of advanced smart metering analytics.

Yet, information and metadata on individual water demand datasets are scattered in the literature, and to the authors' knowledge, a comprehensive review of the existing datasets is still missing. Existing data are frequently difficult to access or use, and existing literature reviews on urban water consumption focus on demand modelling or other datadriven applications, rather than on analyzing the heterogeneity of existing datasets, their spatial and temporal scales, and accessibility. Motivated by the recent development and availability of datasets gathered with increasingly high spatial and temporal resolution, the aim of this paper is to gather information on the datasets to identify current trends and gaps and help future data-driven research, along with research benchmarking and reproducibility.

This review contributes the first effort of classification and analysis of 92 water demand datasets and 120 related peer-review publications that have been compiled in the last 45 years to monitor urban water consumption data at different spatial and temporal scales and provide data for water demand modelling and management studies. We characterize the reviewed datasets according to their heterogeneous spatial and temporal scales, and investigate their accessibility. Moreover, since digital disruption has transformed the electricity industry earlier and some lessons learned may apply also in the water or multi-utility sectors [26], we additionally explore similarities and differences between the reviewed subset of high-resolution water demand datasets and 57 comparable high-resolution electricity demand data.

We thus analyze the reviewed datasets and publications to address these five research questions (see Figure 1):


**Figure 1.** State-of-the-art water demand datasets review: summary of the research questions and multi-stage analysis.

The ultimate goal of this review is to compile an updated catalog of the existing water demand datasets and facilitate future research efforts in this rapidly evolving field of investigation. Researchers performing water demand studies could refer to this review to identify data readily available in formats, spatial scales, and temporal scales that suit their research needs. This review will finally also help identifying water demand datasets that are accessible free of charge, in the attempt to promote further publication of open-access datasets to foster reproducible research, benchmarking, and the development/validation of existing software tools to generate reliable and realistic synthetic data [27–29].

The paper is structured as follows. The dataset review methods and the considered spatial and temporal scales are presented in Section 2; an overview of the dataset search outcomes is presented in Section 3; Sections 4–6 analyzes the reviewed datasets in terms of (i) spatial scales, (ii) temporal scales, and (iii) accessibility; Section 7 analyzes similarities and synergies between some of the reviewed water demand datasets and alike electricity demand datasets; finally, Section 8 draws some final remarks and directions for follow-up research.

#### **2. Datasets Review Methods**

To address the research questions formulated in Figure 1, we searched for water demand datasets collected at different spatial and temporal scales and referenced in the peer-reviewed scientific literature on water demand modelling and management. We searched on different web search engines and scientific databases, namely, Google Scholar (https://scholar.google.com/), Mendeley (https://mendeley.com/), Mendeley Data (https://data.mendeley.com/), and data.world (https://data.world/datasets/). We followed the following 3-step procedure:


In addition to the datasets retrieved with the above search, we included in this review other high-resolution datasets retrived from two articles strongly focused on residential water demand, i.e., [30,31].

After compiling an inventory with the datasets and related publications retrieved with the above search methods, we reviewed, classified, and critically analyzed the inventory according to three main criteria: (i) spatial scale (Section 4), (ii) temporal scale (Section 5), and dataset accessibility/access policy (Section 6).

## *Spatial and Temporal Scales of Interest*

Depending on the spatial scale of interest, different metering and monitoring tools for water consumption data gathering can be adopted. For instance, end use metering usually requires ad hoc, customized, solutions [20,32], while household or district water consumption can be monitored with commercial flow meters [33]. Datasets collected at different spatial scales will thus represent different levels of aggregation of water demand and will possibly have implications on data privacy and ownership (e.g., water utilities vs individual water consumers). Numerous benefits can derive from high-resolution data, both for water utilities and water consumers [21]. Such data enable, for instance, accurate modelling of water demand patterns, peaks, and anomalies (e.g., leaks) [28]. However,

large and high-resolution data implies also several potential drawbacks, e.g., privacy concerns, need for cloud resources for data storage and new skills for data analytics [34]. We identified four scales of interest for urban water consumption monitoring and analysis, from the coarser to the finer:


In this review, we keep into account the spatial scale dependencies of the reviewed datasets and classify them according to the three suburban scales included in the *city* level: District, Household, and End Use. In the literature, the spatial scale of interest is related to the type of application that requires water demand datasets (WDDs). WDDs at the district scale, for instance, are mainly used to investigate water network partitioning [35,36], compute water balances [37], assess the hydraulic performance of the network system [38], and perform leakage identification and localization [39,40]. The level of aggregation of these WDDs depends on the network configuration and/or DMA design, and often refers to water demands at network nodes [41,42]. At the household scale, WDDs represent domestic water demands and are primarily used to build descriptive and predictive models of water demand, estimate demand peak timing and magnitude to inform water network operations, and inform conservation campaigns and demand management interventions [43,44]. Finally, at end use scale, WDDs are used to improve our understanding of residential water consumption behaviors, develop disaggregation models to estimate the share of household water consumption of individual fixtures, develop customized water demand management strategies and billing reports, and overall increase customer engagement and help water utilities and customers promote efficient water usage [45,46]. In keeping with the different spatial and temporal scales considered in this study, this review includes both water consumption data retrieved with digital water meters and data measured with low resolution meters or retrieved from water bills [47–49]. Furthermore, when a dataset or publication considers multiple spatial scales, we classify it according to the finest level of data granularity.

Beside the spatial dimension, we also explore how datasets differ in terms of temporal scale (or time sampling frequency). Previous literature has shown that water demand data gathered at monthly or quarterly resolution is mainly used to inform strategic regional planning and to calculate water bills [11], while a number of additional applications, including post-meter leak detection and water end use disaggregation can be enabled by sub-daily data (e.g., recorded with a time sampling frequency of 1 h or a few minutes/seconds) [28]. Here, we characterize the datasets collected at the district, household, and end use scales according to their time sampling resolution, with primary focus on daily and sub-daily frequencies. We consider datasets to have a *low resolution* when they include data with a daily or lower time sampling frequencies (e.g., monthly). In turn, we consider as *high resolution* datasets those gathered with a sub-daily frequency (e.g., hourly, 1 min, 10 s).
