**8. Discussion and Conclusions**

In the last decades, demand-side water management emerged as a key strategy to pursue efficient water demands and complement supply-side interventions to enhance the overall resilience of urban water systems. The rise of demand-side water management, coupled with the development of digital water metering technologies, has fostered the collection of water demand data at increasingly higher spatial and temporal resolutions. The availability of water demand data at the spatial scale of individual households or end uses, and with a time sampling resolution of a few seconds or minutes, opened up unprecedented opportunities to improve our understanding of water consumer behaviors and modelling water demand. As a consequence of this transformative process, the literature is now rich with urban water demand datasets collected over time with different spatial and temporal resolutions, and archived with different levels of accessibility.

In this paper, we reviewed 92 water demand datasets and 120 related peer-review publications compiled over the last 45 years. We analyzed the datasets and classified them according to their spatial scale, temporal scale, and level of accessibility. Moreover, we analyzed their domains of application within water demand modelling and management studies, and compared them with similar datasets in the electricity sector. As a result of this review and classification effort, we can summarize the following takeaways and address the research questions introduced in Figure 1.

Q1. How are the existing urban water demand datasets distributed across different spatial scales? We found that the majority of the reviewed datasets was collected at the household (31 datasets) or end use scale (41 datasets). Only 20 datasets were identified at the district scale. This is likely due to the increasing number of water demand studies that developed after the advent of digital water meters. Moreover, the datasets gathered at the district scale are usually owned by water utilities, which make them available to researchers usually only temporarily and for ad hoc case study analyses.

Q2. How are the existing urban water demand datasets distributed across different temporal scales? Focusing on the finest spatial scales analyzed, i.e., the household and end use scales, we found that most of the analyzed datasets contain data sampled with a time frequency in the range of 1 s to 1 day. Yet, differences exist: most of the end use-scale datasets contain data gathered with a sub-minute resolution, while household-scale data are characterized by time sampling resolutions of 15 min to 1 day. This is primarily due to the high temporal resolution required by residential water end use disaggregation models.

Q3. What are the main domains of application of the reviewed studies, within water demand modelling and management studies? Our review reveals that the datasets reviewed at district level are mainly used to estimate aggregate demand patterns used in water distribution networks models to investigate water network partitioning, hydraulic performance, network anomalies, and leakage detection. Household-scale datasets have been primarily used to develop data-driven models for water demand forecasting, as well as for explorative analysis to identify water demand determinants. Consistently with our findings for Q2, end use datasets are primarily gathered to develop, train, and validate end use disaggregation algorithms. Both household and end use datasets have also been used to inform water conservation/demand management programs and monitor their effectiveness to change water demand patterns.

Q4. What is the access policy for the reviewed data sets? Most of the reviewed datasets are not open access. Usually, they have a restricted access, i.e., are available for purchase, or can only be obtained by contacting the researchers or water utilities that compiled and own the dataset. However, some households- and end use-scale datasets became openly available, primarily in the last 5 years. This is an encouraging signal for future data sharing and research reproducibility.

Q5. Is there any synergy with comparable datasets in the electricity sector? Similarities exist in the spatial and temporal scales of interest for both the water and the electricity sector, and the amount of reviewed datasets is comparable. Yet, the datasets in these two domains are still very different for what regard their accessibility. Open access datasets are more

easily available in the electricity sector, primarily because of the extensive research efforts developed in the last three decades on the problem of electricity end-use disaggregation.

Overall, this paper can provide researchers in the water demand modelling and management sector with useful information to identify data readily available in formats and spatial and temporal scales that suit their research needs. We also identify a roadmap of priorities to enable a complete disclosure of the information value of urban water demand datasets. First, the scientific community would benefit from increased accessibility to open data. We acknowledge that water demand data are sensitive and anonymization and privacy-protection measures need to be undertaken before they can be made openly available. Sharing high-resolution data, consumer data, and sensitive digital data imply potential risks for the privacy and security of private or personal information. Sensitive datasets could potentially be used by third parties for profit and intimidation, or to intrusively track private activities [168]. In response to privacy and security concerns, data protection regulations such as the General Data Protection Regulation (GDPR) implemented by the EU in 2018 and other policies initiated after it in other countries worldwide should be established at the regulatory level [192]. When guaranteed in compliance with privacy protection and data security frameworks, an increasing availability of open access datasets would guarantee better reproducible research, create opportunities for research benchmarking, and foster more transparent and possibly collaborative development and validation of analytic tools.

Second, this review is focused solely on water demand datasets, with primary focus on the household and the end use scales, and only a general overview of possible applications at different temporal and spatial resolutions is provided. Future work could look at systematically reviewing the different goals of existing urban water demand studies at different suburban and urban scales, including those focused on outdoor water use [193], urban landscape water conservation [194], economics and price influences [91], socioeconomic factors and drivers of water demand [195], and metropolitan water planning [196]. Especially these last categories of studies and applications entail cross-domain analysis which combine water consumption data with data from other sources (e.g., socio-economics, climate, behavioral data). Beside requiring proper analytic tools for data analysis, proper data management and sharing frameworks and protocols should be designed to facilitate data fusion among private/public water utilities and the other stakeholders involved in these inter-sectoral studies.

Third, the reviewed datasets are unevenly geographically spread worldwide (some geographical hot spots in USA, Europe, and Australia were identified) and come with different spatial and temporal resolutions. Research efforts aimed at quantitatively comparing water demand data (water consumption volumes, peaks, patterns) gathered across different scales and geographical contexts would advance the generalization of water demand models and contribute to upscale the findings from currently localized water demand studies. In addition, important aspects related to the use of water consumption data from different meters include data standardization and meter accuracy. Data from various sources need a standardized format to facilitate and improve the use of WDDs and increase data portability, interoperability, and overall data quality [197,198]. Moreover, future research could focus on assessing and comparing datasets in the catalogue we have built in this work in terms of measurement precision and accuracy.

Finally, we expect that the current challenges posed to the resilience of interconnected critical infrastructure will foster efforts aimed at overcoming data silos and encourage the development and transfer of multi-sectoral analytic tools to inform resilience planning across sectors (e.g., smart electricity grids, green infrastructure), and scales [26].

**Supplementary Materials:** The complete catalog with the 92 state-of-the-art water demand datasets and 120 publications reviewed in this paper is available on Zenodo (https://doi.org/10.5281/zenodo. 4390460 [50]) and in this public GitHub repository: https://github.com/AnnaDiMauro/WDDreview. The complete list and metadata of the additional 57 electricity datasets at the end use and household

scales that we reviewed in this paper is reported in Supplementary Tables S1 (end use scale) and S2 (household scale). The following are available online at https://www.mdpi.com/2073-4441/13/1/36/s1.

**Author Contributions:** All authors designed the research. A.D.M. compiled the catalog of the reviewed datasets and peer-reviewed publications, and performed the review. A.D.M. and A.C. (Andrea Cominola) analyzed the outcomes of the review. A.C. (Andrea Cominola), A.C. (Andrea Castelletti), and A.D.N. supervised the research. All authors reviewed the manuscript. All authors have read and agree to the published version of the manuscript.

**Funding:** The research was conducted as part of the activities financed with the awarding of the V:ALERE: 2019 project of the University of Campania Luigi Vanvitelli.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**

