Temporal Segmentation of Urban Water Consumption Patterns Based on Non-Parametric Density Clustering
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThe paper presents the application of multi-clustering to a database including time, water supply and electricity consumption in a real case study. Though the paper has some merits, I have the following major comments.
1 - Material and methods. Subsection 3.3. Typically, an optimization is performed considering an objective function and some decision variables. The Authors should describe their optimization in these canonycal terms.
2 - Material and methods. Subsection 3.3. Step 1 is already present in equation 2, so it could be taken out from the optimization.
3 - Material and methods. There are two subsections 3.3 with the same caption. Please, rename the second.
4 - Subsection 3.4 is the really novel part of the method but it is poorly presented and the flowchart in Figure 4 is hardly described.
5 - Results. I haven't understood how come that the five clusters in Figure 6 finally lead to the four time slots in Table 3, passing through the two main clusters in Table 2.
6 - Table 3. The four time slots in Table 3 are a bit trivial, maybe engineering judgment could have identified them very easily. What is the additional benefit obtainable bu the Authors' method compared to engineering judgment?
Other comments
7 - The paper is full of "Erros! Reference Source not found" for table and figure citations.
8 - Besides the works of Cominola et al., other works in the context smart metering could be cited, including those parameterizing residential water demand pulse models through smart meter readings and those making use of smart meters data for modeling and forecasting water demand at the user-level.
9 - Many acronyms appear without being defined. I had to wait a long to understant what DBSCAN and UMAP are about.
Author Response
1 - Material and methods. Subsection 3.3. Typically, an optimization is performed considering an objective function and some decision variables. The Authors should describe their optimization in these canonycal terms.
We thank the reviewer for their valuable suggestion. In the revised version, Subsection 3.3 is rewritten in canonical optimization terms. We explicitly state the objective function, decision variables, and constraints. The procedure is now described as finding the optimal number of points that ensures stable cluster formation, and then determining ε from this value.
2 - Material and methods. Subsection 3.3. Step 1 is already present in equation 2, so it could be taken out from the optimization.
We thank the reviewer for their comment. We agree that step 1 duplicated equation (2) and has therefore been excluded from the description of the optimization procedure. The corresponding changes have been made to subsection 3.3.
3 - Material and methods. There are two subsections 3.3 with the same caption. Please, rename the second.
We thank the reviewer for their thoughtful comment. The text did indeed contain two subsections 3.3 with the same title. We renamed the second subsection, eliminating the duplication, and adjusted the numbering.
4 - Subsection 3.4 is the really novel part of the method but it is poorly presented and the flowchart in Figure 4 is hardly described.
We thank the reviewer for their valuable comments. In the revised version of the article, we expanded the description of Subsection 3.4, focusing more on the essence of the proposed method and its differences from the standard approach. We also updated the text explanation for the flowchart in Figure 4.
5 - Results. I haven't understood how come that the five clusters in Figure 6 finally lead to the four time slots in Table 3, passing through the two main clusters in Table 2.
Thank you for your comment. In the original version of the article, there was indeed a discrepancy between the number of clusters in the figure and the number of time intervals in the table. This was due to the fact that clusters with insignificant relative importance were not included in the illustration. In the revised version, we have completely redesigned the presentation of clusters on the timeline.
6 - Table 3. The four time slots in Table 3 are a bit trivial, maybe engineering judgment could have identified them very easily. What is the additional benefit obtainable bu the Authors' method compared to engineering judgment?
Thank you for your comment. Indeed, the identified time intervals largely align with engineering intuition. However, the goal of the study was not so much to confirm obvious patterns as to develop an algorithmic mechanism for automatically identifying and labeling water consumption patterns. This task is considered an intermediate one, where the algorithm will be used to label patterns in classification and forecasting tasks. Unlike engineering assessment, the proposed approach provides a formalized segmentation method that can be integrated into intelligent models to improve prediction accuracy.
7 - The paper is full of "Erros! Reference Source not found" for table and figure citations.
We thank the reviewer for their comment. Errors like "Error! Reference source not found" were due to incorrect automatic links to figures and tables. These errors have been corrected in the revised version of the article.
8 - Besides the works of Cominola et al., other works in the context smart metering could be cited, including those parameterizing residential water demand pulse models through smart meter readings and those making use of smart meters data for modeling and forecasting water demand at the user-level.
Thank you for your comment. The following references have been added to the manuscript:
31. Hao, W., Cominola, A., & Castelletti, A. (2025). Short-Term Memory and Regional Climate Drive City-Scale Water Demand in the Contiguous US. Earth’s Future, 13(1). Portico. https://doi.org/10.1029/2024ef004415
32. Spinelli, D., Giuliani, M., & Castelletti, A. (2024). Ensemble Forecasts with Blocked K-Fold Cross-Validation in Multi-Objective Water Systems Control. 2024 European Control Conference (ECC), 493–498. https://doi.org/10.23919/ecc64448.2024.10591306
33. Pesantez, J. E., Berglund, E. Z., & Kaza, N. (2020). Smart meters data for modeling and forecasting water demand at the us-er-level. Environmental Modeling & Software, 125, 104633. https://doi.org/10.1016/j.envsoft.2020.104633
9 - Many acronyms appear without being defined. I had to wait a long to understant what DBSCAN and UMAP are about.
We thank the reviewer for their comment. In the updated version of the article, all abbreviations used (including DBSCAN and UMAP) are explained at their first mention in the text.
Reviewer 2 Report
Comments and Suggestions for AuthorsThe manuscript entitled “Temporal Segmentation of Urban Water Consumption Patterns Based on Non-Parametric Density Clustering” focuses on the analysis of intra-daily water consumption in urban supply systems and proposes a modified DBSCAN clustering method to identify high-demand, low-demand, and transitional states. The topic is relevant to intelligent water supply and energy-efficient management, the dataset is authentic, the methodological framework is clearly described, and the results are consistent with actual consumption patterns, which provides certain practical value.
However, the originality and scientific contribution of the study remain limited. The clustering outcomes largely reproduce the obvious day–night consumption cycles, which appear reasonable but offer little novelty, and at times give the impression of being “too coincidental.” Furthermore, the lack of multi-dimensional validation and cross-scenario comparison raises concerns about the robustness and generalizability of the conclusions.
Specific issues include:
-
The main clusters (high demand: 6:30–22:30, low demand: 0:30–6:00) almost entirely coincide with common-sense daily routines, representing formal confirmation rather than new discovery.
-
Transitional clusters merely describe gradual shifts between day and night, without providing deeper insights.
-
No validation is presented against actual pumping schedules or socio-activity data, limiting engineering relevance.
-
No comparison is made with alternative methods (e.g., K-means, hierarchical clustering), making it difficult to highlight the advantages of the proposed approach.
-
The analysis is based on a single city and one year of data, leaving external validity unclear.
-
DBSCAN is inherently sensitive to parameter settings; even with hyperparameters optimization, cross-dataset testing is needed to establish robustness.
-
The discussion does not sufficiently distinguish between algorithmic findings and well-known consumption patterns, leading to potential overinterpretation.
Author Response
- The main clusters (high demand: 6:30–22:30, low demand: 0:30–6:00) almost entirely coincide with common-sense daily routines, representing formal confirmation rather than new discovery.
Thank you for your comment. In Section 4, we clarified that the coincidence of the base clusters with obvious diurnal activity cycles confirms the validity of the method. We added an additional analysis to Section 4 that included hydraulic pressure, which allowed us to identify several additional regimes. We would like to note that the primary goal of this study was not to discover new patterns, but to formalize and label regimes for their subsequent use in predictive models.
- Transitional clusters merely describe gradual shifts between day and night, without providing deeper insights.
Thank you for your comment. In the updated version of the article (Section 4.2), we emphasized that transient conditions, although small in proportion (approximately 6-7%), capture the separation of morning and evening intervals, which are important for forecast models. We clarified that such intervals play a role in modeling transient conditions and allow for improved forecast accuracy.
- No validation is presented against actual pumping schedules or socio-activity data, limiting engineering relevance.
Thank you for your comment. In the revised version of the article, we clarified that the data used represent actual pumping station performance indicators (water supply, power consumption, and pressure). To strengthen the engineering interpretation, a new subsection was added to Section 4, introducing a hydraulic factor (average pressure at the pumping station outlet), expanding the structure of the identified modes and clarifying the transition intervals.
- No comparison is made with alternative methods (e.g., K-means, hierarchical clustering), making it difficult to highlight the advantages of the proposed approach.
Thank you for your feedback. Substantial changes have been made to Section 4. We added Section 4.3, where we implemented the proposed method with K-means. It was shown that K-means reproduces general trends but produces smooth and overlapping clusters, while the modified DBSCAN identifies more compact and interpretable modes.
- The analysis is based on a single city and one year of data, leaving external validity unclear.
Thank you for your comment. We clarified in the article that the study is indeed based on data from a single year and a single city. This choice is due to the highly labor-intensive process of collecting and fully digitizing pumping station logs, including flow, power consumption, and pressure data. However, we agree that expanding the analysis to long-term time series and other cities appears to be a promising direction for future research. For the purposes of this paper, the primary focus was on methodological formalization and identifying patterns that we expect to use for labeling time series in water consumption regime problems.
- DBSCAN is inherently sensitive to parameter settings; even with hyperparameters optimization, cross-dataset testing is needed to establish robustness.
Thank you for your comment. In response, we expanded the analysis scenario by introducing an additional parameter: pumping station pressure. This allowed us to test the algorithm's robustness to changes in the feature space dimensionality. As shown in Section 4.2, the main clusters (nighttime minimum and daytime maximum) retained their structure, and subtypes of transient conditions that were previously less pronounced were also identified.
- The discussion does not sufficiently distinguish between algorithmic findings and well-known consumption patterns, leading to potential overinterpretation.
Thank you for your comment. We have taken this into account and completely revised Section 4, adding a clear distinction between stable clusters, which coincide with obvious diurnal properties, and transient regimes identified algorithmically. In particular, we emphasized that the goal of the study was not simply to confirm known patterns, but to develop a formalized mechanism for labeling regimes that could subsequently be used for classification and forecasting purposes.
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsThe current citation 31 is out for scope for the paper. A better citation would be about "Parameterizing residential water demand pulse models through smart meter readings"
Reviewer 2 Report
Comments and Suggestions for AuthorsThe authors has well replied my comments.
