Next Article in Journal
Recurring Errors in Studies of Gender Differences in Variability
Next Article in Special Issue
Precise Tensor Product Smoothing via Spectral Splines
Previous Article in Journal
Model Selection with Missing Data Embedded in Missing-at-Random Data
Previous Article in Special Issue
Farlie–Gumbel–Morgenstern Bivariate Moment Exponential Distribution and Its Inferences Based on Concomitants of Order Statistics
 
 
Article
Peer-Review Record

Detecting Regional Differences in Italian Health Services during Five COVID-19 Waves

Stats 2023, 6(2), 506-518; https://doi.org/10.3390/stats6020032
by Lucio Palazzo 1,*,† and Riccardo Ievoli 2,*,†
Reviewer 1: Anonymous
Reviewer 2:
Reviewer 3:
Stats 2023, 6(2), 506-518; https://doi.org/10.3390/stats6020032
Submission received: 19 January 2023 / Revised: 11 April 2023 / Accepted: 13 April 2023 / Published: 15 April 2023
(This article belongs to the Special Issue Novel Semiparametric Methods)

Round 1

Reviewer 1 Report

This paper seems to be original and the paper demonstrates an adequate understanding of the relevant literature in the field and cite an appropriate range of literature sources.

Abstract

- Clearly describe

Introduction

- clearly identify the aim of study/research

Materials and Methods

- described step by step

Results

- results are explained in detail and sequentially.

Discussion

The discussion is explained very clearly and  detail

Conclusion

Clearly described

Author Response

Thank you for your kind review and gentle feedback of our manuscript.

Reviewer 2 Report

  1. Why did the paper investigate five waves while the title only mentioned two waves?

  2. How to select the number of K of trimmed k means cluster? The author just mentioned it is purely based on data but do you use any tests?

  3. On page 1, you said “In fact, the territorial imbalances should be neither attributable to well-known economic and infrastructural disparities nor to geographical propagation flows of the virus itself” Do you mean there is no economic disparity among the different regions? Because if there is disparity, I believe it will impact the number of hospitals, number of doctors, etc and it probably will impact the territory imbalance.

  4. In the time series clustering step 2, why are we only storing the coordinates of the first two components? 

  1. When finding similarity scores, what feature do we use? Any feature related to demographic information of the region?

  2. How to choose the weight of wMDS?

Author Response

Let us thank the Reviewer for giving us many helpful suggestions to extend the scope of the paper and also to present the material in a more convincing manner. We have reproduced your comments followed by a discussion in italics (where appropriate) to detail the changes inserted in the current version of the manuscript.

Reply to the Reviewer 2

1. Why did the paper investigate five waves while the title only mentioned two waves?

Reply: Thank you for noticing this inconsistency. Firstly, we corrected the typo and then modified the title following the feedback of another reviewer.

2. How to select the number of K of trimmed k means cluster? The author just mentioned it is purely based on data but do you use any tests?

Reply: Thank you for this note regarding the clustering strategy. We confirm that a formal test is not carried out but we apply a conventional diagnostic of clustering technique, i.e., the within-residual sum of squares computed for different choices of K, also denoted as the elbow criterion. We better clarified the use of this diagnostic in the main text.

3. On page 1, you said “In fact, the territorial imbalances should be neither attributable to well-known economic and infrastructural disparities nor to geographical propagation flows of the virus itself” Do you mean there is no economic disparity among the different regions? Because if there is disparity, I believe it will impact the number of hospitals, number of doctors, etc and it probably will impact the territory imbalance.

Reply: Thank you for noticing this misleading sentence. Our intention was to underline how the proposed clustering could be useful for identifying similarities and differences in the performance of regional health systems which can obviously also be due to the economic availability but also to the governance model adopted in the local health care. However, in the revised version of the manuscript we modified the sentence following your suggestion.

4. In the time series clustering step 2, why are we only storing the coordinates of the first two components?

Reply: Thank you for the question. We highlighted that the percentage of explained variance obtained throught the first two components of our weighted MDS, denoted as the Mardia’s score, are always greater than 0.9. The Mardia’s rule of thumb suggests to accept values of the score greater than 0.8.

5. When finding similarity scores, what feature do we use? Any feature related to demographic information of the region?

Reply: Thank you for this comment. We used the demographic information (number of inhabitants) to obtain the three considered indicators which are used to carried out the wMDS.

6. How to choose the weight of wMDS?

Reply: Several possibilities can be explored to properly choice the weights of wMDS. Moreover, given the scope of the paper, our purpose was to take into account some spatial features of the Regions. For this purpose we selected the weights considering the number of bordering regions of each region. The weight associated to a specific Region i is the ratio between the number of borders of Region i and the number of borders of all Regions.

Author Response File: Author Response.pdf

Reviewer 3 Report

Please check the attached pdf.

Comments for author File: Comments.pdf

Author Response

Let us thank the Reviewer for giving us many helpful suggestions to extend the scope of the paper and also to present the material in a more convincing manner. We have reproduced your comments followed by a discussion in italics (where appropriate) to detail the changes inserted in the current version of the manuscript.

Reply to the Reviewer 3

This paper proposes a three-steps clustering procedure for the time series data. The spatial effect is achieved by adding weight as neighborhood counts in the MDS. Different distances for time series are compared. For me, the paper is well written and easy to understand.

Comments:

1. In this paper, different distances based on the proposed method are compared. However, time series clustering has many existing methods. For a comprehensive case study, it is necessary to compare the proposed methods with existing methods as the baselines. The advantage and limitation of different methods should also be discussed.

Reply: The purpose of this paper is focused on showing the applicability of wMDS combined with various distance measures in the context of time series related to the effects of a pandemic (or epidemic). It would be possible to perform a broader benchmark of the proposed approach, highlighting the pros and cons, but this would change the purpose of this work. For this reason, a broader comparison of many methods is deferred for further research involving a simulation study.

2. The code is not provided in the paper. It would be good to provide the model code to let others utilize your procedure easily.

Reply: Thank you for this methodological concern. We attached as supplementary material an R-script with an example procedure explaining how to apply the weighted MDS to our considered COVID-19 data.

3. I am wondering whether the title is proper. The paper proposed a time series clustering, but the title says, “detecting difference”. Although distance of time series measures the similarities, but the distance is only an intermediate step of the procedure.

Reply: Thank you for this note. In the title we refer to regional differences in terms of the geographical impact of COVID-19 on regional health services. According to your suggestion, we clarified this aspect and, therefore, we changed the title of the manuscript. The new title is:“Detecting regional differences in Italian health services during five Covid-19 waves”.

4. Although this is application and case study paper, but the simulation study may be still needed to validate the functionality of the proposed model. Also, the sensitivity of clustering can be studied through different simulation settings.

Reply: Thanks for the remark. We intend to carry out broader research investigating different scenarios through the use of simulated data, but first, we need to develop a data generation process that follows the main characteristics related to the covid-19 pandemic or, more generally, to the spread of a pandemic.

5. Another reason for a simulation is that, although the spatial effect is taken into account by using the weight MDS. It is doubtful that the weight truly models the clustering. The spatial effect will be complicated in public health. Additionally, other covariates are not included in the clustering. Such points should be discussed in the future research section.

Reply: Thank you for this methodological concern. As suggested we expanded the last section to include a discussion regarding these points.

6. I am confused about the relationship between the goodness-of-fit metrics of MDS and different distances. A bad GOF metric may be caused by the data, the types of MDS, and so on. An additional reason should be included to claim that some distance has better performance. Please discuss.

Reply: Thank you for the comment. In fact, our discussion was only based on the bad performance registered for the CID. Following your suggestion we added some considerations regarding the usefulness of shape-based metrics at the beginning of Section 4. Some considerations regarding the limits of this approach are depicted in the concluding Section.

Minor comments:

1. Line 148, please provide the source about the time duration of each wave.

Reply: The time duration of each wave was determined after the inspection of the time series plots identifying the change points. Moreover, the identified windows are in line with other scientific contributions concerning covid-19 in Italy, see e.g. Boriani et al. (2023), Fig. 1.

2. Details on how to determine the hyper-parameters are missing, examples are:

a. The number of clusters on your trimmed k-mean

b. For DTW, the choice of m, ai, bi

c. α in the trimmed k mean.

It would also be helpful to share the code for the data analysis in addition to the potential package since the data is open source.

Reply: Firstly, following your suggestions we attached as supplementary material an Rscript including these details. Regarding the other points:

a. We clarified in the paper that K is selected using the within-cluster variance using the elbow criterion

b. The DTW computes the optimal alignment between two time series, the optimality is computed by minimizing the sum of distances between aligned values. Therefore, the procedure does not require the apriori specification of the parameter ξ or the ai and bi values.

c. We carried out a comparison using α = 0.05, 0.1 and 0.15 that which is not included in the paper for the sake of brevity. The illustrated results are obtained using α = 0.1, i.e., two outlying regions.

3. Line 113, the dimension of MDS is which is the same in line 88, the length of DTW One of them should be replaced with other letters to avoid confusion.

Reply: Thank you for noticing this inconsistency. Following your suggestion we replaced the m in DTW with the letter ξ.

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

Please update the format. I saw colored sentences and strikelines. I didn't expect the audience also see this. 

Author Response

Thank you for the insightful comments and for your valuable work.

Reviewer 3 Report

Thanks for the comments, I do not have additional comments.

Author Response

Thank you for the insightful comments and for accepting our manuscript for publication.

Back to TopTop