*2.2. Outcome Variable*

The model forecasts the risk of a COVID-19 outbreak in each dialysis clinic in a 2-week horizon. Clinic outbreak is defined as the occurrence of two or more COVID-19-confirmed cases in a given clinic. Therefore, for each clinic registered in the Nephrocare network, the model estimates the probability of COVID-19 outbreak (2 or more PCR confirmed cases within a 2-week horizon) as a function of a vector of input variables. Study design is represented in Figure 1.

**Figure 1.** Study design: Reference timeframe for data collection/calculation is shown.

For illustrative purposes, we established 3 risk categories: (1) low (L), when outbreak risk is less than or equal to 1.5%; (2) medium (M), risk greater than 1.5% and less than or equal to 12.5%; (3) high (H), if risk is greater than 12.5%. For this purpose, the action thresh-

old defining the low risk class has been chosen to select a subpopulation of clinics where the risk of outbreak is very small so that non-pharmacological interventions to prevent the spread of COVID-19 can be temporarily and partially mitigated. In this context, a costly error would be to assign to the Low Risk class a clinic which will experience an outbreak in the following two weeks. Such threshold would be useful when a sufficiently large share of clinics (i.e., P(Class = L)) could be found, so that P(Class = L|Outbreak = No) is high and P(Outbreak = Yes|Class = L) is, conversely, very small. On the other hand of the spectrum, we selected a more specific action threshold, which defines a High-Risk Class of clinics. In this risk group, additional non-pharmacological intervention should be initiated including, for example, the formal testing of temperature and thorough physical examination administered to each patient before entering the clinic or even periodical screening test (i.e., onceweekly). Since the intervention would require intensive resources, may be constraint by procurement difficulties, and would unduly overburden patients with unnecessary testing, the High Risk threshold should ideally define a group where P(Outbreak = Yes|Class = H) is high and both P(Class = H|Outbreak = No) and P(Outbreak = Yes|Class = H) are low. It is important to remark that the choice and number of the action thresholds depends on the intended use of the risk score, the set of interventions available to the organization, the price cost of each intervention, and ultimately by the value function ranking the desirability/undesirability of different health outcomes. Therefore, the thresholds presented in this paper should not be considered generalizable per se: different institutions may choose different thresholds (or no thresholds at all) depending on the availability, cost, and expected outcomes of COVID-19-related interventions (i.e., email alerts to medical directors, shipments of medical equipment such as face masks or diagnostics kits, delivery of health education modules, PCR screening, etc.,). Therefore, the problem is not diagnostic in nature, yet reduces to optimal ranking (and longitudinal stability of such ranking of risk) in order to efficiently allocate limited resources and minimize risk for the patients throughout a continuously changing epidemic landscape.

#### *2.3. Input Variables*

The model is computed using aggregated data provided by all the dialysis centers (min: 545; max: 611) located in one of the 23 countries of the FMC European Nephrocare Network. The final model incorporates 74 variables belonging to one of the following categories (Appendix A):

	- a. Epidemic status in the target clinic (prefix: CL): 5 variables;
	- b. Distance-weighted information of the adjacent clinics (prefix: CLS); 5 variables. Adjacent clinics were defined as the 3 centers with shorter distance in terms of both latitude and longitude to the target clinic. Measures of the adjacent clinics, including cases and trends, were computed as the average value weighted for the inverse of the distance to the target clinic;
	- c. Other parameters related to the target clinic (prefix: CL): 49 parameters.

As detailed in Appendix A, each variable can be calculated/collected over different timeframes of the ascertainment period, i.e., the last 7 days (d), previous 7 d, last 14 d, previous 14 d, and previous 28 d.
