*2.3. Measures*

#### 2.3.1. Endpoint Definition

The primary endpoint was KRT initiation within 6 and 24 months. Outcome definition does not include episodes of dialysis treatment for acute and transient kidney derangement.

We defined patients as "lost to follow" when no additional s-cr assessments after end of follow-up date and no dialysis-dependence onset notes were present in the clinical records.

#### 2.3.2. Input Variables

A list of all the variables included in the final model is provided in Table 1. The final model for the 6-month forecast incorporates 28 independent variables, while the model for the 24-month forecast includes 34 variables.


**Table 1.** Variables included in PROGRES-CKD models.

**Table 1.** *Cont.*


\* Slope of linear regression of eGFR values over the last 12 months. \*\* Urine Protein-Creatinine Ratio was converted to ACR by ACR = Urin protein\*PCR (Urine protein = 0.6) (please, see the Supplementary Material for the conversion table).

We assessed demographic, anthropometric, and lifestyle variables at index visit; blood biomarkers were collected and averaged over 12 months before index date (i.e., during the ascertainment period); their slope (i.e., change rate) was likewise calculated. Lifetime occurrence of comorbidities was evaluated by abstracting ICD10 codes [28] from outpatient medical records (Supplementary Material). Finally, etiologies of kidney disease were also noted.

#### 2.3.3. Definition of CKD Stages

GFR was estimated in adults using the 2009 CKD-EPI creatinine equation [29]. Patients are classified into one of the following GFR categories: (1) G1 normal or high, GFR: ≥90 mL/min/1.73 m2; (2) G2 mildly decreased, GFR: 60–89 mL/min/1.73 m2; (3) G3a mildly to moderately decreased, GFR: 45–59; (4) G3b moderately to severely decreased, GFR: 30–44; (5) G4 severely decreased, GFR: 15–29; (6) G5 kidney failure, GFR: <15.3.

#### *2.4. Design and Setting of PROGRES-CKD Validation Studies*

For the validation study we randomly selected one visit from patients' histories (index date) before occurrence of study endpoint. All information collected before the index data was used as an input variable for the model. Patients dying before reaching the endpoint or before the end-of-follow-up (i.e., 6 or 24 months, depending on endpoint of interest) were excluded.

Based on this general design setting, we validated PROGRES-CKD models in two independent cohorts.

#### 2.4.1. Study A

The first validation study was performed in the testing cohort derived from 30% partitioning of the clinical data abstracted from the FMC NephroCare cohort.

#### 2.4.2. Study B

A second analysis evaluated PROGRES-CKD performance using data from the German CKD study [21]. Briefly, the GCKD study is an ongoing prospective observational national study that recruited 5217 patients with CKD of various etiologies. The enrolment period started in July 2011 and ended in 2012. Patient recruitment and follow-up is organized through a network of academic nephrology centers collaborating with practicing nephrologists throughout Germany. The main study endpoints were mortality, decline in kidney function, and cardiovascular events. At the time of recruitment, patients were under nephrological care and showed either eGFR of 30–60 mL/min/1.73 m<sup>2</sup> or overt urin protein in the presence of an eGFR > 60 mL/min/1.73 m2. In our validation analysis, only patients subjected to serum creatinine evaluation at baseline and followed for at least 2 years were considered.

#### 2.4.3. Study C

We conducted an impact study assessing concordance of nephrologists' and PROGRES-CKD-24 ratings of risk. Four experts were asked to forecast KRT initiation risk for 78 CKD patients based on their demographic, anthropometric, and clinical data. These patients were randomly selected from the FMC NephroCare cohort and had complete clinical history up to 24 months after the index date. Information related to all input variables used by the model were extracted from existing clinical records. Information extracts for each patient were collected in real-world clinical practice by physicians during outpatient visits. Doctors were asked to rate KRT risk on a 10-point rating scale anchored at 1 (risk is negligible, almost no patient with these characteristics would require RRT within 2 years), 5 (about 50% of patients with these characteristics would require RRT within 2 years) and 10 (almost 100% patients with these characteristics would require RRT within 2 years). Risk ratings provided by the physicians were then compared to scores obtained from PROGRES-CKD-24 for the same patients. Comparative analysis included accuracy, sensitivity, and specificity based on score cut-off that maximized Youden's Index. Thereafter, we investigated the potential impact of using risk scores provided by either experts or PROGRES-CKD-24 in referring patterns to intensified healthcare prevention programs aimed at delaying CKD progression. We simulated the use of risk estimates on a large, hypothetical CKD population of stage 3–5 CKD patients (*n* = 10,000), assuming an ESRD incidence within 24 months of 4.6% (i.e., *n* = 460 expected ESKD cases) and an intervention effect size of 1.5 (i.e., patients in the standard of care arm would face 50% higher risk of ESKD compared to those allocated in the intensified healthcare program). The intervention effect size was estimated based on expert opinion and several intensified intervention programs reported in diabetic and non-diabetic CKD [30–32].

#### *2.5. Statistical Analysis*

We computed the cumulative incidence and the incidence density of KRT initiation events in the study population and their 95% confidence intervals based on the Poisson distribution.

Since PROGRES-CKD models are NBCs, no data manipulation was required to explicitly handle missing variables.

Model performance was evaluated by concordance statistic and calibration charts in the FMC NephroCare and the GCKD cohorts. Discrimination was quantified by calculating the area under the receiver operating characteristic curve (ROC AUC) [33]. An AUC >0.70 was considered acceptable. Calibration was visually inspected by plotting observed outcome incidence by quintiles of the risk score [34].

A further analysis investigated non-inferiority (defined as ΔAUC < 0.05) of both PROGRES-CKD-6 and PROGRES-CKD-24 relative to the KFREs [15] calibrated for the European population [16]. Briefly, Tangri's models were developed using Cox proportional hazards regression methods in stage 3–5 CKD patients. In the present study, the following Tangri's equations were used: (1) 4 Variables (4VAR), includes Age, Gender, eGFR, and Albumin-Creatinine Ratio (ACR); (2) 6 Variables (6VAR), includes Age, Gender, eGFR, ACR, Diabetes, and Hypertension. We could not apply the 8 Variables (8VAR) equation given the lack of serum bicarbonate assessments in both study cohorts. Non-inferiority was assessed by checking whether a one-sided confidence interval of the AUC remained entirely above the non-inferiority threshold (0.05). In case non-inferiority was achieved, we evaluated superiority of PROGRES-CKD compared to benchmark models; superiority

was set at ΔAUC ≥ 0.05. Given the sequential nature of testing in a fixed order method approach, type I error is not inflated by multiple testing. Superiority was tested with the DeLong non-parametric approach [35]. Statistical significance was claimed at α < 0.05.

For study C, the following accuracy parameters were considered: Sensitivity, Specificity, Positive Predictive Value (PPV), and False Omission Rate (FOR). We also calculated the number needed to treat (NNT) in order to avoid 1 KRT event as the reciprocal of the absolute risk difference between the hypothetical prevention program and standard of care for all patients:

NNT = (#patients int tr/[(#patients int tr∗PPV)−((#patients int tr∗PPV)/(effect−size))]

Model training was performed using Hugin Explorer. All analyses for the validation study were performed with SAS 9.4®.
