*2.4. Image Evaluation and Data Quantification*

The evaluation of PET/MRI images with malignancy assessment of LN was performed prospectively in consensus by one radiology and one nuclear medicine specialist each with at least 8 years of experience in PET and MRI imaging. Anatomical positions of resected LN were identified on PET/MRI images based on their position in SLN SPECT/CT and the surgeon's description of the localization intraoperatively.

Multiparametric data were collected using a dedicated software (syngo.via® 8.2; Siemens Healthineers) and matched retrospectively with histology. Volumes of interest (VOI) were placed manually around every histologically confirmed LN on early and (standardized uptake value = SUVe) and delayed PET (SUVd). Quantification was performed as SUVmax and SUVpeak as well as SUVmean (50% isocontour). Blood pool correction (bpc) for SUV measurements (bpcSUV) was performed by dividing the lesions SUV by the SUVmean (without isocontour) of an ROI placed in a large venous vessel in the same PET bed position.

$$\text{bpcSUV} = \frac{\text{SUV VOI}}{\text{SUV blood pool}} \tag{1}$$

Dual-time-point [18F]FDG kinetics were calculated using a retention index (RI), as described by Nogami et al. [25], and extended with a blood pool correction.

$$\text{RI} = \frac{\text{bpcSUVd} - \text{bpcSUVe}}{\text{bpcSUVe} \times 100\%} \tag{2}$$

In addition, the absolute difference of the bpcSUV between the early and delayed scan was defined as SUVΔ.

$$
\Delta \text{SUV} \, = \text{bpcSUVd} - \text{bpcSUVe} \tag{3}
$$

LN diameters were measured in the perpendicular short and long axis in the transaxial plane. Sphericity was defined as the ratio of short- to long-axes diameter. Diffusion was quantified manually using an ROI in the apparent diffusion coefficient (ADC) maps in LN ≥ 4 mm.

#### *2.5. Statistical Analysis*

Statistical analysis was performed using the SPSS Statistics 25.0 software (IBM Inc., Armonk, NY, USA), MedCalc v20.009 (MedCalc Software Ltd., Ostend, Belgium), and R 4.0.3 (R Foundation for Statistical Computing, Vienna, Austria). All parameters acquired were benchmarked against the gold standard histology. Differences in prevalence were tested for significance using the Chi<sup>2</sup> test. Differences between the means of groups were analyzed using the two-tailed *t*-test.

Listwise deletion was performed in case of missing values. Optimal cut-off values in ROC analyses were set at the Youden optimum.

The newly defined malignancy score (MS) predicts the probability of a lymph node exhibiting malignant histology based on a mixed logistic regression model, including the multiparametric imaging measures. This model uses the optimally weighted combination given the included predictors and covariances in the sample predicting the histological findings and incorporates random intercepts for patients within which the individual nodes account for dependencies. The probabilities are predicted for the current sample without using these random effects as these will not be known in future cases or samples for which one may wish to use the procedure. The criterion for statistical significance was set at α = 0.05.
