*2.3. Radiomics Features Extraction*

Image features were extracted from PET images by using LIFEx 2.20 software (LIFEx, by the French Alternative Energies and Atomic Energy Commission (CEA), Gif-sur-Yvette, France) (http://www.lifexsoft.org, accessed on 10 September 2021) [26] with the same procedure previously described for SUV-related parameters extraction, with similar VOI and after a new segmentation process.

The extraction of radiomics features (RF) was performed without spatial resampling, with an intensity discretization of 64 grey levels and with a distance from neighbors of 1 voxel for the extraction of GLCM parameters.

A total of 42 RF were generated (Table 1), divided in first-order statistics (histogramrelated and shape-related) and second-order statistics: grey level co-occurrence matrix (GLCM) related, grey-level run length matrix (GLRLM) related, neighborhood grey level different matrix (NGLDM) related and grey-level zone length matrix (GLZLM) related.


**Table 1.** List of semiquantitative parameters and of radiomics features considered in the study.


**Table 1.** *Cont*.

SUVmax: standardized uptake value body weight max; SUVmean: standardized uptake value body weight mean; SUVlbm: standardized uptake value lean body mass, SUVbsa: standardized uptake value body surface area; MTV: metabolic tumor volume; TLG: total lesion glicolysis; SRE: short-run emphasis; LRE: long-run emphasis; LGRE: Low Gray-level Run Emphasis; HGRE: High Gray-level Run Emphasis; SRLGE: Short-Run Low Gray-level Em-phasis; SRHGE: Short-Run High Gray-level Emphasis; LRLGE: Long-Run Low Gray-level Emphasis; LRHGE: Long-Run High Gray-level Emphasis; GLNU: Gray-Level Non-Uniformity; RLNU: Run Length Non-Uniformity; RP: Run Percentage; SZE: Short-zone emphasis; LZE: Long-zone emphasis; LGZE: Low Gray-level Zone Emphasis; HGZE: High Gray-level Zone Emphasis; SZLGE: Short-Zone Low Gray-level Emphasis; SZHGE: Short-Zone High Gray-level Em-phasis; LZLGE: Long-Zone Low Gray-level Emphasis; LZHGE: Long-Zone High Gray-level Emphasis; ZLNU: Zone Length Non-Uniformity.Extraction of RF by LIFEx is only possible for VOI of at least 64 voxels, therefore 16 patients were excluded from the study because the volume of the TIs uptake was below this limit. As a consequence, the final number of patients included in the study was 221.

#### *2.4. Statistical Analysis*

Statistical analysis was performed using MedCalc Software version 18.1 (8400, Ostend, Belgium) and R (http://www.R-project.org/) software version 4.1.1 (Statistics Department of the University of Auckland, Auckland, New Zealand). In the descriptive analysis, the categorical variables were represented as simple and relative frequencies, while the numeric variables with mean, standard deviation, and range values. For both scanners, the kernel density estimation built on the RF values were qualitatively compared and the presence of significant differences were evaluated with the Mann–Whitney test.

The general statistical analysis line of the study was structured of various steps. First of all, a univariate analysis (with a logistic regressor, in a 10-cross-fold validation) was performed for the group of patients evaluated on scanner 1, 1 for the group of patients of scanner 2 and 1 for the entire group of patients (scanner 1 and scanner 2 considered together). This first analysis had the purpose to evaluate the influence of the two scanners on the ability of RF to correlate with the final clinical outcome.

Furthermore, a bivariate analysis was performed with the purpose of developing 3 predictive models (1 for scanner 1, 1 for scanner 2 and 1 for both scanners considered together), by analyzing all of the possible couples of variables (the cartesian product of semiquantitative parameters, RF and the major clinical features such as age, gender and ultrasound dimension of the Tis). This bivariate analysis was performed with a bivariate logistic regression model was applied in order to classify them on the basis of the area under the curve (AUC) under the receiving operator curve (ROC) after a 10-cross fold validation training/testing test. This bivariate model had the purpose to clearly explore all the space of RF presented in the study. Similarly, for each couple of variables, the accuracy was extrapolated and to obtain a more complete statistic, the *p*-value were also extracted.

Lastly, a selection of the models with the best bivariate logistic regression was performed for scanner 1, scanner 2 and for both scanners considered together. In this setting an AUC higher than 0.8 was arbitrarily considered optimal to predict the final diagnosis of TIs, while an AUC between 0.6 and 0.8 was considered acceptable. Similarly, a *p*-value < 0.05 was arbitrarily considered as statistically significant.

#### **3. Results**

#### *3.1. Patients Characteristics*

A total of 221 patients were included in the study (Table 2), with a mean age of 66 years (range 16–88). The majority of the patients were female (*n* = 149, 67%) while 72 (33%) were male. No significant difference in terms of sex between the 2 groups of malignant TIs and benignant TIs was underlined (*p* value = 0.07).


**Table 2.** Characteristics of the 221 patients included in the study.


**Table 2.** *Cont*.

N.: number, SD: standard deviation, mm: millimeters, SUVmax: standardized uptake value body weight max, SUVmean: standardized uptake value body mean, SUVlbm: standardized uptake value lean body mass, SUVbsa: standardized uptake value body surface area, MTV: metabolic tumor volume, TLG: total lesion glicolysis.

TIs were most frequent findings on the right thyroid lobe with 123 (56%) subjects, while in 87 (39%) the incidental uptake were discovered on the left lobe and only in 11 (5%) cases they were underlined at the isthmus. Again, the site of TIs was not significantly correlated with the final diagnosis (*p* value = 0.79).

The mean diameter of the TIs, evaluated on subsequent ultrasound evaluation, was of 17 mm (range 5–75).

Overall, the final diagnosis of TIs was malignant for 71 (32%) patients and benignant in 150 (68%) patients. In this setting, for the correct evaluation of their final diagnosis, 97 (44%) subjects were evaluated only with ultrasound exams, with a mean follow-up of 24 months (range 12–168). Five (2%) patients performed a 99mTc thyroid scintigraphy, that revealed the presence of an hyperfunctioning adenoma.

For 118 (53%) patients, a cytological examination for the correct diagnosis of incidental 18F-FDG uptakes was performed, classifying the results according to the Italian Thyroid Cytology Classification System [27]. In particular, in 16 (14%) cases the result of the cytological examination was TIR5, in 13 (11%) it was TIR4, in 54 (45%) it was TIR3 while in 35 (30%) it was TIR2. Furthermore, of the 54 patients with a TIR3 classification, 24 (44%) had a TIR3a result while TIR3b was the final cytological result for 30 (56%) patients. A histological diagnosis of the TIs was performed in 71 (32%) cases and all of them revealed the presence of malignancy. In particular, in 3 (4%) cases the presence of anaplastic carcinoma was revealed, in 7 (10%) the presence of follicular carcinoma was underlined and in 61 (86%) there was a final diagnosis of papillary carcinoma. An evaluation of the predictive abilities of semiquantitative PET/CT parameters and of RF to predict the final cytological or histologic diagnosis was not performed because of the low sample of subjects beneath all the subgroups mentioned before.

A total of 128 (58%) scans were performed on the Discovery 690 tomograph (scanner 1), while 93 (42%) of them were acquired on the Discovery STE tomograph (scanner 2). The mean value of the SUVmax of the TIs was 7.9, it was 4.3 for SUVmean, 5.8 for SUVlbm, 2.0 for SUVbsa, 9.2 for MTV and 35.0 for TLG. (Figure 1).

Analyzing PET/CT acquisition depending on the tomograph used for their execution, in 92 (72%) scans performed on scanner 1 the incidental uptake resulted of benign nature while in 36 (28%) cases the final diagnosis was malignancy (1 anaplastic carcinoma, 4 follicular carcinomas and 31 papillary carcinomas). Regarding scanner 2, in 58 (62%) scans the final diagnosis of incidental uptake was benignancy while in 35 (38%) cases the presence of malignancy was underlined (2 anaplastic carcinomas, 3 follicular carcinomas and 30 papillary carcinomas). No significant difference in terms of final diagnosis was reported between the 2 scanners (*p* value = 0.1).

**Figure 1.** (**A**): Axial CT, axial PET and axial fused PET/CT images demonstrating the presence of TI revealed as intense focal uptake of 18F-FDG on the right lobe of thyroid. The lesion had a SUVmax of 44.47, an MTV of 0.7 and a TLG of 18.1 and subsequent cytological exam revealed no malignancy (TIR2). (**B**): Axial CT, axial PET and axial fused PET/CT images of another scan demonstrating again the presence of TI as a faint uptake on the right lobe of thyroid. The values of SUVmax, MTV and TLG of the lesion were 2.64, 6.9 and 10.3, respectively. Cytological evaluation (TIR5) and subsequent total thyroidectomy revealed the presence of papillary carcinoma.

#### *3.2. Comparison between the Two Scanners*

The major clinical and epidemiological characteristics of the patients (age, sex, ultrasound dimension and final diagnosis of the TIs) were not significantly different between the two scanners.

Regarding semiquantitative parameters of PET/CT, only the values of SUVmax resulted significantly different between the 2 scanners (*p* value = 0.046), while the remaining parameters were not. In particular, the SUVmax values resulted higher on scanner 1 compared to scanner 2.

Focusing on RF, only 9 of 42 resulted in significant differences between the 2 scanners. In particular, RF with apparent correlation on the type of scanner used for the acquisition were Histo entropy\_log10, Histo entropy\_log2, Histo Energy, GLRLM LGRE, GLRLM SRLGE, NGLDM busyness, GLZLM SZE, GLZLM SZHGE and GLZLM ZLNU (Table 3). However, cross-correlation maps of RF between the two scanners were quite similar (Figure 2).

**Table 3.** Comparison of clinical parameters, semiquantitative PET/CT parameters and radiomics features between the two scanners.


#### **Table 3.** *Cont*.



**Table 3.** *Cont*.

**Figure 2.** Correlation maps for first and second order RF between the two scanners. Scanner 1 (Discovery 690) is presented on the left, while scanner 2 (Discovery STE) is presented on the right. Blue means high positive correlation; red means high negative correlation; white means no correlation.
