**5. Conclusions**

In conclusion, classifiers adjusted with adaptive synthetic sampling and feature selection allowed for increased diagnostic performance of CEM and DCE-MRI in the differentiation between benign and malignant lesions.

**Author Contributions:** Formal analysis, R.F., E.D.B. and A.P. (Adele Piccirillo); Investigation, V.G., M.R.R., T.P., M.L.B., M.M.R., P.V., C.R., C.S., F.A., G.S., M.D.B. and A.P. (Antonella Petrillo); Methodology, R.F., E.D.B., A.P. (Adele Piccirillo), V.G., M.R.R., T.P., M.L.B., R.D.G. and A.P. (Antonella Petrillo); Writing–original draft, R.F.; Writing—review and editing, R.F. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** The study was approved by the Ethics Committee of the National Cancer Institute of Naples Pascale Foundation (Deliberation N. 617 of 9 August 2016).

**Informed Consent Statement:** Each patient signed the informed consent. **Data Availability Statement:** Data are available at link https://zenodo.org/record/6344730# .YixvazXSK3A (accessed on 20 January 2022).

**Acknowledgments:** The authors are grateful to Alessandra Trocino, librarian at the National Cancer Institute of Naples, Italy. Moreover, for the collaboration, the authors are grateful for the research support of Paolo Pariate, Martina Totaro and Andrea Esposito of the Radiology Division at Istituto Nazionale Tumori IRCCS Fondazione Pascale–IRCCS di Napoli, I-80131 Naples, Italy.

**Conflicts of Interest:** The authors declare no conflicts of interest.

#### **Appendix A. Definition of Textural Features**

*Appendix A.1. First-Order Gray-Level Statistics*

First-order gray-level statistics describe the distribution of gray values within the volume. Let *X* denote the 3-D image matrix with *N* voxels, *P* the first order histogram, *P*(*i*) the fraction of voxels with intensity level *i* and *N*l the number of discrete intensity levels.

• Mean, the mean gray level of *X*.

$$mean = \frac{1}{N} \sum\_{i=1}^{N} X(i)$$


$$STD = \left(\frac{1}{N-1} \sum\_{i=1}^{N} \left(X(i) - \overline{X}\right)^2\right)^{1/2}$$

• Mean Absolute Deviation (MAD), the mean of the absolute deviation of all voxel intensities around the mean intensity value.

$$MAD = \frac{1}{N} \sum\_{i=1}^{N} |X(i) - \overline{X}|$$

• Range, the range of intensity values of *X*.

$$\text{range} = \max(X) - \min(X)$$

where max(*X*) is the maximum intensity value of *X* and min(*X*) is the minimum intensity value of *X*.


$$kurtosis = \frac{\frac{1}{N} \sum\_{i=1}^{N} \left(X(i) - \overline{X}\right)^4}{\left(\sqrt{\frac{1}{N} \sum\_{i=1}^{N} \left(X(i) - \overline{X}\right)^2}\right)^2}$$

where *X* is the mean of *X*.

• Variance, Variance is the square of the standard deviation:

$$variance = \frac{1}{N-1} \sum\_{i=1}^{N} \left( X(i) - \overline{X} \right)^2$$

where *X* is the mean of *X*. • Skewness:

$$skewness = \frac{\frac{1}{N} \sum\_{i=1}^{N} \left( \mathbf{X}(i) - \overline{\mathbf{X}} \right)^{3}}{\left( \sqrt{\frac{1}{N} \sum\_{i=1}^{N} \left( \mathbf{X}(i) - \overline{\mathbf{X}} \right)^{2}} \right)^{3}}$$

where *X* is the mean of *X*.

#### *Appendix A.2. Gray Level Co-Occurrence Matrix (GLCM)*

A normalized GLCM is defined as *<sup>P</sup>*(*<sup>i</sup>*, *j*; *δ*, *<sup>α</sup>*), a metric with size *Ng* × *Ng* describing the second-order joint probability function of an image, where the (*i*, *j*)th element represents the number of times the combination of intensity levels *i* and *j* occur in two pixels in the image, that are separated by a distance of *δ* pixels in direction *α* and *Ng* is the maximum discrete intensity level in the image. Let:


$$energy = \sum\_{i=1}^{N\_{\mathcal{S}}} \sum\_{j=1}^{N\_{\mathcal{S}}} \left[ P(i, j) \right]^2$$

• Contrast

$$contrast = \sum\_{i=1}^{N\_{\mathcal{S}}} \sum\_{j=1}^{N\_{\mathcal{S}}} \left| i - j \right|^2 P(i, j)$$

• Entropy

•

$$entropy = -\sum\_{i=1}^{N\_{\mathcal{S}}} \sum\_{j=1}^{N\_{\mathcal{S}}} P(i,j) \log\_2 \left[ P(i,j) \right]$$

$$homogeneity = \sum\_{i=1}^{N\_\S} \sum\_{j=1}^{N\_\S} \frac{P(i,j)}{1+|i-j|}$$

• Correlation

Homogeneity

$$correlation = \frac{\sum\_{i=1}^{N\_{\mathcal{S}}} \sum\_{j=1}^{N\_{\mathcal{S}}} ijP(i,j) - \mu\_{\mathcal{X}}\mu\_{\mathcal{Y}}}{\sigma\_{\mathcal{X}}\sigma\_{\mathcal{Y}}}$$

• Sum Average

$$\text{sum\,average} = \frac{1}{N\_{\ $} \times N\_{\$ }} \sum\_{i=1}^{N\_{\ $}} \sum\_{j=1}^{N\_{\$ }} [iP(i,j) + jP(i,j)]$$

• Dissimilarity

$$dissimilarity = \sum\_{i=1}^{N\_{\mathcal{S}}} \sum\_{j=1}^{N\_{\mathcal{S}}} |i - j| P(i, j)$$

• Autocorrelation

$$output correlation = \sum\_{i=1}^{N\_{\underline{\mathcal{S}}}} \sum\_{j=1}^{N\_{\underline{\mathcal{S}}}} ijP(i,j)$$

#### *Appendix A.3. Gray Level Run-Length Matrix (GLRLM)*

Run-length metrics quantify gray level runs in an image. A gray level run is defined as the length in number of pixels, of consecutive pixels that have the same gray level value. In a gray level run length matrix *p*(*<sup>i</sup>*, *j*|*θ*), the (*i*, *j*)th element describes the number of times *j* a gray level *i* appears consecutively in the direction specified by *θ*. Let:


$$SRE = \sum\_{j=1}^{N\_r} \frac{p\_r}{j^2}$$

• Long-Run Emphasis (LRE)

$$LRE = \sum\_{j=1}^{N\_r} j^2 p\_r$$

• Gray Level Nonuniformity (GLN)

$$GLN = \sum\_{i=1}^{N\_{\mathfrak{F}}} p\_{\mathfrak{F}}^{\cdot^{2}}$$

• Run-Length Nonuniformity (RLN)

$$RLN = \sum\_{j=1}^{N\_r} p\_r^{-2}$$

• Run Percentage (RP)

$$\mathcal{R}P = \frac{\mathcal{N}\_s}{\mathcal{N}\_p}$$

• Low Gray Level Run Emphasis (LGRE)

$$LGRE = \sum\_{i=1}^{N\_S} \frac{p\_S}{i^2}$$

• High Gray Level Run Emphasis (HGRE)

$$HGRE = \sum\_{i=1}^{N\_{\mathcal{S}}} i^2 p\_{\mathcal{S}}$$

• Short-Run Low Gray Level Emphasis (SRLGE)

$$SRLGE = \sum\_{i=1}^{N\_\mathcal{S}} \sum\_{j=1}^{N\_\mathcal{r}} \frac{p(i,j)}{i^2 j^2}$$

• Short-Run High Gray Level Emphasis (SRHGE)

$$SRHGE = \sum\_{i=1}^{N\_\mathcal{S}} \sum\_{j=1}^{N\_r} \frac{p(i,j)i^2}{j^2}$$

• Long-Run Low Gray Level Emphasis (LRLGE)

$$LRLGE = \sum\_{i=1}^{N\_\mathcal{S}} \sum\_{j=1}^{N\_r} \frac{p(i,j)f^2}{i^2}.$$

• Long-Run High Gray Level Emphasis (LRHGE)

$$LRHGE = \sum\_{i=1}^{N\_{\mathcal{S}}} \sum\_{j=1}^{N\_{\mathcal{r}}} p(i,j)i^2j^2$$

• Gray Level Variance (GLV)

$$GLV = \frac{1}{N\_{\mathcal{S}} \times N\_{r}} \sum\_{i=1}^{N\_{\mathcal{S}}} \sum\_{j=1}^{N\_{r}} \left(ip(i,j) - \mu\_{\mathcal{S}}\right)^{2}$$

• Run-Length Variance (RLV)

$$RLV = \frac{1}{N\_{\mathcal{S}} \times N\_{r}} \sum\_{i=1}^{N\_{\mathcal{S}}} \sum\_{j=1}^{N\_{r}} \left( j p(i, j) - \mu\_{r} \right)^{2}$$

*Appendix A.4. Gray Level Size Zone Matrix (GLSZM)*

A gray level size-zone matrix describes the amount of homogeneous connected areas within the volume, of a certain size and intensity. The (*i*, *j*) entry of the GLSZM *p*(*<sup>i</sup>*, *j*) is the number of connected areas of gray level (i.e., intensity value) *i* and size *j*. GLSZM features therefore describe homogeneous areas within the tumor volume, describing tumor heterogeneity at a regional scale [5]. Let:


$$SZE = \sum\_{j=1}^{N\_{\mathbb{E}}} \frac{p\_z}{j^2}$$

• Large Zone Emphasis (LZE)

$$LZE = \sum\_{j=1}^{N\_{\mathbb{E}}} j^2 p\_{\mathbb{E}}$$

• Gray Level Nonuniformity (GLN)

$$GLN = \sum\_{i=1}^{N\_{\mathcal{S}}} p\_{\mathcal{S}}^{-2i}$$

• Zone Size Nonuniformity (ZSN)

$$ZSN = \sum\_{i=1}^{N\_3} p\_z^{-2}$$

• Zone Percentage (ZP)

$$ZP = \frac{N\_s}{N\_p}$$

• Low Gray Level Zone Emphasis (LGZE)

$$LGZE = \sum\_{i=1}^{N\_{\mathcal{S}}} \frac{p\_{\mathcal{S}}}{i^2}$$

• High Gray Level Zone Emphasis (HGZE)

$$HGZE = \sum\_{i=1}^{N\_{\mathcal{S}}} \dot{\iota}^2 p\_{\mathcal{S}}$$

• Small Zone Low Gray Level Emphasis (SZLGE)

$$SZLGE = \sum\_{i=1}^{N\_g} \sum\_{j=1}^{N\_x} \frac{p(i,j)}{i^2 j^2}$$

• Small Zone High Gray Level Emphasis (SZHGE)

$$SZHGE = \sum\_{i=1}^{N\_{\underline{x}}} \sum\_{j=1}^{N\_{\underline{z}}} \frac{p(i,j)i^2}{j^2}$$

• Large Zone Low Gray Level Emphasis (LZLGE)

$$LZLGE = \sum\_{i=1}^{N\_\mathcal{S}} \sum\_{j=1}^{N\_\mathcal{z}} \frac{p(i,j)j^2}{i^2}$$

• Large Zone High Gray Level Emphasis (LZHGE)

$$LZHGE = \sum\_{i=1}^{N\_{\mathcal{S}}} \sum\_{j=1}^{N\_{\mathcal{Z}}} p(i,j)j^2 j^2$$

• Gray Level Variance (GLV)

$$GLV = \frac{1}{N\_{\mathcal{S}} \times N\_z} \sum\_{i=1}^{N\_{\mathcal{S}}} \sum\_{j=1}^{N\_{\mathcal{S}}} \left(ip(i,j) - \mu\_{\mathcal{S}}\right)^2$$

• Zone Size Variance (ZSV)

$$ZSV = \frac{1}{N\_{\mathcal{S}} \times N\_{z}} \sum\_{i=1}^{N\_{\mathcal{S}}} \sum\_{j=1}^{N\_{\mathcal{z}}} \left( j p(i, j) - \mu\_{z} \right)^{2}$$

#### *Appendix A.5. Neighborhood Gray Tone Difference Matrix (NGTDM)*

The *i*th entry of the NGTDM *s*(*i*|*d*) is the sum of gray level differences of voxels with intensity *i* and the average intensity *Ai* of their neighboring voxels within a distance *d*. Let:


$$
overset{\textstyle \text{\\_}}{\text{\\_}}
\
earness = \left[\text{\\_} + \sum\_{n=1}^{N\_{\mathcal{S}}} p(i)s(i)\right]^{-1}
$$

where *ε* is a small number to prevent coarseness from becoming infinite.

• Contrast

$$\text{output} = \left(\frac{1}{N\_p\left(1 - N\_p\right)}\sum\_{i=1}^{N\_\mathcal{S}}\sum\_{j=1}^{N\_\mathcal{S}} p(i)p(j)\left(i - j\right)^2\right)\left(\frac{1}{N}\sum\_{i=1}^{N\_\mathcal{S}} s(i)\right)^2$$

• Busyness

$$busymess = \frac{\sum\_{i=1}^{N\_{\mathbf{f}}} p(i)s(i)}{\sum\_{i=1}^{N\_{\mathbf{f}}} \sum\_{j=1}^{N\_{\mathbf{f}}} \left| ip(i) - ip(j) \right|}, \qquad \qquad p(i) \neq 0, \ p(j) \neq 0$$

• Complexity

$$\text{Complexity} = \sum\_{i=1}^{N\_{\mathcal{E}}} \sum\_{j=1}^{N\_{\mathcal{E}}} |i - j| \frac{p(i)s(i) + p(j)s(j)}{N(p(i) + p(j))}, \qquad \qquad p(i) \neq 0, \ p(j) \neq 0$$

• Strength

$$\text{strength} = \frac{\sum\_{i=1}^{N\_{\mathcal{S}}} \sum\_{j=1}^{N\_{\mathcal{S}}} [p(i) + p(j)](i - j)^2}{\varepsilon + \sum\_{a=1}^{N\_{\mathcal{S}}} s(i)}, \quad \qquad \qquad p(i) \neq 0, \ p(j) \neq 0$$

where *ε* is a small number to prevent strength from becoming infinite.
