**Investigating Serum and Tissue Expression Identified a Cytokine**/**Chemokine Signature as a Highly E**ff**ective Melanoma Marker**

**Marco Cesati <sup>1</sup> , Francesca Scatozza <sup>2</sup> , Daniela D'Arcangelo <sup>2</sup> , Gian Carlo Antonini-Cappellini <sup>2</sup> , Stefania Rossi <sup>3</sup> , Claudio Tabolacci <sup>3</sup> , Maurizio Nudo <sup>2</sup> , Enzo Palese <sup>2</sup> , Luigi Lembo <sup>2</sup> , Giovanni Di Lella <sup>2</sup> , Francesco Facchiano 3,\* and Antonio Facchiano 2,\***


Received: 2 November 2020; Accepted: 4 December 2020; Published: 8 December 2020

**Simple Summary:** In this study, we investigated the expression of 27 cytokines/chemokines in the serum of 232 individuals (136 melanoma patients vs. 96 controls). It identified several cytokines/chemokines differently expressed in melanoma patients as compared to the healthy controls, as a function of the presence of the melanoma, age, tumor thickness, and gender, indicating different systemic responses to the melanoma presence. We also analyzed the gene expression of the same 27 molecules at the tissue level in 511 individuals (melanoma patients vs. controls). From the gene expression analysis, we identified several cytokines/chemokines showing strongly different expression in melanoma as compared to the controls, and the 4-gene signature "*IL-1Ra*, *IL-7*, *MIP-1a*, and *MIP-1b*" as the best combination to discriminate melanoma samples from the controls, with an extremely high accuracy (AUC = 0.98). These data indicate the molecular mechanisms underlying melanoma setup and the relevant markers potentially useful to help the diagnosis of biopsy samples.

**Abstract:** The identification of reliable and quantitative melanoma biomarkers may help an early diagnosis and may directly affect melanoma mortality and morbidity. The aim of the present study was to identify effective biomarkers by investigating the expression of 27 cytokines/chemokines in melanoma compared to healthy controls, both in serum and in tissue samples. Serum samples were from 232 patients recruited at the IDI-IRCCS hospital. Expression was quantified by xMAP technology, on 27 cytokines/chemokines, compared to the control sera. RNA expression data of the same 27 molecules were obtained from 511 melanoma- and healthy-tissue samples, from the GENT2 database. Statistical analysis involved a 3-step approach: analysis of the single-molecules by Mann–Whitney analysis; analysis of paired-molecules by Pearson correlation; and profile analysis by the machine learning algorithm Support Vector Machine (SVM). Single-molecule analysis of serum expression identified IL-1b, IL-6, IP-10, PDGF-BB, and RANTES differently expressed in melanoma (*p* < 0.05). Expression of IL-8, GM-CSF, MCP-1, and TNF-α was found to be significantly correlated with Breslow thickness. Eotaxin and MCP-1 were found differentially expressed in male vs. female patients. Tissue expression analysis identified very effective marker/predictor genes, namely, IL-1Ra, IL-7, MIP-1a, and MIP-1b, with individual AUC values of 0.88, 0.86, 0.93, 0.87, respectively. SVM analysis of the tissue expression data identified the combination of these four molecules as the most effective signature to discriminate melanoma patients (AUC = 0.98). Validation, using the GEPIA2 database on an additional 1019 independent samples, fully confirmed these observations. The present study

demonstrates, for the first time, that the *IL-1Ra*, *IL-7*, *MIP-1a*, and *MIP-1b* gene signature discriminates melanoma from control tissues with extremely high efficacy. We therefore propose this 4-molecule combination as an effective melanoma marker.

**Keywords:** melanoma markers; cytokines; machine learning; Support Vector Machine; principal component analysis

#### **1. Introduction**

Melanoma is the most aggressive skin cancer with a good prognosis when early diagnosis is achieved. While relevant advances come from newly available therapies, novel approaches are necessary to improve early diagnosis and therapeutic efficacy. Several studies addressed the complex role that specific cytokines and growth factors may play in melanoma biology, acting either as pro- or anti-proliferation and either positively or negatively regulating the immune response [1]. For instance, CXCL10 (IP-10) exerts both pro-and anti-melanoma effects, mostly due to splice variants of its CXCR3 receptors [2]. Several cytokines/chemokines and corresponding receptors are known to be expressed in melanoma tissue, to regulate the multifaceted machinery coordinating the proliferation rate, the angiogenic response, the inflammatory response, the immune response, and the metastatic diffusion [1,3]. Simultaneous quantification of several cytokine/chemokine analytes has recently become available in serum as well as in tissue samples. Previous studies report gene expression profiles identifying low- vs. high-risk patients. For instance, the 31 GEP prognostic classifier identifies BAP1b, MGP, SPP1, CXCL14, CLCA2, S100A8, BTG1, SAP130, ARG1, KRT6B, GJA1, ID2, EIF1B, S100A9, CRABP2, KRT14, ROBO1, RBM23, TACSTD2, DSC1, SPRR1B, TRIM29, AQP3, TYRP1, PPL, LTA4H, and CST6 [4]. Other studies investigated the expression of orphan receptors as well as known chemokine receptors and chemokine ligands in melanoma metastases, leading to the identification of several molecules differentially expressed in metastatic melanoma, such as GPR18, GPR34, GPR119, GPR160, GPR183, P2RY10, CCR5, CXCR4, CXCR6, CCL4, CCL5, CCL14/15, CXCL8, CXCL9, CXCL14, and XCL1/2 [5]. An additional study reports a significant expression change of six chemokines (namely, CCL2, CCL3, CCL4, CCL5, CXCL9, and CXCL10) related to the lymphocyte infiltration in the melanoma tissue [6]. Statistically significant differential plasma expression in melanoma patients vs. controls has been reported for IL-2, IL-6, and IL-10 [7]. Despite the high statistical significance of the differences, none of these molecules show relevant AUC values according to ROC analyses; therefore, to date, they cannot be proposed as markers with clinical relevance.

Additional studies carried out in melanoma patients identified the serum expression level of proinflammatory cytokines, such as IL-2Ra, IL-12-p40, and IFN-α, as good predictors of relapse-free survival [8]. In another study carried out in 40 patients, the serum expression levels of 115 analytes were investigated, including most of the known cytokines and chemokines, such as IL-6, IL-7, IL-10, IL-16, TNF- α trimer, IL-1b, IFN-γ, IL-4R, IL-18, RANK-L, IL-1b, IL-2R, IL-6R, MPIF-1, Leptin, MIG, GDNF, MIP-1 alpha, MIP-1b, MIP-1 delta, ITAC, GM-CSF, MCP-4, MIP-3a, MIP-3b, MMP-1, SP-C, amphiregulin, RANK, MCP-2, IP-10, OPG, FGF-2, and many others. The serum expression profile of TNF-α receptor II, TGF-a, TIMP-1, and C-reactive protein was identified as a profile with prognostic value to predict overall survival in melanoma patients, with an Area Under the Curve (AUC) of 0.89 reduced to 0.72 when the leave-one-out cross-validation technique was applied [9]. An additional study indicates the expression of IL-1Ra, IL-2, and IFNa2 as pro-inflammatory cytokines related to the cytotoxicity associated with anti-CTL4 and anti-PD1 combined therapy [10]. Tissue expression of CCR6 and its ligand CCL20 (MIP-3a) were identified as progression predictors in primary melanoma patients [11]. Prostate-specific membrane antigen (PSMA) was identified by immunohistochemistry analysis as a good marker of metastatic melanoma, in 41 Stage III/IV melanoma human specimens [12]. The ROC analysis measured an AUC = 0.82, i.e., a good performance but not good enough to allow

its clinical application as a melanoma marker. Consensus among different studies is often difficult given experimental discrepancies on serum/plasma handling or the antibodies' sensibility/specificity. Using multiplex immune-based technology may overcome these issues, at least in part, by measuring many different analytes within the same sample.

We have previously investigated melanoma markers by in vitro screening [13], as well as by investigating the ion channels [14,15], autophagy-related molecules [16,17], or molecules related to lipid metabolism [18] in populations composed of hundreds/thousands of controls and patients. In the present study, we investigated the cytokine/chemokine protein expression in the serum of 232 controls/melanoma patients recruited in our hospital, and the gene expression on 511 melanoma tissues selected from the GENT2 database. We report here, for the first time, significant differences related to gender, age, and Breslow thickness in the serum-expression dataset. In the tissue-expression dataset, we report, for the first time, a highly relevant gene marker combination, discriminating healthy controls from melanoma patients with an extremely high accuracy, and reaching an AUC = 0.982, according to the ROC analysis.

#### **2. Results**

The cytokine/chemokine expression in melanoma patients was analyzed to identify molecules with strong and significant differential expression in patients vs. controls. The cytokine/chemokine protein expression in the serum of 232 patients recruited at the IDI hospital and their RNA expression in tissue biopsies of 511 samples from the GENT2 public database were evaluated. The serum expression and tissue expression of the same 27 human chemokines/cytokines were analyzed as a single-molecule analysis, as a paired-molecule analysis, or as a profile analysis, as reported in the cartoon depicted in Figure 1.

#### *2.1. Serum Expression: Single-Molecule Analysis of the Cytokines*/*Chemokines in Melanoma Patients vs. Controls*

The serum dataset included the following information: histopathological diagnosis (96 pathological subjects versus 136 controls), sex (112 male and 120 female), age (median 46.5 years, and mean 48.54 years), Breslow's depth (minimum value 0 mm, maximum 12 mm, median 0.7 mm, and mean 1.34 mm), and the expression values of the 27 cytokines/chemokines expressed as pg/mL. Table 1 summarizes the information on the serum dataset.


**Table 1.** Descriptive statistics of the population for the serum-expression analysis.

\* The 1 mm limit is consistent with the current threshold used for staging of T1 melanoma patients and allowed the best case distribution. Not all pathological samples report the thickness value.

Tables S1 and S2 report more general data (number of samples for each molecule, minimum value, 25% percentile, median, 75% percentile, maximum, mean, standard deviation, and having passed the normality test (or not)) for all controls and all melanoma patients, respectively.

Table 2 reports the mean values of serum expression of the 27 cytokines/chemokines, the statistical significance of the differences, and the AUC according to the ROC analyses. Five molecules show a

significantly (*p* < 0.05) different expression in melanoma vs. the controls, namely, IL-1b, IL-6, IP-10, PDGF-BB, and RANTES. The ROC analyses indicated that none shows a good ability to act as a serum marker of melanoma; in fact, an AUC < 0.70 was found in all cases. Nevertheless, the following *Cancers*  Breslow-, age-, and gender-specific characterization indicated many statistically significant differences. **2020**, *12*, x FOR PEER REVIEW 4 of 27

**Figure 1.** The cartoon reports the study workflow for the serum and tissue samples. **Figure 1.** The cartoon reports the study workflow for the serum and tissue samples.

**Table 2.** Serum expression. Medians of the expression values of the 27 cytokines/chemokines for the control and melanoma patients. The table also reports the significance of the differences according to Mann–Whitney analyses. For *p*-values < 0.05, the null hypothesis that the control and melanoma patients have the same median should be rejected (significant values reported in bold and are underlined); i.e., a *p* < 0.05 indicates a significant difference. Moreover, the table reports the classification performances as the AUC values from the ROC analyses.


Bold underlined: highlight the result.

The Breslow thickness-related differences are reported in Tables 3 and 4. Table 3 reports the mean expression of the 27 cytokines in all melanoma patients as a function of Breslow thickness <1 mm vs. >1mm. Three molecules, namely IL-8, MCP-1, and RANTES, show a statistically significant differential expression. As a further characterization, the correlation of Breslow thickness with serum expression was then investigated in all melanoma patients. Expression of IL-8, GM-CSF, and MCP-1 on one site, and TNF-α on the other, shows a significant negative and positive correlation, respectively (Table 4). Table S3 reports the correlations with Breslow thickness in male melanoma and in female melanoma patients and shows that the cytokines with significant correlations are different in males vs. females.

**Table 3.** Serum expression as a function of Breslow thickness. Count indicates the number of patients analyzed. The median expression values of the 27 molecules with a Breslow thickness <1 mm or >1 mm are reported. The significance of the difference according to the Mann–Whitney analyses is also reported: for *p*-values > 0.05, the null hypothesis that the two distributions of cytokine expressions have the same median should be rejected (significant values are reported in bold and are underlined). In simpler words, when the *p*-value is < 0.05, the cytokine expressions in patients with a Breslow thickness <1 mm and expression in patients with a thickness >1 mm have significantly different medians.


Bold underlined: highlight the result.



Bold underlined: highlight the result.

A further analysis was carried out as a function of age, in all melanoma patients and all controls. The Spearman's correlation index was computed between the age and the expression value of each cytokine. Seven significant correlations were found in the controls involving IL-7, IL-12(p70), IL-13, IP-10, MIP-1a, MIP-1b, and VEGF. Such correlation were mostly lost in the melanoma patients; in fact, the patients showed only two significant correlations with age, namely, IP-10 and G-CSF (Table 5). A similar finding in male controls vs. male melanoma and in female controls vs. female melanoma is reported in Tables S4 and S5, showing a strong reduction in the correlation with age in melanoma samples compared to the controls.



Bold underlined: highlight the result.

Then, a gender-specific analysis was carried out. Namely, expression levels of the 27 cytokines in male melanoma were compared to female melanoma. Table 6 shows that Eotaxin is significantly increased in male vs. female melanoma, and MCP-1 expression is significantly reduced in male vs. female melanoma, highlighting gender-related differences in cytokine/chemokine serum expression.


**Table 6.** Serum expression in male melanoma compared to female melanoma. For *p*-values < 0.05 (reported in bold and underlined), the null hypothesis (i.e., the two distributions have the same median) should be rejected. Briefly, the cytokine distributions for the control and melanoma patients with *p*-values < 0.05 have significantly different medians.

Bold underlined: highlight the result.

Altogether, the results reported in Tables 2–6 indicate strong and significant differential expression of several cytokines/chemokines, as a function of Breslow thickness, age, and gender. Such differences support the known role of these molecules in controlling proliferation, immune response, chemotaxis, and inflammation in melanoma samples, and provide molecular insights into the systemic response to melanoma (see the Discussion section).

#### *2.2. Serum Expression: Analysis of Paired Molecules by a Correlation Matrix*

A correlation analysis was then carried out. Namely, Spearman's correlations between the expression values of all pairs of molecules were investigated in control and in melanoma patients. Figures 2 and 3 show the molecule pairs exhibiting the highest correlation coefficients in the control and

in melanoma patients, respectively. Specifically, Figure 2A,B shows the heatmap of the intersections having a *p* < 0.05 and Spearman's rank coefficient >0.60, in the control and melanoma samples, respectively. Figure 3A,B show the molecules pair exhibiting a more severe selection, i.e., a correlation with *p* < 0.05 and a coefficient >0.7. In Figure 2, the 27 molecules were roughly clustered according to their biological functions. A higher number of strong correlations appear in the melanoma samples as compared to the controls, involving either immune/inflammatory molecules, chemokines, and angiogenic factors, indicating that the cytokines/chemokines expression network appears to be more reciprocally correlated in the melanoma samples. *Cancers* **2020**, *12*, x FOR PEER REVIEW 10 of 27 *Cancers* **2020**, *12*, x FOR PEER REVIEW 10 of 27

**Figure 2.** Correlation analysis of the serum expression values. The heatmaps in (**A**,**B**) show the **Figure 2.** Correlation analysis of the serum expression values. The heatmaps in (**A**,**B**) show the correlation of pairs of expression values having a *p*-value <0.05 and Spearman's rank coefficient >0.60, in the control and in melanoma samples, respectively. **Figure 2.** Correlation analysis of the serum expression values. The heatmaps in (**A**,**B**) show the correlation of pairs of expression values having a *p*-value <0.05 and Spearman's rank coefficient >0.60, in the control and in melanoma samples, respectively.

correlation of pairs of expression values having a *p*-value <0.05 and Spearman's rank coefficient >0.60,

**Figure 3.** Correlation analysis of the serum expression values of all the control (**A**) and melanoma (**B**) samples. The connected circles show the paired molecules having *p* < 0.05 and Spearman's R **Figure 3.** Correlation analysis of the serum expression values of all the control (**A**) and melanoma (**B**) samples. The connected circles show the paired molecules having *p* < 0.05 and Spearman's R coefficient >0.70, in the control and melanoma samples, respectively. The correlations specifically present in the melanoma samples are highlighted with blue dashed lines. **Figure 3.** Correlation analysis of the serum expression values of all the control (**A**) and melanoma (**B**) samples. The connected circles show the paired molecules having *p* < 0.05 and Spearman's R coefficient >0.70, in the control and melanoma samples, respectively. The correlations specifically present in the melanoma samples are highlighted with blue dashed lines.

as predictors of melanoma state. A 10-fold cross-validated, linear-kernel SVM search method was carried out, and the missing values were handled in two alternative ways, as specified in the Methods section. The SVM analysis of the sera data improved the classification efficacy as compared to the single-molecule analysis reported in Table 2, leading to an AUC value = 0.761 and an average accuracy of 0.724 with a *p* = 0.108 (reported in bold and underlined in Table 7). This is slightly higher

present in the melanoma samples are highlighted with blue dashed lines.

*2.3. Serum Expression: Profile Analysis by SVM*

*2.3. Serum Expression: Profile Analysis by SVM*

coefficient >0.70, in the control and melanoma samples, respectively. The correlations specifically

The serum expression of the 27 chemokines/cytokines was then analyzed as a global profile. The SVM supervised learning algorithm was used, performing a simultaneous analysis of all molecules as predictors of melanoma state. A 10-fold cross-validated, linear-kernel SVM search method was carried out, and the missing values were handled in two alternative ways, as specified in the Methods section. The SVM analysis of the sera data improved the classification efficacy as compared to the single-molecule analysis reported in Table 2, leading to an AUC value = 0.761 and an average accuracy of 0.724 with a *p* = 0.108 (reported in bold and underlined in Table 7). This is slightly higher

#### *2.3. Serum Expression: Profile Analysis by SVM*

The serum expression of the 27 chemokines/cytokines was then analyzed as a global profile. The SVM supervised learning algorithm was used, performing a simultaneous analysis of all molecules as predictors of melanoma state. A 10-fold cross-validated, linear-kernel SVM search method was carried out, and the missing values were handled in two alternative ways, as specified in the Methods section. The SVM analysis of the sera data improved the classification efficacy as compared to the single-molecule analysis reported in Table 2, leading to an AUC value = 0.761 and an average accuracy of 0.724 with a *p* = 0.108 (reported in bold and underlined in Table 7). This is slightly higher than 0.7, i.e., the best AUC value obtained by analysis in the single-molecule data (Table 2). Such a result was obtained by removing the missing values and considering as predictors the age and the expression values of the molecules IL-4, IL-8, IL-9, Eotaxin, FGF-2, IFN-γ, IP10, MIP-1a, MIP-1b, PDGF-BB, RANTES, and VEGF.

**Table 7.** Results of the SVM method applied to the serum expression dataset. Missing values are either removed or assigned as zero. In the first case, some molecules are removed from the predictor values (remaining molecules are listed in the "Molecules" column). Sex and/or age are or are not regarded as predictors. The *p*-value is the probability that the "Accuracy" value is not significantly above the "No Info Rate" value.


\* When the missing values were removed, IL-4, IL-8, IL-9, Eotaxin, FGF-2, IFN-γ, IP-10, MIP-1a, MIP-1b, PDGF-BB, RANTES, and VEGF were simultaneously analyzed by the SVM. *\*\** When the missing values were set to 0, all 27 cytokines/chemokines were simultaneously analyzed by SVM.

The SVM analysis on the serum expression data show that the profile analysis may improve the classification efficacy as compared to the single-molecule analysis, but unfortunately not enough to reach clinically relevant values.

#### *2.4. Tissues Expression: Single-Molecule Analysis of 27 Cytokines*/*Chemokines in Melanoma Patients and Controls*

Gene expression of the 27 chemokines/cytokines was then evaluated in tissue biopsies of melanoma samples and in control samples. Expression data were derived from the skin cancers section of the GENT2 database. The interface available at the link http://gent2.appex.kr/gent2/ presents data from all skin cancers combined, reporting analyses of Basal Cell Carcinoma (BCC) pooled with Squamous Cell Carcinoma (SCC), Merkel carcinoma primary, Merkel carcinoma metastatic, primary and metastatic melanoma data, for a total of 810 samples. We therefore extracted data referring to normal skin, primary melanoma, and all other melanoma data, excluding all other skin cancers from the analysis. After such a selection, 511 samples were considered, namely, 201 normal skins, 83 primary/primary in-transit melanoma patients, and 227 metastatic melanoma patients. The median gene expression values of the 27 molecules are reported in Table 8 for the three categories. No other stratifications (such as sex, age, or Breslow thickness) were carried out, given the database limitations reported in the Material and Methods section. Differences of the medians in the categories assessed by Mann–Whitney analysis revealed that most molecules show significantly different median values (*p* < 0.05). ANOVA analysis

was also carried out on the three groups, indicating similarly strong significant differences for most molecules investigated (see Table S6).

**Table 8.** Tissue expression: medians of the expression values of the 27 chemokines/cytokines for the control and for melanoma patients (all, primary, and metastatic). The table also reports the significance of the difference according to Mann–Whitney analyses (Wilcoxon two-sample rank-sum test): for *p*-values < 0.05 the null hypothesis (i.e., the corresponding sets of samples have the same median) should be rejected (values in bold and underlined).


Bold underlined: highlight the result; Italics: Genes.

A ROC analysis was then carried out for every molecule by comparing the control vs. all melanoma, control vs. primary melanoma, control vs. metastatic, and primary melanoma vs. metastatic melanoma samples. The results are reported in Table 9. Four molecules were found to be very good classifiers of the control vs. melanoma samples, namely, IL-1Ra, IL-7, MIP-1a, and MIP-1b, with AUC values of 0.88, 0.86, 0.93, and 0.87, respectively. The corresponding ROC curves for the control vs. all melanoma samples are shown in Figure 4.


**Table 9.** ROC analysis of the tissue expression data. For each molecule, the AUC values for the ROC analysis was calculated for the following sets of expression values: controls vs. all melanoma, controls vs. primary melanoma, controls vs. metastatic melanoma, and primary melanoma vs. metastatic melanoma samples. For every AUC value, the standard error of the measure obtained by the cross-validation procedure is also shown.

Bold: highlight the result; Italics: Genes.

validation procedure is also shown.

**Cytokines** 

Bold underlined: highlight the result; Italics: Genes.

melanoma samples. For every AUC value, the standard error of the measure obtained by the cross-

**Ctrls vs. Primary**

*IL-1b* 0.62 ± 0.03 0.57 ± 0.04 0.63 ± 0.03 0.54 ± 0.04 *IL-1Ra* **0.88 ± 0.02 0.88 ± 0.03 0.88 ± 0.02** 0.53 ± 0.04 *IL-2* 0.51 ± 0.03 0.54 ± 0.04 0.51 ± 0.03 0.53 ± 0.04 *IL-4* 0.50 ± 0.03 0.55 ± 0.04 0.52 ± 0.03 0.57 ± 0.04 *IL-5* 0.52 ± 0.03 0.51 ± 0.04 0.52 ± 0.03 0.51 ± 0.04 *IL-6* 0.64 ± 0.03 0.59 ± 0.04 0.66 ± 0.03 0.59 ± 0.04 *IL-7* **0.86 ± 0.02 0.91 ± 0.03 0.85 ± 0.02** 0.61 ± 0.04 *IL-8* 0.50 ± 0.03 0.52 ± 0.04 0.51 ± 0.03 0.52 ± 0.04 *IL-9* 0.51 ± 0.03 0.53 ± 0.04 0.52 ± 0.03 0.55 ± 0.04 *IL-10* 0.68 ± 0.03 0.65 ± 0.03 0.69 ± 0.03 0.55 ± 0.04 *IL-12(p70)* 0.78 ± 0.02 0.77 ± 0.03 0.79 ± 0.02 0.58 ± 0.04 *IL-13* 0.50 ± 0.03 0.50 ± 0.04 0.50 ± 0.03 0.50 ± 0.04 *IL-15* 0.52 ± 0.03 0.64 ± 0.01 0.52 ± 0.03 0.61 ± 0.04 *IL-17* 0.61 ± 0.03 0.54 ± 0.04 0.63 ± 0.03 0.59 ± 0.04 *Eotaxin* 0.63 ± 0.03 0.57 ± 0.04 0.65 ± 0.03 0.59 ± 0.04 *FGF-2* 0.52 ± 0.03 0.67 ± 0.04 0.54 ± 0.03 0.66 ± 0.04 *G-CSF* 0.50 ± 0.03 0.63 ± 0.04 0.55 ± 0.03 0.61 ± 0.04 *GM-CSF* 0.51 ± 0.03 0.52 ± 0.04 0.52 ± 0.03 0.54 ± 0.04 *IFN-γ* 0.69 ± 0.03 0.60 ± 0.04 0.72 ± 0.03 0.61 ± 0.04 *IP-10 (CXCL10)* 0.64 ± 0.03 0.56 ± 0.04 0.67 ± 0.03 0.61 ± 0.04 *MCP-1(MCAF)* 0.56 ± 0.03 0.54 ± 0.04 0.59 ± 0.03 0.62 ± 0.04 *MIP-1a (CCL3)* **0.93 ± 0.01 0.91 ± 0.02 0.93 ± 0.02** 0.58 ± 0.04 *MIP-1b (CCL4)* **0.87 ± 0.02 0.87 ± 0.02 0.86 ± 0.02** 0.58 ± 0.04 *PDGF-BB* 0.56 ± 0.03 0.55 ± 0.04 0.60 ± 0.03 0.62 ± 0.04 *RANTES (CCL5)* 0.73 ± 0.03 0.73 ± 0.03 0.72 ± 0.02 0.53 ± 0.04 *TNF-α* 0.77 ± 0.03 0.68 ± 0.03 0.80 ± 0.02 0.62 ± 0.04

**Ctrls vs. all Melanoma**

**AUC ± S.E. of ROC Analysis**

**Ctrls vs. Metastatic** **Primary vs. Metastatic**

**Figure 4.** AUC plots of the IL-1Ra, IL-7, MIP-1a, and MIP-1b gene expression in the controls vs. all melanoma samples, for the tissue-expression values.

These results indicate that analyzing the gene expression of single molecules identifies relevant and significant differences; in this case, the ability to discriminate melanoma from the controls is much higher than the serum-expression data. Nevertheless, such values are below the threshold commonly indicated for potential clinical application. We then carried out the paired-molecule and profile analysis.

#### *2.5. Tissue Expression: Analyzing Paired Molecules by a Matrix Correlation*

The analysis of the paired-molecule correlations was then carried out, similarly to what we have done for the sera dataset. The correlations between the expression values of all pairs of molecules in the control and melanoma patients were analyzed by computing Spearman's rank correlation coefficient.

Figures 5 and 6 show the molecule pairs exhibiting high correlation coefficients in the control and melanoma patients. The heatmap in Figure 5 highlights the pairs with significant correlation coefficients of R > 0.6 with *p* < 0.05. Figure 6 shows a more severe selection, i.e., the pairs with a correlation *p*-value < 0.05 and coefficient >0.70. As observed in the serum dataset, this analysis indicates that many strong correlations appear in the melanoma samples as compared to the controls, involving immune/inflammatory molecules, chemokines, and angiogenic factors.

coefficient.

coefficient.

**Figure 4.** AUC plots of the IL-1Ra, IL-7, MIP-1a, and MIP-1b gene expression in the controls vs. all

*Cancers* **2020**, *12*, x FOR PEER REVIEW 14 of 27

**Figure 4.** AUC plots of the IL-1Ra, IL-7, MIP-1a, and MIP-1b gene expression in the controls vs. all

The analysis of the paired-molecule correlations was then carried out, similarly to what we have done for the sera dataset. The correlations between the expression values of all pairs of molecules in the control and melanoma patients were analyzed by computing Spearman's rank correlation

The analysis of the paired-molecule correlations was then carried out, similarly to what we have done for the sera dataset. The correlations between the expression values of all pairs of molecules in the control and melanoma patients were analyzed by computing Spearman's rank correlation

Figures 5 and 6 show the molecule pairs exhibiting high correlation coefficients in the control and melanoma patients. The heatmap in Figure 5 highlights the pairs with significant correlation coefficients of R > 0.6 with *p* < 0.05. Figure 6 shows a more severe selection, i.e., the pairs with a correlation *p*-value < 0.05 and coefficient >0.70. As observed in the serum dataset, this analysis

Figures 5 and 6 show the molecule pairs exhibiting high correlation coefficients in the control and melanoma patients. The heatmap in Figure 5 highlights the pairs with significant correlation coefficients of R > 0.6 with *p* < 0.05. Figure 6 shows a more severe selection, i.e., the pairs with a correlation *p*-value < 0.05 and coefficient >0.70. As observed in the serum dataset, this analysis

melanoma samples, for the tissue-expression values.

melanoma samples, for the tissue-expression values.

*2.5. Tissue Expression: Analyzing Paired Molecules by a Matrix Correlation* 

*2.5. Tissue Expression: Analyzing Paired Molecules by a Matrix Correlation* 

involving immune/inflammatory molecules, chemokines, and angiogenic factors.

**Figure 5.** Correlations of the tissue expression values. The heatmap shows the correlation among pairs of expression values with a *p* < 0.05 and Spearman's rank coefficient >0.60. **Figure 5.** Correlations of the tissue expression values. The heatmap shows the correlation among pairs of expression values with a *p* < 0.05 and Spearman's rank coefficient >0.60. **Figure 5.** Correlations of the tissue expression values. The heatmap shows the correlation among pairs of expression values with a *p* < 0.05 and Spearman's rank coefficient >0.60.

**Figure 6.** Correlation analysis of the tissue expression values of all the control (**A**) and melanoma (**B**) samples. The connected circles show the paired molecules with a *p* < 0.05 and Spearman's R coefficient **Figure 6.** Correlation analysis of the tissue expression values of all the control (**A**) and melanoma (**B**) samples. The connected circles show the paired molecules with a *p* < 0.05 and Spearman's R coefficient **Figure 6.** Correlation analysis of the tissue expression values of all the control (**A**) and melanoma (**B**) samples. The connected circles show the paired molecules with a *p* < 0.05 and Spearman's R coefficient >0.70, in the control and melanoma samples, respectively. The correlations specifically present in the melanoma samples are highlighted with blue dashed lines.

#### *2.6. Tissue Expression: Profile Analysis by SVM*

As in the case of the serum expression data, the SVM supervised learning algorithm was used to investigate all molecules simultaneously as melanoma predictors. Impressive results were achieved by analyzing the tissue-expression data. The results are shown in Table 10. The average of the AUC values obtained in the 10 iterations of the cross-validation procedure is 0.99, much higher than the highest AUC value obtained in the ROC analysis of the single molecules (namely, 0.93; see Figure 4 and Table 9). The *p*-value is <0.00001; hence, we are highly confident about the statistical significance of this observation.

significance of this observation.

**Table 10.** SVM method applied to the tissue-expression dataset. Sex and age were not considered as predictors for the lack of data in the dataset. The *p*, i.e., the probability that the "Accuracy" value is not significantly higher than the "No Info Rate" value, is lower than 0.00001. Hence, we can safely reject the null hypothesis and we can assume that the accuracy of the predictive model is higher than the value of the No Info Rate (0.61), corresponding to the performance of a dummy, fixed-answer predictor. **Table 10.** SVM method applied to the tissue-expression dataset. Sex and age were not considered as predictors for the lack of data in the dataset. The *p*, i.e., the probability that the "Accuracy" value is not significantly higher than the "No Info Rate" value, is lower than 0.00001. Hence, we can safely reject the null hypothesis and we can assume that the accuracy of the predictive model is higher than

*Cancers* **2020**, *12*, x FOR PEER REVIEW 15 of 27

>0.70, in the control and melanoma samples, respectively. The correlations specifically present in the

As in the case of the serum expression data, the SVM supervised learning algorithm was used

We then conclude that by using all molecules as melanoma classifiers, the accuracy of the

to investigate all molecules simultaneously as melanoma predictors. Impressive results were achieved by analyzing the tissue-expression data. The results are shown in Table 10. The average of the AUC values obtained in the 10 iterations of the cross-validation procedure is 0.99, much higher than the highest AUC value obtained in the ROC analysis of the single molecules (namely, 0.93; see Figure 4 and Table 9). The *p*-value is <0.00001; hence, we are highly confident about the statistical

melanoma samples are highlighted with blue dashed lines.

*2.6. Tissue Expression: Profile Analysis by SVM*


We then conclude that by using all molecules as melanoma classifiers, the accuracy of the prediction is extremely strong. The ROC curve of the predictive model based on the simultaneous analysis of all molecules is shown in Figure 7. **Melanoma Controls Set Size Rate** *<sup>p</sup>***-Value** 310 201 358 153 0.99 0.95 0.61 <0.00001

**Figure 7.** ROC analysis of the SVM-based model, classifying the melanoma samples against the **Figure 7.** ROC analysis of the SVM-based model, classifying the melanoma samples against the controls.

controls. A model based on many predictors (in this case 27) may present some practical issues. We thus performed a Recursive Feature Elimination (RFE) procedure (summarized in the Methods section), to select the most relevant molecules of the predictive SVM-based model. According to this analysis, the most sensible predictors are, in order, *MIP-1a, IL-1RA, IL-7, MIP-1b, IL-12(p70),* and *TNF-α*. The best four molecules correspond to the ones shown in Table 9, obtained with ROC analyses of the A model based on many predictors (in this case 27) may present some practical issues. We thus performed a Recursive Feature Elimination (RFE) procedure (summarized in the Methods section), to select the most relevant molecules of the predictive SVM-based model. According to this analysis, the most sensible predictors are, in order, *MIP-1a, IL-1RA, IL-7, MIP-1b, IL-12(p70),* and *TNF-*α. The best four molecules correspond to the ones shown in Table 9, obtained with ROC analyses of the single molecules. As shown in Figure 8, a model based on two molecules, namely, *MIP-1a* and *IL-1RA*, achieves an AUC value = 0.965, while a 4-predictor model *(MIP-1a, IL-1RA, IL-7,* and *MIP-1b*) reaches an AUC = 0.982. These molecules stably represent the best 4-marker combination with the highest AUC value. Further increasing the number of predictors does not add any relevant improvement (see Figure 8).

We therefore conclude that combined analysis of the expression of the *MIP-1a, IL-1RA, IL-7,* and *MIP-1b* genes represents the best combination within the 27 investigated, able to very effectively discriminate the control from the melanoma samples.

improvement (see Figure 8).

single molecules. As shown in Figure 8, a model based on two molecules, namely, *MIP-1a* and *IL-1RA*, achieves an AUC value = 0.965, while a 4-predictor model *(MIP-1a, IL-1RA, IL-7,* and *MIP-1b*) reaches an AUC = 0.982. These molecules stably represent the best 4-marker combination with the highest AUC value. Further increasing the number of predictors does not add any relevant

We therefore conclude that combined analysis of the expression of the *MIP-1a, IL-1RA, IL-7,* and

*MIP-1b* genes represents the best combination within the 27 investigated, able to very effectively

**Figure 8.** AUC values obtained by the SVM algorithm using one predictor (*MIP-1a*), two predictors (*MIP-1a* + *IL-1RA*), three predictors (*MIP-1a* + *IL-1RA* + *IL-7*), four predictors (*MIP-1a* + *IL-1RA* + *IL-7* **Figure 8.** AUC values obtained by the SVM algorithm using one predictor (*MIP-1a*), two predictors (*MIP-1a* + *IL-1RA*), three predictors (*MIP-1a* + *IL-1RA* + *IL-7*), four predictors (*MIP-1a* + *IL-1RA* + *IL-7* + *MIP-1b*), etc., up to six predictors.

+ *MIP-1b*), etc., up to six predictors. We finally investigated the role of the expression of these four genes as a prognostic factor. According to the survival analysis tool available in the GEPIA2 database, 3 out of 4 show significant Hazard Ratios. Namely, *IL7*, *MIP-1a* (*CCL3*), and MIP-1b (*CCL4*) show a HR of 0.71, 0.65, and 0.5, We finally investigated the role of the expression of these four genes as a prognostic factor. According to the survival analysis tool available in the GEPIA2 database, 3 out of 4 show significant Hazard Ratios. Namely, *IL7*, *MIP-1a* (*CCL3*), and MIP-1b (*CCL4*) show a HR of 0.71, 0.65, and 0.5, respectively, with *<sup>p</sup>* <sup>=</sup> 0.01, 0.002, and 1 <sup>×</sup> <sup>10</sup>−<sup>7</sup> , respectively. These data indicate significantly improved survival in patients with high expression values for these three genes.

#### respectively, with *p* = 0.01, 0.002, and 1 × 10 *2.7. Results Validation*

survival in patients with high expression values for these three genes. *2.7. Results Validation* The expression of the four molecules reported in Figure 6 was then investigated in an The expression of the four molecules reported in Figure 6 was then investigated in an independent database, namely, GEPIA2 (found at http://gepia2.cancer-pku.cn/). Expression was confirmed to be significantly different in melanoma compared to the healthy controls, for *IL-1Ra* (recognized as *ILRn* by GEPIA2), *IL-7*, *MIP-1a* (recognized as *CCL3* by GEPIA2), and *MIP-1b* (recognized as *CCL4* by GEPIA2) (see Figure 9).

−7, respectively. These data indicate significantly improved

independent database, namely, GEPIA2 (found at http://gepia2.cancer-pku.cn/). Expression was confirmed to be significantly different in melanoma compared to the healthy controls, for *IL-1Ra*  (recognized as *ILRn* by GEPIA2), *IL-7*, *MIP-1a* (recognized as *CCL3* by GEPIA2), and *MIP-1b* The combined expression of these four molecules was then subject to a PCA analysis carried out by the "Dimensionality reduction" tool in GEPIA2. The three most relevant components very effectively differentiated melanoma from controls (Figure 10), indicating that the combined analysis of these four molecules may represent an effective melanoma marker.

(recognized as *CCL4* by GEPIA2) (see Figure 9). This observation fully validated the SVM analysis reported in Figure 6.

**3. Discussion**

sensitivity limitations.

results on tissue expression.

**Figure 9.** Tissue expression according to the GEPIA2 database. An asterisk (\*) indicates *p* < 0.0001. **Figure 9.** Tissue expression according to the GEPIA2 database. An asterisk (\*) indicates *p* < 0.0001. *Cancers* **2020**, *12*, x FOR PEER REVIEW 18 of 27

**Figure 10.** PCA analysis on the tissue expression values of *IL-1Ra, IL-7, MIP-1a,* and *MIP-1b*. **Figure 10.** PCA analysis on the tissue expression values of *IL-1Ra, IL-7, MIP-1a,* and *MIP-1b*.

While several serum biomarkers are investigated in melanoma patients [19], diagnostic markers currently applied in clinics are restricted to S-100, HMB-45, Melan-A, and SM5-1 [20,21], and the prognostic markers to monitor melanoma progression are S100B, MART1, PMEL, and S100A13 [22]. As recently reported [23], potential markers in melanoma are mutations (on BRAF, NRAS, KIT, GNA11/GNAQ, NF1, CDKN2AI, immunohistochemical biomarkers (such as PD-11 and PD-L1, as well as mutated BRAF and NY-ESO-1), miRNAs, and other serum molecules. The key role cytokines/chemokines play in the immune/inflammatory response and in proliferation and chemotaxis control has been largely investigated; nevertheless, their role as diagnostic or prognostic markers remains to be elucidated. We and others demonstrated the key role of growth factors such as FGF-2, PDGF, and TNF-α in controlling melanoma growth [24–26] and melanoma aggressiveness [27]. The present study is the first, to our knowledge, presenting a signature of four cytokines/chemokines as an extremely effective melanoma marker, in a large patient collection. The present study measures cytokine/chemokine expression in serum and in tumor biopsies. We did not expect that the same cytokines/chemokines would be modified in serum and in tissues. In fact, the molecules measured in the serum are likely produced as a systemic response, while the molecules measured within the biopsies are directly produced in the tumor or in the regions immediately close to it. Therefore, the cytokines/chemokines measured within the biopsies reflect more directly the tumor biology and its aggressive behavior. On the contrary, the cytokines/chemokines measured in the serum reflect more how the organism responds to the tumor from an inflammatory/immunological point of view. We cannot exclude that molecules produced within the primary tumor may reach the blood. However, such signals may be not measured due to the large dilution in the blood stream and their expression values may fall below the detection limit. We used the xMAP technology for quantification in serum samples, to minimize as much as possible the

For the sake of clarity, we will discuss below the results of serum expression separately from the

Serum expression: Analyzing the serum expression of 27 cytokines/chemokines did not identify any relevant marker when individually analyzed. Nevertheless, significant and strong differential expressions were found in melanoma vs. controls (see Table 2), as well as in melanoma samples as a function of Breslow thickness (see Tables 3 and 4) or age (Table 5). Furthermore, significant genderspecific differences were identified in Eotaxin and MCP-1 expression (Table 6), as well as in GM-CSF,

#### **3. Discussion**

While several serum biomarkers are investigated in melanoma patients [19], diagnostic markers currently applied in clinics are restricted to S-100, HMB-45, Melan-A, and SM5-1 [20,21], and the prognostic markers to monitor melanoma progression are S100B, MART1, PMEL, and S100A13 [22]. As recently reported [23], potential markers in melanoma are mutations (on BRAF, NRAS, KIT, GNA11/GNAQ, NF1, CDKN2AI, immunohistochemical biomarkers (such as PD-11 and PD-L1, as well as mutated BRAF and NY-ESO-1), miRNAs, and other serum molecules. The key role cytokines/chemokines play in the immune/inflammatory response and in proliferation and chemotaxis control has been largely investigated; nevertheless, their role as diagnostic or prognostic markers remains to be elucidated. We and others demonstrated the key role of growth factors such as FGF-2, PDGF, and TNF-α in controlling melanoma growth [24–26] and melanoma aggressiveness [27]. The present study is the first, to our knowledge, presenting a signature of four cytokines/chemokines as an extremely effective melanoma marker, in a large patient collection. The present study measures cytokine/chemokine expression in serum and in tumor biopsies. We did not expect that the same cytokines/chemokines would be modified in serum and in tissues. In fact, the molecules measured in the serum are likely produced as a systemic response, while the molecules measured within the biopsies are directly produced in the tumor or in the regions immediately close to it. Therefore, the cytokines/chemokines measured within the biopsies reflect more directly the tumor biology and its aggressive behavior. On the contrary, the cytokines/chemokines measured in the serum reflect more how the organism responds to the tumor from an inflammatory/immunological point of view. We cannot exclude that molecules produced within the primary tumor may reach the blood. However, such signals may be not measured due to the large dilution in the blood stream and their expression values may fall below the detection limit. We used the xMAP technology for quantification in serum samples, to minimize as much as possible the sensitivity limitations.

For the sake of clarity, we will discuss below the results of serum expression separately from the results on tissue expression.

Serum expression: Analyzing the serum expression of 27 cytokines/chemokines did not identify any relevant marker when individually analyzed. Nevertheless, significant and strong differential expressions were found in melanoma vs. controls (see Table 2), as well as in melanoma samples as a function of Breslow thickness (see Tables 3 and 4) or age (Table 5). Furthermore, significant gender-specific differences were identified in Eotaxin and MCP-1 expression (Table 6), as well as in GM-CSF, TNF-α, IL-9, MIP-1a, IL-8, PDGF-BB, and MCP-1 (Tables S3–S5). This finding indicated the molecular bases possibly underlying the different incidence and different mortality rates in male vs. female melanoma [28–32], as well as the unexpected better response of immunotherapies in men than in women [33].

Serum expression of IL-1b, IL-6, IP-10, PDGF-BB, and RANTES was found to be significantly different in melanoma vs. controls (Table 2), reinforced by a much larger patient cohort, since previous observations were carried out on much smaller patient cohorts [34,35]. These molecules make a proinflammatory milieu previously found in uveal melanoma [36] and such findings agree with recent data showing that serum inflammation markers are strongly associated with melanoma progression [37]. Cytokines and chemokines are closely engaged within a large network where several ligands share few receptors [38], therefore reciprocally modulating their pro-, anti-inflammatory, chemotactic, and angiogenic functions [39–41]. The chemokines network is known to mediate melanoma interaction with the surrounding tissues [42]. Investigating cytokine/chemokine serum expression may therefore reveal the coordinated tissue reaction to the presence of melanoma. In the present study, Spearman's correlation matrix revealed for the first time that strong and significant correlations of the expression values are more numerous in melanoma samples than in healthy controls (Figures 2 and 3). In the melanoma samples, molecules involved in inflammation, chemotaxis, and angiogenesis had strong and significant correlations, namely, according to Figure 3, strong correlations of IL-10 with FGF-2, IL-10 with RANTES, IL-5 with IL-13, IL-6 with TNF-α, and of IFN-γ with MCP-1 appear in melanoma samples. Particularly interesting is the strong correlation of IL-6 to TNF-α in the melanoma samples; these two molecules are known to control the ability to evade the immune system control in a PDL-1-dependent manner [43]. Such correlation data indicated that the cross-talk of cytokines and chemokines is altered in melanoma and this may help in defining a melanoma-specific, correlation-matrix fingerprint. We therefore analyzed the entire panel of 27 cytokines/chemokines with the SVM machine learning algorithm, to investigate simultaneously all molecules as predictors of melanoma state. In other studies, SVM effectively discriminated melanoma on the basis of dermoscopic images [44], ultrasonic and spectrophotometric images [45], BRAF status [46], or dermo-fluorescence spectra [47], with a reported accuracy up to 90%. SVM was previously used for prognostic purposes in melanoma patients [48] but, to our knowledge, the present study is the first applying the SVM analysis to cytokine/chemokine-expression values to discriminate melanoma from controls, both in serum and in tissue, in a large group of controls and patients. The SVM procedure was indeed able to improve the ability to classify the serum samples, from AUC = 0.70 for IL-6 expression (see Table 2) up to AUC = 0.761 for the combined indicators (see Table 7). However, this is still not good enough to propose a clinical diagnostic application. We then concluded that the serum expression data of these molecules, while showing strong and significant differences, may not be good classifiers. Several reasons may underlie this result, such as biological reasons (namely, the large serum dilution) as well as technical reasons (namely, samples storage or antibody cross-specificity). We cannot exclude that the cytokine serum expression will give improved information with an improved technology. Protein quantification is a rapidly evolving technology with the continuous upgrading of antibody combinations, sensitivity improvement, and protocol optimization. As an example, a 2007 report [49] identified several cytokines differently expressed in melanoma serum using multiplex xMAP technology, in 179 melanoma patients and 378 healthy controls. However, those data merit to be re-evaluated in the light of the currently available multiplex xMAP technology and new antibodies. As an alternative, quantitative proteomics approaches, based on mass spectrometry, indicated proteins differently expressed in melanoma compared to the controls [50]. However, the sensitivity of the latter techniques limits their application for cytokine/chemokine quantification, as compared to immunometric methods. Analyzing serum from melanoma patients aimed at identifying markers suitable for the early diagnosis, using a minimally invasive technique, expressions significantly different were indeed identified. However, we could not identify good markers within the 27 molecules investigated. This may depend on the molecules chosen (i.e., we should probably change targets and focus on other molecules), or it may depend on the high dilution factor in the serum samples.

We should address briefly the age-matching issue. As reported in Table 1, the mean age in the healthy groups and melanoma groups is different. Such difference reflects what the reality is, i.e., cancer patients are generally older than healthy controls, since increased age is a specific cancer risk factor. In the present study, individuals were sequentially enrolled, and controls were individuals with a suspect lesion removed and diagnosed by the pathologist as a not-cancer lesion. To have a similar age distribution in patients and in the control groups, one would be forced to remove several young healthy controls from the dataset (to match the rarely present young melanoma patients) and to remove several old melanoma patients (to match the rarely present old healthy controls). Age-matching would, therefore, strongly decrease the number of individuals analyzed, and alter the actual patient and control age distribution. The SVM analysis reported in Table 7 was also carried out on the age-matched groups, and the results are similar to the ones obtained from unmatched groups (See Table S7). Furthermore, the matrix analysis reported in Figure 3 was carried out also on the age-matched groups, and the results are identical to the ones obtained from the unmatched groups. We then conclude that age-matching, required for correct statistical analysis, would strongly reduce the group numerosity; it also would abolish a specific risk factor. In addition, the results achieved on the age-matched groups appear very similar or identical to those obtained on the age-unmatched groups, under our experimental conditions.

Tissue expression: Analyzing the tissue expression was much more effective to discriminate melanoma from controls. This is likely related to biological reasons (i.e., the direct analysis of the melanoma bearing tissue), as well as technical reasons (i.e., using a more stable quantification technique). Very high AUC values were calculated, up to 0.92 (see Figure 4), with the single-molecule analysis. Most of the investigated molecules showed a significant differential expression. This finding indicated that, at the tissue level, most of the cytokines/chemokines investigated are strongly altered in melanoma, prompting us to look for a further improvement of their classifier ability. Analysis of paired molecules reinforced what was observed in the sera dataset, showing that high-correlation pairs appear in melanoma while they are almost absent in the controls (Figures 5 and 6). Particularly interesting are the strong correlations involving the chemokines RANTES, MIP-1a, and MIP-1b (Figures 5 and 6).

Then, an SVM analysis on all molecules simultaneously analyzed was carried out. This analysis strongly improved the classification ability as compared to the single molecules. AUC reached an extremely high value of 0.991 when all 27 cytokines/chemokines were simultaneously considered as melanoma predictors. Use of the Recursive Feature Elimination (RFE) [51] procedure allowed us to identify the four best-performing molecules. The combination of IL-1Ra/IL-7/MIP-1a/MIP-1b shows the relevant AUC = 0.982. Interestingly to notice, the expression of IL-1Ra and IL-7 (known anti-inflammatory cytokines) is significantly reduced in the melanoma samples, while expression of MIP-1a and MIP-1b (known inflammatory chemokines) is significantly increased. Previous data demonstrate that CCL3 (MIP-1α) and CCL4 (MIP-1b) control the infiltration of the immune cells by recruiting antigen-presenting cells, including dendritic cells (DCs), to the tumor site via IFN-γ [52]. However, the specific signature made of the two anti-inflammatory and two pro-inflammatory molecules is a novel finding to our knowledge.

Full validation of these results was achieved on an independent dataset, the GEPIA2 database, reporting expression data from 1019 control and melanoma samples. The four molecules IL-1Ra, IL-7, MIP-1a, and MIP-1b were confirmed to have a significant (*p* < 0.0001) differential expression in melanoma (Figure 9), and their combined analysis with the PCA methodology (a different methodology compared to SVM) was found to effectively discriminate the controls from the melanoma samples (Figure 10).

The identification of the relevant gene markers from the tissue-expression data by using a quantitative technique may help improve histological diagnoses. Identification of a 4-gene signature may be a relevant help for pathologists. Measuring expression of these genes represents a quantitative approach that is operator-independent and may be part of an automatic process useful to identify suspect samples.

#### **4. Materials and Methods**

#### *4.1. Patients Selection and Recruitment*

Melanoma patients were consecutively recruited at the hospital sections of IDI-IRCCS, according to the procedure approved by the IDI Ethics Committee (CE 287/1 approved 7/04/2009) based on a suspect skin lesion. All patients gave written informed consent. Patients under any pharmacological melanoma therapy were excluded. Serum was collected before the biopsy procedure, aliquoted, and stored at −80 ◦C. According to the histological analysis, patients were then assigned to the control arm or to the melanoma arm. A total of 232 patients were recruited in the present study.

#### *4.2. Serum Handling*

A total of 7 mL of peripheral blood were collected from patients. Blood was collected in tubes with no additives of any type. Tubes were taken at room temperature for 2 to 3 h; they were then centrifuged at 15,000 rpm for 15 min; clear yellow color serum was stored in 100 µL aliquots and stored at −80 ◦C. The red color was considered a hemolysis sign and such sera were then not analyzed in the current study.

#### *4.3. Cytokines Quantification in Sera Samples*

Sera were obtained from melanoma patients (*n* = 96) and from patients with non-melanoma suspect lesions (*n* = 136) and were analyzed using xMAP technology on the Luminex platform (X200 Instrumentation equipped with a magnetic washer workstation and software Manager 6.1), which allows the simultaneous quantification of many molecules. The commercial kit used was Bio-Plex Pro human cytokine 27-plex panel (Bio-Rad Laboratories, Hercules, CA, USA), able to measure the following analytes: IL-1Ra, IL-1b, IL-2, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, IL-10, IL-12(p70), IL-13, IL-15, IL-17, TNF-α, IFN-γ, Eotaxin, Macrophage Inflammatory Protein *(MIP)-1a* (*MIP-1a*; *CCL3*), Macrophage Inflammatory Protein *(MIP)-1b (MIP-1b; CCL4),* Monocyte Chemoattractant Protein *(MCP)-1 (CCL2)*, Granulocyte Colony stimulating factor (G-CSF), GM-CSF, Basic Fibroblast growth factor (FGF-2), Interferon γ-induced protein 10 (IP-10; CXCL10), Regulated on Activation, Normal T cell Expressed and Secreted (RANTES), Platelet-Derived Growth Factor (PDGF-BB), and Vascular Endothelial Growth Factor (VEGF). Samples were handled according to the manufacturer's instructions and as previously reported [27].

#### *4.4. Serum Expression Data*

The serum dataset was composed of 232 records corresponding to 232 different individuals. The recorded data included in each case the histopathological diagnosis, sex, age, Breslow's thickness, and the expression values of the 27 chemokines/cytokines, reported in pg/mL. The expression of a few molecules was undetectable in some patients. Specifically, expressions of IL-2, IL-5, IL-6, IL-10, IL-13, and IL-15 were undetected in most cases and were measured in less than 30 controls and/or less than 30 melanoma samples. We handled missing values in two alternative ways, i.e., we removed the missing data point from the analysis (either the whole cytokine from all samples or the whole sample containing the missing value, depending on the performed analysis while trying to maximize the size of the resulting dataset), or, alternatively, we assumed the missing values are equal to zero, thus assuming that all missing values were caused by expression values lower than the measurement thresholds of the diagnostic kits.

#### *4.5. Tissue Expression Data from GENT2 Database*

The 27 molecules measured in the sera were investigated in control and melanoma samples taken from the GENT2 database, according to transcriptomic data; data of 201 normal skin biopsies were from 11 independent studies (referred to as GSE39612, GSE30355, GSE14905, GSE13355, GSE7553, GSE42109, GSE16161, GSE15605, GSE7307, GSE46239, and GSE7307); data of 83 primary melanoma biopsies were from 4 independent studies (referred to as GSE10282, GSE15605, GSE7553, and GSE62837); data of 227 metastatic melanoma biopsies were from 11 independent studies (referred to as GSE62837, GSE7307, GSE31879, GSE38312, GSE15605, GSE35640, GSE7553, GSE4587, GSE19293, GSE19234, and GSE22968). The tissue dataset was composed of 511 records, each describing a single subject: the recorded data included the histopathological diagnosis, sex, age, melanoma stage, and the expression values of the 27 cytokines. Each record always states whether the subject has been clinically diagnosed as affected by melanoma (310 patients) or not (201 controls). In this dataset, there were no missing values within the expression data. However, sex and age data were not recorded in most cases: only 217 records reported sex (138 males and 79 females) and only 125 records indicated the age (minimum 20 years, maximum 92 years, median 56 years, mean 59.56 years), and the majority of the control subjects had no sex and age data (about 70% of total). Therefore, sex and age stratifications were not carried out in the tissue-expression data.

#### *4.6. Statistical Analyses*

All statistical analyses were carried out on the "R" package version 4.0.0 [53].

#### 4.6.1. Single-Molecule Analysis

Expression data of the single molecules were analyzed as an evaluation of the statistical significance of the median differences, by the Mann–Whitney test with Bonferroni correction. ANOVA analysis of the differences between the controls, primary melanoma, and metastatic melanoma samples in the tissue data was also performed and reported in Supplementary Material. In this case, normal distribution was evaluated by the Shapiro–Wilk test, and homoscedasticity (homogeneity of variance) was evaluated by the nonparametric Levene test. When the ANOVA analysis showed a significant difference between the medians, Mann–Whitney with Bonferroni correction and Tukey's honest significance tests were applied as post-hoc tests.

#### 4.6.2. Paired-Molecule Analysis

A data-matrix analysis investigated whether the expression correlations of all molecule pairs show any relevant difference between the control and melanoma samples. Spearman's rank correlation coefficient was calculated.

#### 4.6.3. Profile Analysis

Profile analysis was based on the Support Vector Machine (SVM) supervised learning algorithm, using a linear kernel [54,55]. Briefly, the method finds the best separation hyperplane between the set of control samples and the set of melanoma samples. Each sample is assumed as a single point in the hyperspace of dimensions n, where n is the number of features that can be used as predictors (specifically, the expression values of the molecules, and optionally age and sex). The result of the SVM algorithm is a separation hyperplane that maximizes the cumulative quadratic distance between the boundary points and the hyperplane itself. A parameter C plays a crucial role when the points are not linearly separable: C represents the tradeoff between decreasing the quadratic distance and ensuring that the boundary points are properly classified. We tuned the parameter C by testing 40 values between approximately 10−<sup>14</sup> and 10<sup>5</sup> and selecting the value yielding the largest Area Under Curve (AUC) of the Receiver Operating Characteristic (ROC).

The missing expression values were removed from the dataset, according to two alternative approaches. In the first approach, we removed either the entire sample or, if more convenient in terms of resulting dataset size, all expression values of a specific molecule from the dataset. In the second approach, we assigned all missing values to zero [56]. The expression values of each molecule were then transformed to have average = 0 and standard deviation = 1, according to standard methods for this kind of analysis [57,58]. To validate the results, a 10-fold cross-validation procedure was applied. The SVM algorithm considers all predictors as coordinates in a multidimensional space, hence the prediction model is based on the whole set of 27 expression values of the molecules. For the analysis of the tissue-expression data, a Recursive Feature Elimination procedure [51] was applied to identify the molecules having the greatest impact. Therefore, the most relevant molecules of the predictive SVM-based model were identified. This method essentially repeats several times the cross-validated SVM analysis by excluding one of the predictors at a time, then discarding the weakest one, and restarts the whole process on the set of remaining predictors. By this procedure, the molecules having the weakest impact on the performance of the SVM model were identified and removed from the feature set.

#### *4.7. Results Validation*

#### 4.7.1. Cross-Validation Procedure

All statistical results involving random selections of samples (namely SVM analyses) were validated using "cross-validation" methods to reduce errors due to overfitting. Overfitting errors are caused by an over-optimization of the parameters of a statistical method that achieves an optimal result on the available dataset but poor results on a dataset built from a different set of observations. In a typical k-fold cross-validation procedure, the dataset is randomly partitioned in k subsets of

approximately equal size. The statistical method is then repeated k times: in every execution, k-1 subsets are used as "training sets" to optimize the parameters of the method, while the remaining "testing"' subset is used to evaluate the performances of the method. Every execution of the method uses a different subset as the "testing" set. Eventually, the performances of the method are taken as the average performances of all k executions. In this work we selected k = 10; thus, every measure is the net results of 10 experimental runs.

#### 4.7.2. Validation

The tissue-expression results obtained from the data collected from the GENT2 database were validated on an independent database, GEPIA2 (available at http://gepia2.cancer-pku.cn/#index), reporting the RNA expression data from 461 controls and from 558 melanoma patients. Expression and dimensionality reduction by PCA analysis were carried out by the specific tools available at the GEPIA2 database.

### **5. Conclusions**

We report here, for the first time, significant differences in cytokine expression as a function of the pathological state and gender, or age, or Breslow thickness, in the serum expression of a large patient dataset. Such differences are likely related to the systemic response to the tumor and may help, at least in part, investigating the known heterogeneity of this tumor. Furthermore, by analyzing gene expression in a large tissue expression dataset, we report, for the first time, a highly relevant 4-gene signature that discriminates the controls from the melanoma patients. We also show here that the machine learning algorithm SVM appears to be very effective in improving the classification ability for potentially diagnostic purposes and clinical applications.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2072-6694/12/12/3680/s1, Table S1: Serum expression general data in all controls (male + female). Table S2: General data on serum expression in all melanoma patients (male + female). Table S3: Correlation of the serum expression with Breslow thickness in male melanoma and in female melanoma. Table S4: Correlation of the serum expression with age, in female controls compared to female melanoma. Table S5: Correlation of the serum expression with age, in male controls compared to male melanoma. Table S6: Anova analysis of tissue expression data. Table S7: Results of the SVM method applied to the serum expression dataset after age-matching. Results are similar to the analysis performed on the unmatched data (see Table 7).

**Author Contributions:** M.C.: conceptualization; data analysis; draft writing; final writing: F.S.: sample handling; database updating; draft writing; D.D.: sample handling; database updating; draft writing; G.C.A.-C.: patients recruitment; informed-consent; S.R.: expression analysis; C.T.: expression analysis; M.N.: patients recruitment; informed-consent; E.P.: patients recruitment; informed-consent; L.L.: patients recruitment; informed-consent; G.D.L.: patients recruitment; informed-consent; F.F.: conceptualization; study design; draft writing; final writing; A.F.: conceptualization; study design; draft writing; data analysis; final writing; study supervision. All authors have read and agreed to the published version of the manuscript.

**Funding:** MOH: RC2018: RC2019 to A.F. MOH: Programma Italia-USA, and Oncotecnologico to FF.

**Acknowledgments:** We gratefully thank the support from the Facility of Complex Protein Mixture (CPM) analysis at ISS, Rome, and the entire sections of Oncology- Dermatology and Surgery at IDI-IRCSS. T.C. was supported by Fondazione Umberto Veronesi that is gratefully acknowledged.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

*Review*
