*2.4. Chemicals*

All chemical reagents for analysis were of analytical grade. Deuterated solvent (CDCl3 99.8 atom%D) was purchased from was purchased from Sigma-Aldrich S.r.l. (Milan, Italy).

#### **3. Results and Discussion**

#### *NMR Spectroscopy and MVA*

Multivariate statistical analysis was applied to all of the NMR data, focusing on the possible differences existing between Italian (from Tuscany, Sicily and Apulia regions) and foreign (Spain, Portugal, Tunisia, Turkey, Chile and Australia) EVOO samples. The fatty acid composition, together with squalene and β-sitosterol, was calculated by methods presented previously in the literature [29], on the basis of 1H NMR data. One-way analysis of variance (ANOVA) was performed to assess whether the means were significantly different among groups (see Tables S1 and S2). The obtained data indicated that all the investigated oils had fatty acid composition within the expected range for, in particular with regard to the polyunsaturated fatty acids (linolenic and linoleic fatty acids) and oleic acid. A first level of investigation was performed using the unsupervised exploratory statistical technique (PCA), without considering the climatic data, to look for trends among samples and/or possible outliers, which were excluded from further analyses, and to obtain a general overview of EVOOs (Figure 1). Of the original 221 variables per spectrum, six PCs were enough to describe 92.5% of the variance of the entire NMR dataset, giving R2X(cum) = 0.925 and Q2(cum) = 0.822 (with PC1, PC2 and PC3 describing 57.9%, 20.8% and 5.6% of the variance, respectively). The first principal component, PC1, gave a clear separation of Tunisian samples from the remaining classes, while all the other oils appeared to overlap considerably in the PC1/PC2 scoreplot (Figure 1A). Nevertheless, a certain degree of separation was also observed for Chilean oils, in particular on the third component (PC3), while European oils (Spanish, Portuguese, Italian) overlapped considerably in the scoreplot. By examining the loadings of the original variables it was possible to define the molecular components responsible for the observed trend (Figure 1B). In particular, Tunisian oils were characterized by a high relative content of polyunsaturated fatty acids, such as linolenic acid (1.38 ppm methylene protons of the unsaturated acyl groups, 2.78 ppm diallilyc groups, 5.40 ppm linolenic olefinic protons), while a high relative content of monounsatured fatty acids (1.32 ppm acyl group of oleic acid) was associated with all other oils.

**Figure 1.** (**A**) Principal Component Analysis (PCA) (t[1]/t[2] scoreplot for the whole Nuclear Magnetic Resonance (NMR) dataset of extravirgin olive oils (EVOOs) (six components give R2X(cum) = 0.925 and Q2(cum) = 0.822). (**B**) Loading plot for the model; the variables indicated ppm in the 1H NMR spectra.

PCA analysis was further used considering only the EVOOs with a higher sample size, in order to increase the statistical significance for each class. Three oil groups were therefore excluded (with 4, 14, and 4 samples from Australia, Portugal, and Turkey, respectively) and the PCA was repeated using the remaining four clusters (with 61, 58, 34, and 62 samples from Italy, Tunisia, Chile, Spain, respectively). Analyzing the resulting PCA model (Figure 2, 93.9% of the variance with six PCs, R2X(cum) = 0.939 and Q2(cum) = 0.859), it could be observed again that the Tunisian samples were clearly separated on the first principal component PC1, while Chilean oils appeared clearly distinct from Italian and Spanish EVOOs, in particular when the third component (PC3) was considered. As described above, Tunisian oils were characterized by a high relative content of linolenic acid, while a high relative content of monounsatured fatty acids (1.32 ppm signal corresponding to the acyl group of oleic acid) was associated with the other oils. The most scattered clusters, Italian and Spanish oils, remained considerably overlapped in this PCA model, suggesting that further considerations are needed.

**Figure 2.** (**A**) PCA (t[1]/t[3] scoreplot for the whole NMR dataset of EVOOs (six components give R2X(cum) = 0.939 and Q2(cum) = 0.859). (**B**) Loading plot for the model; the variables indicated ppm in the 1H NMR spectra.

*Foods* **2017**, *6*, 96

In the first place, the unsupervised PCA and supervised OPLS-DA analyses were applied in order to deeply analyze the differences existing between Italian and Tunisian EVOO samples. Indeed, due to the recent introduction of Tunisian product in the EU olive oil market, and the serious impact on especially Italian production [30], differentiation of Tunisian EVOO appears to be a key issue [31]. In particular, taking into account the very different pedoclimatic conditions of the three Italian regions studied (Tuscany, Sicily and Apulia), sub-groups of samples from these regions were considered separately. Analyzing the resulting PCA scoreplots of Tunisian vs. Tuscan and Tunisian vs. Sicilian samples (Figure 3A,B respectively), a very good separation between the clusters was interestingly found even in the unsupervised analysis. Moreover, also in the case of Apulian vs. Tunisian oils, despite the low number of Apulian samples considered in this work, a good separation between the two clusters was observed, as already reported in other studies [32]. The samples were then analyzed by OPLS-DA in order to accurately analyze the differences observed in the PCA analysis and to investigate the goodness of fit (R2X) and prediction (Q2) for the models. In both cases (Tunisian vs. Tuscan and Tunisian vs. Sicilian samples), good OPLS-DA models were obtained, in which one predictive and two orthogonal components (1 + 2) gave R2X = 0.86, R2Y = 0.91 and Q2 = 0.89 and R2X = 0.85, R2Y = 0.92 and Q2 = 0.905, respectively. By considering the Q2 predictivity parameter for the OPLS-DA models of Figure 4A,B, it should be noted that both the OPLS-DA models showed a very high prediction ability (Q2 = 0.89 and Q2 = 0.905, respectively). Again, Tunisian oils were characterized by a high relative content of polyunsaturated fatty acids, such as linolenic acid (1.38 ppm, methylene protons of the unsaturated acyl groups, 2.78 and 5.40 ppm, linolenic diallilyc and olefinic protons, respectively), while a high relative content of oleic acid (5.34, 2.01, 1.32 ppm) was associated to both Tuscan and Sicilian oils.

**Figure 3.** PCA scoreplots for EVOOs from Tuscan ( **A**) and Sicilian (**B**) Italian regions vs. Tunisian oils.

Finally, analyzing the resulting OPLS-DA models between Italian (Sicilian, Tuscan) and Spanish EVOO samples, a good separation was obtained for Tuscan (in particular from Arezzo province, Tuscany region) vs. Spanish (Figure 5A) and for Sicilian vs. Spanish oils (Figure 5B). Again, also in the case of the Apulian (limited samples) vs. Spanish oils, a reasonable separation between the two clusters was observed, which is in agreemen<sup>t</sup> with results already reported in other studies [32].

Further consideration deserves to be given to the comparison of the OPLS-DA discriminating models and the average cumulative rainfall (mm) temperature (◦C) data for the considered classes. A careful analysis of Table 1 data and the quality model descriptors for the OPLS-DA discriminations does not seem to give an indication of a clear correlation between both average rainfall and temperature differences and model discrimination performance (predictivity). Higher differences in average rainfall (Tuscan vs. Tunisian and Tuscan vs. Spanish oils) are generally associated with a more constant discriminating ability of the studied OPLS-DA models. This trend is also observed when considering average temperature differences. On the other hand, average country or regional temperature, although calculated for a wide time span (between the year 2009 and 2012), may not correctly account for the specific pedoclimatic conditions associated with the examined samples. Therefore conclusive correlations could not be obtained by using these simple climate descriptors, which were observed and calculated on a country and/or regional basis and chosen for the ease of their availability. Further studies possibly based on a detailed climate data detection and analysis are required in order to obtain sound correlation with specific EVOOs metabolic profiles.

**Figure 4.** Orthogonal Partial Least Squares Discriminant Analysis, (OPLS-DA) scoreplots for EVOOs from Tuscan (**A**) and Sicilian (**B**) Italian regions vs. Tunisian oils.

**Figure 5.** (**A**) OPLS-DA scoreplot for Italian (from Arezzo province, Tuscany region) and Spanish oils (1 + 4 + 0 components gave R2X = 0.91, R2Y = 0.89 and Q2 = 0.84). (**B**) OPLS-DA scoreplot for Italian (from Sicily region) and Spanish oils (1 + 6 + 0 components gave R2X = 0.9, R2Y = 0.71 and Q2 = 0.41).
