**3. Results**

### *3.1. Physicochemical Properties of Honeys: Reference Values*

The descriptive analysis of physicochemical data obtained by reference methods are summarized in (Table 1). The data are expressed in function of two groups established for the NIR treatment (calibration and validation).


EC: electrical conductivity; HMF: hydroxymethylfurfural; N: number of samples; SD: standard deviation.

### *3.2. NIR Calibration Equations*

The calibration process was performed with chemometric techniques using the spectra and the physicochemical data of the samples. The samples were split randomly in two groups: calibration group with 84 samples, and external validation group with 16 samples (Table 1). Firstly, a principal component analysis was carried out with the samples corresponding to the calibration group. The spectral variability explained ranged between 99.36% and 99.99%. The principal components required for each parameter were 8 for moisture, EC, HMF and L and b\* coordinates, phenols and flavonoids; 5 for pH; 7 for diastase index; 9 for a\* coordinate; and 6 for color (Pfund scale). During the cross-validation process, the identified outliers with the T and H criteria were removed of the calibration set. before of the development of equations. According to both criteria were deleted the following samples: 13 for moisture, 10 for EC, eight for pH, 29 for HMF, 11 for diastase index, 11 for Pfund, 15 for L, a\* and b\* coordinates, 11 for phenols, and 13 for flavonoids.

Calibrations were performed by MPLS using spectral data and the physicochemical data of honey. Table 2 shows the best mathematical treatment, the concentration range, standard deviations and the calibration descriptors for each parameter. The obtained results indicated that it was possible to predict most of the physicochemical parameters in honey samples with portable microNIR system. The degree to which the calibration fits the data set was calculated by considering the highest RSQ, and the lowest SEC and

SECV. Moisture, EC, HMF, Pfund, L and b\* coordinates of CIELab scale, phenols, and flavonoids presented high RSQ coefficient (between 0.74 and 0.90). Values of RSQ lower than 0.70 had pH, diastase index and a\* coordinate of CIELab. Although, the standard errors of calibration (SEC) and of cross-validation (SECV) were acceptable in all cases, presenting a minimum difference, which is an indicator that the NIR models obtained are suitable in the ranges indicated with the portable microNIR.

**Table 2.** Statistical descriptors of modified partial least squares (MPL)S calibration models for each physicochemical parameter.


EC: electrical conductivity; HMF: hydroxymethylfurfural; N: number of samples; SD: standard deviation; RSQ: multiple correlation coefficients; SEC: standard error of calibration; SECV: standard error of cross-validation; RPD: ratio performance deviation.

### 3.2.1. Internal Validation of Models

Cross-validation method (internal validation) was used to study the predictive capacity of the obtained models. The samples of the calibration set were divided into a series of subsets. Six cross-validation sets in all cases were checked, one group for the results (prediction) and the other to construct the calibration model. The process was implemented as many times as there were groups, such that all of them passed through the calibration set and the prediction set. The results of internal validation (predicted values versus reference) for physicochemical variables in NIR is shown in (Figure 2). Considering the statistics SEP and SEP corrected (C) was deduced that the calibration models for moisture, EC, HMF, Pfund, a\* and b\* coordinates, phenols, and flavonoids were adequate. Therefore, the estimation of these physicochemical parameters was possible with good results.

The RPD statistic was used to determine the predictive capacity of reference methods for NIR calibration. The parameters of moisture, HMF, Pfund, a\* and b\*, and flavonoids had the higher values of RPD (>2.0). This is indicative of a good calibration of the data. While, EC, pH, and phenols showed an acceptable calibration, with values of RPD higher than 1.5. Diastase index and L coordinate had a RPD value of 1.5, resulting the poorest model.

### 3.2.2. External Validation of the Models

In external validation the solidity of the method is checked with 16 new honey samples which were not used in the calibration models (Table 1). The average of the sample spectra was taken, the equations obtained were applied, and the NIR values were compared with the reference in accordance with the residuals and the root mean square error (RMSE). In general, the results obtained for each physicochemical parameter analysed were satisfactory. The means of the residuals were between 0.12 for HMF and 188.37 for EC. RMSE values were between 0.18 and 270.36 for HMF and EC, respectively. The predicted values by the calibration models were compared with the reference data using the Student test for paired values (*p* < 0.05). The significance level showed that there were no differences between the results obtained (values were higher than 0.05), ranging the value between 0.07 for L coordinate and 0.92 for moisture (Table 3). Therefore, it can be concluded that the method provides significantly comparable data to the reference physicochemical data.

**Figure 2.** Measured references values versus the NIR values in the prediction set of honey samples.



EC: electrical conductivity; HMF: hydroxymethylfurfural; RMSE: root mean standard error.

### *3.3. Botanical Origin of Samples: Pollen Fingerprint*

Eighty-four pollen types corresponding to 50 botanical families were identified in the honey samples. The best represented families were Fagaceae, Leguminosae, Ericaceae, Rosaceae, and Myrtaceae. According to their botanical origin, 10 samples were classified as blackberry honey (*Rubus*), 22 samples as chestnut honey (*Castanea sativa*), 9 as eucalyptus honey (*Eucalyptus*), 5 as heather honey (*Erica*), 18 as honeydew honey, and 36 as multifloral honey. The principal pollen types in each honey type are showed in (Table 4).



Main pollen type: predominant pollen type in samples (frequently dominant pollen but not always). Secondary pollen types: 15–45% of pollen spectra; Other significant pollen types: <15% of pollen spectra.

> The blackberry honeys had a mean value of 58.5% for *Rubus* pollen, while *Castanea* was secondary pollen (26.4%). Other important pollen types in sample were *Erica*, *Cytisus* type, *Eucalyptus*, *Echium,* and *Trifolium* type. The averaged percentage of *Castanea* in chestnut honeys was 76.1%. *Rubus* was also present in all samples with a mean value of 14.2%. Other important pollen types were *Cytisus* type, *Erica*, *Trifolium* type, and *Echium*. For eucalyptus honeys, *Eucalyptus* pollen had a mean value of 72.8%. Other significant pollen types were *Cytisus* type, *Castanea*, *Rubus*, *Erica*, *Conium maculatum* type, and *Salix.* In heather honeys, *Castanea* pollen was the pollen type with the highest mean value (37.9%) and a representation of 100% of the samples and the mean value for *Erica* was 35.5% corresponding to a slightly underrepresented pollen in samples. *Cytisus* type, *Eucalyptus*, and *Rubus* were also present in all samples.

> The predominant pollen type in honeydew honey was *Castanea*, commonly found as dominant pollen (mean value of 52.3%). *Rubus* is usually secondary pollen and *Cytisus* type, *Erica*, *Plantago*, and *Salix* were also well represented pollen.

> Multifloral honeys had diverse pollen spectra. Commonly *Castanea* was the main pollen type and *Eucalyptus* secondary pollen while, *Rubus*, *Cytisus* type, *Erica*, *Trifolium* type, *Salix*, *Echium*, and *Cynoglossum* appeared as other significant pollen.

### *3.4. Discrimination of the Samples by Honey Type*

LDA was used to classify honey samples according the honey type. Main pollen types (*Castanea sativa*, *Eucalyptus*, *Erica*, and *Rubus*) and physicochemical characteristics (moisture, EC, pH, HMF, diastase index, color, phenols, and flavonoids) were included.

Five discriminant functions represented the 100% of variability of the data in the discriminant analysis (Table 5). The cumulative contribution rate of the first two linear discriminant functions accounted for 66.07%, which represented the largest fraction of overall variability in the dataset. In the first function a Wilks' Lambda = 0.01, Chi-Square = 432.3, DF = 75, *p* < 0.01 was formed; and in the second function a Wilks' Lambda = 0.04, Chi-Square = 296.4, DF = 56, *p* < 0.01. The significant value (*p* < 0.05) of Wilks' Lambda showed that the discriminant function was basic for the differentiation of the investigated groups. In addition, the higher values of eigenvalue and canonical correlations showed the high power of discrimination of the first two functions.

**Table 5.** Discriminant functions and statistics extracted of linear discriminant analysis.


*p*: significance level.

Figure 3 shows the representation of the first two functions of discriminant analysis applied to all the honey samples. The graphical projection shown that honey types were satisfactorily differentiated. The overall correct classification rate was 88.1% for all samples (Table 6). The groups of unifloral samples (heather, eucalyptus, and blackberry) showed a correct classification (100% of samples) with the higher discrimination rate. Chestnut honeys and honeydew honeys were correctly classified with 83.4% and 83.3%, respectively. Finally, multifloral honeys had a correct classification of 83.3%. Regarding chestnut and honeydew honeys, 3 and 2 samples, respectively, were interchanged of group. This is possible due to the closeness of these samples, produced in same biogeographical area and from the same plant species but with different predominance. For this reason, some samples are difficult to classify being possible they are chestnut honeys with honeydew contributions.

**Figure 3.** Plot of the first two functions of linear discriminant analysis based on main pollen types and physicochemical characteristics of honeys.


**Table 6.** Results of classification of honey samples considering the linear discriminant analysis.

> 

N: number of samples; %: correct classification.
