3.2. PCA and SIMCA Results
Figure 5 shows the result of the PCA calculation. Using the first two PCs that explained 95% of the variance, a separation between pure and adulterated SBH samples was established, especially along the PC1 axis. The addition of HFCS-55 to pure
Heterotrigona itama honey shifted the samples to the right of PC1 (PC1 scores became more positive). For this, we can see that all pure SBH samples were clustered on the left of PC1 (PC1 score < 0), and most of the adulterated SBH samples were located on the right of PC1 (PC1 score > 0).
The X-loading spectra of the first three PCs were plotted against wavelengths to evaluate the most contributive variables (
Figure 6). The positive loading spectra for PC1 showed that the peak at 391 nm has a significant positive contribution, while other peaks at 384 and 470 nm in PC2 and PC3 had a similar impact in the opposite direction. Two positive peaks at 490 and 540 nm were also observed for PC2 and PC3. These important peaks in the fluorescence spectra of honey are consistent with those reported in previous works. For example, the emission at 391 nm is closely related to phenolic compounds, and the emissions at 470 and 490 nm may well be related to caffeic acid, chlorogenic acid, and ferulic acid in honey [
63]. The presence of a peak at 391 nm with a positive direction could be understood to be due to the fact that SBH honey is rich in phenolic compounds, as reported in many studies [
63]. The intensity of these phenolic compounds is strongly correlated with the antioxidant activity, color, and sensory features of SBH, and it could provide a valuable marker for differentiation between pure and adulterated SBH [
64]. The peaks at 470 and 490 nm obtained in our SBH samples have also been observed in several previous studies. Lang et al. [
65] reported a fluorescence peak of phenolic substances, such as chlorogenic acid, caffeic acid, and coumarins, close to 450 nm. The emission band for caffeic acid is in the range between 445 and 460 nm, which corresponds to the chlorogenic and ferulic acids of SBH, as reported by Sergiel et al. [
66]. For these reasons, the important emission peaks at 391, 384, 470, 490, and 540 nm were selected as inputs for MLR.
Table 4 shows the result of SIMCA model development for pure and adulterated SBH classes. The first six principal components (PCs) were obtained both in calibration and validation for both classes. The cumulative percent variance (CPV) of the first four PCs was more than 99% for calibration and validation, both for pure and adulterated SBH classes. It should be noted that a CPV of more than 70–85% is required to establish a reliable SIMCA model [
67]. Thus, the developed SIMCA model of pure and adulterated SBH classes was used to predict the class of new SBH samples in the prediction sample set (
n = 52 for pure and adulterated SBH samples). The result is presented in
Table 5.
The pure and adulterated prediction samples were classified with 100% accuracy, sensitivity, specificity, and precision, resulting in excellent classification. To confirm the result of classification, a model distance between the pure and adulterated SBH classes was calculated. A distance larger than 3 indicates good class separation and a low risk of misclassification in the model. A model distance of 121.6199 was obtained in this study (much larger than 3). This indicates the pure and adulterated SBH models were highly differentiating, providing a low risk of misclassification in the model. The plot of model distance (Si) versus leverage (Hi) for the pure SBH model is shown in
Figure 7, demonstrating that there is no overlap between any of the pure and adulterated samples (all adulteration levels), with no sample classified as false positive or false negative (FP and FN equal to 0). According to Se et al. [
12], the limit of detection (LOD) of the SIMCA model can be calculated based on only the true positive (TP) results. Consequently, the developed SIMCA model with smoothed fluorescence spectra had excellent classification for the adulterated SBH samples at concentrations above 10% (
v/
v). This LOD value for the developed SIMCA model is consistent with the result of a previous study. For example, Se et al. [
12] proposed the rapid detection and quantification of adulterants in stingless bee honey (
Heterotrigona itama) using an FTIR approach. The LOD was 2% (
w/
w) and 8% (
w/
w) for sugar cane and corn syrup adulteration, respectively. Our result was also comparable with the standard analysis technique for honey adulteration detection based on stable carbon isotope ratio analysis (SCIRA). According to this standard method, the direct addition of HFCS to honey would be detected when the adulteration was greater than 7%; this is the internationally recognized threshold level for samples to be considered adulterated [
68,
69].
3.3. Quantification of SBH Adulteration Level Using Different Regression Methods
Supervised quantification of the SBH adulteration level was studied using three different multivariate regression methods: PLSR, PCR, and MLR. For PLSR and PCR, fluorescence spectra from 357–725.5 nm (n = 738 wavelengths) were used as predictors (x-variables). For MLR, eight peaks were selected as x-variables based on their important fluorescence information: 378 nm, 384 nm, 391 nm, 470 nm, 477 nm, 490 nm, 540 nm, and the intensity ratio of 378 nm and 477 nm (I
378/I
477). The adulteration level (0–60%) was assigned as the target (y-variables). To minimize the risk of overfitting, the PLSR, PCR, and MLR were validated using an external validation sample set (
t-test validation). The scatter plot of the developed calibration model and the validation are presented in
Figure 8,
Figure 9 and
Figure 10 for PLSR, PCR, and MLR, respectively. The PLSR calibration model was acceptable, with R
2 = 0.9637, RMSEC = 4.0792%, and SEC = 4.0907%. In the validation step, a good agreement between actual and predicted levels of adulteration was achieved with an R
2 = 0.9610, RMSEV = 4.2418%, and SEV = 4.2608%. An even better result was obtained for PCR with lower error and higher R
2 (RMSEC = 3.7014%, SEC = 3.7119%, R
2 = 0.9701 in calibration, and RMSEV = 3.8347%, SEV = 3.8506%, R
2 = 0.9681 for validation). Interestingly, the calibration and validation results of MLR with fewer x-variables are also acceptable, with an R
2= 0.9480 for both calibration and validation. The error obtained in MLR is also acceptable (RMSEC = 4.8836%, SEC = 4.8974% for calibration and RMSEV = 4.9007%, SEV = 4.9206% for validation). It was noted that the RMSEC and RMSEV along with SEC and SEV were quite close (with a difference of less than 1%) for all regression models; indicating no over-fitting occurred. This study also showed that the PCR model with the highest R
2 both in calibration and validation outperformed the PLSR and MLR models.
The quality of the regression models obtained in this study is comparable with previously reported studies. For example, Chen et al. [
70] studied the application of three-dimensional fluorescence spectroscopy and PLSR to the authentication of four types of honey (sunflower, longan, buckwheat, and rape) adulterated by rice syrup. They obtained R
2= 0.9495~0.9972. Ferreiro-Gonzalez et al. [
71] demonstrated the success of visible-near-infrared spectroscopy applications coupled with PLSR for the prediction of honey adulteration with fructose-rich corn syrup. An R
2 = 0.9990 and R
2= 0.9855 were obtained for calibration and validation, respectively. UV-VIS (ultraviolet-visible) and NIR (near-infrared) spectroscopy were utilized to predict the fructose corn syrups in Acacia honey from Croatia (adulteration range: 10–90%
w/
w). Two regression models were developed for quantification of the adulteration level based on PLSR and MLP ANN (multiple layer perceptron neural networks). Using PLSR, R
2 = 0.9268, and R
2 = 0.9100 were reported for calibration and validation, respectively [
26]. Possible quantification of glucose syrup adulteration in Acacia honey in the range of 10–90% (
w/
w) from Croatia was also investigated using NIR spectroscopy, resulting in the best PLSR model using pre-treated NIR spectra with R
2 = 0.8978 for calibration and R
2 = 0.8557 for validation [
72]. Raypah et al. [
73] studied SBH adulteration using visible-near infrared (VIS-NIR) spectroscopy combined with aquaphotomics. The direct adulteration was done by mixing high fructose syrup with the pure SBH samples in the range of adulteration from 10 to 90% (
v/
v). The VIS-NIR spectra from 400 to 1100 nm were pre-treated using smoothing and detrend (1st polynomial) and were used as x-variables, while the adulteration level (10–90%
v/
v) was input as a y-variable. The PLSR result was accurate, with R
2 = 0.98 and RMSEC = 3.93% for calibration and R
2 = 0.96 and RMSEV = 5.88% for validation. Visible and NIR spectroscopy was used to quantify different types of adulterants (inverted sugar, rice syrup, brown cane sugar, and fructose syrup) added to high-quality honey (Granada Protected Designation of Origin, Spain) at adulteration levels of 5–50%
w/
w [
74]. Individual and global models based on PLSR were developed with R
2 = 0.964~0.990 and RMSEC = 1.621~3.195%. Laser-induced breakdown spectroscopy (LIBS) along with four different PLSR methods was utilized to quantify the HFCS-55 and HFCS-90 in Acacia honey. R
2 = 0.931~0.966 and RMSEC = 5.6~7.9% could be obtained [
75]. In a recent study, NIR spectroscopy was combined with chemometrics to evaluate the adulteration in the SBH sample [
76]. The adulterants were distilled water and apple cider vinegar. The PLSR model was developed with R
2= 0.986049, bias= −0.010%, and RMSECV = 1.686% [
76]. Raman spectroscopy was recently applied along with PLSR and ANN to quantify the adulteration level of a common single-flowered lychee honey adulterated with four syrups, HFCS, RS (rice syrup), MS (maltose syrup), and BS (blended syrup), in the range of adulteration 5–90% (
w/
w) [
77]. The calibration model of PLSR with 6 LVs for HFCS quantification gave R
2 = 0.9997, RMSEC = 0.6270% for calibration, and R
2 = 0.9910, RMSECV = 3.2633% for validation. It is noted that a difference of 2.6363% was observed between RMSEC and RMSECV, which is bigger than that of our present study. Wu et al. [
78] reported a similar result using Raman spectroscopy and PLSR for adulteration quantification of Acacia honey, litchi honey, and linden honey mixed with four syrups, including HFCS-55, RS, MS, and BS, with adulteration concentrations ranging from 5 to 60% (
w/
w). The quantification of multiple syrups in adulterated samples was performed using PLSR with acceptable results. The performance of our regression models was also comparable to the result of honey adulteration quantification using the standard method of HPLC and linear regression with R
2 = 0.9835 [
69].
The developed PLSR, PCR, and MLR models were used to predict the adulteration level in an independent prediction sample set (n = 52 samples). The slopes for the PLSR, PCR, and MLR models were 0.96, 0.95, and 0.94, respectively (
Figure 11).
Table 6 summarizes the performance of the prediction result. It is noted that all models resulted in satisfactory prediction results with the coefficient of determination in prediction (R
2p) greater than 0.90 for the PLSR, PCR, and MLR models. The error was low both in terms of RMSEP, SEP, and bias. To compare our prediction performance with previously reported studies, two parameters, RER and RPD, are frequently used. According to Parrini et al. [
79], the model can be considered sufficient for screening if the RPD is between 1.5 and 2.5. An excellent and acceptable prediction result must have an RER of more than 10 and an RPD of more than 2.5, as adopted by several studies [
80,
81,
82]. All developed models are superior with acceptable RER and RPD values (RER = 13.16~14.81; RPD = 4.76~5.36), as seen in
Table 6. Compared to previous studies on the quantification of honey adulteration using several different spectroscopy methods, our present results are better in terms of RER and RPD values. Nespeca et al. [
83] used laser-induced breakdown spectroscopy (LIBS) for the detection and quantification of honey adulteration (range of HFCS and sugar cane syrup adulteration 5–95%
w/
w) and developed an acceptable PLSR model with RPD = 2.7. A report has been published on the use of visible and near-infrared (VIS-NIR) spectroscopy and PLSR with a
t-test validation method for the detection of glucose concentration in a mixture of Saudi and imported honey samples (adulteration range: 0–33%
w/
w). An RPD of 2.06 was obtained [
84]. A similar result was reported on the application of UV-VIS and NIR spectroscopy for quantification of fructose corn syrup in Acacia honey in the range of adulteration (10–90%
w/
w). The best PLSR model had RPD = 3.3150 and RER = 10.4512 [
73]. A quantitative study was conducted using NIR spectroscopy and PLSR to predict the glucose syrup adulteration in Acacia honey (10–90%
w/
w). The best model was obtained for multiplicative scatter correction (MSC) spectral data with RPD = 2.7601 and RER = 8.7157 [
74].
The LOD of the regression models varied between 12.79 and 14.55%. In recent work, honey adulteration using brown rice syrup, corn syrup, glucose syrup, sugar cane syrup, and wheat syrup (10–50%
w/
w) of Western Australian honey was quantified using benchtop
1H NMR (nuclear magnetic resonance) spectroscopy with a LOD = 5%
w/
w [
85]. Our result was worse compared to the standard analytical methods of HPLC and SCIRA for honey adulteration. Ultrahigh-performance liquid chromatography with quadrupole time-of-flight mass spectrometry (UHPLC-Q-TOF-MS) successfully determined all the honey adulterants simultaneously with a detection range above 10% [
86]. SCIRA methods can detect only up to 7% (
w/
w), whereas HPLC was found to detect as low as 2.5% (
w/
w) adulterants [
86]. The LOD and LOQ in this present study were higher compared to several previous studies due to the high range of adulterated SBH samples (10–60%
v/
v) used in the study. However, as demonstrated in
Figure 11, the prediction plot, most predicted samples were very close to the regression line, which indicates that the developed PLSR, PCR, and MLR models could be used to detect HFCS adulteration of pure SBH samples.