**1. Introduction**

The economic significance of olive industries to the European Union is unquestionable. Europe contributed almost 70% of the world olive oil production in the 2018–2019 harvest year campaign and the resultant revenue was to the tune of five billion euro [1]. This large and continuously expanding industry is also associated with many negative environmental problems stemmed from waste production and inappropriate disposal, soil depletion, and atmospheric emissions [2]. Every phase in the olive chain is characterized by different environmental concerns. In the agronomic phase, the use of pesticides, herbicides, and fertilizers has been identified as the principal contributor to ecological challenges [3]. In the cultivation phase, activities such as irrigation, pruning, soil management, and fertilizer applications can negatively affect the environment. The impacts of these primary

**Citation:** Grassi, S.; Jolayemi, O.S.; Giovenzana, V.; Tugnolo, A.; Squeo, G.; Conte, P.; De Bruno, A.; Flamminii, F.; Casiraghi, E.; Alamprese, C. Near Infrared Spectroscopy as a Green Technology for the Quality Prediction of Intact Olives. *Foods* **2021**, *10*, 1042. https://doi.org/10.3390/foods10051042

Academic Editor: Maria D. Guillen

Received: 15 April 2021 Accepted: 6 May 2021 Published: 11 May 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

phases are minor when compared to olive oil production and its unit operations. Oil extraction generates the most potentially hazardous organic compounds that accompany olive wastewater and pomace, depending on the techniques [4]. Laudable efforts have been made to adopt sustainable agricultural and industrial practices in the olive value chain to mitigate these problems. For instance, adoption of organic integrated agricultural systems in the farming and cultivation of olives is an example of sustainable agricultural practice. Industrially, practices such as the two-phase olive extraction method, which reduces water consumption, extraction of bioactive phytonutrients from by-products, and overall valorization of the olive production chain have significantly reduced the negative impacts of the industry on the environment [5,6]. However, a rather less emphasized aspect of the sustainable olive system is solvent reduction and replacement strategies during laboratory chemical analyses of olives and olive oils.

These chemical analyses are fundamental to monitor olive ripeness, estimate oil extraction efficiency, and control oil quality. Free acidity, moisture, and oil contents are examples of chemical parameters serving as quick tests on olive drupes before extraction [7]. On-field information of these chemical parameters can suggest suitable harvest time and overall orchard management [8,9]. Immediate first-hand knowledge of moisture and oil content of olive drupes prior to processing can reliably predict the economic viability of the entire production process, therefore informing producers about the raw material composition is of crucial relevance [10,11]. Similarly, prediction of minor constituents such as phenols, pigments, and antioxidants contents of olives can facilitate instant classification of the resultant oils even before production, making official standard compliance and product consistency easier. Commonly used wet methods, such as Soxhlet extraction technique, gravimetry, and chromatography have many unsustainable limitations such as excess solvent consumption, limited sample size, destructive sample preparation, slow response, and technical demand [7]. Thus, for effective processing and quality control of the olive system, application of green, sustainable eco-friendly, energy-efficient, non-destructive, non-invasive, easy-to-use, and inexpensive spectroscopic methods become inevitable.

From the technological point of view, the importance of these rapid determinations before oil extraction may lie in the possibility of modulating the extraction systems based on the drupe characteristics and type of desired product. For instance, operative conditions safeguarding the phenolic content can be adopted if phenolic substances are not so high in the drupes or, vice versa, the outstanding phenolic content of some drupes can be lowered if the final product is intended for consumers who do not like bitter/pungent oil [12,13]. Knowing how to set the equipment before starting the process instead of correcting the settings once the oil has been extracted and analyzed might be of interest.

Near infrared spectroscopy (NIRS) has gained prominence in the last decade and has contributed economically to food and feed industries by ensuring on-time processing and quality control [14,15]. The technology is a formidable green chemistry tool and environmentally sustainable analytical technique capable of handling a large sample size in solid and liquid forms and it provides quick answers to quality questions. NIRS, in conjunction with appropriate chemometrics, has become a routine analytical tool for the determination of intact olive drupes moisture and fat contents [16,17]. Using a portable Vis/NIR spectral acquisition device equipped with multiple detectors, it was possible to predict several economically important olive mill parameters such as maturity index, moisture, oil content, acidity, and dry matter [18]. Another type of NIRS system with a wavelength selection tool (acousto-optically tunable filter—AOTF) was satisfactorily applied to predict phenolic compounds and to monitor ripening of olives [19,20]. In addition to intact or crushed olive quality assessment, NIRS has been found to be handy in evaluation of olive oils and olive by-products [21,22]. However, comparative performance evaluations of NIRS using different signal acquisition devices are relatively uncommon especially for olive drupes. In this study the results obtained by a Fourier transform (FT)-NIR spectrometer (equipped with both integrating sphere and fiber optic probe) and a Vis/NIR handheld device for the prediction of quality parameters of intact olives of

13 different cultivars collected in three harvest years are discussed. In particular, the objective was to evaluate the different performance of the acquisition systems in the prediction of moisture, oil content, soluble solids, total phenol content, and antioxidant activity, in vision of suitable tools to be applied both in the field and at the mill for quick answers to quality questions in a sustainable way.

#### **2. Materials and Methods**

#### *2.1. Olive Samples*

Samples of olives belonging to 13 different cultivars from Abruzzo, Apulia, Calabria, and Sardinia regions (Italy) were used; sampling was carried out at different ripening degrees during 2016–2018 harvesting years. For each sampling and cultivar, three sample units (500 g each) were picked from different identified trees of the same grove, for a total of 267 sample units. Each unit was independently analyzed for the chemical parameters (moisture, oil content, soluble solid content, total phenolic content, and antioxidant activity). Two aliquots (100 g each) were taken from each sample unit for FT-NIR analysis with the integrating sphere. From each aliquot, 10 olives were selected as representative of the ripening stage [23] and used for analyses with both the FT-NIR and Vis/NIR fiber optical probes.

#### *2.2. Chemical Analyses*

Determination of moisture content (%) was carried out according to the AOAC 934.06 official method [24]. Oil content (% on fresh weight) was determined gravimetrically after the extraction of the oil from 10 g of dehydrated olive paste in a Soxhlet apparatus using petroleum ether as solvent [25]. Total soluble solids content (◦Bx) was measured according to a previously published procedure [26]. Briefly, the sugar aqueous solution was prepared by homogenizing olive paste (20 g) in distilled water (40 mL) and stirring for 2 min. After centrifugation (11,000× *g* for 10 min), the supernatant solution was analyzed through a digital refractometer. Total phenol content (TPC) was determined as follows: olive pulp (1 g) was extracted using hexane (3 mL) and methanol:water (70:30 *v*/*v*; 15 mL), by stirring for 10 min at room temperature. After centrifugation (6000× *g* at 4 ◦C for 10 min), the supernatant phase was collected and further centrifuged (13,600× *g*, 5 min, room temperature). The obtained extracts were filtered through nylon syringe filters (pore size 0.45 μm; LLG Syringe Filter CA, Carlo Erba, Milano, Italy), properly diluted, and spectrophotometrically analyzed at 750 nm using the Folin-Ciocalteau reagent [27]. Calibration curves were made using gallic acid and the results were expressed as grams of gallic acid equivalent per kilogram olive pulp (gGA/kg). Antioxidant activity (% inhibition/mg olive pulp) was determined on the same extracts used for TPC, applying the radical 2,2 diphenyl-1-picrylhydrazyl (DPPH•) method [28]. Briefly, 200 μL extract (previously diluted 1:20 in methanol) was made to react with 2.8 mL DPPH• methanol solution (6 × <sup>10</sup>−<sup>5</sup> M) for 1 h at 22 ◦C, measuring the discoloration at 515 nm. All reagents were from Sigma-Aldrich (St. Louis, MO, USA).

#### *2.3. Spectra Collection*

Spectra were collected by using a benchtop FT-NIR spectrometer (MPA, Bruker Optics, Milan, Italy), equipped with both an integrating sphere and a fiber-optic probe, and a handheld portable Vis/NIR device (Jaz, OceanOptics Inc., Dunedin, FL, USA). The FT-NIR spectra of the two aliquots (100 g each) of each olive sample unit were collected in duplicate in diffuse reflectance by means of the integrating sphere system. The optical fiber was used to acquire, in duplicate, the FT-NIR spectra of the 10 single olives selected from each aliquot based on ripening degree [23]. For both FT-NIR sampling systems, spectra were collected within a 12,500–3600 cm−<sup>1</sup> spectral range, at 8 cm−<sup>1</sup> resolution and with 32 scans. The background for the integrating sphere was performed by closing the internal reference wheel of the module, while for the fiber-optic probe a Spectralon standard was used. A dedicated software (OPUS v. 6.5, Bruker Optics, Ettlingen, Germany) was used to

manage the instrument. The same single olives were analyzed in duplicate also by using the Vis/NIR portable device (500–1000 nm, i.e., 20,000–10,000 cm−1; 0.3 nm resolution; 5 scans) equipped with a bifurcated optical fiber provided with a cap that standardizes the distance between the head of the probe and the sample (about 2 mm) and reduces the environmental light interference. A white reference (99% reflection) was used to set the maximum reflection. Spectrum acquisition lasted 18 s for both the integrating sphere and the probe of the benchtop FT-NIR spectrometer, and 1 s for the portable Vis/NIR device. Measurements were conducted with both instruments on the same day, thus making sample storage between analyses unnecessary.

#### *2.4. Data Analysis*

Data elaborations were performed using the Unscrambler X software (v. 10.4, CAMO ASA, Oslo, Norway). The replicated spectra were averaged in order to have one spectrum for each sample unit. For FT-NIR probe and sphere, spectral ranges were reduced to eliminate non-informative and noisy regions (i.e., 3600–4000 and 10,500–12,500 cm−1), whereas in the case of the portable Vis/NIR device, the whole spectral range was used. The spectral data were independently pre-processed by standard normal variate (SNV), which removes possible interferences due to light scattering [29]. Chemical variables and all spectral data were merged in a single matrix (267 sample units × 5024 variables) and used to perform principal component analysis (PCA), autoscaling all the variables to overcome the heteroscedasticity nature of the data. The coordinate transformation of the merged spectral–chemical data matrix allowed for the selection of a calibration and a prediction data set, using the Kennard–Stone (KS) algorithm [30]. The algorithm partitioned the data in order to have 70% of samples (187 sample units) in the calibration set and 30% (80 sample units) in the prediction set.

Prediction of olive chemical characteristics based on spectral data was performed applying the partial least squares (PLS) regression to the calibration set of each spectral matrix (187 sample units × 1686 variables for the FT-NIR systems; 187 sample units × 1647 variables for the Vis/NIR equipment) using nonlinear iterative partial least squares (NIPALS) algorithm. Different pre-treatments of spectral data were tested: SNV, first derivative (d1; Savitzky–Golay algorithm, second order polynomial, 11-window size), which allows removal of baseline offset [31], and their combination. After calibration, the models were validated internally, through cross-validation (Venetian blind, 10 cancellation segments). The number of components to be considered for each model was determined based on the plot of calibration and cross-validation errors as a function of the number of latent variables (LVs). The optimal number of LVs was chosen as the number of LV allowing to minimize the cross-validation error. Afterwards, the models were externally validated by independently using the prediction set previously created with KS. Model performance was evaluated in terms of determination coefficients for calibration (R<sup>2</sup> cal), cross-validation (R<sup>2</sup> cv), and prediction (R2 pred), as well as by root mean square error of calibration (RMSEC), cross-validation (RMSECV), and prediction (RMSEP), and standard error of prediction (SEP).

Prediction performances of the models obtained by the three spectral acquisition systems were compared by different approaches: (i) comparison of intermediate precisions expressed as standard error of laboratory (SEL); (ii) comparison of SEP with SEL of reference analyses; (iii) statistical tests proposed in the scientific literature [32,33]. SEL of the reference analyses and NIRS acquisition systems was calculated as follows [34]:

$$SEL = \sqrt{\frac{\sum\_{1}^{m} \left(x\_1 - x\_2\right)^2}{m}}$$

where *m* is the number of olive samples and *x1* − *x2* is the absolute value of the difference between replicate results. In the third approach (i.e., statistical tests), first, the model biases, i.e., differences between the reference method results and those of the models predicting

the chemical parameters, were compared by a *t* confidence interval for paired samples with a 95% confidence interval. The null hypothesis (H0) states that model biases are not different. If the calculated Fisher value is higher than the F critical value, the H0 is rejected and the hypothesis H1 is true (i.e., differences between models are significant) [32]. Furthermore, a pairwise comparison of the model standard deviations was performed by the calculation of the correlation coefficient between each two sets of prediction errors (r). Then, K index is calculated by the following equation:

$$\mathbf{K} = 1 + \langle [2(1 - \mathbf{r}^2)\mathbf{t}^2]\_{n-2, 0.025} \rangle / (\mathbf{n} - \mathbf{2}) \| \tag{1}$$

where tn−2,0.025 is the upper 2.5% point of the *t* distribution on n − 2 degrees of freedom. Subsequentially, L index is calculated as follows [33]:

$$\mathcal{L} = \sqrt{\left[\mathcal{K} + \sqrt{\left(\left(\mathcal{K}^2 - 1\right)\right)}\right]} \tag{2}$$

Then, the 95% confidence interval for the ratio of the standard deviations (L-lower and L-upper limits) was calculated. If the L interval includes 1, the standard deviations are not significantly different (*p* > 0.05). The model comparison was performed in MATLAB environment (v. R2017b, The MathWorks, Inc., Natick, MA, USA).

#### **3. Results and Discussion**

#### *3.1. Chemical Parameters*

Descriptive statistics of the chemical variables are presented in Figure 1 as box and whisker plots. The box lines represent the first and third quartiles and the median. The mean value is indicated by a cross sign. Whiskers correspond to the minimum and maximum measured values. Genetic, environmental, and cultivation factors affect olive composition, which changes during growth together with the drupe weight [5]. Actually, the tested cultivars and the different ripening stages and crop seasons accounted for a high range of variability of all the chemical parameters. This is an important point for the development of prediction models useful for different production sites. Variation ranges of the chemical parameters for the different olive cultivars are reported in Table S1.

**Figure 1.** Box and whisker plots showing the descriptive statistics for the chemical variables tested on olive drupes. TPC: total phenol content; GA: gallic acid equivalent; DPPH•: radical 2,2 diphenyl-1-picrylhydrazyl; inhib.: inhibition.

Moisture represents the main constituent alongside oil. In the considered drupes, moisture content ranged from 39.3 to 87.2%. The obtained results agree with previously published data [18,35], considering that the moisture mean value was 63.3%, while the

highest values (>80%) were obtained only in three out of thirteen cultivars, all from Calabria region. Excluding those three cultivars, the maximum value for moisture was 73.7%.

Commonly, olives intended for oil production have approximately 20% oil [36]. The samples here considered had a wide range of oil content (1.9–26.0%), suggesting the high influence of cultivar and ripening degree on this parameter. A general increase in oil content ranging from 2 to 12% was observed over ripening, depending on the considered cultivar.

TPC is an approximate estimation of total phenolic acids, phenolic alcohols, flavonoids, and secoiridoids in olive drupes. These compounds confer the bitter taste and pungent sensation on olive oils and are responsible for the well-known antioxidant properties. TPC values of the samples had a wide range of variation (2.5–60.6 gGA/kg), with the highest levels (>35 gGA/kg) found in three cultivars from Sardinia region. The antioxidant activity too was very different in the various samples, ranging from 2.4 to 165.0% inhibition/mg. Unexpectedly, the highest values (>70% inhibition/mg) were not found in the olives with the highest TPC, but in two cultivars from the Apulia region.

#### *3.2. Spectral Features*

Figure 2 shows the spectra of the olives obtained from the three acquisition systems. Visual features and patterns of the spectra conform with those previously reported for intact olive drupes [37,38].

**Figure 2.** Spectra of olive drupes acquired with: (**a**) FT-NIR integrating sphere; (**b**) FT-NIR fiber-optic probe; (**c**) portable Vis/NIR device.

Aside from the visual differences in band intensities among samples, FT-NIR spectra from the integrating sphere and the fiber-optic probe (Figure 2a,b) are quite similar, with the latter exhibiting higher absorbances in most of the observable peaks.

The low absorbance band around 8600 cm−<sup>1</sup> represents a combined symmetric and asymmetric OH stretching and bending vibrations. This is followed by the second overtone of CH stretching vibrations at 8300 cm−<sup>1</sup> that corresponds to methyl (-CH3), methylene (-CH2), and olefin (-CH=CH-) bonds [37]. The high water content of the olive drupes (39–87%) explains the two absorption bands at 7500–6100 and 5400–4500 cm−1. These bands are designated as the combination of first overtone of symmetric and asymmetric OH-bending and OH-stretching bands (6900 cm−1) and combined OH-bending and OHstretching bands (5200 cm−1), respectively [39]. Similarly to the second overtone of CH stretching vibrations at 8300 cm<sup>−</sup>1, the two bands at 5800 and 5650 cm−<sup>1</sup> represent the first overtone of CH-stretching vibrations present in the same CH3, CH2, and CH=CH functional groups. At the far end of the FT-NIR spectral range, two peaks at 4335 and 4262 cm−<sup>1</sup> represent CH and CH2 s overtones, respectively [35]. However, the intermediate bands between the overtones (i.e., 8600, 5800–5650, and 4350–4250 cm<sup>−</sup>1) have been attributed to the oil content of the drupes [40]. Regarding olive fruit phenols, there are no reported NIR correlated bands in the literature. However, a previous study suggested that some regions (i.e., 8700–8300 and 5800–5650 cm<sup>−</sup>1) are correlated with TPC of olives [19].

In the case of Vis/NIR spectra (Figure 2c), more peak variations among samples were observed, especially within the visible (550–680 nm) and near-infrared (700–790 nm) regions. The changes around 550–680 nm correspond to some varying pigment indices. Specifically, the peak around 540 nm has been associated with anthocyanin, while that at 680 nm has been linked to chlorophyll [41]. Thus, changes in reflectance along these peaks may be due to maturation differences among the drupes. Other parameters, such as soluble solids, pH, and firmness, have been implicated within these regions in pears, especially around 340–740 nm [42]. Changes in the two absorption peaks around 750 and 850 nm could be assigned to the third overtone of H2O and C-H functional group, respectively [43].

#### *3.3. Principal Component Analysis*

Figure 3 shows the score and loading plots of the PCA model built on the merged chemical and spectral database. The first two principal components (PCs) represent 59% of total data variance. The application of KS algorithm after PCA allowed to select evenly distributed samples for the calibration and prediction sets, highlighted with different colors in the score plot of Figure 3a. Few samples were seemingly outliers, but they were not removed in order to avoid presumptive assumption that they might adversely affect the model. Anyway, KS data splitting algorithm retained to a large extent as much variability as possible within the calibration and validation sets and this is a prerequisite for model robustness and validity in prediction. The loading plot (Figure 3b) shows a balanced contribution of both the chemical parameters and the three spectral ranges to sample distribution and consequently to the dataset partitioning.

**Figure 3.** PCA results: (**a**) score plot showing the distribution of calibration (blue) and prediction (orange) set samples selected by Kennard-Stone algorithm applied on the merged chemical and spectral dataset of olive drupes; (**b**) loading plot of PC1 (blue) and PC2 (orange).
