*2.1. Pollutant Exposure Experiments*

Since experience in the evaluation of in-field spectroscopy when estimating heavy metal contents in grapevine leaves is lacking, we chose to conduct this research in a laboratory-controlled environment. Therefore, treatments for heavy metal stress were applied to grapevine seedlings. For this purpose, five treatments at varying levels of Cu, Zn, Pb, Cr, and Cd were considered, and in each treatment, four repetitions were carried out (a total of 84 grapevine seedlings were examined).

It should be noted that the objective of this experiment was not to determine the sensitivity of grapevine to pollutants. We only intended to add heavy metal contents to the grapevine to compare its spectral differences with healthy leaf samples. The common grapevine variety in the study area is *Vitis vinifera cv. Askari*; all seedlings belonged to this variety to eliminate the effect of variety change on spectral characteristics [43,48]. Experiments were conducted outdoors in full sun between March and September 2018. Each grapevine seedling sample was placed in an individual plastic pot (length and width 25 cm × 10 cm) and was randomly divided amongst the studied treatments. The seedlings were two years old, and their height at the beginning of the experiment was between 20 and 30 cm. All seedlings were in the same conditions in terms of soil, pot size, sunlight exposure, watering, temperature, and humidity. In Figure 1, a schematic of the applied treatments is displayed. The first treatment served as a control to monitor the potential effects of soil, water, and air on the transfer of heavy metals to grapevine seedlings. In the second treatment, the maximum allowed level (MAL) of Cu, Zn, Pb, Cr, and Cd in irrigation water provided stress to the seedlings. All contamination levels were increased in the third, fourth, and fifth treatments as two, three, and four times the metal MALs in irrigation water, respectively. A stress program was applied to treatments 2–5 by dissolving the metal salts (nitrate form) in irrigation water. Salt metals have a high solubility, resulting in the absorption of the metals by plant organs [34]. According to the Iranian Water Quality Standard (IWQS), the MALs for Cu, Zn, Pb, Cr, and Cd in irrigation water are 200, 2000, 100, and 10 mcg/l, respectively. Seedlings were examined for a period of seven months, and they were stressed during each month (a total of seven stresses were applied). At the end of the stress period and before the beginning of the fall season (September 2018), a spectrophotometric analysis of grapevine seedlings leaves was applied.

**Figure 1.** Schematic design of the treatment for each studied metal (C: control, L1 to L4: level 1 to level 4 stressed, MAL: maximum allowed level) and image of applied grapevine seedling pots.

#### *2.2. Spectra Acquisition*

At least five leaves of each seedling pot were collected for spectroscopy measurements (a total of 420 spectra samples were taken), and afterwards, individual reflectance spectra were measured by pot. In this study, the grapevine foliar spectral reflectance was measured using the ASD FieldSpec 3 spectroradiometer in the full range (350–2500 nm). This instrument is supported by three separate spectrometers (first: 350–975 nm, second: 976–1770, and third: 1771–2500 nm). The ASD spectral resolutions in the range of 350–1000 nm and 1000–2500 nm are 3 and 10 nm with sampling intervals of 1.4 and 2 nm, respectively. In accordance with Kumar et al. [49], the electromagnetic spectrum in the range of 350 to 2500 nm can be classified into four regions including visible (VIS), red-edge region (RDE), near-infrared (NIR), and mid-infrared (MIR), with ranges of 350~700, 680~750, 700~1300, and 1300–2500 nm, respectively. We performed the spectroscopy experiment in a fully dark room in order to reduce the effect of wind, water vapor, temperature, and other environmental disturbance [12]. In this study, each spectral sample was recorded in 100 automatic replicates. Then, we applied the ViewSpect version 6.0 in order to convert spectral curves into test files and analyze them by statistical software.

For each sample, the reflectance spectrum was recorded at 2151 wavelengths (350–2500 nm), which gave a large amount of data, not all of which may be useful for the study purpose. Therefore, in this study, 32 spectral indices were calculated to evaluate their ability to estimate heavy metal contents. The spectral indices which are used in this study were calculated based on the method indicated by Mirzaei et al. [12], although no specific spectral indices exist to detect heavy metal contamination [1]. Table 1 shows the indices that have demonstrated sensitivity in previous studies to small differences in plant characteristics [12,46,50].


**Table 1.** Characteristics of studied hyperspectral indices [13,46].

R: Reflectance.

#### *2.3. Heavy Metal Laboratory Analysis*

The leaves of each pot were placed in polyethylene bags and converted separately in the laboratory after obtaining the foliar reflectance spectra. The leaf samples were dried for 24 h in an oven at 45 ◦C to achieve a constant weight [16]. The samples were powdered and stored for further analysis with a stainless-steel mill. We then digested one gram of each grapevine sample with HNO3 + HClO4 (3:1 v/v) for about 4 h at a low temperature (about 40 ◦C) [51]. All digested samples were then diluted and filtered to 25 mL. Finally, a Graphite-Furnace Atomic Absorption Spectrophotometer (GA-AAS, Model: Analytik Jena, Germany) was used to analyze all samples in triplicate. The concentrations of heavy metal samples were expressed as dry weight (DW) mg/kg. The device detection limits for Zn, Cu, Pb, Cr, and Cd were 0.008, 0.025, 0.01, 0.04, and 0.009 mg/kg, respectively. Based on the analysis, the relative standard deviation accuracy was less than 9%. To evaluate the accuracy of analytical techniques, a spike-and-recovery analysis was performed. Post-analyzed samples were accentuated and homogenized with varying amounts of standard metal solutions. The recovery ranged from 90% to 108% of the spiked sample [52].

#### *2.4. Feature Selection*/*Partial Least Squares (PLS)*

In summary, the dependent variables were the contents of Cd, Cr, Cu, Pb, and Zn in grapevine leaves, while the independent variables were wavelengths (count: 2151) and spectral indices (count: 32). However, a large number of independent variables can reduce the performance of the relationship modelling between spectral data and metal contents. To mitigate this, we needed a feature selection process to identify optimal features (wavelengths and spectral indices) to forecast the concentration of each metal, individually. Also, before applying statistical operations, it is recommended to scale each variable linearly to the same standard range, especially in the machine learning methods [40,53]. The values of wavelengths, spectral indices, and heavy metal concentrations were therefore scaled to the range between 0 and 1, as follows:

$$N\_i = \frac{\mathbf{x}\_i - \mathbf{x}\_{\text{min}}}{\mathbf{x}\_{\text{max}} - \mathbf{x}\_{\text{min}}} \tag{1}$$

where *Ni* is the normalized value, *xi* is the original data, and *x*min and *x*max are the minimum and maximum of each variable's percentages, respectively.

Given the high-dimensional spectral dataset, the use of multivariate statistical analysis is an appropriate solution for achieving optimal features to estimate each metal. PLS is a robust and well-known statistical analysis in relation to hyperspectral data that has shown acceptable performance in many studies [12,40]. This statistical analysis method generates some new components instead of using existing inputs, based on the least square regression. Unlike principal components analysis (PCA), PLS considers response variables in the data reduction process [54]. Fitting a regression model between input and output variables, high collinear spectral data, and the high processing speed are the other advantages of the PLS method. The PLS-developed components are capable of explaining community variance by a simpler structural mechanism. Accordingly, the importance of each input variable is realized by its factor load in each component [12]. We therefore selected optimal independent variables (wavelengths or spectral indices) based on the maximum factor load in each developed PLS component. These variables were considered to be the most representative of the related components. Based on the PLS results, the optimal wavelengths and indices were identified and introduced to the next step (modelling). Wold et al. [55] has provided more information about the assumptions and applications of the PLS.

## *2.5. Modelling the Relationship Between Spectral Data and Heavy Metal Contents*

After the identification of the optimal wavelengths and relevant indices by the PLS, two types of modelling algorithms (SVM and MLR) were applied to estimate heavy metal concentrations based on hyperspectral data. To assess the estimation performance of each model, two goodness-of-fit indicators—specifically, the coefficient of determination (R2) and root mean squared error (RMSE)—were used [40]. All achieved data in this study were randomly separated into two sections: 70% as training data and 30% as testing data. As such, the performance of each developed model was individually reported for training and testing sets.
