*3.2. Modeling the Hyperspectral Wavelengths Through Artificial Neural Network*

The wavelengths were modeled by different machine learning algorithms from the 14th day of the experiment. The ANN model presented here was able to classify better than any of the remaining algorithms since the first day of measurement (Table 3). In general, the absorbance values offered better accuracy than the reflectance ones. The accuracy and other metrics were improved with each measurement, indicating an increased difference over time.


**Table 3.** Accuracy metrics for each of the machine learning algorithms evaluated in this study.

The other machine learning algorithms were also able to return similar classification accuracies. The logistic regression method presented high accuracy in the first three measurements. However, it declined over the final day. This behavior was noted for the other algorithms as well. ANN was not only able to maintain consistency over time but also presented its highest accuracy on the final day. Another observation is that, to all machine learning methods applied here, the absorbance values were more efficient in discriminating the plant groups in most of the classifications.

To visualize the differences between each group, an ROC curve of the last day of measurement was used (Figure 6). The ROCs suggest that the ANN was better to differentiate individually the three groups, while other algorithms performed worse at specific conditions. The ANN also returned a less false-positive rate than all of the other machine learning algorithms. The confusion matrix of the final measurement day also shows how the ANN had more problems in predicting the control group (89.6%) than the other groups (94.4% and 94.1%, bacteria and stress, respectively).

**Figure 6.** Receiver operating characteristic (ROC) curve comparison for each group classification and the ANN confusion matrix.

Based on this classification, the gain ratio and the relief-F were used to evaluate the contribution of individual wavelengths to the ANN model (Figure 7). These metrics suggest that the stress group presented higher differences with the control group than the bacteria group, easily distinguishable by the algorithm. However, there appears to be a higher discrepancy between the bacteria group and the control group at the blue region (380 to 440 nm). Nonetheless, the near-infrared region and the 660 to 730 nm region appears to be contributing more to the stress group response.

**Figure 7.** The individual contribution of the wavelengths to model the water-stress induced lettuces in relation to the control group.

#### **4. Discussion**

This study evaluated the spectral response of lettuce submitted to water stress while modeling its effects with an ANN and other machine learning algorithms. For that, we separated our data into three groups: control, stress, and bacteria. The reason to include the rhizobacteria in this situation is to induce a similarity with what transpires in greenhouses or horticulture models, as this bacterium is commonly present in soil and commercial seeds [29]. The addition of the bacteria group is also important to reinforce our test as it can act as a middle-ground between the stress group and the control group. We firstly evaluated the biological and physical response of the induced stress, and later compared it with the hyperspectral measurement. Lastly, we used ANN and other machine learning algorithms to classify both groups solely by their spectral response.

Our results indicate that the physiological response to water stress in early-stage lettuce is the reduction in leaf size and an increase in chlorophyll concentration. This behavior was evident both in the stress group and in the bacteria group, although with lower intensity in the latter. Changes in leaf pigmentation are noticeable in cases of plant stress [15]. However, an interesting observation is that the stress did not affect root weight in the bacteria group. This indicates that there is an effect in mitigating this stress, although this was not indicated by leaf analysis. By examining the mean spectral curves of each treatment (Figure 2) there is a small amplitude between the red-edge region wavelengths. The red-edge region is commonly known to indicate stress presence [23,34], and this may explain why the bacteria group did not differentiate much from the control group here.

Regarding the correlation between biophysical parameters and the wavelengths, it is initially perceived that absorbance wavelengths are more correlated with most biophysical parameters. An important observation to be made is the strongest correlation between chlorophyll and leaf fresh-weight with these wavelengths (Figure 4 and Table 2). This relationship between absorbance levels and the biophysical parameter was continuous throughout the experiment, particularly in regard to modeling each group's response with machine learning algorithms (Table 3). Still, the correlation was more pronounced in the green and near-infrared regions (Figure 3). This situation is also evident when observing the mean curves of each treatment (Figure 2), where the amplitude between the curves is smaller in the blue, red, and red-edge remaining regions.

The classification performed by the ANN algorithm in this study showed interesting results since the first day of measurement when lettuces had just been stressed one day before the actual measurement. This condition is important to mention as it indicates how powerful hyperspectral analysis in conjunction with machine learning algorithms can be. From the evaluation metrics used in this study, it is evident how ANN was better in distinguishing the three plant groups. By observing the phenomenon temporally, one can see how the performance of the algorithms increased (Table 3). This can be explained by the increased distinction between the wavelengths of each group. As the stress occurred, the spectral behavior of these experiments became distinct from each other. Because this study is unique in this regard, there is a lack of literature to compare with. Still, the accuracy found here is similar to or even higher than those obtained by modeling different stresses effects in plants [21,24–27].

Another contribution of this study is the evaluation of the performance of different algorithms for both reflectance and absorbance wavelengths. Absorbance curves were directly related to changes in biophysical parameters for all treatments (Figure 4 and Table 2). This persisted in the machine learning analysis, where the performance of the algorithms was superior in differentiating the three groups by using their absorbance values. Thus, it is recommended that the modeling of these effects in lettuce is preferably performed from the conversion of reflectance to absorbance data. Another observation is that, by evaluating the performance of each algorithm over time, the ANN accuracy reached its peak at the last measurement day (with 92.7%), while the other algorithms decreased in performance (from the third to the fourth day of evaluation). This indicates how feasible the ANN algorithm was in modeling the water-stress effects in comparison to the others.

Lastly, the ANN algorithm has shown high precision and recall values (Table 2) when classifying each group, as shown in Figure 5. The confusion matrix demonstrated a small decrease in performance (89.6%) for differentiating the control group from the others. Regardless, the gain ratio and relief-F metrics (Figure 6) show how each individual wavelength contributed to the ANN model. In the gain ratio analysis, there is a predominance of wavelengths in the region of blue (380 to 440 nm), red (660 to 730 nm), and near-infrared (790 nm onwards). Despite the similarity between the curves of both groups, there was a smaller amplitude difference for the blue region. This may indicate how much the blue region contributed to differentiate the effects of the stress on the bacteria group. This was similar in the assessment of relief-F curves, in which this same region presented an even higher value than the group under stress. The blue region is responsible for the absorption of chlorophyll and may be an indication of how important the stress effect was in this spectral range. Nevertheless, the model also indicated greater contributions in the red region, red-edge, and near-infrared, which corroborates with the observations made during previous results (Figure 3). Apart from other species and cultivars, future research could be conducted exploring additional spectral regions such as the shortwave infrared (SWIR) region that is unfortunately not considered in the Fieldspec HandHeld ASD spectroradiometer device.
