Water serves as the fundamental component for photosynthesis and nutrient transport in plants. The water status of plant leaves plays a direct role in the growth processes of plants, subsequently impacting biomass and yield. Therefore, comprehending variations in leaf water potential and leaf water content is of paramount importance for understanding plant growth, assessing plant water status, and rational management of plant water resources [
37,
38]. Leveraging remote sensing technology allows for the effective monitoring and assessment of vegetation water status, including parameters like LWC and EWT. These parameters offer precise insights into the physiological condition of vegetation within the natural environment, enabling swift detection of drought conditions and timely implementation of irrigation measures [
39,
40].
4.1. The Optimal Spectral Index under Different Screening Strategies
In this study, hyperspectral remote sensing technology was employed to comprehensively examine and investigate the empirical/semi-empirical index methods utilized by previous researchers. The focus of this investigation centered on the inversion research of LWC and EWT, key water parameters relevant to C. camphora. Through an analysis of four distinct screening strategies and the application of three algorithm models, several noteworthy findings emerged:
The G1 group included empirical vegetation indices; specifically, eight indices closely related to plant water were chosen from a pool of hundreds of vegetation indices [
23,
24,
25,
26,
27,
28,
29,
30]. The investigation revealed that among these indices, the water index (WI) exhibited the highest correlation with LWC, with a correlation coefficient (
R) of 0.713, utilizing bands at 900 nm and 970 nm. In the case of EWT, the Red Edge Normalized Vegetation Index (RE-NDVI) demonstrated the highest correlation, boasting a correlation coefficient (
R) of 0.774 and utilizing bands at 705 nm and 750 nm. Conversely, other empirical vegetation indices yielded less favorable results. These findings align with prior research conducted by Penuelas [
24], Alordzinu [
41], Wang [
25], Song [
42], and others. Notably, measuring plant radiation through WI at 900 nm and 970 nm can significantly enhance the speed and precision of plant water assessment, proving valuable for on-site drought evaluation. Meanwhile, RE-NDVI proves highly effective in capturing vegetation water stress, affirming the robust predictive performance of WI and RE-NDVI in plant water estimation.
The G2 group employed a selection of random dual-band vegetation indices. Initially, three fundamental dual-band indices (DVI, RVI, NDVI) were chosen, and then fractional differentiation was applied to preprocess the original hyperspectral data. This process led to the construction of three first-order differential dual-band vegetation indices (DVI-FD, RVI-FD, NDVI-FD). Research has indicated that by utilizing various narrow-band spectral data or different band combinations, indices can often be further refined, allowing for the comprehensive exploitation of spectral information [
30]. While there have been prior studies on plant water prediction, fractional differentials are infrequently employed for hyperspectral data preprocessing. However, research by Tang [
31], Xia [
43], and others has demonstrated that fractional order differentiation can significantly enhance the correlation between original hyperspectral data and selected parameters, ultimately improving model performance. It is important to note that there is typically no linear relationship between the band and the fractional order. Significant band changes only occur when the fractional order is an integer [
44]. In this study, the use of first-order differential transformations effectively enhanced the correlation between dual-band vegetation indices and the water content of
C. camphora. Matrix plots in
Figure 3 and
Figure 4 depict correlation coefficients between leaf water content and all DVI, RVI, and NDVI band combinations within the 1481 bands spanning the 350–1830 nm range. These plots demonstrate that first-order differentiation can effectively segment the original spectrum and accentuate spectral features’ details. Notably, the dual-band index with the highest correlation with LWC was DVI, featuring a correlation coefficient (
R) of 0.739 and bands at 734 nm and 956 nm. After first-order differential transformation (DVI-FD), the correlation coefficient (
R) increased to 0.766, with bands shifting to 1009 nm and 774 nm. Similarly, the dual-band index with the highest correlation with EWT was also DVI, exhibiting a correlation coefficient (
R) of 0.820 and bands at 700 nm and 1167 nm. After first-order differential transformation (DVI-FD), the correlation coefficient (
R) increased to 0.828, and bands shifted to 1182 nm and 1514 nm. Furthermore, it was observed that the selected random dual-band vegetation indices primarily resided within the red-edge (670–760 nm) and near-infrared (780–2526 nm) wavelength ranges. This aligns with the findings of Chen [
45], Kolarik [
46], Carter [
47], and Zhuang [
48]. Within the 400–700 nm range, reflectance is predominantly influenced by pigments like chlorophyll and carotenoids, which are directly impacted by plant water content. In the 800–1300 nm range, reflectivity is mainly attributed to changes in leaf internal structure induced by variations in plant water content. Finally, within the 1300–2500 nm range, reflectivity is influenced by the direct absorption of radiation by plant water.
The G3 group included the ‘trilateral’ parameters, which encompass spectral reflectance position, amplitude, area, and the spectral indices derived from them, all of which can elucidate the spectral characteristics of green vegetation [
46]. Initially, eight ‘trilateral’ parameters closely associated with plant moisture were selected [
34,
35,
36]. It was observed that the ‘trilateral’ parameter with the highest correlation with LWC was the red edge amplitude (Dr), featuring a correlation coefficient (
R) of 0.664. In the case of EWT, the ‘trilateral’ parameter with the highest correlation was the red edge area (SDr), with a correlation coefficient (
R) of 0.697. Notably, the correlation between the ‘trilateral’ parameters and the water content of
C. camphora was lower compared to the empirical vegetation index and dual-band index. This observation aligns with findings from studies by Peng [
49], Guo [
50], Xie [
51], and others. ‘Trilateral’ parameters exhibit greater sensitivity to factors such as leaf area index, nitrogen content, and chlorophyll content, while their response to plant water content varies at different growth stages.
Considering the strengths and weaknesses of the G1, G2, and G3 groups, we propose a more comprehensive hybrid group, G4, with the expectation of achieving more accurate inversion results. In the G4 group, all correlation coefficients (
R) were compared, and five indices were selected for LWC: WI, OSAVI, DVI (734, 956 nm), DVI-FD (1009, 774 nm), and Dr. Additionally, five indices were chosen for EWT: RE-NDVI, SRWI, DVI (700, 1167 nm), DVI-FD (1182, 1514 nm), and SDr. The results are presented in
Table 6 and
Table 7, demonstrating varying degrees of improvement in the accuracy of the three models in the G4 group. According to the empirical/semi-empirical index method, the ten indices with the highest correlation coefficients with the moisture content of
C. camphora, as selected in this study, leave room for optimization. Obtaining more water and spectral data at various growth stages could provide valuable insights, as the growth status of
C. camphora can change significantly during different growth stages. Factors such as coverage, leaf area index, and chlorophyll content may also undergo alterations, thus impacting spectral reflectance. By refining the selection of more suitable vegetation indices using the empirical/semi-empirical index method, a more comprehensive theoretical foundation is established for future studies on water management in
C. camphora.
4.2. Screening of Different Machine Learning Algorithms
Among the three modeling methods chosen in this study, the RF-based moisture estimation model for
C. camphora demonstrates the highest accuracy, highlighting RF’s superiority over other models in the inversion of LWC and EWT. Previous research has shown that the choice of modeling methods significantly impacts prediction accuracy [
52]. The findings in this study indicate that SVM yields lower prediction accuracy compared to the RF model. This can be attributed to the inherent challenge of SVM in determining the appropriate kernel function and associated parameters. Due to limitations in parameter selection, such as the kernel function and penalty factor, SVM’s applicability is somewhat constrained [
53]. The RBFNN model slightly outperforms the SVM model, possibly due to several reasons. Plant water content is influenced by various factors, including vegetation cover, soil temperature, and air humidity. Multi-path spectral reflectance makes it challenging for the inversion model to be adequately characterized by simple linear or exponential models. However, the RBFNN algorithm employs the spherical basis function as the neuron’s activation function, imparting robust nonlinear mapping capabilities that can accommodate complex nonlinear relationships, such as nonlinear function approximation and pattern classification [
54,
55]. RF, as a machine learning method grounded in ensemble thinking, possesses strong self-learning capabilities. It is a composite model comprising multiple decision trees. Each decision tree is trained on a randomly selected feature subset, and the final prediction is reached through voting or averaging. This ensemble approach mitigates the risk of overfitting and enhances the model’s stability and accuracy [
56]. Therefore, RF stands out as the preferred method for monitoring and modeling LWC and EWT in
C. camphora. This study offers practical applications with real-time and efficient technical services for crop water status monitoring.
Currently, there remain several challenges in the inversion model for the LWC and EWT of C. camphora based on the preferred vegetation index. This study utilizes a limited amount of spectral information, and the spectral preprocessing method employed is singular. Therefore, it is advisable to explore a wider range of spectral index transformations, such as wavelet transform, in future research endeavors. Additionally, it is crucial to consider the incorporation of multi-source remote sensing data to comprehensively exploit the spectral information associated with C. camphora. The integration of more advanced machine learning algorithms should also be pursued to facilitate faster and more accurate predictions of LWC and EWT for C. camphora.