Utilizing Hyperspectral Reflectance and Machine Learning Algorithms for Non-Destructive Estimation of Chlorophyll Content in Citrus Leaves

Li, Dasui; Hu, Qingqing; Ruan, Siqi; Liu, Jun; Zhang, Jinzhi; Hu, Chungen; Liu, Yongzhong; Dian, Yuanyong; Zhou, Jingjing

doi:10.3390/rs15204934

Open AccessArticle

Utilizing Hyperspectral Reflectance and Machine Learning Algorithms for Non-Destructive Estimation of Chlorophyll Content in Citrus Leaves

by

Dasui Li

¹,

Qingqing Hu

¹,

Siqi Ruan

¹,

Jun Liu

²,

Jinzhi Zhang

^1,3,

Chungen Hu

^1,3,

Yongzhong Liu

^1,3

,

Yuanyong Dian

^1,4

and

Jingjing Zhou

^1,4,*

¹

College of Horticulture & Forestry Sciences, Huazhong Agricultural University, Wuhan 430070, China

²

East China Academy of Inventory and Planning of NFGA, Hangzhou 310000, China

³

National Key Laboratory for Germplasm Innovation & Utilization of Horticultural Crops, Wuhan 430070, China

⁴

Hubei Engineering Technology Research Center for Forestry Information, Wuhan 430070, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2023, 15(20), 4934; https://doi.org/10.3390/rs15204934

Submission received: 12 September 2023 / Revised: 9 October 2023 / Accepted: 10 October 2023 / Published: 12 October 2023

(This article belongs to the Special Issue Convolutional Neural Network Applications in Remote Sensing II)

Download

Browse Figures

Versions Notes

Abstract

:

To address the demands of precision agriculture and the measurement of plant photosynthetic response and nitrogen status, it is necessary to employ advanced methods for estimating chlorophyll content quickly and non-destructively at a large scale. Therefore, we explored the utilization of both linear regression and machine learning methodology to improve the prediction of leaf chlorophyll content (LCC) in citrus trees through the analysis of hyperspectral reflectance data in a field experiment. And the relationship between phenology and LCC estimation was also tested in this study. The LCC of citrus tree leaves at five growth seasons (May, June, August, October, and December) were measured alongside measurements of leaf hyperspectral reflectance. The measured LCC data and spectral parameters were used for evaluating LCC using univariate linear regression (ULR), multivariate linear regression (MLR), random forest regression (RFR), K-nearest neighbor regression (KNNR), and support vector regression (SVR). The results revealed the following: the MLR and machine learning models (RFR, KNNR, SVR), in both October and December, performed well in LCC estimation with a coefficient of determination (R²) greater than 0.70. In August, the ULR model performed the best, achieving an R² of 0.69 and root mean square error (RMSE) of 8.92. However, the RFR model demonstrated the highest predictive power for estimating LCC in May, June, October, and December. Furthermore, the prediction accuracy was the best with the RFR model with parameters VOG2 and Carte4 in October, achieving an R² of 0.83 and RMSE of 6.67. Our findings revealed that using just a few spectral parameters can efficiently estimate LCC in citrus trees, showing substantial promise for implementation in large-scale orchards.

Keywords:

leaf chlorophyll content; hyperspectral; citrus; linear regression; machine learning

1. Introduction

Chlorophyll molecules play an important role in the physiology, nitrogen status, and economic productivity of green vegetation [1,2,3] by capturing light energy and generating biochemical energy essential for the Calvin–Benson cycle [4]. Leaf chlorophyll content (LCC) serves as an indicator of plant physiological status, intimately linked to the leaf photosynthetic capacity, defined by its maximum carboxylation rate [5]. LCC exhibits variability across different plant species, and within a specific plant category, there can be noteworthy fluctuations in chlorophyll content throughout the leaf developmental stages. These fluctuations are influenced by alterations in nutrient and water availability, environmental factors, and plant phenology [2,6,7]. Traditional measurement methods of LCC rely on chemical analysis, which is time-consuming, destructive, and unsuitable for rapid field assessments [8]. Consequently, development of a non-destructively high and effective approach for estimating LCC emerges as a crucial and pressing challenge.

Remote sensing can utilize spectra and vegetation indices to predict various biochemical and morphological attributes of plants. This approach holds significant potential for assessing the chlorophyll content of plants across various scales rapidly, efficiently, and non-destructively, including ground-based, airborne, and satellite platforms [9,10,11,12]. In recent years, hyperspectral remote sensing has emerged as a viable solution for acquiring LCC data in crop monitoring and advancing precision agriculture. Its advantages lie in its ability to provide repetitive and high-throughput observations [13,14]. Hyperspectral research has shown that the fluorescence reflectance of chlorophyll in plants predominantly occurs within the near-infrared spectrum (700–1050 nm) due to internal leaf structure. Conversely, the absorption of fluorescence is particularly pronounced in the blue range (400–450 nm) and the red range (650–700 nm). And that is influenced by presence of pigment groups [15,16,17]. Furthermore, some pre-processing techniques have provided some better solutions to estimate vegetation characteristics from hyperspectral reflectance. Additionally, certain preprocessing techniques have offered improved solutions for extracting vegetation attributes from hyperspectral reflectance [18,19,20]. Vegetation indices (VIs) have been formulated through various mathematical combinations, including simple ratios, differences, normalized differences, and derivatives of hyperspectral data. These VIs serve the purpose of characterizing various plant features. In addition, first-order differential spectra (FODS) have been employed to eliminate part of the linear or nearly linear background and reduce the impact of noise spectra on the target spectrum. Another valuable approach involves utilizing three-edge parameters (TEPs), which are associated with relevant variables based on spectral positions, specifically the red edge, blue edge, and yellow edge. TEPs are particularly effective in capturing the spectral attributes of green vegetation and is sensitive to changes in LCC at the same time [10,20,21,22]. The integration of spectral preprocessing techniques with machine learning algorithms has been proven to be valuable in estimating seasonal variability in LCC. This approach has a widespread application in precision agriculture and crop phenotyping [23,24].

Over the past few years, machine learning techniques, including random forest regression (RFR), support vector regression (SVR), and K-nearest neighbor regression (KNNR) have gained prominence for their application in building predictive models for plant traits by using spectral reflectance. These methods have shown notable improvements in prediction accuracy and model robustness [25,26,27,28,29]. Notably, the RFR technique has been proven to be effective particularly in reducing the root mean square error (RMSE) significantly compared to traditional linear regression methods [22]. Meanwhile, SVR and KNNR are commonly employed to address datasets with a large number of features and a high degree of sparsity [30]. Ma et al. [31] used four three-edge parameters and eight vegetation indices to obtain a good result in estimating LCC of citrus trees with RFR, SVR, and XGBoost. Machine learning algorithms can obtain higher accuracy with increasing model parameter inputs, while the risk of overfitting and uncertainty also increase.

Citrus is the most widely cultivated fruit crop globally, and the citrus industry has experienced rapid growth over the past two decades. While many researchers have employed hyperspectral remote sensing to evaluate LCC in various crop species such as winter wheat, rice, and maize [32,33,34], there has been a scarcity of research attention directed toward the dynamic monitoring of physiological factors like chlorophyll content in citrus trees across different growth seasons [11]. The majority of prominent citrus research efforts have revolved around areas such as molecular breeding, stress response, and post-harvest treatment [11]. Saturation effects result from the dense canopy of high year-round leaf density and variation in light conditions and pose huge challenges for the accurate estimation of biophysical and biochemical traits in citrus fruit orchards [35]. Many scholars have already began to estimate physiological factors using hyperspectral information [11,36]. However, a large amount of redundant data often leads to prediction inaccuracies of the dependent variable [35], which greatly limit the hyperspectral remote sensing in quantitative estimation of bio-physical and biochemical traits.

In this study, we collected measurements of both leaf reflectance and leaf chlorophyll content through a field experiment and aimed to (i) monitor the seasonal variations in chlorophyll content at the leaf level of citrus and how phenology influences its estimation; (ii) identify how few spectral parameters can achieve high accuracy in estimating LCC; and (iii) evaluate the performance of linear regression and different machine learning algorithms in predicting the chlorophyll content of citrus leaves. The answers to these questions are effectively instrumental in the development of spectral prediction models for the leaf chlorophyll content of citrus. The selection of key spectral information facilitates the dynamic monitoring of LCC on a larger scale.

2. Materials and Methods

2.1. Experimental Site and Experimental Design

In this study, field experiments were carried out from May to December in 2022. These experiments were conducted in citrus orchards of Huazhong Agriculture University, which is located in Wuhan, Hubei Province, China (113°41′–115°05′E, 29°58′–31°22′N). Wuhan is among the major cities situated along the upper and middle stretches of the Yangtze River in central China. The climate characteristics of this region include an average annual temperature of 16.9 °C, a mean annual relative humidity of 77%, an annual precipitation of 1259 mm, and an annual average frost-free period of 240 days. Citrus trees from the F2 hybrid population with Citrus clementina and Poncirus trifoliata (L.) Raf as parents were chosen in May, June, August, October, and December of 2022, respectively. The LCC and spectral data of three healthy citrus leaves in the upper canopy were collected at the same time, with three measurements for LCC and one for spectral data for each leaf.

2.2. Measurement of the Hyperspectral Data

In this study, we conducted spectral measurements of citrus leaves using a portable field PSR-3500 Spectroradiometer manufactured by Spectral Evolution, INC, which originates from Lawrence, KS, USA. This instrument can offer a spectral range spanning from 350 to 2500 nm and features spectral resolutions of 1 nm up to 1006 nm and 3.5 nm after 1006 nm. To ensure the accuracy of our measurements, a white surface plate was registered for calibration before each spectral measurement. This calibration process enabled us to convert the digital readings into physical signals [37]. Leaf spectra were captured at the middle position of each leaf’s surface with measurements taken under clear, blue skies.

2.3. Measurement of Leaf Chlorophyll Content

LCC was determined by collecting samples from the leaves at the points corresponding to the spectral sampling using a handheld chlorophyll meter (SPAD-502Plus, Konica Minolta, Tokyo, Japan) in the field. The chlorophyll meter primarily utilizes leaf transmittance within the central band of 650 to 940 nm to determine chlorophyll content, and SPAD values can more accurately reflect changes in leaf greenness [10]. Each sample value was obtained from the same location as where spectral data were obtained. For every sample leaf, three measurements were taken, and these values were then averaged to derive the representative SPAD value for the leaves.

2.4. Extraction of Spectral Parameters

In total, 45 vegetation indices (VIs), 4 three-edge parameters (TEPs), and first-order differential spectrum (FODS) were preselected as surrogates for LCC estimations which are summarized in Table 1. The indices included some traditional vegetation indices (VIs), such as normalized difference ratios (e.g., NDVI705), simple ratio indices (e.g., SR), and ratio vegetation indices (e.g., RVI). These VIs simplify the interpretation of intricate vegetation reflection patterns by establishing indirect connections with plant physiological and structural characteristics [38]. On the other hand, FODS is a type of common spectral transformation with sensitivity to LCC, while TEPs are adept at reflecting the spectral attributes of green vegetation and exhibits sensitivity to variations in LCC [10]. FODS is derived from the calculation of the difference in reflectance between adjacent wavelengths [39]. Specifically, the value of FODS is calculated by subtracting the reflectance value of the current spectral wavelength from that of the next adjacent spectral wavelength. All data processing and spectral calculations were performed using the Python 3.10 software package.

2.5. Dimension Reduction and Parameter Selection

Citrus leaves exhibit varying sensitivity to the extracted spectral parameters. Regression models become more complex and may be prone to overfitting with the number of parameters increasing because some of these parameters may contain noise that does not contribute meaningfully to the output [69]. To address the issues of dimension reduction and parameter selection and achieve better results with fewer parameters, we employ the following solutions. (i) Pearson correlation coefficient was used to analyze the correlation between spectral parameters and all the measured LCC. Seven VIs with the highest correlation were selected from a pool of forty-five VIs. Two TEPs with the highest correlation were chosen from among four TEPs. And the FODS with the highest correlation were also selected to compose ten sensitive parameters. (ii) The Lasso method was employed to select three optimal parameter combinations from the pool of ten sensitive parameters for multiple linear regression. (iii) The random forest training model was employed to rank the importance of the sensitive parameters and used to select the top two parameters for modeling. It is important to note that Lasso is a regularization technique primarily used with linear regression models [70]. It introduced a penalty term into the linear regression objective function, which reduced the magnitude of less important parameters to zero. Lasso effectively conducts feature selection by setting some coefficients to precisely zero, effectively eliminating those features from the model. This simplifies the model and mitigates overfitting. Parameter importance ranking is often associated with ensemble models such as Random Forests and Gradient Boosting [71], based on the contributions of each parameter to the model’s performance. Parameters with high importance scores were considered more influential in making predictions, which was helpful for identifying and prioritizing relevant parameters.

2.6. Linear Regression Analysis

Univariate linear regression (ULR) and multivariate linear regression (MLR) are well-known statistical models employed for estimating LCC. These models are favored for their simplicity in interpretation and high efficiency in computation [10]. The regression models are:

y = c₀ + c₁x₁

(1)

y = c₀ + c₁x₁ + c₂x₂ + c₃x₃

(2)

Equations (1) and (2) are models for the ULR and MLR, respectively, where x₁, x₂, and x₃ represent the independent variables; y stands for dependent variables; and c₀, c₁, c₂, and c₃ are the modeled coefficients of the LCC regression models.

2.7. Machine Learning Algorithms

In this study, three different machine learning methods for LCC estimation were utilized, including random forest regression (RFR), K-nearest neighbor regression (KNNR) and support vector regression (SVR). The machine learning model training process included the following steps: data preparation, standardization, data splitting (the field-measured data were randomly divided into two sets, with a training set comprising 70% and a testing set comprising 30%), model definition, the definition of a hyperparameter search space, performing grid search with cross-validation, obtaining the best parameters, model training, and performance evaluation. Grid search exhaustively searched through a predefined parameter grid to identify the best combination of parameters. The accuracy of the LCC estimation model was evaluated by coefficient of determination (R²) and root mean square error (RMSE), calculated by Equations (3) and (4), respectively. To visualize the relationship between the predicted and measured LCC values, we generated graphs that measured and predicted values along the 1:1 line. All of these algorithms and analyses were implemented using the scikit-learn package in Python v3.10, a widely used library for machine learning and data analysis in Python.

R^{2} = 1 - \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2} / \sum_{i = 1}^{n} {(y_{i} - {\bar{y}}_{i})}^{2}

(3)

RMSE = \sqrt{\sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2} / n}

(4)

where

{\hat{y}}_{i}

is the predicted values of LCC;

y_{i}

is the measured values of LCC in the field; and

{\bar{y}}_{i}

is the average values of measured LCC. N is the sample number of validations.

RFR is an ensemble machine learning technique that relies on decision trees. It constructs numerous small regression trees to make predictions [72]. Ntree and Mtry are crucial in controlling the model’s performance and complexity in RF [11]. Ntree refers to the number of decision trees that are created in the random forest ensemble. Increasing the number of trees usually increases model accuracy up to a certain threshold, but it also leads to greater computational complexity. In this study, Ntree was defined as a range of potential values within [10, 50, 100, 200].

KNNR is a relatively simple method in which the estimation is predicted as a weighted average value with k spectrally nearest neighbors using a weighting method [72]. The KNN parameters were set as follows: the type of distance measures was set to Euclidean distance, the weighting functions were set to uniform, and the range of k values was specified as [1, 3, 5, 7, 9, 11, 13, 15].

SVR, grounded in statistical learning theory, is a machine learning method renowned for its ability to generate highly accurate estimates even with a limited quantity of training data [72]. Furthermore, SVR is founded on rigorous mathematical theoretical principles, providing a solid basis for its predictive performance [73]. SVR utilizes kernel functions to transform the data into a new hyperspace, where intricate nonlinear patterns can be simply represented [74], allowing for SVR to effectively capture intricate relationships between input features and the target variable, making it suitable for handling non-linear data. For optimal SVR performance, a step involved the tuning of hyperparameters. The hyperparameters considered for tuning were the regularization parameter C, and the kernel coefficient γ for the three kernel functions: linear functions, radial basis kernel functions, and polynomial kernel functions. C and γ were optimized within [10⁻¹, 1, 10, 100] and [10⁻³, 10⁻², 10⁻¹, 1], respectively.

3. Results

3.1. Statistics of Measured LCC

Figure 1 shows the summary statistics for the measured LCC of the citrus leaves. In this study, 275, 389, 451, 279, and 113 leaf samples were selected for model training in May, June, August, October, and December, respectively. The mean value of LCC increased from a minimum of 49.60 in May to a maximum of 55.87 in June, then decreased to 52.01 in August and 52.46 in October and continued to decrease to 50.86 in December. Notably, the mean LCC in June was significantly higher than that in May, August, and December. The LCC range varied from 22.30 to 79.70 in May, 20.30 to 87.30 in June, 6.70 to 87.37 in August, 10.80 to 91.67 in October, and 11.87 to 92.70 in December. The coefficient of variation (CV) of data was 21.41–36.52%. The coefficient of variation in August, October, and December was greater than 30%, and the variability of data was greater than that in May and June. The range of LCC in August, October, and December was larger than that in May and June.

3.2. Parameters Selection

Figure 2 illustrates the results of the correlation analysis, which was conducted between 45 VIs, 4 TEPs, and FODS with all the measured LCC. Ten sensitive parameters including (seven Vis, two TEPs, and FODS) Datt1, VOG2, MTCI, SR (750,710), Carte4, mNDVI705, RVI, FODS, SDr/SDb, and (SDr − SDb)/(SDr + SDb) were selected for model training in linear regression and machine learning. The correlation coefficients were 0.77, 0.78, 0.80, 0.76, −0.79, 0.76, 0.79, 0.77, 0.53, and 0.75, respectively. The maximum correlation coefficients in May, June, August, October, and December between the LCC and FODS were 0.56 at 647.2 nm, 0.80 at 740.7 nm, 0.81 at 740.7 nm, 0.80 at 735.5 nm, and 0.85 at 730.2 nm, respectively (Figure 3).

3.3. Univariate Linear Regression

Each of the 10 sensitive parameters were selected as an independent variable, while the LCC values served as the dependent variable for constructing linear regression models. Subsequently, optimal fitting models were chosen. Table 2 shows that the R² and the related parameter were 0.20 and FODS (647.2) in May, 0.61 and Datt1 in June, 0.69 and MTCI in August, 0.66 and Carte4 in October, and 0.72 and FODS (730.2) in December, respectively. FODS (730.2) had the best LCC performance with an R² value of 0.72 and an RMSE of 9.72 in December. At all growth seasons except for May, the R² was greater than 0.60.

3.4. Multivariate Linear Regression

In this section, 10 sensitive parameters were selected for the Lasso parameter selection. By adding an L1 regularization term to the loss function, the top three sensitive parameters were chosen for MLR models. The R² and RMSE of the best model were 0.29 and 8.59 in May, 0.64 and 7.93 in June, 0.68 and 8.79 in August, 0.77 and 7.70 in October, and 0.76 and 8.56 in December. The MLR models achieved the highest R² of up to 0.77 in October. At all growth seasons except for May, the R² was greater than 0.60 (Table 3). The MLR model performed better in May, June, October, and December, compared with that of a single variable.

3.5. Machine Learning Algorithms

Figure 4 shows that ten sensitive parameters were ranked according to importance for determining the top two parameters for each growth season. Those were Carte4 and FODS (647.2) in May, VOG2 and Carte4 in June, VOG2 and SR (750,710) in August, VOG2 and Carte4 in October, and FODS (730.2) and (SDr − SDb)/(SDr + SDb) in December, respectively. These parameters were selected for predicting the LCC. The R² of RFR was 0.42 in May, 0.69 in June, 0.67 in August, 0.83 in October, and 0.83 in December; and the KNNR were 0.34 in May, 0.62 in June, 0.64 in August, 0.78 in October, and 0.71 in December; and the SVR were 0.33 in May, 0.58 in June, 0.59 in August, 0.73 in October, and 0.71 in December, respectively (Table 4). In October and December, the highest R² value for RFR reached 0.83, while KNNR achieved 0.78 in October, and SVR reached 0.73 in October. All three regression models had R² values higher than 0.7 in both October and December (Figure 5, Figure 6 and Figure 7). RFR demonstrated better LCC estimation performance, and the R² values were higher than KNNR and SVR across the five growth seasons.

4. Discussion

In this study, we focused on leaf-level hyperspectral data to estimate the chlorophyll content in citrus trees. A correlation analysis examining the relationship between spectral parameters (45 VIs, 4 TEPs, and FODS) and all measured LCC was performed as presented in Figure 2. The results of this study indicate that the absolute values of correlation coefficients for the top 7 VIs (Datt1, VOG2, MTCI, SR (750,710), Carte4, mNDVI705 and RVI) were all greater than 0.75, while the top 2 TEPs (SDr/SDb, (SDr − SDb)/(SDr + SDb)) were both greater than 0.50, and FODS (740.7) was 0.77. Notably, these parameters were primarily associated with the red edge region (680–780 nm) of the spectral reflectance, which was influenced by biochemical and biophysical factors. This region could effectively be used to estimate foliar chlorophyll and nitrogen content [75].

We explored regression analysis by utilizing established spectral parameters, as well as by implementing machine learning algorithms such as random forests, support vector machines, and K-nearest neighbors. The analysis was performed in two ways: (i) considering all available spectral bands and (ii) utilizing selected and optimized spectral parameters as predictor variables. Furthermore, a more detailed discussion of this analysis is provided below.

4.1. Linear Regression Analysis of the Spectral Parameters for LCC Estimation

Univariate linear regression analyses among the 10 selected sensitive parameters and LCC were performed. The parameter with the best performance is presented in Table 2. FODS representing the first-order differential spectrum at the red edge region was the best-performing spectral parameter based on the R² and RMSE (0.72 and 9.72) in December. These results were consistent with prior research in crop analysis, including studies by Guo, Zhang, and Shi [76,77,78], who reported that FODS can effectively estimate crop chlorophyll content including tobacco, winter wheat, and soybean. This was attributed to the fact that FODS contained valuable spectral information linking to chlorophyll content and the red edge region was particularly sensitive to chlorophyll content. It characterized the plant chlorophyll ability to absorb red and near-infrared radiation [78,79]. The multivariate linear regression results indicate that the three combination parameters (VOG2, SR (750,710), Carte4) yielded the best estimated LCC, with an R² value of 0.77 in December, and 0.76 in October, as shown in Table 3. Previous studies have consistently demonstrated that red edge parameters provide the most accurate results for estimating chlorophyll content, whether at the leaf level or the canopy level [80]. Specifically, VOG2 in original spectra was found to be strongly sensitive to the leaf chlorophyll content in winter oilseed rape and sorghum [79,81]. SR (750,710) was demonstrated to provide accurate results for the estimation of chlorophyll content in closed forest canopies [80].

4.2. Performance Evaluation of Machine Learning Algorithms for LCC Estimation

Additional research has demonstrated the potential of machine learning models for predicting the chlorophyll content of citrus leaves [10,30,82]. In this study, we established prediction models for LCC across five growth seasons of citrus trees by using RFR, KNNR, and SVR. The KNNR and SVR models, with the parameters VOG2 and Carte4, achieved the highest R² value of 0.78 and 0.73, and the lowest RMSE of 7.94 and 8.43, respectively, in October. However, RFR models, with parameters the VOG2 and Carte4, reached the highest R² (0.83), and lowest RMSE (6.67) in October. The results indicate that RFR outperformed KNNR and SVR as the most effective predictor (Table 4). RFR, which is a supervised learning technique for regression that employs ensemble learning, has been consistently reported to yield high accuracy in estimating LCC in crops [22,30,78,83]. RFR is composed of multiple trees trained through bagging and a random variable selection process [84]. So, it demonstrates robustness against outliers and noise. Moreover, it also exhibits the capability to handle the substantial nonlinearity inherent in the functional relationship between spectral variables and biophysical or biochemical parameters [41]. On the other hand, KNNR is a simple and nonparametric supervised learning technique that relies on proximity for predicting groupings of individual data points, which render it attune to the local data structure [85]. Although it has the ability to provide reliable estimates of LCC, KNNR performed poorly in this study compared to RFR. SVR models exhibited slightly inferior performance compared to other regression models in June, August, and December. This result could potentially be attributed to the tendency of SVR models to exhibit overfitting.

4.3. Exploring Future Prospects in Citrus Chlorophyll Research

In this study, we established ULR, MLR, and various machine learning models (RFR, KNNR, and SVR) using field-measured hyperspectral data to estimate LCC across five growth seasons. To enhance the applicability of our findings, we recognized the need for further analysis that extended beyond leaf-level assessments. Moreover, future research can extend its focus to the utilization of hyperspectral data acquired through sensors integrated into remote sensing satellites or UAV-based systems. This has the potential to significantly monitor horticultural crop growth across large areas and address key factors such as physiology, nitrogen status, and productivity.

Additionally, it was essential to acknowledge that these models exhibited varying estimation accuracies across the five growth seasons, underscoring the influence of climatic and phenological factors. The fluctuations in chlorophyll content observed, notably the peak in June and subsequent declines in August, October, and December, can be attributed to reduced photosynthetic activity resulting from decreasing light and temperature levels during these months. Moreover, fluctuations in rainfall between months may impact water availability, subsequently influencing leaf chlorophyll content through its effect on plant water utilization. Furthermore, considering the growth stages of citrus trees, where June possibly signifies a period of vigorous growth and December represents a dormant phase, we can appreciate their contribution to the observed variations in leaf chlorophyll content. These variations may also be influenced by other meteorological factors such as wind speed, humidity, and ultraviolet radiation, highlighting the multifaceted nature of leaf chlorophyll content variation across changing seasons.

5. Conclusions

Non-destructive approaches for precisely estimating LCC in various growth seasons are essential for monitoring the physiological condition of citrus trees quickly. In this study, we explored the development of univariate/multivariate linear regression and machine learning models (RFR, KNNR, and SVR) using hyperspectral leaf reflectance. Interestingly, RFR models performed the best, with R² values of 0.83 for both October and December, compared to KNNR and SVR just using two spectral parameters, respectively. This indicates that the RFR model based on a limited number of parameters also remained robust under varying seasonal and meteorological conditions. And the obvious influence of phenology on the LCC estimation was also observed, although citrus belongs to the evergreen tree species. This is useful for the cultivation and management of citrus trees with the change of phenology. In addition, the investigation of the changes in LCC throughout the growth seasons could provide many suggestions to the dynamics of physiological status of citrus tree over time.

Author Contributions

J.Z. (Jinzhi Zhang) proposed the idea, designed the experiment, and revised the paper. D.L. analyzed the experimental data and organized the writing. D.L., Q.H. and S.R., measured all chlorophyll and spectral data. Y.D. provided the technical guidance and supervised the work. J.Z. (Jingjing Zhou), C.H., Y.L. and J.L., provided citrus trees as experimental materials and helped to design experiments. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Key Research and Development Plan (Grant number, 2019YFD1000104). This study was also supported by two National Natural Fund Projects (no. 31972356 and 31901963) and an earmarked fund for CARS 26.

Data Availability Statement

The data used in this study is available from the first or corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

References

Peng, Y.; Nguy-Robertson, A.; Arkebauer, T.; Gitelson, A. Assessment of Canopy Chlorophyll Content Retrieval in Maize and Soybean: Implications of Hysteresis on the Development of Generic Algorithms. Remote Sens. 2017, 9, 226. [Google Scholar] [CrossRef]
Evans, J.R. Photosynthesis and Nitrogen Relationships in Leaves of C3 Plants. Oecologia 1989, 78, 9–19. [Google Scholar] [CrossRef]
Luo, J.; Zhou, J.-J.; Masclaux-Daubresse, C.; Wang, N.; Wang, H.; Zheng, B. Morphological and Physiological Responses to Contrasting Nitrogen Regimes in Populus Cathayana Is Linked to Resources Allocation and Carbon/Nitrogen Partition. Environ. Exp. Bot. 2019, 162, 247–255. [Google Scholar] [CrossRef]
Porcar-Castell, A.; Tyystjärvi, E.; Atherton, J.; Van Der Tol, C.; Flexas, J.; Pfündel, E.E.; Moreno, J.; Frankenberg, C.; Berry, J.A. Linking Chlorophyll a Fluorescence to Photosynthesis for Remote Sensing Applications: Mechanisms and Challenges. J. Exp. Bot. 2014, 65, 4065–4095. [Google Scholar] [CrossRef]
Croft, H.; Chen, J.M.; Luo, X.; Bartlett, P.; Chen, B.; Staebler, R.M. Leaf Chlorophyll Content as a Proxy for Leaf Photosynthetic Capacity. Glob. Chang. Biol. 2017, 23, 3513–3524. [Google Scholar] [CrossRef]
Sage, R.F.; Pearcy, R.W.; Seemann, J.R. The Nitrogen Use Efficiency of C₃ and C₄ Plants: III. Leaf Nitrogen Effects on the Activity of Carboxylating Enzymes in Chenopodium album (L.) and Amaranthus retroflexus (L.). Plant Physiol. 1987, 85, 355–359. [Google Scholar] [CrossRef]
Luo, J.; Zhou, J.; Li, H.; Shi, W.; Polle, A.; Lu, M.; Sun, X.; Luo, Z.-B. Global Poplar Root and Leaf Transcriptomes Reveal Links between Growth and Stress Responses under Nitrogen Starvation and Excess. Tree Physiol. 2015, 35, 1283–1302. [Google Scholar] [CrossRef]
Qiao, L.; Tang, W.; Gao, D.; Zhao, R.; An, L.; Li, M.; Sun, H.; Song, D. UAV-Based Chlorophyll Content Estimation by Evaluating Vegetation Index Responses under Different Crop Coverages. Comput. Electron. Agric. 2022, 196, 106775. [Google Scholar] [CrossRef]
Aasen, H.; Honkavaara, E.; Lucieer, A.; Zarco-Tejada, P. Quantitative Remote Sensing at Ultra-High Resolution with UAV Spectroscopy: A Review of Sensor Technology, Measurement Procedures, and Data Correction Workflows. Remote Sens. 2018, 10, 1091. [Google Scholar] [CrossRef]
Ta, N.; Chang, Q.; Zhang, Y. Estimation of Apple Tree Leaf Chlorophyll Content Based on Machine Learning Methods. Remote Sens. 2021, 13, 3902. [Google Scholar] [CrossRef]
Zhou, J.-J.; Zhang, Y.-H.; Han, Z.-M.; Liu, X.-Y.; Jian, Y.-F.; Hu, C.-G.; Dian, Y.-Y. Evaluating the Performance of Hyperspectral Leaf Reflectance to Detect Water Stress and Estimation of Photosynthetic Capacities. Remote Sens. 2021, 13, 2160. [Google Scholar] [CrossRef]
Gamon, J.A.; Somers, B.; Malenovský, Z.; Middleton, E.M.; Rascher, U.; Schaepman, M.E. Assessing Vegetation Function with Imaging Spectroscopy. Surv. Geophys. 2019, 40, 489–513. [Google Scholar] [CrossRef]
Zhang, Y.; Hui, J.; Qin, Q.; Sun, Y.; Zhang, T.; Sun, H.; Li, M. Transfer-Learning-Based Approach for Leaf Chlorophyll Content Estimation of Winter Wheat from Hyperspectral Data. Remote Sens. Environ. 2021, 267, 112724. [Google Scholar] [CrossRef]
Zarco-Tejada, P.J.; González-Dugo, M.V.; Fereres, E. Seasonal Stability of Chlorophyll Fluorescence Quantified from Airborne Hyperspectral Imagery as an Indicator of Net Photosynthesis in the Context of Precision Agriculture. Remote Sens. Environ. 2016, 179, 89–103. [Google Scholar] [CrossRef]
Datt, B. Remote Sensing of Chlorophyll a, Chlorophyll b, Chlorophyll a+b, and Total Carotenoid Content in Eucalyptus Leaves. Remote Sens. Environ. 1998, 66, 111–121. [Google Scholar] [CrossRef]
Yamashita, H.; Sonobe, R.; Hirono, Y.; Morita, A.; Ikka, T. Dissection of Hyperspectral Reflectance to Estimate Nitrogen and Chlorophyll Contents in Tea Leaves Based on Machine Learning Algorithms. Sci. Rep. 2020, 10, 17360. [Google Scholar] [CrossRef]
Poobalasubramanian, M.; Park, E.-S.; Faqeerzada, M.A.; Kim, T.; Kim, M.S.; Baek, I.; Cho, B.-K. Identification of Early Heat and Water Stress in Strawberry Plants Using Chlorophyll-Fluorescence Indices Extracted via Hyperspectral Images. Sensors 2022, 22, 8706. [Google Scholar] [CrossRef]
Zheng, Y.; Zhang, T.; Zhang, J.; Chen, X.; Shen, X. Influence of smooth, 1st derivative and baseline correction on the near-infrared spectrum analysis with PLS. Spectrosc. Spectr. Anal. 2004, 24, 1546–1548. [Google Scholar]
Zhao, T.; Komatsuzaki, M.; Okamoto, H.; Sakai, K. Cover Crop Nutrient and Biomass Assessment System Using Portable Hyperspectral Camera and Laser Distance Sensor. Eng. Agric. Environ. Food 2010, 3, 105–112. [Google Scholar] [CrossRef]
Zhao, T.; Nakano, A.; Iwaski, Y.; Umeda, H. Application of Hyperspectral Imaging for Assessment of Tomato Leaf Water Status in Plant Factories. Appl. Sci. 2020, 10, 4665. [Google Scholar] [CrossRef]
Turpie, K.R. Explaining the Spectral Red-Edge Features of Inundated Marsh Vegetation. J. Coast. Res. 2013, 290, 1111–1117. [Google Scholar] [CrossRef]
Shah, S.H.; Angel, Y.; Houborg, R.; Ali, S.; McCabe, M.F. A Random Forest Machine Learning Approach for the Retrieval of Leaf Chlorophyll Content in Wheat. Remote Sens. 2019, 11, 920. [Google Scholar] [CrossRef]
Cui, B.; Zhao, Q.; Huang, W.; Song, X.; Ye, H.; Zhou, X. A New Integrated Vegetation Index for the Estimation of Winter Wheat Leaf Chlorophyll Content. Remote Sens. 2019, 11, 974. [Google Scholar] [CrossRef]
Zhu, W.; Sun, Z.; Yang, T.; Li, J.; Peng, J.; Zhu, K.; Li, S.; Gong, H.; Lyu, Y.; Li, B.; et al. Estimating Leaf Chlorophyll Content of Crops via Optimal Unmanned Aerial Vehicle Hyperspectral Data at Multi-Scales. Comput. Electron. Agric. 2020, 178, 105786. [Google Scholar] [CrossRef]
Wang, J.; Chen, Y.; Chen, F.; Shi, T.; Wu, G. Wavelet-Based Coupling of Leaf and Canopy Reflectance Spectra to Improve the Estimation Accuracy of Foliar Nitrogen Concentration. Agric. For. Meteorol. 2018, 248, 306–315. [Google Scholar] [CrossRef]
Wang, L.; Zhou, X.; Zhu, X.; Dong, Z.; Guo, W. Estimation of Biomass in Wheat Using Random Forest Regression Algorithm and Remote Sensing Data. Crop J. 2016, 4, 212–219. [Google Scholar] [CrossRef]
Cavallo, D.P.; Cefola, M.; Pace, B.; Logrieco, A.F.; Attolico, G. Contactless and Non-Destructive Chlorophyll Content Prediction by Random Forest Regression: A Case Study on Fresh-Cut Rocket Leaves. Comput. Electron. Agric. 2017, 140, 303–310. [Google Scholar] [CrossRef]
Liu, H.; Li, M.; Zhang, J.; Gao, D.; Sun, H.; Yang, L. Estimation of Chlorophyll Content in Maize Canopy Using Wavelet Denoising and SVR Method. Int. J. Agric. Biol. Eng. 2018, 11, 132–137. [Google Scholar] [CrossRef]
Barman, U.; Sarmah, A.; Sahu, D.; Barman, G.G. Estimation of Tea Leaf Chlorophyll Using MLR, ANN, SVR, and KNN in Natural Light Condition. In Proceedings of the International Conference on Computing and Communication Systems; Maji, A.K., Saha, G., Das, S., Basu, S., Tavares, J.M.R.S., Eds.; Springer: Singapore, 2021; pp. 287–295. [Google Scholar]
Narmilan, A.; Gonzalez, F.; Salgadoe, A.S.A.; Kumarasiri, U.W.L.M.; Weerasinghe, H.A.S.; Kulasekara, B.R. Predicting Canopy Chlorophyll Content in Sugarcane Crops Using Machine Learning Algorithms and Spectral Vegetation Indices Derived from UAV Multispectral Imagery. Remote Sens. 2022, 14, 1140. [Google Scholar] [CrossRef]
Ma, R.; Tang, T.; Wang, X. Correlation Analysis of Citrus Chlorophyll Content based on Machine Learning. Sci. Technol. Innov. 2023, 72–75. [Google Scholar]
Kong, W.; Huang, W.; Zhou, X.; Ye, H.; Dong, Y.; Casa, R. Off-Nadir Hyperspectral Sensing for Estimation of Vertical Profile of Leaf Chlorophyll Content within Wheat Canopies. Sensors 2017, 17, 2711. [Google Scholar] [CrossRef] [PubMed]
Jin, X.; Li, Z.; Feng, H.; Xu, X.; Yang, G. Newly Combined Spectral Indices to Improve Estimation of Total Leaf Chlorophyll Content in Cotton. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2014, 7, 4589–4600. [Google Scholar] [CrossRef]
Jin, X.; Wang, K.; Xiao, C.; Diao, W.; Wang, F.; Chen, B.; Li, S. Comparison of Two Methods for Estimation of Leaf Total Chlorophyll Content Using Remote Sensing in Wheat. Field Crops Res. 2012, 135, 24–29. [Google Scholar] [CrossRef]
Ali, A.; Imran, M. Remotely Sensed Real-Time Quantification of Biophysical and Biochemical Traits of Citrus (Citrus sinensis L.) Fruit Orchards—A Review. Sci. Hortic. 2021, 282, 110024. [Google Scholar] [CrossRef]
Sari, M.; Sonmez, N.K.; Karaca, M. Relationship between Chlorophyll Content and Canopy Reflectance in Washington Navel Orange Trees (Citrus sinensis (L.) Osbeck). Pak. J. Bot. 2005, 37, 1093–1102. [Google Scholar]
Osco, L.P.; Ramos, A.P.M.; Faita Pinheiro, M.M.; Moriya, É.A.S.; Imai, N.N.; Estrabis, N.; Ianczyk, F.; de Araújo, F.F.; Liesenberg, V.; de Castro Jorge, L.A.; et al. A Machine Learning Framework to Predict Nutrient Content in Valencia-Orange Leaf Hyperspectral Measurements. Remote Sens. 2020, 12, 906. [Google Scholar] [CrossRef]
Gerhards, M.; Schlerf, M.; Mallick, K.; Udelhoven, T. Challenges and Future Perspectives of Multi-/Hyperspectral Thermal Infrared Remote Sensing for Crop Water-Stress Detection: A Review. Remote Sens. 2019, 11, 1240. [Google Scholar] [CrossRef]
Li, F.; Wang, L.; Liu, J.; Wang, Y.; Chang, Q. Evaluation of Leaf N Concentration in Winter Wheat Based on Discrete Wavelet Transform Analysis. Remote Sens. 2019, 11, 1331. [Google Scholar] [CrossRef]
Gitelson, A.A.; Merzlyak, M.N.; Chivkunova, O.B. Optical Properties and Nondestructive Estimation of Anthocyanin Content in Plant Leaves. Photochem. Photobiol. 2007, 74, 38–45. [Google Scholar] [CrossRef]
Liang, L.; Qin, Z.; Zhao, S.; Di, L.; Zhang, C.; Deng, M.; Lin, H.; Zhang, L.; Wang, L.; Liu, Z. Estimating Crop Chlorophyll Content with Hyperspectral Vegetation Indices and the Hybrid Inversion Method. Int. J. Remote Sens. 2016, 37, 2923–2949. [Google Scholar] [CrossRef]
Carter, G.A. Ratios of Leaf Reflectances in Narrow Wavebands as Indicators of Plant Stress. Int. J. Remote Sens. 1994, 15, 697–703. [Google Scholar] [CrossRef]
Datt, B. A New Reflectance Index for Remote Sensing of Chlorophyll Content in Higher Plants: Tests Using Eucalyptus Leaves. J. Plant Physiol. 1999, 154, 30–36. [Google Scholar] [CrossRef]
Huete, A.; Justice, C.; Liu, H. Development of Vegetation and Soil Indices for MODIS-EOS. Remote Sens. Environ. 1994, 49, 224–234. [Google Scholar] [CrossRef]
Daughtry, C.S.T.; Walthall, C.L.; Kim, M.S.; de Colstoun, E.B.; McMurtrey, J.E. Estimating Corn Leaf Chlorophyll Concentration from Leaf and Canopy Reflectance. Remote Sens. Environ. 2000, 74, 229–239. [Google Scholar] [CrossRef]
Haboudane, D.; Miller, J.R.; Pattey, E.; Zarco-Tejada, P.J.; Strachan, I.B. Hyperspectral Vegetation Indices and Novel Algorithms for Predicting Green LAI of Crop Canopies: Modeling and Validation in the Context of Precision Agriculture. Remote Sens. Environ. 2004, 90, 337–352. [Google Scholar] [CrossRef]
Marshak, A.; Knyazikhin, Y.; Davis, A.B.; Wiscombe, W.J.; Pilewskie, P. Cloud-Vegetation Interaction: Use of Normalized Difference Cloud Index for Estimation of Cloud Optical Thickness. Geophys. Res. Lett. 2000, 27, 1695–1698. [Google Scholar] [CrossRef]
Merzlyak, M.N.; Gitelson, A.A.; Chivkunova, O.B.; Rakitin, V.Y. Non-Destructive Optical Detection of Pigment Changes during Leaf Senescence and Fruit Ripening. Physiol. Plant. 1999, 106, 135–141. [Google Scholar] [CrossRef]
Roujean, J.-L.; Breon, F.-M. Estimating PAR Absorbed by Vegetation from Bidirectional Reflectance Measurements. Remote Sens. Environ. 1995, 51, 375–384. [Google Scholar] [CrossRef]
Clevers, J.G.P.W. Imaging Spectrometry in Agriculture—Plant Vitality and Yield Indicators. In Imaging Spectrometry—A Tool for Environmental Observations; Eurocourses: Remote Sensing; Hill, J., Mégier, J., Eds.; Springer: Dordrecht, The Netherlands, 1994; Volume 4, pp. 193–219. ISBN 978-0-7923-2965-7. [Google Scholar]
Vincini, M.; Frazzi, E.; D’Alessio, P. Angular Dependence of Maize and Sugar Beet VIs from Directional CHRIS/Proba Data. In Proceedings of the 4th ESA CHRIS PROBA Workshop, ESRIN, Frascati, Italy, 19–21 September 2006. [Google Scholar]
Penuelas, J.; Frederic, B.; Filella, I. Semi-Empirical Indices to Assess Carotenoids/Chlorophyll Alpha Ratio from Leaf Spectral Reflectance. Photosynthetica 1995, 31, 221–230. [Google Scholar]
Lichtenthaler, H.K.; Lang, M.; Stober, F.; Sowinska, M.; Heisel, F.; Miehe, J.A. Detection of Photosynthetic Parameters and Vegetation Stress via a a New High Resolution Fluorescence Imaging-System. In Proceedings of the EARSeL, Basel, Switzerland, 4–6 September 1995; p. 103. [Google Scholar]
McMurtrey, J.E.; Chappelle, E.W.; Kim, M.S.; Meisinger, J.J.; Corp, L.A. Distinguishing Nitrogen Fertilization Levels in Field Corn (Zea mays L.) with Actively Induced Fluorescence and Passive Reflectance Measurements. Remote Sens. Environ. 1994, 47, 36–44. [Google Scholar] [CrossRef]
Gitelson, A.A.; Merzlyak, M.N. Remote Estimation of Chlorophyll Content in Higher Plant Leaves. Int. J. Remote Sens. 1997, 18, 2691–2697. [Google Scholar] [CrossRef]
Zarco-Tejada, P.J.; Miller, J.R. Land Cover Mapping at BOREAS Using Red Edge Spectral Parameters from CASI Imagery. J. Geophys. Res. 1999, 104, 27921–27933. [Google Scholar] [CrossRef]
Sims, D.A.; Gamon, J.A. Relationships between Leaf Pigment Content and Spectral Reflectance across a Wide Range of Species, Leaf Structures and Developmental Stages. Remote Sens. Environ. 2002, 81, 337–354. [Google Scholar] [CrossRef]
Haboudane, D.; Miller, J.R.; Tremblay, N.; Zarco-Tejada, P.J.; Dextraze, L. Integrated Narrow-Band Vegetation Indices for Prediction of Crop Chlorophyll Content for Application to Precision Agriculture. Remote Sens. Environ. 2002, 81, 416–426. [Google Scholar] [CrossRef]
Rondeaux, G.; Steven, M.; Baret, F. Optimization of Soil-Adjusted Vegetation Indices. Remote Sens. Environ. 1996, 55, 95–107. [Google Scholar] [CrossRef]
Broge, N.H.; Leblanc, E. Comparing Prediction Power and Stability of Broadband and Hyperspectral Vegetation Indices for Estimation of Green Leaf Area Index and Canopy Chlorophyll Density. Remote Sens. Environ. 2001, 76, 156–172. [Google Scholar] [CrossRef]
Datt, B. Remote Sensing of Water Content in Eucalyptus Leaves. Aust. J. Bot. 1999, 47, 909. [Google Scholar] [CrossRef]
Blackburn, G.A. Spectral Indices for Estimating Photosynthetic Pigment Concentrations: A Test Using Senescent Tree Leaves. Int. J. Remote Sens. 1998, 19, 657–675. [Google Scholar] [CrossRef]
Gitelson, A.A. Remote Estimation of Canopy Chlorophyll Content in Crops. Geophys. Res. Lett. 2005, 32, L08403. [Google Scholar] [CrossRef]
Fitzgerald, G.; Rodriguez, D.; O’Leary, G. Measuring and Predicting Canopy Nitrogen Nutrition in Wheat Using a Spectral Index—The Canopy Chlorophyll Content Index (CCCI). Field Crops Res. 2010, 116, 318–324. [Google Scholar] [CrossRef]
Dash, J.; Curran, P.J. The MERIS Terrestrial Chlorophyll Index. Int. J. Remote Sens. 2004, 25, 5403–5413. [Google Scholar] [CrossRef]
Gitelson, A.A.; Keydan, G.P.; Merzlyak, M.N. Three-band Model for Noninvasive Estimation of Chlorophyll, Carotenoids, and Anthocyanin Contents in Higher Plant Leaves. Geophys. Res. Lett. 2006, 33, 2006GL026457. [Google Scholar] [CrossRef]
Jordan, C.F. Derivation of Leaf-Area Index from Quality of Light on the Forest Floor. Ecology 1969, 50, 663–666. [Google Scholar] [CrossRef]
Li, X.; Liu, G.; Yang, Y.; Zhao, C.; Yu, Q.; Song, S. Relationship Between Hyperspectral Parameters and Physiological and Biochemical Indexes of Flue-Cured Tobacco Leaves. Agric. Sci. China 2007, 6, 665–672. [Google Scholar] [CrossRef]
Ying, X. An Overview of Overfitting and Its Solutions. J. Phys. Conf. Ser. 2019, 1168, 022022. [Google Scholar] [CrossRef]
Fonti, V.; Belitser, E. Paper in Business Analytics Feature Selection Using LASSO; VU Amsterdam: Amsterdam, The Netherlands, 2017; Volume 30, pp. 1–25. [Google Scholar]
Qi, Y. Random Forest for Bioinformatics. In Ensemble Machine Learning: Methods and Applications; Zhang, C., Ma, Y., Eds.; Springer: New York, NY, USA, 2012; pp. 307–323. ISBN 978-1-4419-9326-7. [Google Scholar]
Zhang, C.; Denka, S.; Cooper, H.; Mishra, D.R. Quantification of Sawgrass Marsh Aboveground Biomass in the Coastal Everglades Using Object-Based Ensemble Analysis and Landsat Data. Remote Sens. Environ. 2018, 204, 366–379. [Google Scholar] [CrossRef]
Smola, A.J.; Schölkopf, B. A Tutorial on Support Vector Regression. Stat. Comput. 2004, 14, 199–222. [Google Scholar] [CrossRef]
Were, K.; Bui, D.T.; Dick, Ø.B.; Singh, B.R. A Comparative Assessment of Support Vector Regression, Artificial Neural Networks, and Random Forests for Predicting and Mapping Soil Organic Carbon Stocks across an Afromontane Landscape. Ecol. Indic. 2015, 52, 394–403. [Google Scholar] [CrossRef]
Cho, M.A.; Skidmore, A.K. A New Technique for Extracting the Red Edge Position from Hyperspectral Data: The Linear Extrapolation Method. Remote Sens. Environ. 2006, 101, 181–193. [Google Scholar] [CrossRef]
Guo, T.; Tan, C.; Li, Q.; Cui, G.; Li, H. Estimating Leaf Chlorophyll Content in Tobacco Based on Various Canopy Hyperspectral Parameters. J. Ambient Intell. Hum. Comput. 2019, 10, 3239–3247. [Google Scholar] [CrossRef]
Zhang, X.; Sun, H.; Qiao, X.; Yan, X.; Feng, M.; Xiao, L.; Song, X.; Zhang, M.; Shafiq, F.; Yang, W.; et al. Hyperspectral Estimation of Canopy Chlorophyll of Winter Wheat by Using the Optimized Vegetation Indices. Comput. Electron. Agric. 2022, 193, 106654. [Google Scholar] [CrossRef]
Shi, H.; Guo, J.; An, J.; Tang, Z.; Wang, X.; Li, W.; Zhao, X.; Jin, L.; Xiang, Y.; Li, Z.; et al. Estimation of Chlorophyll Content in Soybean Crop at Different Growth Stages Based on Optimal Spectral Index. Agronomy 2023, 13, 663. [Google Scholar] [CrossRef]
Li, L.; Ren, T.; Ma, Y.; Wei, Q.; Wang, S.; Li, X.; Cong, R.; Liu, S.; Lu, J. Evaluating Chlorophyll Density in Winter Oilseed Rape (Brassica napus L.) Using Canopy Hyperspectral Red-Edge Parameters. Comput. Electron. Agric. 2016, 126, 21–31. [Google Scholar] [CrossRef]
Zarco-Tejada, P.J.; Miller, J.R.; Morales, A.; Berjón, A.; Agüera, J. Hyperspectral Indices and Model Simulation for Chlorophyll Estimation in Open-Canopy Tree Crops. Remote Sens. Environ. 2004, 90, 463–476. [Google Scholar] [CrossRef]
Bhadra, S.; Sagan, V.; Maimaitijiang, M.; Maimaitiyiming, M.; Newcomb, M.; Shakoor, N.; Mockler, T.C. Quantifying Leaf Chlorophyll Concentration of Sorghum from Hyperspectral Data Using Derivative Calculus and Machine Learning. Remote Sens. 2020, 12, 2082. [Google Scholar] [CrossRef]
Angel, Y.; McCabe, M.F. Machine Learning Strategies for the Retrieval of Leaf-Chlorophyll Dynamics: Model Choice, Sequential Versus Retraining Learning, and Hyperspectral Predictors. Front. Plant Sci. 2022, 13, 722442. [Google Scholar] [CrossRef]
An, G.; Xing, M.; He, B.; Liao, C.; Huang, X.; Shang, J.; Kang, H. Using Machine Learning for Estimating Rice Chlorophyll Content from In Situ Hyperspectral Data. Remote Sens. 2020, 12, 3104. [Google Scholar] [CrossRef]
Maimaitijiang, M.; Sagan, V.; Sidike, P.; Daloye, A.M.; Erkbol, H.; Fritschi, F.B. Crop Monitoring Using Satellite/UAV Data Fusion and Machine Learning. Remote Sens. 2020, 12, 1357. [Google Scholar] [CrossRef]
Liu, Y.; Zhang, Y.; Jiang, D.; Zhang, Z.; Chang, Q. Quantitative Assessment of Apple Mosaic Disease Severity Based on Hyperspectral Images and Chlorophyll Content. Remote Sens. 2023, 15, 2202. [Google Scholar] [CrossRef]

Figure 1. Measured LCC across five growth seasons in this study. One-way ANOVA test of the mean of LCC was performed. Different letters “a” and “b” indicated a significant difference in the mean LCC between seasons. N presented the sample number.

Figure 2. Correlation coefficients were calculated between all measured LCC values and various parameters. Notably, FODS (740.7) indicates that the highest correlation in the first-order differential spectrum occurs at 740.7 nm.

Figure 3. Correlation coefficients between LCC across five growth seasons and FODS. (a–e) represent the seasons May, June, August, October, and December.

Figure 4. Ranking the importance of 10 sensitive parameters. (a–e) represent the seasons May, June, August, October, and December.

Figure 5. Measured and predicted along the 1:1 line of the RFR model. (a–e) represent the seasons May, June, August, October and December.

Figure 6. Measured and predicted along the 1:1 line of the KNNR model. (a–e) represent the seasons May, June, August, October and December.

Figure 7. Measured and predicted along the 1:1 line of the SVR model. (a–e) represent the seasons May, June, August, October, and December.

Table 1. The 50 selected spectral parameters examined in this study, along with their band-specific formulations and corresponding principal references.

NO.	Name	Explanation	Reference
1	Anthocyanin Reflectance Index 1	ARI1 = 1/R₅₅₀ − 1/R₇₀₀	[40]
2	Anthocyanin Reflectance Index 2	ARI2 = R₈₀₀(1/R₅₅₀ − 1/R₇₀₀)	[41]
3	Green Normalized Difference Vegetation Index hyper 1	GNDVIhyper1 = (R₇₅₀ − R₅₅₀)/(R₇₅₀ + R₅₅₀)	[41]
4	Green Normalized Difference Vegetation Index hyper 2	GNDVIhyper2 = (R₈₀₀ − R₅₅₀)/(R₈₀₀ + R₅₅₀)	[41]
5	Modified Normalized Difference Vegetation Index	mNDVI₇₀₅ = (R₇₅₀ − R₇₀₅)/(R₇₅₀ + R₇₀₅ − 2R₄₄₅)	[41]
6	Canopy Chlorophyll Index	CCI = (R₇₇₇ − R₇₄₇)/R₆₇₃	[41]
7	Vogelmann Index 2	VOG2 = (R₇₃₄ − R₇₄₇)/(R₇₁₅ + R₇₂₆)	[41]
8	Carter1	Carte1 = R₆₉₅/R₄₂₀	[42]
9	Carter2	Carte2 = R₆₉₅/R₇₆₀	[42]
10	Carter3	Carte3 = R₆₀₅/R₇₆₀	[42]
11	Carter4	Carte4 = R₇₁₀/R₇₆₀	[42]
12	Carter5	Carte5 = R₆₉₅/R₆₇₀	[42]
13	Datt1	Datt1 = (R₈₅₀ − R₇₁₀)/(R₈₅₀ − R₆₈₀)	[43]
14	Datt2	Datt2 = R₈₅₀/R₇₁₀	[43]
15	Datt3	Datt3 = R₇₅₄/R₇₀₄	[43]
16	Enhanced Vegetation Index	EVI = 2.5 × ((R₈₀₀ − R₆₇₀)/R₈₀₀ − 6R₆₇₀ − 7.5R₄₇₅ + 1))	[44]
17	Modified Chlorophyll Absorption in Reflectance Index	MCARI = ((R₇₀₀ − R₆₇₀) − 0.2 × (R₇₀₀ − R₅₅₀))(R₇₀₀/R₆₇₀)	[45]
18	Modified Triangular Vegetation Index 1	MTVI1 = 1.2 × (1.2 × (R₈₀₀ − R₅₅₀) − 2.5 × (R₆₇₀ − R₅₅₀))	[46]
19	Normalized Difference Cloud Index	NDCI = (R₇₆₂ − R₅₂₇)/(R₇₆₂ + R₅₂₇)	[47]
20	Plant Senescence Reflectance Index	PSRI = (R₆₇₈ − R₅₀₀)/R₇₅₀	[48]
21	Renormalized Difference Vegetation Index	RDVI = (R₈₀₀ − R₆₇₀)/ $\sqrt{R_{800} + R_{670}}$	[49]
22	Red-Edge Position Linear Interpolation	REP = 700 + 40 × ((R₆₇₀ + R₇₈₀)/2 − R₇₀₀)/(R₇₄₀ − R₇₀₀)	[50]
23	Spectral Polygon Vegetation Index 1	SPVI1 = 0.4 × 3.7 × (R₈₀₀ − R₆₇₀) − 1.2 × \|R₅₃₀ − R₆₇₀\|	[51]
24	Simple Ratio Pigment Index	SRPI = R₄₃₀/R₆₈₀	[52]
25	Simple Ratio 440/690	SR(440,690) = R₄₄₀/R₆₉₀	[53]
26	Simple Ratio 700/670	SR(700,670) = R₇₀₀/R₆₇₀	[54]
27	Simple Ratio 750/550	SR(750,550) = R₇₅₀/R₅₅₀	[54]
28	Simple Ratio 750/700	SR(750,700) = R₇₅₀/R₇₀₀	[55]
29	Simple Ratio 750/710	SR(750,710) = R₇₅₀/R₇₁₀	[56]
30	Simple Ratio 752/690	SR(752,690) = R₇₅₂/R₆₉₀	[56]
31	Simple Ratio 800/680	SR(800,680) = R₈₀₀/R₆₈₀	[57]
32	Transformed Chlorophyll Absorption Ratio	TCARI = 3 × ((R₇₀₀ − R₆₇₀) − 0.2× (R₇₀₀ − R₅₅₀)(R₇₀₀/R₆₇₀))	[58]
33	Optimized Soil Adjusted Vegetation Index	OSAVI = (1 + 0.16) × (R₈₀₀ − R₆₇₀)/(R₈₀₀ + R₆₇₀ + 0.16)	[59]
34	Transformed Chlorophyll Absorption in Reflectance Index/Optimized Soil Adjusted Vegetation Index	TCARI/OSAVI = $\frac{3 \times ((R_{700} - R_{670}) - 0.2 (R_{700} - R_{550}) (R_{700} / R_{670}))}{(1 + 0.16) (R_{800} - R_{670}) / (R_{800} + R_{670} + 0.16)}$	[41]
35	Triangular Vegetation Index	TVI = 0.5 × (120× (R₇₅₀ − R₅₅₀) − 200 × (R₆₇₀ − R₅₅₀))	[60]
36	Leaf Chlorophyll Index	LCI = $\frac{\| R_{850} \| - \| R_{710} \|}{\| R_{850} \| - \| R_{680} \|}$	[61]
37	Structure Intensive Pigment Index 1	SIPI1 = (R₈₀₀ − R₄₄₅)/(R₈₀₀ − R₆₈₀)	[62]
38	Structure Intensive Pigment Index 2	SIPI2 = (R₈₀₀ − R₅₀₅)/(R₈₀₀ − R₆₉₀)	[62]
39	Structure Intensive Pigment Index 3	SIPI3 = (R₈₀₀ − R₄₇₀)/(R₈₀₀ − R₆₈₀)	[62]
40	Red-Edge Ratio Vegetation Index	RERVI = R₈₄₀/R₇₁₇	[63]
41	Red-Edge Normalized Difference Vegetation Index	RENDVI = (R₈₄₀ − R₇₁₇)/(R₈₄₀ + R₇₁₇)	[64]
42	Green Ratio Vegetation Index	GRVI = R₈₄₀/R₅₆₀	[63]
43	MERIS Terrestrial Chlorophyll Index	MTCI = (R₇₅₃ − R₇₀₈)/(R₇₀₈ − R₆₈₁)	[65]
44	Chlorophyll Index Green	CI-green = (R₇₈₀/R₅₅₀) − 1	[66]
45	Ratio Vegetation Index	RVI = R₇₆₅/R₇₂₀	[67]
46	FODS	First-order differential spectrum	[39]
47	SDr	First-order differential spectral integration in the wavelength range of 680~760 nm	[68]
48	SDb	First-order differential spectral integration in the wavelength range of 490~530 nm	[68]
49	SDr/SDb	Ratio of the red edge area to the blue edge area	[68]
50	(SDr − SDb)/(SDr + SDb)	Normalized value of the red edge area and the blue edge area	[68]

Note: R, r, and b represent spectral reflectance, red edge, and blue edge, respectively. NO.1~45, 46, and 47~50 were the VIs, FODS, and TEPs, respectively.

Table 2. Performance of ULR and selected parameters across five growth seasons.

Growth Seasons	Models	R²	RMSE	Parameter
May	y = 49.680 + 6.1210 × x₁	0.20	8.88	FODS (647.2)
June	y = 55.974 + 10.689 × x₁	0.61	6.85	Datt1
August	y = 52.045 + 14.207 × x₁	0.69	8.92	MTCI
October	y = 52.131 − 14.277 × x₁	0.66	8.81	Carte4
December	y = 50.612 + 15.170 × x₁	0.72	9.72	FODS (730.2)

Note: FODS (647.2) and FODS (730.2) represent the first-order differential spectra at 647.2 nm and 730.2 nm, respectively. x₁ was the parameter of the best-fitting model.

Table 3. Performance of MLR and selected parameters across five growth seasons.

Growth Seasons	Models	R²	RMSE
May	y = 49.710 − 7.616 × x₁ − 11.644 × x₂ − 10.097 × x₃	0.29	8.59
June	y = 55.819 + 2.387 × x₁ − 11.107 × x₂ − 2.689 × x₃	0.64	7.93
August	y = 52.875 − 5.880 × x₁ + 13.748 × x₂ − 6.317 × x₃	0.68	8.79
October	y = 52.274 − 29.580 × x₁ − 27.842 × x₂ − 16.227 × x₃	0.77	7.70
December	y = 51.278 − 21.262 × x₁ − 33.347 × x₂ − 27.079 × x₃	0.76	8.56

Note: x₁, x₂, and x₃ represented the parameters of the best-fitting model. In May: x₁ was VOG2, x₂ was SR (750,710), and x₃ was Carte4; in June: x₁ was (SDr − SDb)/(SDr + SDb), x₂ was VOG2, and x₃ was SR (750,710); in August: x₁ was SDr/SDb, x₂ was MTCI, and x₃ was Carte4; in October: x₁ was SR (750,710), x₂ was Carte4, and x₃ was VOG2; in December: x₁ was VOG2, x₂ was SR (750,710), and x₃ was Carte4.

Table 4. Performance of LCC estimation with different parameters with machine learning algorithms.

Growth Seasons		RFR	KNNR	SVR	Parameters
May	R²	0.42	0.34	0.33	Carte4, FODS (647.2)
May	RMSE	7.80	8.51	8.38	Carte4, FODS (647.2)
June	R²	0.69	0.62	0.58	VOG2, Carte4
June	RMSE	7.34	8.03	8.52	VOG2, Carte4
August	R²	0.67	0.64	0.59	VOG2, SR (750,710)
August	RMSE	8.97	9.59	9.94	VOG2, SR (750,710)
October	R²	0.83	0.78	0.73	VOG2, Carte4
October	RMSE	6.67	7.94	8.43	VOG2, Carte4
December	R²	0.83	0.71	0.71	FODS (730.2), (SDr − SDb)/(SDr + SDb)
December	RMSE	7.13	9.18	9.36	FODS (730.2), (SDr − SDb)/(SDr + SDb)

Note: FODS (647.2) and FODS (730.2) represent the first-order differential spectra at 647.2 nm and 730.2 nm, respectively.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, D.; Hu, Q.; Ruan, S.; Liu, J.; Zhang, J.; Hu, C.; Liu, Y.; Dian, Y.; Zhou, J. Utilizing Hyperspectral Reflectance and Machine Learning Algorithms for Non-Destructive Estimation of Chlorophyll Content in Citrus Leaves. Remote Sens. 2023, 15, 4934. https://doi.org/10.3390/rs15204934

AMA Style

Li D, Hu Q, Ruan S, Liu J, Zhang J, Hu C, Liu Y, Dian Y, Zhou J. Utilizing Hyperspectral Reflectance and Machine Learning Algorithms for Non-Destructive Estimation of Chlorophyll Content in Citrus Leaves. Remote Sensing. 2023; 15(20):4934. https://doi.org/10.3390/rs15204934

Chicago/Turabian Style

Li, Dasui, Qingqing Hu, Siqi Ruan, Jun Liu, Jinzhi Zhang, Chungen Hu, Yongzhong Liu, Yuanyong Dian, and Jingjing Zhou. 2023. "Utilizing Hyperspectral Reflectance and Machine Learning Algorithms for Non-Destructive Estimation of Chlorophyll Content in Citrus Leaves" Remote Sensing 15, no. 20: 4934. https://doi.org/10.3390/rs15204934

APA Style

Li, D., Hu, Q., Ruan, S., Liu, J., Zhang, J., Hu, C., Liu, Y., Dian, Y., & Zhou, J. (2023). Utilizing Hyperspectral Reflectance and Machine Learning Algorithms for Non-Destructive Estimation of Chlorophyll Content in Citrus Leaves. Remote Sensing, 15(20), 4934. https://doi.org/10.3390/rs15204934

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Utilizing Hyperspectral Reflectance and Machine Learning Algorithms for Non-Destructive Estimation of Chlorophyll Content in Citrus Leaves

Abstract

1. Introduction

2. Materials and Methods

2.1. Experimental Site and Experimental Design

2.2. Measurement of the Hyperspectral Data

2.3. Measurement of Leaf Chlorophyll Content

2.4. Extraction of Spectral Parameters

2.5. Dimension Reduction and Parameter Selection

2.6. Linear Regression Analysis

2.7. Machine Learning Algorithms

3. Results

3.1. Statistics of Measured LCC

3.2. Parameters Selection

3.3. Univariate Linear Regression

3.4. Multivariate Linear Regression

3.5. Machine Learning Algorithms

4. Discussion

4.1. Linear Regression Analysis of the Spectral Parameters for LCC Estimation

4.2. Performance Evaluation of Machine Learning Algorithms for LCC Estimation

4.3. Exploring Future Prospects in Citrus Chlorophyll Research

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI