Regional Scale Inversion of Chlorophyll Content of Dendrocalamus giganteus by Multi-Source Remote Sensing

Xia, Cuifen; Zhou, Wenwu; Shu, Qingtai; Wu, Zaikun; Xu, Li; Yang, Huanfen; Qin, Zhen; Wang, Mingxing; Duan, Dandan

doi:10.3390/f15071211

Open AccessArticle

Regional Scale Inversion of Chlorophyll Content of Dendrocalamus giganteus by Multi-Source Remote Sensing

by

Cuifen Xia

¹,

Wenwu Zhou

¹

,

Qingtai Shu

^1,*

,

Zaikun Wu

¹,

Li Xu

¹

,

Huanfen Yang

¹,

Zhen Qin

¹

,

Mingxing Wang

¹ and

Dandan Duan

²

¹

College of Forestry, Southwest Forestry University, Kunming 650224, China

²

Information Technology Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China

^*

Author to whom correspondence should be addressed.

Forests 2024, 15(7), 1211; https://doi.org/10.3390/f15071211

Submission received: 28 May 2024 / Revised: 30 June 2024 / Accepted: 10 July 2024 / Published: 12 July 2024

(This article belongs to the Section Forest Inventory, Modeling and Remote Sensing)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

The spectrophotometer method is costly, time-consuming, laborious, and destructive to the plant. Samples will be lost during the transportation process, and the method can only obtain sample point data. This poses a challenge to the estimation of chlorophyll content at the regional level. In this study, in order to improve the estimation accuracy, a new method of collaborative inversion of chlorophyll using Landsat 8 and Global Ecosystem Dynamics Investigation (GEDI) is proposed. Specifically, the chlorophyll content data set is combined with the preprocessed two remote-sensing (RS) factors to construct three regression models using a support vector machine (SVM), BP neural network (BP) and random forest (RF), and the better model is selected for inversion. In addition, the ordinary Kriging (OK) method is used to interpolate the GEDI point attribute data into the surface attribute data for modeling. The results showed the following: (1) The chlorophyll model of a single plant was y = 0.1373x^1.7654. (2) The optimal semi-variance function models of pai, pgap_theta and pgap_theta_a3 are exponential models. (3) The top three correlations between the two RS data and the chlorophyll content were B2_3_SM, B2_3_HO, B2_5_EN and pai, pgap_theta, pgap_theta_a3. (4) The combination of the Landsat 8 imagery and GEDI resulted in the highest modeling accuracy, and RF had the best performance, with R², RMSE and P values of 0.94, 0.18 g/m² and 83.32%, respectively. This study shows that it is reliable to use Landsat 8 images and GEDI to retrieve the chlorophyll content of Dendrocalamus giganteus (D. giganteus), revealing the potential of multi-source RS data in the inversion of forest ecological parameters.

Keywords:

RS; machine learning; chlorophyll content; Dendrocalamus giganteus; inversion

1. Introduction

Chlorophyll is an important biochemical index of plants. The chlorophyll content can directly reflect the photosynthetic capacity, nitrogen fixation capacity and health status of vegetation, which is an important index to evaluate the growth and stand yield of a forest [1]. Accurately estimating the forest’s chlorophyll content is helpful to further understand the forest’s ecosystem functions and the forest’s health status. Bamboo forests are very important in maintaining forest ecosystems’ balance, mitigating global warming and promoting carbon sequestration [2]. Among bamboo forests, Dendrocalamus giganteus (D. giganteus) is one of the largest bamboo species in the world. Therefore, the chlorophyll content retrieval of D. giganteus at regional scale is the key to grasp the growth status and ecosystem functions of D. giganteus. Traditionally, a spectrophotometer is used to measure chlorophyll content in a laboratory. This method destroys plants and results in the loss of chlorophyll content during transportation [3]. Moreover, the traditional method operates at specific leaf scales with sample point data, which is limited in forestry application. Nowadays, non-destructive, real-time, efficient and accurate RS technology can greatly enhance the efficiency and improve the precision of vegetation chlorophyll content retrieval at a regional scale by combining it with ground survey data. Meanwhile, numerous studies have shown that using RS technology combined with ground survey data to estimate vegetation chlorophyll content at a regional scale is both effective and feasible [4,5,6].

At present, the most used methodology for chlorophyll content inversion is optical RS data (such as the Landsat series of the United States; [7,8,9]), while there are few studies on the inversion of chlorophyll content with spaceborne LiDAR GEDI. The reflectance and spectral index of the optical sensor are sensitive to the horizontal structure of chlorophyll content. Chlorophyll mainly absorbs red light and blue-violet light, and the maximum absorption wavelength range is 420~663 nm, which has a strong reflection effect on green light [10,11]. However, the acquisition of Landsat 8 RS images is susceptible to climate and saturation problems [12,13,14], and previous studies have mainly focused on extracting spectral features from RS images [15,16], as well as on exploring the relationship between the vegetation index and the plant chlorophyll concentration, ignoring texture features, which are conducive to improving interpretation accuracy [9]. Based on multi-spectral RS satellite Landsat 5 and Landsat 8 data, Yang Y et al. [9] used RF regression to construct an inversion model of texture features and spectral indices with the lake chlorophyll-a concentration. The research shows that texture features have a significant correlation with the lake chlorophyll-a concentration. At the same time, it also suggests that RS image texture features have good potential in estimating biochemical indicators. Compared with ICESat-2 (Ice Cloud Land Elevation Satellite-2), GEDI has the advantage of wide coverage. Compared with the above RS satellites, the advantage is that the emitted laser beam has a certain penetration ability through the forest canopy. The forest canopy’s three-dimensional structure and the underlying topography are determined using echo waveform data. The deficiency lies in the difficulty of space-borne LiDAR in obtaining the horizontal structure parameters of the forest, and due to the discreteness of the data, the forest information is incomplete. At present, GEDI data are mostly used in the inversion of forest canopy height, leaf area index, biomass, etc. Xu L et al. [17] used GEDI spaceborne LiDAR data to estimate the biomass of oak forests in Shangri-La. The research shows that modis_treecover, rv, sensitivity and oak tree biomass have obvious correlations, and the oak forests biomass estimation model established by random forest has the best accuracy. At the same time, it shows that the series of indicators contained in GEDI L2B product data can also show good model interpretation accuracy with the forest structure parameters and the vegetation biochemical parameters.

From the perspective of domestic and foreign research, it is rare to use optical RS data and spaceborne LiDAR data to estimate the chlorophyll content of bamboo. Therefore, in this study, Landsat 8 data and spaceborne LiDAR GEDI data were used to work together. Using the unique optical properties of chlorophyll, the statistical relationship between the chlorophyll content and RS characteristic bands, as well as the statistical relationship between the GEDI L2B product data indicators and the chlorophyll content were analyzed. The inversion model was established by RF to improve the inversion accuracy of the chlorophyll content.

Currently, the inversion methods of the chlorophyll content based on RS data mainly involve empirical models, physical models and coupling models. The empirical model method (including the parametric model and the non-parametric model) is convenient, fast and easy to operate, exhibiting high efficiency and ideal accuracy, but the optimization of feature combination needs further study [11,18]. The coupling model involves coupling the empirical model and the physical model, which can maximize the advantages of the statistical model, but the operation is time-consuming and inefficient [19]. Machine learning algorithms, as a novel modeling approach, are not constrained by a fixed model framework. They have the capability to iteratively learn from feedback errors during the model correction process, enhancing the understanding of the intricate relationship between independent and dependent variables [20]. The chlorophyll content estimation based on the machine learning algorithm can be divided into the following two processes: (1) The analysis the relationships between the chlorophyll concentration and characteristic variables. (2) The calculation of the chlorophyll concentration by using the function relationship [21]. The machine learning algorithm usually shows a good chlorophyll concentration inversion effect because it can solve high-dimensional nonlinear problems [22,23]. For example, models such as RF [24], neural network [25] and the genetic algorithm-optimized simplified support vector machine (GA-SVM) have been proven to perform well in chlorophyll-a estimation [26]. RF is one of the most popular machine learning methods, and its model performs better compared to other machine learning methods in forest biochemical parameter estimation [24,25,26,27]. Although there have been many studies on RS monitoring of chlorophyll concentration using machine learning algorithms [26], a general method has not been proposed to achieve long-term monitoring of chlorophyll concentration, and the inversion model has poor universality [27]. However, so far, the cooperative operation of RS satellite and spaceborne LiDAR, along with the use of the more mature RF algorithm to estimate the chlorophyll content of D. giganteus, is rare in forestry applications. Real-time, fast and accurate inversion of the chlorophyll content has become one of the urgent problems for forestry researchers.

Therefore, the goal of this study is to use Xinping County, Yunnan Province, which has a large number of large, clumped D. giganteus, as the primary test area, and machine learning technology to estimate the chlorophyll content, as well as to evaluate the potential of multi-source RS data collaboration in chlorophyll content inversion. The specific objectives of this study are the following: (1) to derive a model for retrieving chlorophyll content from a single D. giganteus plant; (2) to establish an optimal model for inverting the chlorophyll content of D. giganteus; (3) to create a distribution map of D. giganteus plants in the study area, utilizing the optimal attributes derived from multispectral Landsat 8 and LiDAR GEDI satellite data. The feasibility of estimating the chlorophyll content of D. giganteus by the collaborative operation of multi-RS data was evaluated, which provided a feasible reference for the inversion of chlorophyll content of D. giganteus at medium and large regional scales.

2. Materials and Methods

2.1. Study Area

Xinping Yi and Dai Autonomous County (Xinping County, 23°38′15″–24°26′05″ N, 101°16′30″–102°16′50″ E) is located at the eastern foot of the middle of Ailao Mountain (Figure 1), belonging to Yuxi City in Yunnan Province [28]. The main terrain is mountainous, and the terrain is high in the northwest and low in the southeast. The highest altitude in the territory is 3165.9 m and the lowest is 422 m. Due to the influence of the altitude difference, Xinping County has formed three climate types: dry-hot valley high-temperature area, semi-mountain warm temperature area and alpine cold temperature area [29]. The annual precipitation is 869 mm, the annual maximum temperature and minimum temperature are 32.8 °C and 1.3 °C, respectively, and the annual average temperature is 18.1 °C. It belongs to the subtropical low-latitude plateau monsoon climate zone. It is suitable for growing D. giganteus in an area with an altitude of 300~1200 m, and its daily average surface temperature is between 18 and approximately 26 °C [30]. The forest coverage rate of Xinping County reaches 61.99%, and the forest land area reaches 3.18 × 10⁵ hm². In the forest land, the arbor forest land is 2.78 × 10⁵ hm², accounting for 87.47%; 1.49 × 10⁴ hm² of bamboo forest land, accounting for 4.67%; 2.27 × 10⁴ hm² of shrub land, accounting for 7.14%; other forest land represents 2.29 × 10³ hm², accounting for 0.72%. D. giganteus is one of the largest bamboo species in the world. Yunnan is one of the main distribution areas of D. giganteus. Belonging to the genus Poaceae [31], the average D. giganteus height is about 30 m, the diameter at breast height (DBH) is about 15 m and the wall thickness is about 14 mm. The area of bamboo forest in China is about 7.02 × 10⁶ hm². Therefore, accurate estimation of the chlorophyll content of D. giganteus at regional scale can help monitor vegetation and ecological environment changes, as well as promote ecological protection, resource management and disaster monitoring. In addition, it can provide a scientific basis for local governments to guide forestry production and land planning, promoting sustainable development.

2.2. Acquisition and Processing of Measured Data

The 35 measured data of chlorophyll content of D. giganteus used in this study were obtained from a standard sample survey of 30 m × 30 m (about 0.09 km²) in Jiasa Town, Shuitang Town and Laochang Township in Xinping County (Figure 1). In Yunnan, the rainy season is from May to October of each year, while the dry season is from November to April of the following year. The rainfall from mid-June to mid-August is the highest, accounting for about 60% of the annual rainfall. During this period, the rainfall is abundant and the climate is suitable. It is the main growth period of D. giganteus, and it is also the most representative period to reflect the change in the chlorophyll content of D. giganteus. Considering comprehensive factors such as weather, road safety and the representativeness and typicality of the field sample setting, after screening, it was found that the weather was sunny from 11 July to 16 July. Therefore, we finally chose 12 July to 14 July. The field survey was carried out in the study area and the experimental data were collected. The DBH, coordinates and other factors were measured, and the materials required for the experiment (including standard D. giganteus and standard leaves) were collected. The diameter of the DBH was set to 5 cm, and the D. giganteus with a DBH of 10 cm was used as the standard plant. The coordinates of each square sample plot were measured using the southern surveying and mapping Real-Time Kinematic (RTK) differential locator in the fixed solution state. The latitude, longitude and altitude information of the sample center was recorded once the error was less than 2 cm.

2.2.1. Subsubsection

In the 35 sample plots within the distribution range of D. giganteus, 137 single plants of D. giganteus with good growth and no pests, diseases and mechanical damage were randomly selected as the standard samples of D. giganteus (about 4 plants were selected on average in each sample plot). The selected standard D. giganteus was cut from the base, and its DBH was measured. The collected 137 leaf preservations were brought back to the laboratory for weighing. The measurement index of individual D. giganteus is shown in Table 1.

2.2.2. Standard Leaf Selection and Sample Determination

A total of 49 strains were randomly selected from 137 standard samples of D. giganteus, and different sizes of new, old and young leaves were selected from the upper, middle and lower parts as standard sample leaves. The surface of the sampling leaves was wiped clean, and the upper, middle and lower parts on both sides of the middle vein (removing the middle vein) were evenly sampled, weighed, cut into pieces and ground to powder using liquid nitrogen. Add 80% C₃H₆O, grind to homogenize, filter and adjust to a constant volume. The chloroplast pigment extract was poured into a colorimetric dish with a light diameter of 1 cm. With 80% C₃H₆O as blank control, the absorbance was measured by a spectrophotometer at 663 nm and 645 nm, respectively. The statistics of the chlorophyll content in the leaves of D. giganteus are shown in Table 2. The specific method of chlorophyll determination in standard leaves of D. giganteus is based on the principle and technology of plant physiological and biochemical experiments [32]. The calculation formula is as follows:

C_{a} = 12.72 A_{663} - 2.59 A_{645}

(1)

C_{b} = 22.88 A_{645} - 4.67 A_{663}

(2)

C_{T} = C_{a} + C_{b} = 20.29 A_{645} + 8.05 A_{663}

(3)

where A₆₆₃ and A₆₄₅ are the absorbance of chlorophyll solution at 663 nm and 645 nm, C_a, C_b and C_T are chlorophyll-a, chlorophyll-b and total chlorophyll content, respectively, in mg/g.

2.2.3. Measurement of Chlorophyll of D. giganteus at Plot Scale

The chlorophyll content of 137 individual plants of D. giganteus was used as the dependent variable and DBH as the independent variable to establish a power function relationship between the two. Based on the allometric growth equation, the chlorophyll–DBH model was constructed as the basic model of chlorophyll of D. giganteus, and then the chlorophyll content of D. giganteus in 35 plots was calculated by this model (Table 3).

2.3. RS Image Data Acquisition and Information Extraction

2.3.1. Landsat 8 OLI Image Data

In February 2013, NASA launched the Landsat 8 satellite, which is equipped with a land imager (OLI) and a thermal infrared sensor (TIRS). It is the most widely used civilian satellite in the world. Landsat 8 OLI has a total of 11 bands, of which 1-7, 9-OLI is a multi-spectral band (30 m), the 8-OLI is a panchromatic band (15 m), the 10, 11-TIRS is a thermal infrared band (100 m). In this study, the visible-light band (B2~B4) and the near-infrared band (B5) were used for research. The Landsat 8 OLI imagery (24 September 2021, orbit No.130/43) was acquired by the geospatial data cloud (http://www.gscloud.cn/search), and the access date was 5 July 2023, covering the entire study area. The RS image processing software ENVI 5.6 was utilized to preprocess the satellite imagery by performing radiometric calibration, atmospheric correction using FLAASH and geometric correction. Subsequently, 73 RS factors were extracted [33]. Three topographic factors were extracted from DEM data [34], including a total of 76 characteristic variable factors, which were used to construct the initial characteristic variable set at the regional scale (Table 4). This study used the gray level (GLCM) co-occurrence matrix method to extract texture features and employed the probability statistics tool to obtain ME, VA, HO, CO, DI, EN, SM, CR. The processing window sizes were 3 × 3 and 5 × 5, respectively. The step size was 1 × 1, and the gray-scale quantization level compression level was 64.

2.3.2. GEDI Data

GEDI was successfully launched by the International Space Station (ISS) of the United States on 5 December 2018, sampling and collecting data between 51.6° north and south latitudes around the world. The GEDI system consists of three lasers, generating a total of eight-beam ground cross sections, including about 25 m footprint samples, with an interval of about 60 m along the orbit. The GEDI beam’s cross section is spaced approximately 600 m apart along the Earth’s surface orbit, with a transverse channel width of around 4.2 km. GEDI consists of four product-level datasets. L1 provides geographically located return energy waveform data, while L2 includes geographically located surface elevation and canopy height information. Additionally, L3 offers gridded vegetation structure details, and L4 consists of footprint-level and gridded aboveground biomass data.

This study uses GEDI-derived L2B data, which provide a richer source of information than L2A data, such as: pai, leaf_on_doy, cover, etc. The GEDI data in this study were downloaded from the Earthdata website (https://search.earthdata.nasa.gov/). The access date was 27 June 2023, and the data acquisition time range was from 1 January 2022 to 31 December 2022. All 35 orbital data beams in the study area were selected based on Xinping County’s vector boundary. In order to obtain light spots with high quality and complete information, this study filtered out invalid light spots through 5 indicators from GEDI (Table 5), namely, lon_lowestmode, lat_lowestmode, qualit_flag, sensitivity, degraded_flag and previous research experience [17].

After the target points were screened out by the above 5 indicators, there remained 58,421 valid footprint points in the study area, of which 1689 were D. giganteus woodlands and 56,732 non-D. giganteus woodlands (Figure 2). GEDI L2B product data information can be accessed from GEDI L2B product data dictionary, containing 35 modelling alternatives and 5 quality screening indicators.

In this study, we also used the 2016 secondary inventory data of forest resources in Xinping County for masking the extraction of D. giganteus forest land.

2.4. Research Method

The main steps for constructing a single D. giganteus chlorophyll model based on traditional methods of chlorophyll estimation and conducting region-scale chlorophyll inversion are as follows: (1) construction of chlorophyll model of single D. giganteus; (2) selection and accuracy evaluation of interpolation model; (3) correlation analysis and importance ranking of feature parameters; (4) construction and accuracy evaluation of the chlorophyll content estimation model (Figure 3). The specific process involves fitting the basic model of chlorophyll content in D. giganteus using the basic information of individual D. giganteus. The chlorophyll content at the plot level is calculated by the basic model, which is used as the training sample (dependent variable) for modeling. The characteristic value of the characteristic variable factor at the corresponding sample site is extracted from the remote sensing data as the modeling sample (independent variable), and the chlorophyll content prediction model of D. giganteus at the regional scale is constructed using the dependent variable and the independent variable, and then the chlorophyll content of D. giganteus in the study area is inverted.

2.4.1. Construction of Chlorophyll Model of Single D. giganteus Plant

The relative growth equation formula (also known as “allometric growth equation”) method is used to establish a chlorophyll model for a single bamboo plant, and this model is taken as the baseline model for estimating the chlorophyll content in a single D. giganteus plant in the study area. The total chlorophyll of 35 plots was calculated and then used for regional-scale chlorophyll inversion. The relative growth equation is calculated as follows:

{C h l}_{T} = a {D B H}^{b}

(4)

where

{C h l}_{T}

represents the chlorophyll content of a single D. giganteus plant,

D B H

represents the diameter of a single D. giganteus plant,

a

and

b

represent the parameters to be estimated for a single plant model.

2.4.2. Geostatistics Method

Semi-variogram

Semi-variogram, also known as the semi-variance function, is a special tool in geostatistics. It was first proposed by Matheron in 1963, who defined it as the mathematical expectation of the square of the increment of a regionalized variable. The research object of this paper is chlorophyll content. The chlorophyll content of each pixel can be considered as a regionalized variable affected by its characteristic factors. The functional relationship between independent variables and the chlorophyll content can be used to describe the spatial distribution characteristics of chlorophyll [35,36]. Therefore, in this study, GS + 9.0 software was used to fit the optimal semi-variance model of each parameter, defined as the following formula:

r (h) = \frac{1}{2 N (h)} \sum_{i = 1}^{N (h)} {\{z (x_{i}) - z (x_{i} + h)\}}^{2}

(5)

where

r (h)

represents the value of semi-variance function;

N (h)

represents the number of point pairs with an interval distance of

h

;

z (x_{i})

represents the variable of the region, that is, the pixel chlorophyll;

h

is the distance between two sample points.

Ordinary Kriging interpolation method

The Ordinary Kriging (OK) interpolation technique is one of the most commonly utilized methods in geostatistical analysis. In this study, ArcGIS Pro 2.8 software was utilized to provide unbiased estimations by combining the structural characteristics of sampling data and the semi-variance function of regionalized variables. According to the principle of minimum variance, the weighting coefficient of unsampled points is determined, and the data of unsampled points are obtained by the interpolation method [37]. This involves using different sample data in the study area to estimate the unmeasured data in the study area, defined as the following formula:

Z_{θ}^{*} (x_{0}) = \sum_{i = 1}^{n} λ_{i} Z (x_{i})

(6)

where

Z_{θ}^{*} (x_{0})

represents the Kriging estimation result,

n

is the number of GEDI footprints,

λ_{i}

denotes the weight factor,

x_{i}

represents the location of any point in the study area,

Z (x_{i})

denotes the sample point of a known value.

Interpolation accuracy evaluation

In this study, GS + 9.0 software was utilized to derive the optimal semi-variance model for the parameters. The evaluation of the model’s fitting effect first considers the R² and the sum of residual squares, followed by the nugget effect value, and finally assesses the range and sill. The larger the R², the smaller the RSS, the larger the range value, and the smaller the nugget and the nugget effect values, the better the fitting effect of the semi-variance function model is [38].

Then, under the ArcGIS pro 2.8 software geostatistics module, the OK interpolation is used to interpolate the GEDI point attribute data into the area attribute data. The interpolation effect was evaluated by mean error (ME), root mean square error (RMSE), standard mean error (MSE), standardized root mean square error (RMSSE), mean standard error (ASE) and the determination coefficient R-squared (R²) [39]. Each evaluation index was defined as the following formula:

M E = \frac{\sum_{i - 1}^{n} [{\overset{\land}{Z}}_{(x_{i})} - Z_{(x_{i})}]}{n}

(7)

R M S E = \sqrt{\frac{\sum_{i = 1}^{n} {[{\overset{\land}{Z}}_{(x_{i})} - Z_{(x_{i})}]}^{2}}{n}}

(8)

M S E = \frac{\sum_{i - 1}^{n} [{\overset{\land}{Z}}_{(x_{i})} - Z_{(x_{i})}] / {\overset{\land}{σ}}_{(x_{i})}}{n}

(9)

R M S S E = \frac{\sum_{i = 1}^{n} {[{\overset{\land}{Z}}_{(x_{i})} - Z_{(x_{i})}] / {\overset{\land}{σ}}_{(x_{i})}}^{2}}{n}

(10)

A S E = \sqrt{\frac{\sum_{i = 1}^{n} {\overset{\land}{σ}}_{(x_{i})}}{n}}

(11)

where

{\overset{\land}{Z}}_{(x_{i})}

is the predicted value of

x

at

i

position,

Z_{(x_{i})}

is the observed value of

x

at

i

position, and

n

is the number of footprint points.

2.4.3. Establishing a Regional Chlorophyll Estimation Model

In order to use different model methods to compare the estimation accuracy of two different RS data and the synergy of two different RS data, this study selected the more traditional non-parametric SVM, BP and RF model, combined with RS characteristic variables and field sample survey data, to develop a chlorophyll content estimation model for D. giganteus in the study location.

Support vector machine regression (SVR) is a regression technique derived from the SVM. The basic idea is to find a function so that the difference between the predicted value and the true value of the function for all training samples does not exceed a given threshold, and in the case of satisfying this condition, the function is smoothed as much as possible. This study uses the statistics of MATLAB R2023a software and the fitrsvm function in the machine learning toolbox to train the SVM regression model. The kernel function selects the Gaussian kernel and optimizes the hyperparameters through the random search optimizer [40]. Randomly sample within the predefined range of the hyperparameters and evaluate the performance of each sampling point. The performance is evaluated using the cross-validation method.

BP neural network is a commonly used feed-forward artificial neural network. It trains parameters by the back-propagation algorithm, so as to realize tasks such as pattern recognition and function approximation. The BP network consists of an input layer, a hidden layer and an output layer, with each layer containing multiple neuron nodes that are connected to all nodes in the previous layer. The main training process includes two stages: forward propagation and back propagation. In this study, the MLPRegressor module under the Neural Network Toolbox built in MATLAB R2023a software was used to realize the regression modeling of BP neural network.

RF is an ensemble learning algorithm based on decision trees, which averages or votes the results of multiple different and independent sub-trees as the final result of the model. The randomness in RF mainly comes from randomly selecting samples and feature variables [41]. In the training phase, the samples are sampled from the training set by the self-help method (bootstrap) sampling, and several new training samples are obtained. The number of samples in the new training samples is the same as that of the original samples. Therefore, for a certain sample, it may be selected multiple times, or it may not be selected at all. The training data obtained from each tree are not all the data in the original training sample, reducing the risk of overfitting [42]. In RF, there are two important parameters that control the model: one is the number of trees of the subtrees (n_estimators) and the other is the maximum number of features (max_features). By controlling the number of sub-trees, the accuracy of the model is affected. When the amount of data is large, increasing the number of sub-trees enhances the reliability of the model’s recognition accuracy. When the amount of data is small, too many subtrees will result in too much computation, which will reduce the efficiency of the model. Similarly, the maximum number of features should also be appropriately adjusted according to the feature dimension in the training samples. Therefore, the modeling parameters should be selected according to the number of samples [43]. In this study, the RF model mainly uses the RandomForestRegressor module in the sklearn, pandas and numpy libraries of Python 3.10 version programming software to realize the construction of the estimation model and the inversion of the chlorophyll content in the bamboo in the study area.

2.4.4. Model Evaluation

In order to verify the prediction accuracy and estimation results of the RS estimation model of chlorophyll content, the leave-one-out cross-validation (LOOCV) method was used. Due to the limited number of ground survey samples, this method is used to reduce the potential error caused by the segmentation of the training and validation samples. LOOCV participates in training modeling and verification one by one for small sample data. This method solves the problem of having the same data for both the modeling set and the verification set, and it effectively avoids the problem of local optimum in the fitting model [44,45]. Compared with K-fold cross-validation, the verification results are reproducible, are not affected by random factors and have stronger robustness, which can effectively solve the problem of over-fitting or under-fitting of the estimation model [45].

In this study, to better evaluate the feasibility of the RS estimation model of chlorophyll content in D. giganteus and to verify the model’s performance, three indicators were selected: R², the root mean square error (RMSE) and overall prediction accuracy (P). Among them, the closer the R² value is to 1, the better the goodness of fit of the model is; the smaller the RMSE value and the larger the P value, the higher the accuracy of the model is. On the contrary, the model fitting effect is not ideal. Each evaluation index is defined as the following formula:

R^{2} = \frac{\sum_{i = 1}^{n} {({\hat{y}}_{i} - \bar{y})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}

(12)

R M S E = \sqrt{\frac{\sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}{n}}

(13)

P = (1 - \frac{R M S E}{\bar{y}}) \times 100 %

(14)

where

n

is the number of samples,

{\hat{y}}_{i}

is the predicted value of the model,

y_{i}

is the measured value of chlorophyll, and

\bar{y}

is the predicted average value of the model,

\bar{y} = \frac{1}{n} \sum_{i = 1}^{n} y_{i}

.

3. Results and Analysis

3.1. Single Plant D. giganteus Model

In this study, the power function model was constructed with the DBH of single D. giganteus as the independent variable and the chlorophyll as the dependent variable. According to the chlorophyll modeling data, a single-plant chlorophyll-diameter scatter plot was drawn (Figure 4). Based on the R² value, the model exhibited a high degree of fitting; furthermore, the accuracy of the model was demonstrated to be very high through the analysis of the RMSE and P values. The chlorophyll content of individual D. giganteus plants was positively correlated with the DBH; that is, under certain growth conditions, the chlorophyll of D. giganteus increases with the increase in the DBH, which is in line with the actual growth situation of D. giganteus.

3.2. Selection of Semivariogram Models

In this study, using GS + 9.0 software, we employed the linear, spherical, exponential and Gaussian models to fit the semi-variance function, and the nugget value, sill value, nugget effect and range of each parameter were calculated. To ensure model accuracy, we selected the semi-variance function model with the highest R² and smallest residual (RSS) as optimal. The model parameters obtained by fitting the semi-variance are shown in Table 6. The extracted GEDI parameters have strong spatial autocorrelation and can be analyzed by OK interpolation. Among them, when the nugget effect is less than 25%, it indicates a strong spatial autocorrelation, and when it is greater than 75%, it indicates a weak spatial autocorrelation [46]. In this study, except for the linear model, the spherical, exponential and Gaussian models of pai, pgap_theta and pgap_theta_a3 all satisfy the optimal range, with a nugget effect of less than 25%.

3.3. Verification of Kriging Interpolation Results

The Kriging interpolation results of pai, pgap_theta and pgap_theta_a3 were fully evaluated by the cross-validation method, and the results showed that all indexes met the evaluation principles (Table 7). This shows that the OK interpolation method performs well in accuracy and spatial prediction, revealing the feasibility of using the OK interpolation method to interpolate point data into polygon surface attribute information (Figure 5).

3.4. Variable Screening Results and Importance

Considering that the sample size in this study is small, the top three characteristic parameters most correlated with the chlorophyll content, from highest to lowest, in the two data sources were selected as the modeling variables. It can be seen from Figure 6 and Figure 7 that the correlation between Landsat 8 and GEDI characteristic parameters and chlorophyll content ranks in the top three, and the significance level is above 0.01; that is, Landsat 8 selected B2_3_SM, B2_3_HO, B2_5_EN, while GEDI selected pai, pgap_theta, pgap_theta_a3. Among them, the correlation between the B2_3_SM extracted by Landsat 8 and the chlorophyll content was the highest, at 0.503. The correlation between the pai extracted by GEDI and the chlorophyll content was the highest, at 0.346.

After the optimal parameters were selected by Pearson correlation analysis, the feature importance of the variables was evaluated by random forest. It can be seen from Figure 8 that the importance of variable characteristics using single Landsat 8 or GEDI data is different from that using two kinds of data. The contribution of B2_5_EN and B2_3_SM is the largest, and the contribution of pgap_theta is the lowest. The importance of the characteristics of the two data sources, from high to low, is B2_5_EN, B2_3_SM, pai, pgap_theta_a3, B2_3_HO, pgap_theta. The contribution rates were 29.503%, 24.852%, 16.873%, 10.701%, 9.389% and 8.682%, respectively.

3.5. Accuracy Evaluation of Model

3.5.1. Analysis of Regional-Scale Chlorophyll Content Model Estimation Results

With the help of SVM, BP neural network and RF algorithms, the optimal chlorophyll content estimation model was constructed by using the characteristic variables selected by Landsat 8 and GEDI, respectively (Figure 9). The modeling results showed that the RF regression model is better than the SVM and BP neural network models. The R² of RF (c), (f) and (i) were 0.81, 0.88 and 0.94, respectively. The RMSE values were 0.12 g/m², 0.09 g/m² and 0.08 g/m², respectively. The P values were 80.19%, 82.45% and 83.32%, respectively. Among them, the model fitting effect and accuracy of GEDI feature variables participating in modeling were better than those of Landsat 8, and the results of Landsat 8 and GEDI collaborative participation in modeling were better than those of the first two. Therefore, this study verified the reliability of the collaborative operation of Landsat 8 and GEDI and the use of random forests to establish a chlorophyll estimation model.

3.5.2. Spatial Distribution of Chlorophyll Content

The optimal model established by Landsat 8 and GEDI RS data was used to estimate the chlorophyll content of D. giganteus in Xinping County (Table 8). The distribution range of chlorophyll content in D. giganteus was 0.24~1.02 g/m². The average chlorophyll content was about 0.52 g/m², accounting for about 50.67% of the total area of D. giganteus in the study area. The maximum chlorophyll content was about 1.02 g/m², and the minimum was about 0.24 g/m². Throughout the study area, the distribution of the highest and lowest chlorophyll content alternated, indicating significant regional variations (Figure 10). The area with the highest chlorophyll content is mainly concentrated in the border area of three townships, namely, Shuitang Township, Laochang Township and Gasa Township, as well as in the area close to the Gasa River and the western part of the Mosha River, while the eastern area, which is inhabited by a large number of people and with a small distribution of D. giganteus, has a lower value of chlorophyll content. In addition, from the spatial distribution map, the chlorophyll content range corresponding to different chroma bands can be observed to be mainly concentrated between 0.40 and 0.70. The number and proportion of pixels in this interval are the highest, suggesting that the growth status of D. giganteus in Xinping County is relatively good.

4. Discussion

In this study, we studied the collaborative operation of multi-source RS data to estimate the chlorophyll content of D. giganteus. The power function model of the chlorophyll content in single plants of D. giganteus was established, and then the chlorophyll content of each plot was calculated by the single plant model, and the chlorophyll content in the study area was estimated by the RF machine learning algorithm. This can effectively reduce costs, improve efficiency and estimation accuracy, and provide a reference for the long-term monitoring of vegetation chlorophyll content. Combining spaceborne (such as GEDI, ICESat-2) and optical (such as Landsat, Sentinel) multi-source RS data offers researchers a novel research approach to estimate vegetation chlorophyll content. The primary challenge encountered when estimating the chlorophyll content with the GEDI data and Landsat 8 OLI data was addressing the discreteness of GEDI footprint points and resolving the resolution mismatch between GEDI and Landsat 8 images. Therefore, this study aims to solve these difficulties to improve the accuracy and precision of chlorophyll content inversion. Addressing the above problems shows that there is great potential for multi-source RS data to collaboratively estimate the chlorophyll content at the county scale and provides a reference for application at medium and large regional scales.

4.1. The Potential of Multi-Source RS Data to Estimate Chlorophyll Content

The single RS data can no longer meet the estimation requirements of medium- and large-scale chlorophyll content retrieval. Optical RS is susceptible to light saturation effects [18,43]. In addition to the high cost of acquisition, high-resolution images (such as GF, QuickBird, IKONOS) are quite different from the space-borne LiDAR GEDI data in terms of resolution [47]; that is, discrepancies in the pixel scale among image data and between the pixel scale and the plot scale can introduce uncertainties in the estimation of chlorophyll content, thereby impacting the accuracy of the estimation. Although the spaceborne LiDAR is less affected by the light saturation effect and can obtain the vertical structure information of the forest, its light-spot data are discrete and discontinuous in space [47,48]. In view of the above situation, this study used the satellite RS data Landsat 8 OLI and the spaceborne LiDAR data GEDI L2B to carry out the inversion of the chlorophyll content in the study area, which made up for the limitations of using a single data source to invert chlorophyll content, and also solved the problem of large differences between pixel scales and between pixel scales and sample scales.

In view of the discrete and discontinuous distribution of spaceborne LiDAR GEDI L2B product data, this study used a more mature geostatistical method for OK interpolation. In order to ensure the validity of the model fitting results, the effective light spots were divided according to the ratio of 8:2 before OK interpolation; that is, 80% of the light-spot data were used as the training set, and the remaining 20% of the light-spot data were used as the verification set for OK interpolation results. Firstly, the semi-variance function analysis was carried out on 80% of the training set data [49], and the most common linear, spherical, exponential and Gaussian models were used to fit the semi-variance function. In order to ensure the accuracy of the model, the model with strong spatial autocorrelation (a nugget effect less than 25%), the highest coefficient of determination (R²) and the smallest residual (RSS) was selected as the optimal semi-variance function model [50]. In this study, except for the linear model, the spherical, exponential and Gaussian models of pai, pgap_theta and pgap_theta_a3 all satisfied the optimal range of a nugget effect of less than 25%. According to the principle of maximum R² and minimum RSS, the exponential model of pai and pgap_theta is selected as the optimal semi-variance function model. In addition, considering the above factors, according to the principle of maximum R², pgap_theta_a3 finally chooses the exponential model as the optimal semi-variance function model. Secondly, OK interpolation is performed under ArcGIS 10.8. Finally, the interpolation results are verified by the cross-validation method. According to the index evaluation principle of Bostan P A et al. [39], the pai, pgap_theta and pgap_theta_a3 selected in this study all had values of ME and MSE close to 0. The values of RMSE and ASE were close to each other, and the RMSSE value was close to 1. The R² value was 0.63~0.71. This result is consistent with the verification results of Bargaoui et al. [51] and Qiao et al. [52], both of which are used to study biomass. Therefore, this study not only solves the discreteness of GEDI light-spot data, but also confirms the feasibility of OK interpolation. From Figure 9, it can be seen that the collaborative modeling effect of two RS data sources is better and more accurate than the model estimated by the variables of a single data source. The R² of the model increased from 0.81~0.89 to 0.94, RMSE decreased from 0.09~0.12 g/m² to 0.08 g/m², and P increased from 80.19%~82.45% to 83.32%, laying the foundation for accurately estimating the chlorophyll status of the forest area and further understanding the function of the forest ecosystem and the health status of the forest. In addition, this study also confirmed that GEDI L2B data can be used not only for the estimation and inversion of structural parameters such as closing degree, biomass and carbon storage, but also for the estimation and inversion of biochemical parameters such as chlorophyll content. This provides a research case for the estimation and inversion of mesoscale and large-scale chlorophyll contents and provides a scientific basis for the health monitoring of global forest ecosystems.

4.2. Analysis of the Influence of Parameter Selection on Model Accuracy

In this study, the parameter selection includes the independent variable selection of the chlorophyll content model of a single plant and the independent variable selection of the chlorophyll content estimation model at the regional scale. The selection of independent variables for the chlorophyll content model of a single plant of D. giganteus is almost unexplored in previous studies, but scholars have performed similar research in the field of tree biomass. There is a significant correlation between the above-ground biomass of trees and their diameter at breast height. For field measurements, there is usually a large error in the measurement results of tree height, so height is not a better modeling parameter [50,53]. In order to avoid the error caused by including the height of D. giganteus and to improve the efficiency and feasibility of field measurement, it was found that the regression model with DBH as a single variable could more accurately reflect the trend in the aboveground biomass of different bamboo varieties [53,54,55]. Therefore, in this study, the DBH of D. giganteus was used as a single variable to establish a chlorophyll content model of D. giganteus. Compared with the traditional destructive sampling and estimation of chlorophyll content, the model established by the allometric growth equation has better universality, providing a favorable reference value for estimating the chlorophyll content of large, clustered D. giganteus and for forest health monitoring in the future.

Aiming at the selection of independent variables for the estimation model of the chlorophyll content of D. giganteus at the regional scale, a large number of previous studies have been limited to the study of single-band reflectance and band combination information in chlorophyll inversion [7]. These studies have often ignored the texture feature information in RS images, which is conducive to improving interpretation accuracy. In addition, they have overlooked the application of GEDI L2B RS data, which contains rich feature information useful in chlorophyll content inversion. From the results of this study (Figure 6), the correlation between texture features and chlorophyll content is better than the relationship between the vegetation index, single-band reflectance and chlorophyll content, being consistent with the results of Yang Y et al. [9]. The model established by GEDI L2B feature parameters is more accurate than the model established by Landsat 8 OLI feature parameters. As shown in Figure 9c,f, the model R² established by GEDI L2B feature parameters is 0.89, while the model R² established by Landsat 8 OLI feature parameters is 0.81, indicating that the optical RS data themselves have light saturation effects. At the same time, this study also confirms that GEDI L2B product data are not limited to the study of tree biomass and carbon storage. They can also be used in the application of mesoscale and large-scale chlorophyll content inversion.

4.3. Model Selection in Uncertainty Evaluation of Chlorophyll Content Estimation Accuracy

This study involves selecting a basic model for estimating the chlorophyll content in individual D. giganteus plants and a regional-scale model for estimating chlorophyll content. The basic model of chlorophyll content per plant of D. giganteus chooses the allometric growth equation as its basic model. In previous studies, the allometric growth equation was mostly used to estimate the biomass, net output productivity and biogeochemical cycle budget in forest ecosystems [56]. A small number of scholars used to establish the regression model of the relationship between the leaf area index and the DBH of a single tree to predict the change process of productivity of Ribinia pseudoacacia forest [57]. From Figure 4, it can be seen that there is a significant allometric growth relationship between the independent variable DBH and the dependent variable chlorophyll content per plant. It is indicated that the allometric growth equation can also be used to estimate the chlorophyll content of individual plants by the basic model of chlorophyll content for a single plant.

Aiming at the selection of chlorophyll estimation models for regional scale D. giganteus, this study comprehensively considers whether the selected model matches the number of known samples. Because the representativeness of the selected model are related to the number of modeling samples, the higher the number of modeling samples is, the more representative the estimation model is, and the uncertainty will also decrease. However, with the increase in the number of model samples, when the number of model samples reaches a certain critical value, the accuracy of the estimation model will no longer change significantly. Therefore, in order to save manpower, material and financial resources, and to meet the small sample principle (30) and the accuracy requirements of field investigation [54], this study investigated 35 measured sample plots for RS modeling research. According to the previous research on the estimation of chlorophyll content, the accuracy of the chlorophyll content estimated by the more mature RF algorithm model is higher than that of other common parameter models (such as partial least squares model or the multiple linear regression model) and non-parametric models (such as the SVM model or K nearest neighbor algorithm) [24,58]. The results of this study indicate that the RF model provides the most accurate estimation of chlorophyll content. Therefore, it was selected as an RS model to estimate the chlorophyll content of D. giganteus in Xinping County. The chlorophyll content of D. giganteus in the study area was 0.24~1.02 g/m². At present, there are few studies on the chlorophyll content of bamboo plants, especially on the chlorophyll content of D. giganteus. Therefore, compared with the results of Jin et al. [59] on the RS estimation of total chlorophyll content in wheat leaves, where R² was 0.868 and RMSE was 0.384 g/m², the estimation accuracy of this study is higher. Compared with the results of Richardson et al. [60] and Gitelson et al. [61], which only studied the chlorophyll content of single leaves of higher plants, this study extrapolated the chlorophyll content of single plant leaves to the RS estimation of chlorophyll content in the study area, which provided an important reference value for the assessment of forest health and for the scientific management of forest resources. In addition, for the three levels of D. giganteus chlorophyll content, from the highest to the lowest, the number of samples was 14, 14 and 7, respectively, and the proportion of graded pixels to the total pixels was 22.55%, 66.93% and 10.62%, respectively (Table 8). These values indicate that the distribution of samples and pixels was relatively reasonable. At the same time, this reflects the representativeness of sampling and the rationality of the modeling results, so as to reduce the uncertainty and error transmission caused by sampling.

4.4. Limitations of Estimation of Chlorophyll Content in D. giganteus

In this study, the size of the sample plot was 30 m × 30 m, so the Landsat 8 data with a resolution of 30 m and the GEDI data with a spot footprint radius of 25 m were selected. In order to ensure consistency, this study resampled the spatial resolution of GEDI data to 30 m by Kriging interpolation [8]. Although the GEDI L2B has chlorophyll-related leaf area index, canopy cover and waveform vegetation energy values, its spot footprint points are discontinuous, and the amount of data is large. In order to solve this problem, the poor quality spots need to be eliminated before Kriging interpolation [62] and then interpolated into surface data to obtain the characteristic parameter information of the whole study area.

Furthermore, this study only uses three common estimation models to estimate the chlorophyll content of D. giganteus. In future research, according to the different number of samples, an appropriate parameter model and an optimized non-parametric model can be selected as the estimation models for chlorophyll content, and the parameter selection can be taken into account. Assessing whether there is anti-interference between the parameters of the two RS data sources makes the model better than single RS data modeling. Then, the selection of the optimal model as the inversion model for chlorophyll content is performed. For the demand of high-precision, high-resolution and large-scale chlorophyll content inversion, high-resolution GF, IKONOS, QuickBird, Sentinel-2 and hyperspectral data can be combined with GEDI or ICESat-2 data and finally modeled and inverted with a unified resolution.

5. Conclusions

The purpose of this study was to explore the feasibility of using the RS data of two different sensors, Landsat 8 and spaceborne LiDAR GEDI, to estimate the chlorophyll content of large sympodial D. giganteus in Xinping County. The results show that through the comprehensive utilization of two RS data sources for collaborative modeling, the model fitting effect is better and the accuracy is higher, compared with the single RS data source modeling. The spatial distribution map of chlorophyll content is consistent with the sub-compartment data of the second-class survey of forest resources in Xinping County in 2016, and is consistent with the distribution of large-scale clustered D. giganteus resources in the study area. This shows that on the one hand, the comprehensive utilization of multivariate RS data has more comprehensive and more accurate vegetation information than single RS data, and has more advantages in improving the accuracy and reliability of the estimation accuracy of chlorophyll content of D. giganteus. On the other hand, RS technology can quickly monitor large areas of forest vegetation around the world. Compared with traditional destructive sampling, it is more efficient, convenient and environmentally friendly, and the data is universal, which ensures the repeatability of estimating chlorophyll content. This provides a scientific basis for the monitoring and management of global vegetation ecosystems, promotes the application and development of ecological environment RS technology, and promotes the progress of global RS.

Author Contributions

C.X.: Conceptualization, formal analysis, investigation, methodology, software, writing—original draft. W.Z.: validation, methodology, software, resources visualization, writing—review and editing. Q.S.: data curation, funding acquisition, supervision, project administration, writing—review and editing. Z.W.: conceptualization, formal analysis, investigation, methodology, software. L.X. and M.W.: validation, visualization, writing—review and editing. H.Y. and Z.Q.: conceptualization, investigation, methodology, software. D.D.: funding acquisition, supervision. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by the National Key Research and Development Program of China (Nos. 2023YFD2201205), the Joint Agricultural Project of Yunnan Province (Nos. 202301BD070001-002) and the National Natural Science Foundation of China (Nos. 31860205 and 31460194).

Data Availability Statement

The data needed for this study can be shared with the relevant authors as per the current circumstances.

Acknowledgments

We are grateful for the resources and efforts of our instructors and all authors. We also sincerely thank the editors and reviewers for their valuable comments.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Qi, M.; Zhang, C. Research Progress on Hyper-spectral Remote Sensing Retrieval for Forest Physical and Chemical Parameters. World For. Res. 2016, 29, 52–57. [Google Scholar] [CrossRef]
Ni, H.; Chu, H.; Su, W.; Fan, S. Effects of management intensities on soil aggregate stability and carbon, nitrogen, phosphorus distribution in Phyllostachys edulis forests. Chin. J. Appl. Ecol. 2023, 34, 928–936. [Google Scholar] [CrossRef]
Madeira, A.C.; Mentions, A.; Ferreira, M.E.; Taborda, M.d.L. Relationship between spectroradiometric and chlorophyll measurements in green beans. Commun. Soil Sci. Plant Anal. 2000, 31, 631–643. [Google Scholar] [CrossRef]
Ahmad, I.; Zhu, G.; Zhou, G.; Song, X.; Hussein Ibrahim, M.E.; Ibrahim Salih, E.G. Effect of N on growth, antioxidant capacity, and chlorophyll content of sorghum. Agronomy 2022, 12, 501. [Google Scholar] [CrossRef]
Qiao, L.; Tang, W.; Gao, D.; Zhao, R.; An, L.; Li, M.; Sun, H.; Song, D. UAV-based chlorophyll content estimation by evaluating vegetation index responses under different crop coverages. Comput. Electron. Agric. 2022, 196, 106775. [Google Scholar] [CrossRef]
Wu, Q.; Zhang, Y.; Zhao, Z.; Xie, M.; Hou, D. Estimation of relative chlorophyll content in spring wheat based on multi-temporal UAV remote sensing. Agronomy 2023, 13, 211. [Google Scholar] [CrossRef]
Houborg, R.; McCabe, M.; Cescatti, A.; Gao, F.; Schull, M.; Gitelson, A. Joint leaf chlorophyll content and leaf area index retrieval from Landsat data using a regularized model inversion system (REGFLEC). Remote Sens. Environ. 2015, 159, 203–221. [Google Scholar] [CrossRef]
Vermote, E.; Justice, C.; Claverie, M.; Franch, B. Preliminary analysis of the performance of the Landsat 8/OLI land surface reflectance product. Remote Sens. Environ. 2016, 185, 46–56. [Google Scholar] [CrossRef] [PubMed]
Yang, Y.; Zhang, X.; Gao, W.; Zhang, Y.; Hou, X. Improving lake chlorophyll-a interpreting accuracy by combining spectral and texture features of remote sensing. Environ. Sci. Pollut. Res. 2023, 30, 83628–83642. [Google Scholar] [CrossRef]
Carder, K.L.; Chen, F.; Cannizzaro, J.; Campbell, J.; Mitchell, B. Performance of the MODIS semi-analytical ocean color algorithm for chlorophyll-a. Adv. Space Res. 2004, 33, 1152–1159. [Google Scholar] [CrossRef]
Sun, D.; Li, Y.; Wang, Q. A unified model for remotely estimating chlorophyll a in Lake Taihu, China, based on SVM and in situ hyperspectral data. IEEE Trans. Geosci. Remote Sens. 2009, 47, 2957–2965. [Google Scholar] [CrossRef]
Barraza-Moraga, F.; Alcayaga, H.; Pizarro, A.; Félez-Bernal, J.; Urrutia, R. Estimation of chlorophyll-a concentrations in Lanalhue Lake using Sentinel-2 MSI satellite images. Remote Sens. 2022, 14, 5647. [Google Scholar] [CrossRef]
Chen, L.; Ren, C.; Zhang, B.; Wang, Z.; Man, W.; Liu, M. Improved Object-Based Mapping of Aboveground Biomass Using Geographic Stratification with GEDI Data and Multi-Sensor Imagery. Remote Sens. 2023, 15, 2625. [Google Scholar] [CrossRef]
Liu, H.; Fan, W.; Xu, Y.; Lin, W. Research Progress in Forest Information Extraction Based on Multi-source Data Collaboration Operation. World For. Res. 2020, 33, 33–37. [Google Scholar] [CrossRef]
Boucher, J.; Weathers, K.C.; Norouzi, H.; Steele, B. Assessing the effectiveness of Landsat 8 chlorophyll a retrieval algorithms for regional freshwater monitoring. Ecol. Appl. 2018, 28, 1044–1054. [Google Scholar] [CrossRef] [PubMed]
Cao, Z.; Ma, R.; Duan, H.; Pahlevan, N.; Melack, J.; Shen, M.; Xue, K. A machine learning approach to estimate chlorophyll-a from Landsat-8 measurements in inland lakes. Remote Sens. Environ. 2020, 248, 111974. [Google Scholar] [CrossRef]
Xu, L.; Lai, H.; Yu, J.; Luo, S.; Guo, C.; Gao, Y.; Zhou, W.; Wang, S.; Shu, Q. Carbon Storage Estimation of Quercus aquifolioides Based on GEDI Spaceborne LiDAR Data and Landsat 9 Images in Shangri-La. Sustainability 2023, 15, 11525. [Google Scholar] [CrossRef]
Sims, D.A.; Gamon, J.A. Relationships between leaf pigment content and spectral reflectance across a wide range of species, leaf structures and developmental stages. Remote Sens. Environ. 2002, 81, 337–354. [Google Scholar] [CrossRef]
Xu, X.; Lu, J.; Zhang, N.; Yang, T.; He, J.; Yao, X.; Cheng, T.; Zhu, Y.; Cao, W.; Tian, Y. Inversion of rice canopy chlorophyll content and leaf area index based on coupling of radiative transfer and Bayesian network models. ISPRS J. Photo-Grammetry Remote Sens. 2019, 150, 185–196. [Google Scholar] [CrossRef]
Lary, D.J.; Alavi, A.H.; Gandomi, A.H.; Walker, A.L. Machine learning in geosciences and remote sensing. Geosci. Front. 2016, 7, 3–10. [Google Scholar] [CrossRef]
Blix, K.; Eltoft, T. Machine learning automatic model selection algorithm for oceanic chlorophyll-a content retrieval. Remote Sens. 2018, 10, 775. [Google Scholar] [CrossRef]
Abba, S.; Hadi, S.J.; Abdullahi, J. River water modelling prediction using multi-linear regression, artificial neural network, and adaptive neuro-fuzzy inference system techniques. Procedia Comput. Sci. 2017, 120, 75–82. [Google Scholar] [CrossRef]
Jimeno-Sáez, P.; Senent-Aparicio, J.; Cecilia, J.M.; Pérez-Sánchez, J. Using machine-learning algorithms for eutrophication modeling: Case study of Mar Menor Lagoon (Spain). Int. J. Environ. Res. Public Health 2020, 17, 1189. [Google Scholar] [CrossRef] [PubMed]
Shah, S.H.; Angel, Y.; Houborg, R.; Ali, S.; McCabe, M.F. A random forest machine learning approach for the retrieval of leaf chlorophyll content in wheat. Remote Sens. 2019, 11, 920. [Google Scholar] [CrossRef]
Lu, F.; Chen, Z.; Liu, W.; Shao, H. Modeling chlorophyll-a concentrations using an artificial neural network for precisely eco-restoring lake basin. Ecol. Eng. 2016, 95, 422–429. [Google Scholar] [CrossRef]
Su, J.; Wang, X.; Zhao, S.; Chen, B.; Li, C.; Yang, Z. A structurally simplified hybrid model of genetic algorithm and support vector machine for prediction of chlorophyll a in reservoirs. Water 2015, 7, 1610–1627. [Google Scholar] [CrossRef]
Sonobe, R.; Sano, T.; Horie, H. Using spectral reflectance to estimate leaf chlorophyll content of tea with shading treatments. Biosyst. Eng. 2018, 175, 168–182. [Google Scholar] [CrossRef]
Kang, Q.; Xu, W.; Wang, L.; Hong, z.; Liu, Y. Extraction of Sugarcane Plantation in Mountainous Areas Based on Landsat-8 and Sentinel-2 Time-series Synthetic. Chin. J. Trop. Crops 2023, 44, 1276–1287. [Google Scholar]
Azadeh, A.; Ghavami, K.; García, J.J. The influence of heat on mechanical properties of Dendrocalamus giganteus bamboo. J. Build. Eng. 2021, 43, 102613. [Google Scholar] [CrossRef]
Li, X. The planting status and industrial development suggestions of cigar tobacco in Xinping County of Yunnan Province. Agric. Eng. Technol. 2022, 42, 18+20. [Google Scholar] [CrossRef]
Zhao, X.; Hui, C.; Zhu, S.; Liu, W.; Cai, C.; Zhang, W.; Tu, D.; Zhu, L. Response of Soil Organic Matter Content and Its Infrared Spectral Characteristics to Different Planting Durations of Dendrocalamus brandisii. Acta Agric. Univ. Jiang-Xiensis 2022, 44, 1448–1456. [Google Scholar] [CrossRef]
Wang, X.; Huang, J. Principles and Techniques of Plant Physiological Biochemical Experiment; Higher Education Press: Beijing, China, 2015. [Google Scholar]
Mancino, G.; Ferrara, A.; Padula, A.; Nolè, A. Cross-comparison between Landsat 8 (OLI) and Landsat 7 (ETM+) derived vegetation indices in a Mediterranean environment. Remote Sens. 2020, 12, 291. [Google Scholar] [CrossRef]
Wu, Q.; Jin, Y.; Fan, H. Evaluating and comparing performances of topographic correction methods based on multi-source DEMs and Landsat-8 OLI data. Int. J. Remote Sens. 2016, 37, 4712–4730. [Google Scholar] [CrossRef]
Curran, P.J.; Atkinson, P.M. Geostatistics and remote sensing. Prog. Phys. Geogr. 1998, 22, 61–78. [Google Scholar] [CrossRef]
Ji, W.; Adamchuk, V.I.; Chen, S.; Su, A.S.M.; Ismail, A.; Gan, Q.; Shi, Z.; Biswas, A. Simultaneous measurement of multiple soil properties through proximal sensor data fusion: A case study. Geoderma 2019, 341, 111–128. [Google Scholar] [CrossRef]
Hengl, T.; Heuvelink, G.B.; Rossiter, D.G. About regression-kriging: From equations to case studies. Comput. Geosci. 2007, 33, 1301–1315. [Google Scholar] [CrossRef]
Li, Y.; Li, M.; Liu, Z.; Li, C. Combining kriging interpolation to improve the accuracy of forest aboveground biomass estimation using remote sensing data. IEEE Access 2020, 8, 128124–128139. [Google Scholar] [CrossRef]
Bostan, P.; Heuvelink, G.B.; Akyurek, S. Comparison of regression and kriging techniques for mapping the average annual precipitation of Turkey. Int. J. Appl. Earth Obs. Geoinf. 2012, 19, 115–126. [Google Scholar] [CrossRef]
Zhang, T.; Huang, M.; Wang, Z. Estimation of chlorophyll-a Concentration of lakes based on SVM algorithm and Landsat 8 OLI images. Environ. Sci. Pollut. Res. 2020, 27, 14977–14990. [Google Scholar] [CrossRef]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Li, J.; Mao, X. Comparison of canopy closure estimation of plantations using parametric, semi-parametric, and non-parametric models based on GF-1 remote sensing images. Forests 2020, 11, 597. [Google Scholar] [CrossRef]
Cao, Y. Estimation of Aboveground Biomass of Regional Forest Based on Multi-Source Remote Sensing. Master’s Thesis, Zhejiang Agricultural and Forestry University, Hangzhou, China, 2021. [Google Scholar]
Marchetti, F. The extension of Rippa’s algorithm beyond LOOCV. Appl. Math. Lett. 2021, 120, 107262. [Google Scholar] [CrossRef]
Ma, J.; Zhang, W.; Ji, Y.; Huang, J.; Huang, G.; Wang, L. Total and component forest aboveground biomass inversion via LiDAR-derived features and machine learning algorithms. Front. Plant Sci. 2023, 14, 1258521. [Google Scholar] [CrossRef]
Tao, Y.; Xu, K.; Yi, Z.; Luo, X.; Gao, Y. A Semi-Variogram-based Analysis of Spatial Heterogeneity of Urban Heat Islands. J. Southwest Univ. (Nat. Sci.) 2018, 40, 145–152. [Google Scholar] [CrossRef]
Dorado-Roda, I.; Pascual, A.; Godinho, S.; Silva, C.A.; Botequim, B.; Rodríguez-Gonzálvez, P.; González-Ferreiro, E.; Guerra-Hernández, J. Assessing the accuracy of GEDI data for canopy height and aboveground biomass estimates in Mediterranean forests. Remote Sens. 2021, 13, 2279. [Google Scholar] [CrossRef]
Liu, X.; Su, Y.; Hu, T.; Yang, Q.; Liu, B.; Deng, Y.; Tang, H.; Tang, Z.; Fang, J.; Guo, Q. Neural network guided interpolation for mapping canopy height of China’s forests by integrating GEDI and ICESat-2 data. Remote Sens. Environ. 2022, 269, 112844. [Google Scholar] [CrossRef]
Cai, C.; Cao, S.; Kong, F.; Hu, L.; Liu, T.; Sun, W.; Wang, L. A dataset of spatial distribution of spruce aboveground biomass in Western Tianshan Mountains, Xinjiang in 2014. China Sci. Data 2022, 7, 250–263. [Google Scholar]
Araújo, T.s.M.; Higuchi, N.; de Carvalho Júnior, J.A. Comparison of formulae for biomass content determination in a tropical rain forest site in the state of Pará, Brazil. For. Ecol. Manag. 1999, 117, 43–52. [Google Scholar] [CrossRef]
Bargaoui, Z.K.; Chebbi, A. Comparison of two kriging interpolation methods applied to spatiotemporal rainfall. J. Hydrol. 2009, 365, 56–73. [Google Scholar] [CrossRef]
Qiao, P.; Lei, M.; Yang, S.; Yang, J.; Guo, G.; Zhou, X. Comparing ordinary kriging and inverse distance weighting for soil as pollution in Beijing. Environ. Sci. Pollut. Res. 2018, 25, 15597–15608. [Google Scholar] [CrossRef]
Chave, J.; Riéra, B.; Dubois, M.-A. Estimation of biomass in a neotropical forest of French Guiana: Spatial and temporal vari-ability. J. Trop. Ecol. 2001, 17, 79–96. [Google Scholar] [CrossRef]
Ji, X.; Luo, Q.; Ding, Y.; Wang, Y.; Zhao, J.; Wang, S. A Study on the Aboveground Biomass Model of Dendrocalamus brandisii. J. Bamboo Res. 2015, 34, 49–53. [Google Scholar]
Yang, Q.; Su, G.; Duan, Z.; He, K.; Guo, Y.; Wang, Z.; Sun, Q.; Peng, Z. Biomass structure and its regression models of Den-drocalamus hamiltonii Nees et Arn.ex Munro population. J. Northwest Agric. For. Univ. (Nat. Sci. Ed.) 2008, 36, 127–134. [Google Scholar] [CrossRef]
Wang, C. Biomass allometric equations for 10 co-occurring tree species in Chinese temperate forests. For. Ecol. Manag. 2006, 222, 9–16. [Google Scholar] [CrossRef]
Wang, T. Characteristics of Leaf Area Index of Artificial Robinia Pseudoacacia Forest in Loess Hilly Region. Master’s Thesis, Beijing Forestry University, Beijing, China, 2019. [Google Scholar]
Lv, J. Hyperspectral Remote Sensing Inversion Models of Crop Chlorophyll Content Based on Machine Learning and Radiative Transfer Models. Ph.D. Thesis, China University of Geosciences (Beijing), Beijing, China, 2012. [Google Scholar]
Jin, X.-l.; Wang, K.-r.; Xiao, C.-h.; Diao, W.-y.; Wang, F.-y.; Chen, B.; Li, S.-k. Comparison of two methods for estimation of leaf total chlorophyll content using remote sensing in wheat. Field Crops Res. 2012, 135, 24–29. [Google Scholar] [CrossRef]
Richardson, A.D.; Duigan, S.P.; Berlyn, G.P. An evaluation of noninvasive methods to estimate foliar chlorophyll content. New Phytol. 2002, 153, 185–194. [Google Scholar] [CrossRef]
Gitelson, A.A.; Merzlyak, M.N. Remote estimation of chlorophyll content in higher plant leaves. Int. J. Remote Sens. 1997, 18, 2691–2697. [Google Scholar] [CrossRef]
Li, Z.; Xuan, F.; Dong, Y.; Huang, X.; Liu, H.; Zeng, Y.; Su, W.; Huang, J.; Li, X. Performance of GEDI data combined with Sentinel-2 images for automatic labelling of wall-to-wall corn mapping. Int. J. Appl. Earth Obs. Geoinf. 2024, 127, 103643. [Google Scholar] [CrossRef]

Figure 1. Location map of the study area (Xinping County) in China (a) and in Yunan Province (b). The digital elevation model (DEM) and the location of field sampling plots in the Xinping County are shown in (c).

Figure 2. (a) Distribution of all footprint locations. (b) Distribution of light spots following filtration.

Figure 3. Technical route.

Figure 4. Chlorophyll model of a single Dendrocalamus giganteus plant.

Figure 5. The spatial prediction distribution map of GEDI variables. Note: (a) represents the interpolation result of pai index, (b) represents the interpolation result of pgap_theta index, (c) represents the interpolation result of pgap_theta_a3 index.

Figure 6. Correlation matrix between the Landsat 8 variables and the chlorophyll content. Note: B2_3_SM and B2_3_HO represent the second-order moment and cooperativity texture information calculated under the 3 × 3 window in the second band of the Landsat 8 imagery, respectively. B2_5_EN represents the information entropy texture information calculated in the 5 × 5 window of the second band of Landsat 8 imagery.

Figure 7. Correlation matrix between the GEDI variable and the chlorophyll content.

Figure 8. RF modeling parameter importance contribution ratio.

Figure 9. Scatter plot of 3 different models. Note: (a–c) Landsat 8 feature variables are involved in modeling, (d–f) participation in the modeling of the GEDI characteristic variables, (g–i) the Landsat 8 and GEDI feature variables are involved in the modeling at the same time.

Figure 10. The spatial distribution map of chlorophyll content in D. giganteus.

Table 1. The statistics of DBH and leaf weight of single Dendrocalamus giganteus plant.

Name	Sample Size	Minimum	Maximum	Average	SD
DBH (cm)	137	3.2	12.8	8.5	2.37
Total leaf fresh weight (kg)	137	0.05	6.54	1.90	1.34

Table 2. The statistics of chlorophyll content of D. giganteus leaves.

Chl (mg/g)	Sample Size	Minimum	Maximum	Average	SD
C_a	49	1.78	2.67	2.36	0.26
C_b	49	0.35	2.73	1.27	0.54
C_T	49	2.13	5.34	3.63	0.78

Table 3. The statistics of chlorophyll content of D. giganteus in sample plot.

Name	Sample Size	Minimum	Maximum	Average	SD
Chl-ab (g/m²)	35	0.16	1.21	0.51	0.27

Table 4. Summary of remote-sensing factor information extraction from Landsat 8 OLI data.

Variables	Amount	Description
$ρ_{B_{i}}$	4	The original incidence of the ith band $(i = 2, 3, 4, 5)$ .
$D V I$	1	Difference vegetation index: $D V I = ρ_{N I R} - ρ_{R}$ , $ρ_{N I R}$ , $ρ_{R}$ , the reflectance of the near-infrared band and the red band, respectively.
$R V I$	1	Ratio vegetation index: $R V I = ρ_{N I R} / ρ_{R}$ .
$N D V I$	1	Normalized vegetation index: $N D V I = (ρ_{N I R} - ρ_{R}) / (ρ_{N I R} + ρ_{R})$
$S A V I$	1	Soil-adjusted vegetation index: $S A V I = \frac{1.5 (ρ_{N I R} - ρ_{R})}{ρ_{N I R} + ρ_{R} + 0.5}$
$E V I$	1	Enhanced vegetation index $E V I = 2.5 [\frac{(ρ_{N I R} - ρ_{R})}{(ρ_{N I R} + 6.0 ρ_{R} - 7.5 ρ_{B} + 1)}]$ , $ρ_{B}$ is the reflectivity of the blue band.
$B_{i_N_T}$	64	Texture features are extracted using the texture filtering $T$ under the i-band $N \times N$ window. $i = 2, 3, 4, 5$ ; $N = 3, 5$ ; $T$ is texture filtering. It is divided into mean ME, variance VA, synergy HO, contrast CO, dissimilarity DI, information entropy EN, second moment SM, correlation CR.
$E l e v a t i o n$	1	Elevation
$S l o p e$	1	Slope factor extracted by DEM
$A s p e c t$	1	Slope aspect factor extracted by DEM

Table 5. GEDI footprint quality filtering criteria.

Parameters	Retention Value	Retention Basis
lon_lowestmode	101–103° E	The geographical coordinate range of the study area.
lat_lowestmode	23–25° N	The geographical coordinate range of the study area.
quality_flag	1	It indicates that the waveform meets the specific high-quality standard.
Sensitivity	≥0.9	It indicates high spot quality.
degrade_flag	0	A value of 1 means the satellite is in a descending orbital state, resulting in inaccurate data. Consequently, the retention value is 0 light spots.

Table 6. Four models fit semi-variance function result tables.

Parameter Name	Model	Nugget	Sill	Nugget Effect	Range	R²	RSS
pai	Linear	3.45	3.71	0.93	61,536.40	0.56	0.07
	Spherical	0.19	3.61	0.05	3700.00	0.62	0.06
	Exponential	0.42	3.61	0.12	3600.00	0.66	0.06
	Gaussian	0.59	3.61	0.16	3117.69	0.62	0.06
pgap_theta	Linear	0.06	0.07	0.94	61,536.40	0.40	3.51 × 10⁻⁵
	Spherical	0.00	0.07	0.05	3700.00	0.68	1.86 × 10⁻⁵
	Exponential	0.01	0.07	0.12	3900.00	0.73	1.60 × 10⁻⁵
	Gaussian	0.01	0.07	0.17	3290.90	0.68	1.90 × 10⁻⁵
pgap_theta_a3	Linear	0.06	0.06	0.96	61,536.40	0.32	1.77 × 10⁻⁵
	Spherical	0.00	0.06	0.06	3600.00	0.75	6.57 × 10⁻⁶
	Exponential	0.01	0.06	0.11	3300.00	0.77	5.90 × 10⁻⁶
	Gaussian	0.01	0.06	0.18	3117.69	0.75	6.83 × 10⁻⁶

Note: The pai parameter represents the total plant area index. The variable pgap_theta represents the Pgap (theta) under L2B calculated by L2A products. Similarly, the parameter pgap_theta_a3 is the same, where a3 represents the third algorithm of GEDI.

Table 7. OK interpolation cross-validation results table.

Parameter Name	ASE	RMSE	RMSSE	R²	Model
Pai	1.70	1.69	1.00	0.63	Exponential
pgap_theta	0.25	0.27	1.08	0.71	Exponential
pgap_theta_a3	0.27	0.27	1.01	0.69	Exponential

Table 8. Statistical results of chlorophyll content classification pixels in the study area.

Chl-ab (g/m²)	Number of Pixels	Proportion (%)
0.24~0.40	36625	22.55
0.40~0.55	65814	40.52
0.55~0.70	42897	26.41
0.70~0.85	10128	6.23
0.85~1.02	6974	4.29

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Xia, C.; Zhou, W.; Shu, Q.; Wu, Z.; Xu, L.; Yang, H.; Qin, Z.; Wang, M.; Duan, D. Regional Scale Inversion of Chlorophyll Content of Dendrocalamus giganteus by Multi-Source Remote Sensing. Forests 2024, 15, 1211. https://doi.org/10.3390/f15071211

AMA Style

Xia C, Zhou W, Shu Q, Wu Z, Xu L, Yang H, Qin Z, Wang M, Duan D. Regional Scale Inversion of Chlorophyll Content of Dendrocalamus giganteus by Multi-Source Remote Sensing. Forests. 2024; 15(7):1211. https://doi.org/10.3390/f15071211

Chicago/Turabian Style

Xia, Cuifen, Wenwu Zhou, Qingtai Shu, Zaikun Wu, Li Xu, Huanfen Yang, Zhen Qin, Mingxing Wang, and Dandan Duan. 2024. "Regional Scale Inversion of Chlorophyll Content of Dendrocalamus giganteus by Multi-Source Remote Sensing" Forests 15, no. 7: 1211. https://doi.org/10.3390/f15071211

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Regional Scale Inversion of Chlorophyll Content of Dendrocalamus giganteus by Multi-Source Remote Sensing

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Acquisition and Processing of Measured Data

2.2.1. Subsubsection

2.2.2. Standard Leaf Selection and Sample Determination

2.2.3. Measurement of Chlorophyll of D. giganteus at Plot Scale

2.3. RS Image Data Acquisition and Information Extraction

2.3.1. Landsat 8 OLI Image Data

2.3.2. GEDI Data

2.4. Research Method

2.4.1. Construction of Chlorophyll Model of Single D. giganteus Plant

2.4.2. Geostatistics Method

2.4.3. Establishing a Regional Chlorophyll Estimation Model

2.4.4. Model Evaluation

3. Results and Analysis

3.1. Single Plant D. giganteus Model

3.2. Selection of Semivariogram Models

3.3. Verification of Kriging Interpolation Results

3.4. Variable Screening Results and Importance

3.5. Accuracy Evaluation of Model

3.5.1. Analysis of Regional-Scale Chlorophyll Content Model Estimation Results

3.5.2. Spatial Distribution of Chlorophyll Content

4. Discussion

4.1. The Potential of Multi-Source RS Data to Estimate Chlorophyll Content

4.2. Analysis of the Influence of Parameter Selection on Model Accuracy

4.3. Model Selection in Uncertainty Evaluation of Chlorophyll Content Estimation Accuracy

4.4. Limitations of Estimation of Chlorophyll Content in D. giganteus

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI