Next Article in Journal
Modern Lacustrine Phytoliths and their Relationships with Vegetation and Climate in Western Yunnan, SW China
Previous Article in Journal
Climate as a Driver of Aboveground Biomass Density Variation: A Study of Ten Pine Species in Mexico
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Application of Random Forest Method Based on Sensitivity Parameter Analysis in Height Inversion in Changbai Mountain Forest Area

1
College of Forestry, Beijing Forestry University, Beijing 100083, China
2
Beijing Key Laboratory of Precision Forestry, Beijing Forestry University, Beijing 100083, China
3
Beijing Ocean Forestry Technology Co., Ltd., Beijing 100083, China
*
Author to whom correspondence should be addressed.
Forests 2024, 15(7), 1161; https://doi.org/10.3390/f15071161
Submission received: 6 June 2024 / Revised: 29 June 2024 / Accepted: 2 July 2024 / Published: 4 July 2024
(This article belongs to the Section Forest Inventory, Modeling and Remote Sensing)

Abstract

:
The vertical structure of forests, including the measurement of canopy height, helps researchers understand forest characteristics such as density and growth stages. It is one of the key variables for estimating forest biomass and is crucial for accurately monitoring changes in forest carbon storage. However, current technologies face challenges in achieving cost-effective, accurate measurement of canopy height on a widespread scale. This study introduces a method aimed at extracting accurate forest canopy height from The Global Ecosystem Dynamics Investigation (GEDI) data, followed by a comprehensive large-scale analysis utilizing this approach. Before mapping, verifying and analyzing the accuracy and sensitivity of parameters that may affect the precision of GEDI data extraction, such as slope, aspect, and vegetation coverage, can aid in assessment and decision-making, enhancing inversion accuracy. Consequently, a random forest method based on parameter sensitivity analysis is developed to break through the constraints of traditional issues and achieve forest canopy height inversion. Sensitivity analysis of influencing parameters surpasses the uniform parameter calculation of traditional methods by differentiating the effects of various land use types, thereby enhancing the precision of height inversion. Moreover, potential factors affecting the accuracy of GEDI data, such as vegetation cover density, terrain complexity, and data acquisition conditions, are thoroughly analyzed and discussed. Subsequently, large-scale forest canopy height estimation is conducted by integrating vegetation cover Normalized Difference Vegetation Index (NDVI), sun altitude angle and terrain data, among other variables, and accuracy validation is performed using airborne LiDAR data. With an R2 value of 0.64 and an RMSE of 8.62, the mapping accuracy underscores the resilience of the proposed method in delineating forest canopy height within the Changbai Mountain forest domain.

1. Introduction

The average tree height of stands is one of the most important stand characteristics in forest planning. With the increasing importance of forest monitoring and resource management, the demand for accurate tree height acquisition has become urgent. However, traditional methods for measuring tree height are constrained by time, cost, and spatial limitations, which are both time-consuming and labor-intensive, limiting the ability to monitor and manage large forest areas [1]. In 2018, National Aeronautics and Space Administration (NASA) conducted the Global Ecosystem Dynamics Investigation (GEDI) in collaboration with the University of Maryland. The primary instrument employed for this mission is a full waveform Light Detection and Ranging (LiDAR), deployed on the International Space Station (ISS). It stands as the inaugural spaceborne LiDAR system expressly engineered for extracting vegetation structure. The GEDI data collection ranges from 51.6° N to 51.6° S latitude. The laser emitted by the GEDI instrument is divided into coverage beams and power beams. The ground track generated by beam jitter includes footprint samples of 30 m [2]. The laser footprint diameter for each ground track is 25 m, with a spacing of 60 m along the track [3].
The GEDI data products provide gridded datasets describing three-dimensional features of the Earth. Upon release, numerous scholars have conducted related research, which has enabled the use of GEDI data in studies related to forest vertical structure and biomass research [4,5], monitoring forest carbon storage [6], assessing forest growth dynamics [7], and estimating stand volume [8], among other applications.
As a spaceborne LiDAR dataset, GEDI possesses characteristics of large-scale coverage and multi-temporal observations, providing essential foundational data for wide-area ground observations and forest height measurements. However, due to factors such as ISS position, attitude, and atmospheric conditions, the accuracy of GEDI data for ground observations and forest height measurements still needs improvement [9]. Many scholars have compared GEDI data observations with airborne LiDAR data to assess factors influencing the accuracy of GEDI data observations. For example, Parra et al. integrated GEDI and airborne LiDAR data to assess tree growth dynamics, showcasing the efficacy of GEDI data in detecting alterations in forest vertical structure [10]. The studies by Dong and Kutchartt, among others, also employed airborne radar data to validate the performance of GEDI data in inferring forest structure and understory terrain [11,12]; Liu et al. introduced a technique known as Neural Network Guided Interpolation (NNGI), which, in conjunction with GEDI and remote sensing imagery data, generated a 30 m resolution forest canopy height map for China in 2019. They validated this method using airborne LiDAR data, demonstrating its feasibility [13]; Liu et al. verified GEDI L2A data accuracy through the utilization of high-resolution, locally calibrated airborne LiDAR products. The study found that for terrain height information extraction, the Root Mean Square Error (RMSE) at mid to low latitudes was 4.03 m. For canopy height inversion, the RMSE was 5.02 m [3]. Wang et al. conducted a comparative analysis between ground elevation and relative height (RH) derived from GEDI L2A data and simulated airborne laser scanning (ALS) waveforms generated from discrete point cloud LiDAR. The results underscored GEDI’s remarkable accuracy in determining ground elevation and RH100, with root mean square error (RMSE) values of 1.38 m and 2.62 m, respectively [14].
The aforementioned studies indicate that the inversion accuracy of canopy height varies in different regions due to factors such as terrain and climate. However, conventional algorithms typically treat these factors uniformly, lacking specificity.
This study adopted a method of precision verification before mapping. Considering the characteristics of the Changbai Mountain forest area, sensitivity analysis of each parameter’s impact on accuracy was conducted, and the selection of parameters was restricted accordingly. Consequently, a random forest method based on sensitivity analysis of parameters was constructed. This method was applied to canopy height inversion in the Changbai Mountain forest area, and a canopy height map was generated.

2. Materials and Methods

2.1. Study Area

The research site is situated within the Changbai Mountain National Nature Reserve in Jilin Province, China. The topography in this region is typified by mountainous hills, accompanied by a temperate continental monsoon climate. The elevation ranges from 700 to 1000 m above sea level. The yearly mean temperature and precipitation amount to 5.14 °C and 636.14 mm, respectively. The predominant forest types include natural deciduous broad-leaf forests and mixed broad-leaf conifer forests [15]. As per the official announcement in the “Boundary Turning Table of Changbai Mountain National Nature Reserve in Jilin Province” by the Management Committee of Changbai Mountain Protection and Development Zone in Jilin Province (2023), a boundary map delineating the area was created. The total area of the reserve is 196,618 hectares, with geographic coordinates ranging from 127°28′ E to 128°16′ E longitude and 41°42′ N to 42°25′ N latitude. The locations of our chosen study areas are shown in Figure 1.

2.2. GEDI L2A Data

The main scientific objectives of GEDI are as follows: (1) high-resolution laser ranging observations are conducted to analyze the three-dimensional structure of the Earth [16]; (2) precise measurements of forest canopy height, vertical structure, and surface elevation are acquired with accuracy [14]; (3) research on carbon/water cycle processes, biodiversity, and habitat characteristics [17,18]. The main parameters of the ISS/GEDI system are listed in Table 1.
According to different stages of data processing, GEDI data are divided into four levels. Level 1 products consist of positioned waveforms; Level 2 products include footprint canopy height and profile metrics; Level 3 products comprise gridded canopy height and its variations; whereas Level 4A and 4B products offer ground-level carbon estimates, presented in both footprint and gridded formats [20].
This study utilized the second version of GEDI L2A-level products released in 2019 [21]. GEDI L2A data result from processing with algorithms for ground elevation and canopy height extraction, making it a crucial product associated with forest canopy height estimation. Compared to other data versions, L2A data undergo more preprocessing and parameter adjustments, enabling more accurate reflection of ground elevation and canopy height information. It includes modeling and processing of factors such as terrain, vegetation cover, and canopy height, making it more specialized and targeted.

2.3. Airborne LiDAR Data

The LiDAR data for this study were collected in June 2021 via a Bell helicopter outfitted with a LiDAR scanner and an aerial camera. The helicopter flew at an altitude of 500 m over the study area, achieving a side overlap rate of about 45% and a forward overlap rate of roughly 65%. Studies have shown that high-density UAV LiDAR point cloud data can be effectively used to acquire ground truth points for forest canopy height [11,22,23,24]. In our research, LiDAR360 V5.2 software was initially used to create a Canopy Height Model (CHM). Following this, geographic location corrections were made using the coordinates of the sample plots. This adjusted data set was utilized to validate the canopy height inversion. Detailed specifications of the LiDAR sensor are provided in Table 2.
The schematic diagram of the airborne LiDAR data point cloud for the study area is shown in Figure 2.
Additionally, we randomly collected 20 sample points from the local area, recording the latitude, longitude, canopy height, and ground point height for each sample. Using the latitude and longitude data, we matched the sample points with the airborne data to calculate the accuracy of the airborne measurements. As shown in the Figure 3, the correlation coefficients for ground height and canopy height extracted from the airborne data are both greater than or equal to 0.9, indicating a high correlation. Therefore, our airborne data have high accuracy and can be used to validate the forest height data.

2.4. Other Auxiliary Data

The auxiliary data consist of three types: Sentinel-2 satellite images are used to provide spectral data information, The Shuttle Radar Topography Mission (SRTM) data are used to provide terrain data information, and Worldclim data are used to provide annual mean temperature and precipitation data to explore whether they have an impact on the accuracy of the GEDIL2A inversion of forest canopy height.

2.4.1. Sentinel-2 Data

The Sentinel-2 satellite, developed by the European Space Agency, represents a cutting-edge addition to Earth observation satellites. Comprising two identical satellites, Sentinel-2A and Sentinel-2B, it operates as a twin satellite system. Sentinel-2A was successfully launched in June 2015, and it was followed by the launch of Sentinel-2B in March 2017. The orbit altitude of the Sentinel-2 satellite is 768 km, with a swath width of 290 km. The repeat cycle of the twin satellites is 5 days. The satellite is outfitted with a state-of-the-art multispectral imaging instrument (MSI), capable of capturing imagery across 13 spectral bands. These bands include three spatial resolutions: 10 m, 20 m, and 60 m. The satellite offers advantages such as high spatial resolution, strong spectral imaging capabilities, and short revisit periods. It also includes three vegetation red-edge bands, which can be used for inverting forest vegetation coverage and leaf area index over large areas [25]. This study employs the data to compute vegetation coverage, with the goal of investigating the influence of Fractional Vegetation Coverage (FVC) on the forest canopy height retrieval model. FVC was calculated with the Snap application.

2.4.2. STRM-DEM Data

The Shuttle Radar Topography Mission (SRTM) is a collaborative endeavor involving NASA, the United States Department of Defense, as well as space agencies from Germany and Italy. SRTM satellite data were collected from 11 February to 22 February 2000. The data coverage extends from 60° N to 56° S latitude, encompassing a total area of over 190 million square kilometers of radar imagery. In this study, SRTM DEM data were acquired from the Google Earth Engine (GEE) platform. Since the elevation from SRTM data is orthometric height, referenced to the Earth Gravitational Model (EGM 96) geoid, we used the egm96 geoid function in MATLAB 2021a (MathWorks) to convert it to ellipsoidal height. This ensures that the SRTM elevation has the same reference as the GEDI elevation [21].
The ground elevation values extracted from the SRTM DEM data serve as the reference ground elevation values for the forest canopy height retrieval model at the pixel scale used in this study. To enhance the accuracy of the pixel-scale forest canopy height retrieval model, cloud removal processing was implemented on the GEDI data obtained from the Changbai Mountain National Nature Reserve. This process retained only the absolute differences between the reference ground elevation values and the estimated ground elevation values from the GEDI data, ensuring they were less than 50 m. The spots calculated by the following Formula (1) were deleted:
e l e v l o w e s t m o d e T a n D E M X > 50   m
Furthermore, slope and aspect factors were extracted from the SRTM DEM data to serve as input parameters for both the pixel-scale forest canopy height retrieval model and the extrapolation model for forest canopy height.

2.4.3. WorldClim Data

The growth status of forests is closely related to climatic conditions such as temperature and precipitation in their respective regions, and we use climate data as one of the potential climate drivers of canopy height. Many previous studies have introduced climate data into forest height estimation models and obtained high accuracy [26,27]. WorldClim is a global climate dataset with high spatial resolution which includes variables such as maximum temperature, minimum temperature, average temperature, precipitation, and bioclimatic variables [28]. In this study, the annual mean temperature (MAT) and precipitation (MAP) data from WorldClim were utilized for sensitivity analysis. The sensitivity index values determined whether these variables should be included as parameters in the forest canopy height inversion model [29,30,31].

3. Methods

This study proposes a sensitivity analysis-based random forest method that integrates GEDI data, airborne LiDAR data, and other auxiliary data for estimating canopy height in the Changbai Mountain forest area. This approach offers an effective strategy for cost-effective and accurate canopy height estimation. The method consists of the following steps:
(1)
Preprocessing of spaceborne LiDAR and airborne LiDAR data: GEDI data products are stored in the Hierarchical Data Format version 5 (HDF5). The study converts the HDF5 format to the CSV format using the optimal threshold spatial method and selects the best dataset for subsequent analysis and forest canopy height inversion [32]. Parameters are extracted from airborne LiDAR data to obtain the Canopy Height Model (CHM) and terrain factor data for further analysis. To ensure consistency between DEM data from airborne LiDAR and GEDI footprint diameter, the DEM is resampled to a 30 m resolution.
(2)
Sensitivity analysis: A random forest (RF) algorithm is used to establish relationships between vegetation factors, terrain factors, climate factors, beam type factors, and GEDI ground elevation errors [33,34]. The sensitivity of each influencing factor is determined by the percentage increase in the mean squared error (%IncMSE) parameter. A higher %IncMSE value indicates higher sensitivity of the influencing factor and greater impact on accuracy.
(3)
Establishment of canopy height inversion model based on sensitivity analysis: Parameters with significant sensitivity identified in Step (2)—namely vegetation cover, slope, solar altitude angle, beam type, and CHM—were used as input variables. Sensitivity indices were assigned to these parameters to enhance the performance of the random forest algorithm. Canopy height was used as the ideal output data for forest canopy height inversion on the prediction set samples. Finally, height data extracted from airborne LiDAR were used as ground truth data for model accuracy validation.
(4)
Mapping of canopy height: Utilizing the established forest canopy height inversion model, the linear stepwise regression equation of the Canopy Height Model (CHM) was derived, and the discrete point buffer was mapped based on arcmap10.2 to construct the interpolation to achieve spatially continuous pixel-scale coverage across the forest region [35,36,37]. Using footprint points as samples and combining Sentinel-2 optical remote sensing data, terrain factors, climate factors, etc., a regression prediction was performed using the random forest algorithm to generate a forest height map.
The research methodology and technology roadmap are shown in Figure 4.

3.1. Extraction and Optimal Selection of Parameters for Spaceborne LiDAR

The spaceborne full-waveform LiDAR GEDI product is stored in the Hierarchical Data Format version 5 (HDF5), a self-descriptive file format developed by the National Center for Supercomputing Applications in the United States. This format facilitates users in storing and manipulating scientific data seamlessly across various operating systems and machines. In this study, GEDI_Subsetter.py provided on NASA’s official website was used to process the downloaded GEDI L2A data in HDF format, the parameters were extracted and stored in geojson format for easy subsequent processing.

3.1.1. Extraction of GEDI L2A Parameters

GEDI L2A calculates forest canopy height by subtracting the detected highest return height (canopy top height) from the ground elevation (ground elevation) extracted from the received waveform. The specific formula is shown in Equation (2):
C H M = e l e v h i g h e s t r e n t u r n e l e v l o w e s t m o d e
It also retrieves forest canopy height for each energy percentile from the received waveform. This study extracts parameters including quality-related parameters such as quality_flag, degrade_flag, sensitivity, as well as ground elevation (elev_lowestmode), Relative Height (RH) metrics which is a measurement with intervals of 1%, such as rh92, rh94, and others [38].

3.1.2. Primary Screening of GEDI L2A Data

GEDI L2A includes parameters for assessing data quality such as quality flag parameters (quality_flag_a<n>) and sensitivity parameters, which can be used to eliminate invalid footprints affected by poor geolocation accuracy, poor waveform quality, clouds, and other surface conditions, retaining high-quality GEDI footprints for each algorithm setting. To obtain high-quality GEDI footprints within the study area, footprints within the study area are first selected using the longitude (lon_lowestmode_a<n>) and latitude (lat_lowestmode_a<n>) parameters of GEDI L2A, where a<n> represents the algorithm setting used. Subsequently, based on quality-related parameters included in GEDI L2A data, some invalid GEDI footprints are filtered out. The specific screening steps are outlined in Table 3 [7,39].

3.1.3. Optimal Algorithm Selection for GEDI L2A Data

GEDI L2A provides terrain and canopy height values calculated using six algorithms: AmpSim, AmpSDE, WavHgt, RH, AnomHeight, and Stat. Previous studies often directly utilized the default algorithm, AmpSim for research as the default optimal algorithm setting results were provided for each beam. However, under default conditions, other algorithm settings yield higher accuracy in ground elevation and canopy height. Each algorithm has its unique methodology and application areas for estimating surface height, vegetation height, and other surface features. To enhance the accuracy of subsequent canopy height inversion, this study conducted a sensitivity analysis on the six height datasets mentioned earlier, as depicted in Figure 5. The results indicated that Algorithm 5 exhibited the highest data sensitivity and were more suitable for research in the study area. Despite the appearance of some “outliers” in the figure, this is not actually the case. We screened the data before conducting the sensitivity analysis to ensure that all data met the required quality standards. The boxplot data show that overall, Algorithm 5 has the highest sensitivity. Therefore, this study selected the data from GEDI L2A Algorithm 5 for research purposes [40].

3.2. Sensitivity Analysis

In the random forest algorithm, the Gini index is utilized for evaluation. Here, the sensitivity index is denoted as SeI, and the Gini index is denoted as GI [41]. Assuming there are J features X1, X2, …, XJ, I decision trees, and C categories, we now calculate the Gini index score S e I j G i n i for each feature XJ. This score represents the average change in impurity of node splitting for the jth feature across all decision trees in the random forest.
In the random forest algorithm, the Gini index is utilized for evaluation. Here, the sensitivity index is denoted as S e I and the Gini index is denoted as GI. Assuming there are J features X1, X2, …, XJ, I decision trees, and C categories, we now calculate the Gini index score S e I j G i n i for each feature XJ. This score represents the average change in impurity of node splitting for the jth feature across all decision trees in the random forest [42,43].
The Gini index of node q in the ith tree is determined by the following formula:
G I q i = c = 1 C c c P q c i P q c i = 1 c = 1 c P q c i 2
where C denotes the count of categories and P q c denotes the proportion of category c in node q.
The significance of feature XJ t node q in the ith tree, that is, the alteration in the Gini index before and after the splitting of node q, is expressed as follows:
S e I j q G i n i i = G I q i G I l i G I r i
G I l i and G I r i respectively, denote the Gini index of the two new nodes after the splitting process.
When feature XJ a is present in the nodes of decision tree i as a set Q, the importance of XJ in the ith tree is expressed as follows:
S e I j G i n i i = q ϵ Q S e I j q G i n i i
We suppose there are I trees in the random forest (RF); then
S e I j G i n i = i = 1 I S e I j G i n i i
Finally, we normalize all the calculated importance scores.
S e I j G i n i = S e I j G i n i j = 1 J ¯ S e I j G i n i
The sensitivity analysis conducted with the random forest (RF) algorithm aimed to discern the relationship between vegetation, terrain, climate factors, beam type, and the error in GEDI ground elevation. The percentage increase in the mean squared error (%IncMSE) parameter was employed to gauge the sensitivity of each influencing factor. A higher %IncMSE value indicates greater sensitivity, implying a larger impact on accuracy. The results are illustrated in Figure 6.
The findings reveal that vegetation coverage exerts the most substantial influence on elevation extraction accuracy, with slope, solar elevation angle, and beam type also demonstrating significant effects. Therefore, an analysis was conducted on these four parameters individually to discuss further screening methods aimed at improving accuracy [38,44,45].

3.2.1. The Impact of GEDI Vegetation Coverage on Ground Elevation

As depicted in Figure 7, this study categorizes vegetation coverage, computed from airborne LiDAR data, into four distinct groups. The RMSE values of ground elevation errors extracted by GEDI decline with increasing vegetation coverage. Within the 0%–25% range of vegetation coverage, the RMSE of ground elevation errors stands at 10.09 m. Conversely, when vegetation coverage surpasses 25%, the RMSE of ground elevation errors exhibits a rapid decline, plummeting to 6.29 m. In the vegetation coverage range of 75%–100%, the ground elevation error is minimized, with an RMSE value of 2.83 m.
GEDIL2A data are ground elevation data extracted based on LiDAR data. In areas with low vegetation coverage such as grasslands, deserts, and bare rocks, LiDAR data can accurately reflect back to the ground, thereby obtaining more accurate ground elevation data. However, in areas with high vegetation coverage such as forests and shrubs, LiDAR data may be absorbed or scattered by vegetation, leading to errors in ground elevation data.

3.2.2. The Impact of GEDI Slope on Ground Elevation

As illustrated in Figure 8, this study categorizes slope values calculated from airborne LiDAR data into five groups. The RMSE values of ground elevation errors extracted by GEDI escalate as terrain slope increases. For terrain slopes less than 5°, the RMSE of ground elevation errors amounts to 2.58 m. However, when the terrain slope exceeds 20°, the ground elevation error increases rapidly, with an RMSE value of 8.92 m. This may be due to the reduced return rate of LiDAR data in areas with steep slopes, leading to larger errors in ground elevation data extraction.

3.2.3. The Impact of GEDI Solar Elevation Angle on Ground Elevation

The GEDI footprints are segregated into two categories determined by the solar elevation angle: nighttime data when the solar elevation angle is less than 0, and daytime data when the solar elevation angle is greater than or equal to 0. Table 4 presents the differences in ground elevation extraction accuracy by GEDI L2A at different acquisition times. The estimation accuracy during daytime (RMSE = 4.23) is lower than that during nighttime (RMSE = 3.79). This is because solar noise during the day reduces waveform quality, leading to higher accuracy in ground elevation estimation with nighttime GEDI data.

3.2.4. The Impact of GEDI Beam Type on Ground Elevation

In this study, all footprints are divided into two groups based on beam type: coverage beams and full power beams. Table 5 presents the differences in ground elevation extraction accuracy by GEDI L2A for different beam types. The results indicate that ground elevation extracted from coverage beams and full power beams shows a very high correlation (R2 = 1.00) with ground elevation extracted from airborne LiDAR data. However, there is a slight difference in root mean square error (RMSE), with the estimation accuracy for coverage beams (RMSE = 5.33 m) being lower than that for full power beams (RMSE = 4.24 m). This is because coverage beams are only suitable for penetrating forests with up to 95% tree canopy coverage under “average” conditions, whereas full power beams can penetrate forests with up to 98% tree canopy coverage, resulting in more energy detection for ground elevation values.

3.3. Random Forest Algorithm for Forest Canopy Height Inversion Based on Sensitivity Analysis

Based on the outcomes of the sensitivity analysis mentioned previously, it is evident that vegetation coverage, slope, solar zenith angle, and beam type are intricately linked to the precision of canopy height. Traditional random forest methods may encounter overfitting issues on regression problems with significant noise and may not exhibit significant classification effects for datasets with fewer features [46,47]. Consequently, this adversely affects the accuracy of estimating canopy height. In this study, based on the results of the sensitivity analysis in Section 3.2, we improved the method by assigning weights to parameter indicators.
The entropy weight method was used to determine the weights. Entropy weight (EW) is based on the information entropy theory and reflects the useful information content provided by each variable. Judgement matrix   γ containing m assessment objects and n variables is constructed for calculating EW:
γ = y i j m × n i = 1,2 , , n ;   j = 1,2 , , m
When y is normalized to the standard matrix, the effect of dimensionality and numerical range of the variables is eliminated, X = x i j m × n i = 1,2 , , n ;   j = 1,2 , , m , and the entropy value of the variables, H i , is calculated as
H i = 1 L n   m j = 1 m f i j L n f i j i = 1,2 , , n ;   j = 1,2 , , m
where f i j = x i j j = 1 m x i j and 0 H i 1 . EW can be calculated as
w i = 1 H i n i = 1 n H i
where EW should satisfy condition i = 1 n w i = 1 . Smaller entropy values are clearly associated with larger EW, suggesting that this variable is more critical.
Following this, a random forest model was developed for canopy height in the forest area, utilizing inputs such as Canopy Height Model (CHM), normalized difference vegetation index (NDVI), slope, and solar altitude angle data. Canopy height was employed as the ideal output data, and the random forest model was utilized to infer forest canopy height for the prediction set samples. The training set for the model comprised 80% of the CHM100 data (refers to data obtained by combining CHM data with RH100 data for correlation analysis and screening) selected randomly, while the remaining 20% was reserved for model accuracy validation, aiming to identify the optimal model. The specific method flow chart is shown in Figure 9.
The specific steps are as follows:
(1)
Combining sensitivity analysis to assign sensitivity indices to important variables as weights for the forest height model.
(2)
Ensuring that the dependent variable (y) consists of forest canopy height values extracted from airborne LiDAR data and the independent variable (x) comprises forest canopy height values extracted from GEDI data.
(3)
Incrementally adding feature variables and calculating R2 and RMSE to evaluate the model’s estimation performance.
Lastly, the accuracy of the model was validated using the height data extracted from airborne LiDAR, which served as the ground truth data.

3.4. Canopy Height Mapping of Changbai Mountain Forest Area

Due to the existence of along-track and across-track spacing, GEDI footprint points are discretely distributed. One of the key research focuses of this study is how to achieve the transformation from discrete to continuous model construction. Although it is difficult to accurately obtain information about forest vertical structure using optical remote sensing technology, it can acquire spectral reflectance information which is crucial for constructing models that extrapolate crown height from points to surfaces. Therefore, by combining airborne LiDAR data with continuous optical image data, the successful transformation from discrete points to continuous surfaces can be achieved. This study combines the estimation set of spot-scale forest canopy height inversion model and image feature parameter set to construct a point-to-surface forest canopy height extrapolation model.
Random forest is a classifier consisting of multiple decision trees. The methodology utilizes the bootstrap resampling technique to generate multiple samples from the original dataset, builds decision trees for each bootstrap sample, and then consolidates the predictions of these decision trees to generate the final prediction via a voting mechanism. Notably, random forest demonstrates robustness to outliers and noise, ensuring high prediction accuracy. Moreover, previous studies have shown that the RF algorithm has certain stability in modeling forest parameters from GLAS data and other remote sensing images. Due to the multicollinearity among image feature parameters and the nonlinear relationship between canopy height and these parameters, the random forest regression algorithm, enhanced through sensitivity analysis, is utilized to construct the forest canopy height extrapolation model. The outlined steps are as follows:
(1)
Parameter settings: The number of decision trees (ntree) and the number of random features (mtry) are two key parameters involved in the random forest algorithm. In this study, ntree and mtry are determined through iterative optimization.
(2)
Sorting of feature parameter importance, namely parameter sensitivity index: The use of a large number of covariates in the extrapolation process may lead to overfitting of the RF algorithm to the model. To reduce this possibility, the construction of the extrapolation model involves feature variables with high sensitivity according to the results of sensitivity analysis. Here are the steps: Feature Selection: The sensitivity of feature variables is ranked. Scale Extrapolation Using Image Feature Parameters and the Forest Height Inversion Model: A buffer zone for each data point is established and the mean is calculated for overlapping areas. The forest height is set in non-forest areas to null values to create the forest height map.
(3)
Forest Height Inversion Model Construction: In total, 80% of the GEDI L2A forest canopy height samples is randomly selected, as well as the corresponding remote sensing image feature parameters for random forest regression training. Then, 10% of the forest canopy height samples is used to validate the forest height inversion model, and the remaining 10% is reserved for subsequent accuracy verification of the forest canopy height map.

3.5. Accuracy Verification

The Canopy Height Model (CHM) of vegetation is a surface model that expresses the height of vegetation above the ground, typically calculated as the difference between the Digital Surface Model (DSM) and the Digital Elevation Model (DEM). The airborne point cloud data are preprocessed, including denoising and filtering, and then divided into ground points and vegetation points using the support vector machine method. Finally, the heights of ground points and vegetation points are calculated separately, resulting in DEM and CHM (as shown in the Figure 10).

4. Result

4.1. Accuracy Verification of the Selected Optimal GEDI Data

(1)
Verification of GEDI Ground Elevation Accuracy
According to the optimal selection mechanism provided in this study, 1084 valid GEDI footprints were selected to extract ground elevation parameters. The ground elevation data extracted from airborne LiDAR point cloud data were used as the reference ground elevation to validate the accuracy of ground elevation extracted by GEDI L2A.
The verification results of ground elevation accuracy extracted by GEDI under different land cover types are shown in Figure 11. It can be observed that regardless of all footprint points or only footprint points in forest or bare ground areas, the R2 value is 0.97, indicating a strong correlation between the ground elevation values extracted by GEDI data and the reference ground elevation values. Although there is some error in the ground elevation extracted from all footprint points, the RMSE value is 2.58 m. Therefore, it is necessary to further analyze the sources of error in GEDI-estimated ground elevation.
(2)
Accuracy Verification of GEDI Forest Canopy Height Values
Based on the optimal selection mechanism provided in this study, a total of 607 footprint data points were obtained within the study area, which were used for accuracy verification and error analysis of forest canopy height in this study. Parameter RH in GEDI L2A data represents the reflectance of forest canopy. Reflectance refers to the proportion of incident light that is reflected back by an object. The reflectance of forest canopy is influenced by various factors such as vegetation type, canopy height, and canopy density. Parameter RH is closely related to canopy height, where higher canopy height leads to higher reflectance of forest canopy. Therefore, RH values ranging from RH90 to RH100 were extracted from the GEDI L2A product and matched with forest canopy height data extracted based on Algorithm 5. The percentile parameters of forest canopy height with the highest matching accuracy were selected as the optimal forest canopy height values for accuracy verification.
The matching results between GEDI L2A extracted forest height percentile parameters and forest canopy height are shown in Figure 12. It can be observed that as the percentile of forest canopy height increases, its correlation with forest canopy height gradually increases, with the highest correlation observed for RH100 at 0.9674. From the RH93 onwards, the correlations between the different height metrics become very close. This observation suggests that the data points in the higher percentiles are more consistent and less variable, leading to similar correlation values. These high percentiles represent the upper canopy structure, which tends to be more homogeneous compared to lower canopy layers. As the canopy height increases, the structural complexity and variation decrease, resulting in more stable and comparable height measurements across different algorithms.
Furthermore, the similar correlations at these higher percentiles indicate that the different algorithms converge in their effectiveness in capturing the uppermost canopy characteristics. This convergence is important as it demonstrates the reliability of using any of these algorithms for studying the upper canopy structure.
The accuracy verification results of forest canopy height extracted by GEDI are presented in Figure 13. It can be seen that whether for all footprint points or only those in forested or bare land areas, the R2 value is 0.99, indicating a strong correlation between the forest canopy height values extracted from GEDI data and the reference forest canopy height values. However, there is some error in the forest canopy height extracted for all footprint points, with an RMSE value of 2.58 m.

4.2. Conducting Accuracy Verification of Canopy Height

This study primarily utilizes spaceborne GEDI L2A data as the main data source. Following the mentioned steps, optimal spot extraction is performed on GEDI L2A data. This is combined with aerial data and image feature parameters extracted from other optical remote sensing data. A forest canopy height retrieval model is developed utilizing the random forest regression algorithm. By extrapolating canopy height at the spot scale with image features, a forest height map with 30 m spatial resolution is generated for the Mountain region. Finally, accuracy validation is conducted using a portion of spaceborne GEDI L2A data and airborne LiDAR forest canopy height data.
The results are shown in Figure 14 as a forest canopy height map with a spatial resolution of 30 m in the Changbaishan forest area. Overall, the canopy height values in the northern forest region of Changbai Mountain are relatively higher compared to other areas.
This study conducted accuracy validation of the forest canopy height map with 30 m spatial resolution for the Changbai Mountain forest region using airborne LiDAR data. It can be observed that the forest canopy height in the Changbai Mountain region mainly ranges from 10 m to 40 m. The R2 is 0.64, the RMSE is 8.62, and the bias is 1.16 m. These results indicate a relatively high mapping accuracy but an overall overestimation of the actual canopy height. This could be attributed to the complex terrain of the Changbai Mountain forest region, which significantly affects the accuracy of forest canopy height extraction using spaceborne LiDAR. Additionally, the richness of species and complex ecological structure may make it challenging for optical imagery to fully reflect forest characteristic parameter information.

5. Conclusions

This study intends to explore the accuracy of regional features in extracting ground elevation and canopy height from GEDIL2A data. It aims to quantify the effect of terrain features on extraction accuracy by analyzing the sensitivity index through extraction parameters. Additionally, the study seeks to invert the canopy height of the Changbai Mountain forest area by combining GEDIL2A data with other auxiliary data. The resulting canopy height distribution maps, with a spatial resolution of 30 m, demonstrate higher accuracy.
(1)
By analyzing the sensitivity indices of various characteristic variables and using sensitivity analysis to determine the optimal algorithm, we effectively screened GEDI data of higher quality and greater relevance to regional characteristics. This improved the validation accuracy for subsequent analyses.
(2)
Utilizing a sensitivity analysis-based random forest algorithm for canopy height inversion in the Changbai Mountain forest region improved the accuracy of the canopy height inversion model at the footprint scale due to the introduction of variables such as slope, vegetation cover, and solar zenith angle.
(3)
Combining active and passive remote sensing data, a forest Canopy Height Model constructed based on the sensitivity analysis-integrated random forest algorithm was extrapolated to achieve a continuous canopy height mapping with 30 m spatial resolution for the Changbai Mountain forest region, demonstrating high accuracy.
Forest canopy height is a crucial component in studying forest carbon cycling. However, its complex structural composition and multiple influencing factors pose challenges for research. In addition, the screened and clipped GEDI L2A data do not evenly cover the entire Changbai Mountain forest area, which affects the accuracy of the forest height mapping to some extent. More ground-measured data will be collected in the future to further verify the accuracy of the forest height map. Furthermore, as GEDI continues to collect data, the GEDI L2A data will be fully utilized in future updates of the forest height maps.

Author Contributions

Conceptualization, X.W. and R.W.; methodology, X.W.; software, X.W. and R.W.; validation, X.W., S.W. and S.X.; writing—original draft preparation, X.W.; writing—review and editing, X.W. and S.W.; visualization, X.W., S.W. and S.X.; supervision, X.W. and S.W.; project administration, R.W.; funding acquisition, R.W. All authors have read and agreed to the published version of the manuscript.

Funding

National Natural Science Foundation of China ‘biomass precision estimation model research for large-scale region based on multi-view heterogeneous stereographic image pair of forest’ (41971376). The project was funded by National Natural Science Foundation of China (NSFC).

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

Author Shi Wei was employed by the company Beijing Ocean Forestry Technology Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Næsset, E. Determination of mean tree height of forest stands using airborne laser scanner data. ISPRS J. Photogramm. Remote Sens. 1997, 52, 49–56. [Google Scholar] [CrossRef]
  2. Dubayah, R.; Blair, J.B.; Goetz, S.; Fatoyinbo, L.; Hansen, M.; Healey, S.; Hofton, M.; Hurtt, G.; Kellner, J.; Luthcke, S.; et al. The Global Ecosystem Dynamics Investigation: High-resolution laser ranging of the Earth’s forests and topography. Sci. Remote Sens. 2020, 1, 100002. [Google Scholar] [CrossRef]
  3. Liu, A.; Cheng, X.; Chen, Z. Performance evaluation of GEDI and ICESat-2 laser altimeter data for terrain and canopy height retrievals. Remote Sens. Environ. 2021, 264, 112571. [Google Scholar] [CrossRef]
  4. Potapov, P.; Li, X.; Hernandez-Serna, A.; Tyukavina, A.; Hansen, M.C.; Kommareddy, A.; Pickens, A.; Turubanova, S.; Tang, H.; Silva, C.E.; et al. Mapping global forest canopy height through integration of GEDI and Landsat data. Remote Sens. Environ. 2021, 253, 112165. [Google Scholar] [CrossRef]
  5. Li, Y.; Wang, R.; Shi, W.; Yu, Q.; Li, X.; Chen, X. Research on Accurate Estimation Method of Eucalyptus Biomass Based on Airborne LiDAR Data and Aerial Images. Sustainability 2022, 14, 10576. [Google Scholar] [CrossRef]
  6. Liang, M.; González-Roglich, M.; Roehrdanz, P.; Tabor, K.; Zvoleff, A.; Leitold, V.; Silva, J.; Fatoyinbo, T.; Hansen, M.; Duncanson, L. Assessing protected area’s carbon stocks and ecological structure at regional-scale using GEDI lidar. Glob. Environ. Chang. 2023, 78, 102621. [Google Scholar] [CrossRef]
  7. Guerra-Hernández, J.; Pascual, A. Using GEDI lidar data and airborne laser scanning to assess height growth dynamics in fast-growing species: A showcase in Spain. For. Ecosyst. 2021, 8, 14. [Google Scholar] [CrossRef]
  8. Chen, L.; Ren, C.; Zhang, B.; Wang, Z.; Liu, M.; Man, W.; Liu, J. Improved estimation of forest stand volume by the integration of GEDI LiDAR data and multi-sensor imagery in the Changbai Mountains Mixed forests Ecoregion (CMMFE), northeast China. Int. J. Appl. Earth Obs. Geoinf. 2021, 100, 102326. [Google Scholar] [CrossRef]
  9. Liu, L.; Wang, C.; Nie, S.; Zhu, X.; Xi, X.; Wang, J. Analysis of the influence of different algorithms of GEDI L2A on the accuracy of ground elevation and forest canopy height. J. Univ. Chin. Acad. Sci. 2022, 39, 502. [Google Scholar]
  10. Parra, A.; Simard, M. Evaluation of Tree-Growth Rate in the Laurentides Wildlife Reserve Using GEDI and Airborne-LiDAR Data. Remote Sens. 2023, 15, 5352. [Google Scholar] [CrossRef]
  11. Kutchartt, E.; Pedron, M.; Pirotti, F. Assessment of Canopy and Ground Height Accuracy from Gedi Lidar over Steep Mountain Areas. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2022, V-3-2022, 431–438. [Google Scholar] [CrossRef]
  12. Dong, H.; Yu, Y.; Fan, W. Vertification of performance of understory terrain inversion from spaceborne lidar GEDI data. J. Nanjing For. Univ. 2022, 47, 141. [Google Scholar]
  13. Liu, X.; Su, Y.; Hu, T.; Yang, Q.; Liu, B.; Deng, Y.; Tang, H.; Tang, Z.; Fang, J.; Guo, Q. Neural network guided interpolation for mapping canopy height of China’s forests by integrating GEDI and ICESat-2 data. Remote Sens. Environ. 2022, 269, 112844. [Google Scholar] [CrossRef]
  14. Wang, C.; Elmore, A.J.; Numata, I.; Cochrane, M.A.; Shaogang, L.; Huang, J.; Zhao, Y.; Li, Y. Factors affecting relative height and ground elevation estimations of GEDI among forest types across the conterminous USA. GIScience Remote Sens. 2022, 59, 975–999. [Google Scholar] [CrossRef]
  15. Zhang, Y.; Ma, T.; Liu, B.; Liang, Y.; Huang, C.; Wu, M.; Jiang, S. Assessment of forest ecosystem integrity dynamics in Changbai Mountain National Nature Reserve. Chin. J. Ecol. 2021, 40, 2251–2262. [Google Scholar] [CrossRef]
  16. Adam, M.; Urbazaev, M.; Dubois, C.; Schmullius, C. Accuracy Assessment of GEDI Terrain Elevation and Canopy Height Estimates in European Temperate Forests: Influence of Environmental and Acquisition Parameters. Remote Sens. 2020, 12, 3948. [Google Scholar] [CrossRef]
  17. Schneider, F.D.; Ferraz, A.; Hancock, S.; Duncanson, L.I.; Dubayah, R.O.; Pavlick, R.P.; Schimel, D.S. Towards mapping the diversity of canopy structure from space with GEDI. Environ. Res. Lett. 2020, 15, 115006. [Google Scholar] [CrossRef]
  18. Gwenzi, D. Lidar remote sensing of savanna biophysical attributes: Opportunities, progress, and challenges. Int. J. Remote Sens. 2016, 38, 235–257. [Google Scholar] [CrossRef]
  19. Chen, X.; Wang, R.; Shi, W.; Li, X.; Zhu, X.; Wang, X. An Individual Tree Segmentation Method That Combines LiDAR Data and Spectral Imagery. Forests 2023, 14, 1009. [Google Scholar] [CrossRef]
  20. Quiros, E.; Polo, M.-E.; Fragoso-Campon, L. GEDI Elevation Accuracy Assessment: A Case Study of Southwest Spain. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 5285–5299. [Google Scholar] [CrossRef]
  21. Asner, G.P.; Mascaro, J. Mapping tropical forest carbon: Calibrating plot estimates to a simple LiDAR metric. Remote Sens. Environ. 2014, 140, 614–624. [Google Scholar] [CrossRef]
  22. Liu, Q.; Fu, L.; Wang, G.; Li, S.; Li, Z.; Chen, E.; Pang, Y.; Hu, K. Improving Estimation of Forest Canopy Cover by Introducing Loss Ratio of Laser Pulses Using Airborne LiDAR. IEEE Trans. Geosci. Remote Sens. 2020, 58, 567–585. [Google Scholar] [CrossRef]
  23. Dhargay, S.; Lyell, C.S.; Brown, T.P.; Inbar, A.; Sheridan, G.J.; Lane, P.N.J. Performance of GEDI Space-Borne LiDAR for Quantifying Structural Variation in the Temperate Forests of South-Eastern Australia. Remote Sens. 2022, 14, 3615. [Google Scholar] [CrossRef]
  24. Urbazaev, M.; Hess, L.L.; Hancock, S.; Sato, L.Y.; Ometto, J.P.; Thiel, C.; Dubois, C.; Heckel, K.; Urban, M.; Adam, M.; et al. Assessment of terrain elevation estimates from ICESat-2 and GEDI spaceborne LiDAR missions across different land cover and forest types. Sci. Remote Sens. 2022, 6, 100067. [Google Scholar] [CrossRef]
  25. Zhang, Z.; Dong, X.; Tian, J.; Tian, Q.; Xi, Y.; He, D. Stand density estimation based on fractional vegetation coverage from Sentinel-2 satellite imagery. Int. J. Appl. Earth Obs. Geoinf. 2022, 108, 102760. [Google Scholar] [CrossRef]
  26. Hu, T.; Zhang, Y.; Su, Y.; Zheng, Y.; Lin, G.; Guo, Q. Mapping the Global Mangrove Forest Aboveground Biomass Using Multisource Remote Sensing Data. Remote Sens. 2020, 12, 1690. [Google Scholar] [CrossRef]
  27. Huang, W.; Min, W.; Ding, J.; Liu, Y.; Hu, Y.; Ni, W.; Shen, H. Forest height mapping using inventory and multi-source satellite data over Hunan Province in southern China. For. Ecosyst. 2022, 9, 100006. [Google Scholar] [CrossRef]
  28. Adrah, E.; Wan Mohd Jaafar, W.S.; Omar, H.; Bajaj, S.; Leite, R.V.; Mazlan, S.M.; Silva, C.A.; Chel Gee Ooi, M.; Mohd Said, M.N.; Abdul Maulud, K.N.; et al. Analyzing Canopy Height Patterns and Environmental Landscape Drivers in Tropical Forests Using NASA’s GEDI Spaceborne LiDAR. Remote Sens. 2022, 14, 3172. [Google Scholar] [CrossRef]
  29. Wang, Y.; Li, G.; Ding, J.; Guo, Z.; Tang, S.; Wang, C.; Huang, Q.; Liu, R.; Chen, J.M. A combined GLAS and MODIS estimation of the global distribution of mean forest canopy height. Remote Sens. Environ. 2016, 174, 24–43. [Google Scholar] [CrossRef]
  30. Klein, T.; Randin, C.; Korner, C. Water availability predicts forest canopy height at the global scale. Ecol. Lett. 2015, 18, 1311–1320. [Google Scholar] [CrossRef]
  31. Zhongxuan Si, C.Y. Vegetation Coverage Inversion Based on Pixel Dichotomy Model. Adv. Geosci. 2023, 13, 865–878. [Google Scholar] [CrossRef]
  32. Agca, M.; Daloglu, A.I. Local Geoid height calculations with GNSS, airborne, and spaceborne Lidar data. Egypt. J. Remote Sens. Space Sci. 2023, 26, 85–93. [Google Scholar] [CrossRef]
  33. Speiser, J.L.; Miller, M.E.; Tooze, J.; Ip, E. A comparison of random forest variable selection methods for classification prediction modeling. Expert Syst. Appl. 2019, 134, 93–101. [Google Scholar] [CrossRef]
  34. Gislason, P.O.; Benediktsson, J.A.; Sveinsson, J.R. Random Forest Classification of Multisource Remote Sensing and Geographic Data. In Proceedings of the IGARSS 2004—2004 IEEE International Geoscience and Remote Sensing Symposium, Anchorage, AK, USA, 20–24 September 2004; Volume 2, pp. 1049–1052. [Google Scholar]
  35. Simard, M.; Pinto, N.; Fisher, J.B.; Baccini, A. Mapping forest canopy height globally with spaceborne lidar. J. Geophys. Res. 2011, 116, G04021. [Google Scholar] [CrossRef]
  36. Molto, Q.; Hérault, B.; Boreux, J.J.; Daullet, M.; Rousteau, A.; Rossi, V. Predicting tree heights for biomass estimates in tropical forests—A test from French Guiana. Biogeosciences 2014, 11, 3121–3130. [Google Scholar] [CrossRef]
  37. Holmgren, J.; Nilsson, M.; Olsson, H. Estimation of Tree Height and Stem Volume on Plots Using Airborne Laser Scanning. For. Sci. 2003, 49, 419–428. [Google Scholar] [CrossRef]
  38. Lahssini, K.; Baghdadi, N.; le Maire, G.; Fayad, I. Influence of GEDI Acquisition and Processing Parameters on Canopy Height Estimates over Tropical Forests. Remote Sens. 2022, 14, 6264. [Google Scholar] [CrossRef]
  39. Healey, S.P.; Yang, Z.; Gorelick, N.; Ilyushchenko, S. Highly Local Model Calibration with a New GEDI LiDAR Asset on Google Earth Engine Reduces Landsat Forest Height Signal Saturation. Remote Sens. 2020, 12, 2840. [Google Scholar] [CrossRef]
  40. Wu, D.; Fan, W. Forest canopy height estimation using LiDAR and optical multi-angler data. J. Beijing For. Univ. 2014, 36, 8–15. [Google Scholar] [CrossRef]
  41. Pal, M. Random forest classifier for remote sensing classification. Int. J. Remote Sens. 2007, 26, 217–222. [Google Scholar] [CrossRef]
  42. Han, H.; Guo, X.; Yu, H. Variable selection using Mean Decrease Accuracy and Mean Decrease Gini based on Random Forest. In Proceedings of the 2016 7th IEEE International Conference on Software Engineering and Service Science (ICSESS), Beijing, China, 26–28 August 2016; pp. 219–224. [Google Scholar]
  43. Archer, K.J.; Kimes, R.V. Empirical characterization of random forest variable importance measures. Comput. Stat. Data Anal. 2008, 52, 2249–2260. [Google Scholar] [CrossRef]
  44. Tang, H.L.; Huang, H.B.; Zheng, Y.; Qin, P.; Xu, Y.F.; Ding, S. Improved GEDI Canopy Height Extraction Based on a Simulated Ground Echo in Topographically Undulating Areas. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5705915. [Google Scholar] [CrossRef]
  45. Shendryk, Y. Fusing GEDI with earth observation data for large area aboveground biomass mapping. Int. J. Appl. Earth Obs. Geoinf. 2022, 115, 103108. [Google Scholar] [CrossRef]
  46. Freeman, E.A.; Moisen, G.G.; Coulston, J.W.; Wilson, B.T. Random forests and stochastic gradient boosting for predicting tree canopy cover: Comparing tuning processes and model performance. Can. J. For. Res. 2016, 46, 323–339. [Google Scholar] [CrossRef]
  47. Wang, H.; Seaborn, T.; Wang, Z.; Caudill, C.C.; Link, T.E. Modeling tree canopy height using machine learning over mixed vegetation landscapes. Int. J. Appl. Earth Obs. Geoinf. 2021, 101, 102353. [Google Scholar] [CrossRef]
Figure 1. Study area. (a) Illustration of experimental area. (b) Processed remote sensing dataset, including Global Ecosystem Dynamics Survey (GEDI) Light Detection and Ranging (LiDAR). (c) Slope. (d) Aspect. (e) Canopy Height Model.
Figure 1. Study area. (a) Illustration of experimental area. (b) Processed remote sensing dataset, including Global Ecosystem Dynamics Survey (GEDI) Light Detection and Ranging (LiDAR). (c) Slope. (d) Aspect. (e) Canopy Height Model.
Forests 15 01161 g001
Figure 2. Schematic diagram of GEDIL2A data and aerial image location.
Figure 2. Schematic diagram of GEDIL2A data and aerial image location.
Forests 15 01161 g002
Figure 3. Airborne data accuracy validation.
Figure 3. Airborne data accuracy validation.
Forests 15 01161 g003
Figure 4. Flowchart.
Figure 4. Flowchart.
Forests 15 01161 g004
Figure 5. Comparison chart of the sensitivity of six algorithms of GEDIL2A data.
Figure 5. Comparison chart of the sensitivity of six algorithms of GEDIL2A data.
Forests 15 01161 g005
Figure 6. Parameter sensitivity index.
Figure 6. Parameter sensitivity index.
Forests 15 01161 g006
Figure 7. Influence error of forest cover on ground elevation extracted by GEDI L2A.
Figure 7. Influence error of forest cover on ground elevation extracted by GEDI L2A.
Forests 15 01161 g007
Figure 8. Influence error of slope on ground elevation extracted by GEDI L2A.
Figure 8. Influence error of slope on ground elevation extracted by GEDI L2A.
Forests 15 01161 g008
Figure 9. Flow chart of random forest detection model based on random forest.
Figure 9. Flow chart of random forest detection model based on random forest.
Forests 15 01161 g009
Figure 10. (a) DEM in forest area and (b) CHM in forest area low.
Figure 10. (a) DEM in forest area and (b) CHM in forest area low.
Forests 15 01161 g010
Figure 11. GEDI extraction accuracy of ground elevation and canopy height.
Figure 11. GEDI extraction accuracy of ground elevation and canopy height.
Forests 15 01161 g011
Figure 12. Comparison of forest height percentiles and canopy height extracted from GEDI data.
Figure 12. Comparison of forest height percentiles and canopy height extracted from GEDI data.
Forests 15 01161 g012
Figure 13. Accuracy of forest canopy height extraction from GEDIL2A data.
Figure 13. Accuracy of forest canopy height extraction from GEDIL2A data.
Forests 15 01161 g013
Figure 14. (a) Forest canopy height map with 30 m spatial resolution in Changbai Mountain forest and (b) Slope schematic.
Figure 14. (a) Forest canopy height map with 30 m spatial resolution in Changbai Mountain forest and (b) Slope schematic.
Forests 15 01161 g014
Table 1. Description of the main parameters of GEDI [19].
Table 1. Description of the main parameters of GEDI [19].
Parameter NameParameter ValueParameter NameParameter Value
Operating AltitudeApproximately 400 kmReference SystemWGS84
Coverage Range51.6° S~51.6° NBeam Diameter≈25 m
Emission Frequency242 HzAlong-Track Distance60 m
Laser Wavelength1064 nmAcross-Track Distance600 m
Pulse Width14 nsScan Width4.2 km
Pulse Intensity10 mJNumber of Tracks8
Table 2. Airborne LiDAR data parameters.
Table 2. Airborne LiDAR data parameters.
ParametersValue
Flight altitude /m500
Laser wavelength /nm1064
Scanning angle /°10–60
Pulse frequency /kHz50–100
Point cloud average density (pts/m2)160
Table 3. Parameter filtering criteria.
Table 3. Parameter filtering criteria.
ParameterScreening Criteria
lon_lowestmode_a<n>
lat_lowestmode_a<n>
quality_flag_a<n>value = 1
|elev_lowestmode-TanDEM-X|value > 50 m
degrade_flagvalue = 0
sensitivityvalue ≥ 0.9
rx_assess_flagvalue = 0
rx_algrunflagvalue = 1
Table 4. Differences in ground elevation accuracy extracted at different collection times.
Table 4. Differences in ground elevation accuracy extracted at different collection times.
GEDI L2ANumber of SpotsR2RMSE (m)
Solar_elevation ≥ 0 (Day)5841.004.23
Solar_elevation < 0 (Night)5371.003.79
Table 5. Differences in ground elevation accuracy extracted by different beam types.
Table 5. Differences in ground elevation accuracy extracted by different beam types.
GEDI L2ANumber of SpotsR2RMSE (m)
Coverage beam4331.005.33
Full power beam6511.004.24
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, X.; Wang, R.; Wei, S.; Xu, S. Application of Random Forest Method Based on Sensitivity Parameter Analysis in Height Inversion in Changbai Mountain Forest Area. Forests 2024, 15, 1161. https://doi.org/10.3390/f15071161

AMA Style

Wang X, Wang R, Wei S, Xu S. Application of Random Forest Method Based on Sensitivity Parameter Analysis in Height Inversion in Changbai Mountain Forest Area. Forests. 2024; 15(7):1161. https://doi.org/10.3390/f15071161

Chicago/Turabian Style

Wang, Xiaoyan, Ruirui Wang, Shi Wei, and Shicheng Xu. 2024. "Application of Random Forest Method Based on Sensitivity Parameter Analysis in Height Inversion in Changbai Mountain Forest Area" Forests 15, no. 7: 1161. https://doi.org/10.3390/f15071161

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop