**1. Introduction**

The Asian temperate mixed forest in northeastern China is one of the three major temperate mixed forests in the world (i.e., northeastern North America, Europe, and East Asia) [1], which is of great strategic importance to the carbon trading of China. The forests of Northeast China have experienced three periods of excessive timber harvesting in the last century, including the period of Russian and Japanese aggression (1896–1945), the period of encouraging excessive harvesting for timber production (1950–1977), and the period of national economic reforms and the broadening of international relations (1978–1998) [2]. The excessive logging and neglected cultivation of forests nearly exhausted exploitable forest reserves in the region [3]. Since the Natural Forest Conservation Program (NFCP) was put into practice in 1998, there was a profound shift in focus from timber production to environmental protection by rehabilitating damaged forest ecosystems, afforesting desertified and degraded areas, and banning logging in natural forests [2]. In this context, natural secondary forests (NSFs) are gradually expanding and gaining importance. NSFs, which account for as much as 70% of the forests of northeastern China, refer to as the natural-regeneration forests after stand-replacing disturbances of primary forests by anthropogenic activities or by extreme natural events [4,5]. Nowadays, the NSFs

**Citation:** Du, C.; Fan, W.; Ma, Y.; Jin, H.-I.; Zhen, Z. The Effect of Synergistic Approaches of Features and Ensemble Learning Algorithms on Aboveground Biomass Estimation of Natural Secondary Forests Based on ALS and Landsat 8. *Sensors* **2021**, *21*, 5974. https://doi.org/ 10.3390/s21175974

Academic Editors: Moulay A. Akhloufi and Mozhdeh Shahbazi

Received: 8 August 2021 Accepted: 2 September 2021 Published: 6 September 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

of northeastern China are gradually recovering from the excessive logging of the 20th century, which led to an extraordinary reduction in the quality of the forest ecosystem. NSFs are of significance to China not only for timber supply, but also for a vital reservoir of biodiversity, potential carbon sequestration, a destination of ecological tourism, and a broad ecological shelter for northeastern China [2].

The accurate estimation of forest aboveground biomass (AGB) has a critical effect on the understanding of forest quality and recovery in the NSFs of northeastern China. AGB is defined as the dry mass of live or dead matter from tree or shrub life forms, typically expressed as a mass per area density (e.g., Mg/ha) [6]. In general, AGB could be obtained by (1) direct harvest method; (2) allometric equation-based method; (3) biomass expansion factor (BEF)-based method; (4) process-based biogeochemical modeling; (5) remote sensingbased estimation method. Although the direct harvest method is the most exact among these methods, it is time-consuming, destructive, and labor-intensive. It is only suitable for AGB estimation of a small area or of individual trees with a small sample size [7] and is usually applied as reference data to establish allometric equations of AGB (e.g., [8,9]). An allometric equation-based method is more flexible and feasible than the direct harvest method to estimate AGB on both individual tree and plot levels. A variety of allometric equations are developed for diverse tree species by modeling the relationship between AGB and various physical parameters of trees, such as diameter at breast height (DBH), tree height, crown diameter, etc. (e.g., [8,10–13]). Similar to an allometric equation-based method, the BEF-based method applied BEF defining as the ratio of all stand biomass to growing stock volume to convert timber volume to biomass [14]. However, the allometric equation-based and BEF-based methods are still time-consuming and expensive because both of them are based on the acquisition of field measurements (such as DBH, tree height), and still limited to plot-level or individual tree-level AGB estimations. Process-based biogeochemical models consider the processes including photosynthesis, absorption, and carbon allocation, and generally couple biology, soil, climate, hydrology, and anthropogenic effects [15]. To some degree, these models could improve the conventional, point-based estimation of biomass over large areas [16]. However, the high uncertainties in biomass estimation due to constraints in data source, spatial resolution, homogeneous assumption, and inaccuracy of models greatly limit the usage of process-based biogeochemical models [15]. The remote sensing-based method is exceedingly appealing for estimating forest biomass on a large scale (e.g., local, regional or global) because of its unique characteristics such as repetitive data acquisition, large coverage, digital format, and so on [15], and of the capability of providing spatially explicit AGB estimates for every pixel location, instead of only the mean or total biomass within a given inventory unit [17,18]. Nowadays, it becomes the most commonly used method for large-scale AGB estimation [19–21].

In the last three decades, researchers have attempted a variety of remotely sensed data sources to estimate AGB. With a relatively long history of data availability, optical satellite imagery (such as Landsat, MODIS, etc.) has become a primary data source for biomass estimation (e.g., [22–25]). In particular, Landsat series satellite imagery is the most commonly used data source for AGB estimation (e.g., [26–30]), mainly because of the continuous, long-term, medium spatial resolution, and cross-calibrated data for global surface observations, and free access policy [31]. However, it is of significance to notice the data saturation in Landsat imagery, which refers to the phenomenon that spectral reflectance values are not sensitive to the change in biomass of mature forest or advanced successional forests even if AGB varies significantly [32,33]. For example, Steininger [34] found that the canopy reflectance in Landsat imagery saturates when the AGB approaches 15 kg/m<sup>2</sup> or over 15 years of age in Brazilian tropical secondary forests. Zhao et al. [33] examined the saturation values in Landsat imagery for different vegetation types in a subtropical region, and found the AGB saturation values for pine forest, mixed forest, Chinese fir forest, broadleaf forest, bamboo forest, and shrub were 159, 152, 143, 123, 75, and 55 Mg/ha, respectively. Data saturation in optical imagery like Landsat significantly lowers the accuracy and increases the uncertainties of AGB estimation [15]. Data saturation

still exists in RADAR (Radio Detection and Ranging) data like SAR (Synthetic Aperture Radar) [35]. Generally speaking, saturation values could be higher obtained by longer wavelengths (such as L and P bands) and lower by shorter wavelengths (such as C bands), and also vary for different forest structures [36]. Until now, the data saturation problem caused by remote sensing signals is still one of the biggest obstacles to applying optical imagery and RADAR data for AGB estimation [15,37,38].

Since the 1990s, it has been found that LiDAR (Light Detection and Ranging) is more advantageous than optical imagery for AGB estimation because it is more relative to tree height and produces less estimation error [39]. Meanwhile, LiDAR is unaffected by the data saturation problem, even for high AGB values (>1000 Mg/ha) [40]. Thus, LiDAR data is widely used in AGB estimation in the last two decades. According to the format of return signals, LiDAR can be classified into discrete and continuous LiDAR; according to platforms, LiDAR can be classified into spaceborne, airborne, UAV(Unmanned Aerial Vehicle), terrestrial, backpack/handheld LiDAR; according to the size of the footprint, LiDAR can be classified into small footprint (footprint size <1 m), mid footprint (footprint size: 10–30 m), and large footprint LiDAR (footprint >50 m) [41]. In recent years, Airborne Laser Scanning (ALS) data, a kind of discrete, multiple returns, and small footprint LiDAR data captured from an aerial platform, has received much scientific and operational attention for AGB estimation than any of the other remote sensing data [42]. ALS emits laser pulses towards the ground and receives the pulses reflected from the tree canopy, branches, leaves, trunk, shrub, and then ground to form a three-dimensional profile of forest structure. ALS is far more capable than optical and RADAR sensors in estimating forest parameters and is considered the premier tool for large-scale AGB estimation (e.g., [43–47]). It is beneficial to estimate AGB by capturing both two-dimensional spectral information of the upper canopy and three-dimensional structural information of the canopy. However, the spectral characteristics of vegetation provided by ALS are very limited since most LiDAR systems only work at a single wavelength [48]. Thus, the integration of optical imagery and ALS data has become the most promising approach for large-scale AGB estimation (e.g., [48–53]).

Features are the most direct representation or manifestation of data sources. Feature extraction and selection could greatly influence the accuracy of AGB estimation [54]. A variety of spectral-related features including band combinations, textures, diverse vegetation indices, leaf area index, fraction of vegetation cover, and so on were derived from optical imagery for AGB estimation (e.g., [29,55–58]). Similar, diverse point-based features including height statistics (e.g., mean, maximum, variance, skewness, etc.), canopy-based quantile estimators, canopy relief ratio, laser penetration rates, canopy closure, and so on were extracted from LiDAR data for AGB estimation (e.g., [25,46,52,59,60]). Some researchers directly combined optical imagery and LiDAR features (e.g., [21,61,62]) while a few of them designed novel features derived from optical imagery and LiDAR data to improve AGB estimation. For example, Zhang et al. [48] developed two novel groups of features (i.e., *COLI*1 and *COLI*2) using seven vegetation indices derived from Landsat 8 and the best-performing LiDAR variable (i.e., mean of height). The *COLI*1 and *COLI*2 were generated by the multiplication and ratio combinations of the best-performing LiDAR variable and each vegetation index, respectively. They found that the stacked sparse autoencoder network model with the combination of all *COLI*1, optical, and LiDAR features yielded the highest accuracy of AGB estimation for the coniferous and broadleaf mixed forest of southeast China. However, whether it is more efficient to use novel features extracted from both data than directly combine all features is still needed to be further investigated.

In addition to data sources and features, it is vital to establish a reliable and suitable model to estimate AGB. Currently, most remote sensing-based AGB estimation methods use data-driven empirical models, which can be divided into parametric and nonparametric models [63]. Parametric models explicitly determine parameterized expressions of independent variables (e.g., spectral bands) and the dependent variable of interest (e.g., AGB) assuming the probability distributions of the variables being assessed [63]. Multiple linear regression, a classic parametric model with normality assumption, was

the most widely used method in previous AGB studies due to their simplicity and interpretability (e.g., [53,64,65]). Other parametric models, like non-linear regression (e.g., an exponential, power, or polynomial fitting function), were also applied for AGB estimations (e.g., [59,66,67]). Unlike parametric models, nonparametric models are distribution-free methods in which the predictor does not take a predetermined form but is constructed according to information derived from the data. Most machine learning models belongs to non-parametric, such as artificial neural network (ANN), random forest (RF), k-nearest neighbor (KNN), support vector machine (SVM), cubist (CB), classification and regression tree (CART), convolutional neural networks (CNN) and so on. Without the assumption of distribution, the non-parametric machine learning models are extremely flexible and capable of capturing the complex relationships between remote sensing variables and AGB, and widely applied in AGB estimation (e.g., [43,68–73]).

Ensemble learning, a branch of machine learning, is designed to learn tasks by constructing and then integrating multiple learners to produce a strong learner for improving accuracy [74,75]. There are three basic categories of ensemble learning: bagging, boosting, and stacking. RF and adaptive boosting (AdaBoost) algorithms are classic representatives of bagging and boosting algorithms, respectively. RF builds trees using subsamples and a random subset of predictors and can be very effective for estimating AGB due to its robustness to overfitting and noise in the training dataset [43,76,77]. Adaptive boosting is an iterative boosting algorithm that adaptively changes the distribution of the training set based on the performance of previous learners. Another boosting algorithm, called extreme gradient boosting (XGBoost), has been demonstrated to show great advantages in decreasing overestimation of low AGB values and underestimation of high AGB values for a forest type-based biomass estimation using continuous forest inventory data and Landsat 8 imagery [54]. Stacking, first proposed by Wolpert [78], is another method for combining multiple models but is less used than bagging and boosting. Unlike the RF algorithm that the base learner is homogeneous (e.g., regression tree), stacking are heterogeneous ensemble algorithms that could integrate diverse base learners to generate a stronger learner. The stacking algorithm was used to estimate canopy height in forestry (e.g., [79]), however, its potential has not been fully explored in AGB estimation.

Although the synergistic utilization of ALS and optical passive imagery was proved to improve AGB estimation [48], the synergistic approach (i.e., features) has not been fully investigated, especially for NSFs with complex structures. For example, is it more efficient to apply a novel feature extracted from passive imagery and LiDAR data (e.g., *COLI*1 and *COLI*2 in [48]) or directly combine all the features from the two data sources (like [61])? In addition, will ensemble learning algorithms improve the accuracy of AGB estimation for NSFs? Inspired by these questions, this study aimed at exploring the effects of different synergistic approaches of features and ensemble learning algorithms on AGB estimation of NSFs of northeastern China based on ALS and Landsat 8 OLI (Operational Land Imager) imagery. Specifically, the objectives of this study were (1) to investigate the effects of different data sources and classic machine learning algorithms on AGB estimation of a natural secondary forest; (2) to grope for a highly effective approach to combine ALS and Landsat 8 OLI imagery on AGB estimation of a natural secondary forest; (3) to explore the performances of ensemble learning algorithms in estimating AGB of a natural secondary forest; (4) to generate an accurate wall-to-wall AGB map of a natural secondary forest for future forest resources management.

#### **2. Materials and Methods**

#### *2.1. Study Area*

The study area is located in Maoershan Experimental Forest Farm of Northeast Forestry University (NEFU), Shangzhi, Heilongjiang Province, China, ranging from 127◦29 to 127◦44 E and 45◦14 to 45◦29 (Figure 1). The landform of the forest farm belongs to a low mountain and hilly area. The terrain gradually rises from south to north, with an

average elevation of 300 m. The highest mountain is Maoer Mountain, with an elevation of 805 m.

**Figure 1.** The location of study area: (**a**) The location of Maoershan Experimental Forest Farm within Heilongjiang Province; (**b**) the locations of 195 plots (20 m × 30 m) within Maoershan (Background: Landsat 8 OLI image).

The total area of the forest farm is 26,496 ha, which belongs to a typical natural secondary forest in northeastern China. The vegetation in the Maoershan area is a part of Changbai plant flora, with the original zonal top-level community of Korean pine broad-leaved forest. Due to the destruction in the last century, the original vegetation has undergone reverse succession. It has formed a forest landscape in which natural secondary forests are dominated by precious broad-leaved forests, poplar and birch forests, oak forests, and so on, and plantations such as red pine and larch are inlaid. The main species include *Betula platyphylla*, *Quercus mongolica*, *Populus davidiana*, *Larix olgensis*, *Pinus sylvestris*, and *Pinus koraiensis*, etc. The average forest coverage rate is 95%, and the total stock is approximately 3.5 million m3.

#### *2.2. Data Collection*

#### 2.2.1. Remotely Sensed Data

The remotely sensed data utilized in this study include ALS data and Landsat 8 OLI imagery. ALS data were obtained in September 2015. It is a secondary product scanned by the LiDAR sensor (Riegl LMS-Q680i) carried by the LiCHY system of the Chinese Academy of Forestry. The maximum frequency of the laser pulse of the LiDAR sensor is 400 kHz, with a wavelength of 1550 nm, a scanning angle of ±30◦, a sampling interval of 1 ns, and vertical accuracy of 0.15 m. The sidelap of this flight strip was designed to be greater than 60%, with an average point cloud density of 3.6 points·m<sup>−</sup>2.

To be consistent with ALS data in time, the Landsat 8 OLI imagery acquired on 13 September 2015 was applied in this study (downloaded from https://earthexplorer.usgs. gov/ (accessed on 1 September 2021)). The scene ID is LC81170282015256LGN01 (L1T-level product), with cloudiness of 1.35%, sun elevation angle of 45.28◦, and sun azimuth angle of 154.91◦. Seven multispectral bands (band1–band7) of 30 m nominal spatial resolution were utilized in this study. The radiometric resolution of the imagery is 12 bits and the swath width is 185 km × 185 km.

#### 2.2.2. Reference Data

The 195 fixed plots data of continuous forest resources inventory obtained in 2016 was applied as reference data in this study (see Figure 1b). The plot size was 20 m × 30 m and the center of each plot was correctly determined using a GPS (accuracy ±5 m). The diameter at breast height (DBH) of the trees larger than 5 cm and the tree species of each plot were recorded.

The AGB of individual trees was calculated using the species-specific allometric growth equations with DBH. In this study, the allometric growth models developed by [80,81] for the major species of trees and understory in northeastern China were employed to calculate the AGB of individual trees. The allometric growth equation was showed as Equation (1) and the parameters of major species of trees and understory were listed in Table 1.

$$\mathcal{W} = \begin{array}{c} a \cdot D^b \end{array} \tag{1}$$

where *W* represents aboveground biomass (kg), *D* represents DBH (cm), *a* and *b* are estimated parameters of different species in [80,81]. The AGB of the plot was the cumulative summation of the AGB of individual trees of each plot.


**Table 1.** Estimated parameters (*a* and *b*) of the allometric growth models of different species applied in this study.

<sup>1</sup> Represents plantations; otherwise are natural forests. <sup>2</sup> represent arbor-like mixed species of understory that do not have a specific Latin name.

#### *2.3. Methods*

To investigate the effects of different synergistic approaches of features and ensemble learning algorithms on AGB estimation of NSFs, a five-step methodology with three experiments of features (Feature experiments I-III) was implemented in this study, including (1) data preprocessing, (2) feature extraction and selection, (3) establishment and evaluation

of classic machine learning models, (4) establishment and evaluation of ensemble learning models, (5) wall-to-wall AGB prediction using the most effective algorithm and features. Feature experiment I was designed to explore the effects of features from different data sources (ALS, optical imagery, and combined data) on AGB estimation based on a variety of machine learning algorithms; Feature experiment II was designed to investigate how to efficiently combine the best-performing ALS feature (a unique feature) with several spectral features for AGB estimation, is it better to use novel extracted features or directly combine all the features?; Feature experiment III aims to compare the performance of combining all features for AGB estimation. The feature experiment design and logic of this study were shown in Table 2 and Figure 2, respectively.


**Table 2.** Feature experiments designed in this study.

<sup>1</sup> Number of features was determined by the procedure described in Section 2.3.2. <sup>2</sup> *COLI*1 and *COLI*2 were calculated using Equations (3) and (4) described in Section 2.3.2. <sup>3</sup> Feature 3 is the best performing ALS feature.

**Figure 2.** The flowchart of this study. Note: the number in parentheses represents feature number. Feature 1: optimal ALS features; Feature 2: optimal Landsat 8 features; Feature 3: the best performing ALS feature; Feature 4: all *COLI*1s; Feature 5: all *COLI*2s.

#### 2.3.1. Preprocessing of Remotely Sensed Data

The preprocessing of the ALS data includes (1) noise elimination (such as air points, low points, and isolated points). The radius of a fitting plane and the multiples of standard deviation were set to 0.5 m and 1, respectively. The algorithm will automatically calculate the standard deviation of the surrounding fitting plane of a point. If the distance from this point to that plane is less than multiples of standard deviation, this point will be kept. (2) classification of ground and non-ground points. The ground points were classified by improved progressive triangulated irregular network densification (IPTD) filtering algorithm developed in [82]. The maximum building size and maximum terrain angle were set to 20 m and 88◦, respectively. (3) normalization of point clouds. A digital terrain model (DTM) with a resolution of 0.5m was generated based on ground points using the inverse distance weighted (IDW) interpolation method. The power of the distance between sampling points and an unknown point was set to 2, and the smallest number of points

used for interpolation was 12. Then, the point clouds were normalized by subtracting the DTM value from the elevation of all points. The preprocessing of the ALS data was implemented using LiDAR 360 V3.2 of GreenValley International.

Preprocessing of the Landsat 8 OLI imagery including radiometric calibration, atmospheric correction, and topographic correction was implemented using ENVI 5.3 software. The Fast Line-of-sight Atmospheric Analysis of Spectral Hypercube (FLAASH) radiative transfer model was implemented for atmospheric correction and conversion to surface reflectance in the EVNI environment. The topographic correction was conducted with the well-known Sun Canopy Sensor + C correction (SCS + C) approach using the extension tool of "Topographic Correction\_V5.3\_4\_S1". The SCS + C correction approach reduces overcorrection and is an effective topographic correction method in forested and mountainous terrain [83,84]. The SCS + C topographic correction model can be expressed by Equation (2).

$$L\_t = L \cdot \left(\frac{\cos \theta \cdot \cos \alpha + \mathcal{C}}{\cos i + \mathcal{C}}\right) \tag{2}$$

where *Lt* is the corrected pixel radiance value of the image; *L* is the uncorrected pixel radiance value of the image; *i* is the incidence angles on a horizontal surface; *θ* is the solar zenith angle; α is the slope angle; C is the semi-empirical parameter. DTM generated from ALS data was applied for topographic correction in this study.

2.3.2. Feature Extraction and Selection
