1. Introduction
The taper equation is a mathematical function that studies the diameter of any part of the stem and the corresponding tree height (TH), total TH, and diameter at breast height. The taper function has a wide range of applications in tree stem volume estimation, 3D space model reconstruction, yield estimation, forest planning, and timber production simulation and optimization [
1,
2]. The taper equation can be used to estimate the diameter at any height on the stem, the height at any diameter, the total stem volume, and the merchantable volume of different specifications [
3,
4,
5]. A simplified Kozak variable-exponential taper function can improve the predictive ability of the model by adding an auxiliary diameter through the mixed effect method [
6]. The fixed effect model is simpler than the mixed effect model. However, the mixed model eliminates the error correlation and heterogeneity of hierarchical data, to improve the prediction accuracy of the model and explain the source of random errors [
7]. Doyog N D’s study of compatible “Max and Burkhart (1976)” showed that the estimated value of the bottom diameter of the trunk is higher than the true value, while the middle and upper diameters are underestimated; the overall height and volume of the stem at a given diameter are overestimated [
8]. A compatible system composed of the taper equation, total volume equation and volume sales equation was used for the modeling, and the fitting accuracy of the optimal model reached 98% [
9]. Not only biological factors but also abiotic factors affect the trend of the stem curve, such as climate, soil, water, and altitude [
5,
10]. In addition, the relationship between the crown width (CW), stand density, thinning intensity, site quality and the taper equation has been explored [
11]. Jiang et al. [
12] adopted a nonlinear mixed model to fit the stem taper equation of
Larix gmelinii with crown characteristics. This model shows that the stem taper is related to the crown ratio, and a large crown ratio had a poor shape quality. Cai Jian et al. [
13] proposed that the tree shape of moderate thinning (35.7% of the plants removed) was relatively full.
Lidar is an active remote sensing technique that actively measures the target and detect its position, shape, height and other parameters by emitting and receiving laser pulses. Terrestrial laser scanning (TLS) emerged in the 1990s. TLS are instruments that enable the nondestructive, rapid and precise digitization of physical scenes into three-dimensional (3D) point clouds [
14]. A modern TLS system can densely scan underneath the canopy surroundings within a few hundred meters with high accuracy [
15]. Compared to traditional field inventories, TLS can obtain relatively complete 3D coordinate information, which can be reconstructed in the form of point clouds, to truly restore the overall structure and morphological characteristics of the located objects [
16]. The point cloud data obtained by TLS are dense and accurate and have the potential for automatic processing. Moreover, the resolution of the TLS forest canopy is high and is not destructive to the forest. Liang et al. [
17] reported that with field measured stem from diameters as reference, an evaluation of extraction accuracy shows a mean RMSE of 1.13 cm per tree, as low as manual extraction from TLS. The 2D Hough transform and fitting circle are commonly used in forestry to identify single trees and obtain their positions and diameters at breast height (DBHs) [
18]. Some researchers have proposed an octree-based ground point filtering algorithm for woodland scenes, which has enabled automatic ground point high-accuracy filtering. These scholars also proposed an algorithm for identifying trees according to the projection density of voxels, which improves the efficiency and accuracy of trunk recognition [
19]. Bauwens et al. used a hand-held laser scanner consisting of a TLS scanner and additional sensors, which can be referred to as a kinematic TLS system [
20]. Walking with it through the forest produced point clouds from which positions and other single tree properties could be derived. Currently, the terrestrial LiDAR technology has been used for tree crown geometry reconstruction, biomass estimation, forest parameter inversion, single tree factor extraction, and leaf area simulation. The results show that the TLS data have a higher accuracy, better simulation effect, and stronger estimation ability, with great advantages for improving the efficiency of forestry investigations.
TLS can yield 3D information of standing trees accurately, quickly and non-destructively, which greatly facilitates the potential for forest sustainability research. Previous studies showed that the DBH and TH extracted by TLS have achieved high accuracy, but there are few reports on the extraction effect of the upper stem diameter. Xi et al. reported that the treetops are heavily shaded by leaves and branches at heights above 10 m [
21]. The loss of stem points in upper stems above 10 m was also reported by Henning et al. [
16]. Yang et al. extracted the diameter to a height of 6 m [
22]. TLS measurement still has problems with regard to data accuracy and effect verification in large-scale information acquisition [
23]. To test the extraction accuracy of the entire stem, the taper equation was established using the extracted diameter to provide a new data acquisition method for estimating growing stock, as well as to enable TLS to play a greater role in forestry.
2. Materials and Methods
2.1. Data Source and Processing
In October 2019, nine larch fixed sample plots were selected in the Mengjiagang Forest Farm of Jiamusi city, Heilongjiang Province according to different site qualities, densities, ages and thinning intensities. The DBH, height, CW and relative coordinates of each tree were measured. The DBH of each plot was divided into five grades according to the method of equal cross-sectional area sample trees, and the average DBH of each grade was calculated. Six plots (there were a total of nine plots, but stem analysis was only conducted on six plots due to the weather.) were selected, and five sample trees of different sizes were chosen near the sample plots according to the average DBH of each grade for stem analysis. We collected a total of 30 trees for stem analysis. The DBH, TH, CW, and height of the tree’s crown base (HCB) based on the stem analysis were measured. The relative height according to the total tree height was calculated, and the diameter of each relative height was measured, including 0 m, RH2, RH4, RH6, RH8, RH10, RH20, RH25, RH30, RH40, RH50, RH60, RH70, RH75, RH80 and RH90.
TLS data were collected by a Trimble TX8 narrow infrared laser beam before stem analysis. Five stations were scanned for each plot, three stations were scanned for each stem analysis, and the scanning time of each station was 3 min. After data preprocessing, a total of 580 larch trees were obtained from 9 plots, and 7887 diameters at different heights were extracted; a total of 451 diameters were extracted from 30 trees for stem analysis. Seventy-five percent of the sample trees were used for model parameterization, and the remainder (25%) were used to evaluate model performance. The parameters of Trimble TX8 are listed in
Table 1. A summary of the stem analysis data is provided in
Table 2.
2.2. Processing Point Cloud Data
The original scanned data were in TZF format. Trimble RealWorks11.1 and LiDAR360 software were used for registration, denoising, 0.01 m minimum point spacing for subsampling, classification of ground points with an improved progressive TIN densification filtering algorithm, establishment of the digital elevation model, normalization by ground class, point cloud segmentation, tree identity matching and other processing. The diameters at different heights were fitted by the least squares method. To reduce the influence of the tree height on the extraction accuracy, the absolute height was replaced by the relative height. The position of stem diameter extraction is shown in
Figure 1. When extracting the diameter, the slice thickness was 10 cm. For example, assuming that the diameter at 1.3 m was extracted, point clouds with heights from 1.25–1.35 m were selected for the fitting circle.
In this study, the stem analysis data at each relative height of 30 felled trees were selected for verification, which also provided a basis for constructing the taper equation. The analyzed diameter of 30 trees was taken as the reference value, and the corresponding TLS data were taken as the predicted value. The extraction effect was evaluated by the coefficient of determination (R2), root mean square error (RMSE), absolute error (bias) and extraction accuracy (P%).
2.3. Basic Taper Function
Referring to the latest edition of “Forest Mensuration” (Fourth Edition) [
24] and related documents worldwide, six different commonly used types of taper equations were used in this study as candidate models. The relative height values at the lower and upper inflection points of the segmented taper equation were set to RH9 and RH77, respectively [
25]. Among the 30 trees in the stem analysis, 75% (23 trees) for model fitting and the remaining 25% (7 trees) for evaluation. TLS data are the same.
- (1)
Simple taper equation:
- (2)
Segmented taper equation:
Max and Burkhart (1976) [
27]:
- (3)
Variable-exponential taper equation:
Weisheng Zeng, Zhiyun Liao (1997) [
28]:
Kozak (2002)-II [
4]:
where
is the stem diameter at a height of
;
is the diameter at breast height;
is the total tree height;
is the measuring point height;
are the parameters of the model;
; and
.
2.4. Mixed Effects Models
Because the main purpose of our study was to provide the best taper model to predict the upper stem diameter for point cloud data, the fixed-effects modelling approach guided the model selection procedure. The stem diameter data have a hierarchical structure (several diameter measurements within a single sample tree) and great spatial variability. The taper equation established by the trees in the sample plots can better represent the trends in the stem curve of the forest stand and can also more accurately predict the upper diameter of the trunk and estimate the volume. Once the best model was selected, a nonlinear mixed model with the tree effect was fitted and compared with the fixed-effects model. The mixed model expression is as follows:
where
is the dependent variable of the
th observation of the
th tree, which refers to the diameter at the relative height in this paper;
is the number of samples;
is the number of observations of the
th tree;
is the relative height;
is a
dimensional vector of the fixed parameters;
is a
random effects parameter vector;
and
are the corresponding design matrices;
is a variance-covariance matrix for the error terms;
is an n-dimensional vector of the residuals; and
is the variance value.
The following three steps are used to construct the mixed effect model [
29]:
- (1)
Determine random parameters. Fit the taper models of different random parameter combinations and compare the fit statistics. Too many parameters may lead to the overparameterization or nonconvergence of the model. Therefore, this study only selects a combination of 1–2 random parameters for fitting and compares the Akaike information criterion (AIC), Bayesian information criterion (BIC) and log-likelihood values of the model fitting. The results show that models with a smaller AIC and BIC generally have a larger logarithmic likelihood value and better fitting quality.
- (2)
Determine the variance-covariance structure within-trees (
R). The data of this study has a hierarchical structure among trees, therefore, we needed to solve the problem of the correlation and heteroscedasticity of intra-tree errors. To compensate for the effects of autocorrelation, this study used the first-order autoregressive structure AR (1). This study utilized the work of Davidian (1995), which is common in forestry research, to calculate [
30]:
where
refers to the error variance value of the model;
is the time series correlation structure, i.e., the error correlation structure within the tree; and
is the diagonal matrix of the variance heterogeneity.
- (3)
Determine the structure of the inter-tree variance-covariance (D). The structure of inter-tree variance-covariance reflects the changes among groups. Taking the Generalized positive—definite matrices commonly used in forestry as an example, the variance-covariance structures of 2 random parameters were selected:
where
is the variance of the random parameter
u,
is the variance of the random parameter
v, and
is the covariance of random parameters
u and
v.
2.5. Evaluation and Test Models
This study uses independent data that are not involved in modeling for testing. The test of the fixed effects parameter in the mixed effects model is equivalent to the traditional regression analysis test, and the random effect component requires secondary sampling to calculate the random parameter value. Five samples were randomly selected from each tree, and the method of Vonesh was used to calculate the random parameter values [
31].
where
is the variance-covariance matrix of the random effect parameters,
is the variance-covariance structure in the sample tree,
is the design matrix, and
is the actual value minus the predicted value calculated with fixed effect parameters.
2.6. Application of the Model
2.6.1. Prediction of the Stem Diameter
The taper function can be used to estimate the diameter of any height on the tree. The mixed model was established using 580 trees from 9 plots. It is highly representative and can simulate the stem curve well. It was used in this study to derive the diameter of the upper stem that cannot be obtained by TLS. The prediction of the upper diameter, which was obtained with the mixed model, not only solves the problem of missing point cloud data but also accurately estimates the volume.
2.6.2. Estimated Volume
The ultimate purpose of constructing the taper equation is to calculate the volume. This study adopted the following three types of methods to calculate the volume of 30 trees: Method 1, the optimal taper model with tree effects was integrated numerically to compute the volume; Method 2, the extracted diameters at different relative heights were directly used to calculate the volume with the mid-area sectional measurement method (the diameter that could not be extracted from the upper stem was calculated according to the predicted value by the taper equation); and Method 3, the measured value from the stem analysis at each meter was used to calculate the volume. The measured volume (Method 3) was taken as the reference value and was compared with those of the other two methods to determine their prediction accuracies.
4. Discussion
The conventional construction of the taper equation is performed by obtaining data by cutting down trees, which is a waste of time and energy. This research provides a new nondestructive TLS scanning method, which not only provides Woodsfield data for verification but also provides large-scale plot data for modeling. Compared with Sun’s study (applying 198 trees as samples in 8 plots, 16 of which were subjected for stem analysis) [
32], this study involved more data (a total of 580 trees in 9 plots, and another 30 were subjected to stem analysis), and this study therefore better represents the actual state. Most studies on the diameter of TLS acquisition are limited to stems with DBHs, while there are few studies on upper TLS acquisition. Xi et al. [
21] reported that the occlusion of the crown introduces biases above 12 m in stem form extraction. In this study, the diameters of the relative positions were extracted, which not only reduced the influence of the tree height but also derived the diameters through the taper model.
According to the diameter extraction of different relative heights in the stem analysis, the accuracy of the entire tree decreases with height. Notably, the accuracy at RH75 is less than that at RH80. The study showed that during TLS scanning, due to the influence of wind blowing on the tree or the error in the registration of the station, there are angles or ghosts in the upper part of the stem at different stations. There will be errors in the registration of different stations of point cloud data, which is shown in
Figure 2a, and tree heights in this study ranged from RH75 to RH80. The visual error, shown in blue in
Figure 2b, may overestimate or underestimate the diameter. At RH75, the tree point clouds are mixed together, so it is not easy to interpret the stem. At RH80, the stem point clouds are separated, which is easy to interpret and has little visual error. At 0.9H, the point cloud density decreases, the stem diameter is difficult to interpret, and the extraction error increases. Beam divergence of the scanner is the key determinate of the increasing error with height. To overcome this problem, we used multiple scans of trees from different perspectives. Xi et al. [
21] scanned plots using Leica HDS6100 by the Finnish Geodetic Institute (FGI), and the results showed that the DBH extraction accuracy was 0.97 (
r2) and 0.90 cm (RMSE). In our study, the DBH extraction accuracy was 0.9958 (
R2) and 0.3726 cm (RMSE). Li Dan et al. [
33] used Riegl VZ-400 to scan 5 plots, and the average RMSE of DBH and TH extracted automatically were 2.53 cm and 3.80 m, respectively (lower than ours). In addition, the stand density, crown density, tree height, branch growth condition, understory vegetation, and number and distance of stations will all affect the accuracy.
Internationally, the tree detection rate, stem position, DBH, TH, wood volume, biomass and other forest measurement processes of TLS have reached a high degree of automation [
34]. Compared with traditional methods, the DBH and TH are automatically obtained by algorithms, such as Cabo C, where 85% of the DBH deviations are lower than 1 cm and 92% of the TH differences are less than 0.5 m [
35]. This study used manual extraction, except for 0 m, and all diameter errors were less than 1 cm. With regard to the quality of the data, the accuracy in this study was higher, but the processing time was longer and the workload was greater. Regardless, the lower portions of the trees are obscured by understory vegetation, as well as other trees and branches, or the lack of scanning points on the treetops.
According to fitting analysis of the three types of taper equations, it was concluded that the variable-exponential taper equation has the highest accuracy, and the segmented taper equation also has a higher accuracy, which is consistent with previous studies. Fitting the basic taper equation, the results indicate that RH70 is the maximum height for TLS to obtain diameter data. In our view, this position has higher accuracy and can reduce the processing time of redundant data.
Relevant studies have shown the predictive ability of the random effect mixed model preference at tree levels [
36]; therefore, this study only considered the tree effect but not the plot effect. A stem taper is susceptible to the tree crown characteristics. Whether it is a segmented taper equation [
12] or a variable-exponential taper equation, the mixed model with the tree effect can better manage the hierarchical structure of the data. Another innovation of this study is that TLS was entirely used to obtain the diameters, and TH was used to establish a taper equation mixed model, which was obtained without cutting down trees. The RD1000 was also adopted for non-destructive tree measurement, as it had the largest mutation when the relative height was 0.64–0.8, similar to the results of our previous study (RH75) [
37]. Sun et al. [
32] used a modified Schumacher equation to fit the
Populus L. taper equation of the TLS, which yielded a fitting
R2 = 0.96, which is similar to that of the basic model in our study but lower than that of the mixed model (
R2 = 0.98). Other studies have recommended that, as a variable for calculating volume, the cross-section area exhibits better performance than the diameter [
38]. The prediction quality of the taper and volume model can also be calibrated with an additional diameter measured between 40% and 60% of the total tree height [
36,
39,
40]. When the diameter of the calibration measurement is 7 m, the average deviation of the merchantable volume is only 0.63% [
41].
5. Conclusions
In this study, TLS was used to obtain forest parameters, a small amount of field data was used to verify its accuracy, and a mixed model was established to improve the prediction effect of the taper equation. The results show that nondestructive TLS measurements are suitable for deriving stem diameters and trunk volumes; they are feasible for forest investigations.
The optimal height of diameter extraction is RH70; this accuracy meets the requirements of forestry surveys, and also reduces much of the point cloud data processing work. If the extracted height is too high, the diameter accuracy of the upper tree will be low, which will cause a significant amount of redundant data. Conversely, if the extracted height is too low, the amount of data acquired will be too low, which will affect the stem curve. In addition, the amount and accuracy of the data will affect the fitting of the taper model, which in turn affects the simulation of the stem curve and volume estimation.
The taper equation model based on both datasets has a better fitting effect and higher accuracy (model (6): R2 > 0.97, P > 98.94%), so the optimal basic taper equation was selected to establish the mixed model with the tree effect. The accuracy of the mixed model based on the tree effect reached 99.72%. The mixed model can accurately predict the diameter of the upper stem, which resolved the problem of the occlusion by the crown. The diameter accuracy derived from the mixed model reached 99%, and the bias was <1.5 cm; it also had high accuracy in estimating the volume of the stem.
The taper equation established by TLS data has typical representativeness and good practicability. In future field measuring work, this method can be used to conduct high-accuracy forest surveys without cutting trees.