Forest Management Type Identification Based on Stacking Ensemble Learning

Liu, Jiang; Chen, Jingmin; Chen, Shaozhi; Wu, Keyi

doi:10.3390/f15050887

Open AccessArticle

Forest Management Type Identification Based on Stacking Ensemble Learning

¹

Research Institute of Forestry Policy and Information, Chinese Academy of Forestry, Beijing 100091, China

²

Liaoning Zhanggutai National Nature Reserve Management Center, Fuxin 123100, China

³

Chinese Academy of Forestry, Beijing 100091, China

^*

Author to whom correspondence should be addressed.

Forests 2024, 15(5), 887; https://doi.org/10.3390/f15050887

Submission received: 28 April 2024 / Revised: 15 May 2024 / Accepted: 17 May 2024 / Published: 20 May 2024

(This article belongs to the Special Issue Economy and Sustainability of Forest Natural Resources)

Download

Browse Figures

Versions Notes

Abstract

:

Forest management is the fundamental approach to continuously improve forest quality and achieve the quadruple functions of forests. The identification of forest management types is the basis of forest management and a key technical link in the formulation of forest management plans. However, due to insufficient application of forestry informatization and digitization, there are problems in the organization and application of management types, such as inaccurate identification, diversified standards, long organizational cycles, and low decision-making efficiency. Typical technical models are difficult to widely promote and apply. To address these challenges, this study proposes the Stacking Ensemble Forest Management Type Identification (SEFMTI) method based on Stacking ensemble learning. Initially, four typical forest management types from the sustainable forest management pilot of the Yichun Forestry Group were selected as research subjects, and 19 stand parameters were chosen to form the research data, training various recognition models. Subsequently, the Least Absolute Shrinkage and Selection Operator (LASSO) regression and random forest (RF) methods were used to analyze key decision-making indicators for forest management type recognition and compare the performance of different models. The results show that (1) the SEFMTI model achieved an accuracy rate of 97.14%, effectively improving the accuracy of forest management type recognition while ensuring stability; (2) average age (AG), age group (AGG), crown density (CD), and stand origin (SO) are key decision-making indicators for recognizing forest management types; and (3) after feature selection, the SEFMTI model significantly enhanced the efficiency of model training while maintaining a high accuracy rate. The results validate the feasibility of the SEFMTI identification method, providing a basis for the gradual implementation of sustainable forest management pilots and aiding in the precise improvement of forest quality.

Keywords:

sustainable forest management; forestry informatization and digitization; Yichun Forestry Group; key decision-making indicator; stacking; feature selection; identification

1. Introduction

Scientific forest management is the fundamental approach to continuously improving forest quality and achieving the multifunctionality of forests [1]. In China, forest management plans serve as action guides for forest management units to implement scientific forest management. They are also the basis for supervising forest management activities by competent authorities and institutional documents to safeguard the rights and interests of stakeholders [2]. The classification and identification of forest management types are the foundation of forest management and a key technical step in the formulation of forest management plans [3]. Therefore, research on forest management type identification not only plays an important role in improving the speed and accuracy of forest management plan formulation but also serves as a crucial step in promoting the sustainable utilization of forest resources.

Accurate identification of forest management types can promote the scientific and standardized management of forests. The selection of forest management methods is considered essential for short-term, medium-term, and long-term forestry decisions and in formulating forest policies that support regional or national interests [4]. With the increasing refinement of the entire forest management process and the growing demand for differentiated management, identifying forest management types has become an indispensable part of forest management. Precise identification of forest management types can help to formulate forest management strategies that are more responsive to actual needs and to better design and implement targeted management measures. On the one hand, it can effectively promote ecosystem restoration and increase carbon sequestration capacity [5,6]. On the other hand, large-scale state-owned forest areas and forest management units can significantly improve forestry production efficiency and enhance ecological function maintenance [7,8]. However, traditional top-down forest management and forest management type organizations are too macroscopic to explain local complexity and the resulting uncertainty [9]. The division of management types is mainly based on two-category stand survey data, and its usability and accuracy are limited by stand data [10]. Additionally, management type division reflects forest dominant functions rather than management characteristics [11,12]. Furthermore, inadequate informatization and digital application have led to problems such as inaccurate identification of forest management types, diversification of standards, long organization cycles, slow information feedback, and inefficient decision-making during the organization of forest management types in forest management plans [10]. The typical forest management theories and technical models are also limited by geographical environment, climatic conditions, and forest stand characteristics, and it is also difficult to realize large-scale promotion [11], which seriously affects the precise improvement of forest quality.

Inspired by the trend towards intelligent and digitized artificial intelligence technology, researchers have begun to explore the application of machine learning methods in forest management. This mainly involves forest management [13,14,15], forest fire management and prevention [16], forest growth and harvest prediction [17], forest management optimization decision-making [18,19], forest quality assessment [20], forest management monitoring and evaluation [21,22], forest management visualization simulation [23,24], and tree species classification [25], providing technical support for the formulation of scientifically reasonable forest sustainable management plans. Ensemble learning, as one of the machine learning methods, is a very effective strategy for solving complex machine learning problems. It has significant advantages in improving model accuracy, robustness, and generalization ability, as well as reducing model overfitting and handling imbalanced data [26]. Currently, ensemble learning algorithms have been widely used in forestry scenarios, but there is limited research on applying machine learning and ensemble learning methods to forest management type identification and forest management plan formulation. Tajik et al. [27] utilized ensemble learning models to predict spatial variations in soil organic carbon (SOC) concentrations in deciduous forests in northern Iran. Barreras et al. [28] employed ensemble learning models for each forest type in Mexico to comprehensively predict tree height and stand density across a 1 km grid. Wen et al. [29] applied three widely used algorithms—Bagging, Boosting, and Stacking—to map wetland distributions in the Manning River estuary on the Australian coast, addressing key issues in periodic wetland mapping. Chen et al. [30] proposed a tree crown profile modeling and prediction technique based on ensemble learning and validated its effectiveness. Hence, constructing a forest management type identification model using ensemble learning methods to predict forest management types on a large scale is a feasible innovative attempt.

In summary, the aim of this study is to overcome the bottleneck of inaccurate identification of forest management types and the difficulty of large-scale popularization and application of typical technology models, as well as to accurately improve the quality of forests. This study proposes to use data based on a field forest survey and the judgment of professional technicians in the 2023 forest sustainable management pilot project of Yichun Forestry Group to build a forest management type identification model using an ensemble learning method. Based on this foundation, the authors further utilized random forest and LASSO regression methods to evaluate the importance of each indicator, analyze the key indicators of forest management type identification and compare the efficiency after feature selection. The results of this study not only enrich the existing forest management type identification system but also provide a theoretical reference and practical basis for improving the information feedback and decision-making efficiency of forest management type organization, enhancing the accuracy and level of forest management plan formulation and continuously promoting sustainable forest management.

2. Organizational Foundation for Forest Management Types

Forest management type (FMT), also known as operational level, refers to a grouping of forest stands with similar or identical management objectives, management cycles, management levels, site qualities, and technical characteristics, adopting relatively consistent management methods and measures [10]. Internationally, a concept like forest management types is known as Forest Development Types (FDTs), which describe the long-term development goals of forests to achieve specific functions at specific locations (climate and soil conditions). Serving as a guide for future forest management activities, Forest Development Types direct forest managers to steer actual stands toward the desired direction [9]. In China, management types are spatial management units composed of many stands that are not necessarily contiguous but share the same management direction, objectives, and measures [7]. The design of management types mainly involves three stages: Firstly, based on site conditions (priority management for high-quality sites), tree species composition, tree species value, tree growth, and development stages, as well as individual differences among trees, the management objectives of forest stands are determined, combined with socio-economic conditions. Secondly, the management technology system is determined according to the management objectives. Finally, the management measures needed for the current forest stands are determined [31].

In recent years, sustainable forest management (SFM) has received increasing support and encouragement from more and more countries as a guiding principle for forest management [32]. The theoretical framework of forest management is also maturing and developing, and the governance model of systematic governance of mountains, rivers, forests, farmlands, lakes, grasses, and deserts, and the integrated protection and restoration of ecosystems has been explored in practice [33]. It has eventually evolved into a modern forest management system that is close-to-nature forest management [34,35], full-cycle forest management [36,37], multi-objective forest management [38,39,40], systematic forest management, and sustainable forest management. Meanwhile, starting from the demand for improving forest quality and the actual situation of forest stands of management units, various management types have been proposed, including target tree management [41,42,43,44,45], whole-stand management based on target trees [46], homogeneous forest management [47], low-quality and low-efficiency forest conversion management [48], natural secondary forest management [49], degraded forest restoration management [50], landscape forest management [51], and so on. Based on this foundation, our team has summarized the relevant definitions, adaptation conditions, and stands of various management types through extensive practice and proposed management objectives for each type around national ecological security, reserve forest construction, “dual carbon” goals, and other requirements. This has ultimately led to the development of technical systems such as target tree management (TTM) aimed at cultivating large-diameter timber, homogeneous forest management (HFM) focused on the cultivation of medium- and small-diameter timber, rehabilitation of low-quality and inefficient forests (RLQIFs), and conversion of degraded forests (CDFs), as shown in Table 1.

3. Materials and Methods

3.1. Study Area

This paper focuses on the identification of forest management types in the Forestry Sector under the jurisdiction of the Yichun Forestry Group in Heilongjiang. Yichun Forestry Group is located in the Xiaoxing’an Mountains forest area (127°37′~130°46′ E, 46°24′~49°24′ N). It belongs to the north temperate continental monsoon climate, with lower temperatures, and is the cold and rainy area of the Xiaoxing’an Mountains. The city’s average annual temperature is 1~1.2 °C from north to south, the average annual cumulative temperature is between 1700 °C and 2200 °C, and the frost-free period is short—87~120 days from north to south. It has more abundant precipitation, with an average annual precipitation of 750~820 mm. Geomorphological features are as follows for the “eight mountains, half water, half grass, a field”: the entire terrain of the northwest is high, while it is low in the southeast, steeper in the southern terrain, of median steepness in the central part, and flatter in the northern part. The average altitude is 600 m, and the highest altitude is 1429 m. The soil is dominated by dark-brown forest soil, meadow soil, and swamp soil, with a high humus content of about 6%~8%, pH value of about 5~5.5, and a general soil layer of up to 70 cm. Yichun has four distinct seasons: April and May are spring, which is shorter, with a rapid rise and fall in temperature, large temperature difference between day and night, and variable cold and warmth; June, July, and August are summer, which is hot, humid, and rainy; September and October are autumn, which has similar climatic characteristics to spring; and January, February, March, November, and December is winter, with more snowfall and very severe cold. The research regions are shown in Figure 1.

Yichun Forestry Group has a total area of 3,512,400 ha under its jurisdiction, of which 2,659,000 ha is concentrated in the territory of Yichun City, and the rest is distributed in the cities of Hegang, Heihe, Suihua, and Harbin, with a forest coverage rate of 87.6 percent. There are 17 Forestry Bureau companies and 195 forestry branches under its jurisdiction. Natural forests account for 91.32% of the forest resources, while plantation forests account for 8.68%; young and middle-aged forests account for more than 80% of the forest resources. The forest vegetation mainly consists of Pinus koraiensis Siebold, Larix gmelinii Kuzen., Picea asperata Mast., Abies fabri Craib, Populus davidiana Dode., Betula platyphylla Sukaczev, Juglans mandshurica Maxim., Fraxinus mandshurica Rupr., etc.

3.2. Experimental Data

In forest management, accurate data acquisition is crucial to developing scientific and reasonable management strategies. The data acquisition for this study mainly relies on manual collection, including sample plot establishment, tree measurement, and data recording and organization. The sample plot deployment was based on forest sub-compartment data and distribution maps and followed the standards of the “National Forest Sustainable Management Pilot Monitoring Sample Plot Survey Technical Guidelines”. First, according to the management tasks assigned by higher authorities, sub-compartments suitable for sustainable forest management are selected within the management units. Second, following the principle of contiguous blocks, dispersed plots are excluded to ensure the continuity and representativeness of the sample plots. Third, under the premise of prioritizing plots with good site conditions and convenient operational management, operational and control sample plots are established using typical sampling methods, with fixed corner posts set at the four corners of each sample plot for subsequent monitoring and survey work. Each monitoring sample plot covers an area of 0.067 hectares, and each sub-compartment sample plot is 3% of the total area of the sub-compartment. Within each sample plot, key parameters of each tree, including diameter at breast height (DBH), tree height, species, and health status, are measured and recorded. These data are collected by trained technicians using height meters, measuring tapes, and calipers, with a monitoring frequency of once every 2–3 years. All collected data are then organized and recorded in paper or electronic forms to ensure data completeness and accuracy.

The research data in this paper were obtained by the Forest Multi-Objective Management Team of the Institute of Forestry Science and Technology Information, Chinese Academy of Forestry, with the cooperation and support of the Forestry Bureaus of Yichun Forestry Group, Heilongjiang Province, from September 2022 to May 2023, in Yichun through forest surveys. Among them, forest management types are determined by forestry experts after comprehensive judgment through field visits and sample plot remeasurement, and the data sources are authentic and reliable. The forest stand parameters obtained include 19 indicators: slope, aspect, slope position, soil type, soil thickness, soil moisture, gravel content, soil pH, site class, stand origin, age group, age class, average age, stand mean height, crown density, species composition, mean diameter at breast height, number of trees per hectare, and volume per hectare. The distribution of forest sustainable management pilots is shown in Figure 1, the area and number of sub-compartments managed by the 10 Forestry Bureaus are shown in Table 2, and the number, area, and percentage of small groups of the four management types are shown in Table 3.

As shown in Table 3, the percentages of different management types of target tree management, homogeneous forest management, rehabilitation of low-quality and inefficient forests, and conversion of degraded forests in the data obtained in this paper are 57.79%, 34.06%, 4.39%, and 3.76%, respectively. Among them, TTM has the greatest number and area, with 738 sub-compartments and an area of 7578.53 ha, HFM is the second largest, and RLQIF and CDF have relatively small numbers and areas.

3.3. Experimental Environment

The experimental environment of this study is Windows 10 64-bit education version operating system, CPU: Intel Core [email protected] GHz, GPU: Intel Iris Xe Graphics, RAM: 16 G, ROM: 512 GB SSD. The main software platforms used include ArcMap 10.8, Anaconda 3, PyCharm 2021.1.1, and others. The programming language used is Python. The languages used in the framework are NumPy, Matplotlib, Pandas, Scikit-learn, and Python.

3.4. Methods

3.4.1. Ensemble Learning

Ensemble learning (EL) is a machine learning algorithm that trains multiple base learners and combines their predictions to achieve higher performance and better generalization than a single base learner [52,53]. The concept was first introduced in 1979 by Dasarathy and Sheela [54]. In 1990, Hansen and Salamon first showed that the generalization error of a neural network can be reduced when the neural network ensemble is invoked. Within the same year, Schapire described how combining several weak learners can outperform a strong learner in the sense that it may be approximately correct. Since that time, ensemble learning has been widely used in many fields and has been shown many times to improve the performance of a single model. There are three main common ensemble learning strategies: the Bagging algorithm [55], the Boosting algorithm [56], and the Stacking algorithm [57].

3.4.2. Decision Tree

The decision tree (DT) is a tree that consists of nodes and branches each [58]. Each node represents a feature in the category to be categorized, and each subset defines the values desirable for the node. Decision tree analysis has been used in many fields due to its simplicity, ease of understanding, ability to be presented through visualization, and accuracy for a wide range of data forms [59]. Decision tree algorithms are of various types such as Iterative Dichotomies 3 (ID3), Successor of ID3 (C4.5), Classification and Regression Tree (CART), Multivariate Adaptive Regression Splines (MARS), etc. Among them, ID3 and C4.5 algorithms, which produce multinomial trees, can only handle classification, not regression. The CART algorithm can be used for both classification and regression tasks. The CART classification tree algorithm commonly uses the Gini coefficient to select features. The Gini coefficient represents the impurity of the model. A smaller Gini coefficient and fewer impurities indicate better features [60]. For dataset

D

, its impurity can be measured by the Gini coefficient:

G i n i (D) = \sum_{i = 1}^{n} p (x_{i}) * (1 - p (x_{i})) = 1 - \sum_{i = 1}^{n} p {(x_{i})}^{2}

(1)

Among them,

p (x_{i})

is the probability of appearance of classification

x_{i}

and

n

is the number of classes.

G i n i (D)

reflects the probability that two randomly selected samples from dataset

D

will not have the same class labeling. Therefore, the smaller

G i n i (D)

shows that the impurity of the dataset

D

is higher.

3.4.3. Support Vector Machine

Support Vector Machine (SVM) is a powerful machine learning algorithm proposed by Vapnik and Cortes in 1995 [61] and is often widely applied to classification, regression, and other prediction tasks. The SVM is very effective with high dimensional, heterogeneous, and almost unlabeled datasets. It can also be successfully customized for specific applications [62]. It largely overcomes issues such as the “curse of dimensionality” and “overfitting” [63]. The mechanism of the SVM involves finding an optimal classification hyperplane that meets the classification requirements [63]. This hyperplane maximizes the margin, or the blank space on either side, while ensuring classification accuracy. Theoretically, the SVM can achieve optimal classification for linearly separable data. However, in many real-world applications, the data are not linearly dividable. To address this, SVMs introduce kernel functions, which allow the algorithm to operate effectively in a high-dimensional space without explicitly mapping the data points to this space. The kernel function represents the inner product of two data points after being mapped to the feature space. Commonly used kernel functions include the polynomial kernel and the radial basis function (RBF) kernel. This paper employs the RBF kernel to build the classification model.

3.4.4. Random Forest

Random forest (RF) is a machine learning algorithm based on decision trees, initially proposed by Breiman in 1995 [64]. It enhances the overall accuracy and stability of the model by combining the predictions of multiple decision trees. Due to its characteristics of handling multicollinearity, being fast, insensitive to overfitting, and effectively processing high-dimensional data [65], it is widely used in the forestry industry. The random forest consists of multiple decision trees, each of which is built independently. When the predictor variables are numerical, the generated random forest model is a multivariate nonlinear regression model, and the model’s predictions are the average of the predictions from multiple decision trees [66]. In classification tasks, random forests often use a voting system to determine the final category label. Bootstrap Aggregating and random feature selection are two key concepts of its core principles. The Bootstrap resampling method is used to extract multiple samples from the original data, and after modeling each Bootstrap sample with a decision tree, the predictions of multiple decision trees are combined. Random feature selection refers to the random forest not considering all possible features when building each decision node but randomly selecting a subset of features and choosing the best feature from this subset to split the node. This method not only reduces the correlation between trees, enhancing the model’s diversity, but also improves the overall model’s accuracy and robustness. Another advantage of random forests is the ease of measuring the relative importance of each feature in prediction. Therefore, this paper employs random forests to construct the primary classification model and measure the importance of forest stand parameters.

3.4.5. Least Absolute Shrinkage and Selection Operator Regression

Least Absolute Shrinkage and Selection Operator (LASSO) regression is a shrinkage and variable selection method for regression models [67]. This method aims to identify the variables and their corresponding regression coefficients to build a model that minimizes prediction error. In recent years, due to its capability to handle high-dimensional datasets, this method has attracted attention from researchers in various fields, especially in statistics and machine learning [68]. The core principle is to add an L1 regularization term to the traditional linear regression loss function to reduce the regression coefficient to zero. Once shrunk, variables with coefficients reduced to zero are excluded from the model, thereby reducing the complexity of the model. The mathematical expression for LASSO regression is as follows:

\underset{β}{m i n} \{\sum_{i = 1}^{n} {(y_{i} - β_{0} - \sum_{j = 1}^{p} β_{j} x_{i j})}^{2} + λ \sum_{j = 1}^{p} | β_{j} |\}

(2)

Among them,

y_{i}

is the response variable,

x_{i j}

is the explanatory variable,

β_{j}

is the coefficient,

β_{0}

is the intercept term,

n

is the number of samples,

p

is the number of features, and

λ

is the regularization strength parameter. The degree of complexity adjustment in LASSO regression is controlled by the parameter

λ

. A larger

λ

imposes a stronger penalty on linear models with many variables, thereby resulting in a model with fewer features. Given that LASSO regression can reduce model complexity to prevent overfitting, identify the most influential variables for classification, and make the model easier to interpret and understand, this paper employs LASSO for building the primary classification model and for feature selection of the acquired data indicators.

3.4.6. Standardization Treatment

Due to the variation in dimensions and magnitude of the indices describing forest stand conditions, direct analysis using raw data may lead to slow convergence or difficulty in achieving global optimality during model training. Therefore, to eliminate the impact of dimensions, enhance numerical stability, and improve algorithm performance, it is necessary to standardize the raw data to obtain more reliable results. Common standardization methods include min-max scaling and Z-Score Standardization [69]. Min-max scaling scales the data between a specified minimum and maximum value, usually 0 and 1. Z-Score Standardization adjusts the data so that their mean is 0 and their standard deviation is 1. Proper standardization of data ensures the efficiency of machine learning models during training and good generalization capabilities on new data. In this study, the Z-Score Standardization method is used to determine the relative position of a data point within the group. The formula for this calculation is

z = \frac{(x - μ)}{σ}

(3)

Among them,

μ

is the sample mean, and

σ

is the sample standard deviation.

3.4.7. Feature Selection

It is well-known that the higher the dimensionality of data, the more noise, irrelevant, and redundant data there will be, which can lead to overfitting in models and increase the error rate of learning algorithms [70]. Therefore, a common approach for high-dimensional data involves using feature selection methods to filter out the most influential features from the original high-dimensional feature set. This achieves the goals of reducing data dimensions, lowering noise, decreasing model complexity, and preventing model overfitting. Feature selection, also known as variable selection, attribute selection, or variable subset selection, is a process in machine learning and statistical models used to select features that significantly impact model predictive performance. As a data preprocessing strategy, the effectiveness and efficiency of feature selection have been proven in numerous data mining and machine learning challenges [71]. Currently, common feature selection methods can be broadly categorized into three types: filter, wrapper, and embedding methods. To effectively measure the importance of various forest stand parameters and enhance model performance, this study chooses the embedding methods of random forests and LASSO regression for feature selection on the acquired data.

3.4.8. SEFMTI Model

Compared to individual machine learning methods, ensemble learning methods have better accuracy [72]. Wang et al. [73] proposed an FL-Stacking model based on hybrid feature selection and ensemble learning, achieving high accuracy in estimating the stock volume of forest units and validating the potential value of the Stacking ensemble strategy in multiple forestry scenarios. For this purpose, this paper adopts the Stacking ensemble method to construct a Stacking Ensemble Forest Management Type Identification (SEFMTI) model. The construction of the ensemble model uses a two-tier Stacking approach to ensure data independence. At the same time, as an efficient ensemble technique, the Stacking strategy can combine different models into an integrated model [74], with a framework composed of primary learners and meta-learners. In this study, DT, SVM, and RF are selected as primary learners for the Stacking ensemble learning, and SVM is chosen as the meta-learner to construct the SEFMTI model. The flowchart of the model is shown in Figure 2.

3.5. Model Evaluations

The evaluation of the classification model is generally based on the verification set, and a general evaluation is performed by calculating the confusion matrix, Kappa coefficient, etc. In addition, to comprehensively evaluate the performance of the model in different aspects, indicators such as accuracy, precision, recall, and F1-score are often used to evaluate classification models [75]. Especially in data sets with imbalanced categories, the accuracy alone may not accurately reflect the true performance of the model. Currently, the importance of indicators such as precision, recall, and F1 score has become even more prominent.

3.6. Model Parameters

The accuracy, complexity, and training time of machine learning methods are greatly affected by the parameters; the default parameters of the model in Scikit-learn are not always the optimal parameters, and there are large differences in the impact of different parameters on the model’s performance. To improve the model’s performance, prevent overfitting or underfitting, and enhance the generalization ability of the model in this study, the model-related parameters used were first optimized. GridSearchCV was mainly used to optimize the parameters of the LASSO regression, decision tree, Support Vector Machine, and random forest methods involved, and the optimized parameters are shown in Table 4, and GridSearchCV was used for Stacking-related parameter settings and explanations.

4. Results

4.1. Data Pre-Processing Results

According to the acquired data, the discrete variables were unmedicalized regarding the technical protocol for continuous forest inventory. Species richness refers to the number of species in an ecosystem or community and is the most direct measure of diversity [76]. Tree species composition, which refers to the types of trees in a forest stand and their proportions, is difficult to quantify under a unified indicator. Therefore, this study calculates the species richness for each plot based on the tree species composition. The names, definitions, maximum values, minimum values, means, and standard deviations of each forest stand parameter are presented in Table 5. According to the results in Table 3, the standard deviations of the 19 forest stand parameters range from 0.11 to 370.13, indicating significant differences between indicators. Therefore, to ensure data consistency in terms of dimensions, the data are standardized according to the method described in Section 3.4.6. Additionally, the data are split into a training set and a test set at a ratio of 7:3 for subsequent model training and testing.

4.2. Comparison of BFMTI Model Results

Using the preprocessed dataset as input, the study selected LASSO, DT, SVM, and RF classification methods based on the parameter settings in Table 4 to build the Basic Forest Management Type Identification (BFMTI) models: BFMTI-LASSO, BFMTI-DT, BFMTI-SVM, and BFMTI-RF. The models were evaluated based on accuracy, precision, recall, and F1-score, as shown in Table 6.

From Table 6, it is known that under the same dataset, the BFMTI models BFMTI-LASSO, BFMTI-DT, BFMTI-SVM, and BFMTI-RF built in this paper have the accuracies of 87.76%, 87.76%, 96.61%, and 95.31%, respectively. Comparing the indicators of BFMTI-LASSO and BFMTI-DT, both models had the same accuracy and recall, but BFMTI-DT had a slightly higher F1-score and precision than BFMTI-LASSO. Therefore, BFMTI-DT, BFMTI-RF, and BFMTI-SVM were selected as primary classifiers for the SEFMTI model. Among the four models, BFMTI-SVM achieved the best results, with an accuracy 8.85% higher than BFMTI-LASSO and BFMTI-DT and 1.3% higher than BFMTI-RF. Its F1-score, precision, and recall were relatively stable and higher than those of BFMTI-LASSO, BFMTI-DT, and BFMTI-RF. Therefore, SVM was chosen as the meta-learner to build the SEFMTI model.

4.3. Comparison of Different Ensemble Models

With the preprocessed dataset as input, the ensemble model was built. Firstly, following the flowchart shown in Section 3.4.8, the BFMTI-DT, BFMTI-RF, and BFMTI-SVM models built using DT, SVM, and RF methods were selected as primary learners, with SVM used as the meta-learner to construct the SEFMTI model. Secondly, to compare the performance of the Stacking model, this study utilized the relative majority voting method commonly used in the Bagging ensemble strategy to integrate the identification results of the BFMTI-DT, BFMTI-RF, and BFMTI-SVM models, which built the Bagging ensemble forest management type identification (BEFMTI) model. The results are presented in Table 7.

Table 7 shows that the accuracy, precision, recall, and F1-score of the SEFMTI model are 3.39%, 3.56%, 3.48%, and 3.39% higher than BEFMTI, respectively. Comparing the primary recognition models, the accuracy of the SEFMTI model is 9.38%, 9.38%, 0.52%, and 1.82% higher than that of BFMTI-LASSO, BFMTI-DT, BFMTI-SVM, and BFMTI-RF, respectively, which fully verifies the advantages of the Stacking ensemble learning method in the recognition of forest management types. These results show that the SEFMTI model outperforms both the BEFMTI model and other primary identification models in terms of performance, which substantiates the SEFMTI model’s robust capability in recognizing forest management types, thereby providing strong support for research in this field.

4.4. Comparison of Feature Selection Results

4.4.1. Calculation Results of Feature Importance

The analysis of key decision indicators can effectively eliminate redundant and irrelevant variables, significantly reducing the complexity and cost of data collection. Additionally, it enhances the interpretability of the model, making it more transparent, helping decision-makers understand the key factors affecting forest management and formulate policies scientifically. At the same time, it reduces model complexity, accelerates the decision-making process, and improves overall decision-making efficiency. LASSO regression automatically performs variable selection through L1 regularization, simplifying the model and improving interpretability. Random forest provides robust feature importance measures by constructing multiple decision trees, capturing nonlinear relationships and interactions between features. The combination of both methods simplifies the model and enhances its robustness and accuracy. Therefore, this study employs random forest and LASSO regression methods to analyze the key decision factors for forest management types. The model parameters are described in Section 3.6. The feature importance results are shown in Figure 3, with scores ranging from 0 to 1, indicating the relative importance of each indicator in the classification task.

The results indicate that in the RF method, the importance of the indicators ranged from 0 to 0.24, with AG having the highest importance at 0.24 and GC the lowest at 0. In the LASSO method, six attributes were selected, specifically AG, AGG, CD, NTPH, SO, and STY. Among these, AGG had a high importance of 0.16, while NTPH had the lowest at 0.01. A comparison between the two feature selection methods reveals that the attributes AG, AGG, CD, and SO have higher importance, each exceeding the 0.05 level, making them key decision indicators for identifying forest management types. Additionally, combining the feature selection results from both models, the study used STY as a dividing line. After excluding four indicators (ASP, SLP, SM, GC) from the preprocessed data, the remaining 15 indicators were organized into a feature selection dataset to conduct further analysis.

4.4.2. Comparison of Model Results after Feature Selection

The BFMTI-LASSO, BFMTI-DT, BFMTI-SVM, BFMTI-RF, BFMTI, and SEFMTI models have been built with the post-feature-selection data as input, and the results are shown in Table 8.

According to Table 6 and Table 8, after feature selection, the accuracies of the BFMTI-LASSO and BFMTI-DT models improved by 0.78% and 1.04%, respectively, and their training efficiencies increased by 25.25% and 18.47%, respectively. The accuracy of BFMTI-RF remained unchanged, but its training efficiency increased by 18.48%. The accuracy of the BFMTI-SVM decreased by 0.78%, but its efficiency improved by 2.27%. From Table 7 and Table 8, it is evident that after feature selection, the BEFMTI model’s accuracy, precision, recall, and F1-score each improved by 1.04%, 1.17%, 1.02%, and 1.04%, respectively, and its training efficiency improved by 14.97%. The accuracy of the SEFMTI model decreased by 2.34%, but its efficiency increased by 26.96%. These results indicate that feature selection methods can significantly enhance the efficiency of model training while maintaining high accuracy, providing a scientific basis for later applying the models developed in this study in actual production processes.

5. Discussion

5.1. Analysis of Key Decision-Making Indicators

In China’s traditional forest management activities, stands are typically classified into forest management types based on five main indicators: stand origin (SO), tree species composition (TSC), age group (AGG), tree growth and development stage (TGDS), and site conditions (SCOs) [10]. Stand origin describes the source of tree development in the stand, serving as a basis for analyzing stand growth and determining management technical measures. Tree species composition describes the tree species composition and their proportions in the stand. Stands composed of a single tree species are termed pure stands, while those composed of two or more tree species are termed mixed stands. Stand age refers to the average age determined by the average age of dominant tree species (groups) in the stand, which is an important indicator for determining management measures, especially for artificial forest management. The age group represents the stage of growth and development of the stand and determines the corresponding management measures needed at each stage. Site conditions (aspect, slope, soil, etc.) determine the priority sequence of management and the support capacity for management intensity and objectives. Stands with good site conditions are given priority for management, and management intensity can be increased accordingly.

According to the results of Section 4.4.1, the importance of four indicators including AG, AGG, CD, and SO in both random forest and LASSO regression methods exceeds 0.05, making them key decision indicators for forest management type identification, which are consistent with the main indicators used in traditional forest type classification. However, unlike traditional classification criteria, which do not explicitly consider indicators such as TSC and SCO, this study highlights the importance of CD, a factor that has not been given much attention in previous research. In contrast, the TSC and SCO indicators in the traditional delineation metrics failed to be visualized, and the CD indicators selected for this study have not been focused on in the past. To further analyze the relationship between TSC, CD, and other stand parameters, this study first calculates the cumulative importance of nine indicators related to SCO such as SLO, STH, SPH, SC, ASP, SLP, STY, and GC using random forest and LASSO regression methods and takes their average to obtain the SCO Importance Index. Similarly, the Comprehensive Importance Index of ten indicators including AG, VPH, AGG, CD, NTPH, SO, SMH, SR, MDBH, and AGC is calculated and plotted as a histogram, as shown in Figure 4.

From Figure 4, it can be observed that the top five indicators in terms of comprehensive importance ranking are AG, SCO, AGG, CD, and SO. After integrating the nine indicators belonging to SCO, the Comprehensive Importance Index of SCO is 0.145, ranking second. Subsequently, the Comprehensive Importance Index of SR, quantified based on TSC, is 0.02, generally at a lower level. The Comprehensive Importance Index of CD is 0.08, ranking fourth. The results indicate that the key decision indicators analyzed in this study are consistent with the traditional classification criteria, with some differences in TSC and CD. The higher Comprehensive Importance Index of CD may be related to significant differences in CD among the four management types: TTM, HFM, RLQIF, and CDF. Therefore, future research on forest management type identification should consider more diverse indicators of biodiversity to improve identification accuracy [22]. Additionally, besides focusing on SO, TSC, AGG, TGDS, and SCO, CD should also be given priority consideration.

5.2. Comparison of Different Model Confusion Matrices

To analyze the differences in the identification of different management types, this study plotted the confusion matrices of the BFMTI-LASSO, BFMTI-DT, BFMTI-SVM, BFMTI-RF, BEFMTI, and SEFMTI models on the test set after feature selection, as shown in Figure 5 and Figure 6.

From Figure 5a, in the BFMTI-LASSO model, 0.33 of the plots misclassified RLQIF as TTM, and 0.67 misclassified CDF as HFM. Figure 5b shows that in the BFMTI-DT model, 0.33 of the plots misclassified RLQIF as TTM and 0.5 misclassified CDF as HFM. From Figure 5c, it is evident that in the BFMTI-RF model, 0.27 of the plots misclassified RLQIF as TTM and 0.17 misclassified CDF as HFM. Figure 5d reveals that in the BFMTI-SVM model, 0.13 of the plots misclassified RLQIF as TTM, and 0.07 misclassified RLQIF as HFM. The results indicate that there is some sensitivity between forest management types and classification algorithms, with RLQIF and TTM as well as CDF and HFM being groups of management types prone to misclassification. The potential cause of misclassification between RLQIF and TTM could be related to both management types corresponding to forest stands characterized by poor uniformity in tree distribution and uneven ages, whereas misclassification between CDF and HFM might be due to similar stand densities within these management types; however, the specific correlations still require further substantiation.

As shown in Figure 6a, the BEFMTI model has 0.13 plots where RLQIF is identified as TTM and 0.17 samples where CDF is identified as HFM. From Figure 6b, it is shown that the SEFMTI model has 0.13 plots where RLQIF is identified as TTM, and the accuracy of CDF identification is as high as 100%. The results show that the SEFMTI model proposed in this paper can effectively improve the accuracy rate of small sample types while maintaining high accuracy, which provides a certain reference for further analysis of small sample data.

5.3. Limitations

Based on the data from the sustainable forest management pilot project of Yichun Forest Industry Group, this study utilized the Stacking ensemble learning algorithm to construct the SEFMTI ensemble model, successfully identifying four management types: TTM, HFM, RLQIF, and CDF. Despite achieving positive results, the study still has some limitations. Firstly, due to the significant regional differences in China and the complex and diverse forest types, the management types investigated in the current research are relatively limited. The study area is solely focused on Yichun, lacking verification of applicability to other regions. In future work, we need to explore more forest management types based on the specific forest conditions of different regions. Additionally, we will further validate the applicability of the SEFMTI model in different forest ecosystems and optimize the model to meet the forest management needs of various regions. Secondly, the 19 indicators currently used may not be the optimal choice for management type identification, and they need to be dynamically adjusted based on the forest management requirements and the actual conditions of the stands. Thirdly, the model built in this study is based on traditional survey data, which are susceptible to constraints such as inaccurate and discontinuous data. Therefore, in the context of the continuous transformation of forest management concepts and the trend towards a “digital” and “intelligent” era with remote sensing technology, UAV-LiDAR, artificial intelligence, etc., it is necessary to continuously deepen the research on intelligent technology in the monitoring system, combined with the integration of “sky–ground” multimodal data.

6. Conclusions

This paper proposes a Stacking Ensemble Forest Management Type Identification (SEFMTI) model. Utilizing the Stacking ensemble strategy, this model combines the advantages of various classification methods such as decision trees, Support Vector Machines, and random forests, achieving significant results in forest management type identification. The identification accuracy for the management types in the study area reached 97.14%, and the response time for management type identification and information feedback was only 0.639 s, effectively shortening the organization cycle and significantly improving decision-making efficiency in the process of forest management type organization in China. By analyzing key decision indicators (AG, AGG, CD, SQ), not only can the complexity and cost of data collection be reduced but the interpretability of the model can also be enhanced, helping decision-makers formulate more scientific policies. Additionally, this provides a scientific basis for the formulation of forest management plans, enhancing the accuracy and reliability of decisions. In summary, the SEFMTI model not only addresses problems such as long organization cycles, inaccurate type identification, and low decision-making efficiency in forest management type organizations but also greatly improves decision-making efficiency and accuracy, enhancing the formulation and scientific management of forest management plans. Moreover, this model has broad application prospects and promotion value in the large-scale promotion of typical technical models, cost-saving in management, and the advancement of intelligent and precise forestry development. In the future, the application of the SEFMTI model will further promote the modernization of forest management in China, improve the scientific and effective management of forest resources, and provide strong technical support for achieving sustainable forest management.

Author Contributions

Development of the idea, J.L. and S.C.; data collection, J.L. and J.C.; writing—original draft preparation, J.L.; writing—review and editing, J.C., K.W. and S.C.; supervision of the project, K.W. and S.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the National Key Research and Development Program of China: Intelligent Multifunctional Management Decision-making Technology for Larch Plantation Forests (No. 2023YFD2200804); the Cooperative Forestry Science and Technology Project of Zhejiang Provincial Academy: Study on carbon sequestration and sink enhancement technology of forest ecosystem in Zhejiang Province and the realization path of carbon sink value (No. 2023SY02).

Data Availability Statement

The research data in this paper were obtained by the Forest Multi-Objective Management Team of the Institute of Forestry Science and Technology Information, Chinese Academy of Forestry, in Yichun, Heilongjiang Province, through in situ surveys from September 2022 to May 2023, with the cooperation and support of the forestry bureaus of Yichun Forestry Group. The data presented in this study are available on request from the corresponding author. The data are not publicly available due to the confidentiality of the data set.

Acknowledgments

The sustainable forest management data are provided by the Yichun Forest Industry Group. We gratefully acknowledge their invaluable cooperation in this work.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Zhang, H.R.; Lei, X.D.; Zhang, C.Y.; Zhao, X.H.; Hu, X.F. Research on theory and technology of forest quality evaluation and precision improvement. J. Beijing For. Univ. BJFU 2019, 41, 1–18. [Google Scholar] [CrossRef]
Tang, X.P.; Ouyang, J.X. Summary on the Development of the Forest Management Plan. For. Resour. Manag. 2022, S1, 8–18. [Google Scholar] [CrossRef]
Chen, Q.Y. Division of Working Group of Silviculture and Working Measures Difference. For. Eng. 2009, 25, 17–20. [Google Scholar] [CrossRef]
Duncker, P.S.; Barreiro, S.M.; Hengeveld, G.M.; Lind, T.; Mason, W.L.; Ambrozy, S.; Spiecker, H. Classification of Forest Management Approaches: A New Conceptual Framework and Its Applicability to European Forestry. Ecol. Soc. 2012, 17, art51. [Google Scholar] [CrossRef]
Chazdon, R.L.; Brancalion, P.H.S.; Laestadius, L.; Bennett-Curry, A.; Buckingham, K.; Kumar, C.; Moll-Rocek, J.; Vieira, I.C.G.; Wilson, S.J. When Is a Forest a Forest? Forest Concepts and Definitions in the Era of Forest and Landscape Restoration. Ambio 2016, 45, 538–550. [Google Scholar] [CrossRef] [PubMed]
Chazdon, R.L.; Broadbent, E.N.; Rozendaal, D.M.A.; Bongers, F.; Zambrano, A.M.A.; Aide, T.M.; Balvanera, P.; Becknell, J.M.; Boukili, V.; Brancalion, P.H.S.; et al. Carbon Sequestration Potential of Second-Growth Forest Regeneration in the Latin American Tropics. Sci. Adv. 2016, 2, e1501639. [Google Scholar] [CrossRef] [PubMed]
Qing, X.H. Classification methods on forest working group of noncommercial forest in collective forest are. J. South. Agric. 2014, 45, 266–273. [Google Scholar]
Hong, Y.; Yu, B.; Ding, L.; Hu, W.L.; Zheng, X.X. Technical Research of Multi-functional Management Types on Oka Forest in Mountainous Region of Eastern Liaoning Province. For. Resour. Manag. 2018, 3, 75–80. [Google Scholar] [CrossRef]
Larsen, J.B.; Nielsen, A.B. Nature-Based Forest Management—Where Are We Going? For. Ecol. Manag. 2007, 238, 107–117. [Google Scholar] [CrossRef]
Li, T.T.; Chen, S.Z.; Wu, S.R.; Wu, K.Y.; Lan, Q. The Application of Close-to-nature Forest Development Types in Forest Management Type Improvement in China. World For. Res. 2016, 29, 1–5. [Google Scholar] [CrossRef]
Hou, S.M.; Liu, D.L.; Zheng, X.X. Division of forest management types based on GIS in Jiangle Forest Farm. J. Cent. South Univ. For. Technol. 2014, 34, 66–71. [Google Scholar] [CrossRef]
Pukkala, T. Which Type of Forest Management Provides Most Ecosystem Services? For. Ecosyst. 2016, 3, 9. [Google Scholar] [CrossRef]
Pang, L.; Wang, G.; Sharma, R.P.; Lu, J.; Tang, X.; Fu, L. Simulation of Thinning by Integrating Tree Competition and Species Biodiversity for Target Tree-Based Management of Secondary Forests. Forests 2023, 14, 1896. [Google Scholar] [CrossRef]
Raihan, A. Artificial Intelligence and Machine Learning Applications in Forest Management and Biodiversity Conservation. Nat. Resour. Conserv. Res. 2023, 6, 3825. [Google Scholar] [CrossRef]
Liu, T.; Sun, Y.; Wang, C.; Zhang, Y.; Qiu, Z.; Gong, W.; Lei, S.; Tong, X.; Duan, X. Unmanned Aerial Vehicle and Artificial Intelligence Revolutionizing Efficient and Precision Sustainable Forest Management. J. Clean. Prod. 2021, 311, 127546. [Google Scholar] [CrossRef]
Muhammad, A.; Khloud, K.A.; Salma, A.S.; Samar, O.A.; Mashael, E.A.; Maram, A.A.; Maryam, A. Role of Machine Learning Algorithms in Forest Fire Management: A Literature Review. J. Robot. Autom. 2021, 5, 212–226. [Google Scholar] [CrossRef]
Taylor, P.; Almeida, A.C.; Kemmerer, E.; Abreu, R.O.D.S. Improving Spatial Predictions of Eucalypt Plantation Growth by Combining Interpretable Machine Learning with the 3-PG Model. Front. For. Glob. Chang. 2023, 6, 1181049. [Google Scholar] [CrossRef]
Pecchi, M.; Marchi, M.; Burton, V.; Giannetti, F.; Moriondo, M.; Bernetti, I.; Bindi, M.; Chirici, G. Species Distribution Modelling to Support Forest Management. A Literature Review. Ecol. Model. 2019, 411, 108817. [Google Scholar] [CrossRef]
Kaya, A.; Bettinger, P.; Boston, K.; Akbulut, R.; Ucar, Z.; Siry, J.; Merry, K.; Cieszewski, C. Optimisation in Forest Management. Curr. For. Rep. 2016, 2, 1–17. [Google Scholar] [CrossRef]
Zhao, Q.; Yu, S.; Zhao, F.; Tian, L.; Zhao, Z. Comparison of Machine Learning Algorithms for Forest Parameter Estimations and Application for Forest Quality Assessments. For. Ecol. Manag. 2019, 434, 224–234. [Google Scholar] [CrossRef]
Camarretta, N.; Harrison, P.A.; Bailey, T.; Potts, B.; Lucieer, A.; Davidson, N.; Hunt, M. Monitoring Forest Structure to Guide Adaptive Management of Forest Restoration: A Review of Remote Sensing Approaches. New For. 2020, 51, 573–596. [Google Scholar] [CrossRef]
Oettel, J.; Lapin, K. Linking Forest Management and Biodiversity Indicators to Strengthen Sustainable Forest Management in Europe. Ecol. Indic. 2021, 122, 107275. [Google Scholar] [CrossRef]
Falcão, A.O.; Santos, M.P.D.; Borges, J.G. A Real-Time Visualization Tool for Forest Ecosystem Management Decision Support. Comput. Electron. Agric. 2006, 53, 3–12. [Google Scholar] [CrossRef]
Haara, A.; Pykäläinen, J.; Tolvanen, A.; Kurttila, M. Use of Interactive Data Visualization in Multi-Objective Forest Planning. J. Environ. Manag. 2018, 210, 71–86. [Google Scholar] [CrossRef]
Liu, P.; Ren, C.; Wang, Z.; Jia, M.; Yu, W.; Ren, H.; Xia, C. Evaluating the Potential of Sentinel-2 Time Series Imagery and Machine Learning for Tree Species Classification in a Mountainous Forest. Remote Sens. 2024, 16, 293. [Google Scholar] [CrossRef]
Dong, X.; Yu, Z.; Cao, W.; Shi, Y.; Ma, Q. A Survey on Ensemble Learning. Front. Comput. Sci. 2020, 14, 241–258. [Google Scholar] [CrossRef]
Tajik, S.; Ayoubi, S.; Zeraatpisheh, M. Digital Mapping of Soil Organic Carbon Using Ensemble Learning Model in Mollisols of Hyrcanian Forests, Northern Iran. Geoderma Reg. 2020, 20, e00256. [Google Scholar] [CrossRef]
Barreras, A.; Alanís de la Rosa, J.A.; Mayorga, R.; Cuenca, R.; Moreno-G, C.; Godínez, C.; Delgado, C.; Soriano-Luna, M.D.L.Á.; George, S.; Aldrete-Leal, M.I.; et al. Spatial Predictions of Tree Density and Tree Height across Mexico Forests Using Ensemble Learning and Forest Inventory Data. Ecol. Evol. 2023, 13, e10090. [Google Scholar] [CrossRef] [PubMed]
Wen, L.; Hughes, M. Coastal Wetland Mapping Using Ensemble Learning Algorithms: A Comparative Study of Bagging, Boosting and Stacking Techniques. Remote Sens. 2020, 12, 1683. [Google Scholar] [CrossRef]
Chen, Y.; Dong, C.; Wu, B. Crown Profile Modeling and Prediction Based on Ensemble Learning. Forests 2022, 13, 410. [Google Scholar] [CrossRef]
Zou, W.T.; Chen, S.Z.; Wu, S.R.; Li, T.T. Forest Management Type Classification Based on Forestry Survey and Geographic Spatial Data: A Case Study of Qinshui County, Shanxi Province. For. Sci. Technol. 2019, 11, 16–21. [Google Scholar] [CrossRef]
MacDicken, K.G.; Sola, P.; Hall, J.E.; Sabogal, C.; Tadoum, M.; De Wasseige, C. Global Progress toward Sustainable Forest Management. For. Ecol. Manag. 2015, 352, 47–56. [Google Scholar] [CrossRef]
Zhen, Y.; Zhuang, G.Y. System Governance on the Mountain-River-Forest-Farmland-Lake-Grassland: Theoretical Framework and Approaches. J. Eco-Civ. Stud. 2020, 9, 12–27. [Google Scholar]
Lei, J.P.; Li, H.Q.; Jiang, Z.P. An Analysis on the Application of Close-to-Nature Forestry in China. World For. Res. 2007, 05, 63–67. [Google Scholar] [CrossRef]
Lan, Q.; Wu, S.R.; Wu, K.Y.; Chen, S.Z.; Li, T.T. Theory and Practice of Close-to-Nature Forest Management in Small Watershed. World For. Res. 2016, 29, 7–11. [Google Scholar] [CrossRef]
Liu, S.R.; Yang, Y.J.; Wang, H. Development strategy and management countermeasures of planted forests in China: Transforming from timber-centered single objective management towards mul-ti-purpose management for enhancing quality and benefits of ecosystem services. Acta Ecol. Sin. 2018, 38, 1–10. [Google Scholar]
Chen, W.J. Innovative Forestry Management Model Achieving Sustainable Forest Management Across the Entire Lifecycle. Guangxi For. 2019, 11, 14–15. [Google Scholar]
Hubbard, W.; Latt, C.; Long, A. Forest Terminology for Multiple-Use Management; 2006. Available online: https://ufdcimages.uflib.ufl.edu/IR/00/00/18/11/00001/FR06300.pdf (accessed on 28 March 2024).
Song, J.W.; Fan, B.M.; Li, Z.Y. Historical Evolution of the Idea of Multi-functional Forestry in China. World For. Res. 2011, 24, 8–13. [Google Scholar] [CrossRef]
Zeng, X.W.; Fan, B.M.; Zhang, H.Q.; Lei, X.D. Study on the Theory and Strategy of Multifunctional Forest Management in China. For. Resour. Manag. 2013, 2, 10–16. [Google Scholar] [CrossRef]
Abetz, P.; Kladtke, J. The Target Tree Management System. Die Z-Baum-Kontrollmethode. Forstwiss. Cent. 2002, 121, 73–82. [Google Scholar] [CrossRef]
Lu, Y.C.; Zhang, S.G.; Lei, X.D.; Ning, J.K.; Wang, Y.X. Theoretical Basis and Implementation Techniques on Close-to-natural Transformation of Plantations. World For. Res. 2009, 22, 20–27. [Google Scholar] [CrossRef]
Guo, S.Y.; Forster, H.; Chen, X.L. Target Tree Management: German Experience and Hubei Practices. World For. Res. 2021, 34, 14–20. [Google Scholar] [CrossRef]
Zhou, C.F.; Feng, L.Y.; He, X.; Zhang, H.R.; Lei, X.D.; Lu, J.; Zhang, X.H.; Wang, J.Z. Study on Effectiveness of Crop Tree Management for Individual Tree Growth in Spruce-Fir Coniferous-Broadleaved Mixed Forest. For. Res. 2022, 35, 19–27. [Google Scholar] [CrossRef]
Pommerening, A.; Maleki, K.; Haufe, J. Tamm Review: Individual-Based Forest Management or Seeing the Trees for the Forest. For. Ecol. Manag. 2021, 501, 119677. [Google Scholar] [CrossRef]
Wu, K.Y.; Chen, S.Z.; Xu, C.L.; Wu, S.R. Planted Forest Management and Conversion of Natural Secondary Forest Based on the Crop Tree System: A Case Study in Mulan Forest. For. Econ. 2015, 37, 56–61. [Google Scholar] [CrossRef]
Wang, H.; Chen, A.T.; Xu, L. Effects of Different Tending Methods on Tree Growth and Soil Nutri-ents in Secondary Birch Forest. For. Resour. Manag. 2022, S1, 85–90. [Google Scholar] [CrossRef]
Ji, H.X.; Zhang, J.F. Causes of low-quality and low-efficiency forests and countermeasures for their enhancement and transformation. Xiandai Nongye Keji 2021, 23, 99–100. [Google Scholar]
Feng, Q.Y.; Chen, C.F.; Qi, L.; He, Y.T.; Wang, P.; Duan, Y.X.; Wang, Y.F.; He, Y.J. Effects of Dif-ferent Management Models on Stand Structure and Plant Diversity of Natural Secondary Forests of Quercus Mongolica. Sci. Silvae Sin. 2018, 54, 12–21. [Google Scholar]
Lan, Q.; Chen, S.Z.; Wu, K.Y.; Li, T.T.; Lu, J. Progress of Degraded Forests Restoration Research. World For. Res. 2021, 34, 50–57. [Google Scholar] [CrossRef]
Dong, L.B.; Liu, Z.G.; Li, F.R. Spatial Point Patterns and Associations of Forest Landscapes in Pangu Forest Farm in Daxing’an Mountains. Sci. Silvae Sin. 2015, 51, 28–36. [Google Scholar]
Dietterich, T.G. Ensemble Methods in Machine Learning. In Multiple Classifier Systems; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2000; Volume 1857, pp. 1–15. ISBN 978-3-540-67704-8. [Google Scholar]
Mienye, I.D.; Sun, Y. A Survey of Ensemble Learning: Concepts, Algorithms, Applications, and Prospects. IEEE Access 2022, 10, 99129–99149. [Google Scholar] [CrossRef]
Dasarathy, B.V.; Sheela, B.V. A Composite Classifier System Design: Concepts and Methodology. Proc. IEEE 1979, 67, 708–713. [Google Scholar] [CrossRef]
Breiman, L. Bagging Predictors. Mach. Learn. 1996, 24, 123–140. [Google Scholar] [CrossRef]
Ferreira, A.J.; Figueiredo, M.A.T. Boosting Algorithms: A Review of Methods, Theory, and Applications. In Ensemble Machine Learning; Zhang, C., Ma, Y., Eds.; Springer: New York, NY, USA, 2012; pp. 35–85. ISBN 978-1-4419-9325-0. [Google Scholar]
Zhang, Y.; Ma, J.; Liang, S.; Li, X.; Liu, J. A Stacking Ensemble Algorithm for Improving the Biases of Forest Aboveground Biomass Estimations from Multiple Remotely Sensed Datasets. GIScience Remote Sens. 2022, 59, 234–249. [Google Scholar] [CrossRef]
Navada, A.; Ansari, A.N.; Patil, S.; Sonkamble, B.A. Overview of Use of Decision Tree Algorithms in Machine Learning. In Proceedings of the 2011 IEEE Control and System Graduate Research Colloquium, Shah Alam, Malaysia, 27–28 June 2011; IEEE: Shah Alam, Malaysia, 2011; pp. 37–42. [Google Scholar]
Charbuty, B.; Abdulazeez, A. Classification Based on Decision Tree Algorithm for Machine Learning. J. Appl. Sci. Technol. Trends 2021, 2, 20–28. [Google Scholar] [CrossRef]
Rutkowski, L. The CART Decision Tree for Mining Data Streams. Inf. Sci. 2014, 266, 1–15. [Google Scholar] [CrossRef]
Maruejols, L.; Wang, H.; Zhao, Q.; Bai, Y.; Zhang, L. Comparison of Machine Learning Predictions of Subjective Poverty in Rural China. China Agric. Econ. Rev. 2023, 15, 379–399. [Google Scholar] [CrossRef]
Salcedo-Sanz, S.; Rojo-Álvarez, J.L.; Martínez-Ramón, M.; Camps-Valls, G. Support Vector Machines in Engineering: An Overview. WIREs Data Min. Knowl. Discov. 2014, 4, 234–267. [Google Scholar] [CrossRef]
Ding, S.F.; Qi, B.J.; Tan, H.Y. An Overview on Theory and Algorithm of Support Vector Machines. J. Univ. Electron. Sci. Technol. China. 2011, 40, 2–10. [Google Scholar]
Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Belgiu, M.; Drăguţ, L. Random Forest in Remote Sensing: A Review of Applications and Future Directions. ISPRS J. Photogramm. Remote Sens. 2016, 114, 24–31. [Google Scholar] [CrossRef]
Gao, R.N.; Xie, Y.S.; Lei, X.D.; Lu, Y.C.; Su, X.Y. Study on prediction of natural forest productivity based on random forest model. J. Cent. South Univ. For. Technol. 2019, 39, 39–46. [Google Scholar] [CrossRef]
Ranstam, J.; Cook, J.A. LASSO Regression. Br. J. Surg. 2018, 105, 1348. [Google Scholar] [CrossRef]
Adhikari, A.; Montes, C.R.; Peduzzi, A. A Comparison of Modeling Methods for Predicting Forest Attributes Using Lidar Metrics. Remote Sens. 2023, 15, 1284. [Google Scholar] [CrossRef]
Andrade, C. Z Scores, Standard Scores, and Composite Test Scores Explained. Indian J. Psychol. Med. 2021, 43, 555–557. [Google Scholar] [CrossRef] [PubMed]
Venkatesh, B.; Anuradha, J. A Review of Feature Selection and Its Methods. Cybern. Inf. Technol. 2019, 19, 3–26. [Google Scholar] [CrossRef]
Li, J.; Cheng, K.; Wang, S.; Morstatter, F.; Trevino, R.P.; Tang, J.; Liu, H. Feature Selection: A Data Perspective. ACM Comput. Surv. 2018, 50, 1–45. [Google Scholar] [CrossRef]
Xu, R.; Lin, H.; Lu, K.; Cao, L.; Liu, Y. A Forest Fire Detection System Based on Ensemble Learning. Forests 2021, 12, 217. [Google Scholar] [CrossRef]
Wang, J.; Xu, J.; Peng, Y.; Wang, H.; Shen, J. Prediction of Forest Unit Volume Based on Hybrid Feature Selection and Ensemble Learning. Evol. Intell. 2020, 13, 21–32. [Google Scholar] [CrossRef]
Ribeiro, M.H.D.M.; Dos Santos Coelho, L. Ensemble Approach Based on Bagging, Boosting and Stacking for Short-Term Prediction in Agribusiness Time Series. Appl. Soft Comput. 2020, 86, 105837. [Google Scholar] [CrossRef]
Liu, J.; Zhang, Y.; Lv, D.; Lu, J.; Xie, S.; Zi, J.; Yin, Y.; Xu, H. Birdsong Classification Based on Ensemble Multi-Scale Convolutional Neural Network. Sci. Rep. 2022, 12, 8636. [Google Scholar] [CrossRef] [PubMed]
Wang, X.; Dallimer, M.; Scott, C.E.; Shi, W.; Gao, J. Tree Species Richness and Diversity Predicts the Magnitude of Urban Heat Island Mitigation Effects of Greenspaces. Sci. Total Environ. 2021, 770, 145211. [Google Scholar] [CrossRef] [PubMed]

Figure 1. The research regions and forest sustainable management pilots involved.

Figure 2. Flowchart of the SEFMTI.

Figure 3. The importance of each indicator.

Figure 4. The Comprehensive Importance Index for each indicator.

Figure 5. Confusion matrix for different BFMTI models. (a) the BFMTI-LASSO model; (b) the BFMTI-DT model; (c) BFMTI-RF model; (d) the BFMTI-SVM model.

Figure 6. Confusion matrix for BEFMTI and SEFMTI models. (a) the BEFMTI model; (b) the SEFMTI model.

Table 1. The definition of forest management types and conditions of application.

FMT	Definition	Adaptation Conditions and Stands	Management Objective
TTM	TTM refers to a forest management method in which all management activities and technical applications throughout the entire cultivation cycle of the forest are closely centered around the cultivation of target trees.	(1) Planted forests that have undergone severe natural differentiation. (2) Natural forests with poor uniformity in tree distribution and varied ages. (3) Forests that are difficult to manage due to their remote location, inconvenient access, and poor entry conditions. (4) Forests with poor site conditions and low output can still select a certain number of target trees for subsequent conversion. (5) Forests requiring individual cultivation of special trees (e.g., forests used for both timber and pine nuts).	Cultivate high-value forests composed of large-diameter trees that feature extensive area-based stock, high growth rates, rich biodiversity, and complete ecological functions.
HFM	HFM is characterized by selecting retention trees based on the value of the tree species and the quality of the trees, focusing on quality rather than uniformity, and is typically applied to stands that have surpassed the optimal period for selecting target trees, with the goal of cultivating medium- to large-diameter timber. In homogeneous management, intermediate cutting is conducted based on tree grading, progressively harvesting inferior trees while retaining superior ones, which supports highly sustainable forest management and overall good returns.	(1) Planted middle-aged forests with the same species and age group (excluding forests used for both fruit and timber). (2) Low-value broad-leaf forests such as planted PopulusL. forests. (3) Natural betula platyphylla forests. (4) Landscape forests. (5) Stands composed of pioneer species.	Utilize the rapid growth phase of trees during their middle age, where diameter and height grow relatively quickly, to maintain optimal density through scientific management, achieving maximum growth and optimal ecological, diameter class, and social values for the stand.
RLQIF	The RLQIF refers to the silvicultural method of gradually improving forest quality and enhancing forest functions without drastically altering the forest habitat, through practices such as thinning and supplementary planting, artificially inducing natural regeneration and progressive species replacement.	Affected by human or natural factors, the structure and stability of these forest stands are disrupted, and their ecological functions, forest product yield or biomass (or stock) are significantly lower than the average level of similar stands under the same site conditions, but they continue to develop in the original succession direction as low-quality, inefficient stands.	Improve the quality and efficiency of the stand to reach normal levels.
CDF	The CDF refers to a management approach used in forest stands where backward succession occurs due to either poor suitability of tree species to site conditions in artificial forests or excessive logging in natural forests. This method involves artificially replanting or promoting natural regeneration to increase the presence of top-tier tree species or long-term associated species, thus curbing degradation and fostering positive ecological succession.	Due to external forces causing forest regression, which natural forces cannot quickly restore, these include the following: (1) Forest degradation in planted forests is caused by the low compatibility of tree species with the site conditions. (2) Forest degradation in natural forests due to excessive logging, lack of renewable high-quality seed sources, and resulting inverse succession. (3) Forest degradation due to severe soil erosion caused by land damage or monoculture in artificial forests blocking nutrient cycling. (4) Forest degradation due to climate change or drastic changes in the forest ecological environment.	Change the regressive succession trend of degraded forests, promote stand growth and positive succession, enhance biodiversity within the forest, and cultivate a healthy and stable forest ecosystem that continuously enhances carbon sequestration.

Table 2. Area of forest management and number of sub-compartments in 10 forestry bureaus.

Forestry Bureau Name	Area (ha)	Number of Management Sub-Compartments
Dailing Forestry Bureau	1333.69	106
Hongxing Forestry Bureau	1333.34	104
Jinshantun Forestry Bureau	2000.06	213
Langxiang Forestry Bureau	1339.24	123
Meixi Forestry Bureau	1173.66	155
Nancha Forestry Bureau	666.67	74
Shangganling Forestry Bureau	1095.15	108
Wuyiling Forestry Bureau	1333.34	102
Xinqing Forestry Bureau	2005.04	214
Youhao Forestry Bureau	647.32	78
Total	12,927.51	1277

Table 3. The distribution of the four forest management types.

Management Types	Number	Area (ha)	Percent
TTM	738	7578.43	57.79%
HFM	435	4104.28	34.06%
RLQIF	56	734.13	4.39%
CDF	48	510.66	3.76%
All	1277	12,927.51	100%

Table 4. The model parameter settings.

Methods	Parameters	Parameter Interpretation
GridSearchCV	cv = 5, scoring = ‘accuracy’, n_jobs = −1	cv is the number of cross-validation folds; scoring is the evaluation metric; n_jobs is the number of CPU cores for parallel jobs
LASSO regression	random_state = 42	random_state is the seed for the randomness control
Decision tree	min_samples_split = 2, random_state = 42	min_samples_split is the minimum number of samples required for internal node repartitioning
Random forest	n_estimators = 20, min_samples_split = 2, random_state = 42	n_estimators is the number of trees; min_samples_split is the minimum number of samples required to subdivide the internal nodes
Support Vector Machine	probability = True, C = 10, kernel = ‘rbf’, random_state = 42	the kernel is the kernel function; C is the regularization parameter; probability is whether probability estimation is enabled or not
Stacking	final_estimator = ‘Support Vector Machine’, cv = 5	final_estimator is the meta-learner

Table 5. Variable definitions and descriptive statistics.

Variable Name	Abbreviation	Variable Definition	Max	Min	Mean	Standard Deviation
Slope	SLO	Continuous Variables	52	0	8.59	4.91
Aspect	ASP	north = 1; northeast = 2; east = 3; southeast = 4; south = 5; southwest = 6; west = 7; northwest = 8; no aspect = 9	9	1	4.64	2.40
Slope position	SLP	upper part = 2; middle part = 3; lower part = 4; valley section = 5; flat ground = 6	6	2	3.30	0.89
Soil type	STY	dark brown soil = 1; gray desert soil = 2; brown soil = 3; swamp soil = 4; black soil = 5	5	1	1.91	1.00
Soil thickness	STH	Continuous Variables	55	5	36.81	8.26
Soil moisture	SM	tide = 1; dry = 2; moist = 3	3	1	1.45	0.83
Gravel content	GC	less stone = 1; medium stone = 2	2	1	1.04	0.19
Soil pH	SPH	strong acid = 1; acidic = 2; slightly alkaline = 3; slightly acidic = 4; neutral = 5	5	0	4.46	0.57
Site class	SC	I = 1; I a = 2; II = 3; III = 4; IV = 5; V = 6	6	1	3.51	0.98
Stand origin	SO	planted forest = 1; natural forest = 2	2	1	1.19	0.39
Age group	AGG	young forest = 1; half-mature forest = 2; pre-mature forest = 3; mature forest = 4; over-mature forest = 5	4	1	1.81	0.41
Age class	AGC	I = 1; II = 2; III = 3; IV = 4; V = 5	5	1	2.69	0.68
Average age	AG	Continuous Variables	85	12	38.10	13.53
Stand mean height	SMH	Continuous Variables	26	3	14.06	2.57
Crown density	CD	Continuous Variables	0.9	0.2	0.71	0.11
Species richness	SR	Continuous Variables	17	1	7.28	3.07
Mean diameter at breast height	MDBH	Continuous Variables	32	8	15.80	3.56
Number of trees per hectare	NTPH	Continuous Variables	3128	277	1146.56	370.13
Volume per hectare	VPH	Continuous Variables	516.9	11.2	136.93	56.26
Forest Management Type	FMT	TTM, HFM, RLQIF, CDF.

Table 6. The result of different BFMTI models.

Model	Accuracy (%)	F1-Score (%)	Precision (%)	Recall (%)	Time (s)
BFMTI-LASSO	87.76%	87.33%	87.59%	87.76%	0.0165
BFMTI-DT	87.76%	87.59%	87.84%	87.76%	0.0050
BFMTI-RF	95.31%	95.16%	95.45%	95.31%	0.1003
BFMTI-SVM	96.61%	96.61%	96.78%	96.61%	0.0295

Table 7. The results of the BEFMTI and SEFMTI models.

Model	Accuracy (%)	F1-Score (%)	Precision (%)	Recall (%)	Time (s)
BEFMTI	93.75%	93.60%	93.81%	93.75%	0.1120
SEFMTI	97.14%	97.16%	97.29%	97.14%	0.8750

Table 8. Identification results of each model after feature selection.

Model	Accuracy (%)	F1-Score (%)	Precision (%)	Recall (%)	Time (s)
BFMTI-LASSO	88.54%	88.15%	88.36%	88.54%	0.0124
BFMTI-DT	88.80%	89.13%	89.94%	88.80%	0.0041
BFMTI-RF	95.83%	95.83%	95.97%	95.83%	0.0817
BFMTI-SVM	95.31%	95.24%	95.43%	95.31%	0.0288
BEFMTI	94.79%	94.77%	94.83%	94.79%	0.0953
SEFMTI	94.79%	94.81%	94.94%	94.79%	0.6390

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, J.; Chen, J.; Chen, S.; Wu, K. Forest Management Type Identification Based on Stacking Ensemble Learning. Forests 2024, 15, 887. https://doi.org/10.3390/f15050887

AMA Style

Liu J, Chen J, Chen S, Wu K. Forest Management Type Identification Based on Stacking Ensemble Learning. Forests. 2024; 15(5):887. https://doi.org/10.3390/f15050887

Chicago/Turabian Style

Liu, Jiang, Jingmin Chen, Shaozhi Chen, and Keyi Wu. 2024. "Forest Management Type Identification Based on Stacking Ensemble Learning" Forests 15, no. 5: 887. https://doi.org/10.3390/f15050887

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Forest Management Type Identification Based on Stacking Ensemble Learning

Abstract

1. Introduction

2. Organizational Foundation for Forest Management Types

3. Materials and Methods

3.1. Study Area

3.2. Experimental Data

3.3. Experimental Environment

3.4. Methods

3.4.1. Ensemble Learning

3.4.2. Decision Tree

3.4.3. Support Vector Machine

3.4.4. Random Forest

3.4.5. Least Absolute Shrinkage and Selection Operator Regression

3.4.6. Standardization Treatment

3.4.7. Feature Selection

3.4.8. SEFMTI Model

3.5. Model Evaluations

3.6. Model Parameters

4. Results

4.1. Data Pre-Processing Results

4.2. Comparison of BFMTI Model Results

4.3. Comparison of Different Ensemble Models

4.4. Comparison of Feature Selection Results

4.4.1. Calculation Results of Feature Importance

4.4.2. Comparison of Model Results after Feature Selection

5. Discussion

5.1. Analysis of Key Decision-Making Indicators

5.2. Comparison of Different Model Confusion Matrices

5.3. Limitations

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI