1.1. Research Significance and Background
Forests, being the cornerstone of terrestrial ecosystems, sustain life for humans and diverse organisms [
1], encompassing economic, environmental, social, and cultural values. Yet, human activities continue to diminish forest coverage and resources, resulting in issues like land use changes, ecosystem fragmentation, and biodiversity loss [
2]. Comprehensive forest resource surveys serve as critical foundations for effective management strategies and conservation policies, imperative for fostering sustainable forest development. Therefore, at present, these surveys hold paramount importance for scientific management, sustainable utilization, and protection of forest resources.
Diameter at Breast Height (DBH) is a crucial factor in studying tree structures, and a pivotal variable for forest growth models and management strategies [
3]. DBH aids in assessing tree niches, growth status, estimating forest biomass and productivity, monitoring forest health, determining forest structure and species composition, providing fundamental data, and decision-making references for scientific forest resource management and conservation [
4].
Forest ecosystems, among Earth’s most complex, play a pivotal role in maintaining global biodiversity and ecological balance. DBH data serve as indicators of tree growth and health status. Analyzing DBH data allows for the assessment of forest ecosystem stability, health, and vegetation dynamics. Moreover, DBH is a crucial parameter for estimating tree volume. Evaluating DBH distribution in forests enables calculations of total wood quantity and specific wood levels within a forest area, significantly impacting wood resource assessment, wood industry planning, and economic evaluation [
5]. Before advanced remote sensing and digital techniques, forestry resource surveys primarily relied on field investigations and sampling. However, these methods were expensive and lacked precision and comprehensive data. In contrast, remote sensing provides significant benefits by enabling a more precise and efficient collection of extensive forest resource information. Using advanced remote sensing and artificial intelligence algorithms [
6], UAV technology allows for high-resolution, multi-temporal, and dynamic monitoring of forest resources. This contributes to an improved evaluation and management of these resources.
With continual advancements in remote sensing technology and the widespread utilization of large-scale remote sensing images [
7], remote sensing images now facilitate DBH detection. Remote sensing technology provides a comprehensive, large-scale, and non-destructive means of data acquisition, capturing DBH spatial distribution patterns. Using remote sensing technology for DBH detection enhances work efficiency while reducing human resource demands.
In recent years, groundbreaking advancements in machine learning have found extensive applications across various domains, such as industry, medicine [
8], and finance. To address the need for the precise and real-time detection of forestry resources [
9], the exploration of machine learning techniques in tree parameter extraction has started. However, this field is in its early stages due to the complexity and variety of algorithms, providing many research possibilities.
In conclusion, using machine learning for extracting tree parameters shows significant potential for forestry resource surveys. This study uses UAV remote sensing images from mixed-tree forests and the BlendMask segmentation model to identify individual trees and measure crown widths. The method includes extracting the contour to calculate the crown area. Finally, a Bayesian neural network creates a model to predict DBH based on measured crown areas in UAV remote sensing images.
1.2. Research Landscape
1.2.1. Research Status of Crown Width Extraction
Around 2004, means of digital aerial photogrammetry began to be used for extracting tree crowns [
10]. High-resolution remote sensing images were used as the data source, utilizing image processing and computer vision techniques to derive crown shape and contour information. Common approaches included threshold-based segmentation [
11], edge-based detection [
12], and pixel-based classification [
13], albeit limited in precision.
By the early 2010s, the introduction of UAV technology facilitated the acquisition of high-resolution image data for crown extraction [
14]. Researchers increasingly employed unmanned aerial vehicles (UAVs) due to their flexibility and maneuverability, enabling a more precise capturing of intricate tree details.
In 2016, Guo Yushan [
15] and colleagues used high-spacial-resolution imagery to find tree crowns in both sparse and dense forest areas. They applied the marker-controlled watershed segmentation method, using image gradients to improve the accuracy of extracting crown outlines. Their tests showed an extraction accuracy of 87.8% for sparse forests and 65.5% for dense forests.
Internationally, researchers have conducted studies on crown width extraction, focusing on several key areas:
Lidar-based Crown Width Extraction [
16]: Lidar technology allows for the highly precise collection of three-dimensional information about ground and canopy surfaces. It is widely used in crown width extraction. Various algorithms, including altitude-threshold-based [
17], topological-relation-based [
18], and morphological-operation-based [
19] approaches, analyze laser point cloud data to extract tree crown information.
Image-processing-based Crown Width Extraction [
20]: This method uses remote sensing images to extract crown width. By analyzing color, texture, and shape attributes within remote sensing images, the automatic extraction of crown width is achieved.
Machine-learning-based Crown Width Extraction [
21]: Recent advancements in machine learning algorithms have led to increased exploration of these methods for crown width extraction. Researchers create training sample sets and utilize supervised learning algorithms such as vector machines [
22] and random forests [
23] to enable the automatic detection and segmentation of crowns.
These studies are beyond extracting crown width and instead focus on exploring the interconnections between crown width and various forest structure parameters. This involves examining how crown width correlates with factors like tree height, diameter at breast height (DBH), tree density, and their impact on forest ecosystem functionality and biodiversity. These insights are useful for managing forest resources and evaluating ecological environments. Both domestic and international research efforts have not only enriched crown width extraction methodologies but also broadened the applications of these data in forest resource research.
To enhance crown width extraction accuracy, researchers continually optimize and improve algorithms and models. They integrate multiple data sources, such as merging lidar data with remote sensing imagery, to obtain comprehensive and accurate information about tree crowns [
24]. The increasing use of deep learning methods like convolutional neural networks aids in precisely identifying and segmenting tree crowns. Crown extraction technology is widely used in forest resource management, ecological conservation, climate change research, and related fields. Accurate crown information acquisition facilitates forest structure and biodiversity assessment and helps in monitoring forest health and decision-making and resource management processes.
1.2.2. Research Status of Deep Learning in Forestry Segmentation
As deep learning evolves, several models are now used for image recognition, classification, and localization, where convolutional neural networks (CNNs) excel in analyzing two-dimensional images. In addition, CNNs can work with 3D data when the input is converted into a regular form. In 2020, Brage et al. conducted extensive forest surveys using high-resolution satellite imagery. Their use of the Mask R-CNN algorithm for tree crown detection and segmentation achieved precision, recall, and F1-scores of 0.91, 0.81, and 0.86, respectively. This method shows promise in aiding forest resource surveys, planning, and execution. Another study in 2021 by Huang Xinxi focused on ginkgo trees, establishing a dataset of individual ginkgo tree crowns through UAV remote sensing imagery. Utilizing Mask R-CNN and orthophoto maps, they detected tree crowns in different urban settings, achieving a precision rate of 93.90%, a recall rate of 89.53%, an F1-score of 91.66%, and an average precision of 90.86%. Similarly, Huang Yanxiao et al. in 2021 used drones to capture orthographic images of two distinct Metasequoia forest plots. They improved the Faster R-CNN method for crown identification and width extraction, resulting in a more accurate model with 92.92% accuracy and a determination coefficient of 0.84, showing enhancements over the original model.
BlendMask uses a Mask R-CNN-based framework that merges object detection and instance segmentation. During object detection, BlendMask uses two simultaneous branches: one for creating the object’s bounding box and another for generating a rough segmentation mask. In the segmentation phase, BlendMask selectively extracts the target using the bounding box network and aligns feature maps of various scales to a fixed size via ROIAlign. Subsequently, the fusion mask generator refines feature maps into precise segmentation masks using a sequence of convolution operations, with each layer integrating a fusion module to enhance feature expression. Ultimately, BlendMask utilizes the predicted bounding box and segmentation mask to produce detection and segmentation outcomes. Presently, there is limited documentation or practical use of the BlendMask model in the forestry segmentation domain, presenting substantial research opportunities in this field.
1.2.3. Research Status of DBH Prediction of Trees
In forestry resource estimation, individual-tree diameter at breast height (DBH) is a critical evaluation metric. Yet, field surveys have long grappled with challenges like high difficulty and slow-paced data collection. Traditional forestry surveys often rely on empirically derived formulas relating crown area to DBH to estimate actual tree diameter. However, in reality, different tree species exhibit significant variations in crown area and DBH, posing difficulties in establishing correlations [
25]. In recent years, the surge in machine learning achievements across various domains has drawn widespread attention, gradually expanding applications in forestry. Models based on neural networks offer new insights into understanding the relationship between tree crown area and DBH.
Accurate DBH data play a crucial role in forestry resource surveys. For instance, Fu Kaiting used an EBEE unmanned aerial vehicle equipped with a digital camera to capture high-resolution images within the Tanli management area at the Nanning Arboretum in Guangxi [
26]. Following orthographic image processing, they established a DBH–crown width regression model. This model efficiently correlated individual tree crown width with measured DBH, enabling a swift estimation of stand volume by integrating individual tree volume models [
27]. Similarly, Shi Jieqing and collaborators developed an all-encompassing forest resource survey system using UAV remote sensing images, integrating various UAV photography techniques and post-processing technologies alongside GIS. Impressively, reported errors in tree number density and volume were merely 2.68% and 4.01%, respectively, promoting forest resource exploration [
28].
Tree DBH prediction often relies on traditional regression models in international studies. Methods such as linear regression, multiple linear regression [
29], and generalized linear models [
30] are frequently used to model and predict the relationship between DBH and other dependent variables, such as tree height, age, and growth environment. As machine learning technologies evolve, an increasing number of researchers explore employing machine learning algorithms for tree DBH prediction. Remote sensing data [
31] also significantly influence DBH prediction. By using high-resolution remote sensing images and lidar data from forest areas, relevant geographical, morphological, and structural features can be extracted and used as input variables for constructing and optimizing tree DBH prediction models.
To enhance the accuracy of DBH prediction, researchers also attempt to fuse different types of data. Integrating multiple data sources, such as ground survey data, remote sensing data, and environmental factors [
32], can reveal the factors affecting DBH more comprehensively and improve the accuracy and stability of prediction models. In recent years, deep learning methods have seen increased application in tree DBH prediction. Convolutional neural networks (CNNs) [
33] and recurrent neural networks (RNNs) [
34] can extract image features and time series features, culminating in precise DBH predictions.
1.3. Primary Research Focus
In crown segmentation techniques, the threshold-based segmentation algorithm is a common method. It separates images based on preset thresholds applied to pixel grayscale values. However, it can be inaccurate in complex backgrounds, varying lighting, and occlusion situations. It struggles especially with irregular or overlapping crown shapes [
35].
Another approach involves using features and a classifier for segmentation. This method extracts texture, shape, and color features from images, and then uses a classifier to label pixels belonging to the crown region. Yet, accurately segmenting crowns in complex backgrounds remains challenging due to limitations in feature selection and classifier design [
36]. Additionally, manual parameter adjustments are often needed for different tree species and environments.
The third method turns crown segmentation into an image segmentation problem using region-based or edge-based algorithms. However, these methods can be computationally complex, with large datasets and complex scenes [
37], and may result in false segmentation with overlapping trees and unclear boundaries. The fourth method, multi-sensor fusion-based segmentation, combines data sources like remote sensing images and lidar data to provide more comprehensive information for crown segmentation. But, challenges persist in data fusion and registration technologies, requiring sophisticated multi-sensor data collection and processing [
38].
To improve the computational efficiency and increase the recognition accuracy, we adopted a more streamlined and efficient model: the BlendMask model [
39] for crown segmentation.
In the calculation of tree DBH, linear regression equations are widely used in the industry, but they often result in poor fitting effects. In this study, we opted for a Bayesian neural network to model the relationship curve between crown area and DBH. Combining the application of convolutional neural networks in aerial remote sensing images, we devised a precise individual-tree crown segmentation method based on high-resolution UAV images and the BlendMask network. This methodology enabled the extraction of crown information from trees within the forest farm. The specific research objectives include:
Utilizing the orthophoto map of Beijing Jingyue Ecological Forest Farm as experimental data to use the BlendMask network for segmenting individual crowns and detecting the count of Pinus tabulaeformis trees.
Assessing the prediction results of the model using relevant accuracy evaluation metrics.
Fitting an optimal relationship model between the DBH and crown width of trees using a Bayesian neural network, leveraging DBH measurements of sample trees collected in the field and the calculated crown mask area obtained from segmentation.