1. Introduction
Camellia oleifera is a significant indigenous oil plant with diverse applications such as edible and medicinal uses, light chemical raw materials, soil and water conservation, and ecological climate regulation, and it is typically cultivated in subtropical alpine and hilly regions [
1,
2]. As the world’s population grows with a large demand for food, obtaining edible oil from
Camellia oleifera can reduce the acreage of oil crops such as soybeans, so that more land can be used to grow food crops. At present, the planting area of
Camellia oleifera in China has reached 70 million hectares, and it is increasing year by year. Canopy extraction plays a pivotal role in monitoring the health of
Camellia oleifera trees, estimating yields, and facilitating mechanical harvesting. Using UAV and remote sensing technology can enable the precise and efficient extraction of individual tree canopy information for
Camellia oleifera and aid in implementing precision operations based on plant variations [
3,
4,
5], thereby significantly enhancing the level of intelligent management in
Camellia oleifera plantations.
Owing to its advantages in data resolution, flight flexibility, high efficiency, and convenience, UAV-based proximal sensing technology has emerged as a prominent trend for acquiring high-resolution remote sensing image data using LiDAR [
6,
7,
8,
9]. By capturing images, the UAV can improve detection efficiency and accuracy [
10,
11] by analyzing and extracting information, such as topography and the growth status of the canopy, in order to achieve precise management and monitoring of an orchard [
12,
13,
14]. LiDAR enables the extraction of canopy information by transmitting a pulsed laser and receiving echo signals, resulting in the generation of laser point clouds with wide coverage, high precision, and efficient processing capabilities [
15,
16,
17]. Obtaining individual tree canopy parameters such as plant density, diameter at breast height, tree height, and crown size and position is a crucial step toward achieving precise individual tree canopy segmentation. However, when utilizing airborne LiDAR data for segmentation purposes, challenges related to over/under segmentation may arise owing to adjacent or overlapping canopies as well as variations in their shapes. To address these issues comprehensively, numerous methods have been developed for canopy-based individual tree canopy segmentation using airborne LiDAR data; these primarily include approaches based on canopy height models and normalized point cloud analysis.
Based on the CHM individual tree canopy segmentation method [
18,
19,
20], the crown height model was initially constructed using ground elevation data and vegetation elevation information. By applying a threshold, areas exceeding this value in the CHM were identified as non-canopy regions. Subsequently, morphology, regional growth, and other algorithms were employed to further partition the vegetation area into individual tree canopies [
18]. Ding [
19] and colleagues have achieved promising outcomes in segmenting tomato canopy multispectral image leaves by integrating wavelet transform and a watershed algorithm with an average error rate below 8%. However, when dealing with complex backgrounds and varying light intensities, the average error rate for tomato canopy leaves increases to 21%. In order to address the challenge of a complex background on crown overlap, Liu et al. [
20] proposed the ITCD algorithm integrated with CHM and introduced the multi-scale local maximum (LM) algorithm to enhance the segmentation accuracy of individual tree canopies. However, this method tends to misidentify crown edges as separable areas, leading to excessive segmentation. Ayry et al. [
21] developed the layer stacking algorithm to improve the segmentation accuracy of individual tree canopies in deciduous or leafless conditions. Considering the difficulties in detection and information loss in complex forested areas, Paris et al. [
22] proposed a method that combines CHM with point cloud spatial clustering for delineating dominant tree crowns’ boundaries, achieving significant results with 97% accuracy using high-density point cloud data and 92% accuracy using low-density data, thereby greatly improving understory vegetation detection and segmentation accuracy.
The Normalized Point Cloud Individual-Tree Segmentation [
23,
24,
25] technique normalizes the point cloud data by mapping the coordinates of all points to a uniform range. This approach employs clustering algorithms such as density-based DBSCAN and connectivity-based methods to group the normalized point cloud data into clusters, thereby enabling the identification and segmentation of individual tree canopies based on their shape, density, and other characteristics. Unlike the CHM segmentation algorithm, this method directly analyzes the original point cloud data without relying on elevation information. Hence, it was well-suited for analyzing point cloud data generated by vegetation types with complex canopies and minimal height differences. Yang et al. [
26,
27] conducted high-resolution UAV image experiments in larch forests using the Structure-from-Motion (SfM) method, yielding outstanding results in the visual interpretation of orthophoto images and various automatic segmentation methods based on images and point clouds. The overall detection rate exceeded 91%, while the accuracy of crown width extraction surpassed 81%. Fu et al. [
28] implemented tree segmentation from Terrestrial Laser Scanning (TLS) data using DBSCAN and an improved Distance Distribution Matrix (DDM). Through a combination of detection, correction, and layer-by-layer clustering, the evaluation results on plantation and mixed forest datasets demonstrate the superiority of the proposed method over traditional DBSCAN in terms of recall, accuracy, and F1 score. The method automatically extracts optimal parameters and accurately segments small trees under tall canopies. Yan et al. [
29,
30] employed an adaptive mean shift segmentation approach that divides the three-dimensional space into sectors from global maximum points to simulate canopy surfaces, iteratively identifying potential boundaries within specified areas. Results indicate accurate segmentation rates of 95% for simple samples and 80% for complex environments.
From the referenced literature, various approaches were studied for individual tree canopy segmentation; however, a solution for segmentation of the Camellia oleifera tree canopy is still lacking because of the challenges encountered in previous studies, such as uneven distribution of the canopy, complex terrain, and a significant overlap phenomenon. This study employs an airborne LiDAR data acquisition system to acquire point cloud data for Camellia oleifera. Furthermore, typical research areas considering terrain and planting type were selected, and algorithm parameters were optimized separately to improve the segmentation accuracy of the Camellia oleifera tree canopy. It integrates the CSF filtering algorithm for CHM segmentation and performs point cloud clustering segmentation. Additionally, the impacts of different segmentation parameters and growth environments on the accuracy of the Camellia oleifera canopy were analyzed to determine optimal parameters that were suitable for individual tree canopy segmentation in mountainous and hilly regions.
2. Materials and Methods
2.1. Experimental Materials
The airborne LiDAR data were acquired using the LiAir VH2 LiDAR scanning system mounted on a DJI M300 RTK UAV (DJI Technology Co., Ltd., Shenzhen, China), which was equipped with an RTK-GNSS system and BMI088 IMU sensor. According to
Table 1, the LiDAR field of view (FOV) is 70.4° horizontally and 4.5° vertically, with an accuracy of 5 cm@70 m. The RTK-GNSS achieved 1 cm + 1 ppm horizontally and 1.5 cm + 1 ppm vertically, and the IMU had a 200 HZ data sampling frequency up to 2000 HZ. The LiDAR worked with a multi-thread and repeated scanning mode, recording three echoes for each pulse. In complex
Camellia oleifera environments, the use of multi-echo LiDAR proved suitable for acquiring point cloud information, effectively enhancing data quality, increasing point cloud density, and enabling improved information fusion capabilities. As the terrain of the
Camellia oleifera forest is undulating and the highest elevation is close to 30 m, in order to ensure flight safety and data uniformity in each research area, the flight altitude was set to 83 m. During data collection, the RTK position information and IMU attitude information of the UAV were recorded at the same time. In order to improve the fusion and matching accuracy of the information, the IMU frequency was set to 200 HZ. At the same time, a relatively low speed of 3 m/s was adopted to improve the attitude stability of the UAV during the flight process and increase the amount of data in the point cloud, so as to further improve the point cloud matching accuracy. An overlap rate of 90% can improve the amount of point cloud data, obtain more comprehensive ground details, overcome the influence of occlusion, and improve the matching accuracy. In this research, the flight altitude was set at 82 m, with an IMU data frequency of 200 Hz and a flight speed of 3 m/s.
Figure 1 illustrates the airborne data acquisition system, while
Table 1 presents the specific parameters.
2.2. Data Acquisition
The research area is situated in Xinyu City, Jiangxi Province, the main growing area of Camellia oleifera in China. Considering the growth characteristics of Camellia oleifera, the branches and leaves of Camellia oleifera were the most lush in June, and the data collection period was from 27 May to 2 June 2023. Within this forest land, a diverse range of growth differentiation can be observed among Camellia oleifera trees. This study focused on investigating the sparsity between Camellia oleifera trees, canopy size variations, terrain slope characteristics, and different spatial grid sizes as the primary research objectives. The aim was to analyze the applicability of three segmentation algorithms for effectively delineating Camellia oleifera tree boundaries.
In order to investigate the impact of different terrains and Camellia oleifera planting type on segmentation effectiveness, three representative areas were selected within the forest land for study. Among them, research area 1 was characterized by a flat terrain, uniform canopy width, and consistent spacing (approximately 1.0 m) between Camellia oleifera plants and consisted of 88 samples covering an area of 1082.20 m2. Research area 2 was characterized by an irregular and uneven distribution of canopy widths with sparse growth of Camellia oleifera plants on a gentle slope ranging from 15° to 25°, and the spacing between Camellia oleifera plants was approximately 1.2 m, consisting of 94 samples covering an area of 1194.13 m2. Research area 3 featured overlapping occurrences in the Camellia oleifera canopy width on steeper slopes ranging from 25° to 60°, along with significant variations in canopy height as well as inconsistent and uneven planting spacing. The spacing between Camellia oleifera plants was approximately 1.2 m and consisted of 95 samples spanning an area measuring approximately 1610.66 m2.
In this study, tree numbers and heights were obtained by field investigations and measurements. As the canopy width is difficult to directly measure in the field, we used the “Measure” tool of the CloudCompare (version 2.13.0) software to measure the results on a stitched point cloud map. Notably, our investigation on canopy segmentation of
Camellia oleifera trees in mountainous and hilly regions served as a representative case. The specific research areas are illustrated in
Figure 2, while
Table 2 presents the corresponding statistical data for
Camellia oleifera trees within each area.
2.3. Data Preprocessing
The quality of data preprocessing significantly influences subsequent processing and outcomes. Canopy segmentation preprocessing primarily involves denoising, resampling, and ground point separation steps. When the laser scanning system acquires point cloud data, various types of noise are inevitably introduced, including instrument noise errors, environmental noise, and irregular reflective surfaces. Eliminating these noises can establish a more accurate foundation for data analysis and provide support for subsequent analyses. To ensure the density distribution of point cloud data adheres to specifications and is better suited for accurate segmentation and feature extraction of camellia canopies, we can enhance the calculation efficiency by resampling while retaining sufficient information to accurately represent the target object. For camellia canopy segmentation, filtering algorithms were employed to eliminate interference from ground point cloud data, including DEM extraction and ground point cloud classification methods. By removing the ground points, the remaining point cloud data primarily captures the structural characteristics of camellia trees’ canopies, thereby facilitating high-precision canopy segmentation.
Furthermore, the integration of DEM, DSM, and CHM plays a pivotal role in the preprocessing stage of Camellia oleifera canopy segmentation. DEM provides crucial ground elevation information for eliminating ground point cloud data, while DSM offers comprehensive vegetation elevation details that aid in the identification and segmentation of vegetation. Additionally, CHM supplies essential vertical structure information to facilitate accurate identification and segmentation of the vegetation canopy. The combined utilization of DEM, DSM, and CHM enables a more thorough analysis of vertical distribution patterns and growth status within the Camellia oleifera canopy segmentations by providing fundamental data. Insufficient removal of ground point clouds may occur owing to lower DEM resolution, thereby impacting segmentation accuracy; conversely, higher DSM smoothing factor may result in the loss of vegetation elevation information, consequently affecting the accuracy of vegetation segmentation. Additionally, CHM resolution directly influences canopy segmentation accuracy; excessively low resolution leads to detail loss, while excessively high resolution introduces noise. Field detection and analysis revealed that setting the DEM resolution at 1 m yielded optimal results in this study. When the parameter was set below 1 m, complete removal of non-canopy point clouds could not be achieved; on the other hand, when the parameter exceeded 1 m, small areas of Camellia oleifera canopy were mistakenly identified as ground point clouds and removed, thus further increasing the error.
The parameters for DSM smoothing ranged from 0.3 to 1.0. Excessively high parameter values resulted in a rough TIN model, leading to over-segmentation, while excessively low values caused the TIN model to become overly smooth, resulting in potential under-segmentation during the segmentation process. The CHM grid varied between 0.3 m and 0.7 m. Comparative analysis of camellia canopy segmentation accuracy using different grid sizes (0.3 m, 0.5 m, and 0.7 m) revealed that the optimal segmentation accuracy was achieved with a grid size of 0.5 m. If the value is smaller than the threshold, it may lead to a reduced segmentation area, resulting in potential under-segmentation; conversely, if the value exceeds this threshold, it may cause an expanded segmentation area and potential over-segmentation.
2.4. Individual Tree Canopy Segmentation Method
Based on the diversity of topography and canopy structure, three segmentation methods were employed in this study (CHM segmentation, point cloud clustering segmentation, and layer stacking fitting segmentation) by optimizing parameters to achieve high-precision segmentation.
The CHM segmentation algorithm [
27] achieves precise canopy segmentation of
Camellia oleifera by calculating the height difference between the crown center vertex and the ground. This algorithm effectively partitions the CHM into distinct vegetation objects through threshold settings or watershed algorithms, making it suitable for low-complexity segmentation tasks specific to
Camellia oleifera. Moreover, crucial parameters such as minimum tree height, Gaussian smoothing coefficient, and Gaussian smoothing radius significantly impact the performance of this method. In this study, field measurements provided tree height data, and a minimum tree height of 1.2 m was set to exclude
Camellia oleifera trees below this threshold during segmentation. Additionally, a Gaussian radius coefficient ranging from 0.5 to 1.0 was selected. Adjusting these parameters helps alleviate issues related to excessive or inadequate segmentation in different sample environments; smaller values may result in overly refined segmentations, while larger values may lead to insufficient differentiation among various vegetation objects.
The point cloud clustering segmentation method [
31] primarily extracts features such as spatial relationships and colors from point cloud data and utilizes clustering or segmentation algorithms to partition the data into distinct regions or individual tree canopies. The results of the clustering segmentation process were controlled by adjusting the segmentation parameters. Setting a value that is too small may result in noise points or small non-individual tree canopy areas being clustered as individual tree canopies, while setting a value that is too large may lead to incorrect clustering of small individual tree canopy areas. In this study, we set the threshold for the average sample distance of
Camellia oleifera trees between 0.3 and 0.7, with a field search radius of 0.5 and an average density range of
Camellia oleifera point clouds at 15–30%.
The layer stacking fitting segmentation method [
32] mainly combines the point cloud clustering algorithm and the area growing method. In the point cloud data, some seed points representing individual tree canopies were selected by manual selection or feature extraction. Starting from the seed point, the area growing algorithm was used to gradually add adjacent points to the same area until the growth conditions were met. In the segmentation of the
Camellia oleifera canopy, the area growth can be judged according to the spatial relationship and color similarity between points. The size of the CHM spatial grid used for the layer stacking algorithm was 0.3−0.5 m, and the Gaussian smoothing coefficient and Gaussian radius were consistent with the coefficients of the CHM segmentation algorithm, which were 0.3–1.0 and 0.5–1.0.
2.5. Evaluation of Segmentation Accuracy
The integrity of
Camellia oleifera canopy segmentation can be assessed through the visual interpretation of high-resolution UAV images, three-dimensional morphological analysis of laser point clouds, and ground survey data from forestry lands. Each canopy point cloud segmentation algorithm exhibits unique performance characteristics and adaptability. In the case of orthophoto images obtained by UAV, areas exceeding 80% of the canopy diameter at breast height were typically classified as correct detection and segmentation (true positive, TP), while areas surpassing the threshold but not accurately detected and segmented were considered false positives (FP), and those below the threshold that remain undetected and unsegmented were regarded as false negatives (FN). To evaluate the accuracy of results produced by three different segmentation algorithms, we calculate the weighted harmonic mean value F based on the detection recall (r) and precision (p) and evaluated the accuracy of the segmentation results. The calculation of r and p are shown in Equations (1) and (2) [
33].
Figure 3 illustrates the corresponding segmentation criteria. The weighted harmonic mean F was calculated according to Equation (3):
where TP represents the number of correctly detected and segmented
Camellia oleifera canopies, FP represents the number of incorrectly detected
Camellia oleifera canopies, and FN represents the number of undetected
Camellia oleifera canopies.
3. Results
In this research, the accuracy of three canopy segmentation algorithms for Camellia oleifera was compared in three distinct research areas, and the impact of the CSF filtering algorithm on canopy segmentation accuracy was analyzed. The findings revealed that the performance of canopy segmentation algorithms for Camellia oleifera was primarily influenced by sample plant density, canopy width, spatial grid size, threshold setting, and edge overlap. Among these factors, the point cloud clustering segmentation algorithm exhibited the highest overall accuracy, followed by CHM segmentation and layer stacking segmentation.
In the three research areas, when the grid size was smaller than 0.3 m or bigger than 0.7 m, the accuracies of the three segmentation algorithms in the canopy segmentation of Camellia oleifera were significantly low. When the grid was 0.3 m, the F-score for the approaches of CHM segmentation, point cloud cluster segmentation, and layer stacking fitting segmentation in area 1 were 75%, 82%, and 69%, respectively; those in area 2 were 70%, 82%, and 69%, respectively; and those in area 3 were 58%, 66%, and 60%, respectively. When the grid was 0.5 m, the segmentation evaluation indicators in area 1 were 88%, 93%, and 79%, respectively; in area 2 they were 88%, 88%, and 83%, respectively; and in area 3 they were 85%, 90%, and 84%, respectively. When the grid was 0.7 m, the evaluation indexes in area 1 were 78%, 91%, and 72%, respectively; in area 2 they were 70%, 82%, and 71%, respectively; and the evaluation indexes in area 3 were 70%, 83%, and 80%, respectively. When the grid was 0.5 m, the segmentation accuracy achieved the highest level of 90%.
In addition, after CSF filtering processing of the ground point cloud, the accuracy of the canopy segmentation of Camellia oleifera was significantly improved. According to the actual measurement results, when the threshold coefficient of the CSF algorithm was set to 0.55 and the number of iterations was set to 1000, the accuracy of the canopy segmentation of Camellia oleifera was most improved. Considering the use of the CSF filtering algorithm, the use of this algorithm not only improves the segmentation efficiency but also increases the segmentation accuracy by 21%, which verifies the positive role of the CSF filtering algorithm in the canopy segmentation of Camellia oleifera.
5. Conclusions
In this study, we used an UAV-LiDAR to collect laser point cloud information for Camellia oleifera. Subsequently, according to different terrains and planting types, three research areas were selected, and three different canopy segmentation algorithms (CHM segmentation, point cloud clustering segmentation, and layer stacking fitting segmentation) were evaluated under different parameters. Evaluation indexes were assessed using F-scores to determine the accuracy and effectiveness of the experimental protocol. The experimental results show that the point cloud clustering segmentation algorithm has the highest performance, followed by CHM segmentation and layer stacking segmentation, and the segmentation accuracy was highest when the grid size was 0.5 m, and the CSF filtering algorithm also has a positive effect on the canopy segmentation of Camellia oleifera.
By comparing and analyzing the segmentation accuracy of different segmentation algorithms, we determined the best segmentation parameters for segmentation. In order to further enhance the accuracy of canopy segmentation algorithms for Camellia oleifera, future research should focus on optimizing the sampling method for plant density, improving canopy width measurement technology, and fine-tuning threshold settings and edge overlap parameters. Additionally, exploring the application of alternative filtering algorithms may also contribute to enhancing the effectiveness of Camellia oleifera canopy segmentation. These endeavors will facilitate a deeper understanding of the structure and characteristics of Camellia oleifera canopies while providing a scientific foundation for managing and conserving Camellia oleifera.