1. Introduction
Forest species classification [
1] plays a vital role in forest resource monitoring [
2], forest management [
3], biodiversity assessment [
4], and carbon storage [
5], among others. Surveying tree species [
6] relies on the visual inspection and measurement of individual trees through various parameters, such as tree height, canopy width, trunk diameter (diameter at breast height), morphological structure, leaf shape, and bark texture. Gathering this detailed information requires significant manpower, time, and prior knowledge, rendering it unsuitable for undertaking large-scale tree species surveys. With technological advancements, remote sensing has gradually been applied in tree species classification [
7]. This approach initially extracts features from data that are then combined with traditional supervised classification methods, such as support vector machines [
8], maximum likelihood [
9], and random forest [
10], to achieve tree species classification. However, due to spatial resolution constraints, early remote sensing images were only suitable for regional scale assessments alone and were incapable of achieving individual tree-level classification [
11]. With the emergence of high-resolution remote sensing [
12] and hyperspectral remote sensing [
13,
14], the resolution and accuracy of tree species classifications have significantly improved. However, there remain inherent limitations in passive remote sensing images, such as difficulties in acquiring information on tree species below the canopy, reliance on sunlight, and susceptibility to meteorological conditions and time factors [
15].
In comparison, Light Detection and Ranging (LiDAR) [
16,
17,
18], as a form of active remote sensing technology, possesses the ability to autonomously emit light sources and receive reflected signals, which are uniquely high in reflectivity when interacting with plant matter [
19]. The quality of its signals is not affected by meteorological conditions or time. LiDAR allows for changes in the signal transmitter position to obtain information about forest trees from various angles. It exhibits higher spatial resolution for complex forest terrains and vegetation structures along with superior penetration capabilities, enabling it to capture information beneath the canopy. Furthermore, LiDAR can collect three-dimensional (3D) point cloud data under various environmental conditions, among many other advantages. Consequently, it has gradually become a research hotspot for the classification of tree species [
20,
21,
22].
One challenge in the application of point cloud data for tree species classification lies in its unordered nature [
23]. Given that each point in a point cloud dataset is independently collected in space, the arrangement of these points in the dataset is random and unrelated to their physical locations. This disorder implies that point cloud data cannot be directly applied to traditional machine learning methods. To address this issue, it is often necessary to transform unordered point cloud data into a format that can be processed by traditional machine learning classifiers. Currently, the most prevalent method is feature-based classification. This approach involves extracting a series of features from the point cloud data, which are then used as inputs for conventional machine learning classifiers for tree species identification. In this manner, the inherent disorder of point cloud data can be converted into ordered information suitable for machine learning classifiers through feature extraction and selection, thereby enabling effective identification of different tree species. Xiaoyi et al. [
24] used an optimal feature parameter set based on point cloud distribution characteristics for tree species classification, achieving an average classification accuracy of 58.8%. Cao et al. [
25] used full-waveform LiDAR data to achieve an overall classification accuracy of 68.6% for six subtropical forest tree species, including
Pinus massoniana and
Cunninghamia lanceolata. In the process of feature-based classification, it is evident that the selection of features typically relies heavily on deep prior knowledge. The accuracy of the classification results demonstrates a high sensitivity to the categories of selected features, which greatly limits the effectiveness of this method. Concurrently, although point cloud data provide comprehensive 3D spatial information about trees, feature-based classification methods often fail to effectively exploit and utilise the features and 3D information inherent within point cloud data.
Simultaneously, considering the strong correlation between a tree species and its morphological structure [
26,
27,
28], image-based tree classification stands as one of the traditional methods for tree species classification [
29]. However, due to its intensive demands for human labour and time, it is not suitable for current large-scale tree species surveys. The advent of LiDAR technology has addressed the previous difficulty in obtaining tree images. By merely segmenting individual trees from point cloud data and projecting them, it is possible to acquire images exhibiting the complete morphological structure of the trees. The introduction of these two-dimensional (2D) images also circumvents the disorderliness issue inherent in point cloud data. Therefore, tree species classification can be based on point cloud projection images. Hamid et al. [
30] converted cloud data into a 2D projection image dataset for individual trees. They used a convolutional neural network (CNN) to classify the crowns of 124 conifers, achieving an average accuracy of 87% with limited tree features provided by canopy information. Mizoguchi et al. [
31] converted point cloud data from the trunk sections of cedar and cypress trees into images and used the CNN method to classify the two types of trunks, achieving an average accuracy of 89%.
In recent years, image classification algorithms have made significant progress. The evolution of these algorithms is notable, transitioning from early machine learning feature extraction methods to today’s advanced deep learning techniques. The mainstay of current image classification algorithms is the CNN [
32]. CNNs are often used for feature extraction and dimension reduction, making them a crucial component of modern image classification. Another significant development is the visual geometry group (VGG) [
33], characterised by the use of small convolutional kernels. This technique enhances the effectiveness of the image classification process, especially in dealing with detailed and complex image content. The advent of the residual network (ResNet) [
34,
35] marked a considerable advancement in the field. ResNet addresses the vanishing gradient problem encountered in deep neural networks by introducing cross-layer residual connections. This innovation significantly enhances the learning capability of deep networks. Furthermore, the densely connected convolutional network (DenseNet) introduced dense connections, another leap forward in the evolution of image classification algorithms. These methods continuously explore the potential of image classification, offering innovative perspectives for the classification of individual tree species.
For the aforementioned reasons, this study focused on tree species classification based on projected images from point cloud data. The main research emphasis lies in exploring the impact of different projection directions, the various classification models, and the incorporation of colour information as a method to restore depth information lost during the transformation of 3D point cloud data into 2D projection images. The feasibility of these methods in resolving the issue of dimensional information loss during the projection process and in enhancing classification accuracy in the context of tree species identification using point cloud projected images was investigated with the aim of informing and benefiting future research.
4. Discussion
The results of this study show that the method of projecting point cloud data into 2D images can effectively address their issue of a lack of order. This approach transforms the original unordered points in point cloud data into pixels in a 2D image with explicit adjacency relationships and order, further employing existing machine learning techniques for classification. However, there is indeed a risk of information loss in this transformation process. To mitigate this risk, this study adopted several strategies. These strategies and their impact on the experimental results are discussed below.
The quality of information contained within the projection images derived from different point cloud projection directions can vary, resulting in different classification accuracies. From the X-, Y-, and Z-direction projection images selected for this study, the average classification accuracies obtained in the X- and Y-directions were superior to those in the Z-direction. Upon comparing the projection images, it was observed that only the tree crown information, including shape, area, and degree of closure, were obtained from the Z-direction projection. As the selected trees in this study were concentrated in one region, similar climatic conditions caused insufficient crown differentiation in some tree species, ultimately leading to lower information content in this direction. Therefore, the classification model had a relatively low accuracy in this direction, and as such, the Z-axis projection model was not discussed. In comparison to the Z-direction, the information content of the X- and Y-direction projection images was often richer.
Compared to single-directional projection images, XY dual-directional projection images provided classification models with shape contour information from different directions while effectively increasing the sample size. Consequently, dual-directional classification outperformed single-directional classification. This strategy not only expanded the training set and sample size compared to single-directional training but also added further information dimensions, thereby enhancing the model’s generalisation ability and accuracy. This enabled the model to comprehensively understand the 3D coordinate information of point clouds, enabling a more accurate acquisition of precise positional information, thereby improving the classification accuracy and robustness. The average precision increased from 81.34% to 85.43%.
By supplementing part of the information that was compressed in the dimension during point cloud image projection using colouration methods, classification models could obtain more spatial information, aiding the model in better understanding the 3D information features and distance relationships between the points of the point cloud data. Therefore, the accuracy of the classification model significantly improved, with the average precision increasing from 81.72% to 84.86%. This result indicates that adding depth information can beneficially impact point cloud projection image classification tasks and that utilising spatial information increases the accuracy of the classification models.
By comparing the training results of the four models, it was observed that the simpler CNN structure had lower accuracy; the VGG model had deeper training and, compared to the ordinary CNN model, its accuracy improved by nearly 12%, reaching 85.83%; ResNet possessed the advantages of VGG with lower time costs and resource occupation rates and its classification accuracy was the highest; DenseNet, while occupying more computational costs and storage space, had a precision that was slightly lower (3.03%) than that of ResNet.
In line with initial expectations, the strategies employed successfully mitigated the problem of dimensionality information loss during the projection process and enhanced classification accuracy. Compared to previous related studies [
30,
31], the results of the present study expanded the classification from two species to nine. While this undoubtedly increased the complexity of the classification task, a peak classification accuracy of 91.46% was still achieved. This result provides substantial evidence for the efficacy and feasibility of the research method displayed here and holds significant implications for advancing the study of tree species classification using point cloud projection images.
There exists a significant correlation between the morphological structure of trees and their corresponding species. Each tree has unique growth patterns and morphological features, which, in most instances, are directly associated with the species. For instance, some species might exhibit rapid vertical growth, resulting in slender, erect trunks, whereas others might lean towards lateral expansion, forming expansive canopies. These characteristics are intrinsic attributes of trees, manifesting as distinct dendritic structures. Moreover, these differences in dendritic structure are reflected within point cloud data. Specifically, by analysing and interpreting point cloud data, detailed 3D information about trees can be obtained, encompassing various aspects such as trunk thickness, leaf distribution, and canopy shape. This information can significantly aid in the accurate determination of a tree’s species. In other words, to some extent, dendritic structure provides pivotal clues for species identification.
During the tree species classification process, the classification model’s ability to learn these dendritic structural features specific to tree species can be enhanced. This can be achieved by incorporating additional viewing angle information, augmenting depth information, and adjusting the classification model’s parameters and the model. These methods aim to fulfil the objective of effective tree species classification. However, certain misclassification issues persisted. For instance, council trees were misclassified as camphor trees, mango trees as bodhi trees, and Wingleaf soapberries as council trees. Upon manual comparison with the original point clouds, these misclassifications were attributed to specific individual morphologies or issues with point cloud quality. For example, council trees with poor growth may exhibit morphological similarities to camphor trees, mango trees with fewer lateral branches were erroneously identified as bodhi trees, and point clouds of inferior quality misclassified Wingleaf soapberries as council trees. These scenarios underscore that while the majority of trees manifest similar morphological structures during their growth, free growth in natural environments might result in distinct morphological deviations, leading machines to misidentify them as other species. Nevertheless, these discrepancies still fall within an acceptable margin of error.
These challenges suggest the potential for further improving the classification accuracy of point cloud projection images. However, given the complexity of these issues, it might be essential to delve into deeper-level features, optimise the model parameters of the classification methods, or employ superior techniques to observe the understory, such as integrating data from airborne laser scanning (ALS) [
46], terrestrial laser scanning [
47], and backpack laser scanning [
48]. These strategies could potentially offer more effective solutions to these challenging classification problems.
There were some limitations to this study that need to be addressed in future investigations. First, the tree segmentation method required manual assistance and the sampling areas were relatively concentrated. In the future, we will explore the effects of more projection angles on tree classification and investigate the optimal combination of projection angles to improve the speed and efficiency of classification and minimise redundant samples. Additionally, we will introduce classification models, such as multi-view CNN [
49], which treat multiple views of an object as the same object for classification to avoid the problem of treating different projections of the same object as separate entities. For example, Silva et al. [
50] achieved 95% accuracy in tree species classification by using microscopic images of three major anatomical parts of wood and combining them with the multi-view random forest model, which is different from the traditional approach of using cross-sectional images alone. In future research, we will explore the optimal combination of multi-view images and the multi-view classification model, along with the point cloud data that are currently in use, to further investigate the upper limit of the classification of tree species using multi-view projection images, which can be obtained quickly and conveniently. Furthermore, considering the differences in tree growth patterns between different terrains and regions, we will further validate the classification performance of trees in different areas and climates and on different mountain slopes (shady versus sunny) to enhance the reliability and generalisability of this method in practical applications. Finally, the point cloud data processed in this study only cover nine specific tree species. As a result, the applicability of the trained classification model is somewhat limited at the current stage, being effective only for the identification and classification of these nine tree types. To augment the model’s versatility, robustness, and address the issue of parameter generalization, [
51] future research directions will focus on collecting and processing point cloud data from a broader array of tree species across different geographical areas and time frames. This expansion will broaden the model’s applicability, further elevating its comprehensiveness and effectiveness in practical forestry applications.