A Building Point Cloud Extraction Algorithm in Complex Scenes

Su, Zhonghua; Peng, Jing; Feng, Dajian; Li, Shihua; Yuan, Yi; Zhou, Guiyun

doi:10.3390/rs16111934

Open AccessArticle

A Building Point Cloud Extraction Algorithm in Complex Scenes

by

Zhonghua Su

^1,2

,

Jing Peng

³,

Dajian Feng

³,

Shihua Li

²

,

Yi Yuan

²

and

Guiyun Zhou

^2,*

¹

School of Computer and Software Engineering, Xihua University, Chengdu 610039, China

²

School of Resources and Environment, University of Electronic Science and Technology of China, Chengdu 611731, China

³

Communications and Information Technology Headquarters, Sichuan Provincial Public Security Department, Chengdu 610041, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2024, 16(11), 1934; https://doi.org/10.3390/rs16111934

Submission received: 6 April 2024 / Revised: 8 May 2024 / Accepted: 16 May 2024 / Published: 28 May 2024

(This article belongs to the Special Issue 3D Reconstruction and Mobile Mapping in Urban Environments Using Remote Sensing)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Buildings are significant components of digital cities, and their precise extraction is essential for the three-dimensional modeling of cities. However, it is difficult to accurately extract building features effectively in complex scenes, especially where trees and buildings are tightly adhered. This paper proposes a highly accurate building point cloud extraction method based solely on the geometric information of points in two stages. The coarsely extracted building point cloud in the first stage is iteratively refined with the help of mask polygons and the region growing algorithm in the second stage. To enhance accuracy, this paper combines the Alpha Shape algorithm with the neighborhood expansion method to generate mask polygons, which help fill in missing boundary points caused by the region growing algorithm. In addition, this paper performs mask extraction on the original points rather than non-ground points to solve the problem of incorrect identification of facade points near the ground using the cloth simulation filtering algorithm. The proposed method has shown excellent extraction accuracy on the Urban-LiDAR and Vaihingen datasets. Specifically, the proposed method outperforms the PointNet network by 20.73% in precision for roof extraction of the Vaihingen dataset and achieves comparable performance with the state-of-the-art HDL-JME-GGO network. Additionally, the proposed method demonstrated high accuracy in extracting building points, even in scenes where buildings were closely adjacent to trees.

Keywords:

building; point cloud; geometric information

Graphical Abstract

1. Introduction

Building points are widely used in a variety of fields, including urban planning, cultural preservation, and disaster management, due to their capacity to capture detailed geometric features [1,2]. With the rapid development of cities, the surrounding environment of buildings has become complicated, making accurate building extraction a difficult task [3,4,5,6].

Building point cloud extraction methods can be classified into two categories based on data sources: single-source methods and multi-source methods. Single-source methods only use LiDAR data to extract building points. Zou et al. [7] proposed an adaptive strips approach for extracting buildings, which used adaptive-weight polynomials to classify each point and extract the edge points of buildings based on the regional clustering relationship among the points. This method only utilized the three-dimensional coordinate values of LiDAR data without the need for other auxiliary information to successfully identify buildings. Huang et al. [8] developed a top-down method based on the object entity to extract building points. Ground points were separated from non-ground points, and non-ground points were split to identify smooth zones. The building regions were then distinguished from smooth regions by top-level processing using their geometric and penetrating properties. Lastly, employing topological, geometric, and penetrating properties, the down-level processing was used to eliminate non-building points surrounding structures from each building region. The method produced good results in terms of area-based and object-based quality. Hui et al. [9] developed a multi-constraint graph segmentation method that converted point-based building extraction into object-based building extraction through multi-constraint graph segmentation and then utilized the spatial geometry information of objects and a multi-scale progressive growth algorithm to obtain building points. These methods perform well in extracting buildings in general urban environments and enable automated building recognition. However, when dealing with tree points closely attached to buildings, there is a possibility of misclassifying them as buildings.

The multi-source methods integrate LiDAR, aerial images, and ground planning maps into building point extraction, typically employing traditional and deep learning techniques. In the traditional technique, Qin and Fang [10] proposed a hierarchical building extraction method from high-resolution multispectral aerial images and Digital Surface Model (DSM) data. The method began with shadow detection using the morphological index, followed by the calculation of NDVI for correction. Subsequently, the top-hat reconstruction of DSM was combined with the NDVI to create the initial building mask data. Finally, the extracted building data was optimized using graph segmentation based on an improved super-pixel method. Acar et al. [11] introduced a building roof extraction algorithm that incorporated multiple data sources. Initially, the NDVI was calculated using spectral information, followed by applying a threshold to distinguish between vegetation and non-vegetation data. Subsequently, the Triangular Mesh Progressive Encoder Filter algorithm was employed to separate ground data. Lastly, the random sample consensus algorithm was utilized to extract the planar information of buildings. The algorithm achieved an average accuracy of 95%, completeness of 98%, and quality of 93%. Hron and Halounová [12] introduced a method for autonomously creating topologically correct roof-building models using building footprints and vertical aerial images. The method enabled the detection and categorization of roof edges in orthophotos by leveraging spatial relationships and height data from a digital surface model. This strategy enabled buildings with complicated designs to be divided into small portions that could be treated separately.

In the deep learning technique, Ghamisi et al. [13] proposed a fusion approach that combines extinction curves and convolutional neural networks for spectral-spatial classification of LiDAR and hyperspectral data. Firstly, extinction curves were extracted from different attributes to capture elevation and spatial information from both LiDAR and hyperspectral data. Afterwards, the extracted features were merged through either feature concatenation or graph feature fusion. Finally, the merged features were fed into a deep learning-based classifier for generating classification maps. Using optical imagery and unregistered airborne LiDAR data, Nguyen et al. [14] proposed an unsupervised and fully autonomous snake model without manual beginning points or training data to extract buildings. It was demonstrated that the method could recover buildings of different colors from intricate surroundings with a high degree of overall accuracy. Yuan et al. [15] proposed an end-to-end fully convolutional neural model based on residual networks for handling high-resolution aerial imagery and LiDAR data. The residual network effectively extracted high-level features, thus reducing the performance degradation associated with increasing network depth. The network demonstrated excellent performance, achieving an IoU of 93.19% and an OA of 97.56% on the WHU dataset and an IoU of 94.72% and an OA of 97.84% on the Boston dataset.

Combining LiDAR with aerial images and other data can significantly enhance the accuracy of building extraction. However, it is still challenging to combine data from different sources into the same reference coordinate system.

To improve building extraction accuracy, this paper proposes a highly accurate building point cloud extraction method based solely on the geometric information of the points. The method is divided into two stages: coarse extraction and fine extraction. In the coarse extraction stage, a coarsely extracted building point cloud is obtained using the cloth simulation filtering algorithm and the region growing algorithm. In the fine extraction stage, the coarsely extracted building point cloud is iteratively refined using mask polygons and the region growing algorithm. This step-by-step refinement process allows for a more accurate extraction of the building point cloud. The proposed method is evaluated on the Urban-LiDAR and Vaihingen datasets, demonstrating excellent extraction accuracy. The main contributions of this paper are summarized as follows:

This paper combines the Alpha Shape algorithm with the neighborhood expansion method to compensate for the shortcomings of the region growing algorithm in the coarse extraction stage, thereby obtaining more complete building points.
To address the issue of misidentifying facade points near the ground, we perform mask extraction on the original points instead of non-ground points. This approach allows us to obtain more comprehensive facade points within the mask polygons compared to the ones obtained using the cloth simulation filtering algorithm.
Even in cases where buildings are closely adjacent to trees, the proposed method can successfully separate and extract building points from tree points, thereby improving accuracy and reliability.

2. Methods

This section introduces the proposed method for building extraction in complex scenes in detail. Our method is divided into two stages, namely coarse extraction and fine extraction, to achieve accurate extraction of the building point cloud.

In the coarse extraction stage of the building point cloud, our proposed method identifies non-ground points in the point cloud using the cloth simulation filtering (CSF) algorithm and uses a region growing algorithm to obtain the coarse extraction of the building point cloud. At this stage, the region growing algorithm may fail to identify some building boundary points.

In the fine extraction stage of the building point cloud, our proposed method obtains mask polygons based on the coarsely extracted building points by applying the Alpha Shape algorithm and the neighborhood expansion method. The building point cloud is enlarged and replaced by non-ground points within mask polygons. Discrete tree points are removed from the building point cloud using the region growing algorithm and the Euclidean clustering algorithm. The building point cloud is then upgraded by merging with the facade point cloud near the ground. Noise points are removed using the radius filtering algorithm to obtain the final building point cloud. The detailed workflow and visualization flowchart for the building point cloud extraction are shown in Figure 1 and Figure 2.

2.1. Coarse Extraction of the Building Point Cloud

Due to the large terrain undulations and uneven density distribution of points, traditional filtering algorithms have difficulty obtaining high-accuracy non-ground points. In order to remove ground points with high accuracy, this paper uses the CSF algorithm to separate non-ground points from ground points.

The basic idea of the CSF algorithm is to invert the original points and use a cloth model composed of spring-connected cloth particles to simulate the filtering process [16]. The position of particles on grid nodes in space determines the shape of the fabric [17]. According to Newton’s Second Law, the relationship between particle position and force can be expressed as follows [18]:

m \frac{\partial X (t)}{\partial t^{2}} = F_{e} (X, t) + F_{i} (X, t),

(1)

where m is the mass of the particle.

X (t)

is the position of the particle at time t.

F_{e} (X, t)

is the external force on the particle.

F_{i} (X, t)

is the internal force of the particle at position X at time

t

.

According to Equation (1), we first only calculate the influence of gravity on each particle, resulting in the position of each particle [18]:

X (t + ∆ t) = 2 X (t) - X (t - ∆ t) + \frac{G}{m} {∆ t}^{2},

(2)

where G is the gravity.

X (t)

is the position of the particle at time t, and

∆ t

is the step length of time.

Next, consider the internal forces between particles to limit their displacement in the void area of the inverted points. The displacement of each particle is calculated as follows [18]:

\vec{d} = \frac{1}{2} b (\vec{p_{k}} - \vec{p_{0}}) \cdot \vec{n}, k = 1, 2, 3, \dots

(3)

where

\vec{d}

is the displacement vector of particles. b is a parameter that determines whether a particle can move (b = 1 indicates it can move; b = 0 indicates it cannot move);

p_{k}

is the position of adjacent particles of

p_{0}

.

\vec{n} = {(0, 0, 1)}^{T}

.

Finally, the relative position of particles is adjusted based on the internal forces between them and the fabric stiffness parameters. If the distance between the actual point and the simulated particles is less than the pre-set threshold, it is considered a ground point; otherwise, it is considered a non-ground point (Figure 3).

After identifying non-ground points in the point cloud, we use the region growing algorithm to obtain the coarse extraction of the building point cloud from non-ground points. The algorithm selects the point with the minimum curvature as the initial seed point. Given a neighboring point A of a seed point B, if the angle between the normal vector of A

{(N}_{n e i g h b o r})

and that of

B {(N}_{s e e d})

is less than a given threshold

θ

(Equation (4)) and the curvature value of

A {(σ}_{n e i g h b o r})

is less than a given threshold value

σ

(Equation (5)), point A is considered a new seed point. The region continues to grow until all points are processed (Figure 4) [19].

acos (\frac{N_{s e e d}}{||N_{s e e d}||} \cdot \frac{N_{n e i g h b o r}}{||N_{n e i g h b o r}||}) < θ,

(4)

σ_{n e i g h b o r} < σ .

(5)

Here,

θ

and

σ

are usually small enough to avoid incorrectly identifying non-building points that are approximately planar as building points. In this case, the region growing algorithm may fail to extract some building boundary points due to the large angles between the local normal vectors of adjacent points (Figure 5b).

2.2. Fine Extraction of the Building Point Cloud

Considering that the region growing algorithm may fail to include the boundary points of the buildings during the coarse extraction stage, the building point cloud is enlarged and replaced with the help of mask polygons.

In this paper, mask polygons are used to identify the points located within them. To obtain mask polygons, we first project the coarsely extracted building point cloud onto the XOY plane. Then, we use the Alpha Shape algorithm [20] to extract edge points from the projected points and finally extend the edge points through the neighborhood expansion method based on corresponding multipliers.

Mask polygons are extracted in the following steps (Figure 6):

(1): All possible pairs of projected points are processed in the same way. For any pair of points $P_{1} (x_{1}, y_{1})$ and $P_{2} (x_{2}, y_{2})$ from projected point cloud on the XOY plane of point cloud S, the center point $P_{3} (x_{3}, y_{3})$ of the circle whose distance from is calculated and is equal to $α$ based on the distance intersection method (Figure 7) [21]:

$\{\begin{matrix} x_{3} = x_{1} + \frac{1}{2} (x_{2} - x_{1}) + H (y_{2} - y_{1}) \\ y_{3} = y_{1} + \frac{1}{2} (y_{2} - y_{1}) + H (x_{2} - x_{1}) \end{matrix},$

(6)

where

$\{\begin{matrix} H = \sqrt{\frac{α^{2}}{S_{P_{1} P_{2}}^{2}} - \frac{1}{4}} \\ S_{p_{1} p_{2}}^{2} = {(x_{1} - x_{2})}^{2} + {(y_{1} - y_{2})}^{2} \end{matrix} .$

(7)
(2): The distance $d$ between each point in S and $P_{3}$ is calculated. If $d$ is less than $α$ , the point is considered to be inside the circle; otherwise, it is deemed to be outside the circle. If there are $P_{1}$ and $P_{2}$ such that there are no other points inside the circle, then $P_{1}$ and $P_{2}$ are defined as edge points, and $P_{1} P_{2}$ is defined as a boundary line. The edge points are obtained until all point pairs in S have been processed.
(3): The centroid coordinates ${C e n}_{p o i n t}$ of all edge points and the distance ${D i s}_{p o i n t}$ from each edge point to the ${C e n}_{p o i n t}$ , as well as the direction vector ${D i r}_{p o i n t}$ from the ${C e n}_{p o i n t}$ to each edge point, are calculated. $M u l t i$ refers to the corresponding multipliers. The expanded corresponding edge point ${E x p}_{p o i n t}$ is as follows:

${E x p}_{p o i n t} = {C e n}_{p o i n t} + {D i r}_{p o i n t} \times M u l t i \times {D i s}_{p o i n t} .$

(8)
(4): Edge points are sorted based on the polar angles between adjacent points and connect them to form a closed polygon for extracting points within the polygon.

The steps for connecting edge points are as follows: First, the center point of all edge points is calculated. Then, the edge points are sorted based on their polar angles relative to the center point in a counterclockwise direction in ascending order. Finally, all edge points are connected in counterclockwise order to create a closed polygon (Figure 8).

After the mask polygons are obtained based on the coarsely extracted building point cloud, the building point cloud is enlarged and replaced by all non-ground points within the mask polygons. Due to the possibility of adding certain tree points to the building point cloud, we use the region growing algorithm and the Euclidean clustering algorithm [22] to filter out some discrete tree points from the building point cloud.

The specific operation process of the Euclidean clustering algorithm is as follows:

(1): The K nearest neighbor points for any point P in space are found using the KD-Tree nearest neighbor search algorithm.
(2): For the K nearest neighbor points, the Euclidean distance between each point and P is calculated.
(3): If there are points within the K nearest neighbors that have a distance smaller than the set threshold, these points are clustered into a set Q.
(4): The above process is repeated until the number of elements in set Q no longer increases.

At this stage, the threshold values for the normal vector and curvature in the region growing algorithm are relatively large to include the boundary points of the buildings.

Subsequently, the building point cloud is upgraded by merging with the façade point cloud near the ground, which is obtained by conducting mask extraction on the original points instead of non-ground points and setting appropriate values for the Z-axis to adjust the height to a certain distance from the ground (Figure 9). Given that the façade point cloud may overlap with the existing building point cloud, the duplicate points are removed from the merged building point cloud. Finally, we use the radius filtering algorithm to remove the discrete noise points within the building point cloud.

The main idea of the radius filtering algorithm is to assume that each point in the original points contains at least a certain number of neighboring points within a specified radius neighborhood [23]. When this assumption is satisfied, the point is considered a valid point and retained. On the contrary, if the conditions are not met, it will be identified as a noise point and removed. As an example, Figure 10 specifies a radius of d. If at least one adjacent point is specified within this radius, only the blue points in the figure will be removed from the point cloud. If at least two adjacent points are specified within the radius, both the purple and black points will be removed.

3. Experiment Settings

3.1. Study Areas

To evaluate the performance of our proposed method, we conducted experiments on two datasets: the Urban-LiDAR dataset (https://www.lidar360.com/ accessed on 2 May 2022) and the Vaihingen dataset (http://www2.isprs.org/ accessed on 7 April 2022). The Urban-LiDAR dataset consists of a total of 719,823 points. The dataset includes various types of objects, including buildings, trees, and ground points, as shown in Figure 11. The terrain in this area has undergone significant changes, with dense vegetation and high buildings.

The Vaihingen dataset contains 411,722 points. The Vaihingen dataset is divided into two parts: Vaihi-1 and Vaihi-2, which have been processed separately in this paper, as shown in Figure 12 (displayed by elevation). In the Vaihingen dataset, non-ground points are composed of buildings, powerlines, low vegetation, cars, fences, hedges, shrubs, and trees; ground points are composed of impervious surfaces. The Vaihingen dataset is collected by the Leica ALS50 system with a point density of 4–8 m⁻². The terrain in this area is relatively flat, with sparse vegetation and low buildings.

3.2. Parameter Settings

In the process of extracting the building point cloud, this paper involves some important algorithms, including the CSF algorithm, the region growing algorithm, and the Euclidean clustering algorithm. In this article, the parameters we set are mainly based on the density of points and terrain undulations. The specific parameter settings are shown in Table 1, where the parameter settings of the region growing algorithm are used for the coarse extraction stage of building points.

When using the CSF algorithm to separate ground and non-ground points, the following key parameters play an important role: (1) cloth_resolution represents the size of the terrain coverage grid, that is, the setting of the grid resolution, which affects the precision of generating a digital terrain model (DTM). A larger cloth resolution usually leads to a rougher DTM generated; (2) max_iterations represents the maximum number of iterations; (3) classification_threshold represents the distance threshold between the actual point and the simulated terrain, used to divide the point cloud into ground points and non-ground points.

In the coarse extraction stage of the building point cloud, the region growing algorithm is used to extract building points from non-ground points. The region growing algorithm involves the following key parameters: (1) theta_threshold represents the smoothing threshold; (2) curvature_threshold represents the curvature threshold; (3) neighbor_number represents the number of neighborhood search points; (4) min_pts_per_cluster represents the minimum number of points for each cluster; and (5) max_pts_per_cluster represents the maximum number of points in each cluster.

When using the Euclidean clustering algorithm to filter discrete tree points and obtain building points, the Euclidean clustering algorithm involves several important parameters: (1) tolerance represents the search radius of nearest neighbor search, which is the minimum Euclidean distance between two different clusters; (2) min_cluster_size represents the minimum number of cluster points; (3) max_cluster_size represents the maximum number of cluster points.

3.3. Evaluation Indicators

This paper uses precision, recall, and the F1 score as evaluation indicators to verify the effectiveness of the proposed method in extracting building points.

Precision represents the proportion of correctly predicted building points to all predicted building points [24]:

P r e c i s i o n = \frac{T P}{T P + F P},

(9)

Recall represents the proportion of correctly predicted building points to actual building points [24]:

R e c a l l = T P / (T P + F N),

(10)

The

F 1

score is the weighted average of precision and recall, which is closer to the smaller value of precision and recall [24]:

F 1 = 2 \times \frac{P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l}

(11)

where TP represents the number of correctly predicted building points, FP represents that non-building points are incorrectly predicted as building points, and FN represents that building points are incorrectly predicted as non-building points.

3.4. Benchmark Algorithm

To verify the effectiveness of the proposed method, a manually interactive recognition of the building point cloud was used as a reference. In the Urban LiDAR dataset, this paper mainly analyzes the building point cloud obtained through manual interactive recognition. In the Vaihingen dataset, this paper compares the PointNet [25], PointNet++ [26], and HDL-JME-GGO [27] networks with the proposed method. The PointNet, PointNet++, and HDL-JME-GGO networks estimate test data by learning from training data (Figure 13).

The basic idea of the PointNet network is to utilize a multi-layer perceptron to capture the feature information of the point, followed by the use of maximum pooling to aggregate these point features into a global feature representation. The PointNet network is able to directly process unordered point cloud data without considering the order of points.

The PointNet++ network incorporates a hierarchical structure comprising a sampling layer, a grouping layer, and a feature extraction layer. This structure allows for the organization of each point and its surrounding neighborhood into local regions, which are then processed using the PointNet network to extract features from the corresponding point cloud. By employing this hierarchical structure, the network becomes capable of effectively learning local feature information as the context scale expands.

The HDL-JME-GGO network utilizes layered data to enhance deep feature learning using the PointNet++ network. It incorporates a joint learning method based on nonlinear manifolds to globally optimize and embed deep features into a low-dimensional space, taking into account the contextual information of spatial and deep features. It effectively addresses artifacts caused by partitioning and sampling in the processing of large-scale datasets. This network achieves global regularization by optimizing initial labels to ensure spatial regularity, resulting in locally continuous and globally optimal classification results.

4. Results

We evaluated the building extraction performance of the proposed method on the Urban-LiDAR dataset and the Vaihingen dataset. The building point cloud could be divided into two non-overlapping point clouds: the facade point cloud and the roof point cloud. The separation of facade points and roof points was achieved based on the normal vector threshold in the Z direction. The extraction results of the proposed method on Urban-LiDAR, Vaihi-1, and Vaihi-2 data are shown in Figure 14, Figure 15 and Figure 16, respectively. It was evident from the figures that the proposed method achieved a high level of accuracy in extracting building points.

5. Discussion

This paper evaluated the extraction results of the proposed method on Urban-LiDAR data, as shown in Table 2. For the roofs, the proposed method yielded a precision of 98.74%, a recall of 98.47%, and an F1 score of 98.60%. For the facades, the values were 97.98%, 70.94%, and 82.30%, respectively.

In addition, we analyzed the extraction accuracy of the roof in the Urban-LiDAR data. From Table 3, it can be seen that the highest precision, recall, and F1 scores all reached 100% (Roof 14 and Roof 29). The minimum accuracy rate of the roof was 79.57%, the recall was 89.13%, and the F1 score was 84.08% (Roof 28). The experimental results showed that the proposed method exhibited high accuracy and completeness in roof segmentation.

Although the CSF algorithm can effectively separate ground points from non-ground points, it may mistakenly identify façade points that are closer to the ground as ground points. To solve this difficult problem, this paper extracted masks based on original points rather than non-ground points and set appropriate values for the Z-axis to obtain the façade point cloud near the ground. Comparing Figure 17c, the façade points within mask polygons in the original points were more complete than those in the non-ground points acquired using the CSF algorithm.

In addition, we evaluated the effectiveness of building point extraction in two different scenes from the Urban-LiDAR dataset: a complex scene and a low-density scene. Figure 18c displayed the extracted building point cloud using the proposed method in the complex scene, and the precision, recall, and F1 score of the roof were 98.82%, 98.38%, and 98.60%, respectively. It demonstrated that the proposed method could extract building points accurately in the complex scene. Figure 19c shows the extraction results using the proposed method in the scene with low point density. The recall of the roof was only 92.02%, but the precision was 99.41%, and the F1 score was 95.57%. It could be seen that there were relatively dense points with significant fluctuations at the edges of the original points, and even if we used the region growing algorithm to process it, points at that location could still be lost.

Our proposed method is compared with three segmentation networks: PointNet, PointNet++, and HDL-JME-GGO on the Vaihingen dataset. The performance indicators are listed in Table 4. The proposed method performed outstandingly in roof extraction, achieving a precision 20.73% higher than that of the PointNet network. However, the F1 score of the proposed method was only lower by 0.28% compared to the HDL-JME-GGO network. For facade extraction, the precision of the proposed method was 49.63% higher than that of the PointNet network, 16.53% higher than that of the PointNet++ network, but only 3.87% lower than that of the HDL-JME-GGO network. While our proposed method achieved slightly lower accuracy than the HDL-JME-GGO network, it considerably outperformed the PointNet and PointNet++ networks in extracting building points based on geometric information.

Because the Vaihingen dataset was composed of the Vaihi-1 point cloud and the Vaihi-2 point cloud, we conducted a detailed analysis of the extraction results on the two-point clouds. For roof extraction, the proposed method achieved precision, recall, and an F1 score of 91.49%, 92.32%, and 91.90% for the Vaihi-1 point cloud and 96.27%, 83.93%, and 89.68% for the Vaihi-2 point cloud, respectively (Table 5).

Furthermore, we selected 21 buildings and analyzed the roof extraction accuracy for both the Vaihi-1 point cloud and the Vaihi-2 point cloud (Table 6). For the Vaihi-1 point cloud, the proposed method achieved the highest precision, recall, and F1 score, all reaching 100%. The proposed method yielded the lowest precision, recall, and F1 score at 71.91%, 81.51%, and 76.41%, respectively. Regarding the Vaihi-2 point cloud, the proposed method achieved the highest precision (99.90%), recall (98.39%), and F1 score (99.04%). Conversely, the proposed algorithm exhibited the lowest precision (86.80%), recall (55.14%), and F1 score (71.05%). These results indicate the proposed method’s capability to achieve high-accuracy results in roof extraction.

Although the proposed method achieved high accuracy in extracting the Vaihi-1 point cloud and the Vaihi-2 point cloud, there were still some shortcomings. Due to the limitations of the CSF algorithm, it may have difficulty extracting certain roof points close to the ground, such as those points shown in the white circle in Figure 20b. In addition, it was difficult to extract building points solely based on geometric information for some roofs with significant undulations, as shown in the black circle of building points in Figure 20b and Figure 21b.

6. Conclusions

This paper proposes a highly accurate building point cloud extraction method based solely on the geometric information of points. The method is divided into two stages: coarse extraction and fine extraction. In the coarse extraction stage, a coarsely extracted building point cloud is obtained using the cloth simulation filtering algorithm and the region growing algorithm. In the fine extraction stage, the coarsely extracted building point cloud is iteratively refined using mask polygons and the region growing algorithm. The proposed method has shown excellent extraction accuracy on the Urban-LiDAR and Vaihingen datasets. On the Urban-LiDAR dataset, the method achieved a precision of 98.74%, a recall of 98.47%, and an F1 score of 98.60% for roof extraction. For facade extraction on the same dataset, the precision, recall, and F1 scores were 97.98%, 70.94%, and 82.30%, respectively. On the Vaihingen dataset, the proposed method outperformed the PointNet network by 20.73% in roof extraction precision and achieved comparable performance with the HDL-JME-GGO network. For facade extraction, the method surpassed the PointNet network by 49.63% in precision, the PointNet++ network by 16.53%, and fell slightly behind the HDL-JME-GGO network by only 3.87%. Additionally, the proposed method can still extract building points with high accuracy, even in cases where buildings are closely adjacent to trees. However, relying solely on geometric information for building extraction may face significant challenges for roofs with significant fluctuations or in situations where point density is low. We will introduce more feature information, such as color or texture, to enhance the ability to extract buildings, thereby achieving more accurate and complete building extraction in the future.

Author Contributions

Z.S., J.P., D.F. and G.Z. designed and performed the experiments. Z.S., J.P., D.F., S.L., Y.Y. and G.Z. contributed to the manuscript writing. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (42271427), the Second Tibetan Plateau Scientific Expedition and Research (2022QZKK0101), and the Science and Technology Program of the Ministry of Public Security of China (2022JSZ09).

Data Availability Statement

Urban-LiDAR and Vaihingen data were obtained from https://www.lidar360.com/ (accessed on 2 May 2022), and Vaihingen data were acquired from http://www2.isprs.org/ (accessed on 7 April 2022).

Acknowledgments

The authors would like to thank the anonymous referees for constructive criticism and comments.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Wang, X.; Li, P. Extraction of urban building damage using spectral, height and corner information from VHR satellite images and airborne LiDAR data. ISPRS J. Photogramm. Remote Sens. 2020, 159, 322–336. [Google Scholar] [CrossRef]
Adamopoulos, E.; Rinaudo, F.; Ardissono, L. A critical comparison of 3D digitization techniques for heritage objects. ISPRS Int. J. Geo-Inf. 2020, 10, 10. [Google Scholar] [CrossRef]
Xu, Y.; Stilla, U. Toward building and civil infrastructure reconstruction from point clouds: A review on data and key techniques. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 2857–2885. [Google Scholar] [CrossRef]
Schrotter, G.; Hürzeler, C. The digital twin of the city of Zurich for urban planning. PFG–J. Photogramm. Remote Sens. Geoinf. Sci. 2020, 88, 99–112. [Google Scholar] [CrossRef]
Tarsha Kurdi, F.; Gharineiat, Z.; Campbell, G.; Awrangjeb, M.; Dey, E.K. Automatic filtering of lidar building point cloud in case of trees associated to building roof. Remote Sens. 2022, 14, 430. [Google Scholar] [CrossRef]
Martín-Jiménez, J.; Del Pozo, S.; Sánchez-Aparicio, M.; Lagüela, S. Multi-scale roof characterization from LiDAR data and aerial orthoimagery: Automatic computation of building photovoltaic capacity. Autom. Constr. 2020, 109, 102965. [Google Scholar] [CrossRef]
Zou, X.; Feng, Y.; Li, H.; Zhu, J. An Adaptive Strips Method for Extraction Buildings From Light Detection and Ranging Data. IEEE Geosci. Remote Sens. Lett. 2017, 14, 1651–1655. [Google Scholar] [CrossRef]
Huang, R.; Yang, B.; Liang, F.; Dai, W. A top-down strategy for buildings extraction from complex urban scenes using airborne LiDAR point clouds. Infrared Phys. Technol. 2018, 92, 203–218. [Google Scholar] [CrossRef]
Hui, Z.; Li, Z.; Cheng, P.; Ziggah, Y.Y.; Fan, J.L. Building extraction from airborne lidar data based on multi-constraints graph segmentation. Remote Sens. 2021, 13, 3766. [Google Scholar] [CrossRef]
Qin, R.; Fang, W. A hierarchical building detection method for very high resolution remotely sensed images combined with DSM using graph cut optimization. Photogramm. Eng. Remote Sens. 2014, 80, 37–47. [Google Scholar] [CrossRef]
Acar, H.; Karsli, F.; Ozturk, M.; Dihkan, M. Automatic detection of building roofs from point clouds produced by the dense image matching technique. Int. J. Remote Sens. 2018, 40, 138–155. [Google Scholar] [CrossRef]
Hron, V.; Halounová, L. Automatic reconstruction of roof models from building outlines and aerial image data. Acta Polytech. 2019, 59, 448–457. [Google Scholar] [CrossRef]
Ghamisi, P.; Höfle, B.; Zhu, X.X. Hyperspectral and lidar data fusion using extinction profiles and deep convolutional neural network. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2017, 10, 3011–3024. [Google Scholar] [CrossRef]
Nguyen, T.H.; Daniel, S.; Gueriot, D.; Sintes, C.; Caillec, J.M.L. Unsupervised Automatic Building Extraction Using Active Contour Model on Unregistered Optical Imagery and Airborne LiDAR Data. In Proceedings of the PIA19+MRSS19—Photogrammetric Image Analysis & Munich Remote Sensing Symposium, Munich, Germany, 18–20 September 2019; Volume XLII-2/W16. pp. 181–188. [Google Scholar] [CrossRef]
Yuan, Q.; Shafri, H.Z.H.; Alias, A.H.; Hashim, S.J. Multiscale semantic feature optimization and fusion network for building extraction using high-resolution aerial images and LiDAR data. Remote Sens. 2021, 13, 2473. [Google Scholar] [CrossRef]
Li, F.; Zhu, H.; Luo, Z.; Shen, H.; Li, L. An adaptive surface interpolation filter using cloth simulation and relief amplitude for airborne laser scanning data. Remote Sens. 2021, 13, 2938. [Google Scholar] [CrossRef]
Provot, X. Deformation constraints in a mass-spring model to describe rigid cloth behaviour. In Graphics Interface; Canadian Information Processing Society: Mississauga, ON, Canada, 1995; p. 147. Available online: http://www-rocq.inria.fr/syntim/research/provot/ (accessed on 3 May 2022).
Zhang, W.; Qi, J.; Wan, P.; Wang, H.; Xie, D.; Wang, X.; Yan, G. An easy-to-use airborne LiDAR data filtering method based on cloth simulation. Remote Sens. 2016, 8, 501. [Google Scholar] [CrossRef]
Su, Z.; Gao, Z.; Zhou, G.; Li, S.; Song, L.; Lu, X.; Kang, N. Building Plane Segmentation Based on Point Clouds. Remote Sens. 2022, 12, 95. [Google Scholar] [CrossRef]
Dos Santos, R.C.; Galo, M.; Carrilho, A.C. Building boundary extraction from LiDAR data using a local estimated parameter for alpha shape algorithm. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2018, 42, 127–132. [Google Scholar] [CrossRef]
Shen, W.; Li, J.; Chen, Y.; Deng, L.; Peng, G. Algorithms study of building boundary extraction and normalization based on LiDAR data. J. Remote Sens. 2008, 05, 692–698. [Google Scholar] [CrossRef]
Sun, Z.; Li, Z.; Liu, Y. An improved lidar data segmentation algorithm based on euclidean clustering. In Proceedings of the 11th International Conference on Modelling, Identification and Control, Tianjin, China, 13–15 July 2019; Springer: Singapore, 2020; pp. 1119–1130. [Google Scholar] [CrossRef]
Xu, Z.; Yan, W. The Filter Algorithm Based on Lidar Point Cloud. Inf. Commun. 2018, 3, 80–82. [Google Scholar] [CrossRef]
Li, W.; Wang, F.; Xia, G. A geometry-attentional network for ALS point cloud classification. ISPRS J. Photogramm. Remote Sens. 2020, 164, 26–40. [Google Scholar] [CrossRef]
Charles, R.Q.; Su, H.; Kaichun, M.; Guibas, L.J. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 77–85. [Google Scholar] [CrossRef]
Charles, R.Q.; Yi, L.; Su, H.; Guibas, L.J. PointNet++: Deep hierarchical feature learning on point sets in a metric space. In Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar] [CrossRef]
Huang, R.; Xu, Y.; Hong, D.; Yao, W.; Ghamisi, P.; Stilla, U. Deep point embedding for urban classification using ALS point clouds: A new perspective from local to global. ISPRS J. Photogramm. Remote Sens. 2020, 163, 62–81. [Google Scholar] [CrossRef]

Figure 1. Workflow of the building point cloud extraction.

Figure 2. Visualization flowchart of the building point cloud extraction.

Figure 3. The point cloud is divided into ground points and non-ground points using the CSF filtering algorithm (ground points are displayed in dark yellow, and non-ground points are displayed in blue).

Figure 4. Plane segmentation results using the region growing algorithm.

Figure 5. (a) Ground truth; (b) coarse extraction results using the region growing algorithm, with buildings in red, trees in green, and ground points in dark yellow.

Figure 6. Mask polygon extraction using a combination of the Alpha Shape algorithm and neighborhood expansion method.

Figure 7. Calculation of the center coordinates of a circle based on the distance intersection method.

Figure 8. Polygonal connection based on the polar angles.

Figure 9. Misclassification of building points using the CSF algorithm within the red circle, with ground points in dark yellow and non-ground points in blue.

Figure 10. Radius filtering algorithm.

Figure 11. Urban-LiDAR dataset.

Figure 12. Vaihingen dataset. (a) Vaihi-1 data; (b) Vaihi-2 data.

Figure 13. Training data. Ground points are in dark yellow; the facades are in purple; the roofs are in red; and other elements are in green.

Figure 14. Urban-LiDAR’s extraction results: ground points are in dark yellow; tree points are in green; the facades are in purple; the roofs are in red. (a) Ground truth; (b) the extraction results using the proposed method.

Figure 15. Vaihi-1’s extraction results: ground points are in dark yellow; tree points are in green; the facades are in purple; the roofs are in red. (a) Ground truth; (b) the extraction results using the proposed method.

Figure 16. Vaihi-2’s extraction results: ground points are in dark yellow; the facades are in purple; the roofs are in red. (a) Ground truth; (b) the extraction results using the proposed method.

Figure 17. (a) Facade points within mask polygons in the original points; (b) the facade points within mask polygons in the non-ground points; (c) the overlay of (a,b).

Figure 18. Extraction of buildings results in complex scenes: (a) original data; (b) label data; (c) the extraction results of the building using the proposed method.

Figure 19. Extraction of the buildings with low cloud density: (a) original point cloud; (b) manually delineated reference building points. The integration of texture information into data collected by unmanned aerial vehicles (UAVs) may introduce errors, as exemplified by the points highlighted in blue in the figure, which should ideally be categorized as building points; (c) the extracted building points using the proposed method.

Figure 20. Vaihi-1 data. (a) Label of Vaihi-1; (b) Vaihi-1’s extraction results using the proposed method.

Figure 21. Vaihi-2 data. (a) Label of Vaihi-2; (b) Vaihi-2’s extraction results using the proposed method.

Table 1. Parameter settings of some important algorithms.

Algorithm	Parameter	Urban-LiDAR	Vaihi-1	Vaihi-2
CSF algorithm	cloth_resolution	1.0	0.3	1.0
	max_iterations	500	500	500
	classification_thresold	2.0	1.5	2.2
Region growing algorithm	theta_threshold	5	30	10
	curvature_threshold	0.05	0.05	0.03
	neighbor_number	20	15	30
	min_pts_per_cluster	100	40	50
	max_pts_per_cluster	10,000	10,000	10,000
European clustering algorithm	tolerance	0.58	1.5	1.25
	min_cluster_size	80	180	180
	max_cluster_size	100,000	10,000	15,000

Table 2. Accuracy assessment of Urban-LiDAR’s extraction (%).

	Precision	Recall	F1 Score
Roof	98.74	98.47	98.60
Façade	97.98	70.94	82.30

Table 3. Accuracy assessment of Urban-LiDAR’s roof extraction (%).

ID	Precision	Recall	F1 Score
0	99.54	99.77	99.66
1	98.25	98.92	98.58
2	99.80	98.42	99.11
3	96.05	98.00	97.02
4	97.19	98.56	97.87
5	95.22	95.62	95.42
6	99.85	99.80	99.82
7	100	98.14	99.06
8	84.08	91.31	87.55
9	98.72	98.88	98.80
10	98.68	97.35	98.01
11	98.82	98.38	98.60
12	98.00	98.52	98.26
13	99.50	97.70	98.59
14	100	100	100
15	99.12	96.64	97.86
16	98.79	97.65	98.22
17	99.94	99.32	99.63
18	96.67	98.28	97.47
19	88.47	93.46	90.90
20	93.29	96.37	94.80
21	99.87	97.78	98.81
22	99.62	99.17	99.39
23	97.69	97.96	97.82
24	99.41	92.02	95.57
25	97.46	92.42	94.87
26	96.18	98.06	97.11
27	98.91	98.68	98.79
28	79.57	89.13	84.08
29	100	100	100
30	92.76	95.66	94.19

Table 4. Accuracy assessment of Vaihingen’s extraction (%).

Algorithm	Indicator	Roof	Facade
PointNet	Precision	73.0 (↑20.73)	10.7 (↑49.63)
	Recall	82.2	0.1
	F1 score	77.6	5.4
PointNet++	Precision	92.8	43.8 (↑16.53)
	Recall	81.0	38.3
	F1 score	86.9	41.0
HDL-JME-GGO	Precision	92.8	64.2 (↓3.87)
	Recall	89.3	24.2
	F1 score	91.1 (↓0.28)	44.2
The Proposed Method	Precision	93.73	60.33
	Recall	88.08	27.33
	F1 score	90.82	37.62

Table 5. Accuracy assessment of Vaihi-1 and Vaihi-2’s extraction (%).

	Precision		Recall		F1 Score
	Vaih-1	Vaih-2	Vaih-1	Vaih-2	Vaih-1	Vaih-2
Roof	91.49	96.27	92.32	83.93	91.90	89.68
Facade	58.33	61.45	17.77	38.36	27.24	47.23

Table 6. Accuracy assessment of Vaihi-1 and Vaihi-2’s roof extraction (%).

ID	Precision		Recall		F1 Score
Roof	Vaih-1	Vaih-2	Vaih-1	Vaih-2	Vaih-1	Vaih-2
0	100	86.80	100	91.83	100	89.24
1	88.94	98.04	93.87	90.77	91.34	94.27
2	100	92.65	99.80	92.45	99.90	92.55
3	97.83	97.91	99.77	95.02	98.79	96.44
4	99.45	99.90	99.73	94.43	99.59	97.09
5	97.75	99.88	97.61	78.13	97.68	87.68
6	99.39	94.78	95.46	84.80	97.39	89.51
7	100	99.88	95.58	55.14	97.74	71.05
8	99.02	99.41	99.18	68.67	99.10	81.23
9	71.91	99.72	81.51	94.67	76.41	97.13
10	98.52	99.29	98.89	94.40	98.70	96.78
11	100	97.21	100	93.43	100	95.28
12	98.14	99.70	86.17	98.38	91.77	99.04
13	98.18	96.57	96.83	97.70	97.50	97.13
14	98.96	99.55	95.65	98.31	97.28	98.93
15	99.43	99.84	99.15	87.41	99.29	93.21
16	100	99.07	99.76	98.05	99.88	98.56
17	99.29	99.23	99.29	90.44	99.29	94.63
18	100	99.16	89.25	98.39	94.32	98.77
19	96.92	97.68	98.43	91.75	97.67	94.62
20	97.35	96.92	96.89	97.55	97.12	97.23

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Su, Z.; Peng, J.; Feng, D.; Li, S.; Yuan, Y.; Zhou, G. A Building Point Cloud Extraction Algorithm in Complex Scenes. Remote Sens. 2024, 16, 1934. https://doi.org/10.3390/rs16111934

AMA Style

Su Z, Peng J, Feng D, Li S, Yuan Y, Zhou G. A Building Point Cloud Extraction Algorithm in Complex Scenes. Remote Sensing. 2024; 16(11):1934. https://doi.org/10.3390/rs16111934

Chicago/Turabian Style

Su, Zhonghua, Jing Peng, Dajian Feng, Shihua Li, Yi Yuan, and Guiyun Zhou. 2024. "A Building Point Cloud Extraction Algorithm in Complex Scenes" Remote Sensing 16, no. 11: 1934. https://doi.org/10.3390/rs16111934

APA Style

Su, Z., Peng, J., Feng, D., Li, S., Yuan, Y., & Zhou, G. (2024). A Building Point Cloud Extraction Algorithm in Complex Scenes. Remote Sensing, 16(11), 1934. https://doi.org/10.3390/rs16111934

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Building Point Cloud Extraction Algorithm in Complex Scenes

Abstract

1. Introduction

2. Methods

2.1. Coarse Extraction of the Building Point Cloud

2.2. Fine Extraction of the Building Point Cloud

3. Experiment Settings

3.1. Study Areas

3.2. Parameter Settings

3.3. Evaluation Indicators

3.4. Benchmark Algorithm

4. Results

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI