**1. Introduction**

Significant acceleration of the urbanization rate is contributing to a persistent vertical expansion in many cities in order to meet the increasing demand for living and working space [1]. In this context, the three-dimensional form of a city has important implications for a city's sustainability, efficiency, and resilience [2]. Considering the expected urban expansion in the near future across the globe [3], understanding the relation between the three-dimensional urban form, critical infrastructure operations, and quality-of-life measures is one of the most important research and technical challenges. The growing interest in generating 3D city models is motivated by a broad range of applications, such as estimation of solar irradiation, energy demand estimation, classification of building types, visibility analysis, 3D cadastre, visualization for navigation, urban planning, emergency response, computation fluid dynamics, change detection, flooding, archaeology, forest management, and virtual tours [4].

Moreover, a decrease in the cost of remote sensing technology and data storage in recent years has contributed to the expansion of urban morphology studies [4,5]. In particular, the emergence of airborne laser, a leading technology for the extraction of information about physical surfaces, enables substantial advances of in-depth availability of data on buildings and infrastructure, as well as over large scales. By directly providing measurements of surface heights with high point density and high level of accuracy, light detection and ranging (LIDAR) technology can improve the automation level in accurate and e fficient 3D reconstruction of urban models [6–9]. The 3D information about buildings and related structures that can be retrieved from the data acquired by airborne LIDAR is usually characterized by directness and simplicity. However, practice shows that the massive number of points requires introducing some level of organization into the data before extraction of information can become e ffective (e.g., aggregating points with similar features into segments in the form of surfaces) [10]. Therefore, producing 3D building reconstruction by manual or semi-automatic methods could be a very time consuming and challenging task. Hence, the generation of 3D building models in a simple and quick way is becoming attractive [11]. Indeed, in the last decade, the automatic 3D reconstruction of buildings from airborne data is an active area of research among photogrammetry, computer graphics, and remote sensing communities [3,5,12–14].

Among the grea<sup>t</sup> variety of reconstruction methods from airborne laser scanning (ALS) proposed in literature (see [3] for a complete review), the data-driven polyhedral method is one of the most commonly used rooftop modeling techniques and can be adapted for generating building models with both simple and complex roof topology [5]. Moreover, polyhedron buildings are quite common in urban areas [7]. These methods use a bottom-up approach that begins with the extraction of primitives (e.g., planes, cylinders, cones and spheres), followed by analysing primitive topology in 3D space, extracting and grouping them to form building models. In summary, the problem of building rooftop reconstruction based on a data-driven framework is transformed into a problem of consistency maintenance of topological relationships among rooftop primitives, primitive boundaries or their combinations: The common assumption is that the building has only planar roofs, based on which various model detection methods in pattern recognition can be adopted for building extraction.

The 3D reconstruction process from airborne LIDAR data is principally based on the segmentation of the raw data sets into building points; the segmentation refers to the task of dividing data into non-overlap homogeneous regions that constitute the complete data sets [7]. The e fficiency and accuracy of the segmentation method results are one of the major challenges in 3D building reconstruction. Region growing methods and the clustering algorithm are the most used in literature for segmentation purpose. Region growing approaches (e.g., [15]) usually start with a selected seed point, calculate its properties, and compare them with adjacent points based on certain connectivity measures to form the region. Alternatively, the cluster techniques (e.g., [7,11]) first summarize the variability in the data by computing the attributes for all points and then group data that cluster together. Each point in the point cloud is classified into one of the clusters of predefined number based on its distances to the clusters' centroid. Other approaches, such as the Hough Transform [16] and random sample consensus (RANSAC) [17], can be used to extract straight lines from boundary points. Finally, the voxel-based algorithm (e.g., [18]) divides a point cloud into voxels with equal size, then the neighboring voxels with elevation di fferences of less than a threshold are classified iteratively into the same subset and segmented from other points.

This study aims to present an investigation and critical comparison of two di fferent fully automatic approaches for roof segmentation used in 3D building reconstruction. In particular, we present a stable solution approach (a), described in Section 2.1, for building roof extraction based on a fuzzy c-means [19] that uses a potential-based clustering method for initial clusters center determination and clusters number determination [20,21]. At the end of the clustering processes, a density-based

and connectivity analysis, as proposed by [7], is used to improve the results of the above clustering process through separation of the planar and coplanar planes. Moreover, a second approach (b), based on a region growing segmentation method [22] using smoothness constrain and curvature consistency refined with application of RANSAC [23] to remove any potential over-segmentation issues, is described in Section 2.2.

The roofs extracted by these two segmentation approaches were used for 3D building reconstruction. After the extraction of the boundary points, as described in [24], a 2.5D dual-contouring approach, proposed by [25], was adopted to create vertical walls connecting the extracted rooftops to the ground.

Both the proposed approaches have been evaluated (Section 3) in terms of geometry accuracy against the real measurements in two di fferent case studies (in terms of types of urban development and ALS data input data resolution) in Matera (Italy) and Toronto (Canada). The results indicate that both approaches have precisely reconstructed the geometric features of the test building preserving topology. In particular, the approach (b) based on region growing segmentation has exhibited slightly better performance but required a computational time that is double that of the clustering-based approach (a). Finally, Section 4 presents the conclusion and main remarks on the investigation presented in this paper.

## **2. Materials and Methods**

This section describes in detail the data used in this study (see Section 2.1) and the steps of the workflow pipeline used in the two proposed fully automatic segmentation approaches (see Sections 2.2 and 2.3) that process airborne LIDAR point cloud data for the purpose of building modeling and 3D reconstruction, as described in the Section 2.3.

## *2.1. LIDAR Data Set*

The first data set was captured over downtown Toronto (Canada). Optech's ALTM-ORIONM was used to acquire the ALS data at a flying height of 650 m in six strips with a point density of about 6 points/m2. The area contains a mixture of low- and high-storey (58) buildings, showing various degrees of complexity in rooftop structures. The reference for building detection and for 3D building reconstruction was generated by stereo plotting. The accuracy of well-defined points is 20 cm in planimetry and 15 cm in height. For more details, refer to [26] and the web site of the test [27]. The scene also contains trees and other urban objects.

The second data set was captured over a complex building in downtown Matera (Italy). The LIDAR survey in the historical center of Matera (Italy) (Figure 1a,b) was carried out by GEOCART S.p.A. using a full-waveform scanner [28], RIEGL LMS-Q560 on board a helicopter to obtain a higher spatial resolution. The flight was operated with a share of around 400 m, a speed of 25.7 m/s, and an opening angle at 60◦. The scanner acquired data in the direction SN–EW, with a divergence of the radius 0.5 mrad, and a pulse repetition rate at 180,000 Hz. The average point density value of the dataset was about 30 points/m2. The accuracy was 25 cm in x, y and 10 cm in z (altitude). The raw data of a small tile extracted by the survey have been orthorectified and radiometrically corrected in order to provide a ready-to-use point cloud to realize as output a group of watertight mesh models that could be used for various applications, such as energy demand estimation, classification of building types, visibility analysis, 3D cadastre, visualization for navigation, urban planning, emergency response, or flooding [4]. The LIDAR data is provided as a group of unorganized discrete points in which each individual point has x, y and z value, plus the intensity value that represents the reflective proprieties of surface encountering (Figure 1b).

**Figure 1.** Clustering and segmentation of roof light detection and ranging (LIDAR) points. (**a**) orthopotho provided by GEOCART; (**b**) LIDAR points; (**c**) clustered points; (**d**) Planar segments.

#### *2.2. Roof Segmentation Clustering Approach*

In 3D building modeling, the segmentation process generally aims to find which LIDAR cloud points belong to which specific rooftop segments and to represent them with as many details as possible. In particular, the segmentation process begins with exploration of the proprieties of local distribution of the points' normal vector (*Ni*) that uniquely determines the direction of a roof plan in order to return the planes with the same normal vectors.

This section describes an approach in which this is treated as a cluster problem, as widely proposed in literature studies (e.g., [6,7,19,20]). In particular, the fuzzy c-means method was used to determine the clusters (see Figure 1c). This method belongs to the partitioning methods clustering category, relocating iteratively data points among various clusters until they achieve the greatest distinction. In this method, a data point does not belong to only one cluster entirely. Instead, it can belong to any of the clusters at a certain degree of belonging estimated measuring the similarity, that is, the inverse distance measure of each data point to the cluster centers.

The fuzzy c-means algorithm requires determination of the number of clusters and their approximate centers in order to start the iterative computation. In this context, the LIDAR data is pre-processed for c-means clustering using a potential-based clustering approach. The point with high potential (i.e., the highest number of data points within its (fixed-distance) sphere of influence) is used to determine the first cluster center; the potential (*Pfi*) of data point (*Ni*) is calculated as:

$$P\_i^f = \sum\_{i=1}^j e^{\frac{\|\mathbf{-}\mathbf{-}\_f^\mathbf{H} \| \mathbf{N}\_i - \mathbf{N}\_j \|^2 \|}{\tau}},\tag{1}$$

where *j* is the number of data points, *Nj* is the *j*th data point and *rf* is the radius of the point *Ni* sphere of influence.

The other cluster center's potential (*Poi* ) is then estimated based on the distance to the previously selected cluster centre(s) to reduce the possibility of two cluster centers being close:

$$P\_i^o = P\_i^f - P\_f^\* e^{\left\|{-\frac{4}{r\_o^2} \left\|{N\_i - N\_f^\*}\right\|^2} \right\|},\tag{2}$$

where *N*∗*f* and *P*∗*f* are the previously selected center and its potential, respectively. To avoid obtaining closely spaced cluster centers, we set *r*0 to be somewhat greater than *rf* , i.e., *r*0 = 1.5 *rf* .

This process stops when the cumulative potential *Poi* reaches a threshold that is below 15% of the first potential *Pfi* . If the potential falls between 15% and 50% of the first potential, we check if the data point provides a good trade-off between having a sufficient potential and being sufficiently far from existing cluster centers, as evaluated by [19].

This algorithm has been implemented iteratively, changing the value of the radius of the sphere of influence. This results in a set of scenarios depending on the radius *r* of the sphere of influence utilized, where the final number of the clusters is inversely proportional with the magnitude of the *r*. Selection of the best approximation value of the cluster numbers and the cluster center positions was determined by the likelihood (i.e., compactness) of each cluster, calculated as in the following equation:

$$d = \sum\_{i=1}^{c} \frac{d^i}{c} \, \, ^i \, \tag{3}$$

where *di* is the mean distance in cluster *i* of data points to its respective clusters and *c* the number of clusters.

Finally, rooftop segmentation was refined by the separation of parallel and coplanar planes as well, as proposed in [7], because the planes may have roof segments that are parallel to each other or roof segments that are mathematically the same but are separated spatially (see Figure 1d).

The flow of the proposed approach is depicted in Figure 2.

**Figure 2.** Schematic representation of the workflow of the clustering segmentation approach described in Section 2.2.

#### *2.3. Roof Segmentation Region Growing Approach*

The second approach uses the region growing segmentation method proposed by [24] in order to describe each individual building rooftop with the best spatial detail possible (see Figure 3). This region growing segmentation process uses the point normals (*Ni*) and their curvatures (*Ci*) to detect every significant feature on the rooftop:

$$\mathcal{C} = \frac{\lambda\_1}{\lambda\_1 + \lambda\_2 + \lambda\_3} \tag{4}$$

where λ is the eigenvalue of the LIDAR points subdivided by their three dimensionalities.

**Figure 3.** Region growing approach. (**a**) Segmentation of roof light detection and ranging (LIDAR) points displayed in Figure 1b. (**b**) 3D building reconstruction.

The process examines the points surface smoothness and picks the point with the smallest curvature value as a seed point. The algorithm then examines the local connectivity among the neighboring points, grouping the points with direction similar to the seed point normal, that is, lesser than a predetermined threshold (the angular difference threshold applied here is equal to 4◦).

Among those points which have been grouped together by the seed point, points with curvature values lower than a predetermined threshold (equal to 0.01) are chosen as future seed points. The procedure continues in the same fashion and stops when all points have been visited. Finally, for each segmented region, RANSAC is applied to fit a virtual plane from the candidate points and then the points are forced to move on to this estimated plane in order to assign a perfect flatness property to each surface [24]. The main steps of the proposed approach are depicted in Figure 4.

**Figure 4.** Schematic representation of the steps of the region-growing segmentation approach described in Section 2.3.
