1. Introduction
Road safety has been one of the major focuses in public safety concerns for many years. In the 2015 global status report on road safety, the total number of worldwide road traffic deaths remains unacceptably high at 1.25 million per year [
1]. To conduct road safety inspections in Europe, a road infrastructure safety management directive is adopted by the European Union [
2]. The EU will eventually have a role in the safety management of the roads belonging to European transportation networks—which is set to encompass 90,000 km of motorway and high-quality roads by 2020—through safety audits at the design stage and regular safety inspections of the network [
3]. Similar policies also have been made in the U.S. In 2014, the U.S. Department of Transportation developed a system named Model Inventory of Roadway Elements (MIRE) to inventory road furniture and improve roadway safety [
4]. Road safety can be enhanced by the inventory of road furniture which is strongly related to road furniture detection [
5].
Another rising general interest is autonomous driving, which facilitates driving safety and makes life more convenient. In addition, it enables people such as the aged and disabled who cannot drive to use a vehicle [
6]. Although autonomous driving systems do not fully rely on 3D precise maps, it is still crucial for the improvement of the safety and stability of automatic driving systems. Road furniture detection plays an important role in both road furniture inventory and 3D highly precise mapping, which consists of road detection, curbstone detection, pole-like road furniture interpretation, and so forth.
Currently, the road furniture inventory mainly relies on visual inspection or semi-manual interpretation, which is time-consuming and tedious. To facilitate this procedure, methods for automatic road furniture detection relying on high-quality data are needed. Tools for capturing three-dimensional road scene data are Mobile Mapping Systems (MMS) which have been developed rapidly in recent years. They are often mounted on a vehicle and mainly composed of four parts: Light Detection and Ranging (LiDAR) sensors that collect 3D point clouds, cameras that capture 2D imagery data, an accurate Global Positioning System (GPS) that records the position of the previous two sensors, and an Inertial Measure Unit (IMU) that measures the pose of the sensors. Compared with optical images acquirement, Mobile Laser Scanning (MLS) data collection is not restricted by the illumination conditions such as good weather and daytime. Moreover, MLS data can be obtained rapidly and accurately. Significant progress has been made on research related to road furniture inventory in the laser scanning data field, which includes road detection and modelling [
7,
8,
9], curbstones mapping [
10], railway modelling [
11,
12], pole-like road furniture interpretation [
13,
14,
15,
16,
17,
18,
19,
20,
21], tree inventory [
22], and building detection [
23,
24,
25,
26]. In the urban scene, MLS data analysis has become popular and essential for urban road environment analysis.
Individual road furniture can be interpreted as a single object or as multiple connected parts. For instance, a streetlight with an attached traffic sign can be recognized as a streetlight at the object level. It can also be interpreted as multiple connected parts: a pole connected to a traffic sign and a streetlight. Numerous studies have been carried out on road furniture interpretation on the level of single objects, while only a little attention has been paid to the more detailed segmentation of road furniture based on spatial relations between object parts. Such a detailed segmentation, however, is necessary since it provides more detailed partition information for the interpretation of road furniture. The objective of this paper is to propose an automatic framework for road furniture decomposition.
The rest of the paper is organized as follows.
Section 2 provides a review of related work on road furniture interpretation and state of art of point cloud semantics segmentation techniques. In
Section 3, our method is described in three main stages: the initial road furniture detection, decomposition, and final road furniture detection. We introduce the test sites, analyze the experimental results, and give a comparison to other work in
Section 4. The conclusions that are drawn and possible future work can be found in
Section 5.
2. Related Work
The reliability of MLS systems has improved remarkably in recent years. It is reported that the accuracy of the advanced mobile laser scanning system is as high as 5 mm under a 30 m range [
27]. Its ability to acquire road objects’ 3D structural information, which 2D optical cameras cannot directly capture, is highlighted. An important factor of MLS data is the varying point density which impacts the performance of 3D point cloud interpretation algorithms. This is because it is difficult to find a reasonable neighboring size for feature extraction under these conditions. There are two elements that affect the point density of MLS data. The first one is the distance between scanners and recorded objects, which causes the change of the point density along the scanline direction. The latter one is the vehicle speed, which adds to the variation of the point density between the different scan lines. Accounting for these properties, numerous studies on MLS data interpretation have been performed. Work on 3D objects interpretation is summarized in two categories: unsupervised methods and supervised methods.
Much research in recent years has focused on road furniture interpretation in the point cloud by using unsupervised methods [
13,
14,
16,
17,
28,
29].
Pu et al. [
14] proposed a percentile-based method to recognize pole-like structures from mobile laser scanning data for road inventory studies. Firstly, they used road parts to partition the unorganized MLS data. Then they divided the point clouds into three parts: ground points, points connected to the ground, and off-ground points. Finally, they used knowledge-based methods to recognize the above-ground segments. The method was able to detect 61–87% of poles in the point clouds. Among all the subclasses, the detection rate of trees was the lowest (29.5–63.5%). However, for this method, there are problems with the detection of jointly connected pole-like objects such as trees connected with pole-like objects. Li and Oude Elberink [
16] optimized the method of Reference [
14] by adding reflectivity information. Because of the use of reflectance information and pulse count information, the rate of street sign detection and tree detection is largely improved. However, connected pieces of road furniture cannot be detected and recognized in these two methods.
Lehtomäki et al. [
13] represents an early attempt which detects vertical-pole objects by using scan line segmentation and cylinder fitting. At first, they extracted potential sweeps from the data, removing long segments and keeping short segments. Then the segments are grouped based on their profile information and horizontal plane position before these clusters that belong to one pole are merged by distance and orientation check. Finally, these clusters are classified into poles and non-poles according to their properties, such as height, shape, and orientation. The detection rate of the poles is 77.7% and the correctness is 81.0%. It is hard for this method to detect some complex pole-like objects that contain many points in the outer cylinder such as slanted pole-like road furniture or traffic signs that consist of many signboards. With a similar cylinder model mask, Cabo et al. [
17] proposed a voxel-based algorithm to detect pole-like road furniture objects from MLS data. In order to make the point clouds more uniform, they first voxelized the point clouds into grids. Then they analyzed the three-dimensional information using two concentric cylinders. Finally, pole-like objects were classified from point clouds. Although this method has acquired pretty good results, there are still some limitations such as the detection of poles that are too close to bushes or guardrails. Nurunnabi et al. [
28] utilized a robust diagnostic principal component analysis (RDPCA) in combination with region growing to segment road furniture into different parts. However, saliency features used in this method strongly rely on a dense point cloud.
Several scholars have put effort into road furniture recognition by introducing supervised approaches [
18,
30,
31,
32,
33,
34,
35,
36].
Golovinsky et al. [
30] utilized a normalized cut method to localize objects of interest. Different machine learning techniques in combination with shape features were used to classify these above-ground objects into sixteen categories. In the work of Munoz et al. [
31], a functional gradient approach was proposed to label mobile mapping data by using Max-Margin Markov Networks (M3Ns). This method was tested for both 3D point cloud classification and geometric surface estimation in 2D images. However, this method still found it hard to separate and classify conjunctions which contain multiple objects.
Different from these methods mentioned above, Huang and You [
18] proposed a method in combination with a Supported Vector Machine (SVM) to classify road furniture into four categories. In order to localize pole-like objects, they implemented point cloud slicing, clustering, pole seed generation, and bucket augmentation. In the following stage, ground points are removed and the pole-like objects were extracted. In the end, six features are trained by the SVM to classify road furniture into four types. The detection rate of pole-like road furniture was 75%.
Soilán et al. [
34] used mobile laser scanning data in combination with images to recognize traffic signs. At the first stage, the ground points were removed by using height and intensity constraints. Traffic signs were extracted with a reflectance threshold estimated by a Gaussian Mixture Model (GMM). Then, geometric parameters of the extracted traffic signs were generated. Based on these geometric inventories, these extracted components were projected onto the corresponding camera systems, after which traffic signs can be found on the corresponding 2D images. In the end, Histogram of Oriented Gradients (HOG) features trained by the SVM are adapted to recognize traffic signs. More than 85% of pole-like traffic signs were detected. However, the detection of traffic signs in this methpd strongly relies on the reflectance information.
Hackel et al. [
35] described an efficient and effective method for point-wise semantic classification, which can deal with point clouds captured by LiDAR or derived from photogrammetric reconstruction with high-density variations. Instead of computing optimal neighborhoods for each point, they down-sample the entire point cloud to generate a multi-scale pyramid with decreasing point density and compute features for every voxel at every scale level. Then these features are trained and discriminant ones are selected. Finally, Random Forest (RF) is used for semantically labeling. As a drawback, RF cannot make use of contextual information. The precision of pole-like road furniture recognition is less than 35%.
The Bag of Words (BoW) and Deep Boltzmann Machine (DBM) methods are applied by Yu et al. [
36] to detect and recognize traffic signs in mobile laser scanning (MLS) data. Here, the authors first constructed a visual word vocabulary by using features encoded by a DBM model to detect traffic signs. Similar to Li et al. [
29], they separated the poles and traffic signs which are projected to 2D images afterward. Lastly, these cropped traffic sign images were recognized by a pre-trained DBM model. More than 90% of pole-like road furniture was detected in the high-quality MLS data. A 3D convolutional neural network was proposed by Huang and You [
33] to interpret urban scenes into seven categories. They first used small grids to densely voxelize the original point cloud. Then feature maps in combination with a convolutional neural network are trained to label the point clouds. 87% of pole-like road furniture were identified.
Compared to unsupervised methods, supervised methods are more flexible in multi-class identification problems. Normally, supervised methods need a lot of training data. For one-class detection problems, unsupervised methods are more practical, especially when there is a limited amount of training data. How to leverage these two methods or combine them still needs to be explored. In this paper, we use a knowledge-driven method—in which generic rules are defined—to detect and decompose road furniture.
All in all, significant progress has been achieved on 3D object classification and 3D scene labeling. Nevertheless, no attention has been paid to the road furniture partitioning based on shape constraints of different components, which is the innovation presented by this paper.
3. Methodology
In this section, the three main stages of our framework are presented: the initial road furniture detection (
Section 3.1), decomposition (
Section 3.2), and the final road furniture detection (
Section 3.3). Firstly, road furniture were initially extracted from unorganized mobile laser scanning points. In the following stage, these detected road furniture objects were decomposed into poles and attachments based on their spatial relations. Road furniture detection is then refined by the feedback of road furniture decomposition (
Section 3.3). In addition, we introduced an approach to evaluate the result of the decomposition which is explained in
Section 3.4. The workflow of our framework is illustrated in
Figure 1.
3.1. Initial Detection of Road Furniture
The main objective of this paper is to segment road furniture within a certain distance from laser scanners. It is difficult to segment long distance points that are captured with scanners because of their low point density. Therefore, we defined a distance to remove points that are far away from the road (trajectory line), which also helps to reduce the computation cost. At this stage, we started with pre-processing, which divided the unstructured data into roadblocks, removed the ground points, and obtained the above-ground components. Then, an initial classification was performed to remove dynamic objects and detect building components. In the last step, we proposed a slice-based method to extract pole-like road furniture, trees, and pole-like road furniture connected with trees and excluded objects inside buildings by occlusion analysis. The workflow of the initial pole-like road furniture detection is as shown in
Figure 2.
3.1.1. Pre-Processing
The original mobile laser point cloud files are often very large and result in computation and memory problems when processed in one go. To circumvent these difficulties, we split the unorganized point cloud into road parts along the trajectory line as described in Reference [
14]. In this paper, the length and width of a road part are specified as 50 m and 40 m, respectively. Since one piece of road furniture could be separated into two parts, the length of the overlapping zone between two neighboring road parts is set to 5 m.
Ground points are the main connection between different above-ground objects. In order to separate different objects, ground points should be removed first. After splitting the point cloud into road parts, the ground points were removed in each road part. As the road surfaces are smooth, the height difference between all nearby points within a certain neighborhood was small. In contrast, for above-ground objects, there was a large height difference between all nearby points within a certain neighborhood. We calculated the height difference as the difference between the maximum height and the minimum height within a point’s neighborhood [
37]. In this paper, the neighborhood size was set to be quite large (e.g., 100 nearest neighboring points) because of the high point density of the ground points. When the point cloud is very sparse, it should be set lower. The height difference threshold was set to be 0.15 m. Based on this property, the ground points were removed. Compared to surface growing, this method proved to be more efficient.
After the ground points are removed, the remaining above-ground points were clustered by conducting a connected component analysis. This resulted in the above-ground objects for the initial classification.
3.1.2. Initial Classification
During the collection of MLS data, many moving objects or pedestrians were scanned. These unrelated objects should be removed to mitigate false positive detection of road furniture. As mentioned in the introduction, our objects of interest were road-side objects which were assumed to have traffic functionalities. However, large building façades also frequently occur in road environments. Consequently, it is necessary to remove them to reduce the computational effort for road furniture detection. On the other hand, buildings façades can provide useful information for pole-like road furniture detection. For instance, some pole-like objects inside buildings can be eliminated by using façade information. In this research, dynamic objects, buildings, and fences that are not related to road furniture were labeled thusly in this period and removed before decomposition.
In the initial classification, we use the method described in Reference [
37] to remove dynamic objects. A component was labeled as a dynamic object if more than 90% of the points only had k-neighbors from the same scanner. This can only be used for the data collected by multiple laser scanners. A detailed explanation is given in Reference [
37].
A large number of building points in the point cloud would lead to an unnecessarily high computation time. Compared to other road objects, a façade plane is perpendicular to the ground, its area is large, and it is above a certain height. Similar to the method in References [
23,
24], the orientation, height, and area of the façade were selected as distinctive features to detect the façade components. In this step, surface growing was utilized to extract the planes from the components. Based on an area and angle threshold, the large vertical planes were retained afterward.
3.1.3. Slice-Based Pole-Like Road Furniture Detection
In this step, pole-like road furniture, vegetation, and pole-like road furniture connected to vegetation were extracted. As known, many types of vegetation have small branches and massive leaves. When a laser pulse hits small branches or leaves, this pulse usually splits into multiple pulses before reaching the receiver sensor. In contrast, most points of other objects exclusively have a single pulse count attribute [
38]. Consequently, the ratio of points with a multi-pulse count is useful for tree detection. However, there are points which belong to the edges of road furniture with multiple counts as well. If we use the number of returns, the points belong to the edges of above-ground objects are also labeled with multiple returns. Therefore, the value of the return number was utilized instead of the number of returns as a feature to extract trees. In our research, the ratio
of points with the first return in above-ground components was also used as a feature to detect trees. In order to extract pole-like road furniture, Pu et al. [
14] proposed a framework which allows for the general interpretation of road furniture. Their work represents an early attempt to utilize a percentile-based algorithm to detect pole-like road furniture. However, it has difficulty in detecting road furniture with a large number of attached components. For instance, it has difficulty extracting traffic lights that are connected to many traffic signs and street signs. To overcome these difficulties, a slice-based method was presented to detect pole-like road furniture. Occlusion analysis was used to remove pole-like objects behind façades afterward. The workflow of this method is as shown in
Figure 3.
First, every individual above-ground component was cut into horizontal slices (
Figure 4). Then a connected component analysis was performed for every slice to produce separated components. The center point of every separated slice component was computed and a 2D connected component analysis was applied to connect the components which were very close to each other in the horizontal plane. Here, three constraints were applied to compute the number of pole-slices for the connected slice components in every individual above-ground component. The first constraint was the displacement
of the center points of two neighboring slices (
Figure 4). There should be no large displacement between the two neighboring slices. The second one was the difference of the diameter of two neighboring slices. The diameter of a slice was the 2D largest distance between two points in this slice. The difference of diameters between the two neighboring slices should be small (e.g., 0.2 m), assuming that a part of the pole will have no attached objects. The third constraint was the diameter
of a slice (
Figure 4). The diameter of a slice should be smaller than a pre-determined threshold. Finally, the number of pole slices was checked for every individual above-ground component. If the number of pole-slices was larger than a specified threshold (set to 3) and the ratio
was smaller than a threshold (0.05), this component was labeled a pole-like road furniture. If the number of pole slices was larger than a specified threshold and the ratio
was larger than the threshold, this component was labeled a pole-like road furniture connected to trees. If both of them were smaller than their corresponding thresholds, this component was labeled trees.
Among the detected pole-like road furniture candidates, there were incorrectly detected objects which were located behind façades. To exclude them, an occlusion analysis is performed. In the step of the initial classification, façade planes were obtained through surface growing. Then these façade planes were computed as constraints to determine if these pole candidates were located outside of the façade planes (
Figure 5). If they were positioned outside of the façade, they were labeled as pole-like road furniture, otherwise, they were not.
3.2. Road Furniture Decomposition
To address the problem that some road furniture had multiple functions, we presented a new method to decompose road furniture into different components based on their spatial relations. We analyzed the relationship between attachments and vertical and horizontal poles. Therefore, it is crucial to know which points belong to which pole. The remaining points have spatial relations to either horizontal or vertical poles. Based on these relations, we presented an optimal segmentation procedure. The overall workflow of this method is as shown in
Figure 6. There is a large variety of structures and shapes in pole-like road furniture. It is difficult to extract all the poles using a single method. Therefore, we first identified some key properties of the road furniture, based on which the most suitable pole extraction method was selected (
Section 3.2.1). Next, the poles were removed and the attached components were separated into different parts (
Section 3.2.2). In the end, the rule-based splitting and merging were carried out to refine the decomposition.
3.2.1. Pole-Extraction
Most pole-like road furniture comprise of poles and other attached components. These poles are the link between different attached components. The motivation of this step is to extract this link. A framework for pole extraction is presented in our previous work [
29].
The structures and shapes of pole-like road furniture vary a lot, which makes it difficult to use a single method to extract poles. For example, in the case of many attachments connected with poles, it is difficult to directly extract poles based on their linear features. Thus, 2D point density can be adopted as a feature to extract poles. We categorized the pole-like road furniture into three typical types. The first type was road furniture with many attachments (the left image in
Figure 7). The second one was road furniture with horizontal poles (the middle image in
Figure 7), and the third type was normal road furniture (the right image in
Figure 7). For these three different types of road furniture, three corresponding pole extraction methods were proposed.
In order to select the method to be used for pole-extraction, we first did the pre-identification to get information from the road furniture such as the length and width of its bounding box. To obtain the knowledge of the structure of the road furniture, we cut pole-like road furniture into slices and computed the width of every slice. In the previous phase, we already cut the above-ground components into slices to check if they were pole-like. We do not use the previous slicing information because the width of the previous slicing in the pole-like road furniture detection stage is rather large (e.g., 0.3 m) because it was used to connect small fragments together. In this phase, the width was set to 0.1 m. The bounding box of every individual road furniture is then calculated. The median width of the slices can be obtained by the statistics of every slice. The maximum variation of the distance between the points of one road furniture item in the XY direction can be calculated based on its bounding box.
According to the properties of road furniture calculated in the pre-processing phase, the corresponding pole extraction method was selected (
Figure 8). If more than half of the pole component consists of attached components, a 2D point density-based method was utilized. This can be determined by comparing the smallest quartile width and median width. If there was a large difference, we believed that there were many components attached to the pole. Otherwise, Random Sample Consensus (RANSAC)-based line fitting was utilized if the horizontal poles were included in the road furniture. These can be detected by checking the bounding box of the piece of road furniture. If the length or width of the bounding box in the horizontal direction was larger than the threshold, the decision that this road furniture likely contained horizontal poles was made. For normal road furniture, there were more robust features to extract poles than using 2D point density features. If there were not many attachments to the road furniture, the line fitting was more robust than extracting points with high 2D point density. Consequently, we used the slice cutting based method to fit the center lines of poles and perform pole extraction. Finally, when there were no attachments to the pole, there was no need for decomposition.
The brief explanation of these three pole extraction methods is given as follows. In the 2D point density-based method, we first calculated the 2D point density around every point. Then we used a region growing method to extract the clusters of points with high 2D point density (
Figure 9a) which were recognized as poles. In the second pole extraction method, points with high linearity were first extracted by calculating the eigenvalues of every point’s neighboring points. Then the RANSAC algorithm was adopted to extract lines from these points with high linearity. Poles were extracted by detecting points within a radius to the fitted lines (
Figure 9b). In the slice cutting method, we first extracted the center lines by carrying out the RANSAC algorithm with the center points of the cut slices. Similar to the second method, the poles were then extracted based on the distance between the points and fitted lines (
Figure 9c). A more detailed description of these three methods is given by Li et al. (2016).
Poles extracted by the first two methods (2D point density method and RANSAC line fitting method) were often not accurate. For the 2D point density-based method, some points of attachments that are near the poles have a high 2D point density. They can thereby be categorized as poles. When points with high linearity were not extracted accurately enough, the center lines of the poles cannot be estimated correctly using the RANSAC algorithm in the second method. If a pole is inaccurately generalized to a line, this causes the imprecise extraction of the pole. For example, if the generalized center lines inclined towards the street side (
Figure 10a), the part of the points that would be far away from the center lines and belong to the poles would not be extracted in this stage. To tackle these problems, pole extraction was optimized by re-estimating the center lines of the poles. Similar to the slice cutting, we cut the extracted points of poles into slices, computed their center points, and used the RANSAC algorithm to extract the lines from these center points. By doing this, poles can be extracted more accurately by using the precisely fitting center lines. The re-estimated center lines were shown in
Figure 10b.
3.2.2. Decomposition into Poles and Attachments
In order to separate the attachments which are connected to the poles, we removed the extracted poles and performed a connected component analysis [
29]. As the components can be very close to each other, the maximum distance and the size of the nearest neighborhood should be chosen properly. Considering the distance between the scanlines and the point distribution on a single scanline, the neighborhood size cannot be very small. Otherwise, points on different scanlines would not be connected. The maximum distance between two scanlines here is 0.05 m. For a point, the number of its neighboring points was typically 10 within the distance of 0.05 m in a single scanline. Here, we set the maximum for the connected components to 0.15 m and the neighborhood size to 15 points. The separated components are as shown in
Figure 11. The best method for each piece of road furniture was chosen automatically.
Figure 11a shows the partitioning of the remainder of points which were obtained from a 2D point density-based pole extraction method. Pole extraction in
Figure 11b uses the RANSAC line fitting method and pole extraction in
Figure 11c,d utilized the slice cutting based method.
Because of occlusion, low point density, and noisy points, it is sometimes difficult to connect all the points that belong to the same attachment by leveraging the parameters of the aforementioned connected component analysis. Examples are marked in
Figure 11. To address this problem, we defined a set of generic rules to split and merge the components. We first performed merging rules for the components for horizontal poles. Then the components connected to vertical poles were analyzed to see whether they could be merged or split. Then, the detached components were checked on whether they could be merged with their nearby components.
As shown in the red circle of
Figure 11b, one component attached to a horizontal pole can be separated into two parts during the connected component analysis. If two components were attached to the same horizontal pole and their positions overlap in XY-plane, these two components should be merged. The merging analysis of the horizontal pole attached components was repeated until no such components were found.
Figure 12 shows the merged components after applying the merging rule.
Another situation was that the attachments might be connected to each other because of noisy points or imprecise pole extraction. An example is shown in the
Figure 11c. In this situation, we increase the width of the pole extraction. If one component can be split into two parts and they are not at the same height, this component was split into two parts. This method is similar to erosion followed by dilation.
Suppose is a component which is attached to vertical poles when the width is set to for pole extraction. We increased —where is the width of the cut slice —and performed a connected component analysis. Once becomes the two components— and —and they are not at the same height, they should be separated into two parts. This operation will continue until the increment of reaches a predefined value. In this paper, it is empirically set to 0.15 m for the splitting analysis.
There are single attachments which might be separated into two parts (
Figure 11a). This case occurs when the distance for connection is not large enough during the connected component analysis. In this case, we conducted a merging analysis. We continued increasing the width of the pole extraction to
. Here, we set
to be the normal size of traffic signs, for example, 0.5 m. Once the two components
and
are wrapped by their connected vertical pole and once the two components are in the same height, the connection line of their center points comes very near to the pole line. Then they should be merged together. The merging analysis will be iterated until no components can be merged. The components were split and merged as shown in the red circle of
Figure 13.
Components that were not connected to poles (
Figure 11d) were described as detached components because of the low point density or occlusion. It is assumed that if one detached component and an attachment were at the same height, on the same side, and their connection line was very close to this pole, they should be one component and merged together. Then, this merged component was added to continue with the detached components merging analysis. The detached components were merged as illustrated in
Figure 14.
3.3. Final Detection of Pole-Like Road Furniture
The results of the decomposition were imported as it is feedback for the road furniture detection stage. If there was no pole extracted from a road furniture item such as the pillars of buildings, this road furniture item would not be labeled as pole-like. For example, the pillars of buildings can be detected as a pole-like road furniture item at the detection stage. When such a pillar is decomposed by using RANSAC line fitting, this pillar will probably not be extracted because the percentage of points which belong to the pole is low when there are many points with a high linearity from the edges of façades or fragments.
3.4. Road Furniture Decomposition Evaluation
In this section, an approach was presented to evaluate the accuracy of the decomposition. Based on their spatial relations, the components of pole-like road furniture were manually labeled. For example, street signs can be manually labeled as “pole and attachments”, as shown in
Figure 15.
In order to assess the results of the decomposition, we used a two-level evaluation method. One is point-based evaluation and the other one is a component-based evaluation. We used completeness and correctness to quantify the evaluation.
In the point-based evaluation, we first selected the corresponding true positive points of the manually labeled ground truth component. We matched the corresponding decomposition result and every ground truth component by selecting the largest decomposed component in this ground truth component. The true positive points in an attachment are the points belonging to both this attachment and its corresponding manually labeled attachment. Then the completeness of every component was computed as the ratio of the number of points of the largest segment in this component to the number of points of this component,
.
is the number of correctly decomposed points in the manually labeled components,
is the number of incorrectly decomposed points in the manually labeled components. For example, in
Figure 16, the true positive component of the streetlight head is the decomposed component which is marked by the red circle 1 in the left figure. Thus, the completeness of this component can be computed as the ratio of the number of points which are in the overlap of the component labeled with the red circle 1 in the left figure and the component labeled with green circle 1 in the right figure, to the number of points of the component labeled with the green circle 1 in the right figure.
The correctness of a separated attachment is the number of true positive points divided by the number of points of this separated attachment. For example, in
Figure 16, the correctness of this component is computed as the ratio of the number of points which are the overlap of the component labeled with the red circle 1 in the left figure and the component labeled with the green circle 1 in the right figure, to the number of points of the component labeled with a green circle 1 in the right figure.
.
is the number of incorrectly decomposed points in the components produced by our algorithm. Those points which are labeled with the green circle 1 in the right figure are included as well in the segment which is labeled with the red circle 1 in the left figure.
In the component-based evaluation, the largest segment in every manually labeled ground truth component was selected as the corresponding segment of this component. Then we calculated the completeness and correctness of every component by using point-based evaluation. If the completeness was lower than the threshold
, this component was labeled as over-decomposed. If the correctness was lower than the threshold
, this component was labeled as under- decomposed. Otherwise, this component was an applicable decomposed component. Both
and
were defined to be 0.6. For example, in the left image of
Figure 16, component 2 labeled with a green circle is under-decomposed with low correctness. In contrast, component 3 is over-decomposed with low completeness.
5. Conclusions
In this paper, we proposed a new framework to detect pole-like road furniture and decompose them into different components based on their logical relations. This innovative framework is tested in two test sites. After being processed by our new framework, the road furniture were detected and interpreted by logical relations, which can be used for precise semantics labeling. This proposed framework can be potentially used for high defined 3D mapping. In this framework, we improved road furniture detection by combining dynamic objects removal, pole slicing, and occlusion analysis. The completeness and correctness values of the pole-like road furniture detection were higher than 90% in both the Enschede dataset and the Paris dataset. The main contribution was the decomposition of the road furniture and its evaluation. Compared with our previous work, the current framework is completely automatic and the performance of decomposition has been improved by applying defined rules.
The next stage of our work will focus on the classification of decomposed road furniture. Even though logically separated components were obtained, the meaning of every component has not been assigned yet. Therefore, in order to make use of the components for further applications such as mapping, these components will be semantically labeled based on their features.
In our research, road furniture items have been decomposed into components by using mere geometric features. Color information has also been beneficial to the detection and decomposition. Many techniques on image semantics labeling such as the convolutional neural network can be applied to our research. 2D image data were also captured by two cameras mounted on a moving vehicle. The clear detection, decomposition, and classification of road furniture could benefit from the color information given by the 3D point cloud.