3.2. Synthetic Data
For synthetic data generation, we considered the characteristics of real data such as noise and occlusion. Specifically, we generated all synthetic data to have noise that follows the Gaussian assumption introduced in
Section 2.1. In addition, occlusion was introduced by excluding the point clouds of the occluded planes from the point clouds of the cuboid. In addition to noise and occlusion, the density of point clouds was considered because it varies with the SLAM algorithm and the sensor used.
Synthetic data were hierarchically generated, as shown in
Figure 6a. First, 500 cuboid data are divided equally into five cases depending on the number of generated faces, because there are no point clouds on the occluded faces. Second, 100 cuboid data of each case are also divided into five types depending on the characteristics related to noise, such as the standard deviation and noise ratio. Here, the standard deviation and noise ratio were arbitrarily taken within ranges of 0.02–0.04 m and 50–100%, respectively. The standard deviation is a parameter of the Gaussian model, and the noise ratio represents the proportion of point clouds following the Gaussian model among all the point clouds of the cuboid. Each cuboid has unique parameters and point density.
To thoroughly evaluate the proposed algorithm, cuboids with 2–6 faces, i.e., the minimum to the maximum number of faces, were created to model a cuboid, even though the measured point clouds generally have a maximum of five faces. Moreover, we included two spatial combinations of reconstructed faces when generating cuboids with 2–4 faces, as shown in
Figure 6b.
When Wei et al. [
10] generated multiple cuboids, the cost of each cuboid was calculated using the following equation, and the cuboid with the lowest cost was selected.
Here, represent the orientation, size, and center error, respectively. Specifically, size error is the absolute difference between the corresponding edges of ground truth. The center error is a two-norm distance between the modeling result and the ground truth. The orientation error is computed as the minimum angle required to transform the obtained orientation to the ground truth orientation.
Robustness is evaluated by the number of correctly modeled cuboids and narrowness of the error range. We counted correctly modeled cuboids by defining an incorrect cuboid model consisting of false and no solutions. The case in which the cuboid parameter estimation fails is determined as no solution, and that in which the estimated parameters have large differences from the ground truth is considered as a false solution. A cuboid model is considered a false solution if any error in the cuboid parameters of size, angle, and center exceeds the threshold. Among the cuboid parameters, we evaluated the size results as the volume, which is the product of each size. This is because it is difficult to achieve consistency in the individual size error results because the error in each size varies depending on the occlusion of the plane of the corresponding axis.
Table 1 reports the cuboid modeling results based on the number of false and no solutions. The threshold was calculated based on the interquartile range of the entire cuboid error generated using the proposed method and that of Wei et al. [
10] from the synthetic data. The volume, center, and angle error thresholds were set at 1.5, 1.5, and 5 interquartile ranges, respectively. The units and calculated threshold values are listed in the second row of the table. According to
Table 1, in most cases, the proposed method is more robust than the method of Wei et al. [
10]. First, there is no case in which the proposed method fails to obtain the cuboid parameters, whereas the method proposed by Wei et al. [
10] cannot estimate the cuboid parameters for 50 data in two faces case, which reflects severe occlusion. Second, both the methods accurately estimate the angle of the cuboid; however, the proposed method outperformed the method of Wei et al. [
10] in estimating the volume and center for all faces.
The average error between the result parameters and the ground truth, and the number of false and no solutions are listed in
Table 2. The average error was calculated using the results of the correctly modeled cuboid. The center and orientation errors were calculated in a manner similar to the selection of the cuboid from multiple results. The volume error was calculated as a percentage, which is the volume of the results divided by the volume of the ground truth. In addition, the units of error are represented in parentheses. The values in bold font indicate better performance, and the hyphen indicates that there is no correctly modeled cuboid.
The proposed method outperformed the method of Wei et al. [
10] under most conditions, as shown in
Table 2. This suggests that the proposed method can model a cuboid under various noise and occlusion conditions with a lower average error. In some cases, the method of Wei et al. [
10] shows better performance than the proposed method, for example, the volume and center error for type 2 in two faces and volume error for type 3 in three faces. However, the proposed method estimates 10 and 5 more cuboids correctly for type 2 in two faces and type 3 in three faces, respectively, compared to the method of Wei et al. [
10]. Therefore, it is necessary to match the data used to calculate the average error to accurately compare the error range.
To accurately evaluate the robustness from the perspective of the error range, the average error and standard deviation of the data obtained using both methods are listed in
Table 3. In the case of two faces, the data are insufficient to calculate the average and standard deviation because Wei et al. [
10] succeeded with only one cuboid. Therefore, the results for the cases of 3–6 faces are provided. The number of cuboids correctly modeled by each method is reported in the used data row.
Table 3 lists the mean and standard deviation values of the angle, volume, and center errors for both methods. Based on
Table 3, the proposed method achieves smaller mean error and standard deviation for all parameters of the cuboid in most cases. Specifically, the proposed method is more robust than the previously studied method because it achieves a lower and more consistent error for various cases. These results are shown as boxplots in
Figure 7 for an easy comparison. Here, the proposed method whose results are colored in green outperforms the previous method whose results are colored in red.
3.3. Real Data
We verified that the proposed method is robust under various noise and occlusion conditions, as discussed in
Section 3.2 by comparing its modeling results with the ground truth. Although synthetic data are designed to cover various cases of point clouds, conditions for real data are more difficult. Point clouds measured by the SLAM algorithm are more dynamic than synthetic data. Therefore, real data offers more challenging conditions for the proposed method.
We designed an environment and sensor system for data acquisition. The method of Lee et al. [
6] was implemented for the SLAM algorithm to scan the environment, and the sensor system shown in
Figure 8a was used to measure the surroundings. We set up a space in which 78 boxes are arbitrarily located, as shown in
Figure 8b. The space was scanned five times, and all boxes were randomly rearranged for each scan.
Robustness was evaluated by the number of correctly modeled cuboids and the error range in the synthetic data. However, because there is no ground truth for real data, we used metrics such as P, R, and F, which were used by Wei et al. [
10], to evaluate the results instead of the error range. Here, P denotes the number of uniformly sampled point clouds of the cuboid model within a certain distance from the measured point clouds. R represents the number of measured point clouds within a certain distance from the uniformly sampled point clouds of the cuboid model. F represents the harmonic mean of P and R. However, these metrics may be unsuitable for comparing the performance of the cuboid model because the point clouds measured in this study were highly occluded. A practical example of this phenomenon is shown in
Figure 9.
Therefore, we developed a metric based on the concept of R score, and the distance between the measured point clouds and uniformly sampled point clouds of the cuboid model. The concept of the R score focuses on the ratio of the number of points located at a certain distance to the total number of points. By contrast, our metric focuses on the distance that makes a certain ratio of points inliers. Specifically, we uniformly sampled points from the cuboid model using an open-source software [
28]. Subsequently, we determined the closest distance from the measured point to the sampled point for each measured point. Then, the closest distances were sorted in ascending order. According to this order, k% of points are located near the surface of cuboid modeling results under the k-th percentile distance. Therefore, we obtained the 75th, 80th, and 85th percentile distances to obtain the threshold distances in that 75%, 80%, and 85% of points are located near the surface of the cuboid.
Moreover, we also tested the performance of the proposed method without the Backtracking Line Search on real data to validate the benefits of this step. We noted the performance of the proposed method without the Backtracking Line Search as ‘w/o BTLS’ on
Table 4 and
Table 5 and
Figure 10 to see the enhancement of the robustness.
The robustness was compared based on the number of correctly modeled cuboids and the mean and standard deviation of the results, which were in a manner similar to the synthetic data. The results of the false and no solutions are listed in
Table 4. The method of Wei et al. [
10] fails to model the correct cuboids for 218 of the total 354 real data, which is approximately 61% of the total cuboid. This percentage is in the middle of the failure rates of the two and three-face cases of the synthetic data, which are 99% and 36%, respectively. Therefore, the condition of real data is difficult. Moreover, this interpretation is supported by the fact that the proposed method yielded two no-solution results for the first time in the experiments. The proposed method without the backtracking line search failed to model the correct cuboids for 47 of the 354 real data. On the other hand, The proposed method with the backtracking line search failed to model the correct cuboids for only 6 of the 354 real data. These real data results show not only a significant improvement over those of the previous method but also the benefit of the backtracking line search. Therefore, it can be concluded that the proposed method is the most robust among other methods under high noise and occlusion conditions.
A robustness evaluation based on the distance metric is summarized in
Table 5. This table shows that 75, 80, and 85% of the points are distributed at distances within the corresponding values in the table from the surface of the cuboid model. Therefore, we consider that a result with a lower value indicates better performance because the points are located closer to the surface of the cuboid model. Consequently, based on
Table 5, the proposed method outperforms other methods in most cases. Specifically, the points are closer to the model surface of the proposed method on average. Moreover, the proposed method achieved consistent results with a lower standard deviation than previous methods. These results are also visualized as boxplots in
Figure 10 for easy comparison. Here, the proposed method whose results are colored in green shows an improved performance compared to the previous method whose results are colored in red.