Next Article in Journal
A Machine Learning Snowfall Retrieval Algorithm for ATMS
Previous Article in Journal
Wind Speed Variation Mapped Using SAR before and after Commissioning of Offshore Wind Farms
 
 
Article
Peer-Review Record

Fast Registration of Terrestrial LiDAR Point Clouds Based on Gaussian-Weighting Projected Image Matching

Remote Sens. 2022, 14(6), 1466; https://doi.org/10.3390/rs14061466
by Biao Xiong 1, Dengke Li 1, Zhize Zhou 2 and Fashuai Li 2,*
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Reviewer 4: Anonymous
Remote Sens. 2022, 14(6), 1466; https://doi.org/10.3390/rs14061466
Submission received: 14 February 2022 / Revised: 12 March 2022 / Accepted: 16 March 2022 / Published: 18 March 2022

Round 1

Reviewer 1 Report

#1. There are also papers that use registration using 2d images, such as A novel point cloud registration using 2D image features. What is the biggest difference from this?
#2. There are many other algorithms such as SURF. What is the special reason for using the copyrighted SIFT algorithm?
#3. Are there reasonable results for buildings with long corridors?
#4. In 2.1, is the appropriate grid size for each situation obtained empirically?
#5. It is necessary to explain in detail to the reader how the Gaussian-weighting function, Equations 1 and 2, physically plays a role.
#6. line 191 needs an explanation of s
#7. The grayscale in Figure 3 needs explanation
#8. The table on line 224 seems to be an image, but it is necessary to specify whether it is a table or a figure.
#9. Need to correct typo in eq(5)
#10. Similarly for the table on line 275, it is necessary to specify whether it is a table or an image.
#11. eq(12)-(13) needs more explanation. Since it is a process of minimizing the error, I will explain why the error term is like this.
#12. How was the weight matrix (Sigma matrix) constructed in eq (14)?
#13. Errors in lines 430 and 470 need to be corrected

Author Response

Response to Reviewer 1 Comments

 

Point 1: There are also papers that use registration using 2d images, such as A novel point cloud registration using 2D image features. What is the biggest difference from this?

Response 1:

Thanks for your comment. In this paper, we consider projecting point cloud to images without losing the salient structure information and extract more stable feature points with high-informative feature descriptors. We also consider the challenge of filtering out the incorrect matches. In this case, we proposed a Gaussian weighting function to calculate the point density of the 2D grid, and used an endpoint validation to filter out incorrect image matches.

 

Point 2: There are many other algorithms such as SURF. What is the special reason for using the copyrighted SIFT algorithm?

Response 2:

Thanks for your comment. SIFT performs better on rotation invariance, whereas SURF performs better on brightness changes and calculates faster. From this point of view, SIFT is more suitable for our issue.

 

Point 3: Are there reasonable results for buildings with long corridors?

Response 3:

Thanks for your comment. Unfortunately, in the 6 datasets in our experiments, IPSN-2016, IPSN-2017, WHU-CAM, WHU-RES, WHUT-WK and WHUT-BSZ, there are no buildings with long corridors. However, we tested IPSN-2016, IPSN-2017 which were collected in large indoor buildings. IPSN2017 has an exhibition area >100m*100m which length is sufficiently long. In the future, we will try to collect more data with long corridors and do more experiments on them.

 

Point 4: In 2.1, is the appropriate grid size for each situation obtained empirically?

Response 4:

Thanks for your comment. In Section 3.3, we analyzed the effect of grid size to the final performance. The trade-off is between the calculation efficiency and accuracy, and we finally set the range of cell size in the range of 0.03-0.06m for the indoor datasets, and the range of cell size in the range of 0.7-1.6m for the outdoor datasets.

 

Point 5: It is necessary to explain in detail to the reader how the Gaussian-weighting function, Equations 1 and 2, physically plays a role.

Response 5:

Thanks for your comment. We have added an explanation of Eq. 1 and 2 according to your suggestion. Eq. 2 is to calculate the distance between cell center and a point nearby, and Eq. 1 assigns a weight to that nearby point based on the distance. In this case, the Gaussian function makes the nearer points more important and the farther points less important, which can reflect the real point density better than simply counting point number in a cell. Based on this processing, the aliasing effect can be dramatically reduced as shown in Fig. 3.

 

Point 6: line 191 needs an explanation of s

Response 6:

Thanks for your comment. s is the size of cell in the grid defined in Line 173. We have added the explanation of s accordingly.

 

Point 7: The grayscale in Figure 3 needs explanation

Response 7:

Thanks for your comment. We have moved Fig. 3 and its related content to the end of Section 2.1 to make this section clearer as the grayscale is calculated based on Eq. 4. We also added the explanation in the paragraph above Fig. 3 that the grayscale represents the normalized point density.

 

Point 8: The table on line 224 seems to be an image, but it is necessary to specify whether it is a table or a figure.

Response 8:

Thanks for your comment. We have revised it accordingly.

 

Point 9: Need to correct typo in eq(5)

Response 9:

Thanks for your comment. We have corrected it accordingly.

 

Point 10: Similarly for the table on line 275, it is necessary to specify whether it is a table or an image.

Response 10:

Thanks for your comment. We have revised it accordingly.

 

Point 11: eq(12)-(13) needs more explanation. Since it is a process of minimizing the error, I will explain why the error term is like this.

Response 11:

Thanks for your comment. we have changed the paragraph above Eq. 12 to explain the equation clearer. We have also added the definition of  to further explain Eq. 13.

 

Point 12: How was the weight matrix (Sigma matrix) constructed in eq (14)?

Response 12:

Thanks for your comment. We just simply set the weight matrix to be identity matrix. In the future, we would consider to try different weight matrices to enhance the optimization.

 

Point 13: Errors in lines 430 and 470 need to be corrected

Response 13:

Thanks for your comment. We have corrected them accordingly (now line#452 and line#497).

Reviewer 2 Report

Brief summary:

The manuscript proposes a method for coarse registration of terrestrial laser scanner point clouds in the xy-plane (3 degrees of freedom), assuming that the translations in the z direction and the rotations around the x- and y-axes are provided by a levelling instrument of the equipment. The method is based on matching density images of projections of point clouds onto the xy-plane, using RANSAC with SIFT keypoints and descriptors. False matches are removed by thresholding distances between corresponding endpoints of straight line segments extracted from the images by Hough transform and non-maximum suppression. Pairwise registrations is followed by a global optimization of transformations from each point cloud to a base point cloud. The experiments illustrate the performance of the method.

General comments:

1) The assumptions and limitations of the method should be stated more clearly already in the introduction. The method is apparently aimed and limited for building reconstruction with planar surfaces so that there are straight lines in the density images. The densities must also vary within the image coverage in order to detect SIFT features. It is not clear if the method is applicable to registration of point clouds of other type of targets such as statues.

2) The approach includes novel improvements to previous methods. The approach is principally correct but there are many mistakes in the technical details which must be corrected.

Detailed comments:

Lines 14-16 and 55-57: The 2D grids are already projections of 3D data onto the xy-plane while in the second step, the x,y-coordinates are only shifted and scaled to the image coordinates.

Lines 62-63: It is mentioned in the introduction that the ICP algorithm is used to refine the pairwise registrations but it has not been included in the method description before global registration in Section 2.4.

Line 81: There exist many other automatic feature matching algorithms in addition to deep learning-based ones.

Line 85: "Handcrafted" is a misleading word, because all the methods described here are automated.

Figure 1c: It looks like that all the SIFT keypoints are located on high density areas (walls of the building). Is it true and why? Aren't there any variations in the density elsewhere in the images?

Line 185 and elsewhere: A grid is the whole raster, while a cell is an element of the grid. So, "grid c" should be "cell c" throughout the manuscript. Also, grid size should be cell size.

Equations 1 and 2 and lines 186-188: It is true that the terms in Eq. 1 are Gaussian functions. However, the notation is unconventional, because the standard deviation is usually denoted by sigma (and not the variance) and the coefficient before the exponent is typically the square root of what appears in Eq. 1 (for a normally distributed variable), although this coefficient is unnecessary as it is the same for all the cells of the grid. Please fix Eq. 1 if it should represent the normal distribution. Equation 2 for the distance is incorrect, because there is the distance squared on the left hand side and the distance on the right hand side of the equation.

Lines 196-199 and Figs. 3 and 1c: The endpoints are incorrectly denoted as feature points in Fig. 3. The feature points are SIFT keypoints according to Fig. 1c and line 154. 

Lines 218-219: It may be unnecessary to filter out the image pairs. The endpoint validation could be performed after each RANSAC iteration as an additional check on line 12 of Algorithm 1, to filter out an incorrectly matched pair of keypoints. The endpoints can be computed before RANSAC.

Algorithm 1, line 4: The restriction that i and j can not be 1 or n seems unnecessary.

Algorithm 1, line 11: The addition of a matrix R_{IM} and a vector T_{IM} is not mathematically correct. The coordinates of the points should be included in the transformation similarly as in corrected Eq. 8.

Lines 230-232 and Eqs. 6 and 11: A distance is always non-negative, while the translation may have also negative components. There should be the average differences in the coordinates instead of the average distances. It would be better to use subscripts u and v in the differences in Eqs. 6 and 11, because they are given in the image coordinates and not in the x,y coordinates.

Lines 245-246: The feature points detected by the SIFT algorithm are most likely not unreliable, but the matched pairs between two scans may be unreliable. If they are unreliable, can you prove that it is caused by occlusions?

Line 271: If only two largest groups of lines are kept, then it restricts the applicability of the method to certain type of targets.

Equation 8: Since the dimensions of the images are not 2 by 1, Eq. 8 is not mathematically correct.

Lines 314-317 and Eq. 12: Please clarify whether transformation T_i is from cloud i to the base or the other way round. According to Eq. 12, it seems that T_{i,j} is the transformation from cloud j to cloud i.

Equation 14: Matrix Sigma_{ij} has not been explained.

Equation 17: The equation is incorrect, because "n" should be under the square root.

Section 3.2.2: The efficiency appears in the title but it has not been defined in this subsection nor elsewhere.

Line 400: Sensitivity (recall, true positive rate) is an incorrect word here.

Lines 407-408 and Fig. 12: There is a conflict between the text and the figure. According to Fig. 12, WHUT-WK 0-5 images could be aligned for the cell sizes of 0.07 and 0.08. The units of the cell sizes are missing from all the plots in Fig. 12 and partly from the text.

Table 2: Column 3 is the same as column 1. The results for global optimization + no endpoint verification are missing. The units are missing.

Lines 511-513: It should be mentioned that the improvement was tested against algorithms based on 4PCS while no comparison was carried out to projection-based algorithms listed in paragraph C of Section 1.2.

Line 610: H. Date is missing from the authors of [32].

Author Response

Response to Reviewer 2 Comments

 

Brief summary:

The manuscript proposes a method for coarse registration of terrestrial laser scanner point clouds in the xy-plane (3 degrees of freedom), assuming that the translations in the z direction and the rotations around the x- and y-axes are provided by a levelling instrument of the equipment. The method is based on matching density images of projections of point clouds onto the xy-plane, using RANSAC with SIFT keypoints and descriptors. False matches are removed by thresholding distances between corresponding endpoints of straight line segments extracted from the images by Hough transform and non-maximum suppression. Pairwise registrations is followed by a global optimization of transformations from each point cloud to a base point cloud. The experiments illustrate the performance of the method.

General comments:

Point 1: The assumptions and limitations of the method should be stated more clearly already in the introduction. The method is apparently aimed and limited for building reconstruction with planar surfaces so that there are straight lines in the density images. The densities must also vary within the image coverage in order to detect SIFT features. It is not clear if the method is applicable to registration of point clouds of other type of targets such as statues.

Response 1:

Thanks for your comment. Unfortunately our method currently cannot register point clouds of irregular statues. In the future, we would like to extend our work to the registration of point clouds in regular scenes.

 

Point 2: The approach includes novel improvements to previous methods. The approach is principally correct but there are many mistakes in the technical details which must be corrected.

Response 2:

Thanks for your comment. According to your comments, we have corrected the technical errors and unproper expressions. The modified content has been marked as red color.

 

 

Detailed comments:

Point 3: Lines 14-16 and 55-57: The 2D grids are already projections of 3D data onto the xy-plane while in the second step, the x,y-coordinates are only shifted and scaled to the image coordinates.

Response 3:

Thanks for your comment.

We have corrected the description as you suggested.

 

Point 4: Lines 62-63: It is mentioned in the introduction that the ICP algorithm is used to refine the pairwise registrations but it has not been included in the method description before global registration in Section 2.4.

Response 4:

Thanks for your comment.

We have added the description as you suggested in Section 2.3 before global registration. Since the ICP is a well-developed algorithm, we just mention it without details.

 

Point 5: Line 81: There exist many other automatic feature matching algorithms in addition to deep learning-based ones.

#4. Line 85: "Handcrafted" is a misleading word, because all the methods described here are automated.

Response 5:

Thanks for your comment.

As you mentioned, the “handcrafted” is a misleading word. So we change the word into “manually-designed”. The matching methods based on manually-designed features are actually automated as you pointed out.

 

Point 6: Figure 1c: It looks like that all the SIFT keypoints are located on high density areas (walls of the building). Is it true and why? Aren't there any variations in the density elsewhere in the images?

Response 6:

Thanks for your comment.

The SIFT feature points are normally located in the areas with high contrasts. The high point density at the walls of the building and the low density at the space beside the walls forms high contrasts, which makes them easy to be captured by the SIFT. Indeed that there are variations in the density elsewhere, but the variations are normally not high enough to be captured by SIFT.

 

Point 7: Line 185 and elsewhere: A grid is the whole raster, while a cell is an element of the grid. So, "grid c" should be "cell c" throughout the manuscript. Also, grid size should be cell size.

Response 7

Thanks for your comment.

We have corrected the description as you suggested.

 

Point 8: Equations 1 and 2 and lines 186-188: It is true that the terms in Eq. 1 are Gaussian functions. However, the notation is unconventional, because the standard deviation is usually denoted by sigma (and not the variance) and the coefficient before the exponent is typically the square root of what appears in Eq. 1 (for a normally distributed variable), although this coefficient is unnecessary as it is the same for all the cells of the grid. Please fix Eq. 1 if it should represent the normal distribution. Equation 2 for the distance is incorrect, because there is the distance squared on the left hand side and the distance on the right hand side of the equation.

Response 8:

Thanks for your comment.

We have corrected the description as you suggested.

 

Point 9: Lines 196-199 and Figs. 3 and 1c: The endpoints are incorrectly denoted as feature points in Fig. 3. The feature points are SIFT keypoints according to Fig. 1c and line 154. 

Response 9:

Thanks for your comment.

We have corrected the description as you suggested. The feature points in Fig. 3 should be endpoints as you pointed out.

 

Point 10: Lines 218-219: It may be unnecessary to filter out the image pairs. The endpoint validation could be performed after each RANSAC iteration as an additional check on line 12 of Algorithm 1, to filter out an incorrectly matched pair of keypoints. The endpoints can be computed before RANSAC.

Response 10:

Thanks for your comment.

In the calculation of the endpoint pairs, we need first to get the final, or the “correct” transformation matrix Tfinal to obtain the correspondence between endpoints in two different images. In this case, in the RANSAC, we can only use endpoints as the additional feature points in the calculation, which is not the same as the endpoint validation afterwards.

 

Point 11: Algorithm 1, line 4: The restriction that i and j can not be 1 or n seems unnecessary.

Response 11:

Thanks for your comment. We have revised it accordingly.

 

Point 12: Algorithm 1, line 11: The addition of a matrix R_{IM} and a vector T_{IM} is not mathematically correct. The coordinates of the points should be included in the transformation similarly as in corrected Eq. 8.

Response 12:

Thanks for your comment. We have revised it accordingly.

 

Point 13: Lines 230-232 and Eqs. 6 and 11: A distance is always non-negative, while the translation may have also negative components. There should be the average differences in the coordinates instead of the average distances. It would be better to use subscripts u and v in the differences in Eqs. 6 and 11, because they are given in the image coordinates and not in the x,y coordinates.

Response 13:

Thanks for your comment. We have revised it accordingly.

 

Point 14: Lines 245-246: The feature points detected by the SIFT algorithm are most likely not unreliable, but the matched pairs between two scans may be unreliable. If they are unreliable, can you prove that it is caused by occlusions?

Response 14:

Thanks for your comment. The original sentence is not clear. The unreliable feature points from SIFT can be due to many situations, and heavy occlusion is just one case. We have revised our expression according to your advice.

 

Point 15: Line 271: If only two largest groups of lines are kept, then it restricts the applicability of the method to certain type of targets.

Response 15:

Thanks for your comment. Your opinion is correct. The application is actually refined to regular architectures. Currently we aim to design a method which suits most of the cases, and unfortunately not all of them. We will consider a more general process to meet the requirement of more situations in the future work.

 

Point 16: Since the dimensions of the images are not 2 by 1, Eq. 8 is not mathematically correct.

Response 16:

Thanks for your comment. We have revised it accordingly.

 

Point 17: Lines 314-317 and Eq. 12: Please clarify whether transformation T_i is from cloud i to the base or the other way round. According to Eq. 12, it seems that T_{i,j} is the transformation from cloud j to cloud i.

Response 17:

Thanks for your comment. We have revised it accordingly.

 

Point 18: Equation 14: Matrix Sigma_{ij} has not been explained.

Response 18:

Thanks for your comment. We have added the explanation accordingly.

 

Point 19: Equation 17: The equation is incorrect, because "n" should be under the square root.

Response 19:

Thanks for your comment. We have revised it accordingly.

 

Point 20: Section 3.2.2: The efficiency appears in the title but it has not been defined in this subsection nor elsewhere.

Response 20:

Thanks for your comment. We have added the related content in Section 3.2.2 accordingly.

 

Point 21: Line 400: Sensitivity (recall, true positive rate) is an incorrect word here.

Response 21:

Thanks for your comment. We have changed sensitivity to effect accordingly.

 

Point 22: Lines 407-408 and Fig. 12: There is a conflict between the text and the figure. According to Fig. 12, WHUT-WK 0-5 images could be aligned for the cell sizes of 0.07 and 0.08. The units of the cell sizes are missing from all the plots in Fig. 12 and partly from the text.

Response 22:

Thanks for your comment. We have changed the expression of the sentence in Line 408 (now 418) accordingly. We have also added the units of cell sizes as well.

 

Point 23: Table 2: Column 3 is the same as column 1. The results for global optimization + no endpoint verification are missing. The units are missing.

Response 23:

Thanks for your comment. We have revised it accordingly. The unit of the average error/number of alignment stations is added in the first row of the table.

 

Point 24: Lines 511-513: It should be mentioned that the improvement was tested against algorithms based on 4PCS while no comparison was carried out to projection-based algorithms listed in paragraph C of Section 1.2.

Response 24:

Thanks for your comment. We have added a comparison with [33], a representative projection-based method, in Section 3.4.2 accordingly.

 

Point 25: Line 610: H. Date is missing from the authors of [32].

Response 25:

Thanks for your comment. We have added the author accordingly.

Reviewer 3 Report

Dear Authors,

I reviewed the paper entitled "Fast Registration of Terrestrial LiDAR Point Clouds Based on Gaussian-weighting Projected Image Matching". The paper presents ‘an automatic point cloud registration method based on Gaussian-weighting projected image matching”. I have some issue that I would like to ask Authors to refer to.

  1. In the Abstract the Authors should describe the results, like accuracy or comparison results. It is very important because a potential reader is not aware of the quality before he will download the entire paper.
  2. Line 75. For me, reviews are more like opinions. Please change it to literature review which is a piece of academic writing.
  3. As I understand it, Your method is about to create images of a point cloud and then searching for homologous points and aligning to a single point cloud. And it is based on Gaussian-weighted projected image matching. Why in the conclusion the Authors claimed about the authors ‘our Gaussian weighting function’? I think that it is unclear because there should be information of how this Gaussian- weighting is in Authors rights.
  4. Line 430. There is an error.
  5. There is no information about the software which Authors used. Did they use some specific software or did the authors use their own scripts ? It is very important to note that in the article because otherwise it could be difficult to repeat the experiment by other researchers.

From my perspective the article is an interesting study but it is unclear of how to reuse the information of the experiment. The Authors should pay a special attention to it.

Author Response

Response to Reviewer 3 Comments

 

Dear Authors,

I reviewed the paper entitled "Fast Registration of Terrestrial LiDAR Point Clouds Based on Gaussian-weighting Projected Image Matching". The paper presents ‘an automatic point cloud registration method based on Gaussian-weighting projected image matching”. I have some issue that I would like to ask Authors to refer to.

Point 1: In the Abstract the Authors should describe the results, like accuracy or comparison results. It is very important because a potential reader is not aware of the quality before he will download the entire paper.

Response 1:

Thanks for your comment. We have added the quantitative result description in the abstract accordingly.

 

Point 2: Line 75. For me, reviews are more like opinions. Please change it to literature review which is a piece of academic writing.

Response 2:

Thanks for your comment. As you suggested, We have corrected the expression in the manuscript.

 

Point 3: As I understand it, Your method is about to create images of a point cloud and then searching for homologous points and aligning to a single point cloud. And it is based on Gaussian-weighted projected image matching. Why in the conclusion the Authors claimed about the authors ‘our Gaussian weighting function’? I think that it is unclear because there should be information of how this Gaussian- weighting is in Authors rights.

Response 3

Thanks for your comment. The Gaussian function is of course not in our rights. ‘our Gaussian weighting function’ just means that we designed a method using Gaussian function to assign weights of each point to reliably calculate the point density in this paper.

 

Point 4: Line 430. There is an error.

Response 4:

Thanks for your comment. We have corrected it accordingly.

 

Point 5: There is no information about the software which Authors used. Did they use some specific software or did the authors use their own scripts ? It is very important to note that in the article because otherwise it could be difficult to repeat the experiment by other researchers.

Response 5:

Thanks for your comment. We use C++ in Visual Studio to realize our algorithm. We have added the description in the 1st paragraph of Section 3.2.

 

Point 6: From my perspective the article is an interesting study but it is unclear of how to reuse the information of the experiment. The Authors should pay a special attention to it.

Response 6:

Thanks for your comment. We try to take our effort to make this paper clearer, and the manuscript is changed especially in the method and experimental sections for this purpose. In the experiment, we discover that projection-based methods are much faster than 4PCS-based methods. Thus this paper aims to improve the projection-based method on these aspects: 1. The proposed Gaussian-weighing method is rotation invariant and well preserves the structure information of terrestrial point clouds; 2. The endpoint validation effectively filters out incorrect matches between images; 3. The global optimization provides more robust matching results. The ablation study further validates the effectiveness of the endpoint validation and global optimization. In the future, we will extend our framework to the registration of terrestrial point clouds in irregular building scenes.

Reviewer 4 Report

The paper deals with a topic of great interest and the methodology is adeguate. Graphics could be improved

Author Response

Response to Reviewer 4 Comments

 

Point 1: The paper deals with a topic of great interest and the methodology is adequate. Graphics could be improved

Response 1:

Thanks for your comment. We have modified unclear figures and tables in the manuscript.

Round 2

Reviewer 1 Report

Many changes have been done in order to make it easier for readers to accept.
It would be a fascinating issue to investigate how to set up an appropriate weight matrix in eq 14.
Finally, please double-check your grammar and spelling.

Author Response

Point 1: Many changes have been done in order to make it easier for readers to accept. It would be a fascinating issue to investigate how to set up an appropriate weight matrix in eq 14. Finally, please double-check your grammar and spelling.

Response 1:

Thanks for your comment. We will proofread our paper carefully and revise the grammatic and spelling issues accordingly. Your suggestion of properly setting the weight function is valuable, and we will investigate how to set an appropriate weight matrix in the future work.

Reviewer 3 Report

Dear Authors, 

Thank You for your revised manuscript. I think that the paper can be consider for publication. 

Author Response

Point 1: Thank You for your revised manuscript. I think that the paper can be consider for publication.

Response 1:

Thanks for your comment. We highly appreciate your affirmation. We will conduct the proofreading over the manuscript carefully and revise the grammatic and spelling issues.

Back to TopTop