Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Data Preparation Impact on Semantic Segmentation of 3D Mobile LiDAR Point Clouds Using Deep Neural Networks

Remote Sens. 2023, 15(4), 982; https://doi.org/10.3390/rs15040982

by Reza Mahmoudi Kouhi^1,*, Sylvie Daniel¹

and Philippe Giguère²

Reviewer 1:

Jun Xiao

Reviewer 2:

Francesca Matrone

Reviewer 3:

Beatriz Marcotegui

Remote Sens. 2023, 15(4), 982; https://doi.org/10.3390/rs15040982

Submission received: 28 November 2022 / Revised: 3 February 2023 / Accepted: 5 February 2023 / Published: 10 February 2023

(This article belongs to the Special Issue Semantic Segmentation Algorithms for 3D Point Clouds)

Round 1

Reviewer 1 Report

This paper explores the impact of data preparation on segmentation based on deep learning，which has certain reference significance for the further development and exploration of deep learning. Here are my comments and suggestions.

1. Data preparation is the core concept of this article. So what is data preparation? The article should give a clear definition. Data preparation in this paper mainly involves data sampling. However, may data preparation include other operations? Or are the two concepts of data preparation and data sampling completely equivalent?

2. I don't quite agree with the statement that this article proposes a new insights. In fact, many studies try to use more meaningful points as input data. Or the author needs to emphasize the new insights and clearly explain the innovation. In addition, I have some concerns about the innovation of this article. Personally, this paper only proposes different sampling strategies during data preparation. Although sufficient experiments have been carried out, but the innovation may still not enough.

3. In Figure 2, it seems that all methods use the same seed points, but different methods obtain seed points in different ways. Is this reasonable?

4. This paper proposes two new data preparation methods. Are these two methods generalized? Or is it better to use more targeted methods according to different types of data?

5. Can the two data preparation methods proposed completely replace other existing methods, or do you still need to select appropriate data preparation methods according to actual needs?

6. Compared with the farthest point sampling, does random sampling lead to non-uniform sampling results？Why do the two proposed methods choose different seed point acquisition methods？

7. In the experimental stage, it is recommended to give a display figure of the experimental results, so as to better explain in which regions and in which circumstances can get better results with the proposed method.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

The paper investigates how data preparation methods can affect deep learning-based results for the semantic segmentation task.
The topic is interesting, well developed and described, however, there are some changes to be made:
1. in the initial part, it would be appropriate to divide the introduction with a section relating to the literature review (see attached file)
2. in the methodological and Results parts, it is better to test an additional dataset. In fact, one dataset is not enough to justify the results and to ensure a full generalization of the proposed method. Semantic3D, S3DIS or the ArCH dataset are recommended. Furthermore, they should at least be mentioned in the state of the art
3. in the discussions, even if partially present, the pros and cons of the methodology should be detailed
4. the conclusions should be extended (e.g. any future developments and final considerations).

Comments for author File: Comments.pdf

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

This paper addresses the first stage of a 3D point cloud semantic
segmentation pipeline: the selection of 3D points to be presented to
the network. While being an interesting topic, with potential
significant gains, there is room for improving the paper. My main
concerns are the following:

- Authors state, lines 119-120, that "seed points are spread all over
the point cloud" and "each point is assigned to at least two groups"
but I do not see how this is reached with algorithm 1 (random
sampling and K-nearest neighbors of each seed).

- Density based method relies in a classification into 3 categories of
points: low, medium and high density. But this classification
process is not given.

- PointNet++ and KPConv are two state of the art methods. It would
have been interesting to include RandlaNet, which contains a random
sampling in the process. Randlanet is a relevant reference that
should be included.

- line 264: how each block is downsampled before applying FPS?

- line 244: authors state that R-KNN is able to generate groups with
points of minority classes, such as buildings. However, cars are 3
times more frequent that buildings (0.74 compared to 0.2) and they
are not present. Moreover, other classes with about the same
frequency such as poles, are detected by both methods.

- which is the difference between number of groups and number of
seeds?

- Line 222:

"""In KPConv, the authors claim that using randomly selected spheres
(like FR) could result in a better mIoU than using KNN (like
R-KNN)."""

Fixed Radius (FR) is different from the technique used by KPConv, in
the sense that KPConv includes a grid.

In conclussion, the data preparation is an interesting topic and it is
worth to be addressed. However, the methods proposed are not fully
described or justified. Moreover they rely on some parameters that are
not discussed.

Some details:

- line 25: ...have no information such as semantic information
or... -> ...have no semantic information or ...

- line 69: figure 1.a -> figure 1.b

- line 113: prepossessing -> preprocessing

- line 247 cloud not be created -> COULD not be created

- line 271 cites figure 4.c that is missing

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

I have no other suggestions for this article

Author Response

We appreciate very much the revision and comments given by the reviewer. They were very helpful to improve the quality of our manuscript. We have responded each comment item-by-item in blue, and the corrections are also highlighted in yellow in the new version of the manuscript. In the responses, please note that the lines mentioned correspond to the revised version of the paper.

Author Response File: Author Response.docx

Reviewer 2 Report

The paper has been improved, however, I think that it is not still suitable for publication.

Comment #1: I have read the template provided by MDPI Remote Sensing and it is not mandatory to keep the literature review in a single section, as done by other contributions to this Journal (e.g. see https://www.mdpi.com/2072-4292/15/3/621). I would suggest you to create a subsection 1.1 of the Introduction.

Comment #2: I still do not agree with the authors. What has been added in lines 193-197 partially justifies the choice of only one dataset. The question that now arises is: so why is Paris-Lille-3D not used? The addition of at least one other scene, also acquired by the authors, would give robustness to the methodology. Unfortunately, this element limits your method to a data niche. Moreover, as mentioned above, its generalization is not fully demonstrated. It could be argued that the scenes in the KITTI dataset were acquired with the same sensors. Does the proposed method also work with data acquired from other sensors? Please, insert the references for the datasets cited.

Author Response

Author Response File: Author Response.docx

Reviewer 3 Report

Most of the raised issues have been addressed. Thus, this version is an improved version. However, at least comments 1 and 5 require more attention. Moreover, I think that including RandlaNet in the benchmark would be a great improvement but I would not object to accept the paper if it is not done.

"""
We applied corrections in lines 126-127 as follows: “There is a high probability that each point is assigned to at least two groups if the number of seed points is calculated using (1)”.:
We rely on the following demonstration to make such statement. Given that the cloud has 1 million points and the number of nearest neighbors to each seed point is 8192, 244 seed points are selected according to equation (1). Considering algorithm 1, each seed point gives rise to a group that consists of 8192 points. Therefore, all the groups involve a total of 8192*244=2 million points. Since the point cloud consists of 1 million points, it means each point is assigned, in average, to at least 2 nearby seed points.
"""

I disagree with this explanation. Statistically it is not the same that "in average each point is assigned to 2 groups" and "there is a high probability that each point is assigned to *at least* two groups". Authors say that it involves 8129*244 points. It would have been true if they were all different. With algorithm 1, you can only state that "in average each point is assigned to 2 groups".

"""The threshold for low-density areas is set to 30% and to 70% for high-density areas. The interval between 30% and 70% is considered as medium-density areas."""
30% of the maximum density? If it is the case it should be said. Or the 70% of points with highest density? If it is the case, may be 30% is too high? The sentence should be clarified, and somehow justified.

""" in our opinion, the objective of the research work is already met using two different network architecture,"""
Including RandLaNet, the paper would be much more impactful. The code is available in Torch Points3D.

"""“The downsampling is done using a dropping point technique with a stride of 32”."""
You mean that you take 1 point out of 32 and that it depends on the order of points?

""" Answer to comment 5"""

I understand that random sampling may miss some objects. I also understand the differences between KNN and spherical neighborhoods. My comment pointed out a counter-example to the authors statement about the presence of minority classes with R-KNN technique. I am not conviced with this statement. What is pointed about building does not hold on cars, that is a more frequent class. In my opinion, this is because it relies on an initial random sampling. Minority classes would not be under-represented due to the random sampling?

""" Answer to comment 7"""

Even if KP-Conv applies to random spheres, a density parameter is used in order to avoid too many points in high density areas.

Author Response

Author Response File: Author Response.docx

Article Menu

Data Preparation Impact on Semantic Segmentation of 3D Mobile LiDAR Point Clouds Using Deep Neural Networks

Further Information

Guidelines

MDPI Initiatives

Follow MDPI