Instance Segmentation of LiDAR Point Clouds with Local Perception and Channel Similarity
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsFigures 8 and 9 Visualization
The current descriptions of the visualization results in Figures 8 and 9 appear somewhat brief in the main text. To enhance reader comprehension, we recommend providing more detailed explanations that will enable readers to quickly and accurately interpret these visual representations.
Table 5 Comparative Methods
Table 5 presents multiple comparative methods. To improve academic rigor and facilitate reference tracking, please consider adding corresponding citation numbers immediately following each algorithm name.
Algorithm Efficiency Analysis
Given the important application background of intelligent vehicle driving in this study, computational efficiency represents a crucial performance factor. While mIoU and PQ are indeed appropriate evaluation metrics, we still suggest incorporating efficiency metrics (e.g., inference speed or computational complexity) into the experimental analysis to provide a more comprehensive performance evaluation.
Author Response
For detailed responses, please refer to the attachment.
Author Response File: Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsThis paper addresses the LiDAR point cloud instance segmentation task in applications such as autonomous driving by proposing an end‑to‑end network called LCPSNet (LiDAR Channel‑Aware Point Segmentation Network). The authors first review the current landscape of point cloud instance segmentation, noting the limitations of existing top‑down and bottom‑up approaches in local feature extraction, cross‑scale feature fusion, and feature redundancy. To tackle these issues, three core innovations are introduced: LPM, ICCM and position-aligned multi-scale fusion. I think this paper is a good work. As long as these issues are solved, they can be accepted. I have the following questions:
1. Could the authors provide more technical details on the position‑wise weighting strategy in LPM, such as the convolution kernel sizes, activation functions, and specific fusion operations used to produce the global spatial saliency map and group‑related components?
2. The completeness and timeliness of the literature references need to be further improved. For example, collimator-assisted high-precision calibration method for event cameras, event-based multi-view photogrammetry for high-dynamic high-velocity target measurement, which could serve as a valuable reference. This makes the paper provide a more comprehensive understanding of the research field.
3. How are empty or sparse regions handled during the resampling of features to the unified BEV/polar coordinate grid? Was any specific interpolation or filling strategy applied?
4. Regarding the channel similarity matrix in ICCM, how is numerical stability ensured? Could the softmax over high‑dimensional features lead to gradient vanishing or explosion?
5. In the ablation studies, LPM and ICCM individually improve mIoU and PQ. Has their compatibility been tested in lighter backbone networks?
6. For generalization capability, was LCPSNet tested on tasks beyond autonomous driving, such as indoor scans or robotic navigation?
Author Response
For detailed responses, please refer to the attachment.
Author Response File: Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for AuthorsThe manuscript addresses the practical challenges of sparse and irregular data in point cloud segmentation, with point cloud density varying with distance from the sensor, and proposes the LiDAR Channel-Aware Point Segmentation Network (LCPSNet). Overall, the article is well-structured, with reasonable experimentation. However, several issues remain that require author clarification and revisions:
1. The abstract and conclusion sections are overly lengthy, with repetitive descriptions, particularly in the first paragraph of the conclusion, which mirrors the network structure discussion in the abstract. This is not acceptable. The contributions and experimental details in the abstract are also overly verbose. Please revise and significantly condense both sections.
2. As mentioned in the introduction, the authors highlight three contributions: "The Local Perception Module (LPM), The Inter-Channel Correlation Module (ICCM), and Multi-Scale Fusion with Positional Alignment." These should be the primary focus of the subsequent experimental sections. However, in the ablation study, the authors only conduct basic experiments on the current network without comparing it to similar algorithms. For instance, ablation experiments should also include classic point cloud processing encoders like PointNet (Qi et al., 2017) and PointNet++ (Qi et al., 2017). Additionally, the loss functions should be compared with alternatives such as IoU-Loss and mAcc-Loss (Zheng et al., 2020). I recommend that the authors add more comprehensive explanations and comparisons.
3. The literature review in the Related Work section is neither in-depth nor comprehensive. Given that the paper focuses on instance segmentation in autonomous driving scenarios, a thorough review of point cloud segmentation algorithms is necessary, extending beyond deep learning-based methods. Traditional statistical operator-based algorithms should also be considered. Recent top-tier conference and journal papers should be reviewed, including:
1. Zhao, L., Hu, Y., Yang, X., Dou, Z., & Kang, L. (2024). Robust Multi-Task Learning Network for Complex LiDAR Point Cloud Data Preprocessing. *Expert Systems with Applications, 237*, 121552.
2. Kolodiazhnyi, M., Vorontsova, A., Konushin, A., & Rukhovich, D. (2024). Oneformer3d: One Transformer for Unified Point Cloud Segmentation. In *Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition* (pp. 20943-20953).
3. Zou, T., Qu, S., Li, Z., Knoll, A., He, L., Chen, G., & Jiang, C. (2024). HGL: Hierarchical Geometry Learning for Test-Time Adaptation in 3D Point Cloud Segmentation. In *European Conference on Computer Vision* (pp. 19-36). Cham: Springer Nature Switzerland.
4. The proposed network is named LiDAR Channel-Aware Point Segmentation Network (LCPSNet). Is it appropriate to abbreviate LiDAR in this context? Please reconsider.
5. The organization of the experimental section is somewhat disordered. I recommend placing the ablation study at the end, and the types of instances for segmentation need to be expanded. The content of the current ablation experiments could be moved, except for the network module comparisons, to the qualitative/quantitative experiments section.
Author Response
For detailed responses, please refer to the attachment.
Author Response File: Author Response.pdf
Round 2
Reviewer 3 Report
Comments and Suggestions for AuthorsI believe the authors have addressed my concerns, and the manuscript has been revised accordingly. I recommend the acceptance of this paper.