Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Three-Dimensional Reconstruction of Zebra Crossings in Vehicle-Mounted LiDAR Point Clouds

Remote Sens. 2024, 16(19), 3722; https://doi.org/10.3390/rs16193722

by Zhenfeng Zhao^1,2,3

, Shu Gan¹, Bo Xiao⁴

, Xinpeng Wang^5,6 and Chong Liu^2,*

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Reviewer 3: Anonymous

Reviewer 4: Anonymous

Reviewer 5: Anonymous

Remote Sens. 2024, 16(19), 3722; https://doi.org/10.3390/rs16193722

Submission received: 5 August 2024 / Revised: 16 September 2024 / Accepted: 29 September 2024 / Published: 7 October 2024

Round 1

Reviewer 1 Report (New Reviewer)

Comments and Suggestions for Authors

This paper presents an innovative method based on energy functions and template matching for 3D reconstruction of zebra crossings using vehicle-mounted LiDAR point cloud data. The paper is logically coherent, with clear illustrations and a well-structured layout. The research methodology is scientifically sound, and the experimental design is meticulous, utilizing data from different cities to ensure the method's broad applicability and reliability. Detailed statistical data and comparative analyses are provided, clearly demonstrating the method's efficiency and accuracy. Additionally, the authors conduct an in-depth discussion of the experimental results, identifying the method's limitations and proposing future research directions, which reflects critical thinking and foresight. The use of charts and visualizations effectively aids in conveying the research content, enhancing the paper's readability and comprehensibility. Overall, the paper excels in both technical depth and research quality, making a significant contribution to the development of high-precision mapping and intelligent transportation systems.

However, several minor issues require clarification:

1. Dependence on Manual RoI Selection: While manual RoI selection effectively reduces data interference and error rates, it also decreases the method's automation level. Could this reliance on manual intervention become a bottleneck in large-scale applications? The paper would benefit from discussing how to reduce manual involvement to improve automation.

2. Potential Subjectivity in Manual Selection: The manual selection process might introduce subjective differences that could impact the results. It is recommended to discuss strategies to minimize or eliminate these subjective biases to enhance result consistency and reliability.

3. Performance in Dynamic Scenarios: The algorithm's effectiveness in dynamic scenarios, such as when pedestrians or vehicles are present on zebra crossings, remains unclear. Further exploration of the algorithm's performance in these situations and possible additional measures to address such complexities would be beneficial.

4. Insufficient Analysis of Anomalous Results: While the paper presents numerous experimental results, it lacks in-depth analysis of cases where the method fails (e.g., low IoU or significant positioning errors). Adding discussions on these failure cases would provide a more comprehensive understanding of the method's limitations and areas for improvement.

5. Limited Discussion on Computational Efficiency and Scalability: The paper does not adequately address the scalability of the algorithm to larger datasets, particularly in terms of performance under increased data volumes. Additionally, the differences in runtime across various hardware environments are not discussed. Including hardware specifications and discussing performance across different setups would provide a more thorough evaluation.

6. Lack of Comparison with Other Methods: The comparative experiments do not include direct comparisons with other advanced methods, such as deep learning approaches or image-based zebra crossing detection techniques. Supplementing the paper with such comparisons would better highlight the proposed method's advantages or disadvantages relative to existing methods.

7. Inadequate Robustness Analysis: Although some experimental results are discussed, there is a lack of in-depth analysis of failure cases, particularly in dense traffic scenarios or under low reflectivity conditions. Adding such discussions would strengthen the robustness analysis.

Author Response

Comments 1: [ Dependence on Manual RoI Selection: While manual RoI selection effectively reduces data interference and error rates, it also decreases the method's automation level. Could this reliance on manual intervention become a bottleneck in large-scale applications? The paper would benefit from discussing how to reduce manual involvement to improve automation.]

Response 1: Many thanks for your comment. Indeed, in the research of extracting road markings, deep learning algorithms have achieved breakthrough results. Once the model is well-trained, the level of automation is relatively high. In our previous work [1], we also made efforts in this area, using deep learning algorithms to automatically acquire marking areas and orientations. This experience made us acutely aware that the current research results still need improvements in accuracy and completeness, making direct application to practical production challenging.

In this context, focusing more on the operability in practical applications, we developed a semi-automatic method for extracting markings, where the RoI is initially selected through human-machine interaction. This work can be seen as an effective and necessary supplement to fully automated methods: for areas where the extraction results are underperform, people can use our method to re-extract, achieving the desired outcomes. The software developed from our method has shown great effective results in the actual production of high-definition maps.

In today’s large-scale production of high-definition maps, human intervention remains a reality we must face. Fortunately, through repeated testing, we have significantly reduced the workload requiring manual intervention. In our preset parameters, considering the general characteristics of MLS point clouds, values for Q_t, K, ∆t, T_Z, T_L, and α, β, γ can be embedded within the algorithm, with almost no need for further manual intervention to adjust them. The template matching process is also automatically completed by the algorithm. The robustness of the algorithm is demonstrated in our experiments with MLS point cloud data obtained from other cities on different platforms.

For point clouds obtained via TLS and ALS, we did not conduct comparative studies due to differences in point cloud density and intensity contrast. This is clarified in the discussion (page number 18, section 5, and line 556-559). Additionally, our algorithm is designed to fully exploit the limited information collected by most vehicle-based LiDAR systems, without considering the registered RGB information (some vehicle-based LiDAR point clouds do not contain synchronized color attributes). Future research could construct point cloud color distribution histograms based on these texture features, combining intensity information to derive stripe widths and spacing, reducing manual parameter adjustments, and increasing the level of automation. This prospect is discussed in our outlook (page number 19, section 6, and line 592-597).

We also clearly understand that, similar to image-based object recognition tasks, our future research still faces challenges such as point cloud colorization errors from coordinate inconsistencies between system components, color distortion and low contrast due to image quality and lighting changes, color loss due to complex road conditions and wear, and misjudgments from environmental interference leading to color similarities.

[1] Mi, X.; Yang, B.; Dong, Z.; Liu, C.; Zong, Z.; Yuan, Z. A two-stage approach for road marking extraction and modeling using MLS point clouds. ISPRS Journal of Photogrammetry and Remote Sensing 2021,180, 255-268.

Comments 2: [ Potential Subjectivity in Manual Selection: The manual selection process might introduce subjective differences that could impact the results. It is recommended to discuss strategies to minimize or eliminate these subjective biases to enhance result consistency and reliability.]

Response 2: Thank you for pointing this out. As a semi-automatic method, the selection of the RoI does not need to be highly precise; it only needs to roughly encompass all zebra stripes. This is because determining the number of stripes within the RoI only requires the approximate location of their central lines. Subsequently, determining stripe length and width, as well as template matching, no longer depends on the RoI. In practical production, for areas where LiDAR point cloud quality is ideal, results from previous work [1] can be used. In cases where model generalization is poor, point cloud quality is suboptimal, or zebra stripes are in particularly challenging areas, our semi-automatic method is employed to ensure the completeness of the extraction results.

Comments 3: [ Performance in Dynamic Scenarios: The algorithm's effectiveness in dynamic scenarios, such as when pedestrians or vehicles are present on zebra crossings, remains unclear. Further exploration of the algorithm's performance in these situations and possible additional measures to address such complexities would be beneficial. ]

Response 3: Many thanks for your comment. The algorithm presented in this paper extracts and constructs stripe-level instantiated 3D boundaries of zebra crossings from large-scale static LiDAR point clouds obtained via MLS. It is primarily intended for the production of high-definition maps in the remote sensing field, rather than for real-time traffic marking recognition in autonomous driving. Therefore, the study does not consider effectiveness in dynamic scenes. During the actual production of high-definition maps, pedestrians and vehicles on zebra crossings are filtered out during the ground filtering phase. The resulting voids in the point cloud are effectively handled by our algorithm.（page number 16, section 5, and line 495-510）

Comments 4: [ Insufficient Analysis of Anomalous Results: While the paper presents numerous experimental results, it lacks in-depth analysis of cases where the method fails (e.g., low IoU or significant positioning errors). Adding discussions on these failure cases would provide a more comprehensive understanding of the method's limitations and areas for improvement.]

Response 4: We appreciate your constructive comments. In real-world scenarios, factors such as sparse point clouds at boundaries, low intensity contrast, unusual zebra stripe shapes, irregular painting, and heavy traffic occlusion can lead to low IoU or significant positioning errors in extraction results. We have added a focused analysis of failure cases in the paper (page 17-18, section 5, lines 528-531, figures 11, 12 and 13). In future research, we plan to incorporate road boundary information to address the issue where zebra stripes near curved road boundaries, which should have gradually decreasing lengths, are instead optimized to a uniform length.

Comments 5: [ Limited Discussion on Computational Efficiency and Scalability: The paper does not adequately address the scalability of the algorithm to larger datasets, particularly in terms of performance under increased data volumes. Additionally, the differences in runtime across various hardware environments are not discussed. Including hardware specifications and discussing performance across different setups would provide a more thorough evaluation. ]

Response 5: Thank you for highlighting this aspect. As previously noted, this method serves as an essential and effective complement to fully automated machine learning algorithms, optimally fulfilling the practical production needs for comprehensive 3D reconstruction of all zebra stripes. The algorithm exclusively utilizes LiDAR point clouds as a data source, does not require rasterization or extensive data labeling, and circumvents the challenges of normalizing large volumes of point cloud intensity and density. It performs parallel computations within each pre-selection box, with the total reconstruction process for all zebra stripes taking only 1 second (page 14, section 4, lines 464-465). The computational demands are minimal, enabling efficient execution of detection and reconstruction algorithms on a CPU without the need for a GPU, making it highly suitable for production environments. As an interactive semi-automatic algorithm, it typically does not require processing large datasets, and the mainstream computer hardware and software configurations of high-definition map production companies are sufficient to support its operation. Therefore, this aspect is not discussed further in the paper.

Comments 6: [Lack of Comparison with Other Methods: The comparative experiments do not include direct comparisons with other advanced methods, such as deep learning approaches or image-based zebra crossing detection techniques. Supplementing the paper with such comparisons would better highlight the proposed method's advantages or disadvantages relative to existing methods. ]

Response 6: Many thanks for your comment. During the research process, we had plans to conduct direct comparison experiments but, after careful consideration, decided against it for the following reasons:

1. Image-based zebra crossing detection techniques either rarely have high-precision elevation data or do not even consider planar coordinate information. The different research fields and focal points reduce the significance of such comparisons.

2. Deep learning methods generally have poor generalization. Our semi-automatic method is an effective and necessary supplement to the results extracted by fully automated deep learning models. The algorithm runs in parallel within the RoI, focusing on the intensity distribution and completion reconstruction within each small RoI. The accuracy and completeness of reconstruction are much higher compared to ordinary deep learning methods. Therefore, we believe that simply comparing the two approaches is unfair.

Additionally, it is important to note that in the study of road marking extraction and reconstruction using MLS point clouds, even the most advanced deep learning frameworks still face formidable challenges with generalization due to the complexity of global point cloud density and intensity contrast. These frameworks continue to struggle to meet the demands of actual high-definition map production. Therefore, we selected zebra crossings as our first research focus and conducted in-depth explorations with practical application value. Our extensive review of the literature revealed a scarcity of research directly applicable to the actual production of zebra crossings in high-definition maps, further inspiring us to pursue this line of inquiry.

Comments 7: [ Inadequate Robustness Analysis: Although some experimental results are discussed, there is a lack of in-depth analysis of failure cases, particularly in dense traffic scenarios or under low reflectivity conditions. Adding such discussions would strengthen the robustness analysis. ]

Response 7: Thank you for pointing this out. As stated in response 4, we have added an analysis of the failure cases. (page number 17-18, section 5, and line 528-531, figure 11, 12 and 13)

Author Response File: Author Response.pdf

Reviewer 2 Report (New Reviewer)

Comments and Suggestions for Authors

Summary：

The manuscript "3D reconstruction of zebra crossings in vehicle mounted LiDAR point clouds" proposes a 3D reconstruction method for zebra crossings in vehicle mounted LiDAR point clouds based on energy function and template matching. The article first introduces the importance of zebra crossings in transportation and the shortcomings of existing methods for obtaining their information, and then highlights the advantages of using LiDAR point clouds for reconstruction. Next, the data, research framework, and specific methods used in the experiment were elaborated in detail, including the calculation of the number of zebra stripes, rough pose localization, template matching, and 3D reconstruction. Finally, the experimental results were presented, and the performance of the algorithm was evaluated and discussed. Its advantages and limitations were analyzed, and future research directions were proposed.

Strength：

1. The beginning of the manuscript clearly explains the importance of zebra crossings in smart city infrastructure management, intelligent driving, and other fields, as well as the necessity of realizing their 3D instantiation in high-precision map production, clarifying the significance and value of the research.

2. In section 2.3.1, it was mentioned that the experimental data was selected from the annotated WHU Urban3D dataset, which covers multiple data types and has undergone coordinate unification and registration. The selected point cloud data of eight roads in Shanghai can represent practical application scenarios and has high reliability and universality.

3. The 3D reconstruction method based on energy function and template matching proposed in Section 3.5 is innovative and can effectively process LiDAR point cloud data, achieve fast extraction and reconstruction of zebra crossings, and does not require a large number of sample labels and training, avoiding the process of converting to raster images. It has strong innovation and pertinence.

4. Clear parameters were carefully designed at different stages of the algorithm, and the optimal settings were determined through repeated experiments, enabling the algorithm to effectively extract and reconstruct various atypical features of zebra crossings, such as fewer stripes, irregular lengths, or parallelograms. In the stages of identifying the number and determining the length of zebra stripes, the algorithm has been carefully designed with preset intervals, iterative filtering parameters, segmentation intervals, etc., to adapt to different situations of zebra stripes.

5. This method has significant potential application value in fields such as intelligent transportation systems and autonomous driving. It can accurately and timely update the location and status of zebra crossings, improve the accuracy of vehicle positioning and path planning, and contribute to safer and more efficient navigation.The article introduces an innovative approach using natural markers for photogrammetry, offering a fresh perspective to address the cumbersome process of sticker attachment and removal in traditional photogrammetry.

Weakness：

1. When introducing the construction of energy functions in section 3.5.1, the explanation of some formulas and parameters is not detailed enough, which may cause readers to have difficulty understanding. For example, the specific meanings and functions of α, β, and γ in formula (10) are not clearly explained in the text.

2. In the fifth section, although the article discussed the limitations of the algorithm in special situations, such as when the cement cover or white warning paint is close to the zebra stripes and away from the external area of the scanning trajectory line, there was not sufficient discussion on other factors that may affect the performance of the algorithm, such as complex traffic environments, different weather conditions, etc.

3. The comparative experiments in sections 4-5 are relatively simple. The article mainly compares the accuracy of the algorithm with manually annotated results, but lacks detailed comparative experiments with other related algorithms or methods, making it difficult to fully demonstrate the superiority of this method.

4. When introducing the specific steps of the algorithm in sections 4.3 and 3.4, some of the content lacks corresponding charts to visually display them. For example, in the process of calculating the number and length of zebra stripes, some schematic diagrams can be added to help readers better understand the establishment of local coordinate systems, the distribution of candidate point sets, and the process of energy calculation.

5. The research scope can be further expanded: The article mainly focuses on the extraction and reconstruction of zebra crossings, and does not cover other related road facilities or traffic signs. In the future, the research scope can be expanded to construct a more complete road scene model.

Author Response

Comments 1: [When introducing the construction of energy functions in section 3.5.1, the explanation of some formulas and parameters is not detailed enough, which may cause readers to have difficulty understanding. For example, the specific meanings and functions of α, β, and γ in formula (10) are not clearly explained in the text. ]

Response 1: Thank you for pointing this out. In the revised manuscript, we have added further explanations of the meaning and function of these parameters and formulas (page number 10, section 3.5.1, and line 354-356；page number 13, section 4, and line 457-458). In formula (10) , α, β, and γ represent the weight coefficients of each term in the energy function. After applying max-min normalization to each energy term, the greater the entropy of the intensity information in the point cloud, the smaller the value of the energy function. Therefore, a negative sign is used before β.

Comments 2: [ In the fifth section, although the article discussed the limitations of the algorithm in special situations, such as when the cement cover or white warning paint is close to the zebra stripes and away from the external area of the scanning trajectory line, there was not sufficient discussion on other factors that may affect the performance of the algorithm, such as complex traffic environments, different weather conditions, etc.]

Response 2: We are grateful for your detailed review. One of the advantages of our work is the selection of vehicle-mounted LiDAR point clouds as the data source for reconstructing 3D zebra stripes. Unlike traditional image-and video-based methods, LiDAR point clouds have the prominent advantage of being minimally affected by lighting and weather conditions ( page number 6, section 2.3, and line 220-224 ). Of course, the quality of the point clouds varies depending on the data collection platform and traffic conditions. To address this, in addition to nearly 10 km of urban roads in Shanghai, we also selected LiDAR point clouds collected at different times, locations, and with different data collection platforms as our study objects. As shown in Figure 10 and Table 4, the robustness of our algorithm is evident. In fact, the software developed based on this method has been successfully applied to the production of high-definition maps for thousands of kilometers of highways, demonstrating the effectiveness and robustness of the method presented in this paper. The data presented in the paper represents only a portion of the preliminary experiments.

For complex local traffic environments, where data loss occurs due to obstructions like pedestrians and vehicles, we have included a discussion on optimizing the extraction results. (page number 16, section 5, and line 495-509，figure 11)

Comments 3: [The comparative experiments in sections 4-5 are relatively simple. The article mainly compares the accuracy of the algorithm with manually annotated results, but lacks detailed comparative experiments with other related algorithms or methods, making it difficult to fully demonstrate the superiority of this method. ]

Response 3: Thank you very much for pointing this out. During the research process, we had plans to conduct direct comparison experiments but, after careful consideration, decided against it for the following reasons:

Comments 4: [When introducing the specific steps of the algorithm in sections 4.3 and 3.4, some of the content lacks corresponding charts to visually display them. For example, in the process of calculating the number and length of zebra stripes, some schematic diagrams can be added to help readers better understand the establishment of local coordinate systems, the distribution of candidate point sets, and the process of energy calculation.]

Response 4: Many thanks for your comment. We used Figure 4 to illustrate the establishment of the local coordinate system and the distribution of candidate points, and Figure 6 to depict the energy calculation process. Figure 5 represents the calculation of zebra stripe lengths. After thorough consideration, we optimized Figure 4(b) to better illustrate the selection and distribution of candidate points. This enhancement helps readers understand the principles of our algorithm more clearly.

Comments 5: [The research scope can be further expanded: The article mainly focuses on the extraction and reconstruction of zebra crossings, and does not cover other related road facilities or traffic signs. In the future, the research scope can be expanded to construct a more complete road scene model.]

Response 5: We appreciate your constructive comments. In fact, following the main ideas of this paper, we have begun using energy functions and template matching methods to extract road markings and parking spaces in areas with high roadside obstructions. We have achieved some preliminary success and hope that these results will better address practical issues in high-definition map production.

Author Response File: Author Response.pdf

Reviewer 3 Report (New Reviewer)

Comments and Suggestions for Authors

1. The zebra crossing 3D reconstruction algorithm proposed in this article has set too many parameters, and the optimal values of the parameters were finally determined through repeated experiments. However, due to the limited experimental dataset, further verification is needed to determine whether the empirical values of the parameters provided in Table 2 can adapt to laser point cloud data collected by various vehicle mounted sensors and different urban road scenes.

2. This paper focuses on the reconstruction of zebra crossings based on vehicle mounted laser point cloud data, which mainly relies on the intensity and morphological features of the point cloud, while ignoring the most significant texture features of the zebra crossings. Therefore, the stability and reliability of zebra crossing reconstruction are limited. It is recommended to combine the texture features of zebra crossings in vehicle mounted panoramic images for zebra crossing segmentation and extraction to improve the applicability and robustness of zebra crossing reconstruction.

Comments on the Quality of English Language

The overall expression in this paper

Author Response

Comments 1: [The zebra crossing 3D reconstruction algorithm proposed in this article has set too many parameters, and the optimal values of the parameters were finally determined through repeated experiments. However, due to the limited experimental dataset, further verification is needed to determine whether the empirical values of the parameters provided in Table 2 can adapt to laser point cloud data collected by various vehicle mounted sensors and different urban road scenes.]

Response 1: Many thanks for your comment. Considering the common characteristics of MLS point clouds and the geometric features of zebra crossings, we preset six key parameters in our algorithm: Qt, K, ∆t, T_Z, T_L, and α, β, γ. We determined their optimal values through repeated experiments. To verify the robustness of the algorithm, we conducted comparative experiments across three cities, three major platforms, 27 zebra crossing areas, and 448 zebra stripes, as shown in Figure 10 and Table 4. These data are representative and demonstrate that our predetermined parameter values require almost no manual intervention for adjustments. In fact, beyond nearly 10 km of urban roads in Shanghai, we also produced high-definition maps using LiDAR point clouds collected at different times and locations, and under various traffic conditions. The software developed based on this method has been successfully applied to the production of thousands of kilometers of highway high-definition maps, confirming the effectiveness and robustness of our approach. The data presented in the paper represent only a portion of our preliminary experiments.

Comments 2: [This paper focuses on the reconstruction of zebra crossings based on vehicle mounted laser point cloud data, which mainly relies on the intensity and morphological features of the point cloud, while ignoring the most significant texture features of the zebra crossings. Therefore, the stability and reliability of zebra crossing reconstruction are limited. It is recommended to combine the texture features of zebra crossings in vehicle mounted panoramic images for zebra crossing segmentation and extraction to improve the applicability and robustness of zebra crossing reconstruction.]

Response 2: We appreciate your insightful feedback. The initial purpose of our algorithm is to address situations where only point cloud data is available, without other data sources. It aims to fully exploit the limited information from most vehicle-mounted LiDAR systems, which often do not consider potential synchronized RGB data since some do not include color attributes. In fact, the intensity information from the point clouds we use partially represents the material, geometry, and color properties of the target objects, and it is also a type of texture feature.

In our future research, we will construct histograms of color distribution in point clouds based on color information and combine them with intensity data to derive stripe widths and spacing. This approach can reduce manual parameter adjustments, enhancing the algorithm’s automation, applicability, and robustness. In response, we mentioned at the end of the paper (page 19, section 6, lines 592-597) the idea of using color attributes in point clouds to assist zebra stripe detection, which is a prospect for future work. We also clearly understand that, similar to image-based object recognition tasks, our future research still faces challenges such as point cloud colorization errors from coordinate inconsistencies between system components, color distortion and low contrast due to image quality and lighting changes, color loss due to complex road conditions and wear, and misjudgments from environmental interference leading to color similarities.

Author Response File: Author Response.pdf

Reviewer 4 Report (New Reviewer)

Comments and Suggestions for Authors

To address the issue of traffic marking extraction accuracy in practical production, which is affected by degradation, occlusion, and non-standard variations, this paper proposes a 3D reconstruction method based on energy functions and template matching, using zebra crossings in vehicle-mounted LiDAR point clouds as an example. This is a very meaningful work, but there are still some minor issues.

1、 Extracting road signs is a long-term task, and many authors have proposed methods in the past. Please add relevant previous works, such as ”Accurate Road Marking Detection from Noisy Point Clouds Acquired by Low-Cost Mobile LiDAR Systems”.

2、 The analysis problem in Introduction does not strongly point out the research space of using LiDAR data for zebra crossing extraction. The description of innovation can be further strengthened to make it easier for readers to accept.

3、 In the Related Work from lines 98 to 210, three methods related to data sources are cited. There is no logical progressive relationship between the cited papers, which cannot clearly express why these papers are cited, which appears to be stacked and redundant, and fails to point out the shortcomings of the citations, which cannot provide strong support for the author's research.

Author Response

Comments 1: [Extracting road signs is a long-term task, and many authors have proposed methods in the past. Please add relevant previous works, such as “Accurate Road Marking Detection from Noisy Point Clouds Acquired by Low-Cost Mobile LiDAR Systems”.]

Response 1: We appreciate your comment and have added these references in the related work. Detecting road markings from noisy point clouds obtained by low-cost multi-beam mobile LiDAR, using intensity gradients and statistical histograms, is currently a significant research focus with notable achievements. In our literature review, we initially overlooked this aspect, but we have included it in the revised manuscript (page 5, section 2.3, lines 198-202). Thank you again for your valuable feedback.

Comments 2: [ The analysis problem in Introduction does not strongly point out the research space of using LiDAR data for zebra crossing extraction. The description of innovation can be further strengthened to make it easier for readers to accept.]

Response 2: We are grateful for your detailed review. We have thoroughly reviewed the introduction, detailing the current research status and the limitations of various methods point by point. We highlighted the advantages of using LiDAR point clouds for zebra crossing extraction and identified areas for improvement in existing studies (page 2, section 1, lines 46-70). We also summarized and enhanced the contributions of the paper in the introduction (page 3, section 1, lines 93-108), emphasizing the necessity and innovation of our algorithm to make it more accessible to readers. Thank you again for your suggestions.

Comments 3: [ In the Related Work from lines 98 to 210, three methods related to data sources are cited. There is no logical progressive relationship between the cited papers, which cannot clearly express why these papers are cited, which appears to be stacked and redundant, and fails to point out the shortcomings of the citations, which cannot provide strong support for the author's research. ]

Response 3: Many thanks for your comment. In our review of existing related work, we focused on "zebra crossing extraction" and conducted a systematic literature survey. This was presented in a review-like manner to provide readers and fellow researchers with a comprehensive understanding of the current research landscape. Logically, we summarized the development of each data source following a general progression from conventional classic methods, to deep learning network architectures and optimization, to 3D information acquisition, while considering the publication order. We removed redundant papers with high similarity. Additionally, we briefly summarized the shortcomings of the cited papers to better support our research. (page 4, section 2.1, lines 136-137, 146-148; page 4-5, section 2.2, lines 161-163, 176-180; page 5, section 2.3, lines 197-198, 216-219)

Author Response File: Author Response.pdf

Reviewer 5 Report (New Reviewer)

Comments and Suggestions for Authors

Dear authors,

Thank you for your efforts in producing this paper. Below, you can find my suggestions and questions:

Line 15: The authors state, "This method guarantees the accuracy of the mapping results," but what accuracy does it guarantee? It should be factually stated here.

Figure 1 – what is the unit of the Intensity value? Perhaps it would be ideal to include the difference in reflectivity between the road surface and the parts where the zebra is.

Lines 220-221 and Figure 2 (a), I recommend giving at least a basic characteristic of the AS-900HL scanning system, as mentioned in the text and figure.

Lines 230-231: The manual selection of the ROI can be considered a major disadvantage of the algorithm. First, how precisely does it need to be selected? Moreover, it would significantly advance if it were at least partially automated.

Line 251: How is the candidate point set Q created? Are all the points from the pre-selected box added to this set? If so, wouldn't it be better to exclude some points that have a lower intensity than some selected values? In this way, it would not be necessary to "test" so many points in the segmentation process.

Lines 320-321: Does this method work only in the case of a zebra crossing shaped as a rectangle (or parallelogram)? Nowadays, in different areas of the world, it is a trend to give 3D zebras; in that case, the zebra crossings' shape can vary. Would it be possible to detect these types of crossings with the depicted algorithm?

Lines 368-370: I need clarification on why it is needed to obtain 3D coordinates. If we are talking about a point cloud, each point has its 3D coordinates, is it not possible to use these coordinates?

General comment: As far as the results and conclusion are concerned, a comparison with some other similar method (state-of-the-art) and, thus, a demonstration of the novelty of the algorithm needs to be improved. In addition, as I mentioned above, manual selection of ROI at the beginning of the algorithm can be a major disadvantage, if they want to deploy it on a large area with a huge number of points of the point cloud.

Regards

Comments on the Quality of English Language

Minor editing of English language required.

Author Response

Comments 1: [Line 15: The authors state, "This method guarantees the accuracy of the mapping results," but what accuracy does it guarantee? It should be factually stated here.]

Response 1: Thank you for pointing this out. In the revised manuscript, we have provided a clear and quantitative description of the results of our method, adhering to scientific writing conventions. (page number 1, line 15-18)

Comments 2: [ Figure 1 – what is the unit of the Intensity value? Perhaps it would be ideal to include the difference in reflectivity between the road surface and the parts where the zebra is.]

Response 2: Many thanks for your comment. The reflection intensity of each point in a LiDAR-scanned point cloud is typically a relative value without a unified physical unit. It represents the proportion of laser pulse energy reflected from the object's surface back to the sensor, relative to the emitted energy, rather than an absolute physical quantity.

Laser intensity is an important physical parameter reflecting the surface characteristics of targets, indicating their spectral reflection properties. Different target surfaces have varying reflectance at specific laser wavelengths, resulting in differences in the laser intensity values we obtain [1-3]. In the context of this study, although different LiDAR systems may define and measure intensity differently, the difference in reflection intensity between zebra crossings and road surfaces remains quite distinct.

[1] Li X, Ma L, Xu L. Empirical modeling for non-Lambertian reflectance based on

full-waveform laser detection[J]. Optical Engineering, 2013, 52(11): 116110-116110.

[2] Kashani A G, Olsen M J, Parrish C E, et al. A Review of LiDAR radiometric processing:

From Ad Hoc intensity correction to rigorous radiometric calibration[J]. Sensors, 2015,

15(11): 28099-28128.

[3] Khan S, Wollherr D, Buss M. Modeling laser intensities for simultaneous localization and

mapping[J]. IEEE Robotics and Automation Letters, 2016, 1(2): 692-699.

Comments 3: [ Lines 220-221 and Figure 2 (a), I recommend giving at least a basic characteristic of the AS-900HL scanning system, as mentioned in the text and figure.]

Response 3: Many thanks for your comment. We have supplemented the manuscript with a table detailing the relevant basic characteristics of the AS-900HL scanning system. (page number 6, section 3.1, and Table 1)

Comments 4: [Lines 230-231: The manual selection of the ROI can be considered a major disadvantage of the algorithm. First, how precisely does it need to be selected? Moreover, it would significantly advance if it were at least partially automated.]

Response 4: Many thanks for your comment. In previous research, we used machine learning methods to automatically determine the positions of various objects of interest, including zebra crossings [4]. Although this approach achieved groundbreaking results, it unfortunately still falls short of meeting the demands for completeness and accuracy in practical applications, especially in terms of completeness. To address this, we developed the algorithm presented in this paper. Compared to the fully automated machine learning approach, our semi-automatic method has a different starting point, placing greater emphasis on practical implementation. In real-world operations, In real-world operations, fully automated machine learning algorithms require substantial computational resources. Furthermore, their poor generalization across different cities often results in omissions. After extraction, a manual visual check of all data is necessary, followed by the use of a semi-automatic algorithm to complete targets not effectively extracted. The low computational demands of the semi-automatic algorithm make it more user-friendly for general production environments, ensuring task completeness and complementing fully automated methods. The software developed based on this method has been successfully applied to the production of thousands of kilometers of highway high-definition maps, confirming the effectiveness and robustness of our approach. The data presented in the paper represent only a portion of our preliminary experiments.

Regarding algorithm accuracy and semi-automation, the selection of the RoI doesn't need to be highly precise; it only needs to roughly encompass all the zebra stripes. This is because determining the number of stripes within the RoI only requires the approximate location of their central lines (Response 5 will elaborate on this further) . Subsequently, determining stripe length and width, as well as template matching, no longer depends on the RoI. In practical production, for areas where LiDAR point cloud quality is ideal, results from previous work [4] can be used. In cases where model generalization is poor, point cloud quality is suboptimal, or zebra stripes are in particularly challenging areas, our semi-automatic method is employed to ensure the completeness of the extraction results. (page number 16, section 5, and line 495-509，figure 11)

Additionally, our algorithm is designed to fully exploit the limited information collected by most vehicle-based LiDAR systems, without considering the registered RGB information (some vehicle-based LiDAR point clouds do not contain synchronized color attributes). Future research could construct point cloud color distribution histograms based on these texture features, combining intensity information to derive stripe widths and spacing, reducing manual parameter adjustments, and increasing the level of automation. This prospect is discussed in our outlook (page number 19, section 6, and line 592-597). We also clearly understand that, similar to image-based object recognition tasks, our future research still faces challenges such as point cloud colorization errors from coordinate inconsistencies between system components, color distortion and low contrast due to image quality and lighting changes, color loss due to complex road conditions and wear, and misjudgments from environmental interference leading to color similarities.

[4] Mi, X.; Yang, B.; Dong, Z.; Liu, C.; Zong, Z.; Yuan, Z. A two-stage approach for road marking extraction and modeling using MLS point clouds. ISPRS Journal of Photogrammetry and Remote Sensing 2021,180, 255-268.

Comments 5: [Line 251: How is the candidate point set Q created? Are all the points from the pre-selected box added to this set? If so, wouldn't it be better to exclude some points that have a lower intensity than some selected values? In this way, it would not be necessary to "test" so many points in the segmentation process.]

Response 5: Many thanks for your comment. The main objective of Section 3.3 is to determine the number of stripes within the pre-selection box. The candidate point set Q is not composed of LiDAR points; instead, it comprises points along the y-axis of the pre-selection box. The process begins by selecting points every 0.05m along the y-axis, based on the precision requirements of high-definition map production (planar accuracy of 0.05m), to form the candidate set Q. Subsequently, we calculate the sum of intensities of all LiDAR points in the neighborhood of each candidate point. By selecting the first maximum value, q_max, and adding it to the Q_max set, we apply the principle of non-maximum suppression to remove other candidate points in the surrounding neighborhood. This step is repeated to select subsequent maximum values until all maximum points are identified. Consequently, the lines extended through points in Q_max essentially represent the central axes of each stripe in the pre-selection box, with the number of points in Q_max indicating the number of stripes. Thus, we did not "test" all points during segmentation. To enhance the readers' understanding of our algorithm, we have redrawn Figure 4(b). (page number 8, section 3.3, and figure 4(b))

Comments 6: [Lines 320-321: Does this method work only in the case of a zebra crossing shaped as a rectangle (or parallelogram)? Nowadays, in different areas of the world, it is a trend to give 3D zebras; in that case, the zebra crossings' shape can vary. Would it be possible to detect these types of crossings with the depicted algorithm?]

Response 6: We appreciate your constructive comments. Our algorithm primarily detects "black and white" zebra crossing types. It is designed to support high-definition map production and road asset inventory and maintenance by enabling stripe-level instantiated 3D zebra crossing modeling, rather than merely identifying zebra crossing areas. Rectangles and parallelograms are used as templates based on the main shape characteristics of zebra stripes for 3D reconstruction. The algorithm adapts well to variations in stripe length and can be adjusted for other relatively uniform simple shapes. However, it struggles with more complex, creative zebra crossing designs. In future research, we plan to develop a new deep learning framework to detect zebra crossing areas in these special cases.

Comments 7: [Lines 368-370: I need clarification on why it is needed to obtain 3D coordinates. If we are talking about a point cloud, each point has its 3D coordinates, is it not possible to use these coordinates?]

Response 7: We appreciate your insightful feedback. The (x, y) coordinates here refer to the ordered node coordinates of the template used to match the optimal position and orientation of zebra stripes, and these positions may not necessarily contain point cloud data. In practical production, we typically denoise and filter the point cloud to reduce data volume and obtain ground points. Then, using a fitting algorithm, we derive the road surface equation. Based on this, we substitute the template node (x, y) coordinates into the road surface equation, elevating the template from 2D to the 3D inclined plane space.

General comment: [ As far as the results and conclusion are concerned, a comparison with some other similar method (state-of-the-art) and, thus, a demonstration of the novelty of the algorithm needs to be improved. In addition, as I mentioned above, manual selection of ROI at the beginning of the algorithm can be a major disadvantage, if they want to deploy it on a large area with a huge number of points of the point cloud.]

Response : Thank you for highlighting this aspect. During the research process, we had plans to conduct direct comparison experiments but, after careful consideration, decided against it for the following reasons:

The core motivation of our algorithm design is to provide an effective and essential supplement to deep learning outcomes, focusing on the completeness and rationality of extraction and reconstruction results, rather than fully automated detection and recognition of large datasets. Our research has demonstrated practicality and innovation in solving real-world production issues. As an interactive semi-automatic algorithm, it generally does not require processing large datasets, and the mainstream computer hardware and software configurations used by high-definition map production companies are sufficient to support its operation. Therefore, this aspect is not discussed in the paper.

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report (New Reviewer)

Comments and Suggestions for Authors

Good job!

Reviewer 5 Report (New Reviewer)

Comments and Suggestions for Authors

Dear authors,

Thank you for addressing all of my comments, after the revision, the manuscript has been improved.

I have no more comments.

Regards

Comments on the Quality of English Language

Minor editing of English language required.

This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The comments are attached.

Comments for author File: Comments.pdf

Comments on the Quality of English Language

Standard English grammar should be used throughout the manuscript. The manuscript should be thoroughly checked to eliminate grammatical errors.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

This paper mainly introduces a 3D reconstruction method based on energy function and template matching. It lies within the scope of the journal. Overall, the research is solid and interesting. And its presentation is clear and straightforward.

My comments are as follows.

1. Suggest providing detailed information on the characteristics, scale, and processing of cloud datasets to meet experimental requirements, including preprocessing steps and any filtering conditions.

2. It is possible to quantify and discuss the performance stability of the algorithm in the face of different environmental conditions (such as lighting changes, weather effects) and different quality point clouds, in order to support the statement of algorithm robustness.

3. What preprocessing operations were performed before using LiDAR point clouds for 3D reconstruction to ensure data quality, such as noise removal, outlier detection, or data alignment?

4. The article points out that this method can effectively address the problems of uneven point cloud density and intensity distribution. What specific techniques or algorithm steps are used to correct these problems?

5. What is the demand for computing resources for the proposed method in practical applications? How much time and computational resources does it take to perform a 3D reconstruction?

6. In the conclusion section, in addition to summarizing the research results, the potential value of this method in practical applications should also be emphasized, such as its application prospects in intelligent transportation systems, autonomous driving map updates, and so on.

7. How will the idea of using point cloud color distribution to assist in identifying zebra crossing width and spacing be implemented at the end of the article? Are there any theoretical or technical difficulties that need to be overcome?

Comments on the Quality of English Language

The qulity of english language should be improved.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

The paper presents a method for 3D, Lidar based, zebra crossing recognition for high precision maps.

The method starts with the manual selection of the ROI where the Zebra crossing is positioned inside the LiDAR point cloud, then the number of zebra stripes is determined and for each zebra stripe the width and length are calculated. After this a template matching procedure based on energy functions is performed and the results are refined in an iterative manner.

The whole method seems to be very geometric oriented and is not based on newer, state of the art methods (machine learning based). The authors do not provide any information on how the exact localization of the Zebra crossings is done in the scene, nor how the high definition maps are calculated.

By using only LiDAR information the method is error prone to outlier, lack of density in LiDAR data, etc.

What is the value of w_zebra (the width of the Zebra stripe) used in the paper?

In section 3.4 rough pose positioning it is not clear what is the chosen interval for computing the length of the Zebra stripe (figure 2b, 2c).

How are the width and length of the Zebra stripe determined, when the Zebra stripe is not rectangular or parallelogram shape, i.e. there are some occlusions in the zebra stripes (manholes, stains, etc.) resulting in irregular shapes?

The data-set used for the experimental results is very small ~12 Zebra crossing in Shanghai, 8 in Wuhan. This is too small for a serious research paper. If you manually labeled the zebra crossings wy not use a machine learning solution which would be more robust to outliers? At least one could try to fuse image and LiDAR for obtaining the region of interest!

What is the size of the analyzed point cloud inside the region of interest for each Zebra crossing case? 1 Second processing time in C++ on an x64 machine is a huge runtime for automotive use-cases.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

The author has addressed the questions in detail and made the appropriate revision in the article.

Comments on the Quality of English Language

This article is ready for publication with some language modifications.

Article Menu

Three-Dimensional Reconstruction of Zebra Crossings in Vehicle-Mounted LiDAR Point Clouds

Further Information

Guidelines

MDPI Initiatives

Follow MDPI