Next Article in Journal
HydroSAR: A Cloud-Based Service for the Monitoring of Inundation Events in the Hindu Kush Himalaya
Previous Article in Journal
Aboveground Biomass Mapping in SemiArid Forests by Integrating Airborne LiDAR with Sentinel-1 and Sentinel-2 Time-Series Data
 
 
Article
Peer-Review Record

Enhancing Integrated Sensing and Communication (ISAC) Performance for a Searching–Deciding Alternation Radar-Comm System with Multi-Dimension Point Cloud Data

Remote Sens. 2024, 16(17), 3242; https://doi.org/10.3390/rs16173242
by Leyan Chen 1,2, Kai Liu 1,2, Qiang Gao 1,2, Xiangfen Wang 3,* and Zhibo Zhang 1,2
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Reviewer 4: Anonymous
Remote Sens. 2024, 16(17), 3242; https://doi.org/10.3390/rs16173242
Submission received: 2 July 2024 / Revised: 23 August 2024 / Accepted: 30 August 2024 / Published: 1 September 2024

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

In order to reveal the contributions of the study, the following reviews need to be made:

1- The introduction is overly long. It is necessary to create a separate "Related Works" section to explicitly present competitive methods.

2- There is no apparent relationship between the main image and the scenes depicted in Figure 2. The figure is not sufficiently explanatory. The main image and the scenario images appear to represent different situations, making it difficult to establish a connection between them.

3- While the researchers have created and labeled the datasets, but quantitative information about these datasets is missing (e.g., the number of points, the number of vehicles in each scene).

4- The study compares results only using YOLO and PointNet architectures. To enhance the robustness of the findings, it is necessary to compare the proposed method with additional algorithms, such as PointPillars and VoxelNet. For example, the following studies can be examined:

"B. Xu et al., "RPFA-Net: a 4D RaDAR Pillar Feature Attention Network for 3D Object Detection," 2021 IEEE International Intelligent Transportation Systems Conference (ITSC), Indianapolis, IN, USA, 2021, pp. 3061-3066, doi: 10.1109/ITSC48978.2021.9564754."

"Nobis, F.; Shafiei, E.; Karle, P.; Betz, J.; Lienkamp, M. Radar Voxel Fusion for 3D Object Detection. Appl. Sci. 2021, 11, 5598. https://doi.org/10.3390/app11125598"

 

Suggestion:

5- More information regarding the statistical significance of the experimental results (e.g., ROC Curves, ANOVA Analysis).

Author Response

Thank you very much for your helpful comments.  We have made modifications point to point according to your suggestions, and hope to meet with your approval.  Please see the attachment for our response and revised manuscript.  Once again, we extend our gratitude for your generous investment of time in enhancing our submission.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

This paper presents a novel approach to enhancing integrated sensing and communication (ISAC) systems using 4D radar point cloud data for vehicle detection in intelligent transportation systems. The proposed scheme includes a 6-channel self-attention neural network architecture that effectively classifies and segments vehicle targets across various scenes. Extensive experiments on real-world datasets demonstrate that this method significantly improves vehicle detection accuracy and communication performance compared to traditional algorithms like YOLO and PointNet.

 

Major Concern

1. please include specific details about the data collection process for the 4D radar point cloud datasets. Mention the conditions, equipment, and protocols used, such as sensor types, configurations, environmental conditions, and preprocessing steps.

2. Provide a more detailed explanation of the performance metrics used to evaluate the proposed scheme. Specifically, include a clearer discussion on the significance of mean average precision (mAP) and mean intersection over union (mIOU) in the context of your experiments.

3. please provide a detailed strategy for mitigating the impact of environmental factors on target detection accuracy. While the paper acknowledges these issues, it would benefit from exploring and discussing more robust methods to handle adverse weather conditions and variable lighting.

4. Include a comprehensive discussion on the real-time performance of the proposed neural network architecture. Address how the system performs under different loads and in real-time applications.

5. Provide more detailed implementation information. Include the exact neural network parameters, training protocols, and computational resources required. This will help others replicate and build upon your work.

6. Conduct more rigorous robustness testing of the proposed scheme under different noise levels, occlusions, and varying traffic densities. Additionally, provide statistical analysis to show the confidence of detection, ensuring reliability in diverse real-world scenarios and strengthening the validity and applicability of your findings.

Author Response

Thank you very much for your helpful comments.  We have made modifications point to point according to your suggestions, and hope to meet with your approval.  Please see the attachment for our response and revised manuscript.  Once again, we extend our gratitude for your generous investment of time in enhancing our submission.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

The article focuses on an intelligent transportation system based on integrated sensing and communication (ISAC) technology. This technology has become an effective and promising method for vehicle road services and can enhance traffic safety and efficiency through real-time interaction between vehicles and roads. The article lacks a detailed description of the concept, where the exact utilization of the system would be described. In the article, the authors extensively describe vehicle detection from 4D radar point clouds with the proposal of their own neural network, which they compare with the PointNet approach in the final part of the article. The contributions of the article can be summarized in the proposal of transforming radar data into the form of 4D point clouds, the design of a new 6-channel neural network for vehicle detection from radar data, which is subsequently tested on real-world data.

In the introduction of the article, I miss the solution concept, which would briefly explain the entire process from the moment of data acquisition to the final results. Explain how the results will be further used and also add information about whether it is real-time processing, whether it works on a dataset or data from own measurements. Where the radar is located (directly on vehicles or at specific road infrastructure locations). Does your system only address vehicle detection, or does it also allow detecting motorcycles or possibly pedestrians? Please clarify for what purpose the system is to be used, provide some examples of use, and problematic situations you are addressing.

 

The article is very interesting, but a large part of it is dedicated to the mathematical model, which is described in a rather complicated manner. Besides the above-mentioned comment, I have the following questions and comments regarding the article:

 

At the beginning, in the affiliation section under point 2, I would not mention the university again, as it is already listed under point 1. In point 2, I would leave only Hangzhou Innovation Institute. For the authors, you refer to institutions 1 and 2 again anyway.

 

In lines 22-24: Is this the intended use for your system, as it is not mentioned anywhere in the article? The purpose of the system influences various constraints such as real-time application, accuracy, system cost, reliability, and so on.

 

Line 42 – ‘To tackle the above challenges,’ – can you please clarify these challenges? Do they relate to vehicle detection?

 

Line 46 – “quality constraints of datasets” – this depends on the lidar used.

 

Line 47 – “high costs” – a solution based on lidar should not be more expensive than a camera-based solution.

 

Lines 48-50 “lidar, millimeter-wave (mmWave) radar...” – please explain where the radar is located. Do you assume radar placement directly on the vehicle or on the road infrastructure (intersections, overpasses, roadside)? Alternatively, do you consider specific radar placement that eliminates certain disruptive influences?

 

Line 52 – “remarkable precision” – please specify the accuracy of the radar.

 

Line 57 – “simultaneously identifying multiple targets” – lidar can do that as well.

 

Line 66 – are you detecting vehicles through the spectrogram from FFT?

 

Line 67 – “traditional threshold methods” – please provide at least a reference to the literature for these methods.

 

Line 76 – “virtual point cloud” – please explain this term.

 

Line 77 – “human pose estimation” – is the subject of detection people or vehicles?

 

Lines 80-81 – “The experiment demonstrates that radar point cloud data outperforms RV spectrograms in performance and efficiency metrics.” – This needs to be explained better.

 

Lines 137-138 – “Based on real-world conditions, this study gathers echo data from mmWave radar and transforms them into the 4D point cloud.” – Do you have data from your own experiment, or did you use an existing dataset? This needs to be clarified at the beginning, and if it is a dataset, provide a reference from where it was downloaded.

 

Line 141 – “Point cloud data is not contingent on camera image quality” – you need to specify that you are writing about radar point clouds. For example, this does not apply to point clouds from a stereo camera.

 

Figure 1:

- The texts in the image are too small,

- Move the image to the text where it is described (in Chapter 2). Here it would be good to at least briefly explain the individual blocks in the image, so the reader gets an understanding of the system.

- The image shows a simulation example, did you use a simulation or real data?

 

Line 144 – “classify and segment targets across various scenes” – please explain what targets were detected and under what conditions.

Line 148 – “competitive communication performance” – please explain what this means.

Line 171 – “fd is the frequency” – frequency of what?

Line 184 – “contained in (6) is in variable l” – variable l is not in equation 6. Don't you mean equation 5?

 

Equation 8 – add an explanation for the variable gamma.

 

Line 192 – the equation for the guiding vector that you have in the text should be given as a separate equation.

Line 194 – are there any conditions for the position and orientation of the antennas when determining the angle? Which angle are you determining – horizontal or vertical?

Line 196 – it would be good to explain the terms range, velocity, and angle with a practical example, or add a diagram where you illustrate the vehicle, radar, and individual parameters.

Line 200 – Do clutter points really have low SNR?

 

Equations in lines 206, 207, 208 – present them as separate equations.

 

Line 214 – when you mention targets, I assume you mean vehicles. Are there any restrictions on the detected vehicles, such as shape, size, etc.? Are pedestrians also subject to detection or only vehicles?

 

Line 223 – after the term “channel capacity,” I would add the symbolic notation as you have it in equation 14.

 

Line 231 – how long does this method of vehicle detection take? Is it possible to deploy in a real-time application?

- What are the limitations regarding vehicles – shape, speed of movement, number of vehicles. Is it possible to detect only in sections with reduced speed? Please comment on this.

- Do you have your own application written, or do you use commercial software? If you have your own application, in which programming language was it created?

 

Line 238 – in equation 19.

 

Equation 20 – explain the symbol “+”.

 

Figure 2.

- The texts in the image are too small,

- Move the image to the text where it is described (beginning of subsection 3.2),

- Why is the legend inserted into the large image when the lines from the legend are used only in the details (in the scenes)?

- The description of the image is not precise. Try to expand it so that it is clear what the image represents. For example, vehicle detection in urban scenes.

- Please explain what “vehicle fleet” means and what this detection serves – does it refer to the detection of vehicle columns? How does the radar handle cars driving behind each other when they overlap? Do you also use vehicle tracking with the Kalman filter for detection? In the sense that you predict the position of detected vehicles from the speed and position of the vehicle and then look for the current position based on the new detection.

 

Line 245 – from equation 22 .....

Line 251 – please add the definition for “4D radar point cloud” and what parameters are defined for individual points.

 

Line 264 – “with the distance between vehicle targets and the mmWave radar” – you don't have the radar shown in the scenes.

- Please explain better what formations you mean. Scene I looks like a representation of a vehicle column with various types of vehicles. What is the result of the detection, and how is it supposed to be used further?

- It would be beneficial to specify this for a specific purpose, what is detected (vehicles, number of vehicles, type of vehicles, speed), and what will be done with it. Could you provide a small model situation? I imagine something like, at time t, I have the detection result for radar XY, and based on that, I know that there is a vehicle column on section Z, with a significant presence of trucks, which means that the delay will be greater.

 

Separate question:

- Nowhere in the text did I find information about difficult conditions that certainly occur during vehicle detection. Here are some of them:

- Please explain how you handle the situation when the speed of the detected vehicle changes. How do you identify that it is the same vehicle? Do you use the Kalman filter and predict the new position of the vehicle?

- How do you handle the situation when multiple vehicles are moving side by side at the same speed?

- How is a vehicle column handled when the vehicles are covered behind each other with respect to the radar?

- It would be appropriate to add a diagram or at least explain what the radar density must be to achieve the desired accuracy and to address, for example, the above-mentioned problematic situations. Or are you considering a combination of radars located in the road infrastructure with radars placed directly in cars?

 

Figure 3.

- The texts in the image are too small,

- If possible, in part of the image 3a) I would move the time under the label "camera image" and make it a contrasting color to the background - e.g., yellow,

- For the image 3b), you could add axes,

- in part of the image 3c) add car numbers – car1, car2, car3,...

- Can your algorithm identify a specific vehicle in the spectrogram even if it changes speed?

 

Lines 287-288 - in image correlation, how do you mark the vehicle that is to be searched for by correlation? Do you do it manually or do you have a prepared set of vehicle images - then the correlations would be low.

Line 296 – “the z-coordinate representing angle (A)” – this does not match the image. According to image 4b, ranges (R) should be on the vertical Z-axis and angles (A) on the Y-axis. Do you determine only the horizontal angle or also the elevation angle?

 

Figure 4.

- Identified objects in the 4D point cloud are barely visible, try adding labels A, B, C, D... or C1, C2, C3 as car1, car2, ....

- I assume that the images are to scale. Then, you need to display the axes in the image normally and add tick marks and axis labels.

- The color of the points represents SNR. Then you need to add some legend to the SNR values, at least a hypsometric scale.

- Why isn't the position of the radar marked in the image, or is it meant as the origin of the coordinate system?

 

Line 306 – “Figure 5(b) displays point labels in the 4D radar point clouds” – sorry, but isn't it the other way around, that in figure 5b you have a 3D point cloud displayed, where the vehicle is represented as a target object in red? In figure 5a, you have 4D display, where the fourth dimension is understood as the SNR data expressed by color. Maybe I misunderstood it, but this way it makes sense to me. Please clarify this.

 

Figure 5.

- The detected objects in the image are very poorly visible; try adding details to the left and right of the main image.

- Adjust the axes according to the comments on the previous image and unify the labels with the text (comment – line 306).

- Please check where the 3D and 4D display are. I understand the fourth dimension as SNR.

- Please, if possible, align the images so that the origin is set the same. Now, when looking at the detected objects below, there is a visible shift of objects in the amplitude direction. I assume you want to capture the state at the same time. Then it is just a matter of setting the range (limits) for the A axis.

 

Line 313 – “multiple reflections between vehicles” – how do you handle this problem?

Line 330 – in the 6-channel vector F’, do the first two parameters x, y represent the position of the object?

Line 336 – why are the parameters R, V, A in a different order in the 4-channel vector F compared to the 6-channel vector F’?

 

Figure 6.

- The images in the 1st block are unreadable + add a note under the image that these are images previously presented in Figures 4 and 5.

- The block at the bottom left has no title?

 

Algorithm 1:

Point 8 “repeat” – What does this repeat refer to? Where is the beginning of this loop, because the for loop goes from line 5 to line 10. Then I do not understand where the algorithm is supposed to return?

 

Line 388 – Please add basic information about the experiment:

- How did you obtain the input data? – Your experimental measurement or was the input a dataset (add a reference to the dataset),

- Add basic parameters of the radar used,

- How were the same conditions ensured when comparing the methods – Yolo, PointNet,

- In which programming language was the calculation performed and what were the parameters of the computer used for processing,

- Is it possible to perform the calculation in real-time?

 

Lines 396-397 – Using up to 80% of the data for training and only 20% for testing. Doesn't this risk overfitting the model to the input data?

When comparing with other approaches (PointNet, Yolo), did you recalculate the same dataset with the compared methods, or how did you ensure the same conditions?

 

Line 409 – Please explain whether this is data from your measurement or if you used a dataset. If a dataset was used, add a reference to this dataset. Additionally, explain how the same conditions were ensured for the compared methods Yolo and PointNet. Did you recalculate the same input data using these methods?

 

Figure 7.

- The texts in the image are very small and unreadable. The graphs need to be enlarged and use the full width of the page.

- One legend for each graph. Could you create a common legend for both graphs where the line type distinguishes between loss value and accuracy?

- If the graph title is in the figure description, then above the graph, I would only leave the label a) or b).

- The tick labels need to be enlarged,

- Add a description to the Y-axis. Is it necessary to have 4 Y-axes? Try to adjust the graphs so that you compare accuracy in both phases of the calculation (testing and training phase) in one graph and compare the loss function in the other graph. Then you would need only one Y-axis for each graph. The Y-axis needs to be labeled.

 

Table 1.

- Please explain the missing metrics in the table. How do you compare these selected methods without them?

- Why do you compare only your approach with PointNet in the graphs and not with Yolo as well?

- Doesn't your approach have a specific name? Only "Ours"?

 

Line 437 – “Scene I” – Please explain how you obtained the input data for scenes I, II, and III. Was it your manual selection and division of data from the dataset/experiment into these scenes, or how was it done?

 

Figure 8.

- Leave only the labels a), b), and c) in the image titles. Their full titles will remain only in the figure description.

- Consider the adjustment proposed in Figure 7 – combine the plotting of accuracy for the train and test phases, and the same for the loss function.

- Add a description to the Y-axis.

- Add a legend to image 8c. Aren't there too many detected vehicles in the image?

 

Line 446 – “the clutter points” – could you at least briefly explain the source of the clutter points? The arrangement of these clutter points is interesting. It would be good if you put image 8c in a separate row and next to it an RGB image for illustration. Then you could enlarge images 8a and 8b, and from image 8d (RGB image) it would be clear what is causing the clutter.

 

Figure 9.

Comments as in Figure 8:

- Leave only the labels a), b), and c) in the image titles. The full titles will remain only in the figure description.

- Consider the adjustment proposed in Figure 7 – combine the plotting of accuracy for the train and test phases, and the same for the loss function.

- Add a description to the Y-axis.

- Add a legend to image 9c. Can you explain why the clutter points are only in one line behind the correctly detected vehicles?

- Consider putting image 9c in a separate row and next to it an RGB image for illustration. Then you could enlarge images 9a and 9b, and from image 9d (RGB image) it would be clear what is causing the clutter.

 

Figure 10.

Comments as in Figure 8:

- Leave only the labels a), b), and c) in the image titles. The full titles will remain only in the figure description.

- Consider the adjustment proposed in Figure 7 – combine the plotting of accuracy for the train and test phases, and the same for the loss function.

- Add a description to the Y-axis.

- Add a legend to image 10c. Can you explain why the clutter points are only in one line in the background behind the correctly detected vehicles?

- Consider putting image 10c in a separate row and next to it an RGB image for illustration. Then you could enlarge images 10a and 10b. From image 10d (RGB image), it would be clear what is causing the clutter.

 

Lines 472-474 – As already mentioned, it is necessary to explain how you selected the input data for individual scenes. Did you do it manually, or did you have an algorithm for that?

- It would also be good to comment on the clutter points in the individual scenes and overall address the issues that can arise in vehicle detection, such as clustering of vehicles and their mutual overlap, change in vehicle speed, identification of vehicles in a column, and so on.

- Can you comment on the accuracy of determining individual parameters such as amplitude, range, velocity, and angle? How will the results be used further? This question still has not been answered. Write the concept of your solution from the input data to the results and the possibilities of their further deployment – traffic monitoring, traffic management, and so on. How is the position of radars considered – placement outside the detected vehicles or directly on the vehicles?

Line 482 – Why was Yolo dropped from the comparison of methods? There are only comparisons between your approach and PointNet everywhere.

 

Figure 11.

- The texts in the image are too small,

- You have not explained the metric mioU,

- Stretch the image to the width of the page.

 

Line 493 – How did you determine the reference length values?

Lines 497-498 – Please explain the parameters (power level and water level value) and how they are used in the optimization process.

 

Figure 12

- The texts in the image are extremely small and unreadable, especially the axis labels and legend,

- Is the second Y-axis in image 12a really necessary? If so, add a description to the axis.

- What does the blue line represent – the description in the legend as a fraction is not suitable. I would rather use text or a combination of text and the fraction.

- In image 12b), for quick orientation, connect the lower left edge of the detail with the main graph using a dashed line.

- Please explain what you mean by “The total channel capacity versus the total transmit power.”

 

Figure 13.

- Sorry, but since all the lines in the graph overlap, the informational value of the figure is zero.

- A compromise needs to be found where the differences will be visible. For example, plot it by scenes or limit it to a specific section that you enlarge extremely to show the differences. As it is now, I don't see any difference between the given lines. Alternatively, try adding extreme details and placing them on the sides of the graph. See last comment as possible solution.

- Also, increase the font size in the legend and for the tick labels.

 

- Why are all the symbols, except for the black squares, only present from a detection probability greater than 0.84? If it is supposed to be this way, then compare only this final section and add enlargements to express the differences.

Author Response

Thank you very much for your helpful comments.  We have made modifications point to point according to your suggestions, and hope to meet with your approval.  Please see the attachment for our response and revised manuscript.  Once again, we extend our gratitude for your generous investment of time in enhancing our submission.

Author Response File: Author Response.pdf

Reviewer 4 Report

Comments and Suggestions for Authors

1. Paper missed clear definition of its task.

2. Many variables in the formulas unexplained, formulas taken out where? Are they derived by authors, taken from references, etc?

3. Please, explain in the begining used data, in algorithm there is a camera, where YOLO looks ok, but then radar obtained data has no place, identification of object by radar left unexplained.

4. Figures has no links in the text, their importance is doubtful, for example fig. 5.

4. Experiment - absolutelly no equipment, no methodology, no data processing and accuracy analysis. Experiment and methodology doesn't fit.

5. I quit commenting, while authors should present coherent and clear research description, this version cannot be published.

Comments on the Quality of English Language

English looks acceptable.

Author Response

Thank you very much for your helpful comments.  We have made modifications point to point according to your suggestions, and hope to meet with your approval.  Please see the attachment for our response and revised manuscript.  Once again, we extend our gratitude for your generous investment of time in enhancing our submission.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The authors have positively implemented all of the major corrections suggested in the review report..

Author Response

Comment: [The authors have positively implemented all of the major corrections suggested in the review report..]

Response: [We would like to express our sincere gratitude for your time and efforts in reviewing our manuscript. Your insightful comments and suggestions were invaluable in helping us improve the quality and clarity of our work. We greatly appreciate your feedback and are pleased that the revisions have addressed your concerns.]

Reviewer 2 Report

Comments and Suggestions for Authors

No more comments.

Author Response

Comment : [No more comments.]

Response: We would like to express our sincere gratitude for your time and efforts in reviewing our manuscript. Your insightful comments and suggestions were invaluable in helping us improve the quality and clarity of our work. We greatly appreciate your feedback and are pleased that the revisions have addressed your concerns.

Reviewer 4 Report

Comments and Suggestions for Authors

1. There are unclear issues inn the figure 5 and 6. Figures remain not indormative and begs for improvement.

2. Please, make a short description of experimental research methodology - equipment, how you measure, etc. Now you statements spread in the text and hard to coprehend.

Comments on the Quality of English Language

English looks acceptable.

Author Response

Thank you very much for your helpful comments.  We have made modifications point to point according to your suggestions, and hope to meet with your approval.  Please see the attachment for our response and revised manuscript.  Once again, we extend our gratitude for your generous investment of time in enhancing our submission.

Author Response File: Author Response.pdf

Back to TopTop