Next Article in Journal
Coupled Vibration Analysis of Multi-Span Continuous Cable Structure Considering Frictional Slip
Previous Article in Journal
Connecting the Elderly Using VR: A Novel Art-Driven Methodology
 
 
Article
Peer-Review Record

A Semantically Aware Multi-View 3D Reconstruction Method for Urban Applications

Appl. Sci. 2024, 14(5), 2218; https://doi.org/10.3390/app14052218
by Rongke Wei 1,2,3, Haodong Pei 1,3, Dongjie Wu 1,2,3, Changwen Zeng 1,2,3, Xin Ai 1,2,3 and Huixian Duan 1,3,*
Reviewer 1: Anonymous
Reviewer 2:
Reviewer 3: Anonymous
Reviewer 4:
Appl. Sci. 2024, 14(5), 2218; https://doi.org/10.3390/app14052218
Submission received: 29 December 2023 / Revised: 24 February 2024 / Accepted: 4 March 2024 / Published: 6 March 2024

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The authors in this research work address the issues present in the traditional multi-view 3D reconstruction process by proposing a semantic-based optimization method. To further enhance the presentation of their approach they provide an experimental validation in complex urban scenarios, demonstrating the superiority of their semantic optimization. By optimizing the open-source Cityscape dataset they adjust its categories, reducing the original 19 categories to 10 to identify and separate dynamic targets in urban environments that significantly affect reconstruction accuracy, and secondly, distinguish architecturally significant buildings needing detailed reconstruction from other elements in the setting.  This is for sure an interesting approach since optimizing existing knowledge to best suit the objectives of a specific scientific effort a known optimization method. Furthermore, based on the new classification several networks are tested to and compared with the outcome providing clues for the superiority of the Mask2former network, which exhibits the highest accuracy.

The paper is well written and the scientific methods followed are sound. There are of course issues that can be improved to enhance the quality of the final manuscript:

1.       Relevant literature should be improved. Authors should cite relevant methods, papers, optimization methods in more detail to present the current state of the art regarding research outcomes that this work is compared to.

2.       The conclusion section is poor since authors only present their findings but they do not compare their findings with other similar works in the domain. For example, the sentence “By comparing with ground truth data from LiDAR scanning, it is evident that our semantically optimized sSGM reconstruction achieves an overall 36% increase in accuracy compared to the original SGM reconstruction results” does not provides nor cites the other methods they are comparing with.

3.       I would suggest that the authors revisit and make the abstract of the paper more target to the contributions of this work

4.       It would be nice if the authors enhance the presentation of their contribution (end of section 1) with some more information or comparation with other methods  

 

 

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

The manuscript appears to propose a method to reconstruct 3D scene from multiple view images using a modified SfM. However, the modification appears to be trivial (marginal) and the proposed method provides only minor-modified mechanism. The authors need to present a mechanism that is significant enough to contribute to 3D scene reconstruction with clarity.

More figures need to be added for explanation of the proposed algorithm in addition to Figure 1.

The architecture of deep learning proposed in the manuscript needs to be provided to explain the equations (1)-(4).

Equations (5)-(11) need to be explained in detail with additional figures that illustrate well the mechanism of the proposed method.

Equations (12)-(17) need to be explained with illustrations so that readers can understand clearly the proposed mechanism.

Experiment section needs to be improved with more cases. Figures 4 to 8 need to be improved with more example cases. Tables 5-7 need to be improved with more example cases.

In conclusions, authors state that the proposed method identifies and separate dynamic targets. But, the mechanism for this is not presented clearly in method and experiment sections. The authors need to show clearly figures and experiments that can support the statements in conclusion section.



Comments on the Quality of English Language

English needs to be improved.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

The paper is well written and is significative from a scientific point of view.

Herein some reviews are provided:

- The abstract must be clearly improved. The novelty of the manuscript should also be outlined. Finally, the most interesting conclusions must be included.

-It is interesting to better discuss the potential fields of application of your outcomes.

- In the introduction, the discussion about the machine learning algorithms to identify buildings could help in highlighting potential novelties and differences of your work, for example consider:

Guo, Z.; Liu, H.; Pang, L.; Fang, L.; Dou,W. DBSCAN-based point cloud extraction for Tomographic synthetic aperture radar (TomoSAR) three-dimensional (3D) building reconstruction. Int. J. Remote Sens. 2021, 42, 2327–2349.

Mele, A., Vitiello, A., Bonano, M., Miano, A., Lanari, R., Acampora, G., & Prota, A. (2022). On the joint exploitation of satellite DInSAR measurements and DBSCAN-Based techniques for preliminary identification and ranking of critical constructions in a built environment. Remote Sensing, 14(8), 1872.

- a graphical flowchart of the overall procedure could be inserted

- more details about your applicative case study are needed

-Please give more details about the precision/reliability of your methodology.

- check the English spelling

In the conclusions, please better highlight potentials and limits of your methodology.

Comments on the Quality of English Language

Minor editing of English language required

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 4 Report

Comments and Suggestions for Authors

 

General Comments

The manuscript presents a novel semantic-aware multi-view 3D reconstruction approach for

urban scenes, addressing the complexity and challenges of urban environments. The research

focuses on integrating semantic information into both sparse and dense reconstruction

processes to enhance precision and provide high-level semantic information in the

reconstructed point clouds. The approach involves scene object classification, semantic-based

static and dynamic separation, and a semantic-based optimization algorithm. The study

demonstrates the potential of this methodology for applications such as autonomous driving,

digital twin technology, and urban planning.

Specific Comments

Line 4. The statement “visual 3D reconstruction is the most widely applied technique in the

field” should have a reference that proves it.

Line 135-136 – The sentence makes no sense. Maybe you mean “The existing classification

standards classifies objects not included in the list into two categories: static and dynamic.”

Line 213 – Wrong function number, it should be formula 13.

Line 230 – Maybe a small explanation of why this method is insufficient for determination if

the disparity is continuous could help.

Line 225-240 – A lot of values in the equations were fixed by you, why did you chose those

specific values?

Line 341 – How were these functions in OpenMVG rewritten? Is it worth describing it?

Line 355-360 – Maybe the word “Was” before reconstructed should be eliminated.

Line 434-436 – Maybe the word “Was” before reconstructed should be eliminated.

Line 449 – “We” instead of “I”

Line 515 – Maybe the others 2 methods should be named.

Line 591+592. Reference is not formatted correctly.

Line 593. Reference is not formatted correctly.

Line 603. Reference is not formatted correctly. Check the Citation subtitle on the github

repository of the readme.md file.

 

 

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

The manuscript does not provide clearly the main algorithm proposed by the authors. Figure 1 shows the flowchart of the approach but it is not enough to describe the key ideas that show the proposed method is significant.

The newly added figures, Figures 2 and 3 are not enough to describe cleary the proposed approach.

Figures 8 and 9 are not enough to describe the strengths of the proposed approach with clarity. More figures from experiments need to be added to show the advantages of the propose method.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 3

Reviewer 2 Report

Comments and Suggestions for Authors

Figure 1 is the only figure which shows the mechanism of the proposed approach. This is not enough to show the details of the proposed approach with clarity. More figures need to be added for showing the mechanism of the proposed method with clarity.

Figures 8-11 are not enough to show the significance of the proposed method. The authors need to present well-designed experiments with good illustrations or figures.

Comments on the Quality of English Language

English is good

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Back to TopTop