Next Article in Journal
Surface Modification of Cellulose Nanocrystals (CNCs) to Form a Biocompatible, Stable, and Hydrophilic Substrate for MRI
Previous Article in Journal
How Does the Biocompatibility of Molybdenum Compare to the Gold Standard Titanium?—An In Vivo Rat Model
 
 
Article
Peer-Review Record

Monocular Depth Estimation Using Res-UNet with an Attention Model

Appl. Sci. 2023, 13(10), 6319; https://doi.org/10.3390/app13106319
by Abdullah Jan and Suyoung Seo *
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3:
Appl. Sci. 2023, 13(10), 6319; https://doi.org/10.3390/app13106319
Submission received: 29 March 2023 / Revised: 17 May 2023 / Accepted: 20 May 2023 / Published: 22 May 2023

Round 1

Reviewer 1 Report

Review of Applied Sciences Manuscript ID: 2342826

Monocular Depth Estimation Using Res-UNet with Attention Model

 

This reviewer finds the paper to be aimed at solving the important problem of using deep learning techniques in computer vision for generating depth maps of an image, also called monocular depth mapping. Depth mapping is an ill-posed problem since there are multiple possible solutions for the same image under different lighting conditions. For this purpose, the authors augment the Res-UNet model with a spatial attention model to develop the depth map. The authors demonstrate that: (1) the number of additional parameters due to the attention model is null to a small number, (2) the generated depth maps have a high quality, (3) model training is fast requiring few iterations, and that (4) the proposed model works well compared to the existing state-of-the-art methods on the benchmark NYU-Depth v2 dataset. This reviewer finds this paper to be an original contribution to the depth mapping literature and recommends that this paper be accepted for publication after the authors address the suggestions of improvement below.    

 

Strengths:

1.     Paper is well-written with appropriate review of the literature for context.

2.     The proposed approach is well described and well-motivated.

3.     The proposed method is compared to multiple existing methods.

4.     Appropriate ablation studies for the proposed components are also provided.

 

Suggestions for further improvement:

1.     The authors should add an ablation study over the components of the loss function. It is unclear which terms are important and which ones are not.

2.     The authors take high resolution imagery and downsample to make it low resolution for training and test purposes. Instead, have the authors tried to create smaller patches from the images and then use those patches of training and testing? This reviewer suspects that the downsampling causes loss of vital information which may be very useful in applications like self-driving, etc.

3.     The authors mention that they have not performed a full hyperparameter ablation study in section 4.4.3. This reviewer recommends that the authors resubmit the paper only after doing a full ablation study as is usual practice. This would also be good for future readers to use the proposed model as a benchmark for future studies.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 2 Report

The authors have presented a novel deep-learning model to overcome shortcomings of some previous models to detect depth maps. The network could be trained to any depth sensing or segmentation tasks easily, which would be of interests to the readers. I can recommend publication of this manuscript.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 3 Report

The paper is devoted to investigation of the  new ways to make depth map.

 

It was greate pleasure to review it.

 

But I have some questions to the authors.

 

1. Why attentions were used in decoder only? Do you have any reasons for that?

 

 

2. Attentions and resnet blocks were used at the same time. Why lstm blocks could not be used for this goals instead of these blocks?  

 

3. How do you think if sota algorithms like visual transformers can be useful for depth map construction? perhaps a paragraph should be added on where further research might be headed

 

 

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

This reviewer is satisfied with the responses from the authors and recommends acceptance of the paper.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Back to TopTop