Improving Monocular Depth Estimation with Learned Perceptual Image Patch Similarity-Based Image Reconstruction and Left–Right Difference Image Constraints
Round 1
Reviewer 1 Report
This paper presents a new approach for self-controlled monocular depth estimation, a reconstruction of the image is designed by training the Model on stereo image data. It facilitated the gradual convergence of the reconstructed image toward greater similarity with the target image during the training process, which is a great advantage. However, stereo images, lighting conditions, and camera calibration errors vary. Due to the tendency of far pixel values to approach zero in the difference image derived from the left and right source images of a stereo pair, this loss leads the far pixel values of the reconstructed difference image to gradually approach zero. Therefore, the use of this loss has shown its effectiveness in reducing distortions in remote areas while improving overall performance. In this sense, the mathematical background of this study is sufficient. The results are satisfactory. However, Equation 11-17 needs further explanation. The aim of the study should be clearly emphasized in the abstract section.
Author Response
Please see the attachment.
Author Response File: Author Response.docx
Reviewer 2 Report
This paper proposes a approach for self-supervised monocular depth estimation with LPIPS-based image reconstruction and left-right difference image constraints. Although it is well described and technically feasible, I still have some concerns, mainly about experiments.
1. How to train the models of comparative methods, datasets, hyper-parameters,etc. Whether they are consistent with the proposed method, and whether they are fair?
How to ensure fairness in comparisons?
2. The training evolution process should be provided, and more details should be analyzed.
3. Ablation analysis should be provided and analyzed. For example, the parameters in Eq.(21).
4. It seems from Figure 2 that Monodepth2 performs better with clearer edges. More examples and explanations need to be added.
Author Response
Please see the attachment.
Author Response File: Author Response.docx
Round 2
Reviewer 2 Report
Most of the problems have been resolved, except that
the training evolution process should be provided with figures as an experiment to demonstrate that the error can gradually decrease.
Author Response
As you commented, we additionally incorporated Table 4 to visually present the incremental enhancement in performance from the evolution of learning process. An accompanying explanatory statement was provided above the table to offer contextual understanding. We greatly appreciate your insightful commentary, as it has significantly contributed to a more lucid presentation of out prospective research trajectory.