Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

A Two-Stage Network Based on Transformer and Physical Model for Single Underwater Image Enhancement

J. Mar. Sci. Eng. 2023, 11(4), 787; https://doi.org/10.3390/jmse11040787

by Yuhao Zhang

, Dujing Chen, Yanyan Zhang^*, Meiling Shen and Weiyu Zhao

Reviewer 1:

Eduardo Quevedo-Gutiérrez

Reviewer 2:

Mourad Lahdir

J. Mar. Sci. Eng. 2023, 11(4), 787; https://doi.org/10.3390/jmse11040787

Submission received: 7 March 2023 / Revised: 30 March 2023 / Accepted: 3 April 2023 / Published: 5 April 2023

(This article belongs to the Special Issue Earth System Modeling, Data Assimilation, Artificial Intelligence, Deep Learning and Ocean Information Engineering II)

Round 1

Reviewer 1 Report

In this paper, a two-stage network based on deep learning and an underwater physical imaging model is proposed to achieve image enhancement in underwater environment.

It is a complete and well written paper, which considers three categories of underwater image enhancement. The paper includes a complete state of the art in this respect, but several references may be considered:

* SSIM metric should be referenced considering the original paper: Zhou Wang, A. C. Bovik, H. R. Sheikh and E. P. Simoncelli, "Image quality assessment: from error visibility to structural similarity," in IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600-612, April 2004, doi: 10.1109/TIP.2003.819861.

* Underwater reconstruction based on Super-Resolution, for instance:

- ChenYuzhang et al. Super-resolution reconstruction for underwater imaging Opt. Appl. (2011)

- Quevedo et al., Underwater video enhancement using multi-camera super-resolution, Optics Communications, Volume 404, 2017.

Moreover, quantitative comparison is mainly based on PSNR and SSIM, but the computational time must be also considered when possible.

Finally, the level of the use of the English language is unsatisfactory to be included in this publication. Authors whose primary language is not English are advised to seek help in the preparation of the paper.

Author Response

please see the attachment

Author Response File: Author Response.docx

Reviewer 2 Report

In this paper, the authors propose to enhance underwater images by combining deep learning and a physical imaging model using a two-stage network called WaterForme. The first stage, the Soft Reconstruction Network (SRN), reconstructs underwater images using the Jaffe-McGramery model, and the second stage, the Hard Enhancement Network (HEN), further enhances the images. The authors use the Transformer framework to capture long dependencies between pixels and propose the Local Intent Multilayer Perceptron (LIMP) and the Channel Self-Attention Module (CSA) to process local and channel information. The joint parameter estimation (JPE) method is introduced to avoid additional errors when estimating multiple parameters. The proposed method uses a combination of L2 loss and SSIM loss to better capture structure and texture details.

The paper is well structured and easy to follow. The authors clearly motivate their work and explain the limitations of existing methods. Overall, this paper is a valuable contribution to the field of underwater image enhancement and proposes a new approach that integrates deep learning and the physical imaging model to address the limitations of existing methods.

- The proposed method is novel and addresses the limitations of existing methods by integrating deep learning and the physical imaging model.

- The use of the Transformer architecture and new modules such as LIMP and CSA shows an innovation in the field.

- The joint parameter estimation (JPE) method is also a valuable contribution, as it addresses the limitations of estimating multiple parameters separately in the Jaffe-McGramery model.

- The abstract could be improved by including more information about the techniques used and their contributions to the overall method. In addition, while the abstract mentions the use of the Transformer structure, the Local Intent Multilayer Perceptron (LIMP), and the Channel Self-Attention Module (CSA), it does not explain how these techniques are integrated into the network and why they are effective. Including more details about these techniques and their role in the network would improve the clarity of the abstract and allow readers to better understand the proposed method. The abstract could benefit from more specific information about the experimental results.

- The conclusion highlights the experimental results, which demonstrate that the proposed method effectively restores color and texture details in various underwater scenes. It is always useful to discuss perspectives to this work, in order to provide ideas for future studies and to demonstrate the importance of the research performed. In the case of WaterFormer, for example, are there a few perspectives that can be considered?

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

Once the required aspects have been considered, the paper can be accepted.

Article Menu

A Two-Stage Network Based on Transformer and Physical Model for Single Underwater Image Enhancement

Further Information

Guidelines

MDPI Initiatives

Follow MDPI