WEDM: Wavelet-Enhanced Diffusion with Multi-Stage Frequency Learning for Underwater Image Enhancement

Chen, Junhao; Ye, Sichao; Ouyang, Xiong; Zhuang, Jiayan

doi:10.3390/jimaging11040114

Open AccessArticle

WEDM: Wavelet-Enhanced Diffusion with Multi-Stage Frequency Learning for Underwater Image Enhancement

¹

Faculty of Mechanical Engineering & Mechanics, Ningbo University, Ningbo 315211, China

²

Ningbo Institute of Materials Technology & Engineering, Chinese Academy of Sciences, Ningbo 315201, China

³

Faculty of Electrical Engineering and Computer Science, Ningbo University, Ningbo 315211, China

^*

Author to whom correspondence should be addressed.

J. Imaging 2025, 11(4), 114; https://doi.org/10.3390/jimaging11040114

Submission received: 10 March 2025 / Revised: 28 March 2025 / Accepted: 1 April 2025 / Published: 9 April 2025

(This article belongs to the Special Issue Underwater Imaging (2nd Edition))

Download

Browse Figures

Versions Notes

Abstract

:

Underwater image enhancement (UIE) is inherently challenging due to complex degradation effects such as light absorption and scattering, which result in color distortion and a loss of fine details. Most existing methods focus on spatial-domain processing, often neglecting the frequency-domain characteristics that are crucial for effectively restoring textures and edges. In this paper, we propose a novel UIE framework, the Wavelet-based Enhancement Diffusion Model (WEDM), which integrates frequency-domain decomposition with diffusion models. The WEDM consists of two main modules: the Wavelet Color Compensation Module (WCCM) for color correction in the LAB space using discrete wavelet transform, and the Wavelet Diffusion Module (WDM), which replaces traditional convolutions with wavelet-based operations to preserve multi-scale frequency features. By combining residual denoising diffusion with frequency-specific processing, the WEDM effectively reduces noise amplification and high-frequency blurring. Ablation studies further demonstrate the essential roles of the WCCM and WDM in improving color fidelity and texture details. Our framework offers a robust solution for underwater visual tasks, with promising applications in marine exploration and ecological monitoring.

Keywords:

underwater image enhancement; wavelet transform; frequency-domain processing; diffusion models

1. Introduction

As humanity advances in the exploration and utilization of marine resources, underwater imaging technology has become increasingly critical in domains such as marine biology monitoring and seabed resource exploration [1,2,3]. However, due to the absorption and scattering properties of water, underwater images often suffer from severe degradation, including color distortion, loss of structural detail, and noise interference [4]. These degradations not only impair visual perception but also significantly hinder downstream analysis tasks, such as detection, segmentation, and tracking [5]. Consequently, underwater image enhancement (UIE) aims to restore visual fidelity by compensating for scattering effects and correcting color shifts, thereby improving the utility of underwater images in diverse computer vision applications [6].

To enhance the quality of underwater images, researchers have developed both traditional and deep learning-based approaches. Traditional UIE methods typically rely on prior knowledge, heuristic assumptions, or simplified physical models, such as histogram equalization [7] and white balance correction [8]. While these techniques can partially improve image contrast and color balance, their effectiveness is inherently constrained by their reliance on overly simplistic assumptions that fail to capture the complex optical properties of underwater environments. Consequently, when applied to real-world underwater scenes, traditional methods often introduce visual artifacts and color inaccuracies. Moreover, they generally disregard frequency-domain characteristics, resulting in suboptimal preservation of fine details and structural information [9].

In recent years, deep learning-based UIE methods have garnered significant attention. These methods leverage convolutional neural networks (CNNs) or generative adversarial networks (GANs) to learn the mapping between degraded and clear images from extensive datasets, significantly enhancing image quality. However, most CNN-based methods adopt direct regression strategies. Although these streamline the processing pipeline and enhance clarity and contrast, they generate deterministic images, lacking flexibility and diversity [10]. Consequently, they often exhibit inconsistent performance in complex underwater scenarios and struggle to adapt to various water bodies, lighting conditions, and degradation factors. In contrast, GAN-based methods employ adversarial training to address stability issues and generate more diverse and realistic images [11]. However, GANs often face challenges such as mode collapse and gradient vanishing during training, limiting their widespread application in UIE tasks.

With increasing interest in generative models, diffusion-based models [12,13] have emerged as a prominent research focus, demonstrating exceptional performance in the generation of high-quality and diverse images. These models have gradually become the state-of-the-art (SOTA) benchmark in image generation. Through a gradual denoising process, diffusion models effectively capture the uncertainty and diversity inherent in UIE image generation, overcoming the limitations of GANs in terms of training stability and mode diversity. Compared to GANs, diffusion models are more robust, yielding finer and more realistic images while avoiding the risk of mode collapse. Nevertheless, most existing diffusion models focus on global feature modeling in the spatial domain, neglecting the processing of frequency-domain information and fine-grained details. In high-noise underwater environments, traditional diffusion models struggle to restore high-frequency details and may amplify noise during denoising, ultimately diminishing the quality of generated images [14].

To address these challenges and enhance underwater image quality, we propose a novel UIE framework based on the Conditional Residual Denoising Diffusion Model (RDDM), named the WEDM. This framework combines frequency domain information with the attributes of diffusion models to generate high-quality and diverse samples through a conditional guidance mechanism. Specifically, the WEDM framework enhances images in two stages. In the first stage, we introduce the Wavelet Color Compensation Module (WCCM), which uses discrete wavelet transform (DWT) to decompose the image into low- and high-frequency components for enhancement. This module compensates for color distortion and detail loss in underwater images, with the enhanced image serving as a conditional input for the subsequent generation process. In the second stage, we present the Wavelet Diffusion Module (WDM). The WDM replaces traditional convolution operations with wavelet convolution (WTConv) and integrates this module deeply into the U-Net structure of the diffusion model. This approach strengthens the ability of the model to restore details and textures of the image. The WDM uses WTConv to extract multiscale features in the frequency domain, effectively preventing the oversmoothing of high-frequency details during denoising, a common issue in traditional diffusion models. Our method combines the powerful feature representation of diffusion models with the precision of frequency-domain processing, improving noise robustness and significantly enhancing image enhancement results and generalization ability. In addition, Figure 1 demonstrates the enhancement results of the degraded images from different datasets by the WEDM.

In summary, the main contributions of our method are as follows:

We propose a novel UIE framework, the WEDM, based on WTConv and diffusion models. This framework addresses the frequency domain characteristics of underwater images and the degradation mechanism of diffusion models in high-frequency information, successfully implementing frequency domain enhancement and diffusion adjustment. This significantly improves the model’s ability to restore underwater image details and textures.
We introduce the Wavelet Color Compensation Module (WCCM), which uses discrete wavelet transform for frequency domain decomposition and enhancement fusion. This effectively compensates for the degradation in underwater images. In the WEDM, the image processed by the color compensation module serves as a strong conditional guide, driving the diffusion model’s denoising process to accurately restore high-frequency details and reduce recovery bias caused by degradation.
We propose the WTConv Residual Diffusion Adjustment Module (WDM), which deeply explores the potential of diffusion models in frequency domain modeling. This significantly improves the restoration of image details and textures while enhancing the model’s robustness to noise and generalization capabilities.
Experimental results show that the WEDM outperforms previous UIE methods. Extensive ablation experiments validate the effectiveness of our contributions.

2. Related Work

2.1. Traditional Underwater Image Enhancement Method

Traditional UIE methods are predominantly based on either physical models or image enhancement strategies. Physical model-based methods aim to reverse the degradation process by constructing estimation models for underwater imaging parameters such as attenuation coefficients and background light. Peng et al. [18] improved the dark channel prior to separate scattering light and perform color correction. However, these methods are vulnerable to color distortion, particularly due to errors in depth map estimation in complex scenes. Liang et al. [19] proposed a multi-constraint optimization model that incorporates priors such as the gray world assumption. However, these models are highly sensitive to non-uniform illumination in underwater environments. While these methods are physically interpretable, their reliance on manually specified priors limits their ability to adapt to the dynamic and complex optical conditions encountered underwater, often resulting in artifacts and local over-enhancement.

In contrast, image enhancement methods work by improving visual quality through pixel-level operations. For instance, the Retinex fusion method proposed by Zhang et al. [20] improves contrast but exacerbates high-frequency noise. Ancuti et al. [21] implemented a multi-strategy fusion approach to achieve color balance, but their method neglects the wavelength-dependent nature of underwater scattering, leading to imbalance in channel compensation. It is crucial to note that existing enhancement techniques face a trade-off: global operations such as histogram equalization [22] may damage local details, whereas block-based variational methods [23] can enhance texture but introduce block effects. This disconnect between low-level signal processing and high-level semantic understanding hinders the practical application of traditional methods in real underwater scenarios.

2.2. Deep Learning Underwater Image Enhancement Methods

With the rise of deep learning, data-driven UIE methods have increasingly become the standard approach. These methods capitalize on the powerful feature representation and non-linear mapping capabilities of neural networks, which automatically extract relevant features from training data. CNN architectures model degradation patterns through end-to-end mappings. For instance, UWCNN [24] adopts a multi-branch architecture to adapt to varying water qualities, while Ucolor enhances discriminability by fusing features across different color spaces. However, deterministic regression frameworks often lead to local optima, resulting in limited output diversity and instability, especially in scenarios involving extreme degradation. To address these limitations, adversarial generation mechanisms have been introduced. Water-Net [25] developed a multi-task GAN framework for joint dehazing and color restoration, while CycleGAN-based methods [26,27] tackle the issue of data scarcity through unsupervised domain adaptation. While GANs excel at generating fine details, the dynamic interplay between the discriminator and the generator can lead to mode collapse, producing structural artifacts in the generated images.

In recent years, several complex frameworks have been proposed and have achieved state-of-the-art performance [28,29]. For instance, Yang et al. [30] proposed a multi-scale progressive restoration network that is aware of reflected light. This network is capable of producing images with both color equalization and rich texture in a variety of underwater scenes. Huang et al. [31] proposed a mean teacher-based semi-supervised network, which effectively leverages the knowledge from unlabeled data. Recent studies suggest that current methods predominantly focus on spatial domain feature learning, whereas the low-frequency energy decay and high-frequency phase distortion characteristics of underwater images in the frequency domain remain underexplored [32], which represents a critical bottleneck limiting the improvement of model performance.

2.3. Diffusion Models

Diffusion models have achieved significant breakthroughs in generation quality and diversity by utilizing a progressive denoising mechanism, offering a novel paradigm for UIE. Saharia et al. [33] proposed Palette, which has demonstrated the excellent performance of diffusion models in the field of conditional image generation, including colorization, in-painting, and JPEG restoration. Ref. [34] were the first to apply this approach to underwater image enhancement, restoring details through iterative noise correction. However, traditional diffusion models exhibit two primary limitations: first, high-frequency details tend to be smoothed out during multiple noise injections (as shown in Figure 2); second, global denoising strategies struggle to distinguish between signal and complex underwater noise, which leads to local contrast imbalances.

Recent studies have begun to explore frequency-domain-guided diffusion mechanisms. WF-Diff [35] employs wavelet decomposition for multi-scale generation, but directly applying this approach to UIE tasks introduces issues related to color channel coupling. Therefore, there is a pressing need to design a collaborative enhancement framework that combines frequency-domain and spatial-domain features, tailored to the specific characteristics of underwater imaging.

3. Methodology

3.1. Overall Framework

This study is dedicated to developing a network capable of concurrently eliminating color bias and enhancing the details of underwater images. The proposed Wavelet-based Underwater Image Enhancement Diffusion Model (WEDM), as depicted in Figure 3, seamlessly integrates the advantages of frequency-domain information with the formidable capabilities of diffusion models.

The proposed WEDM framework comprises two synergistic modules: the Wavelet Color Compensation Module (WCCM) for color restoration and the Wavelet Diffusion Module (WDM) for detail enhancement. Underwater images typically suffer from frequency-dependent degradations; for example, global color casts and scattering predominantly affect low-frequency components, whereas detail blurring manifests as high-frequency information loss. To address this, the WCCM applies discrete wavelet transform (DWT) to decompose the input image into sub-bands. Through LAB color space transformation and frequency-domain operations, it generates a pre-enhanced conditional image that corrects low-frequency color distortion while retaining high-frequency structure.

The WDM adopts the RDDM network [13] as its denoising backbone. As illustrated in Figure 4, it utilizes a U-Net-like architecture where traditional convolution is replaced with WTConv. This enables effective multi-scale feature extraction in the frequency domain. The residual diffusion process enhances high-frequency components while avoiding the over-smoothing typical in conventional models. Overall, the cooperative design of the WCCM and WDM ensures both global structure recovery and fine detail preservation, significantly boosting enhancement performance under complex underwater degradation scenarios.

To justify the use of wavelet transform, we begin by modeling the physical characteristics of underwater imaging. These images often suffer from scattering and absorption effects, which can be modeled as:

I_{i n} = α I_{c l e a n} + β I_{d i s t o r t i o n s}

(1)

where

I_{c l e a n}

denotes the latent clean image,

I_{d i s t o r t i o n s}

represents distortions caused by the underwater environment, and

α

,

β

are coefficients determined by water’s optical properties.

Next, discrete wavelet transform (DWT) is applied to decompose the image into frequency-specific components:

I_{L L}, {I_{L H}, I_{H L}, I_{H H}} = D W T (I)

(2)

where,

I_{L L}

contains low-frequency components typically associated with color degradation, while

{I_{L H}, I_{H L}, I_{H H}}

capture high-frequency details susceptible to blur. The biorthogonal nature of DWT ensures lossless decomposition, enabling degradation-specific processing in the frequency domain.

In contrast, standard convolutions process all frequencies jointly, leading to frequency aliasing and mixing of structural and detail information. This often results in blurred textures and edge loss.

To overcome this, wavelet convolution (WTConv) is employed to independently process each sub-band, effectively decoupling the optimization objectives.

From the perspective of diffusion modeling, denoising is formulated as a conditional probability process:

q (x_{t - 1} | x_{t}, x_{0}) = N (x_{t - 1}; \tilde{μ} (x_{t}, x_{0}), {\tilde{β}}_{t} I)

(3)

Integrating WTConv into the diffusion framework enables sub-band-specific noise prediction. This reduces problem complexity, constrains the solution space, and enhances prediction accuracy. By guiding denoising in separate frequency domains, the model avoids high-frequency detail loss and improves robustness. Recent studies [36] have demonstrated the effectiveness of similar frequency-guided strategies, further validating our approach.

3.2. Wavelet Color Compensation Module

The purpose of the WCCM module is to perform color correction on the low-frequency components and enhance high-frequency details of the underwater image through frequency-domain operations. The steps involved are as follows:

LAB conversion: Convert the RGB image to the LAB color space, which allows for the separation of luminance (L) and chrominance (a, b) channels. This separation facilitates the independent adjustment of color components, where the luminance information is less affected by the scattering effects of water.
Mask generation: Based on the luminance values, a binary mask $M (x)$ is generated using a thresholding approach. Specifically, pixels with luminance $L (x)$ greater than 0.847 are assigned a value of zero, and others are assigned one. This mask enables discriminative processing of regions with higher and lower luminance, ensuring that color compensation is applied appropriately based on the image’s luminance distribution.

$M (x) = I (L (x) < 0.847) \cdot G (σ = 1.5)$

(4)

where $G (σ = 1.5)$ represents a Gaussian blur applied to the mask for smoothing purposes. The threshold value of 0.847 is determined based on the Color Channel Compensation (3C) method [37], which experimentally shows that luminance values above 0.85 usually correspond to strong light sources or overexposed regions. Such areas are unsuitable for aggressive color correction, as it may introduce additional distortion. Considering the unique luminance distribution of underwater images, we slightly adjusted this threshold to 0.847 to better suit underwater scenes.
Wavelet-based correction: After decomposition of the image into wavelet subbands, color correction is applied to the low-frequency subbands of the a and b chrominance channels. This step adjusts for the color bias introduced by the underwater environment.

$W {(I_{c})}_{L L} \leftarrow W {(I_{c})}_{L L} - κ_{c} \cdot M \cdot G (W {(I_{c})}_{L L})$

(5)

where $c \in {a, b}$ and $κ_{c}$ controls the strength of color correction. The Gaussian-blurred mask M is used to modify the low-frequency components of the chrominance channels, helping to mitigate color distortion while preserving essential image structures. The parameter $κ_{c}$ is set with reference to the 3C method [37], which demonstrates that moderate values yield optimal results in standard imaging conditions. In underwater scenarios, where the red channel often experiences severe attenuation, increasing $κ_{c}$ to around 1.0 significantly improves compensation. This study follows the empirical guideline while adjusting $κ_{c}$ for underwater characteristics.
Reconstruction: After color correction is applied to the low-frequency subbands, the image is reconstructed by merging all subbands using the inverse discrete wavelet transform (IDWT). The final image is then converted back to the RGB color space to obtain the color-corrected result.

By working in the wavelet domain, this module allows for fine-grained control over color compensation while maintaining the integrity of the image’s structural information.

3.3. Wavelet Diffusion Module

The objective of the WDM is to combine the advantages of multi-scale feature extraction in the wavelet domain with the strong representation capabilities of diffusion models to refine the image’s high-frequency and low-frequency components. The WDM performs both forward and reverse diffusion processes, detailed as follows:

Forward Diffusion: The forward diffusion process is modeled as a Markov chain, where noise is gradually added to the image. The forward step for a single noise addition can be written as

$q (I_{t} | I_{t - 1}, I_{r e s}) = N (I_{t}; I_{t - 1} + α_{t} I_{r e s}, β_{t}^{2} I)$

(6)

Here, $I_{t}$ represents the image at time step t, and $I_{r e s} = I_{i n} - I_{0}$ is the residual between the degraded image $I_{i n}$ and the clean image $I_{0}$ . The parameters $α_{t}$ and $β_{t}$ control the influence of the residual and the noise respectively.
Wavelet Convolution Parameter Setting and Sensitivity Analysis: The WTConv module leverages wavelet transform to effectively avoid the frequency aliasing problem inherent in conventional convolution, while achieving a large receptive field without significantly increasing computational cost. We adopt the db1 wavelet (Haar wavelet), which provides clear frequency localization and minimal computational complexity in image processing tasks. In addition, the wavelet convolution kernel size is set to 5 × 5. Experimental results show that, compared with smaller kernels, the 5 × 5 kernel is more effective in capturing long-range dependencies among image features, thereby better preserving and restoring high-frequency details. Following the experimental study of Finder et al. [38], we set the wavelet decomposition level to 2, which yields superior multi-scale feature representations.
Reverse Denoising: The reverse diffusion employs L1 loss for residual prediction to preserve high-frequency details. The reverse denoising process is modeled as follows:

$p_{θ} (I_{t - 1} | I_{t}, I_{i n}) = N (I_{t - 1}; μ_{θ} (I_{t}, I_{i n}, t), σ_{t}^{2} I)$

(7)

where $I_{i n}$ is the condition image, and $μ_{θ} (I_{t}, I_{i n}, t)$ represents the predicted mean at step t. The variance $σ_{t}^{2}$ is calculated based on the previous time step. The network uses this mean and variance to predict the denoised residuals.
Feature Reconstruction: After the WTConv operations, the processed high-frequency subbands are scaled to control their influence. The feature map is then reconstructed by merging the high-frequency and low-frequency components and applying inverse wavelet transform (IWT):

$\hat{X} = I_{W T} ({X_{L L}, {\hat{X}}_{L H}, {\hat{X}}_{H L}, {\hat{X}}_{H H}})$

(8)

due to the biorthogonal property of DWT, the reconstructed feature map is then added as a residual to the original input feature map, ensuring that the details and structures are preserved without causing distortion:

$X_{o u t} = X + α \cdot \hat{X}$

(9)

where $α$ is a coefficient that controls the influence of the residual, typically set to 0.1 to prevent over-adjustment.
Training Process and Loss Function: During the reverse diffusion process, the model predicts both the residuals $I_{r e s}$ and the noise $ϵ$ . The training objective is to minimize the Kullback–Leibler (KL) divergence between the true posterior and the predicted posterior, which is simplified to the following L1 loss:

$L (θ) = E_{(t, I_{t}, I_{r e s})} [∥ I_{r e s} - I_{r e s}^{θ} (I_{t}, t) ∥_{1}]$

(10)

This simplified loss function ensures that the model learns to predict the residuals effectively, leading to better performance in underwater image enhancement tasks.

4. Experiments

4.1. Setup

Implementation Details. The proposed WEDM method was trained on an NVIDIA RTX A5000 GPU with 1000 diffusion steps. The Adam optimizer and L1 loss function were used for a total of 240,000 iterations. The initial learning rate was set to 8 ×

10^{- 5}

, and the batch size was fixed at 8. The training dataset consisted of 3900 randomly selected underwater image pairs from the LSUI dataset [15], which were cropped to 256 × 256 pixels before being fed into the network. During inference, full-resolution testing was performed using three diffusion steps for all tasks.

Datasets. The WEDM model was trained on 3900 randomly selected image pairs from the LSUI dataset. The remaining 379 images were set aside as the test set (referred to as TEST-L400). Additionally, the generalization performance of the WEDM was evaluated on real-world underwater scenes from the UIEB dataset [16], the Challenge60 dataset, and the U45 dataset [17].

Comparison Methods. We compared the WEDM with seven SOTA UIE methods, which include one image restoration method (UDCP [39]), one image enhancement method (WWPF [40]), and five deep learning-based methods (PUIE-Net [41], FUnIE-GAN [29], LENet [42], Shallow-UWnet [43], and DM-water [34]). Among these, DM-water is a diffusion-model-based method. To ensure a fair and rigorous comparison, we used the source code provided by the respective authors and adhered to the same experimental settings for evaluation.

Evaluation Metrics. To assess the performance of different UIE methods, we used full-reference image quality metrics, PSNR [44] and SSIM [45], to quantify the enhancement effects of the WEDM on the LSUI and UIEB datasets. These metrics were computed based on the Y channel in the YCbCr color space. Additionally, we used non-reference image quality metrics, UIQM [46] and UCIQE [47], to evaluate the performance of the model on non-reference benchmarks such as the U45 and Challenge60 datasets.

4.2. Results and Comparisons

4.2.1. Qualitative Comparison

For datasets with reference data, we compared the WEDM with other methods on the TEST-L400 and UIEB datasets to evaluate its effectiveness in natural underwater image enhancement. The results are presented in Table 1. Compared to existing methods, the WEDM demonstrates a clear advantage in preserving the structural information and details of the enhanced images, achieving the highest PSNR and SSIM scores. Relative to the second-best method, the WEDM shows an 18.33% improvement in PSNR and a 1.58% increase in SSIM. The WEDM method emphasizes preserving image structure and texture details rather than merely minimizing mean squared error, highlighting its effectiveness in enhancing underwater images that are sensitive to structural details. This further confirms the robustness of the proposed WEDM method.

Additionally, Figure 5 and Figure 6 present visual comparisons of images enhanced by the WEDM and other methods across different degraded underwater scenes. The experimental results show that FUnIE-GAN and Shallow-UWnet underperform in enhancing degraded images due to limited parameters, with FUnIE-GAN introducing noticeable noise. PUIE-NET and LENet improve image clarity but suffer from color distortion, and the enhanced images require further adjustment of saturation and contrast. DM-water achieves good results in detail enhancement but still requires improvement in color quality. WWPF produces enhanced images with good color quality and contrast by adjusting contrast and sharpening, but it also leads to unnatural artifacts, loss of details, and amplified noise. The visual enhancement results in Figure 5 and Figure 6 demonstrate the effectiveness of the WEDM in enhancing degraded underwater images across various scenarios. Compared to other methods, the WEDM shows significant advantages in restoring image details, color, and contrast. Moreover, while some comparison methods perform better on hazy or blue-tinted images, none provide satisfactory results across all scenarios. The WEDM demonstrates strong generalization capabilities, maintaining image structure and texture details, with processed images closely resembling reference images and achieving realistic restoration. Therefore, based on both qualitative and quantitative metrics, the WEDM outperforms current SOTA models.

On one hand, underwater imaging is an ill-posed problem, meaning that restoration outcomes are not unique, and no definitive ground truth exists. On the other hand, non-reference image quality assessment (NR-IQA) better reflects human visual perception compared to full-reference IQA (FR-IQA). Therefore, we tested the WEDM on non-reference datasets (Challenge60 and U45) and used NR-IQA metrics to quantify the enhancement results, demonstrating the WEDM’s superiority. The results are shown in Table 2.

Table 2 demonstrates that the WEDM achieves the highest UIQM and UCIQE scores on every dataset. Evaluations on the Challenge60 and U45 datasets show that the WEDM performs exceptionally well on both UIQM and UCIQE metrics, highlighting its significant advantages in enhancing image quality and restoring accurate colors. These results further validate the effectiveness and feasibility of the WEDM method for UIE tasks. Figure 7 and Figure 8 present visual comparisons of these datasets. The DCP method improves image contrast by introducing artificial colors and reducing blur, but the enhanced images exhibit darker tones and unrealistic appearances. The WWPF method uses a locally adaptive strategy to improve image quality, but its performance is limited by underwater imaging characteristics and cannot achieve optimal enhancement under certain conditions. The Shallow method partially restores details but has limited enhancement, often introducing red shift when processing turbid, complex underwater images. PUIE-Net exhibits limited generalization across different underwater environments, resulting in unnatural color reproduction. In contrast, the WEDM delivers superior underwater image restoration quality, with better clarity, contrast, color fidelity, and naturalness. It achieves consistent restoration of color and detail that aligns with real underwater environments and meets human visual standards.

Figure 9 compares the local magnification effects of various methods on underwater image enhancement. Key regions (such as fish eyes, tentacles, and sea turtle edges) are annotated and magnified to showcase each method’s performance in detail restoration and color reproduction. When magnified, the original underwater images often exhibit low contrast, noise, blurring, distortion, and color shifts. The DCP method produces a greenish tint after enhancement, while WWPF, despite increasing overall brightness, results in blurred local details and lacks depth. This approach struggles to handle severe degradation in complex scenes, leading to unnatural results. LENet alleviates color distortion but fails in recovering fine details, especially in high-frequency texture areas, where blurring or information loss is common. PUIE-Net performs well in adjusting contrast and brightness, but its limited adaptability to complex underwater environments leads to oversaturation or color shifts in localized areas. Both FUnIE and Shallow methods handle overall color shifts well, but they have clear limitations in detail restoration, often producing blurred edges, artifacts, or color deviations. DM-water, a diffusion-based method, performs better than traditional and CNN-based methods in adapting to lighting and color shifts. However, it underperforms in restoring high-frequency details, leading to blurred edges and introducing unnatural texture artifacts in severely degraded scenes. In contrast, the proposed WEDM method effectively combines low-frequency color compensation with high-frequency detail enhancement through frequency domain decomposition, producing enhanced images that significantly outperform other methods in terms of color accuracy, detail, and visual naturalness, closely resembling the reference images.

4.2.2. Visual Evaluation of Enhancement Results

Underwater images often suffer from diverse and uncontrollable degradations, such as color distortion, low contrast, turbidity, light scattering, and non-uniform illumination. These degradations vary across different scenes, and enhancement methods may behave inconsistently under such variations. Furthermore, common test datasets are often biased toward certain image styles, which may limit the reliability of purely quantitative evaluations.

To better assess the generalization ability of the WEDM across diverse underwater conditions, we uniformly sampled a subset of real-world underwater images from several public datasets, including representative scenarios such as high turbidity, dominant green-blue tones, low lighting, and specular reflections.

Figure 10 shows six representative examples with enhancement results from different methods. To support perceptual comparison, we ranked the outputs in each row based on visual factors including color accuracy, detail preservation, and naturalness.

As illustrated in Figure 10, the WEDM consistently ranks among the top performers in most scenarios. It produces visually natural outputs with restored colors and clear textures, demonstrating strong adaptability and generalization to various underwater image styles.

4.3. Applicability Analysis

The enhanced images exhibit improved brightness, contrast, and color fidelity, which are beneficial for high-level vision tasks. These enhancements effectively improve both task adaptability and performance, making them more suitable for underwater vision applications. To verify whether the enhanced images enrich visual features, we conduct object detection experiments on both degraded and enhanced images. The YOLOv9 algorithm is employed to assess the impact of enhancement on detection accuracy. As illustrated in Figure 11, the proposed WEDM framework significantly improves detection performance in terms of precision, recall, and mAP metrics.

Additionally, Figure 12 presents visual comparisons, demonstrating that enhanced images provide clearer edges and more distinguishable features than degraded ones. These improvements confirm the practical effectiveness of our method in real-world underwater detection scenarios.

Ablation Tests

To further demonstrate the effectiveness of our proposed method and analyze the impact of each module on underwater image enhancement, we conducted ablation experiments as follows: (1) the Wavelet Color Compensation Module (WCCM) was removed (denoted as w/o WCCM), using only the degraded image y as the conditional guide; (2) the Wavelet Convolution Residual Diffusion Adjustment Module (WDM) was removed (denoted as w/o WDM).

Figure 13 shows a visual comparison among the proposed WEDM, the baseline model (Base), the model without the WCCM (w/o WCCM), and the model without the WDM (w/o WDM) in the underwater image restoration task. The GT image represents the ideal, distortion-free underwater scene and serves as the benchmark for restoration quality. Table 3 presents the quantitative results of the ablation experiments on the TEST-L400 dataset.

Based on the results presented in Table 3, we can confirm the effectiveness of both the proposed WCCM and WDM modules, with the WDM achieving the best performance across all metrics. As shown in Figure 13, the baseline model, which uses only the basic diffusion process, exhibits significant color distortion and detail blurring. The model without the WCCM shows some improvement in color correction; however, the recovery of high-frequency details remains insufficient, leading to limited enhancements in contrast and clarity. In contrast, the model without the WDM demonstrates notable color restoration but falls short in recovering details and textures compared to the full WEDM method. Specifically, the WCCM leverages discrete wavelet transform for frequency domain decomposition and enhancement fusion, effectively compensating for color distortion and detail loss in underwater images. Meanwhile, the WDM integrates frequency domain features deeply into the diffusion model via WTConv, further enhancing the restoration of image details and textures. The combination of these modules enables the WEDM to outperform in underwater image restoration tasks, thus further validating the efficacy of the proposed modules in enhancing image quality.

5. Conclusions

In this paper, we propose a novel UIE framework, named the WEDM, which effectively combines wavelet-based frequency-domain processing with diffusion models for underwater image enhancement. By fully leveraging the frequency domain characteristics, our Wavelet Color Compensation Module (WCCM) and Wavelet Diffusion Module (WDM) enable precise color correction and efficient frequency-specific feature extraction. The proposed framework, the WEDM, outperforms existing methods on various UIE benchmarks, demonstrating significant improvements in both PSNR and SSIM metrics. Extensive ablation studies confirm the effectiveness of each component, particularly the residual learning strategy, which ensures stable training and robust generalization.

However, due to the use of two diffusion models, our approach does not offer advantages in terms of inference speed. As shown in the results, the WEDM’s computational cost remains higher than recent methods. In the future, we aim to optimize the sampling process and explore lightweight wavelet pruning techniques to enhance the real-time applicability of our framework, while also considering its potential for video-based underwater enhancement and multi-modal sensor fusion applications.

Author Contributions

Conceptualization, J.C. and J.Z.; methodology, J.C., S.Y., X.O. and J.Z.; writing—original draft preparation, J.C.; writing—review and editing, J.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Ningbo Major Research and Development Plan Project, grant number 2024Z114.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data sharing applicable.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Zhou, W.; Zheng, F.; Yin, G.; Pang, Y.; Yi, J. YOLOTrashCan: A deep learning marine debris detection network. IEEE Trans. Instrum. Meas. 2023, 72, 1–12. [Google Scholar]
Qi, Q.; Zhang, Y.; Tian, F.; Wu, Q.J.; Li, K.; Luan, X.; Song, D. Underwater image co-enhancement with correlation feature matching and joint learning. IEEE Trans. Circuits Syst. Video Technol. 2022, 32, 1133–1147. [Google Scholar]
Sun, S.; Guo, H.; Wan, G.; Dong, C.; Zheng, C.; Wang, Y. High precision underwater acoustic localization of the black box utilizing an autonomous underwater vehicle based on the improved artificial potential field. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–10. [Google Scholar]
Li, C.-Y.; Guo, J.-C.; Cong, R.-M.; Pang, Y.-W.; Wang, B. Underwater image enhancement by dehazing with minimum information loss and histogram distribution prior. IEEE Trans. Image Process. 2016, 25, 5664–5677. [Google Scholar]
Cao, X.; Ren, L.; Sun, C. Dynamic target tracking control of autonomous underwater vehicle based on trajectory prediction. IEEE Trans. Cybern. 2023, 53, 1968–1981. [Google Scholar]
Guan, M.; Xu, H.; Jiang, G.; Yu, M.; Chen, Y.; Luo, T.; Zhang, X. DiffWater: Underwater image enhancement based on conditional denoising diffusion probabilistic model. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 17, 2319–2335. [Google Scholar] [CrossRef]
Hummel, R. Image enhancement by histogram transformation. Comput. Graph. Image Process. 1975, 6, 184–195. [Google Scholar] [CrossRef]
Liang, Z.; Ding, X.; Wang, Y.; Yan, X.; Fu, X. GUDCP: Generalization of underwater dark channel prior for underwater image restoration. IEEE Trans. Circuits Syst. Video Technol. 2022, 32, 4879–4884. [Google Scholar]
Luan, X.; Fan, H.; Wang, Q.; Yang, N.; Liu, S.; Li, X.; Tang, Y. FMambaIR: A Hybrid State Space Model and Frequency Domain for Image Restoration. IEEE Trans. Geosci. Remote Sens. 2025, 63, 1234–1245. [Google Scholar] [CrossRef]
Sarkar, P.; De, S.; Gurung, S.; Dey, P. UICE-MIRNet guided image enhancement for underwater object detection. Sci. Rep. 2024, 14, 22448. [Google Scholar] [CrossRef]
Fabbri, C.; Islam, M.J.; Sattar, J. Enhancing underwater imagery using generative adversarial networks. In Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, QLD, Australia, 21–25 May 2018; pp. 7159–7165. [Google Scholar]
Ho, J.; Jain, A.; Abbeel, P. Denoising diffusion probabilistic models. Adv. Neural Inf. Process. Syst. 2020, 33, 6840–6851. [Google Scholar]
Liu, J.; Wang, Q.; Fan, H.; Wang, Y.; Tang, Y.; Qu, L. Residual denoising diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 16–22 June 2024; pp. 2773–2783. [Google Scholar]
Shi, X.; Wang, Y.G. CPDM: Content-preserving diffusion model for underwater image enhancement. Sci. Rep. 2024, 14, 31309. [Google Scholar] [CrossRef]
Peng, L.; Zhu, C.; Bian, L. U-Shape Transformer for underwater image enhancement. IEEE Trans. Image Process. 2023, 32, 3066–3079. [Google Scholar]
Li, C.; Anwar, S.; Hou, J.; Cong, R.; Guo, C.; Ren, W. Underwater image enhancement via medium transmission-guided multi-color space embedding. IEEE Trans. Image Process. 2021, 30, 4985–5000. [Google Scholar] [PubMed]
Li, H.; Li, J.; Wang, W. A fusion adversarial underwater image enhancement network with a public test dataset. arXiv 2019, arXiv:1906.06819. [Google Scholar]
Peng, Y.-T.; Cao, K.; Cosman, P.C. Generalization of the dark channel prior for single image restoration. IEEE Trans. Image Process. 2018, 27, 2856–2868. [Google Scholar] [CrossRef]
Liang, Z.; Zhang, W.; Ruan, R.; Zhuang, P.; Li, C. GIFM: An image restoration method with generalized image formation model for poor visible conditions. IEEE Trans. Geosci. Remote Sens. 2022, 60, 4110616. [Google Scholar]
Zhang, W.; Dong, L.; Xu, W. Retinex-inspired color correction and detail preserved fusion for underwater image enhancement. Comput. Electron. Agric. 2022, 192, 106585. [Google Scholar]
Ancuti, C.O.; Ancuti, C.; De Vleeschouwer, C.; Garcia, R. Locally adaptive color correction for underwater image dehazing and matching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA, 21–26 July 2017; pp. 997–1005. [Google Scholar]
Fu, X.; Cao, X. Underwater image enhancement with global–local networks and compressed-histogram equalization. Signal Process. Image Commun. 2020, 86, 115892. [Google Scholar]
Zhuang, P.; Wu, J.; Porikli, F.; Li, C. Underwater image enhancement with hyper-Laplacian reflectance priors. IEEE Trans. Image Process. 2022, 31, 5442–5455. [Google Scholar] [CrossRef]
Li, C.; Anwar, S.; Porikli, F. Underwater scene prior inspired deep underwater image and video enhancement. Pattern Recognit. 2020, 98, 107038–107049. [Google Scholar]
Li, C.; Guo, C.; Ren, W.; Cong, R.; Hou, J.; Kwong, S.; Tao, D. An underwater image enhancement benchmark dataset and beyond. IEEE Trans. Image Process. 2020, 29, 4376–4389. [Google Scholar]
Zhu, J.-Y.; Park, T.; Isola, P.; Efros, A.A. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2242–2251. [Google Scholar]
Li, J.; Skinner, K.A.; Eustice, R.M.; Johnson-Roberson, M. WaterGAN: Unsupervised generative network to enable real-time color correction of monocular underwater images. IEEE Robot. Autom. Lett. 2018, 3, 387–394. [Google Scholar] [CrossRef]
Peng, Y.T.; Cosman, P.C. Underwater image restoration based on image blurriness and light absorption. IEEE Trans. Image Process. 2017, 26, 1579–1594. [Google Scholar]
Islam, M.J.; Xia, Y.; Sattar, J. Fast underwater image enhancement for improved visual perception. IEEE Robot. Autom. Lett. 2020, 5, 3227–3234. [Google Scholar]
Yang, J.; Li, C.; Li, X. Underwater image restoration with light-aware progressive network. In Proceedings of the ICASSP 2023—2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 4–10 June 2023; pp. 1–5. [Google Scholar]
Huang, S.; Wang, K.; Liu, H.; Chen, J.; Li, Y. Contrastive semi-supervised learning for underwater image restoration via reliable bank. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 18145–18155. [Google Scholar]
Wang, Q.; Li, B.; Li, N.; Xie, J.; Wang, X.; Wang, X.; Chen, Y. Domain adaptive multi-frequency underwater image enhancement network. J. Electron. Imaging 2024, 33, 053035. [Google Scholar]
Saharia, C.; Chan, W.; Chang, H.; Lee, C.; Ho, J.; Salimans, T.; Fleet, D.; Norouzi, M. Palette: Image-to-image diffusion models. In Proceedings of the ACM SIGGRAPH 2022 Conference Proceedings, Vancouver, BC, Canada, 7–11 August 2022; pp. 1–10. [Google Scholar]
Tang, Y.; Kawasaki, H.; Iwaguchi, T. Underwater Image Enhancement by Transformer-Based Diffusion Model with Non-Uniform Sampling for Skip Strategy. In Proceedings of the 31st ACM International Conference on Multimedia, Ottawa, ON, Canada, 29 October–3 November 2023; pp. 5419–5427. [Google Scholar]
Zhao, C.; Cai, W.; Dong, C.; Hu, C. Wavelet-based Fourier information interaction with frequency diffusion adjustment for underwater image restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 16–22 June 2024; pp. 8281–8291. [Google Scholar]
Phung, H.; Dao, Q.; Tran, A. Wavelet Diffusion Models Are Fast and Scalable Image Generators. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 18–22 June 2023; pp. 10199–10208. [Google Scholar]
Ancuti, C.O.; Ancuti, C.; De Vleeschouwer, C.; Sbert, M. Color Channel Compensation (3C): A Fundamental Pre-Processing Step for Image Enhancement. IEEE Trans. Image Process. 2019, 29, 2653–2665. [Google Scholar]
Finder, S.E.; Amoyal, R.; Treister, E.; Freifeld, O. Wavelet Convolutions for Large Receptive Fields. In Proceedings of the European Conference on Computer Vision (ECCV), Milan, Italy, 22–26 September 2024; Springer Nature: Cham, Switzerland, 2024; pp. 363–380. [Google Scholar]
Drews, P., Jr.; do Nascimento, E.; Moraes, F.; Botelho, S.; Campos, M. Transmission estimation in underwater single images. In Proceedings of the 2013 IEEE International Conference on Computer Vision Workshops, Sydney, Australia, 2–8 December 2013; pp. 825–830. [Google Scholar]
Zhang, W.; Zhou, L.; Zhuang, P.; Li, G.; Pan, X.; Zhao, W.; Li, C. Underwater image enhancement via weighted wavelet visual perception fusion. IEEE Trans. Circuits Syst. Video Technol. 2023, 34, 2469–2483. [Google Scholar] [CrossRef]
Fu, Z.; Wang, W.; Huang, Y.; Ding, X.; Ma, K.K. Uncertainty inspired underwater image enhancement. In European Conference on Computer Vision; Springer Nature: Cham, Switzerland, 2022; pp. 465–482. [Google Scholar]
Zhang, S.; Zhao, S.; An, D.; Li, D.; Zhao, R. LiteEnhanceNet: A lightweight network for real-time single underwater image enhancement. Expert Syst. Appl. 2024, 240, 122546. [Google Scholar]
Naik, A.; Swarnakar, A.; Mittal, K. Shallow-uwnet: Compressed model for underwater image enhancement (student abstract). Proc. Aaai Conf. Artif. Intell. 2021, 35, 15853–15854. [Google Scholar] [CrossRef]
Korhonen, J.; You, J. Peak signal-to-noise ratio revisited: Is simple beautiful? In Proceedings of the 2012 Fourth International Workshop on Quality of Multimedia Experience, Melbourne, VIC, Australia, 5–7 July 2012; pp. 37–38. [Google Scholar]
Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef] [PubMed]
Yang, M.; Sowmya, A. An underwater color image quality evaluation metric. IEEE Trans. Image Process. 2015, 24, 6062–6071. [Google Scholar] [PubMed]
Panetta, K.; Gao, C.; Agaian, S. Human-visual-system-inspired underwater image quality measures. IEEE J. Ocean. Eng. 2016, 41, 541–551. [Google Scholar]

Figure 1. Enhanced results (bottom) of our WEDM for several raw images (top). (a,b) are derived from the LSUI [15], (c,d) are derived from the UIEB [16], (e,f) are derived from the U45 [17], and (g,h) are derived from the Challenge60. Our WEDM obtained satisfactory visual results for different degraded images from different datasets.

Figure 2. Decay process of high-frequency details under multiple noise injections. (Left) This section illustrates the evolution of the image after noise is added. Gaussian blur is used to simulate intermediate states during the diffusion process. As noise gradually increases, the high-frequency details of the image become progressively blurred and eventually disappear. (Right) This graph shows the trend of high-frequency energy in the image. As noise is gradually injected, the high-frequency energy continuously decays, reflecting the gradual disappearance of high-frequency details.

Figure 3. Architecture of the proposed WEDM method. It contains two detachable networks, the Wavelet Color Compensation Module (WCCM) and the Wavelet Diffusion Module (WDM). The degraded underwater image

\tilde{y}

obtains the conditional image y through the WCCM.

Figure 3. Architecture of the proposed WEDM method. It contains two detachable networks, the Wavelet Color Compensation Module (WCCM) and the Wavelet Diffusion Module (WDM). The degraded underwater image

\tilde{y}

obtains the conditional image y through the WCCM.

Figure 4. Description of the U-Net architecture with skip connections. The architecture consists of downsampling and upsampling paths, incorporating wavelet convolutions and residual blocks for feature extraction. The intermediate stage integrates an attention mechanism to capture long-range dependencies and enhance feature extraction.

Figure 5. Visual Comparison of real underwater images. Five original images (left column) are drawn from the TEST-L400 subset of the LSUI dataset. The PSNR metric is displayed in the upper right corner, with the best score highlighted in red.

Figure 6. Visual comparison of real underwater images. Five original images (left column) are drawn from the UIEB dataset. The PSNR metric is indicated in the upper right corner, with the best value highlighted in red.

Figure 7. Visual Comparison of real underwater images. Five original images (left column) are from the U45 dataset. The UIQM metric is indicated in the upper right corner, with the best value highlighted in red.

Figure 8. Visual comparison of real visual comparison of real underwater images. Five original images (left column) are drawn from the Challenge60 dataset. The UIQM metric is displayed in the upper right corner, with the best value highlighted in red.

Figure 9. Visual comparison on real underwater images. The red boxes indicate enlarged regions for detail comparison.

Figure 10. Comparison of image enhancement effects for different styles.

Figure 11. Performance comparison on object detection between raw and enhanced underwater images using YOLOv9.

Figure 12. Qualitative comparison of object detection results: (left) raw image, (middle) enhanced image using the WEDM, and (right) ground truth annotations.

Figure 13. Ablation results comparison. The PSNR metric is indicated in the upper right corner.

Table 1. Comparison of PSNR and SSIM for enhanced results.

Method	TEST-L400		UIEB
Method	PSNR	SSIM	PSNR	SSIM
UDCP	14.53	0.656	12.64	0.617
WWPF	18.29	0.759	19.04	0.823
PUIE-Net	28.53	0.917	22.47	0.883
LENet	26.64	0.929	23.37	0.891
Shallow	23.26	0.878	19.45	0.754
FUnIE	23.11	0.823	20.16	0.819
DM-water	29.95	0.946	23.19	0.893
WEDM	35.44	0.961	24.23	0.910

Table 2. Comparison of UIQM and UCIQE for enhanced results.

Method	Challenge60		U45		Enhanced Time
Method	UIQM	UCIQE	UIQM	UCIQE	Enhanced Time
UDCP	1.36	0.55	2.30	0.59	1.120 s
WWPF	2.34	0.58	2.80	0.60	0.228 s
PUIE-Net	2.53	0.56	3.15	0.57	0.015 s
LENet	2.58	0.57	3.07	0.59	0.010 s
shallow	2.30	0.50	2.89	0.52	0.050 s
FUnIE	2.37	0.54	3.22	0.58	0.057 s
DM-water	2.56	0.58	2.96	0.60	0.862 s
WEDM	2.70	0.59	3.28	0.62	0.110 s

Table 3. Metrics of ablation experiments for comparison.

Baselines	PSNR	SSIM
Base	29.12	0.865
w/o WCCM	32.80	0.947
w/o WDM	31.54	0.958
Full model	35.44	0.963

“w/o” refers to the model without the specified module.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chen, J.; Ye, S.; Ouyang, X.; Zhuang, J. WEDM: Wavelet-Enhanced Diffusion with Multi-Stage Frequency Learning for Underwater Image Enhancement. J. Imaging 2025, 11, 114. https://doi.org/10.3390/jimaging11040114

AMA Style

Chen J, Ye S, Ouyang X, Zhuang J. WEDM: Wavelet-Enhanced Diffusion with Multi-Stage Frequency Learning for Underwater Image Enhancement. Journal of Imaging. 2025; 11(4):114. https://doi.org/10.3390/jimaging11040114

Chicago/Turabian Style

Chen, Junhao, Sichao Ye, Xiong Ouyang, and Jiayan Zhuang. 2025. "WEDM: Wavelet-Enhanced Diffusion with Multi-Stage Frequency Learning for Underwater Image Enhancement" Journal of Imaging 11, no. 4: 114. https://doi.org/10.3390/jimaging11040114

APA Style

Chen, J., Ye, S., Ouyang, X., & Zhuang, J. (2025). WEDM: Wavelet-Enhanced Diffusion with Multi-Stage Frequency Learning for Underwater Image Enhancement. Journal of Imaging, 11(4), 114. https://doi.org/10.3390/jimaging11040114

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

WEDM: Wavelet-Enhanced Diffusion with Multi-Stage Frequency Learning for Underwater Image Enhancement

Abstract

1. Introduction

2. Related Work

2.1. Traditional Underwater Image Enhancement Method

2.2. Deep Learning Underwater Image Enhancement Methods

2.3. Diffusion Models

3. Methodology

3.1. Overall Framework

3.2. Wavelet Color Compensation Module

3.3. Wavelet Diffusion Module

4. Experiments

4.1. Setup

4.2. Results and Comparisons

4.2.1. Qualitative Comparison

4.2.2. Visual Evaluation of Enhancement Results

4.3. Applicability Analysis

Ablation Tests

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI