LDPC-Net: A Lightweight Detail–Content Progressive Coupled Network for Single-Image Dehazing with Adaptive Feature Extraction Block

Dai, Lingrui; Liu, Hongrui; Li, Shuoshi

doi:10.3390/electronics13101867

Open AccessArticle

LDPC-Net: A Lightweight Detail–Content Progressive Coupled Network for Single-Image Dehazing with Adaptive Feature Extraction Block

by

Lingrui Dai

¹

,

Hongrui Liu

^2,* and

Shuoshi Li

³

¹

School of Computer Science and Technology, Tiangong University, Tianjin 300087, China

²

School of Electrical Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China

³

School of Electrical and Information Engineering, Tianjin University, Tianjin 300072, China

^*

Author to whom correspondence should be addressed.

Electronics 2024, 13(10), 1867; https://doi.org/10.3390/electronics13101867

Submission received: 6 April 2024 / Revised: 3 May 2024 / Accepted: 8 May 2024 / Published: 10 May 2024

Download

Browse Figures

Versions Notes

Abstract

Image dehazing is an effective means to enhance the quality of images captured in foggy or hazy weather conditions. However, the existing dehazing methods either cannot obtain satisfactory recovery results or have large model parameters. This limits the application of the model on resource-limited platforms. To overcome these limitations, we propose a lightweight yet effective image-dehazing method, named the lightweight detail–content progressive coupled network (LDPC-Net). Within the framework of LDPC-Net, we propose a progressive coupling dehazing paradigm. Specifically, we first estimate the details and content information of the haze-free image, and then fuse these estimations using the progressive coupling method. This proposed dehazing framework markedly enhances the operational efficiency of the model. Meanwhile, considering both the effectiveness and efficiency of the network, we also design a lightweight adaptive feature extraction block serving as the basic feature extraction module of the proposed LDPC-Net. Extensive experimental results demonstrate the effectiveness of our LDPC-Net, outperforming the state-of-the-art methods by boosting the PSNR index over 38.57 dB with only 0.708 M parameters.

Keywords:

image dehazing; adaptive feature extraction; lightweight model

1. Introduction

In foggy or hazy conditions, the image quality captured by the sensor often deteriorates, resulting in problems such as reduced visibility, color distortion, and reduced contrast [1,2]. This degradation severely affects fields that rely on computer vision, such as autonomous driving and robot systems [3,4]. To solve these problems, image dehazing technology has emerged, aiming to recover a clear, haze-free representation from the hazy images. The application of image dehazing technology leads to a noteworthy enhancement in image quality, thereby positively impacting diverse visual applications such as semantic segmentation and object detection. Consequently, the field of computer vision has demonstrated considerable scrutiny and interest in the problem of image dehazing over the preceding two decades [5,6].

According to the atmosphere scattering model (ASM) [7,8,9], the formation of haze is the result of the interaction between the transmission map and atmospheric light. However, in the task of image dehazing, only the degraded haze image is known. This makes dehazing an ill-posed problem, as it involves estimating multiple unknowns from a single observable input.

In addressing this inherently ill-posed problem, researchers endeavor to leverage physical, geometrical, and statistical cues [5,10,11,12,13,14] to estimate elusive model parameters. However, it becomes evident that these cues are inherently insufficient in capturing the comprehensive characteristics inherent in real-world images. Recent advancements in deep learning technology have witnessed remarkable successes in various computer vision tasks [15,16,17]. Consequently, researchers have sought to pioneer the development of deep-learning-based methodologies for image dehazing.

Earlier iterations of deep learning approaches [18,19,20] primarily concentrate on parameter estimation within the atmosphere scattering model through the implementation of convolutional neural networks (CNNs), deviating from the reliance on observed priors. In contrast, several contemporary dehazing algorithms [2,21,22,23,24,25,26] forego the traditional atmosphere scattering model altogether and instead directly infer clear images through end-to-end CNN architectures. Regrettably, CNN-based dehazing methodologies are still confronted with the following challenges:

(1) How to control the parameter amount and calculation amount of the model. With the rise of deep learning, many image-dehazing methods have made significant progress. However, most existing methods often face problems such as high computational complexity and numerous model parameters when processing haze images, which limits their feasibility in practical applications.

(2) How to efficiently restore the detailed information and content information of the image at the same time. Image dehazing requires recovering the content information and detail information of the image at the same time. Estimating the content information of an image requires us to capture the global information of the image, and estimating the detailed information of the image requires us to operate on the local information of the image. The incongruity in the network structures required for these tasks presents a challenge. The design of an efficacious network architecture capable of concurrently reinstating detailed and content information remains an unresolved exploration.

To solve the above problems, in this paper, we propose a lightweight detail–content progressive coupled network (LDPC-Net). Within the LDPC-Net, we propose a progressive coupling dehazing framework and a lightweight adaptive feature extraction block. Specifically, the progressive coupling dehazing framework first estimates the details and content information of the haze free image, and then gradually fuses these estimations. This framework greatly improves the dehazing efficiency of the model, enabling the model to estimate the haze-free images efficiently. The proposed lightweight adaptive feature extraction block can adaptively extract features from images with low parameters, which improves the computational efficiency of the model.

The main contributions of our work are three-fold as follows:

1. We propose the lightweight detail–content progressive coupled network (LDPC-Net) as a novel solution for lightweight single-image dehazing. The proposed framework first estimates the detail information and the content information of the image, respectively, and then the information is coupled step by step. This framework greatly improves the dehazing efficiency of the model, making the model achieve satisfactory performance with a small number of parameters.

2. We propose a lightweight adaptive feature extraction block. The lightweight adaptive feature extraction block can adaptively extract features from images with low parameters, which improves the computational efficiency of the model.

3. Extensive experiments are conducted to validate that the proposed LDPC-Net performs favorably against state-of-the-art methods on both synthetic and real-world hazy images. Concurrently, an ablation study is conducted to demonstrate the efficacy of the major modules in the proposed network.

The structure of the rest of this paper is organized as follows: Section 2 provides a summary of related works concerning single image dehazing. Section 3 then meticulously details the architecture and specifics of LDPC-Net. Subsequently, Section 4 discusses a series of tests conducted to assess the performance of LDPC-Net. Finally, Section 5 concludes this paper.

2. Related Work

Existing single-image dehazing methodologies can be broadly categorized into two classes: prior-based dehazing methods and learning-based dehazing methods.

2.1. Prior-Based Dehazing

The majority of prior-based techniques aim to acquire haze-free images by estimating model parameters within the physical scattering model equation utilizing image statistic priors. Fattal [13], under the assumption of local uncorrelation between transmission map and surface shading, estimated the transmission map utilizing surface shading information. The dark channel prior (DCP), introduced in [5], facilitated haze removal based on pixel intensity statistics from haze-free outdoor images. Zhu et al. [10] employed the color attenuation prior (CAP) to devise a prior-based dehazing method. Berman et al. [14] imposed additional constraints from nonlocal color prior on scene transmission and atmospheric light to attain haze-free images. While these prior-based approaches demonstrate promising performance in less intricate hazy scenes, they exhibit reduced effectiveness in challenging and complex hazy environments where the assumptions or statistical priors may not universally hold.

2.2. Learning-Based Dehazing

Two primary categories of data-driven dehazing methods emerge: indirect end-to-end and direct end-to-end dehazing methods. Indirect end-to-end approaches predominantly leverage convolutional neural networks (CNNs) to train parameters within the atmospheric scattering model, ultimately obtaining dehazing outcomes by optimizing parameters and integrating them with the atmospheric scattering model. Cai et al. [19] introduced DehazeNet, a four-layer network structure model that estimates the transmission map. Ren et al. [18] proposed the multi-scale CNN (MSCNN), a multi-scale deep dehazing model employing convolutional neural networks to predict transmission at varying scales. Li et al. [6] innovatively merged parameters within the atmospheric scattering model and subsequently estimated the merged parameter information utilizing the all-in-one neural networks (AOD-Net). Pang et al. [27] presented the binocular image dehazing network, utilizing information captured by left and right lenses to estimate the transmission of degraded haze images.

Recognizing that the basic atmospheric scattering model may not encompass all hazy image formation processes, scholars have conducted comprehensive research on the mapping relationship between hazy and haze-free images. For instance, Ren et al. [26] proposed the gated fusion network (GFN), processing the original hazy image through learning a confidence map after various preprocessing steps, ultimately achieving image dehazing. Qu et al. [28] introduced an enhanced Pix2pix dehazing network, employing generative adversarial concepts to construct an end-to-end image conversion model. Liu et al. [24] designed the multi-scale neural network GridDehazeNet (GDN), which eschews reliance on the atmospheric scattering model, achieving image dehazing through the introduction of an attention mechanism. Inspired by knowledge distillation technology, Hong et al. [21] proposed a knowledge distillation dehazing network, which utilizes teacher-student architecture to enable dehazingg models to learn how to obtain features in clear images. Qin et al. [25] proposed a feature fusion attention network (FFA-Net) for single-image dehazing demonstrating notable dehazing effects on synthetic datasets. Dong et al. [23] devised a multi-scale enhanced dehazing network with dense feature fusion, effectively harnessing nonadjacent features in U-Net and achieving commendable dehazing results. Li et al. [29] incorporated semisupervised methods into their models to further enhance dehazing effectiveness. The autoencoder and contrastive regularization network (AECR-Net) [2] integrates contrastive learning into an autoencoder-like framework, enhancing the model’s dehazing capabilities through the formulation of a contrastive loss between hazy and haze-free images. Inspired by the work on attention mechanisms [30], Zhang et al. [31] proposed a spatial dual-branch attention dehazing network (SDBAD-Net). Similarly, Guo et al. [32] proposed a self-paced semi-curricular attention network (SCA-Net), which is mainly used to deal with non-homogeneous haze distribution. With the development of transformer technology, some researchers have applied transformer technology to the field of image dehazing. Guo et al. [33] proposed an image-dehazing transformer with transmission-aware 3D position embedding (Dehamer), which successfully introduced transformer into the fog removal task. Subsequently, Song et al. [34] improved the existing transformer technology and proposed a vision transformer architecture for image defogging tasks, called dehazeformer. According to different parameter configurations, dehazeformer with different model parameter sizes is denoted as dehazeformer-T, dehazeformer-S, dehazeformer-M, dehazeformer-B, and dehazeformer-L. In addition, To enable the model to process ultra-high-resolution images and real haze images, zheng et al. [35] and Chen et al. [36], respectively, proposed a dehazing model for ultra-high-definition images and a dehazing model training framework based on transfer learning.

While researchers have proposed typical dehazing algorithms based on convolutional neural networks, most of the current work on deep learning-based dehazing techniques has focused on increasing the depth and width of the network, without considering the size of the model. Figure 1 shows the performance and model parameters of some state-of-the-art dehazing models. As can be seen from Figure 1, most of the existing dehazing methods often cannot take into account both efficiency and accuracy, and a new method is urgently needed to balance this contradiction.

3. The Proposed Method

According to the atmosphere scattering model (ASM) [7,8,9], the hazing phenomenon inherent in a hazy image, denoted as

I (x

), can be formulated as follows:

\begin{matrix} I (x) = J (x) t (x) + A (x) (1 - t (x)), \end{matrix}

(1)

where x is the pixel location,

J (x)

signifies the haze-free image,

A (x)

is the global atmospheric light, and

t (x)

is the transmission map. In this formulation, the terms

J (x)

,

A (x)

, and

t (x)

collectively contribute to the inherent ambiguity of image dehazing, rendering it a markedly ill-posed problem in computational vision.

In addressing this inherently ill-posed problem, in this paper, we proposed the lightweight detail–content progressive coupled network (LDPC-Net). The presented model, as illustrated in the Figure 2, comprises three distinct modules: the initial feature extraction module, the progressive coupled module, and the joint estimation module. The initial feature extraction module is responsible for extracting content features and detail features from the input image. Subsequently, the progressive coupled module methodically integrates these two types of features. Finally, the joint estimation module utilizes both detail and content features to estimate haze-free images.

3.1. Initial Feature Extraction Module

As shown in Figure 3, the initial feature extraction module consists of detail feature extraction module and content feature extraction module. The two modules will be introduced in detail below.

3.1.1. Detail Feature Extraction Module

To mitigate the loss of fine-grained information, the detailed feature extraction module module employs a non-subsampling structure. It encompasses a feature mapping module and two additional feature extraction modules. The feature mapping module is used to map images into features. It is a depth-separable convolution with a convolution kernel size of 3 and an output channel of 16. The feature extraction modules are two cascaded lightweight adaptive feature extraction blocks, which is a variant of the standard residual dense block (RDB) [37]. The lightweight adaptive feature extraction block will be described in detail in a later chapter.

3.1.2. Content Feature Extraction Module

As shown in Figure 3b, this study employs a multi-scale information fusion framework for the extraction of content features. More precisely, the initial seven layers of the EfficientNet [38] architecture were selected as feature encoders, aiming to capture image features across various scales, specifically at

\frac{1}{4}

,

\frac{1}{8}

, and

\frac{1}{16}

. Subsequently, a multi-scale feature fusion module is introduced, with its primary objective being the synthesis of features from different scales to enhance the extraction of image content features effectively. The fusion module is composed of three principal components: a channel conversion module, a resolution adjustment module, and a feature fusion module. Initially, the scale features are adjusted to a standardized channel count of 32. Following this adjustment, the features undergo further modification to achieve a uniform resolution of

\frac{1}{4}

. Ultimately, by employing element-wise addition, an effective fusion of features across different scales is accomplished.

3.2. Progressive Coupled Module

The progressive coupled module incorporates multiple cross-scale information interaction blocks (CI2Bs) to facilitate comprehensive interaction between detail and content features with minimal computational expense. The CI2B framework encompasses four processes: enhancement and refinement of content features, alongside enhancement and refinement of detail features.

(1) Enhancement of Content Features: Initially, detail features undergo downsampling and channel transformation. Subsequently, these transformed features are amalgamated with the content features. This is followed by the application of two depth-separable convolutions to further refine the combined features. Finally, the resultant features are integrated with the original content features to yield the enhanced content features.

(2) Refinement of Content Features: To facilitate global information exchange, two lightweight adaptive feature extraction blocks are deployed to optimize the content features.

(3) Enhancement of Detail Features: This process begins with the upsampling and channel transformation of content features, followed by their integration with detail features. Subsequent to this integration, two depth-wise separable convolutions are employed to further refine the amalgamated features. The refined features are then added back to the detail features to achieve enhanced detail features.

(4) Refinement of Detail Features: For enhanced detail extraction, the detail features undergo additional optimization through the application of a lightweight adaptive feature extraction block.

3.3. Joint Estimation Module

The joint estimation module comprises three submodules: the detail image estimation module, the content image estimation module, and the final haze-free image estimation module. The detail image estimation module is responsible for deriving a haze-free image based on the detail features, whereas the content image estimation module performs a similar function using the content features. The final haze-free image estimation module integrates the outputs from the previous modules by initially merging the detail and content haze-free images. This integration is followed by a sequence of operations involving a convolution layer, lightweight adaptive feature extraction block, and another convolution layer, to compute the ultimate haze-free image.

3.4. Lightweight Adaptive Feature Extraction Block

RDB is a simple and effective feature extraction module, which shows strong feature extraction ability under fewer network parameters. Therefore, many dehazing methods use RDB as the basic feature extraction module of their networks. However, RDB has the following shortcomings: (1) RDB has a fixed receptive field; however, the receptive field required by different objects in haze images is different. (2) The dense connections in RDB create a large number of parameters and calculations. Based on these shortcomings, we propose an improved RDB called a lightweight adaptive feature extraction block. As shown in Figure 4, compared with the original RDB, the proposed lightweight adaptive feature extraction block has the following two improvements:

(1) Dynamic receptive field. In order to make the feature extraction module have dynamic receptive field, we performed two operations. First, we changed the last two convolutions in the RDB block to a convolution kernel size of 7. Secondly, we added channel-wise self-attention before the residual operation, so that the features of different receptive fields could be dynamically fused.

(2) Lightweight parameters and calculation. To reduce the number of parameters and calculations, we replaced the original convolution with a depth-separable convolution.

3.5. Loss Function

We simultaneously supervised the dehazing results based on detail features, the dehazing results based on content features, and the final dehazing results. The loss can be expressed as

\begin{matrix} L = L_{1} (J, G_{f i n a l}) + 0.2 * L_{C R} (J, G_{f i n a l}) + 0.1 * L_{1} (J, G_{d e t a i l}) + 0.1 * L_{1} (J, G_{c o n t e n t}), \end{matrix}

(2)

where

G_{d e t a i l}

and

G_{c o n t e n t}

are the dehazing results based on detail features and the dehazing results based on content features,

L_{1}

is the L1 loss.

G_{f i n a l}

is the final dehazing result and

L_{C R}

is the contrast loss.

4. Experimental Results

4.1. Experimental Settings

(1) Datasets: The assessment of the introduced method was conducted using a synthetic dataset, a real-world dataset, and a collection of real-world hazy images. Within the synthetic dataset REalistic Single-Image Dehazing (RESIDE) [1], the indoor training set (ITS) and the synthetic objective testing Set (SOTS) were utilized as the training and testing sets, comprising 13,990 and 500 samples, respectively. For scenarios derived from the real world, the NH-HAZE [39] and DENSE-HAZE [40] dataset were employed. DENSE-HAZE and NH-HAZE each include 45 images with dense haze and 45 with nonhomogeneous haze, along with their corresponding true images. Both datasets are equipped with five pairs of images for validation purposes and another five pairs for testing

(2) Implementation Details: Our proposed model was trained on all datasets using the Adam optimizer [41], with parameters

β_{1} = 0.9

and

β_{2} = 0.9

, over 100 epochs. The initial learning rate of 0.001 was uniformly applied across all layers and dynamically adapted using the cosine annealing scheduler. Training was performed using the PyTorch 1.7.1 on an Nvidia GeForce GTX 3090 GPU generated by Nvidia Corporation, Santa Clara, CA, USA, with a patch size of

256 \times 256

and a batch size of 8.

(3) Evaluation Metric: To assess the efficacy of our approach, we utilized the peak signal-to-noise ratio (PSNR), structural similarity index (SSIM) and Haar perceptual similarity index (PSI) [42] as benchmarks.

Peak Signal-to-Noise Ratio (PSNR): This is a widely used quantitative measure for assessing the quality of reconstructed images, particularly in the fields of image processing. It evaluates the fidelity of a reconstructed image compared to its original, unaltered version. The formula for PSNR is given as

\begin{matrix} P S N R (x, y) = 10 \cdot {log}_{10} (\frac{{MAX}_{I}^{2}}{MSE (x, y)}), \end{matrix}

(3)

where x and y are the two images being compared;

{MAX}_{I}

represents the maximum possible pixel value of the image;

MSE (x, y)

is the mean squared error between the original and reconstructed images.

Structural Similarity Index Measure (SSIM): This is a metric used to measure the similarity between two images. The SSIM index incorporates luminance, contrast, and structure comparison functions, which are more aligned with the human visual system’s characteristics. The formula for SSIM is defined as

\begin{matrix} SSIM (x, y) = \frac{(2 μ_{x} μ_{y} + c_{1}) (2 σ_{x y} + c_{2})}{(μ_{x}^{2} + μ_{y}^{2} + c_{1}) (σ_{x}^{2} + σ_{y}^{2} + c_{2})}, \end{matrix}

(4)

where x and y are the two images being compared;

μ_{x}

and

μ_{y}

are the average values of x and y;

σ_{x}^{2}

and

σ_{y}^{2}

are the variances of x and y;

σ_{x y}

is the covariance of x and y;

c_{1}

and

c_{2}

are constants used to stabilize the division with weak denominators.

Haar Perceptual Similarity Index (PSI): PSI is a metric that evaluates the perceptual similarity of images based on the Haar wavelet transform. It prioritizes coefficients in regions where human visual perception is more sensitive to detail, making it particularly effective for assessing perceived image quality. The HaarPSI value is computed as follows:

\begin{matrix} PSI (x, y) = \frac{\sum_{i = 1}^{N} w_{i} \cdot sim (c_{i}, c_{i}^{'})}{\sum_{i = 1}^{N} w_{i}}, \end{matrix}

(5)

where

c_{i}

and

c_{i}^{'}

are the wavelet coefficients of the original and distorted images, respectively;

w_{i}

are the weights reflecting the perceptual significance of each coefficient, and sim represents a similarity function comparing the coefficients.

Additionally, the computational cost is evaluated in terms of the number of parameters (Param).

(4) Compared Methods: Our dehazing framework is benchmarked against cutting-edge techniques, encompassing methods grounded in priors (e.g., DCP [5]), approaches driven by ASM (e.g., DehazeNet [19] and AOD-Net [6]), and comprehensive end-to-end image translation strategies (e.g., GFN [22], GDN [24], FFA-Net [25], KDDN [21], AECR-Net [2], DeHamper [33], Dehazefomer-S [34], SCA-Net [32] and SDBAD-Net [31]).

4.2. Qualitative Comparison

Figure 5 illustrates the performance of various dehazing models on the synthetic objective testing set (SOTS). It is observed that the dark channel prior (DCP) method often suffers from significant color distortion, attributed to its underlying assumptions, which may not hold in all scenarios. AOD-Net and DehazeNet mitigate these color distortion and saturation issues to a considerable extent but do not completely remove haze, leading to variations in image brightness. While GDN, GFN, and AECR-Net show enhancements in dehazing results, they are not devoid of visual imperfections, including minor dark spots and blurred edges. In stark contrast, our proposed model generates images that are visually more appealing, free from color distortions, brightness loss, or any noticeable defects. Overall, our method produces results that closely emulate the quality of haze-free images, especially in terms of color accuracy and the clarity of object details.

Figure 6 presents the dehazing outcomes on the NH dataset, noting the challenges posed by the uneven distribution of haze. The performance of various models on the NH dataset does not match their efficacy on the SOTS dataset due to this heterogeneity. However, consistent with our expectations, our model achieves the best visual results among the compared techniques.

Figure 7 showcases comparative results of different models in authentic foggy scenarios. The DCP model exhibits color distortions, while models like AOD-Net and GDN leave behind remnants of haze. The AECR-Net model tends to over-dehaze, removing essential image details. Our model outperforms others by achieving the most effective dehazing, balancing the removal of haze while preserving the natural appearance of the scene.

4.3. Quantitative Evaluations

In Table 1, Table 2 and Table 3, we offer a detailed comparative analysis of our proposed LDPC-Net framework against contemporary state-of-the-art (SOTA) dehazing methods, utilizing the RESIDE, NH-HAZE, and the DENSE-HAZE datasets. The analysis reveals that the DCP method, which relies on hand-crafted priors, delivers the lowest performance metrics. This observation highlights the intrinsic limitations of the assumptions underpinning the DCP algorithm. In contrast, AOD-Net and DehazeNet exhibit enhanced PSNR and SSIM metrics, illustrating the benefits of adopting deep learning methodologies. Furthermore, GDN, FFA-Net, AECR-Net, and SDBAD-Net demonstrate superior performance in both PSNR and SSIM. This improvement is largely due to their ability to directly infer haze-free images, circumventing the reliance on the atmospheric scattering model. Notably, our LDPC-Net outperforms the majority of existing dehazing models in terms of restoration quality, achieving the best experimental results across both synthetic and real-world datasets with a comparatively lower parameter count.

4.4. Ablation Analysis

4.4.1. Effectiveness of the Lightweight Adaptive Feature Extraction Block

In order to verify the effectiveness of the proposed lightweight adaptive feature extraction block, a comparative analysis is carried out with various established feature extraction modules, including the following:

(a) A residual block (RB), which consists of two convolutional layers with a residual link.

(b) A residual dense block (RDB): combined with residual dense operation, the convolution number in the block is set to 4 and the growth rate is 16.

For comparison, we use these feature extraction blocks separately to replace the lightweight adaptive feature extraction block in LDPC-Net. The resulting models were named LDPC-Net-RB and LDPC-Net-RDB, respectively. All the models were trained in the same way, and the dehazing results of these models were shown in Table 4. It can be seen that compared with LDPC-Net-RB and LDPC-Net-RDB, our model has fewer parameters and better haze removal ability.

4.4.2. Effectiveness of the Joint Estimation Module

First, we verified the effectiveness of the joint estimation module. Specifically, we remove the joint estimation module and directly use detail features to estimate the final dehazing result. For a fair comparison, the variant is trained in the same way as the full model on SOTS. As can be seen from Table 5, removing the joint estimation module only reduces the number of parameters by 0.006M, but the PSNR value drops by 1 dB. This fully demonstrates the effectiveness of the joint estimation module.

4.4.3. Effectiveness of the Cross-Scale Information Interaction Block

In order to verify the effectiveness of cross-scale information interaction block, we changed the number of cross-scale information interaction block in the progressive feature fusion module. For a fair comparison, all the variants are trained in the same way as the full model on SOTS. As can be seen from Table 6, as the number of blocks increases, the performance of the model continues to improve. In order to balance the parameters and performance of the model, we set the number of cross-scale information interaction blocks to 4.

4.4.4. Effectiveness of the Loss Function

In order to verify the effectiveness of the loss function, a comparative analysis is carried out with variations of the loss function, including the following:

(a) L1 loss, which removed the contrast loss.

(b) L2+contrast loss, which replaced the L1 loss function with the L2 loss function.

(c) L2 loss, which removed the contrast loss and replaced the L1 loss function with the L2 loss function.

For comparison, we use these loss functions to replace the original loss function in LDPC-Net. The resulting models were named LDPC-Net-L1, LDPC-Net-L2-C, and LDPC-Net-L2, respectively. All the models were trained in the same way, and the dehazing results of these models are shown in Table 7. It can be seen that compared with LDPC-Net-L1, LDPC-Net-L1-C, and LDPC-Net-L2, our model has the best haze removal ability.

5. Conclusions

In this study, a lightweight detail–content progressive coupled network (LDPC-Net) is proposed for lightweight image dehazing. The existing dehazing methods either cannot obtain satisfactory recovery results or have large model parameters. This limits the application of the model on resource-limited platforms. To overcome these limitations, we proposed LDPC-Net. By integrating three well-designed modules—initial feature extraction, progressive coupling, and joint estimation—LDPC-Net achieves high-efficiency haze removal with very few parameters. Meanwhile, a lightweight adaptive feature extraction block is designed to serve as the basic feature extraction module of the proposed LDPC-Net. Extensive experiments were conducted to verify that the proposed LDPC-Net outperforms state-of-the-art methods on composite and real-world hazy images. At the same time, ablation studies were conducted to demonstrate the effectiveness of the main modules in the proposed network.

Author Contributions

Conceptualization, L.D.; methodology, S.L.; validation, H.L.; formal analysis, L.D.; investigation, L.D.; resources, L.D.; data curation, S.L.; writing—original draft preparation, S.L.; writing—review and editing, L.D.; visualization, L.D.; project administration, H.L. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by the CCF Opening Project of Information System CCFIS2018G02G04.

Data Availability Statement

The data presented in this study are are made publicly available for research purposes. Thees data are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

Li, B.; Ren, W.; Fu, D.; Tao, D.; Feng, D.; Zeng, W.; Wang, Z. Benchmarking single-image dehazing and beyond. IEEE Trans. Image Process. 2019, 28, 492–505. [Google Scholar] [CrossRef] [PubMed]
Wu, H.; Qu, Y.; Lin, S.; Zhou, J.; Qiao, R.; Zhang, Z.; Xie, Y.; Ma, L. Contrastive Learning for Compact Single Image Dehazing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 10551–10560. [Google Scholar]
Fu, C.; Yuan, H.; Xu, H.; Zhang, H.; Shen, L. TMSO-Net: Texture adaptive multi-scale observation for light field image depth estimation. J. Vis. Commun. Image Represent. 2023, 90, 103731. [Google Scholar] [CrossRef]
Qi, F.; Tan, X.; Zhang, Z.; Chen, M.; Xie, Y.; Ma, L. Glass Makes Blurs: Learning the Visual Blurriness for Glass Surface Detection. IEEE Trans. Ind. Inform. 2024, 20, 6631–6641. [Google Scholar] [CrossRef]
He, K.; Sun, J.; Tang, X. Single image haze removal using dark channel prior. IEEE Trans. Pattern Anal. Mach. Learn. 2011, 33, 2341–2353. [Google Scholar]
Li, B.; Peng, X.; Wang, Z.; Xu, J.; Feng, D. AOD-Net: All-in-one dehazing network. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 4770–4778. [Google Scholar]
Cantor, A. Optics of the atmosphere–Scattering by molecules and particles. IEEE J. Quantum Electron. 1978, 14, 698–699. [Google Scholar] [CrossRef]
Narasimhan, S.G.; Nayar, S.K. Chromatic framework for vision in bad weather. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Hilton Head, SC, USA, 15 June 2000. [Google Scholar]
Narasimhan, S.G.; Nayar, S.K. Vision and the atmosphere. Int. J. Comput. Vis. 2002, 48, 233–254. [Google Scholar] [CrossRef]
Zhu, Q.; Mai, J.; Shao, L. A fast single image haze removal algorithm using color attenuation prior. IEEE Trans. Image Process. 2015, 24, 3522–3533. [Google Scholar] [PubMed]
Fattal, R. Dehazing Using Color-Lines. ACM Trans. Graph. 2014, 34, 1–14. [Google Scholar] [CrossRef]
Tan, R.T. Visibility in bad weather from a single image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, USA, 23–28 June 2008; pp. 1–8. [Google Scholar]
Fattal, R. Single image dehazing. ACM Trans. Graph. 2008, 27, 1–9. [Google Scholar] [CrossRef]
Berman, D.; Treibitz, T.; Avidan, S. Non-local image dehazing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 1674–1682. [Google Scholar]
Cui, Z.; Sheng, H.; Yang, D.; Wang, S.; Chen, R.; Ke, W. Light Field Depth Estimation for Non-Lambertian Objects via Adaptive Cross Operator. IEEE Trans. Circuits Syst. Video Technol. 2024, 34, 1199–1211. [Google Scholar] [CrossRef]
Zhou, G.; Liu, X. Orthorectification Model for Extra-Length Linear Array Imagery. IEEE Trans. Geosci. Remote Sens. 2022, 60, 4709710. [Google Scholar] [CrossRef]
Xing, J.; Yuan, H.; Hamzaoui, R.; Liu, H.; Hou, J. GQE-Net: A Graph-Based Quality Enhancement Network for Point Cloud Color Attribute. IEEE Trans. Image Process. 2023, 32, 6303–6317. [Google Scholar] [CrossRef] [PubMed]
Ren, W.; Liu, S.; Zhang, H.; Pan, J.; Cao, X.; Yang, M.H. Single image dehazing via multi-scale convolutional neural networks. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; pp. 154–169. [Google Scholar]
Cai, B.; Xu, X.; Jia, K.; Qing, C.; Tao, D. DehazeNet: An End-to-End System for Single Image Haze Removal. IEEE Trans. Image Process. 2016, 25, 5187–5198. [Google Scholar] [CrossRef] [PubMed]
Zhang, H.; Patel, V.M. Densely connected pyramid dehazing network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018; pp. 3194–3203. [Google Scholar]
Hong, M.; Xie, Y.; Li, C.; Qu, Y. Distilling Image Dehazing With Heterogeneous Task Imitation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 14–19 June 2020; pp. 3459–3468. [Google Scholar]
Ren, W.; Ma, L.; Zhang, J.; Pan, J.; Cao, X.; Liu, W.; Yang, M.H. Gated fusion network for single image dehazing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018; pp. 3253–3261. [Google Scholar]
Dong, H.; Pan, J.; Xiang, L.; Hu, Z.; Zhang, X.; Wang, F.; Yang, M.H. Multi-Scale Boosted Dehazing Network With Dense Feature Fusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 14–19 June 2020; pp. 2154–2164. [Google Scholar]
Liu, X.; Ma, Y.; Shi, Z.; Chen, J. Griddehazenet: Attention-based multi-scale network for image dehazing. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 7313–7322. [Google Scholar]
Qin, X.; Wang, Z.; Bai, Y.; Xie, X.; Jia, H. FFA-Net: Feature Fusion Attention Network for Single Image Dehazin. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; pp. 11908–11915. [Google Scholar]
Chen, D.; He, M.; Fan, Q.; Liao, J.; Zhang, L.; Hou, D.; Yuan, L.; Hua, G. Gated Context Aggregation Network for Image Dehazing and Deraining. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa Village, HI, USA, 7–11 January 2019; pp. 1375–1383. [Google Scholar]
Pang, Y.; Nie, J.; Xie, J.; Han, J.; Li, X. BidNet: Binocular Image Dehazing Without Explicit Disparity Estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 14–19 June 2020; pp. 5930–5939. [Google Scholar]
Qu, Y.; Chen, Y.; Huang, J.; Xie, Y. Enhanced Pix2pix Dehazing Network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 8152–8160. [Google Scholar]
Li, L.; Dong, Y.; Ren, W.; Pan, J.; Gao, C.; Sang, N.; Yang, M.H. Semi-Supervised Image Dehazing. IEEE Trans. Image Process. 2020, 29, 2766–2779. [Google Scholar] [CrossRef] [PubMed]
Yang, M.; Wang, H.; Hu, K.; Yin, G.; Wei, Z. IA-Net: An Inception–Attention-Module-Based Network for Classifying Underwater Images from Others. IEEE J. Ocean. Eng. 2022, 47, 704–717. [Google Scholar] [CrossRef]
Zhang, G.; Fang, W.; Zheng, Y.; Wang, R. SDBAD-Net: A Spatial Dual-Branch Attention Dehazing Network Based on Meta-Former Paradigm. IEEE Trans. Circuits Syst. Video Technol. 2024, 34, 60–70. [Google Scholar] [CrossRef]
Guo, Y.; Gao, Y.; Liu, R.W.; Lu, Y.; Qu, J.; He, S.; Ren, W. SCANet: Self-Paced Semi-Curricular Attention Network for Non-Homogeneous Image Dehazing. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Vancouver, BC, Canada, 17–24 June 2023; pp. 1885–1894. [Google Scholar]
Guo, C.; Yan, Q.; Anwar, S.; Cong, R.; Ren, W.; Li, C. Image Dehazing Transformer with Transmission-Aware 3D Position Embedding. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 5802–5810. [Google Scholar]
Song, Y.; He, Z.; Qian, H.; Du, X. Vision Transformers for Single Image Dehazing. IEEE Trans. Image Process. 2023, 32, 1927–1941. [Google Scholar] [CrossRef] [PubMed]
Zheng, Z.; Ren, W.; Cao, X.; Hu, X.; Wang, T.; Song, F.; Jia, X. Ultra-High-Definition Image Dehazing via Multi-Guided Bilateral Learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 16185–16194. [Google Scholar]
Chen, Z.; Wang, Y.; Yang, Y.; Liu, D. PSD: Principled Synthetic-to-Real Dehazing Guided by Physical Priors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Virtual, 19–25 June 2021; pp. 7180–7189. [Google Scholar]
Zhang, Y.; Tian, Y.; Kong, Y.; Zhong, B.; Fu, Y. Residual Dense Network for Image Restoration. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 2480–2495. [Google Scholar] [CrossRef] [PubMed]
Tan, M.; Le, Q.V. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. arXiv 2020, arXiv:1905.11946v5. [Google Scholar]
Ancuti, C.O.; Ancuti, C.; Timofte, R. NH-HAZE: An Image Dehazing Benchmark with Non-Homogeneous Hazy and Haze-Free Images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Seattle, WA, USA, 14–19 June 2020; pp. 1798–1805. [Google Scholar]
Ancuti, C.O.; Ancuti, C.; Sbert, M.; Timofte, R. Dense-Haze: A Benchmark for Image Dehazing with Dense-Haze and Haze-Free Images. In Proceedings of the IEEE International Conference on Image Processing, Salt Lake City, UT, USA, 22–25 September 2019; pp. 1014–1018. [Google Scholar]
Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Reisenhofer, R.; Bosse, S.; Kutyniok, G.; Wiegand, T. A Haar wavelet-based perceptual similarity index for image quality assessment. Signal Process. Image Commun. 2018, 61, 33–43. [Google Scholar] [CrossRef]

Figure 1. The performance and model parameters of some state-of-the-art dehazing models.

Figure 2. The proposed progressive feature fusion network (LDPC-Net).

Figure 3. The details of the initial feature extraction module.

Figure 4. The proposed lightweight adaptive feature extraction block.

Figure 5. Qualitative comparisons of different methods on SOTS data.

Figure 6. Qualitative comparisons of different methods on NH data.

Figure 7. Qualitative comparisons of different methods on real-world hazy images.

Table 1. Comparison of PSNR, SSIM, and PSI performance with state-of-the-art dehazing methods on the RESIDE dataset.

Method	Reference	RESIDE			Param
Method	Reference	PSNR (dB)	SSIM	PSI	Param
DCP [5]	TPAMI 11	16.16	0.855	0.648	-
DehazeNet [19]	TIP16	19.82	0.821	0.869	0.009 M
AOD-Net [6]	ICCV 17	20.15	0.816	0.797	0.002 M
GFN [22]	CVPR 18	24.91	0.919	0.879	0.499 M
GDN [24]	ECCV 19	32.16	0.984	0.976	0.956 M
FFA-Net [25]	AAAI 20	36.39	0.989	-	4.456 M
KDDN [21]	CVPR 20	34.72	0.985	-	2.4 M
AECR-Net [2]	CVPR 21	37.17	0.990	0.983	2.61 M
DeHamer [33]	CVPR 22	36.63	0.988	0.973	29.44 M
Dehazeformer-S [34]	TIP 23	36.82	0.992	0.982	1.28 M
SDBAD-Net [31]	TCSVT 24	37.87	0.988	-	2.23 M
(Ours) LDPC-Net	-	38.57	0.992	0.983	0.708 M

Table 2. Comparison of PSNR, SSIM and PSI performance with state-of-the-art dehazing methods on the NH-HAZE dataset.

Method	Reference	NH-HAZE			Param
Method	Reference	PSNR (dB)	SSIM	PSI	Param
DCP [5]	TPAMI 11	12.30	0.448	0.288	-
DehazeNet [19]	TIP 16	11.76	0.399	0.294	0.009 M
AOD-Net [6]	ICCV 17	16.10	0.536	0.396	0.002 M
GFN [22]	CVPR 18	17.17	0.597	0.464	0.499 M
GDN [24]	ECCV 19	18.14	0.641	0.501	0.956 M
FFA-Net [25]	AAAI 20	19.87	0.692	-	4.456 M
KDDN [21]	CVPR 20	17.39	0.590	-	2.4 M
AECR-Net [2]	CVPR 21	19.88	0.717	0.574	2.61 M
DeHamer [33]	CVPR 22	20.66	0.684	0.578	29.44 M
Dehazeformer-S [34]	TIP 23	20.47	0.713	-	1.28 M
SCA-Net [32]	CVPR 23	19.52	0.649	0.569	2.39 M
SDBAD-Net [31]	TCSVT 24	19.89	0.743	-	2.23 M
(Ours) LDPC-Net	-	20.23	0.665	0.572	0.708 M

Table 3. Comparison of PSNR, SSIM and PSI performance with state-of-the-art dehazing methods on the DENSE-HAZE dataset.

Method	Reference	DENSE-HAZE			Param
Method	Reference	PSNR (dB)	SSIM	PSI	Param
DCP [5]	TPAMI 11	9.26	0.447	0.218	-
DehazeNet [19]	TIP 16	9.48	0.438	0.220	0.009 M
AOD-Net [6]	ICCV 17	13.77	0.468	0.251	0.002 M
GFN [22]	CVPR 18	17.47	0.461	0.393	0.499 M
GDN [24]	ECCV 19	15.23	0.510	0.501	0.956 M
FFA-Net [25]	AAAI 20	14.39	0.452	-	4.456 M
KDDN [21]	CVPR 20	14.28	0.486	-	2.4 M
AECR-Net [2]	CVPR 21	15.80	0.466	0.436	2.61 M
DeHamer [33]	CVPR 22	16.62	0.560	0.401	29.44 M
Dehazeformer-S [34]	TIP 23	16.29	0.510	-	1.28 M
(Ours) LDPC-Net	-	16.83	0.581	0.431	0.708 M

Table 4. Quantitative comparison of LDPC-Net with different feature extraction blocks on the SOTS dataset.

Models	PSNR (dB)	SSIM	Param (M)
LDPC-Net-RB	36.87	0.9897	1.01 M
LDPC-Net-RDB	37.59	0.9913	1.03 M
LDPC-Net	38.57	0.9923	0.708 M

Table 5. Quantitative comparison of LDPC-Net and its variants on the SOTS dataset.

Models	PSNR (dB)	SSIM	Param (M)
LDPC-Net	38.57	0.9923	0.708 M
LDPC-Net (w/o JES)	37.71	0.9911	0.701 M

Table 6. Quantitative comparison of LDPC-Net with different numbers of cross-scale information interaction blocks.

Numbers	PSNR (dB)	SSIM	Param (M)
2	37.21	0.9905	0.530 M
3	38.01	0.9914	0.619 M
4	38.57	0.9923	0.708 M
5	38.64	0.9924	0.797 M

Table 7. Quantitative comparison of LDPC-Net with different loss functions.

Models	LDPC-Net-L1	LDPC-Net-L2	LDPC-Net-L2-C	LDPC-Net
PSNR (dB)	36.95	36.73	38.23	38.57
SSIM	0.9862	0.9855	0.9913	0.9923

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Dai, L.; Liu, H.; Li, S. LDPC-Net: A Lightweight Detail–Content Progressive Coupled Network for Single-Image Dehazing with Adaptive Feature Extraction Block. Electronics 2024, 13, 1867. https://doi.org/10.3390/electronics13101867

AMA Style

Dai L, Liu H, Li S. LDPC-Net: A Lightweight Detail–Content Progressive Coupled Network for Single-Image Dehazing with Adaptive Feature Extraction Block. Electronics. 2024; 13(10):1867. https://doi.org/10.3390/electronics13101867

Chicago/Turabian Style

Dai, Lingrui, Hongrui Liu, and Shuoshi Li. 2024. "LDPC-Net: A Lightweight Detail–Content Progressive Coupled Network for Single-Image Dehazing with Adaptive Feature Extraction Block" Electronics 13, no. 10: 1867. https://doi.org/10.3390/electronics13101867

APA Style

Dai, L., Liu, H., & Li, S. (2024). LDPC-Net: A Lightweight Detail–Content Progressive Coupled Network for Single-Image Dehazing with Adaptive Feature Extraction Block. Electronics, 13(10), 1867. https://doi.org/10.3390/electronics13101867

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

LDPC-Net: A Lightweight Detail–Content Progressive Coupled Network for Single-Image Dehazing with Adaptive Feature Extraction Block

Abstract

1. Introduction

2. Related Work

2.1. Prior-Based Dehazing

2.2. Learning-Based Dehazing

3. The Proposed Method

3.1. Initial Feature Extraction Module

3.1.1. Detail Feature Extraction Module

3.1.2. Content Feature Extraction Module

3.2. Progressive Coupled Module

3.3. Joint Estimation Module

3.4. Lightweight Adaptive Feature Extraction Block

3.5. Loss Function

4. Experimental Results

4.1. Experimental Settings

4.2. Qualitative Comparison

4.3. Quantitative Evaluations

4.4. Ablation Analysis

4.4.1. Effectiveness of the Lightweight Adaptive Feature Extraction Block

4.4.2. Effectiveness of the Joint Estimation Module

4.4.3. Effectiveness of the Cross-Scale Information Interaction Block

4.4.4. Effectiveness of the Loss Function

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI