Next Article in Journal
Advanced Convolutional Neural Networks for Precise White Blood Cell Subtype Classification in Medical Diagnostics
Next Article in Special Issue
Progressive Discriminative Feature Learning for Visible-Infrared Person Re-Identification
Previous Article in Journal
Automatic Parkinson’s Disease Diagnosis with Wearable Sensor Technology for Medical Robot
Previous Article in Special Issue
Real-Time Deep Learning Framework for Accurate Speed Estimation of Surrounding Vehicles in Autonomous Driving
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

PerNet: Progressive and Efficient All-in-One Image-Restoration Lightweight Network

1
State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China
2
University of Chinese Academy of Sciences, Beijing 100049, China
3
School of Automation and Electrical Engineering, Shenyang Ligong University, Shenyang 110159, China
*
Author to whom correspondence should be addressed.
Electronics 2024, 13(14), 2817; https://doi.org/10.3390/electronics13142817
Submission received: 7 June 2024 / Revised: 13 July 2024 / Accepted: 16 July 2024 / Published: 17 July 2024
(This article belongs to the Special Issue Deep Learning-Based Image Restoration and Object Identification)

Abstract

:
The existing image-restoration methods are only effective for specific degradation tasks, but the type of image degradation in practical applications is unknown, and mismatch between the model and the actual degradation will lead to performance decline. Attention mechanisms play an important role in image-restoration tasks; however, it is difficult for existing attention mechanisms to effectively utilize the continuous correlation information of image noise. In order to solve these problems, we propose a Progressive and Efficient All-in-one Image Restoration Lightweight Network (PerNet). The network consists of a Plug-and-Play Efficient Local Attention Module (PPELAM). The PPELAM is composed of multiple Efficient Local Attention Units (ELAUs) and PPELAM can effectively use the global information and horizontal and vertical correlation of image degradation features in space, so as to reduce information loss and have a small number of parameters. PerNet is able to learn the degradation properties of images very well, which allows us to reach an advanced level in image-restoration tasks. Experiments show that PerNet has excellent results for typical restoration tasks (image deraining, image dehazing, image desnowing and underwater image enhancement), and the excellent performance of ELAU combined with Transformer in the ablation experiment chapter further proves the high efficiency of ELAU.

1. Introduction

The purpose of image-restoration is to enhance image quality and improve image information, thereby improving visual perception and recognition capabilities, and providing better image-processing and -analysis results for various application scenarios. Images captured by cameras are often affected by noise such as snowflakes, rain streaks, fog and underwater blur. These noises can significantly reduce the clarity and quality of the images, thereby affecting the accuracy of downstream tasks such as object detection [1], image classification [2], stereo matching [3] and image recognition [4]. These useless noises not only degrade the image quality but also mislead deep learning models [5,6,7,8], leading to incorrect decisions and analysis.
To address the limitations imposed by image noise on downstream tasks, early researchers employed various traditional methods to remove noise from images. These methods include statistical models [9], total variation [10], least squares [11], nonlinear methods [12] and dictionary-based approaches [13]. Although these traditional methods can achieve image-restoration to a certain extent, they still have drawbacks such as insufficient modeling capability for complex image structures, sensitivity to noise and missing data, high computational complexity and low generalization ability.
In contrast, deep learning methods in image-restoration tasks typically capture high-level features and complex structures better, offering stronger generalization capability and adaptability. In recent years, with the rapid development of deep learning, image-restoration technologies [14] have also made significant progress. However, these methods are limited to handling specific types of image degradation, making it difficult to achieve ideal restoration effects for various degradation types.
Therefore, it is highly necessary to develop a universal network capable of handling multiple types of image degradation. Different types of image degradation possess unique characteristics, and the mapping relationship between these characteristics and the model is often difficult to match. Since most degradations exhibit continuous spatial correlations both horizontally and vertically, the spatial information of these various degradation features is challenging to utilize effectively, making it difficult for the network to fully optimize. Consequently, unified image-restoration technology for different degradation features faces numerous challenges.
Figure 1 illustrates the horizontal and vertical correlation features of rain streaks. This paper focuses on the learning process of rain streak degradation features. The restoration results of our network, obtained by subtracting the rain streak features in Figure 1b from the image features in Figure 1a, is shown in Figure 1d. It is evident from Figure 1b that the spatial information of rain streaks exhibits coherence, presenting a regular continuity and correlation between the pixel points rather than a scattered distribution. We found that this correlation is prevalent across various degradation types. If the network can fully exploit this characteristic of degradation, it can better restore the image. As our network considers the continuous correlation of degradation features in both horizontal and vertical directions, it can predict other types of degradation in the real image. Therefore, compared to the real image in Figure 1c, the image-restoration result in Figure 1d offers a better visual experience. Specifically, although Figure 1c is a really clear image, it is not as visually satisfying as our restoration result in Figure 1d. In addition to restoring rain streaks close to the clear and real image in Figure 1c, our restoration result in Figure 1d also eliminates the blurry degradation present in the real clear image in Figure 1c. It is worth mentioning that in the snowflake (e–h) example shown, our method also predicts other degradation features in the snowflake degradation image. Specifically, the enlarged image block e is a small portion of the label image (e) containing the snowflake degradation, similarly, the enlarged image block f is a small portion of the learned degenerated image (f), the enlarged image block g is a small portion of the clear label image (g) and the enlarged image block h is a small portion of the restored image (h) of image (e). It can be seen that black degradation is still present in the small block g of the clear label image, but this degradation is not preserved in the small block h, suggesting that our network can also eliminate other types of degradation, and that the color and color saturation of the small block h are closer to the original label image (e).
According to the analysis above, we propose a progressive and efficient all-in-one image-restoration lightweight network, PerNet, with the following main contributions:
  • We design a simple and efficient adaptive progressive network architecture, which has excellent progressive stability and can be easily plugged with any module.
  • We devise a PPELAM, composed of multiple ELAUs, which can fully exploit the continuous spatial correlation of degradation in both horizontal and vertical directions, thereby achieving a high match with different types of degradation.
  • Our method shows excellent recovery effects on seven types of image degradation datasets, and our model achieves good lightweight effects.

2. Related Works

Image-restoration refers to the process of recovering the original or near-original image from a damaged or degraded image. In cases of image damage or degradation, various factors such as noise, blur, distortion and degradation may affect the image. The goal of image-restoration is to use algorithms or techniques to minimize or eliminate these adverse factors as much as possible, thereby restoring the quality of the image in terms of clarity, contrast, details, color and other aspects.

2.1. Image-Restoration

Haris et al. [15] introduced Deep Back-Projection Networks for efficient image super-resolution, which constructs multi-level back-projection structures and adopts an end-to-end training strategy to effectively enhance resolution while preserving image details. Zhang et al. [16] presented a Residual Dense Network (RDN), an efficient structure containing dense connections and a residual learning mechanism. RDN enhances feature reuse through dense connections and alleviates training difficulties through residual learning, achieving efficient image super-resolution.
These methods have achieved satisfactory results in image-restoration, but there is a common problem of large number of parameters. To address this problem, a series of lightweight image-repair methods have been developed.

2.2. Lightweight Image-Restoration

Lightweightization has become a mature method that significantly reduces model parameters and computational complexity, enabling models to be deployed on small servers. This method has been widely applied across various domains of deep learning. Fu et al. [17] proposed a network with fewer parameters, shallow depth and simple structure by combining convolutional neural networks with classic pyramid techniques. The network integrates multiscale techniques, recursion and residual learning while removing batch normalization. Hu et al. [18] utilized dilated convolutions and Convolutional Block Attention Module (CBAM) to construct a backbone network, which increases the receptive field and gradually extracts spatial information from local to global scales. Lightweight CBAM is selected to guide rain streak removal in both channel and spatial dimensions. The backbone network consists of five blocks and two convolutional operations, each block comprising a CBAM module and a dilated convolution module for rain streak extraction and removal.
Lightweight operations can indeed significantly reduce the number of model parameters, but there is generally insufficient attention to local information. Attention-based image-repair methods can pay close attention to local specific information, allowing the model to learn specific information well.

2.3. Attention-Based Image-Restoration

Mou et al. [19] proposed a novel network architecture that utilizes windowed attention to mimic the selective focusing mechanism of the human eye. By dynamically adjusting the receptive field, it effectively integrates information from various sources (long sequences, local and global regions, feature dimensions and positional dimensions). The LongIR attention mechanism achieves a balance between efficiency and performance in long-sequence image-restoration to address restoration challenges. Sen et al. [20] introduced two sub-networks with integrated loss functions for the first time. Channel attention combines Squeeze-and-Excitation (SE) operations with residual blocks to fully leverage spatial contextual information for rain streak learning.The Generative Image Inpainting with Contextual Attention model proposed by Yu et al. [21] uses the self-attention mechanism to generate more natural restoration results by learning the contextual relationship of images. On this basis, Wang et al. [22] introduced a multi-scale and cross-attention mechanism to further improve the quality and detail fidelity of image-restoration.
Attention-based image recovery solves the problem of insufficient attention to image-specific information in common methods, but these methods still do not achieve good results for different types of image degradation. In order to solve this problem, a series of all-in-one image-restoration methods are proposed.

2.4. All-in-One Image-Restoration

All-in-one image-restoration refers to integrating multiple image-restoration tasks into a unified framework, handling different types of image degradation problems through a single model or network. This approach aims to address the limitations of existing methods, which are often tailored to specific types of image degradation, thus enhancing the model’s generalization ability and applicability. All-in-one image-restoration methods typically combine various techniques and models to address various issues in images such as noise, blur, distortion, etc., thereby achieving more comprehensive and accurate image-restoration results. Siddiqua et al. [23] utilized learning-based hints to enable a single model to effectively handle multiple image degradation tasks. They designed multiple modules to aggregate multiscale features and adaptively restore various types of degradation efficiently. Mei et al. [24] proposed a reference-based task-adaptive degradation modeling method. By introducing additional external reference images, they achieved adaptive construction of different degradation matrices, enhancing modeling accuracy. They also designed a degradation prior emission mechanism to further bridge the semantic gap between target images and reference images. Chen et al. [25] introduced Neural Degradation Representation to represent the latent features and statistical characteristics of various degradations. Zheng et al. [21] proposed a joint framework capable of simultaneous image denoising and restoration, enabling multi-task learning through a shared encoder-decoder structure. Liu et al. [26] further proposed a unified model that enables knowledge sharing between different tasks, improving the performance of individual tasks.
Our method combines the advantages of the above methods to propose PerNet, which focuses on the attention of local specific degradation information and realizes the learning of different types of degradation, and at the same time achieves the purpose of lightweighting.

3. Method

We will begin by presenting our comprehensive network architecture in Section 3.1, followed by a detailed exploration of our PPELAM in Section 3.2 and finally our loss function and evaluation metrics in Section 3.3.

3.1. Overall Network Architecture

The overall network comprises a progressive structure and the PPELAM, as depicted in Figure 2. To facilitate the calculation, the input image is pre-processed into multiple small pieces. To thoroughly investigate the impact of PPELAM on the restoration task, the internal structure of the network utilizes only one convolutional layer for coarse feature extraction. Subsequently, the features pass through the PPELAM module consisting of multiple ELAUs, followed by another convolutional operation. To prevent network degradation, the features after convolution are combined with the coarsely extracted features to form a residual structure. In order to integrate and compress the spatial information from the earlier hierarchical feature maps to obtain a more global and comprehensive representation of the features appropriate to the outputs of different degradation tasks, they pass through another convolutional layer. Furthermore, the output is element-wise subtracted from the network input, guided by the training to emphasize learning degradation while avoiding learning other features.
The network is concise, efficient and characterized by stable parameter settings, enhancing its stability with its progressive nature. The plug-and-play feature enhances the flexibility of the network architecture, allowing experimentation with various network models. The PPELAM module, being plug-and-play in our network architecture, fully considers the continuous correlation of degradation in both spatial horizontal and vertical directions. Our PerNet can be represented mathematically as follows:
F out = X F de ,
where F out represents our final restoration result, X denotes the input image feature containing degradation characteristics and F de represents the learned and predicted degradation features of the input image. The mathematical expression for F de is as follows:
F de = conv2d PReLU conv2d x + BN conv2d PPELAM PReLU conv2d x ,
where conv2d represents a 2D convolution operation, PReLU denotes an activation function. Unlike traditional ones, this activation function allows a small slope for negative inputs, instead of directly zeroing them out. BatchNorm2d (BN) refers to batch normalization, which enhances the model’s generalization capability by normalizing the mean and variance of each input feature, thereby accelerating convergence. PPELAM represents our plug-and-play efficient local attention module. The mathematical expressions for PReLU and BN are as follows:
PReLU x = max 0 , x + α min 0 , x ,
BN x = γ x μ σ 2 + ε + β ,
where x represents the input feature, α denotes learnable parameters, which allow negative inputs to have a small slope. Usually, the α is initialized to a small positive value of 0.25. When the input is negative, the output of function PReLU is slightly greater than 0 instead of directly becoming 0, which helps improve the model’s expressive power and generalization ability. μ represents the mean of the input feature, while σ 2 represents its variance. γ represents a learnable scale factor with an initial value set to 1. β represents the learnable offset, and the initial value is set to 0. ε is a very small constant to prevent division by zero, and its initial value is set to 0.00001. It is worth noting that γ , β and ε are all set up by the underlying code of PyTorch, and their values are automatically updated as the network is trained, so we just need to use them. The mathematical expressions for the mean μ and variance σ 2 of the input feature are as follows:
μ = 1 N i = 1 N x i ,
σ 2 = 1 N i = 1 N x i μ 2 ,
where N represents the number of input features, and x i denotes the i-th value of the input feature.

3.2. Plug-and-Play Efficient Local Attention Module

Figure 3 illustrates the structure of each identical small ELAU within the PPELAM. In order to ensure the lightweight and efficiency of the model, the module first performs two convolutions for coarse feature extraction. The batch size, channel size, height and width are then calculated based on the tensor size, and these four parameters are not calculated sequentially, but only to facilitate the later operations. Here, our feature map is processed by PyTorch, and its shape changes from (B, H, W, C) to (B, C, H, W) of a normal image. The heights and widths of the two tensors are then averaged together separately to obtain two average tensors. Then, the two average tensors are reshaped into two three-dimensional shapes, and the convolution operation is performed using a one-dimensional grouped convolutional layer. One of these two reshaped 3D tensors focuses only on the degraded features at the height of the image (vertically), and the other only with the degenerate features at the width of the image (horizontally). Therefore, our network further enhances the model’s utilization of continuous correlation in the horizontal and vertical directions of the image, and at the same time, the operation of focusing on specific features also enhances the lightweight and efficiency of the model. To maintain consistency in feature distribution across channels, we perform group normalization on these two feature maps separately. Next, to focus the model more on the horizontal and vertical directions of the image, we utilize the Sigmoid activation function to map features into the range of 0 to 1. The processed features are reshaped into four-dimensional shapes for element-wise multiplication. Finally, the two feature maps are multiplied element-wise, multiplied with the coarsely extracted input tensor, and then added to the tensor before coarse extraction to form a residual structure, preventing network degradation. Thus, we obtain the output of the ELAU.
The ELAU is a small unit of PerNet designed to fully exploit the continuous correlations of image degradation in the spatial horizontal and vertical directions. The design intention of this unit is to address the locality of degraded features in image-restoration tasks, as degradation often manifests as continuous changes in the horizontal and vertical directions in actual images. By focusing on these adjacent and correlated regions in the image, the unit can more effectively capture subtle changes and structures in the image, thereby providing more accurate and refined processing for image-restoration tasks. This ELAU enables the network to gain a deeper understanding of the degradation in the image, thus better guiding the restoration process and improving the accuracy and effectiveness of image-restoration. Within the entire image-restoration network, ELAU are used multiple times, adding reliability and stability to the final image recovery results.
The ELAU has four sets of parameters: T, B, S and L. Here, we employ L, with specific distinctions explained in the ablation experiments. The mathematical expression for the PPELAM can be represented as follows:
F n = f ( f ( f ( f ( x 0 ) ) ) ) ,
f x 0 = h x 0 × w x 0 × conv2d conv2d x 0 + x 0 ,
h x 0 = view 11 Sigmoid GN conv1d view 1 mean 1 conv2d conv2d x 0 ,
w x 0 = view 22 Sigmoid GN conv1d view 2 mean 2 conv2d conv2d x 0 ,
where F n represents the final output of the PPELAM, x 0 denotes the input features, f x 0 signifies the output of one pass of the ELAU applied to the input x 0 , conv2d denotes two-dimensional convolution, h x 0 and w x 0 represent features learned in the spatial vertical and horizontal directions, mean 1 and mean 2 represent the computation of mean values along the width and height, view 1 and view 2 represent reshaped features into different shapes, conv1d denotes one-dimensional group convolution, GN represents group normalization, Sigmoid denotes the activation function and view 11 and view 22 represent feature reshaping along different dimensions, with their computation methods being the same. For function mean, its mathematical expression is as follows:
mean y = 1 n i = 1 n y i ,
where y is the input tensor, n is the total number of elements in the tensor and y i is the i-th element in the tensor. For functions view 1 and view 2 , their mathematical expressions are as follows:
view 1 = x . view B , C × W , H ,
view 2 = x . view B , C × H , W ,
where B represents the batch size, C represents the number of channels, H represents the height, W represents the width and x is the original tensor. view 1 and view 2 are reshaped tensors and their shapes are B , C × W , H and B , C × H , W , respectively.
For the GN function, given an input tensor x with shape N , C , H , W , where N is the batch size, N is the number of channels and H and W are the height and width of the image respectively, for each channel C, the H × W pixel values are divided into G groups, with each group containing C / G channels. Assuming x n , c , h , w is the input tensor x for the n-th sample in the c-th channel, at the h-th row and the w-th column of pixel values. The output GN of the function y n , c , h , w is computed as follows.
y n , c , h , w = γ c x n , c , h , w + β c ,
where γ c and β c are learnable scale and shift parameters, with 1 , c , 1 , 1 , γ c is used for scaling the normalized value, β c is used for shifting the normalized value, allowing the network to learn appropriate feature representations. x n , c , h , w represents the result of normalizing pixels using the computed mean μ c and variance σ c 2 for each channel c and each pixel value h , w . The calculation formula for x n , c , h , w is as follows:
x n , c , h , w = x n , c , h , w μ c σ c 2 + ε ,
where x n , c , h , w represents the pixel value in the c-th channel, h-th row and w-th column of the n-th sample in the input tensor x. ε represents a small constant used to prevent division by zero errors. σ c 2 and μ c respectively denote the variance and mean calculated for each channel c and each sample n. The calculation formulas for σ c 2 and μ c are as follows:
σ c 2 = 1 H W h = 1 H w = 1 W x n , c , h , w μ c 2 ,
μ c = 1 H W h = 1 H w = 1 W x n , c , h , w ,
where H and W represent the height and width of the image, respectively. x n , c , h , w is the pixel value at the h-th row and w-th column of the c-th channel of the n-th sample in the input tensor x. μ c represents the mean calculated for each channel c and each sample n.
In the feature extraction stage, we utilize regular convolution for feature extraction. In the ELAU, one-dimensional grouped convolution is employed to extract features from one-dimensional feature tensors in both the spatial horizontal and vertical directions. The concepts of regular convolution and grouped convolution are illustrated in Figure 4, with a convolutional stride of 1 and no padding applied to the edges. Figure 4a depicts the refinement process of regular convolution, while Figure 4b illustrates a brief overview of regular convolution. On the other hand, Figure 4c demonstrates the brief process of grouped two-dimensional convolution, and Figure 4d presents the brief process of one-dimensional grouped convolution. In Figure 4a, we perform feature extraction using 6, 64 and 64 convolutional kernels of size (3 × 3) in three example convolutional layers, resulting in 6, 64 and 64 feature tensors, respectively. Figure 4b shows the input image before and after the extraction of 6 and 64 feature tensors using 6 (3 × 3) and 64 (3 × 3) convolutional kernels, respectively. In Figure 4c,d, we group the features obtained from one convolutional layer into pairs and perform convolution individually within each group. The difference is in Figure 4d, where the grouped feature is not a two-dimensional tensor, but a one-dimensional vector reshaped by a two-dimensional tensor. At the core of ELAU, our network uses a one-dimensional grouped convolution like Figure 4d. Because the input in this section is the output of multiple ordinary convolutions such as Figure 4a,b, the input in this section is a block containing multiple feature maps.

3.3. Loss Function and Evaluation Metrics

To encourage minimal pixel-wise differences between predicted results and ground truth, conducive to producing clearer and more faithful images, we utilize the L 1 loss function to train the network. This loss function also aids in preserving edge information in the images, which is beneficial for our image-restoration task. The mathematical expression is as follows:
L = 1 N i = 1 N | I i G T i | ,
where I i represents the predicted value of the i-th image during the training phase of our model, and G T i represents the ground truth value of the i-th image.
During the testing phase, we evaluate the performance of our experimental results using Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM). PSNR is a metric used to measure the quality of images or videos and is commonly employed to assess the performance of compression algorithms, image-restoration algorithms and image-processing algorithms. PSNR quantifies the level of distortion in an image, which reflects the similarity between the original image and the processed image. Generally, higher PSNR values indicate better image quality. PSNR is measured in decibels (dB), and values between 30 dB and 50 dB are considered good image quality, while values exceeding 50 dB are indicative of lossless image quality. SSIM is a metric used to measure the similarity between two images, with values ranging from −1 to 1. A value closer to 1 indicates greater similarity between the two images. SSIM considers three aspects of the images: luminance, contrast and structure, thereby better reflecting the characteristics of image quality perceived by the human eye. The mathematical expression for PSNR is as follows:
PSNR = 10 × log 10 ( r 2 MSE ) ,
where r represents the range of image data, and in our case, r is 255. MSE stands for Mean Squared Error, and the formula for Mean Squared Error is as follows:
MSE = 1 M N i = 1 M j = 1 N I true i , j I test i , j 2 ,
where M and N respectively denote the height and width of the image, I t r u e i , j and I test i , j represent the grayscale values at the i row and j column of the original and test images. For the SSIM, our mathematical expression is as follows:
SSIM ( x , y ) = 2 μ x μ y + C 1 2 σ x y + C 2 μ x 2 + μ y 2 + C 1 σ x 2 + σ y 2 + C 2 ,
where x and y represent the restored image and the ground truth image, respectively, μ x and μ y are the luminance mean values of images x and y, σ x 2 and σ y 2 are the luminance variances of images x and y, σ x y is the luminance covariance between images x and y, C 1 and C 2 are constants used to prevent division by zero. The calculations for μ x and μ y are the same. The formula for μ x is as follows:
μ x i , j = 1 N k = r r l = r r x i + k , j + l ,
where μ x i , j is the pixel value of i , j in the output image, x i + k , j + l is the pixel value at position i + k , j + l in the input image, N is the number of pixels within the window and r is the radius of the window. The calculations for σ x 2 and σ y 2 are the same. The formula for σ x 2 is as follows:
σ x 2 i , j = μ x 2 i , j μ x 2 i , j .
The formula for covariance σ x y is as follows:
σ x y = μ x y i , j μ x i , j μ y i , j .
The calculation formula for constants C 1 and C 2 is the same, as follows:
C 1 = K 1 × R 2 ,
C 2 = K 2 × R 2 .
R represents the data range, which is 255 in this case, while constants K 1 and K 2 are used for stability, typically set to 0.01 and 0.03, respectively.

4. Experiments and Analysis

In Section 4.1, we will present the preparation for our experiments. Subsequently, in Section 4.2, subjective evaluations of the model will be conducted for tasks such as rain removal, snow removal, fog removal and underwater enhancement. Objective evaluations for these tasks will be performed in Section 4.3. Lightweighting experiments will be showcased in Section 4.4. Additionally, the results of our ablation experiments will be presented in Section 4.5.

4.1. Experimental Setup

In Section 4.1.1, we will introduce our dataset, while our execution Details will be presented in Section 4.1.2.

4.1.1. Datasets

We used seven public datasets, which are divided into training sets and test sets, as shown in Table 1.
For the Rain200H dataset, the first 1800 pairs of a total of 2000 pairs (degraded and clean) were used as the training dataset, and the remaining 200 pairs of images were used as the test dataset. We divide the Rain200L dataset into the same as the Rain200H dataset. For the Rain800 dataset, we used the first 700 pairs of images with a total of 800 pairs (degraded and clean) as the training dataset and the remaining 100 pairs as the test dataset. For the Snow100K dataset, we used a total of 100,000 pairs of 50,000 pairs of images (degraded and clean) as the training dataset, and the remaining 50,000 pairs of images as the test dataset, which in turn assigned 16,801 pairs, 16,588 pairs and 16,611 pairs according to the division of “L”, “M” and “S”, respectively. For the CSD dataset, we used the first 1000 pairs of images totaling 800 pairs (degraded and clean) as the test dataset and the remaining 7000 pairs of images as the training dataset. For the RSID dataset, we used 900 pairs of a total of 1000 pairs (degraded and clean) as the training dataset and the remaining 100 pairs of images as the test dataset. For the EUVP dataset, we used 11,435 pairs of paired images (degraded and clean) as the training dataset, and the remaining 515 pairs (degenerate and clean images) as the test dataset.

4.1.2. Execution Details

Training Setup and Optimization. During the training phase, we utilized the Adam optimizer to update, adjust and optimize the model parameters. The Adam optimizer, known for its combination of momentum and adaptive learning rate properties, effectively adjusts the learning rate for each parameter, leading to faster convergence to the optimal solution.
Hardware Configuration. Our training was conducted using an NVIDIA 3060 GPU and an Intel i5-12490f CPU. Additionally, the memory and SSD used were sufficient to support the GPU training process.
Training Strategy. In order to speed up the training and improve the generalization ability of the model, we divide the input image into multiple small chunks, and then set the step size. Multiple blocks are processed in each batch, which enhances the diversity of input images, thereby improving the generalization ability of the model. The training process consists of 150 epochs with an initial learning rate set at 0.001, an empirically chosen value that helps to quickly converge to a relatively good solution. At the 80th epoch we reduced the learning rate to 0.0001.
Learning Rate Scheduling. In addition to using a fixed learning rate, we employed a learning rate scheduler to dynamically adjust the learning rate, further enhancing the model’s performance and convergence speed. We utilized the MultiStepLR scheduler, this is an era-based learning rate decay strategy that uses critical periods as adjustment points. Specifically, we reduced the learning rate on the 80th epoch by reducing the current learning rate from the previous value to 0.1 times the previous value, i.e., from the initial learning rate of 0.001 to 0.0001. This exponential decay strategy ensures the learning rate remains appropriate throughout training, gradually decreasing to stabilize the model’s convergence to the optimal solution. By implementing the MultiStepLR scheduler, we managed the learning rate changes effectively, preventing instability from a high learning rate and slow training from a low learning rate. This approach significantly improved the training efficiency and performance, resulting in superior evaluation outcomes.

4.2. Subjective Evaluation

In Section 4.2.1, we will analyze the subjective evaluation of the task of deraining. Section 4.2.2 will be dedicated to the subjective evaluation of the task of desnowing. The subjective evaluation of the task of dehazing will be covered in Section 4.2.3. Lastly, the subjective analysis of the underwater image enhancement task will be presented in Section 4.2.4.

4.2.1. Deraining Task

Datasets. For the deraining task, we utilized three public datasets: Rain200L [27], Rain200H [27] and Rain800 [28].
Comparison Models. For the deraining task, we compared the PerNet model against several state-of-the-art methods. The comparative methods on the Rain200L, Rain200H and Rain800 datasets include DSC [33], DiG-CoM [34], DerainCycleGAN [35], SPD-Net [36], NLCL [37], Syn2Real [38], SIRR [39], JRGB [40], DDN [41], Air-Net [42], DID-MDN [43], RESCAN [44], RainDiffusion [45], PreNet [46] and MSPFN [47].
Results. The test results on Rain200H are shown in Figure 5, Rain200L results in Figure 6 and Rain800 results in Figure 7. We randomly selected some recovery results, which clearly show that our algorithm exhibits excellent restoration performance across all three datasets. Particularly on the Rain200H dataset, PerNet demonstrates superior image-restoration results, maintaining color saturation close to the original image and achieving high-quality recovery. In contrast, DSC performs poorly on Rain200H, while DiG-CoM and DerainCycleGAN also show subpar results. DiG-CoM tends to produce darker images overall, while DerainCycleGAN results in brighter images overall. On the Rain200L and Rain800 datasets, recent advanced algorithms generally perform well. However, the experimental results on Rain200L indicate that DSC still performs poorly. In DiG-CoM, SPD-Net and NLCL, black streaks appear near the fish’s head region, and DSC introduces colored artifacts on the cloud near the penguin’s head. Our algorithm, on the other hand, achieves visually pleasing results without such artifacts. Nevertheless, all algorithms struggle to completely remove the rain streaks near the fish’s head. The Rain800 dataset shows that earlier algorithms such as DSC, DiG-CoM, DerainCycleGAN and SPD-Net fail to effectively remove rain streaks from the sky and lawn areas. Notably, in the park image, the original rain-free image’s upper right corner features white misty blobs, which are removed by both our algorithm and MSPFN’s algorithm. While this removal improves visual quality, it does not benefit the PSNR values. The RESCAN algorithm on Rain800 introduces scattered misty blobs in the sky of the park image and white streaks in the air of the city image. In the climbing image, all algorithms retain large white streaks in the annotated region, even though the rain-free label image does not contain such large streaks. DSC and DiG-CoM even retain two such large streaks.

4.2.2. Desnowing Task

Datasets. For the desnowing task, we conducted experiments using two public datasets: Snow100K-L [29] and CSD [30].
Comparison Models. For the desnowing task, we compared the PerNet model against a range of state-of-the-art methods. The comparative methods on the Snow100K and CSD datasets include CycleGAN [48], RESCAN [44], DesnowNet [29], ALL in one [49], JSTASR [50], HDCW-Net [30], DDMSNet [51], MPRNet [52], TransWeather [53], SMGARN [54], TKL [55], WeatherDiff128 [56], MSP-Former [57], Uformer [58], WeatherDiff64 [56], Restormer [59], SnowDiff128 [56], NAFNet [60], DGUNet [61], SnowDiff64 [56] and GridFormer-S [62].
Results. In this phase, we performed desnowing experiments on the Snow100K-L and CSD datasets. The experimental results on the Snow100K dataset are shown in Figure 8. and the results on the CSD dataset are shown in Figure 9. According to the experimental results, on the Snow100K dataset, the methods CycleGAN, RESCAN, DesnowNet, ALL in one, JSTASR and HDCW-Net showed insufficient snow-restoral capabilities, leaving noticeable large areas of snow. In the image of the tree branches, apart from our algorithm, other methods retained noticeable snowflakes on the branches. In the skiing image, all algorithms exhibited a large snow patch on the palm, with our algorithm showing the least severe snow patch. In the ocean image, all methods struggled to distinguish between white waves and white snowflakes, resulting in thin local snow streaks. In the lighthouse image, both SnowDiff64 and our algorithm removed the snowflakes cleanly, leaving only a small patch of snow at the base of the lighthouse. Various methods produced small black spots in the restored results of the lighthouse image. On the CSD dataset, all methods showed less than ideal snow-restoral performance in the sky of the road image, retaining block-like areas resembling localized haze. In the city image of the CSD dataset, Restormer, SnowDiff64, MSP-Former, SMGARN and our algorithm achieved the best snow removal in the sky, while other methods left small patches of light snow. Notably, in the arch image of the CSD dataset, CycleGAN, RESCAN and ALL in one introduced a small black shadow in the upper left corner.

4.2.3. Dehazing Task

Datasets. For the dehazing task, we conducted experiments using the RSID [31] dataset.
Comparative Models. For the defogging task, we compared the PerNet model against a range of state-of-the-art methods. The comparative methods on the RSID dataset include Cycle-SNSPGAN [63], ZID [64], FCTF-Net [65], FFA-Net [66], TCN [67], EVPM [68], IDeRs [69], GRS-HTM [70], SDCP [71], UHD [72], DeHHamer [73], Dehaze-cGAN [74], STD [75], Zero-restore [76] and ROP [77].
Results. In this phase, we performed defogging experiments on the RSID dataset. The experimental results on the RSID dataset are shown in Figure 10. The results demonstrate that our PerNet model achieves very good performance in the defogging task. While it may not completely restore the original colors, it reaches a high standard of quality. First Image: Cycle-SNSPGAN introduces a purple color bias, FFA-Net introduces a light blue color bias, ZID introduces a brown color bias and DeHHamer, IDeRs and TCN result in an overall whitish appearance. Our method’s restored result is the closest to the original clear label image, although the overall color is slightly darker. Second Image: Cycle-SNSPGAN and FFA-Net exhibit severe purple tinting, and the haze-restoral effect is not satisfactory. ZID, FCTF-Net and TCN-EVPM perform poorly in haze removal. DeHHamer shows an overall cyan color bias.

4.2.4. Underwater Enhancement Task

Datasets. For the underwater enhancement task, we conducted experiments using the EUVP [32] public dataset.
Comparison models. For the underwater enhancement task, our PerNet model was compared with a range of state-of-the-art methods. The comparative methods on the EUVP dataset include PRWNet [78], ShallowUW [79], UWCNN [80], FUnIE-GAN ShallowUW [32], UT-UIE [81], WaterNet [82], RAUNE-Net [83], CPDM [84], SyreaNet [85], SGUIE-Net [86] and Cycle-GAN [87].
Results. In this phase, we conducted underwater enhancement experiments on the EUVP dataset. The experimental results on the EUVP dataset are shown in Figure 11. The results indicate that our algorithm achieves the best overall enhancement performance. First Image: PRWNet, ShallowUW and FUnIE-GAN all exhibit a red color bias. UWFormer shows good color restoration but is slightly blurry. The results of other methods are similar to ours, with our method producing the best balance of clarity and color accuracy. Second Image: The enhancement results of all methods are relatively ideal, with UWFormer displaying colored water droplets at the corner of the fish’s mouth and the tip of its back, which is not present in other methods. Third Image: PRWNet, ShallowUW and UT-UIE exhibit a slight white tint in their overall color. Other methods, including ours, achieve satisfactory enhancement results. Fourth Image: FUnIE-GAN’s color is the closest to the clear label image but does not achieve the best enhancement result. UT-UIE, RAUNE-Net, SyreaNet, Cycle-GAN and our method all produce very clear results, sometimes appearing even cleaner than the ground truth image. This phenomenon can be attributed to the removal of some pseudo-degradation, which, although beneficial in subjective evaluation, may not be ideal from an objective evaluation perspective.

4.3. Objective Evaluation

In Section 4.3.1, we will present the objective evaluation of the deraining task. In Section 4.3.2, we will discuss the objective evaluation of the desnowing task. In Section 4.3.3, we will delve into the objective evaluation of the dehazing task. Finally, in Section 4.3.4, we will elaborate on the objective evaluation of the underwater enhancement task.

4.3.1. Deraining Task

We conducted a quantitative comparison on the Rain200L [27], Rain200H [27] and Rain800 [28] public datasets, selecting 200 pairs of images from Rain200L and Rain200H each as test samples and 100 pairs of images from the Rain800 dataset as test samples. The compared methods include DSC [33], DiG-CoM [34], DerainCycleGAN [35], SPD-Net [36], NLCL [37], Syn2Real [38], SIRR [39], JRGB [40], DDN [41], Air-Net [42], DID-MDN [43], RESCAN [44], RainDiffusion [45], PreNet [46] and MSPFN [47].
As shown in Table 2, early traditional deraining methods, such as DSC, exhibit relatively low values in deraining tasks. This is mainly due to limitations in utilizing image priors and interpolation methods, which fail to learn detailed rain streaks as effectively as deep learning approaches. These methods also struggle to distinguish between rain streaks and non-rain streak details, leading to suboptimal deraining results that fall short of the performance achieved by deep deraining methods. From the numerical results, it is evident that our algorithm achieves the highest SSIM values across all three datasets. Additionally, our method achieves the highest PSNR values on Rain200H and Rain200L datasets, while RainDiffusion achieves the highest PSNR value on the Rain800 dataset. On the Rain200H dataset, PerNet has an improved PSNR value by 2.6 percent and SSIM value by 0.6 percent compared with MSPFN. On the Rain200L dataset, PerNet improves the PSNR value by 0.75 percent and the SSIM value by 1 percent compared with MSPFN. On the Rain800 dataset, PerNet improved the SSIM value by 1.6 percent compared to RainDiffusion.

4.3.2. Desnowing Task

In the snow-restoral task phase, we used three public datasets: Snow100K-S [29], Snow100K-L [29] and CSD [30]. We selected 16,801 pairs of images from Snow100K-L, 16,611 pairs from Snow100K-S and 2000 pairs from the CSD dataset as test samples. The compared methods include CycleGAN [48], RESCAN [44], DesnowNet [29], ALL in one [49], JSTASR [50], HDCW-Net [30], DDMSNet [51], MPRNet [52], TransWeather [53], SMGARN [54], TKL [55], WeatherDiff128 [56], MSP-Former [57], Uformer [58], WeatherDiff64 [56], Restormer [59], SnowDiff128 [56], NAFNet [60], DGUNet [61], SnowDiff64 [56] and GridFormer-S [62].
As shown in Table 3, our PerNet demonstrates outstanding performance on the CSD, Snow100K-S and Snow100K-L datasets. Our method achieves the highest PSNR and SSIM values across all three test sets. It is worth noting that other methods also show their respective strengths. On the Snow100K-S test set, the second-best PSNR and SSIM values are achieved by GridFormer-S and DGUNet, respectively. On the Snow100K-L test set, Uformer and NAFNet exhibit the second-best PSNR and SSIM values, respectively. On the CSD test set, the second-best PSNR and SSIM values are achieved by Restormer and SnowDiff64, respectively. On the Snow100K-S dataset, PerNet improves the PSNR value by 0.82 percent compared to GridFormer-S and improves the SSIM value by 0.3 percent compared to DGUNet. On the Snow100K-L dataset, PerNet improves the PSNR value by 1 percent compared to Uformer and improves the SSIM value by 1.4 percent compared to NAFNet. On the CSD dataset, PerNet improves the PSNR value by 1.2 percent compared to Restormer and improves the SSIM value by 0.3 percent compared to SnowDiff64.

4.3.3. Dehazing Task

For the dehazing task, we utilized the RSID [31] dataset, from which we selected 100 pairs of images denoted as R100. The compared methods include Cycle-SNSPGAN [63], ZID [64], FCTF-Net [65], FFA-Net [66], TCN [67], EVPM [68], IDeRs [69], GRS-HTM [70], SDCP [71], UHD [72], DeHHamer [73], Dehaze-cGAN [74], STD [75], Zero-restore [76] and ROP [77].
As shown in Table 4, our PerNet achieved the highest PSNR and SSIM values on the RSID test set compared to the other methods. Our PerNet exhibited excellent performance in the dehazing task. The method with the second-best PSNR and SSIM values was UHD. On the RSID dataset, PerNet improves the PSNR value by 0.5 percent and the SSIM value by 1.3 percent compared to UHD.

4.3.4. Underwater Enhancement Task

For the underwater enhancement task, our testing dataset comprised 515 pairs of test samples from the EUVP [32] dataset. The comparison methods included PRWNet [78], ShallowUW [79], UWCNN [80], FUnIE-GAN [32], UT-UIE [81], WaterNet [82], RAUNE-Net [83], CPDM [84], SyreaNet [85], SGUIE-Net [86] and Cycle-GAN [87].
As shown in Table 5, our PerNet achieved excellent restoration performance on the EUVP dataset, with the highest SSIM value of 0.913 and the second-best PSNR value of 25.592. The method with the best PSNR value was RAUNE-Net, while CPDM exhibited the second-best SSIM results. On the EUVP dataset, PerNet improves the SSIM value by 1.3 percent compared to CPDM.

4.4. Lightweight Experiment

Our PerNet leverages an ELAU, which endows it with outstanding lightweight capabilities, effectively reducing the model’s parameter count. We compared PerNet with several networks to demonstrate its advantage in parameter count. The comparative results are presented in Figure 12, where “Param” represents the complexity of the network model. From Figure 12, it can be observed that our network exhibits strong lightweight characteristics. With only 1.29 million parameters, PerNet has the lowest parameter count among the image-restoration methods listed in Figure 12.

4.5. Ablation Experiment

To validate the impact of ELAU on our network’s performance and explore the most suitable settings for image-restoration tasks, we conducted ablation experiments by varying the four parameter groups (T, B, S and L) of ELAU along with the number of ELAU. We performed these experiments using the Rain800 dataset, where red indicates the best values and blue indicates the next-second values. The experimental results are presented in Table 6 and Table 7, where Table 6 focuses on PSNR and Table 7 on SSIM. According to the experimental results, when the number of ELAU is 8, parameter group T exhibited the best PSNR and SSIM values, reaching 25.362 and 0.852, respectively. On the other hand, parameter group L showed the poorest PSNR and SSIM values, at 24.992 and 0.843, respectively. When the number of ELAU was increased to 16, parameter group T achieved PSNR and SSIM values of 25.444 and 0.856, respectively, indicating the worst performance. Conversely, parameter group L achieved PSNR and SSIM values of 25.993 and 0.889, respectively, demonstrating the best performance. As the number of ELAU increased gradually, parameter group T consistently exhibited the worst PSNR and SSIM values, while parameter group L consistently demonstrated the best PSNR and SSIM values. This suggests that parameter group T is most suitable for lightweight small-scale image-restoration networks, while parameter groups B and S are suitable for non-large-scale deep image-restoration networks and parameter group L is suitable for extra-large-scale image-restoration networks.
To explore the advantages of ELAU, we separately added the ELAU, Convolutional Block Attention Module (CBAM) [88] and Squeeze-and-Excitation (SE) [89] module to our network and then compared their combination with the Spare Transformer (ST) [90]. We set the parameters of the ELAU group to L, with 16 ELAU modules. The results are shown in Table 8. Red values indicate the best values, while blue values indicate the second-best values. From the table, it can be seen that when adding attention modules individually, ELAU achieved the best performance, with PSNR and SSIM values of 25.993 and 0.889, respectively, both being the highest. The second-best performance was observed when adding CBAM alone, with PSNR and SSIM values of 25.675 and 0.856, respectively. When combined with the ST module, the combination of ELAU and ST showed the best performance, with PSNR and SSIM values of 26.291 and 0.912, respectively. The combination of CBAM and ST exhibited the second-best performance, with PSNR and SSIM values of 26.186 and 0.908, respectively. Overall, it can be seen that ELAU performed the best in image-restoration tasks, followed by CBAM, while SE performed the worst. This indicates that ELAU has the potential to replace CBAM in certain domains.

5. Conclusions

This paper introduces a versatile lightweight image-restoration network called PerNet, designed to effectively balance efficiency and accuracy in image-restoration tasks. The network leverages an efficient local attention mechanism, thoroughly exploring the continuous correlations in both horizontal and vertical spatial dimensions of images. To better adapt to different types of image degradation, a PPELAM module is designed, which effectively matches the model to various types of degradation. The innovation of PerNet lies in combining the efficient local attention mechanism with a progressive mode, allowing the network to accurately capture image details while maintaining a lightweight structure. Specifically, the efficient local attention mechanism significantly enhances the network’s performance in handling complex scenes, while the progressive mode refines features layer by layer to gradually restore image details. Furthermore, the introduction of the PPELAM module enables PerNet to highly match different types of image degradation, further enhancing the network’s applicability and performance in practical scenarios. To validate PerNet’s performance, we conducted numerous ablation experiments. The results demonstrate that integrating PPELAM with a Transformer yields significantly better restoration effects compared to other methods, proving the efficiency and applicability of PPELAM. Especially in addressing complex image degradation issues, PerNet exhibits outstanding performance, showing excellent restoration results in tasks such as de-raining, de-snowing, de-hazing and underwater enhancement. Overall, PerNet effectively addresses the balance between efficiency and accuracy in image-restoration tasks. By combining the advantages of an efficient local attention mechanism and progressive processing, it offers a feasible and efficient solution for image-restoration.

Author Contributions

Conceptualization, W.L.; methodology, G.Z.; software, S.L.; validation, W.L. and S.L.; formal analysis, W.L.; investigation, Y.T.; resources, S.L.; data curation, G.Z.; writing—original draft preparation, W.L.; writing—review and editing, G.Z.; visualization, W.L.; supervision, S.L.; project administration, Y.T.; funding acquisition, Y.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Major Program of National Natural Science Foundation of China (Grant Number: 61991413).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The dataset Rain200H, Rain200L and Rain800 are open datasets and can be downloaded at https://github.com/nnUyi/DerainZoo/blob/master/DerainDatasets.md (accessed on 1 May 2024). The dataset Snow100K are open datasets and can be downloaded at https://sites.google.com/view/yunfuliu/desnownet (accessed on 1 May 2024). The dataset CSD are open datasets and can be downloaded at https://ccncuedutw-my.sharepoint.com/:u:/g/personal/104501531_cc_ncu_edu_tw/EfCooq0sZxxNkB7F8HgCyKwB-sJQtVE59_Gpb9soatYi5A?e=5NjDhb (accessed on 1 May 2024). The dataset RSID are open datasets and can be downloaded at https://github.com/Shan-rs/DCI-Net (accessed on 1 May 2024). The dataset EUVP are open datasets and can be downloaded at http://irvlab.cs.umn.edu/resources/euvp-dataset (accessed on 1 May 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

Nomenclatures

DSCDiscriminative Sparse Coding
DiG-CoMDirectional Gradient, Constraints based Model
SPD-NetStructure-Preserving Deraining Network
NLCLNon-Local Contrastive Learning
SIRRSingle Image Rain Removal
JRGBJoint Rain Generation and removal for Both the real and synthetic image
DDNDeep Detail Network
Air-NetAuxiliary image reconstruction Network
DID-MDNDensIty-aware De-raining using a Multi-stream Dense Network
RESCANRecurrEnt Squeeze-and-excitation Context Aggregation Network
PreNetProgressive image deraining Networks
MSPFNMulti-Scale Progressive Fusion Network
JSTASRJoint Size and Transparency-Aware Snow Removal
HDCW-NetHierarchical Dual-tree Complex Wavelet representation Network
DDMSNetDeep Dense Multi-Scale Network
MPRNetMulti-stage Progressive image-restoration Net
SMGARNSnow Mask Guided Adaptive Residual Network
TKLTwo-stage Knowledge Learning
MSP-FormerMulti-Scale Projection transFormer
UformerU-shaped transformer
NAFNetNonlinear Activation Free Network
DGUNetDeep Generalized Unfolding Networks
Cycle-SNSPGANCycle Spectral Normalized Soft likelihood estimation Patch GAN
ZIDZero-shot Image Dehazing
FCTF-NetFirst-Coarse-Then-Fine Network
FFA-NetFeature Fusion Attention Network
TCNTriple-Convolutional Network
EVPMdEhazing Values Prior Model
IDeRsIterative Dehazing method for single Remote sensing image
GRS-HTMGround Radiance Suppressed Haze Thickness Map
SDCPSphere model improved Dark Channel Prior
UHDUltra-High-Definition
DeHamerDeHazing transformer
STDStructure layer according To the Distribution
Zero-restoreZero-shot single image-restoration
ROPRank-One Prior
PRWNetProgressively Refine Wavelet Network
ShallowUWShallow UnderWater
UWCNNUnderWater image enhancement Convolutional Neural Network
FunIE-GANFast underwater Image Enhancement Generative Adversarial Network
UT-UIEU-shape Transformer for Underwater Image Enhancement
Water-NetunderWater image enhancement Network
RAUNE-NetResidual and Attention-driven Underwater eNhancEment Network
CPDMContent-Preserving Diffusion Model
SyreaNetSynthetic and real images Network
SGUIE-NetSemantic attention Guided Underwater Image Enhancement Network
Cycle-GANCycle-consistent Generative Adversarial Networks
CSDComprehensive Snow Dataset
RSIDRemote Sensing Image Dataset
EUVPEnhancing Underwater Visual Perception

References

  1. Liu, Q.; Liu, Y.; Lin, D. Revolutionizing Target Detection in Intelligent Traffic Systems: YOLOv8-SnakeVision. Electronics 2023, 12, 4970. [Google Scholar] [CrossRef]
  2. Zhou, X.; Duan, Y.; Ding, R.; Wang, Q.; Wang, Q.; Qin, J.; Liu, H. Bit-Weight Adjustment for Bridging Uniform and Non-Uniform Quantization to Build Efficient Image Classifiers. Electronics 2023, 12, 5043. [Google Scholar] [CrossRef]
  3. Hirschmuller, H.; Scharstein, D. Evaluation of cost functions for stereo matching. In Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis, MN, USA, 17–22 June 2007; pp. 1–8. [Google Scholar]
  4. Hu, H.; Zhang, Z.; Xie, Z.; Lin, S. Local relation networks for image recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 3464–3473. [Google Scholar]
  5. Li, P.; Tian, J.; Tang, Y.; Wang, G.; Wu, C. Model-based deep network for single image deraining. IEEE Access 2020, 8, 14036–14047. [Google Scholar] [CrossRef]
  6. Kamilaris, A.; Prenafeta-Boldú, F.X. Deep learning in agriculture: A survey. Comput. Electron. Agric. 2018, 147, 70–90. [Google Scholar] [CrossRef]
  7. Guo, Y.; Liu, Y.; Oerlemans, A.; Lao, S.; Wu, S.; Lew, M.S. Deep learning for visual understanding: A review. Neurocomputing 2016, 187, 27–48. [Google Scholar] [CrossRef]
  8. LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
  9. Zhang, J.; Zhao, D.; Xiong, R.; Ma, S.; Gao, W. Image-restoration using joint statistical modeling in a space-transform domain. IEEE Trans. Circuits Syst. Video Technol. 2014, 24, 915–928. [Google Scholar] [CrossRef]
  10. Chambolle, A.; Lions, P.L. Image recovery via total variation minimization and related problems. Numer. Math. 1997, 76, 167–188. [Google Scholar] [CrossRef]
  11. Podilchuk, C.I.; Mammone, R.J. Image recovery by convex projections using a least-squares constraint. JOSA A 1990, 7, 517–521. [Google Scholar] [CrossRef]
  12. Chen, Y.; Pock, T. Trainable nonlinear reaction diffusion: A flexible framework for fast and effective Image-restoration. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 39, 1256–1272. [Google Scholar] [CrossRef]
  13. Liu, Q.; Wang, S.; Ying, L.; Peng, X.; Zhu, Y.; Liang, D. Adaptive dictionary learning in sparse gradient domain for image recovery. IEEE Trans. Image Process. 2013, 22, 4652–4663. [Google Scholar] [CrossRef]
  14. Yu, H.; Yuan, X.; Jiang, R.; Feng, H.; Liu, J.; Li, Z. Feature Reduction Networks: A Convolution Neural Network-Based Approach to Enhance Image Dehazing. Electronics 2023, 12, 4984. [Google Scholar] [CrossRef]
  15. Haris, M.; Shakhnarovich, G.; Ukita, N. Deep back-projection networks for super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 1664–1673. [Google Scholar]
  16. Zhang, Y.; Tian, Y.; Kong, Y.; Zhong, B.; Fu, Y. Residual dense network for image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 2472–2481. [Google Scholar]
  17. Fu, X.; Liang, B.; Huang, Y.; Ding, X.; Paisley, J. Lightweight pyramid networks for image deraining. IEEE Trans. Neural Netw. Learn. Syst. 2019, 31, 1794–1807. [Google Scholar] [CrossRef] [PubMed]
  18. Hu, M.; Yang, J.; Ling, N.; Liu, Y.; Fan, J. Lightweight single image deraining algorithm incorporating visual saliency. IET Image Process. 2022, 16, 3190–3200. [Google Scholar] [CrossRef]
  19. Mou, C.; Zhang, J.; Fan, X.; Liu, H.; Wang, R. COLA-Net: Collaborative attention network for Image-restoration. IEEE Trans. Multimed. 2021, 24, 1366–1377. [Google Scholar] [CrossRef]
  20. Deng, S.; Wei, M.; Wang, J.; Feng, Y.; Liang, L.; Xie, H.; Wang, F.L.; Wang, M. Detail-recovery image deraining via context aggregation networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 14560–14569. [Google Scholar]
  21. Yu, J.; Lin, Z.; Yang, J.; Shen, X.; Lu, X.; Huang, T.S. Generative image inpainting with contextual attention. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 5505–5514. [Google Scholar]
  22. Wang, Y.; Tao, X.; Qi, X.; Shen, X.; Jia, J. Image inpainting via generative multi-column convolutional neural networks. Adv. Neural Inf. Process. Syst. 2018, 329–338. [Google Scholar]
  23. Siddiqua, M.; Belhaouari, S.B.; Akhter, N.; Zameer, A.; Khurshid, J. MACGAN: An all-in-one Image-restoration under adverse conditions using multidomain attention-based conditional GAN. IEEE Access 2023, 11, 70482–70502. [Google Scholar] [CrossRef]
  24. Mei, Y.; Fan, Y.; Zhang, Y.; Yu, J.; Zhou, Y.; Liu, D.; Fu, Y.; Huang, T.S.; Shi, H. Pyramid attention network for image-restoration. Int. J. Comput. Vis. 2023, 131, 3207–3225. [Google Scholar] [CrossRef]
  25. Chen, S.; Ye, T.; Liu, Y.; Chen, E. Dual-former: Hybrid self-attention transformer for efficient image restoration. Digit. Signal Process. 2024, 149, 104485. [Google Scholar] [CrossRef]
  26. Liu, G.; Reda, F.A.; Shih, K.J.; Wang, T.C.; Tao, A.; Catanzaro, B. Image inpainting for irregular holes using partial convolutions. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 85–100. [Google Scholar]
  27. Yang, W.; Tan, R.T.; Feng, J.; Liu, J.; Guo, Z.; Yan, S. Deep joint rain detection and removal from a single image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1357–1366. [Google Scholar]
  28. Zhang, H.; Sindagi, V.; Patel, V.M. Image de-raining using a conditional generative adversarial network. IEEE Trans. Circuits Syst. Video Technol. 2019, 30, 3943–3956. [Google Scholar] [CrossRef]
  29. Liu, Y.F.; Jaw, D.W.; Huang, S.C.; Hwang, J.N. Desnownet: Context-aware deep network for snow removal. IEEE Trans. Image Process. 2018, 27, 3064–3073. [Google Scholar] [CrossRef] [PubMed]
  30. Chen, W.T.; Fang, H.Y.; Hsieh, C.L.; Tsai, C.C.; Chen, I.; Ding, J.J.; Kuo, S.Y. All snow removed: Single image desnowing algorithm using hierarchical dual-tree complex wavelet representation and contradict channel loss. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 4196–4205. [Google Scholar]
  31. Zhang, L.; Wang, S. Dense haze removal based on dynamic collaborative inference learning for remote sensing images. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5631016. [Google Scholar] [CrossRef]
  32. Islam, M.J.; Xia, Y.; Sattar, J. Fast underwater image enhancement for improved visual perception. IEEE Robot. Autom. Lett. 2020, 5, 3227–3234. [Google Scholar] [CrossRef]
  33. Luo, Y.; Xu, Y.; Ji, H. Removing rain from a single image via discriminative sparse coding. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 3397–3405. [Google Scholar]
  34. Ran, W.; Yang, Y.; Lu, H. Single image rain removal boosting via directional gradient. In Proceedings of the 2020 IEEE International Conference on Multimedia and Expo (ICME), London, UK, 6–10 July 2020; pp. 1–6. [Google Scholar]
  35. Wei, Y.; Zhang, Z.; Wang, Y.; Xu, M.; Yang, Y.; Yan, S.; Wang, M. Deraincyclegan: Rain attentive cyclegan for single image deraining and rainmaking. IEEE Trans. Image Process. 2021, 30, 4788–4801. [Google Scholar] [CrossRef] [PubMed]
  36. Yi, Q.; Li, J.; Dai, Q.; Fang, F.; Zhang, G.; Zeng, T. Structure-preserving deraining with residue channel prior guidance. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 4238–4247. [Google Scholar]
  37. Ye, Y.; Yu, C.; Chang, Y.; Zhu, L.; Zhao, X.L.; Yan, L.; Tian, Y. Unsupervised deraining: Where contrastive learning meets self-similarity. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 19–24 June 2022; pp. 5821–5830. [Google Scholar]
  38. Yasarla, R.; Sindagi, V.A.; Patel, V.M. Syn2real transfer learning for image deraining using gaussian processes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 2726–2736. [Google Scholar]
  39. Wang, H.; Yue, Z.; Xie, Q.; Zhao, Q.; Zheng, Y.; Meng, D. From rain generation to rain removal. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 19–25 June 2021; pp. 14791–14801. [Google Scholar]
  40. Ye, Y.; Chang, Y.; Zhou, H.; Yan, L. Closing the loop: Joint rain generation and removal via disentangled image translation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 19–25 June 2021; pp. 2053–2062. [Google Scholar]
  41. Fu, X.; Huang, J.; Zeng, D.; Huang, Y.; Ding, X.; Paisley, J. Removing rain from single images via a deep detail network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 3855–3863. [Google Scholar]
  42. Gui, D.; Song, Q.; Song, B.; Li, H.; Wang, M.; Min, X.; Li, A. AIR-Net: A novel multi-task learning method with auxiliary image reconstruction for predicting EGFR mutation status on CT images of NSCLC patients. Comput. Biol. Med. 2022, 141, 105157. [Google Scholar] [CrossRef]
  43. Zhang, H.; Patel, V.M. Density-aware single image de-raining using a multi-stream dense network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 695–704. [Google Scholar]
  44. Li, X.; Wu, J.; Lin, Z.; Liu, H.; Zha, H. Recurrent squeeze-and-excitation context aggregation net for single image deraining. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 254–269. [Google Scholar]
  45. Wei, M.; Shen, Y.; Wang, Y.; Xie, H.; Qin, J.; Wang, F.L. Raindiffusion: When unsupervised learning meets diffusion models for real-world image deraining. arXiv 2023, arXiv:2301.09430. [Google Scholar]
  46. Ren, D.; Zuo, W.; Hu, Q.; Zhu, P.; Meng, D. Progressive image deraining networks: A better and simpler baseline. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 3937–3946. [Google Scholar]
  47. Jiang, K.; Wang, Z.; Yi, P.; Chen, C.; Huang, B.; Luo, Y.; Ma, J.; Jiang, J. Multi-scale progressive fusion network for single image deraining. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 8346–8355. [Google Scholar]
  48. Zhu, J.Y.; Park, T.; Isola, P.; Efros, A.A. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2223–2232. [Google Scholar]
  49. Li, R.; Tan, R.T.; Cheong, L.F. All in one bad weather removal using architectural search. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 3175–3185. [Google Scholar]
  50. Chen, W.T.; Fang, H.Y.; Ding, J.J.; Tsai, C.C.; Kuo, S.Y. JSTASR: Joint size and transparency-aware snow removal algorithm based on modified partial convolution and veiling effect removal. In Proceedings of the Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020; Proceedings, Part XXI 16. Springer: Berlin/Heidelberg, Germany, 2020; pp. 754–770. [Google Scholar]
  51. Zhang, K.; Li, R.; Yu, Y.; Luo, W.; Li, C. Deep dense multi-scale network for snow removal using semantic and depth priors. IEEE Trans. Image Process. 2021, 30, 7419–7431. [Google Scholar] [CrossRef] [PubMed]
  52. Zamir, S.W.; Arora, A.; Khan, S.; Hayat, M.; Khan, F.S.; Yang, M.H.; Shao, L. Multi-stage progressive image-restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 19–25 June 2021; pp. 14821–14831. [Google Scholar]
  53. Valanarasu, J.M.J.; Yasarla, R.; Patel, V.M. Transweather: Transformer-based restoration of images degraded by adverse weather conditions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 2353–2363. [Google Scholar]
  54. Cheng, B.; Li, J.; Chen, Y.; Zeng, T. Snow mask guided adaptive residual network for image snow removal. Comput. Vis. Image Underst. 2023, 236, 103819. [Google Scholar] [CrossRef]
  55. Chen, W.T.; Huang, Z.K.; Tsai, C.C.; Yang, H.H.; Ding, J.J.; Kuo, S.Y. Learning multiple adverse weather removal via two-stage knowledge learning and multi-contrastive regularization: Toward a unified model. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 17653–17662. [Google Scholar]
  56. Özdenizci, O.; Legenstein, R. Restoring vision in adverse weather conditions with patch-based denoising diffusion models. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 10346–10357. [Google Scholar] [CrossRef]
  57. Chen, S.; Ye, T.; Liu, Y.; Liao, T.; Jiang, J.; Chen, E.; Chen, P. Msp-former: Multi-scale projection transformer for single image desnowing. In Proceedings of the ICASSP 2023–2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 4–10 June 2023; pp. 1–5. [Google Scholar]
  58. Wang, Z.; Cun, X.; Bao, J.; Zhou, W.; Liu, J.; Li, H. Uformer: A general u-shaped transformer for image-restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 17683–17693. [Google Scholar]
  59. Zamir, S.W.; Arora, A.; Khan, S.; Hayat, M.; Khan, F.S.; Yang, M.H. Restormer: Efficient transformer for high-resolution image restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 5728–5739. [Google Scholar]
  60. Chen, L.; Chu, X.; Zhang, X.; Sun, J. Simple baselines for image-restoration. In Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; Springer: Berlin/Heidelberg, Germany, 2022; pp. 17–33. [Google Scholar]
  61. Mou, C.; Wang, Q.; Zhang, J. Deep generalized unfolding networks for image-restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 17399–17410. [Google Scholar]
  62. Wang, T.; Zhang, K.; Shao, Z.; Luo, W.; Stenger, B.; Lu, T.; Kim, T.K.; Liu, W.; Li, H. Gridformer: Residual dense transformer with grid structure for image restoration in adverse weather conditions. arXiv 2023, arXiv:2305.17863. [Google Scholar] [CrossRef]
  63. Wang, Y.; Yan, X.; Guan, D.; Wei, M.; Chen, Y.; Zhang, X.P.; Li, J. Cycle-snspgan: Towards real-world image dehazing via cycle spectral normalized soft likelihood estimation patch gan. IEEE Trans. Intell. Transp. Syst. 2022, 23, 20368–20382. [Google Scholar] [CrossRef]
  64. Li, B.; Gou, Y.; Liu, J.Z.; Zhu, H.; Zhou, J.T.; Peng, X. Zero-shot image dehazing. IEEE Trans. Image Process. 2020, 29, 8457–8466. [Google Scholar] [CrossRef] [PubMed]
  65. Li, Y.; Chen, X. A coarse-to-fine two-stage attentive network for haze removal of remote sensing images. IEEE Geosci. Remote Sens. Lett. 2020, 18, 1751–1755. [Google Scholar] [CrossRef]
  66. Qin, X.; Wang, Z.; Bai, Y.; Xie, X.; Jia, H. FFA-Net: Feature fusion attention network for single image dehazing. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 11908–11915. [Google Scholar]
  67. Shin, J.; Park, H.; Paik, J. Region-based dehazing via dual-supervised triple-convolutional network. IEEE Trans. Multimed. 2021, 24, 245–260. [Google Scholar] [CrossRef]
  68. Han, J.; Zhang, S.; Fan, N.; Ye, Z. Local patchwise minimal and maximal values prior for single optical remote sensing image dehazing. Inf. Sci. 2022, 606, 173–193. [Google Scholar] [CrossRef]
  69. Xu, L.; Zhao, D.; Yan, Y.; Kwong, S.; Chen, J.; Duan, L.Y. IDeRs: Iterative dehazing method for single remote sensing image. Inf. Sci. 2019, 489, 50–62. [Google Scholar] [CrossRef]
  70. Liu, Q.; Gao, X.; He, L.; Lu, W. Haze removal for a single visible remote sensing image. Signal Process. 2017, 137, 33–43. [Google Scholar] [CrossRef]
  71. Li, J.; Hu, Q.; Ai, M. Haze and thin cloud removal via sphere model improved dark channel prior. IEEE Geosci. Remote Sens. Lett. 2018, 16, 472–476. [Google Scholar] [CrossRef]
  72. Zheng, Z.; Ren, W.; Cao, X.; Hu, X.; Wang, T.; Song, F.; Jia, X. Ultra-high-definition image dehazing via multi-guided bilateral learning. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 19–25 June 2021; pp. 16180–16189. [Google Scholar]
  73. Guo, C.L.; Yan, Q.; Anwar, S.; Cong, R.; Ren, W.; Li, C. Image dehazing transformer with transmission-aware 3d position embedding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 5812–5820. [Google Scholar]
  74. Li, R.; Pan, J.; Li, Z.; Tang, J. Single image dehazing via conditional generative adversarial network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 8202–8211. [Google Scholar]
  75. Mi, Z.; Li, Y.; Jin, J.; Liang, Z.; Fu, X. A generalized enhancement framework for hazy images with complex illumination. IEEE Geosci. Remote Sens. Lett. 2021, 19, 3079456. [Google Scholar] [CrossRef]
  76. Kar, A.; Dhara, S.K.; Sen, D.; Biswas, P.K. Zero-shot single image-restoration through controlled perturbation of koschmieder’s model. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 19–25 June 2021; pp. 16205–16215. [Google Scholar]
  77. Liu, J.; Liu, R.W.; Sun, J.; Zeng, T. Rank-one prior: Real-time scene recovery. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 8845–8860. [Google Scholar] [CrossRef]
  78. Huo, F.; Li, B.; Zhu, X. Efficient wavelet boost learning-based multi-stage progressive refinement network for underwater image enhancement. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 1944–1952. [Google Scholar]
  79. Naik, A.; Swarnakar, A.; Mittal, K. Shallow-uwnet: Compressed model for underwater image enhancement (student abstract). In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 2–9 February 2021; Volume 35, pp. 15853–15854. [Google Scholar]
  80. Li, C.; Anwar, S.; Porikli, F. Underwater scene prior inspired deep underwater image and video enhancement. Pattern Recognit. 2020, 98, 107038. [Google Scholar] [CrossRef]
  81. Peng, L.; Zhu, C.; Bian, L. U-shape transformer for underwater image enhancement. IEEE Trans. Image Process. 2023, 29, 4376–4389. [Google Scholar]
  82. Li, C.; Guo, C.; Ren, W.; Cong, R.; Hou, J.; Kwong, S.; Tao, D. An underwater image enhancement benchmark dataset and beyond. IEEE Trans. Image Process. 2019, 29, 4376–4389. [Google Scholar] [CrossRef] [PubMed]
  83. Peng, W.; Zhou, C.; Hu, R.; Cao, J.; Liu, Y. RAUNE-Net: A Residual and Attention-Driven Underwater Image Enhancement Method. arXiv 2023, arXiv:2311.00246. [Google Scholar]
  84. Shi, X.; Wang, Y.G. CPDM: Content-Preserving Diffusion Model for Underwater Image Enhancement. arXiv 2024, arXiv:2401.15649. [Google Scholar]
  85. Wen, J.; Cui, J.; Zhao, Z.; Yan, R.; Gao, Z.; Dou, L.; Chen, B.M. Syreanet: A physically guided underwater image enhancement framework integrating synthetic and real images. In Proceedings of the 2023 IEEE International Conference on Robotics and Automation (ICRA), London, UK, 29 May–2 June 2023; pp. 5177–5183. [Google Scholar]
  86. Qi, Q.; Li, K.; Zheng, H.; Gao, X.; Hou, G.; Sun, K. SGUIE-Net: Semantic attention guided underwater image enhancement with multi-scale perception. IEEE Trans. Image Process. 2022, 31, 6816–6830. [Google Scholar] [CrossRef]
  87. Li, C.; Guo, J.; Guo, C. Emerging from water: Underwater image color correction based on weakly supervised color transfer. IEEE Signal Process. Lett. 2018, 25, 323–327. [Google Scholar] [CrossRef]
  88. Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. Cbam: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
  89. Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 7132–7141. [Google Scholar]
  90. Chen, X.; Li, H.; Li, M.; Pan, J. Learning a sparse transformer network for effective image deraining. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 18–22 June 2023; pp. 5896–5905. [Google Scholar]
Figure 1. (a) Image with rain streaks, (b) Extracted rain degradation features, (c) Clear image, (d) Restored result from the image with rain streaks. (e) Image with snowflakes, (f) Extracted snowflake degradation features, (g) Clear image, (h) Restored result from the image with snowflakes.
Figure 1. (a) Image with rain streaks, (b) Extracted rain degradation features, (c) Clear image, (d) Restored result from the image with rain streaks. (e) Image with snowflakes, (f) Extracted snowflake degradation features, (g) Clear image, (h) Restored result from the image with snowflakes.
Electronics 13 02817 g001
Figure 2. Schematic diagram of the overall network. The output of the formula for the important stages is marked in red font in the figure, corresponding to the formula to be introduced below.
Figure 2. Schematic diagram of the overall network. The output of the formula for the important stages is marked in red font in the figure, corresponding to the formula to be introduced below.
Electronics 13 02817 g002
Figure 3. Schematic diagram of the ELAU. The output of the formula for the important stages is marked in red font in the figure, corresponding to the formula to be introduced below.
Figure 3. Schematic diagram of the ELAU. The output of the formula for the important stages is marked in red font in the figure, corresponding to the formula to be introduced below.
Electronics 13 02817 g003
Figure 4. Schematic diagram of the grouped convolution. (a,b) depict standard convolution. (c,d) illustrate grouped convolution. The same color represents features within the same group. In (a,b), all features are of one color, indicating no grouping. In (c,d), the features are grouped in pairs, so every two feature maps share the same color.
Figure 4. Schematic diagram of the grouped convolution. (a,b) depict standard convolution. (c,d) illustrate grouped convolution. The same color represents features within the same group. In (a,b), all features are of one color, indicating no grouping. In (c,d), the features are grouped in pairs, so every two feature maps share the same color.
Electronics 13 02817 g004
Figure 5. Restoration results on Rain200H dataset. The colored boxes in the figure represent some local information in the image. These local details are magnified to clearly display the image-restoration effects of different algorithms.
Figure 5. Restoration results on Rain200H dataset. The colored boxes in the figure represent some local information in the image. These local details are magnified to clearly display the image-restoration effects of different algorithms.
Electronics 13 02817 g005
Figure 6. Restoration results on Rain200L dataset. The colored boxes in the figure represent some local information in the image. These local details are magnified to clearly display the image-restoration effects of different algorithms.
Figure 6. Restoration results on Rain200L dataset. The colored boxes in the figure represent some local information in the image. These local details are magnified to clearly display the image-restoration effects of different algorithms.
Electronics 13 02817 g006
Figure 7. Restoration results on Rain800 dataset. The colored boxes in the figure represent some local information in the image. These local details are magnified to clearly display the image-restoration effects of different algorithms.
Figure 7. Restoration results on Rain800 dataset. The colored boxes in the figure represent some local information in the image. These local details are magnified to clearly display the image-restoration effects of different algorithms.
Electronics 13 02817 g007
Figure 8. Restoration results on Snow100K-L dataset.
Figure 8. Restoration results on Snow100K-L dataset.
Electronics 13 02817 g008
Figure 9. Restoration results on CSD dataset.
Figure 9. Restoration results on CSD dataset.
Electronics 13 02817 g009
Figure 10. Restoration results on RSID dataset. The colored boxes in the figure represent some local information in the image. These local details are magnified to clearly display the image-restoration effects of different algorithms.
Figure 10. Restoration results on RSID dataset. The colored boxes in the figure represent some local information in the image. These local details are magnified to clearly display the image-restoration effects of different algorithms.
Electronics 13 02817 g010
Figure 11. Restoration results on EUVP dataset.
Figure 11. Restoration results on EUVP dataset.
Electronics 13 02817 g011
Figure 12. Comparison of the number of parameters of different methods.
Figure 12. Comparison of the number of parameters of different methods.
Electronics 13 02817 g012
Table 1. The seven datasets are divided into two parts according to training and testing.
Table 1. The seven datasets are divided into two parts according to training and testing.
DatasetsTraining Set/PairsTest Set/Pairs
Rain200H [27]1800200
Rain200L [27]1800200
Rain800 [28]700100
Snow100K [29]50,00050,000
CSD [30]70001000
RSID [31]900100
EUVP [32]11435515
Table 2. Average PSNR and SSIM values for the deraining task. The arrows in the table indicate that higher PSNR and SSIM values correspond to better performance. Red and blue values represent the best and second-best results, respectively.
Table 2. Average PSNR and SSIM values for the deraining task. The arrows in the table indicate that higher PSNR and SSIM values correspond to better performance. Red and blue values represent the best and second-best results, respectively.
MethodsRain200LRain200HRain800
PSNR↑SSIM↑PSNR↑SSIM↑PSNR↑SSIM↑
DSC [33]27.1630.86614.7350.38214.9350.468
DiG-CoM [34]30.7820.85419.3320.76722.5350.833
DerainCycleGAN [35]31.4910.93624.3210.84224.2930.859
SPD-Net [36]31.5910.91926.0710.85724.3720.861
NLCL [37]31.7410.93522.3120.72824.4610.821
Syn2Real [38]34.3910.96525.7610.83723.7410.799
SIRR [39]34.4710.96926.5510.84624.3610.859
JRGB [40]34.5120.96724.6210.84924.6210.828
DDN [41]34.6830.97626.0530.80624.2340.468
Air-Net [42]34.9010.96925.4820.82923.7710.833
DID-MDN [43]35.4010.96125.6120.85421.8910.795
RESCAN [44]36.0940.97026.7510.83524.3320.823
RainDiffusion [45]36.8510.97226.0210.86226.4910.875
PReNet  [46]37.8020.86614.7350.38214.9350.468
MSPFN [47]38.5810.98329.3610.90323.3320.803
PerNet39.5910.98929.5820.91225.9930.889
Table 3. Average PSNR and SSIM values for the desnowing task. The arrows in the table indicate that higher PSNR and SSIM values correspond to better performance. Red and blue values represent the best and second-best results, respectively.
Table 3. Average PSNR and SSIM values for the desnowing task. The arrows in the table indicate that higher PSNR and SSIM values correspond to better performance. Red and blue values represent the best and second-best results, respectively.
MethodsSnow100K-SSnow100K-LCSD
PSNR↑SSIM↑PSNR↑SSIM↑PSNR↑SSIM↑
CycleGAN [48]28.5130.90223.5960.88320.9810.801
RESCAN [44]31.5120.90326.0800.81122.0310.812
DesnowNet [29]32.3320.95027.1730.89820.1310.812
ALL in one [49]31.2310.92328.3310.88226.3120.873
JSTASR [50]31.4010.90125.3210.80827.9610.883
HDCW-Net [30]31.5420.95227.2360.88629.0610.910
DDMSNet [51]34.3420.99528.8510.87730.2010.923
MPRNet [52]35.8720.96231.0230.91333.9810.972
TransWeather [53]32.5120.93429.3120.88831.7610.932
SMGARN [54]33.8540.95029.3120.89031.9310.952
TKL [55]35.2130.96331.0010.91933.8910.963
WeatherDiff128 [56]35.0230.95229.5820.84933.4630.968
MSP-Former [57]35.4210.93630.3120.91333.7510.961
Uformer [58]35.5120.96331.3010.92333.8010.961
WeatherDiff64 [56]35.8310.95730.0920.90433.6310.962
Restormer [59]36.0810.95930.2810.91235.4310.972
SnowDiff128 [56]36.0920.95530.2830.90035.1340.974
NAFNet [60]36.1230.97031.2630.92435.1320.973
DGUNet [61]36.3120.97131.2040.92234.7410.973
SnowDiff64 [56]36.5910.96330.4310.91535.2310.976
GridFormer-S [62]36.6810.96030.7820.91733.9030.963
PerNet36.9820.97431.6230.93735.8610.979
Table 4. Average PSNR and SSIM values for the dehazing task. The arrows in the table indicate that higher PSNR and SSIM values correspond to better performance. Red and blue values represent the best and second-best results, respectively.
Table 4. Average PSNR and SSIM values for the dehazing task. The arrows in the table indicate that higher PSNR and SSIM values correspond to better performance. Red and blue values represent the best and second-best results, respectively.
MethodsR100
PSNR↑SSIM↑
Cycle-SNSPGAN [63]18.3440.729
ZID [64]18.9920.727
FCTF-Net [65]19.3060.856
FFA-Net [66]24.0520.899
TCN [67]14.2080.606
EVPM [68]15.5790.689
IDeRs [69]13.6040.644
GRS-HTM [70]14.8000.519
SDCP [71]16.0550.691
UHD [72]26.6590.923
DeHamer [73]23.7520.899
Dehaze-cGAN [74]18.7030.743
STD [75]16.2580.559
Zero-restore [76]16.6480.717
ROP [77]15.5750.750
PerNet26.7940.935
Table 5. Average PSNR and SSIM values for underwater enhancement task. The arrows in the table indicate that higher PSNR and SSIM values correspond to better performance. Red and blue values represent the best and second-best results, respectively.
Table 5. Average PSNR and SSIM values for underwater enhancement task. The arrows in the table indicate that higher PSNR and SSIM values correspond to better performance. Red and blue values represent the best and second-best results, respectively.
MethodsEUVP(515)
PSNR↑SSIM↑
PRWNet [78]25.4410.843
ShallowUW [79]24.5510.852
UWCNN [80]17.7250.704
FunIE-GAN [32]24.0770.794
UT-UIE [81]25.2140.813
Water-Net [82]25.2850.833
RAUNE-Net [83]26.3310.845
CPDM [84]23.2430.901
SyreaNet [85]17.7210.743
SGUIE-Net [86]19.1870.760
Cycle-GAN [87]17.9630.709
PerNet25.5920.913
Table 6. Ablation experiments on the Rain800 dataset regarding the parameters T, B, S, L and the number of ELAU modules, evaluated in terms of PSNR. Red and blue values represent the best and second-best results, respectively.
Table 6. Ablation experiments on the Rain800 dataset regarding the parameters T, B, S, L and the number of ELAU modules, evaluated in terms of PSNR. Red and blue values represent the best and second-best results, respectively.
ELAU = 8ELAU = 16ELAU = 24ELAU = 32
T25.36225.44425.59725.662
B25.27125.68525.76825.832
S25.02125.79125.86225.883
L24.99225.99326.01226.241
Table 7. Ablation experiments on the Rain800 dataset regarding the parameters T, B, S, L and the number of ELAU, evaluated in terms of SSIM. Red and blue values represent the best and second-best results, respectively.
Table 7. Ablation experiments on the Rain800 dataset regarding the parameters T, B, S, L and the number of ELAU, evaluated in terms of SSIM. Red and blue values represent the best and second-best results, respectively.
ELAU = 8ELAU = 16ELAU = 24ELAU = 32
T0.8520.8560.8620.868
B0.8480.8690.8790.887
S0.8440.8730.8840.894
L0.8430.8890.8930.899
Table 8. Ablation experiments of ELAU, SE and CBAM on the Rain800 dataset. Red and blue values represent the best and second-best results, respectively.
Table 8. Ablation experiments of ELAU, SE and CBAM on the Rain800 dataset. Red and blue values represent the best and second-best results, respectively.
PSNR↑SSIM↑
ELAU25.9930.889
SE25.5860.852
CBAM25.6750.856
ELAU+ST26.2910.912
SE+ST26.0210.901
CBAM+ST26.1860.908
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, W.; Zhou, G.; Lin, S.; Tang, Y. PerNet: Progressive and Efficient All-in-One Image-Restoration Lightweight Network. Electronics 2024, 13, 2817. https://doi.org/10.3390/electronics13142817

AMA Style

Li W, Zhou G, Lin S, Tang Y. PerNet: Progressive and Efficient All-in-One Image-Restoration Lightweight Network. Electronics. 2024; 13(14):2817. https://doi.org/10.3390/electronics13142817

Chicago/Turabian Style

Li, Wentao, Guang Zhou, Sen Lin, and Yandong Tang. 2024. "PerNet: Progressive and Efficient All-in-One Image-Restoration Lightweight Network" Electronics 13, no. 14: 2817. https://doi.org/10.3390/electronics13142817

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop