DAMNet: A Dual Adjacent Indexing and Multi-Deraining Network for Real-Time Image Deraining

Zhao, Penghui; Zheng, Haowen; Tang, Suigu; Chen, Zongren; Liang, Yangyan

doi:10.3390/fractalfract7010024

Open AccessArticle

DAMNet: A Dual Adjacent Indexing and Multi-Deraining Network for Real-Time Image Deraining

by

Penghui Zhao

^1,2

,

Haowen Zheng

^1,†,

Suigu Tang

^1,†,

Zongren Chen

^1,3,† and

Yangyan Liang

^1,*

¹

School of Computer Science and Engineering, Faculty of Innovation Engineering, Macau University of Science and Technology, Taipa, Macau 999078, China

²

Research Institute of Tsinghua University in Shenzhen (RITS), High-Tech Industrial Park, Shenzhen 518057, China

³

Computer Engineering Technical College (Artificial Intelligence College), Guangdong Polytechnic of Science and Technology, Zhuhai 519090, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Fractal Fract. 2023, 7(1), 24; https://doi.org/10.3390/fractalfract7010024

Submission received: 26 September 2022 / Revised: 16 December 2022 / Accepted: 17 December 2022 / Published: 26 December 2022

(This article belongs to the Section Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

Image deraining is increasingly critical in the domain of computer vision. However, there is a lack of fast deraining algorithms for multiple images without temporal and spatial features. To fill this gap, an efficient image-deraining algorithm based on dual adjacent indexing and multi-deraining layers is proposed to increase deraining efficiency. The deraining operation is based on two proposals: the dual adjacent method and the joint training method based on multi-deraining layers. The dual adjacent structure indexes pixels from adjacent features of the previous layer to merge with features produced by deraining layers, and the merged features are reshaped to prepare for the loss computation. Joint training method is based on multi-deraining layers, which utilise the pixelshuffle operation to prepare various deraining features for the multi-loss functions. Multi-loss functions jointly compute the structural similarity by loss calculation based on reshaped and deraining features. The features produced by the four deraining layers are concatenated in the channel dimension to obtain the total structural similarity and mean square error. During the experiments, the proposed deraining model is relatively efficient in primary rain datasets, reaching more than 200 fps, and maintains relatively impressive results in single and crossing datasets, demonstrating that our deraining model reaches one of the most advanced ranks in the domain of rain-removing.

Keywords:

deraining model; multi-deraining; rain datasets; rain removal; convolutional neural network (CNN)

1. Introduction

Rain traces or drops caused by outdoor weather are nonnegligible to computer vision, especially when capturing images or videos in the outdoor environment [1,2], because of the image quality degradation, leading to the unstable performance of other computer version algorithms [3,4,5,6]. Rain removal task is a primary but important component in basic algorithms, whose performance deeply affects the underlying algorithms based on rain removal models, such as target and segmentation detectors in camera-based autonomous driving or maritime video surveillance systems [7,8,9]. Although some deraining methods maintain high values of peak signal-to-noise ratio (

P S N R

) and structural similarity index measure (

S S I M

), corresponding algorithms are suitable for rain streak removal of a single image, even achieving only elementary real-time speed when an Nvidia Graphics Processing Unit (GPU) with high performance participates. To capture the high real-time speed for multi-image deraining, efficient and robust deraining structures are critical to enabling the efficient deraining ability to the rain removal models, which is also fundamental for advanced computer vision processing.

Witnessing the impressive benefits of deep learning, several state-of-the-art deraining methods have been proposed. Regardless of the computational cost, various deraining methods have high deraining performance for a single image [10,11,12,13,14,15,16] which is also practical for image reconstruction and superresolution. The well-designed iterative multi-stage processes have a nonnegligible impact on anti-degradation. Attention methods and residual structures have accompanied the multi-stage processing, which is based on carefully designed assumptions. Attention methods are conducive to solving “what” or “where” pixels and channels are relatively critical with affordable computational costs [11,17,18]. Residual structures prompt the cooperation of different layers [10,11,12,13,14,15,16] with bearable GPU memory or computational burdens. U-Net, containing global down-up sampling processes, is commonly used to combine detail and abstract features [19]. However, we find that the existing deraining algorithms mainly pay attention to physical methods of rain removal or rely on repetitive structures based on attention blocks, residual connections, and local-global U-Net, lacking the commonality and universality and limiting the freedom of the deraining model. In addition, elaborate designs based on seemingly natural assumptions lead to visible restrictions for real-time requirements. Deraining models based on the multi-strategy of elaborate designs excessively focus on local patterns of structural similarity [10,11,12,13,14,15,16,17,18,20]. Although elaborate designs and assumptions based on local patterns deserve impressive deraining results for a single image, the corresponding deraining methods inappropriately meet the real-time deraining for multiple images even after GPU prompt. Comprehensively, the single image deraining models mainly utilise the repetitive multi-stage or multi-scale operations [10,11,12,13,14,15,21,22] which are designed based on locally designed features and have visible deraining effects. However, the computational burden and inefficiency are nonnegligible, leading to a low inference speed. The model-free method that EfDeRain [23] adopts is a progressive attempt, which abandons the multi-stages or multi-strategies based on locally enhanced features and releases the following optimisation and computational burdens by end-to-end networks. Due to the concise networks of model-free methods, the requirement of practical real-time inference becomes possible. What the deraining algorithms above have in common is the loss calculation method, which is performed by calculating the loss function based on one deraining layer or fused features of multiple layers. Although deraining loss calculation depends on the composite function, the inputs for the loss functions are the same or derived from single features. We regard this loss computation as a “one-off” loss. The one-off loss can measure the global differences in total inputs while ignoring the differences between local features. The direct influence of “one-off” loss on model-free deraining algorithms is the degradation of the generalisation in crossing public datasets in which the training and testing images are derived from different rain datasets. Therefore, a novel model-free deraining method, further balancing inference and performance, is of great importance for deraining single images.

This paper approaches the highly real-time deraining method for multi-image deraining while achieving impressive performance in rainy public datasets. Our proposed method is a model-free algorithm of multi-deraining layers without any rain streak assumptions and subsequent optimisation. The proposed multi-deraining layers are totally different from the multi-stages or multi-strategies utilised by deraining models for a single image [10,11,12,13,14,15,21,22], in which multiple input scales are adopted to enhance rain patterns or multiple stages of local dimension reduction and increase are built based on additional assumptions. Comprehensively, our major contributions are as follows:

We propose a dual adjacent method that produces new pixels to form a container by indexing pixels from two adjacent features in a periodic interval step. The container stores the new pixels before the end of all new pixel production and is reshaped to prepare for the next fusion with deraining features derived from multi-deraining layers, which is conducive to the diversity and efficiency of subsequent pixel fusion. The dual adjacent method realises the avoidance of utilisation of repetitive multi-strategy methods and bridges the efficient connection between existing U-Net and deraining layers.
We offer a joint training method for computing the loss based on five deraining layers. The multi-deraining layers utilise the pixelshuffle operations to produce deraining features by maintaining the original volume of pixels. The multi-loss functions jointly compute the losses between the ground truth (GT) and each merged feature generated based on features of multi-deraining layers and the reshaped image of dual adjacent operations. The joint training method utilising pixelshuffle operation avoids pixel multiplication caused by common up-sampling operations of Pytorch (nn.Interpolate or nn.Upsample) while avoiding the defect of “one-off” loss.

2. Related Works

The deraining methods consist of two components based on the processed data: video and single-image deraining models. The video deraining models mainly focus on space-time features [5,24,25,26], intrinsic characteristics or repetitive local patterns [27,28,29], and deep learning networks [30,31,32]. Although video deraining methods, especially the models utilising deep learning, have high-performance for video deraining, it is inappropriate to immediately use them to remove rain streaks from separate images, because of the absence of temporal features in multiple images or single images. The removal of separate images is more complicated than the deraining of sequence images and videos. In this paper, we confine our discussion to several classic deraining methods of separate images based on deep learning.

Various image-deraining methods have been proposed with the development of deep learning methods. Pioneer attempts [33,34] tend to utilise the advantages of the automatic extraction of the convolutional neural network (CNN), outperforming the traditional deraining methods based on low-frequency part (LFP) and high-frequency part (HFP) [35,36,37]. Recently, multi-stage or multi-scale strategies have been proposed and carefully designed to prompt the performance of deraining models for a single image. The attentive spatial network (SPANet) [14] is introduced based on a multi-directional recurrent neural network (RNN), in which several residual structures and attention methods are applied to extract useful features and iteratively identify rain traces during multi-stage processing. A semi-supervised rain removal method (SIRR) [16] is provided based on a semi-supervised method that elaborately calculates residuals and correlations between inputs and derained images. A rain convolutional dictionary network (RCD-Net) [10] is introduced to remove the degradation of a single image caused by rainy weather, in which inner dictionaries based on CNN encode rain traces, and then the encoded features are processed by the multi-stage structures to iteratively remove the degradation influence. Joint rain detection and removal (JORDERE) [21] is designed to release the adverse impact of rain by multi-task networks, which jointly process features extracted from binary rain images and clean images without rain streaks. A condition variational image deraining (CVID) [12] is offered for a single image deraining based on a conditional variational autoencoder (CVAE), which generates multiple predictions of the derained image to produce the corresponding clean backgrounds and rain traces. A recurrent context aggregation network (RESCAN) [20] is proposed to solve the inverse effects of rain by similarly multi-stage, in which RNN layers are utilised to extract and purify rain patterns to build a memory container aggregating useful features. A progressive recurrent network (PReNet) [15] is proposed by constructing repetitive residual networks in which a recurrent structure is further utilised to approach the correlations of features from multiple stages. More recently, an attentive image deraining network (AIDNet) [11] is offered based on residual attention modules and correlations between ground truth and derained outputs, which are realised by imposing adversarial supervision and loss of wavelet space. A multi-scale hourglass extraction block (MHEB) [13] is proposed to derain single images by local and global features extracted from the multi-scale processing, in which a hierarchical attentive distillation block (HADB) and residual projected feature fusion (RPFF) are utilised to exploit and aggregate the features. CKT&MCR [22] is offered with an elaborately designed multi-strategy structure, which furtherly pushes the rain removal performance to practical usage again. Efficientderain (EfDeRain) [23] is proposed based on a model-free method, with relatively few human interventions that models should focus on.

Comprehensively, these multi-strategy methods consist of three kinds, as shown in Figure 1, containing multi-sampling networks, multi-scale input networks, and multi-residual networks. One common characteristic is the repetitive stages or multiple operations in the middle or global processing of algorithms. Although multi-strategy methods and local pattern assumptions deserve impressive performance for a single image, the corresponding deraining methods pay more attention to repetitive structures, which are inserted in the middle or global processes of algorithms by several times and improperly meet efficient and high real-time requirements for multiple images. Computational costs and inefficiencies are nonnegligible, resulting in a low inference speed even after the accelerating prompts from the GPU. We argue that elaborate prior knowledge and assumptions potentially lead to barriers that hinder the balance between inference speed and performance. Although EfDeRain [23] is a model-free model, the composite loss function is used to calculate the features from different operations at one time(one-off loss), leading to a performance that slightly mismatches the inherent advantage of high-speed processing. In this paper, we exploit a seemingly simple deraining method for multiple images while utilising the pixelshuffle [38] to maintain the number of original pixels during upsampling stages which can avoid repetitive designs based on prior assumptions.

3. Methodologies

The dual adjacent indexing operation indexes and fuses the adjacent degraded pixels to form a container, reshaped to the compatible size for deraining features. Multi-deraining layers are performed to generate deraining features to prepare for the fusion with reshaped features. The joint training method of the multi-loss prompts the training of DAMNet and considers the relationship between local-global features and GT. DAMNet is a model-free deraining method that has a seemingly simple structure without elaborate designs and assumptions.

3.1. Structures

The point of the deraining method is the reasonable suggestion of the learning between the ground truth and degradation, which should be presented by “

D F (X)

versus

G T

”, where

G T

is the ground truth or clear images, and the outputs of

D F (X)

are images predicted based on the degraded images X. The function “

D F

” is the platform to realise the deraining method. To construct efficient and concise networks, we first introduce the formalisation of our methods for rain removal to construct our “

D F

”, as shown in Figure 2. Figure 2a is the dual adjacent method consisting of dual adjacent indexing and reshaped operations. Dual adjacent indexing operations fuse the pixels from adjacent channels to prepare the fusion with the corresponding deraining layer. Figure 2b is the training feedback mechanism, in which five deraining layers join the loss calculations based on each deraining layer and the reshaped feature.

S S I M_{i}

and

S S I M_{t o t a l}

are the loss computations based on the criterion of the structural similarity index measure (

S S I M

), and

M S E

is the global loss computation based on the criterion of the mean square error (

M S E

). Based on the formalisation of Figure 2, we utilise relatively simple CNN layers to build the backbone and propose a dual adjacent indexing and multi-deraining network for real-time image deraining(DAMNet). The structure of DAMNet is shown in Figure 3, representing the mechanism of forwarding propagation.The underlying and head networks of DAMNet consist of 9 CNN layers, respectively. “×” means the multiplying operation of scalars and matrices, “3 × Conv2d” represents three layers of CNN, “Pixelshuffle 2” denotes the “nn.PixelShuffle” [38] of PyTorch and the upsampling fold is “2” whose denotation is similar to other “PixelShuffle” above. DAMNet utilizes the existing upsampling of the head network to perform the multi-deraining works. The container stores the new pixels produced by the dual adjacent method shown in Figure 4, and the multi-deraining operations process the pixels sampled from the container. Figure 4 is the processing of new pixel production, in which new pixels are produced by isometric trilinear interpolation based on periodical and previous pixels.

O r i g i n_{F}

,

D A_{F} 1

, … are the basement for the production of the next new pixels. The number of

D A_{F} i

depends on the size severalfold between original images and features. Before the production of each

D A_{F} i

, new pixels are stored in a pixel container with a periodical location as the red frame of Figure 2a. We use the pixelshuffle operations to upsample the features that are helpful to maintain the original number of pixels [38], while indirectly compressing the channels compared with nn.Upsample [23] and deconvolution. Each deraining layer uses the pixelshuffle [38] to generate the compressed and deraining features that combine the reshaped features derived from dual adjacent operations. The final deraining layer is the combination of

d e r a i n 0

to

d e r a i n 3

in the channel dimension.

Multiple loss calculation based on multi-deraining layers is totally different from “one-off” loss utilised in recent deraining models [11,12,13,14,15,21,39] which only calculate the loss based on single features or similar inputs of composite functions. Multiple loss computations based on different deraining features suggest that not only the global difference between GT and combined deraining features can be focused on, but also the contrast between GT and local deraining features can be considered. Multi-deraining layers can avoid the simple and crude learning method of one composite function, such as the loss calculation of EfDeRain [23]. In addition, the combination of multi-deraining layers helps combine the detailed and abstract information in the underlying and head networks and prompts the inherent advantages of CNN. Comprehensively, our deraining method is a multi-deraining structure that utilises the pixelshuffle upsampling structure of head networks and relatively simple underlying networks to avoid the multi-stage strategies [11,12,13,14,15,21,22,39]. The multi-deraining structure is different from the repetitive multi-strategy methods, our multiple deraining layers are directly used for the deraining performance at the end of global networks, while repetitive multi-strategy methods are utilised for the enhancement or information of the features accumulated in the middle or global processing of algorithms, as shown in Figure 1. The primary workflow of DAMNet is shown in Algorithm 1 which, from another angle, directly shows the avoidance of the repetitive multi-stage methods, containing repetitively local down-up sampling processes and multiple information accumulations.

Algorithm 1: The deraining processing of DAMNet.

$Fractalfract 07 00024 i001$

3.2. The Dual Adjacent Method

The dual adjacent method is the preprocessing for the degraded images, preparing the fusion with the deraining features produced by each deraining layer. The merge operation, which is the multiplication mentioned in Figure 2, informs the adjustable space between degraded pixels and predicted pixels of deraining layers and releases the changing freedom of deraining models. The overall workflow of the dual adjacent method is shown as the red frame in Figure 2a, consisting of dual adjacent indexing and reshaped operations. The dual adjacent indexing operations fuse the pixels from adjacent channels to prepare the processing of the corresponding deraining layer. New pixels are produced by dual adjacent indexing interpolation based on periodical and previous pixels from dual adjacent channels, as shown in Figure 4. Before the production of each

D A_{F} i

, the fused pixels are stored in the container by the periodic interval step based on the previous features. The container is indexed to sample the pixels in similar steps, and new features are formed based on the fused pixels, increasing the diversity of indexing pixels and deep fusion capability. The initial location of periodic interval steps is from 0 to

s t e p - 1

, which gains the dilation attribution and squeezes the computations indirectly. Finally, the features constructed on new pixels are reshaped to be compatible with the multiplication between multi-deraining layers.

Isometric interpolation, used in the dual adjacent method based on periodical and previous pixels, is different from “nn.Interpolate” and “nn.Upsample”(upsampling operations of Pytorch). “nn.Interpolate” and “nn.Upsample” upsample features to twice or several fold sizes by interpolation, while the isometric interpolation we used is not to upsample features. We utilise isometric interpolation to produce new pixels based on periodical and previous pixels, as shown in Figure 4. The specific interpolation of each generated new pixel is shown in Equations (1) and (2), displaying the dual adjacent operations to generate new pixels.

Comprehensively, the dual adjacent method realises the avoidance of utilisation of repetitive down-up sampling structures or multi-scale operations [10,11,21,22], in which CNN blocks containing repetitive down-up sampling operations iteratively perform features, and enhanced images join the accumulation processes. The dual adjacent method acts as a middle operation to bridge the efficient connection between existing U-Net and deraining layers.

\{\begin{matrix} f (R_{1}) & = \frac{x_{2} - x}{x_{2} - x_{1}} V_{11} + \frac{x - x_{1}}{x_{2} - x_{1}} V_{21} \\ f (R_{2}) & = \frac{x_{2} - x}{x_{2} - x_{1}} V_{12} + \frac{x - x_{1}}{x_{2} - x_{1}} V_{22} \\ f (R_{1}^{'}) & = \frac{x_{2}^{'} - x^{'}}{x_{2} - x_{1}} V_{11}^{'} + \frac{x^{'} - x_{1}^{'}}{x_{2} - x_{1}} V_{21}^{'} \\ f (R_{2}^{'}) & = \frac{x_{2}^{'} - x^{'}}{x_{2}^{'} - x_{1}^{'}} V_{12}^{'} + \frac{x^{'} - x_{1}^{'}}{x_{2}^{'} - x_{1}^{'}} V_{22}^{'} \end{matrix}

(1)

where

V_{11 - 22}

and

V_{11 - 22}^{'}

represent eight pixel values in the dual adjacent features.

x_{1 - 2}

and

x_{1 - 2}^{'}

denote the locations of corresponding eight pixels. x and

x^{'}

mean the position of new pixels periodically projected on the dual adjacent features.

Container \overset{indexing}{\Leftarrow} d \frac{(y_{2} - y) f (R_{1}) + (y - y_{1}) f (R_{2})}{y_{2} - y_{1}} + d^{'} \frac{(y_{2}^{'} - y^{'}) f (R_{1}^{'}) + (y^{'} - y_{1}^{'}) f (R_{2}^{'})}{y_{2}^{'} - y_{1}^{'}}

(2)

where

y_{1 - 2}

and

y_{1 - 2}^{'}

denote the normalised position of new pixels projected on the dual adjacent features. y is the location of new pixels planned to generate. d and

d^{'}

are adjusting parameters that measure the scales between dual adjacent features, usually set to 0.5 and 0.5. Each generated pixel is temporarily stored in a “Container”.

3.3. Joint Training Method

The loss calculation of DAMNet consists of two components,

M S E

and

S S I M

.

P S N R

and

S S I M

are popular criteria for measuring performance. As shown in Equation (3),

P S N R

is the function of

M S E

, which both measures the difference between individual pixels, while the computational complexity of

P S N R

is O(

l o g N^{2}

) times higher than

M S E

, leading to relatively low training speed. In the case of numerical normalisation,

P S N R

tends to encounter non-differentiable cases. Therefore, we choose the

M S E

as the loss function tending to measure the changes in individual pixels. Considering the inconstancy of

M S E

during the initial training steps, we utilise

M S E

only in the final deraining layer.

S S I M

is the criterion measuring the global difference between degraded images and inputs, whose relative stability is conducive to combining each deraining layer. We choose

S S I M

and

M S E

as the two components of the joint training method, shown in Figure 2b, multi-loss functions computing losses of 5 deraining layers.

P N S R \Leftrightarrow f (M S E) = 10 {log}_{10} \frac{N L^{2}}{\sum_{i}^{N} {(x_{i} - y_{i})}^{2}}

(3)

where N is the number of pixels, L is the maximum value of pixels, and

x_{i}

and

y_{i}

are the corresponding pixel values.

P S N R

is the peak signal-to-noise ratio, and

S S I M

is the structural similarity index measure.

The elements of the multi-loss function we directly use in each deraining layer are the structural similarity index measure (

S S I M

), as shown in Equation (4), whose function is the simplified formula under the normalisation of pixel values. Due to the exchanges of data types, it is inevitable to influence individual pixel values and make non-negligible changes for “

c_{1}

” and “

c_{2}

”. Therefore,

M S E

is utilised in the final deraining layer to weaken the adverse effect of changing datatypes and offset the overall computational complexity [23].

The deraining layers, utilising pixelshuffle, and the joint multi-loss function, combining the loss functions of Figure 2, are indivisible. The joint training method is based on the joint multi-loss function and deraining layers. Deraining layers use pixelshuffle [38] to generate various deraining features, as shown in Figure 3. If we adopt the common up-sampling approach that recent deraining models have taken, the number of pixels will be multiplied, leading to multiple computational burdens for the following structures. Pixelshuffle orderly dislocates the original pixels to gain deraining features with a severalfold size while maintaining the original number of previous pixels, which makes the joint training method efficient. The calculation of producing deraining features is Equation (5), in which pixels are orderly dislocated in the dimensions of height and width.

DR

is the generated deraining features prepared for multi-loss computations.

EfDeRain [23] has an elaborate network and fast inference speed, while the feedback loss computing is only dependent on the final layer of head networks. Although it has a relatively fast training speed, this “one-off” loss method ignores the local patterns stemming from the different features. The multi-loss function of the joint training method is conducive to training the overall network and combining abstract and detailed features. The multi-loss function is directly defined as Equation (6), in which we keep the sum of the coefficients of all

S S I M

[40] computations as one, and the weights of

S S I M

and

M S E

[41] are globally equal. Although the multi-loss function is based on multiple deraining layers, leading to more computational burdens than EfDeRain, we utilise pixelshuffle [38] operations to maintain the volume of the original pixel during the upsampling processes, denoting the following processings have a quarter of the computation, compared with nn.Upsampling used in EfDeRain [23]. The following experiments show that multi-loss function and joint training based on multi-deraining layers improve the performance in the single and crossing datasets while maintaining a high real-time speed.

\{\begin{matrix} S S I M_{x} = \frac{(2 D F_{x} \times G T + c_{1}) (σ_{D F_{x} G T} + c_{2})}{(D F_{x}^{2} + G T^{2} + c_{1}) (σ_{D F_{x}}^{2} + σ_{G T}^{2} + c_{2})} \\ M S E = \frac{1}{M \times N} \sum_{0 \leq i \leq N} \sum_{0 \leq j \leq M} {(f_{i j} - f_{i j}^{'})}^{2} \end{matrix}

(4)

where x∈ {

0, 1, 2, 3, 4

},

S S I M_{x}

is the

S S I M

between the

x_{t h}

D F

and inputs, the

x_{t h}

D F

is the prediction from one of 5 deraining layers.

D F_{x}

and

G T

are the average values of

x_{t h} D F

and inputs, while

σ_{D F_{x}}^{2}

and

σ_{G T}^{2}

are the corresponding variance.

σ_{D F_{x} G T}

is the covariance of

x_{t h} D F

and Inputs.

c_{1}

and

c_{2}

are two constants utilised to maintain stability. N and M are the height and width of derained images or inputs, while

f_{i j} - f_{i j}^{'}

is the pixel preparation to mean square error.

\{\begin{matrix} F = W_{l} \times F_{l} (W_{l - 1} \times F_{l - 1} + b_{l - 1}) + b_{l} \\ DR = F [\dots, : : 2, 1 : : 2] \cup^{'} F [\dots, 1 : : 2, : : 2] \cup^{'} F [\dots, : : 2, : : 2] \cup^{'} F [\dots, 1 : : 2, 1 : : 2] \end{matrix}

(5)

where

W_{l}

,

F_{l}

, and

b_{l}

represent the trainable kernels weights, previous features, and adjustable constants whose denotations are the same as

W_{l - 1}

,

F_{l - 1}

, and

b_{l - 1}

.

\cup^{'}

denotes the union set(∪) with interval steps in which pixels are orderly dislocated in the dimensions of height and width.

F [\dots, : : 2, 1 : : 2]

and related presentations are the Python index method for the eigenmatrix.

DR

is the generated deraining features prepared for loss computations.

L o s s = \sum_{x \in 0, 1, 2, 3} - 0.1 \times S S I M_{x} ({DR}_{x} \times RS, G T) - 0.5 \times S S I M_{t o t a l} ({DR}_{t o t a l} \times RS, G T) + M S E

(6)

where

S S I M_{t o t a l}

is the

S S I M

of the

d e r a i n

4 and

I n p u t s

,

S S I M_{x}

is the structural similarity of the x

t h

deraining layer and x has four values representing the location of four deraining layers.

{DR}_{x}

and

{DR}_{t o t a l}

represent each deraining feature and the final concatenated deraining feature.

G T

is the clear and undegraded images.

4. Experiments and Results

4.1. Settings

Experimental condition: The experimental core software contains PyTorch 1.71, and CUDA10.2, and the core hardware contains Nvidia Titan V, whose performance is comparable to that of the NVIDIA Quadro P6000 GPU.

Datasets: Rain1400 [34], Rain100H [21], Real-world SPA dataset [14], and Raindrop dataset [42] are used to train and test related models, while each high real-time model trained in one dataset above is retested in the other three datasets, whose results directly demonstrate the generalization ability.

Training tricks: We train our model proposed 5000 epochs with a batch size of 16 in Rain100H, 1200 epochs with a batch size of 16 in Rain1400, 5000 epochs with a batch size of 16 in Raindrops, 15 epochs with a batch size of 12 in Real-world dataset, respectively. The optimiser is Adam, and the learning rate is controlled by an adjusting function, in which no changes happen before the half-training. Then the learning rate slowly decreases in a straight line to zero at the end. There is no data augmentation and fancy training tricks, and the training tricks are simple.

Metrics: The metrics contain objective and subjective evaluations. The objective metrics are

S S I M

and

P S N R

, representing the performance of deraining models in specific numerical quantification.

S S I M

and

P S N R

are popular and accepted, and the larger the two values are, the better the comparison results. The

P S N R

calculation is shown in Equation (4). The subjective metrics [43], taken as an assistant way to further distinguish the results by human eyes, are evaluations of colour changes, preserved detailed features, and deformations, which should be considered because adverse effects should be carefully restrained accompanying the deraining processing.

P S N R = 10 \cdot {log}_{10} (\frac{M A X_{I}^{2}}{M S E}) = 20 \cdot {log}_{10} (\frac{M A X_{I}}{\sqrt{M S E}})

(7)

where

M A X_{I}

is the maximum value of the pixels.

4.2. Comparison on Public Datasets

We have compared DAMNet with several classic deraining methods on Rain1400 [34], Rain100H [21], and Real-world SPA [14]. All testing results are obtained after acceleration from a single GPU with comparable performance. We divide these algorithms into three categories, high real-time (1000/fps

< =

10 ms), real-time (10 < 1000/fps

< =

40 ms), and low real-time (1000/fps > 40 ms) kinds, based on the average time used of all derained images.

The results comparison on Rain100H is shown in Table 1. Although our deraining model is a relatively simple and model-free network, DAMNet gains parallel performance compared with the RCDNet [10] which is an elaborate and intricate model built on inner dictionaries of the rain traces. Owing to avoiding the repetitive multi-stages and utilising the pixelshuffle structure of head networks, DAMNet achieves a high real-time inference, far faster than RCDNet, while maintaining impressive deraining effects. In contrast, EfDeRain [23] has a comparable inference speed compared with our model, while DAMNet achieves better performance because of the adoption of joint training based on multi-loss functions. Although the five deraining layers lead to more inference burdens than the one deraining layer of EfDeRain, our deraining models maintain a high real-time speed by taking advantage of the dual adjacent method. In addition, we confirm that deraining models for a single image have a low inference speed even after GPU prompts. Because deraining methods utilised for a single image belong to the low real-time series, when the specific inference speed is not confirmed, the spent time per image is approximately assessed by “>40 ms”. Compared with the deraining models for a single image [11,12,14,21], DAMNet achieves the fastest inference speed while gaining impressive results, suggesting our muti-deraining method has practical effects despite the adoption of relatively simple networks. The visible comparison is shown in Figure 5, DAMNet achieves sufficient deraining effects while persevering visible detailed features. Overall, our deraining model better balances the inference speed and performance on Rain100H [21], whose rain backgrounds are more complex than those of Rain100L [21].

The performances on Rain1400H are shown in Table 2. MHEB [13] achieves the best performance in Table 2, while the corresponding network neglects the computational burden and only considers the deraining effect for a single image, leading to a low inference speed. Conversely, DAMNet better balances deraining effects and inference speed with a model-free structure. Although the performance of our model outperforms EfDeRain by a narrow margin on the Rain1400 [34], the practical deraining effect of DAMNet is more visible and the restriction for adverse effects is recognisable, as shown in Figure 6. Ears colour changes and deformation appear in the results of most deraining models, but the deraining effect of DAMNet is satisfactory while the adverse effects are restrained very well.

The demonstrations on the Realworld dataset are shown in Table 3. Although achieving the best deraining effect, RCDNet [10] removes the rain streaks with a low inference speed even after the acceleration of the GPU. DAMNet achieves a parallel performance compared with MHEB [13] and balances the performance and inference speed, demonstrating the advantage of the structure of DAMNet. Although outperforming EfDeRain [23] by a narrow margin, only considering the values of

P S N R

and

S S I M

, DAMNet performs better in the constraint of adverse deraining effects, as shown in Figure 7. Although relatively thicker greens are visually better, whether to add a colour attribute that does not exist is another manifestation of the algorithm’s ability. From Figure 7, DAMNet best maintains the original colour.

The practical effects of raindrop removal are shown in Table 4. Raindrops are different from Rain100H, Rain1400, and Realworld. Except for the DAMNet we proposed and EfDeRain [23], the models in Table 4 are specific algorithms for removing raindrops. DAMNet and DfDeRain are directly applied and validated in Raindrops while achieving satisfactory effects, even better than the deraining effects of JORDERE [21] and EIGEN [44]. Although the specific inference speed in our platform can not be confirmed, we assign them to “>40 ms” considering their complex structures and the time consumed in public papers. Compared with EfDeRain, one of the high real-time kinds of Table 4, DAMNet achieves a better balance between performance and inference speed. The removal raindrop effects of DAMNet and EfDeRain [23] can demonstrate the model-free structures, avoiding multi-strategies for elaborate assumptions, have a strong generalisation ability and enough free adjustable space to fit the degraded pattern. Although belonging to the model-free method, DAMNet achieves higher values of

S S I M

and

P S N R

than EfDeRain [23] which adopts the “one-off” loss method. Besides, DAMNet curbs the side effects at full steam and demonstrates satisfactory generalisation ability, as shown in Figure 8.

Comprehensively, DAMNet reaches the degree of high real-time speed in all four datasets while maintaining a stable performance. Although five deraining layers acquire more computational cost than EfDeRain, which only adopts one deraining layer, DAMNet utilises the pixelshuffle operation to form a fast U-Net to maintain the parallel inference speed with EfDeRain. Our deraining model reaches one of the most efficient models in single datasets in which the training and testing images are different but stem from the same datasets. The comparison of visualisation on single datasets is also given in Figure 9. The deraining effects of DAMNet outperform EfDeRain, while the adverse deraining effects of DAMNet are better restrained.

Different datasets have different properties, and the method of training and testing in different or crossing datasets is a practical way to demonstrate the generalisation of the high real-time deraining models. Therefore, we compare DAMNet with EfDeRain in the crossing datasets in which the training and testing images are derived from different datasets. The generalisation of our model outperforms EfDeRain [23] by an impressive margin, as shown in Table 5. The models’ suffixes are datasets for training themselves. The values highlighted in black represent that DAMNet outperforms EfDeRain under correspondingly identical training conditions. DAMNet achieves the higher mean value of

S S I M

in all crossing datasets while keeping an overall advantage by measuring

P S N R

. Comprehensively, our deraining model performs relatively stably, which is why we adopt the dual adjacent indexing and the joint training methods. Although the multi-deraining layers lead to affordable computing burdens compared with one deraining layer of EfDeRain, the generalisation ability of DAMNet has practically improved, demonstrating our deraining method is a more efficient model-free algorithm. The visualisation of the results on crossing datasets is shown in Figure 10. Although the derained images have visible rain streaks, the more obvious tendency to remove rain tracks is visible.

4.3. Ablation Study

Rain100H is the dataset most severely disturbed by rain streaks, compared with Rain1400 and Realworld, which can be reflected in Figure 5, Figure 6 and Figure 7. The ablations of DAMNet on Rain100H can evaluate the relationship between the dual adjacent method and the joint training method of computing the loss of each deraining layer. Experimental ablations on Rain100H are shown in Table 6. The results of DAMNetfinalLoss are lower than DAMNet, which represents the “one-off” loss computation that definitely weakens the deraining effects and is also why DAMNet outperforms EfDeRain in single and crossing datasets. Comparing the results of DAMNet_noDual and DAMNet, we can confirm that the dual adjacent method can offer more useful information than the direct reshaped operations. The only participation of all deraining layers rarely maintains DAMNet performance when the dual adjacent method is absent. Compared with DAMNet04 saving both the first and final deraining layers, the models just maintain first or final deraining layers have an impressive decrease in deraining effects, dropping approximately 38.2% for

P S N R

, which denotes the combination of first and final deraining layers is critically important for the deraining effect of DAMNet. Considering the comparison of DAMNet04 and DAMNet, the abandonment of middle deraining layers weakens the performance, whose phenomenon occurs in DAMNet_noMSE and DAMNet_noSSIMtotal. The

P S N R

gap between DAMNet_noDual and DAMNet reaches more than 1%, which is larger than the comparison in Table 1, Table 2, Table 3 and Table 4, denoting the dual adjacent method is also nonnegligible for improving the performance. Comprehensively, DAMNet has the best performance, indicating that each deraining layer and the dual adjacent method are indispensable to gaining the efficiency and robustness of DAMNet, and suggesting the joint training method of multi-loss on the combination of multi-deraining layers is critical to prompt the performance.

5. Discussion

Rain streaks are a nonnegligible degradation factor to computer vision, because of the image quality degradation, leading to the unstable performance of advanced algorithms, such as camera-based autonomous driving or maritime video surveillance systems. Although some deraining methods maintain satisfactory deraining effects and multi-strategies utilised by deraining models for a single image deserve impressive results, corresponding algorithms are suitable for rain streak removal of a single image, which pays more attention to the performance and neglects the computational burdens. Model-free methods abandon the seemingly elaborate designs and release the constraints raised by those multi-strategies. To capture the high real-time speed for multi-image deraining, not just a single image, DAMNet is proposed based on the dual adjacent method and the joint training of multi-deraining layers. DAMNet improves the deraining effects while maintaining a high real-time speed. However, DAMNet bears a bit more computational burden than EfDeRain, adopting only one deraining layer. Although achieving a high real-time speed, DAMNet is not a lightweight deraining model, which is not totally compatible with edge devices. Although seemingly simple structures of DAMNet achieve high real-time inference speed compared with those deraining models with elaborate designs, the container prebuilt for the dual adjacent method trades additional spaces for the avoidance of multiple upsampling operations. We have decided to prepare more lightweight models compatible with edge devices while maintaining the balance between performance and inference speed. In addition, the side-effect raised by over-deraining processing is non-negligible when enhancing the image quality. Therefore, a better balance between rain removal and corresponding side effects is our further attempt.

6. Conclusions

To capture the high real-time speed for multi-image deraining, not just a single image, DAMNet is an efficient deraining algorithm for multiple separate images that avoids the iterative multi-strategy methods used for single-image deraining methods. The dual adjacent indexing is introduced to prompt the efficiency of new pixel production and match the number of deraining features. joint training methods is proposed to jointly train DAMNet and improve deraining efficiency while ensuring high speed. Experiments show that DAMNet has a high real-time speed of more than 200 fps and maintains an immediate and stable deraining effect in single and crossing datasets. In addition, DAMNet has a practical rain removal effect, while suppressing the side effects of rain removal to the relative best level. Comprehensively, DAMNet has reached the advanced level of high-efficient deraining models.

Author Contributions

P.Z. is the main contributor to this paper and related experiments. S.T. and H.Z., with equal guidance for this work, provide practical suggestions for writing expressions. Z.C. provides computational sources to assist the experiments. Y.L. is the corresponding author for this paper. All authors have read and agreed to the published version of the manuscript.

Funding

Jianping Wu (Department of Civil Engineering, Tsinghua University) provides very important research funding and a work platform to support this work. This work is supported in part by Science and Technology Development Fund of Macau: 0025/2019/AKP, 0004/2020/A1, 0070/2021/AMJ, and Guangdong Provincial Key R&D Programme: 2019B010148001.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Li, Y.; Tan, R.T.; Guo, X.J.; Lu, J.B.; Brown, M.S. Rain Streak Removal Using Layer Priors. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 2736–2744. [Google Scholar] [CrossRef]
Garg, K.; Nayar, S.K. When does a camera see rain? In Proceedings of the 10th IEEE International Conference on Computer Vision (ICCV 2005), Washington, DC, USA, 17–20 October 2005; pp. 1067–1074. [Google Scholar]
Luo, Y.; Xu, Y.; Ji, H. Removing rain from a single image via discriminative sparse coding. In Proceedings of the IEEE International Conference on Computer Vision, Washington, DC, USA, 7–13 December 2015; pp. 3397–3405. [Google Scholar] [CrossRef]
Li, M.; Cao, X.; Zhao, Q.; Zhang, L.; Meng, D. Online rain/snow removal from surveillance videos. IEEE Trans. Image Process. 2021, 30, 2029–2044. [Google Scholar] [CrossRef] [PubMed]
Garg, K.; Nayar, S.K. Vision and rain. Int. J. Comput. Vis. 2007, 75, 3–27. [Google Scholar] [CrossRef]
Fu, X.; Qi, Q.; Zha, Z.J.; Zhu, Y.; Ding, X. Rain streak removal via dual graph convolutional network. In Proceedings of the AAAI Conference on Artificial Intelligence, Washington, DC, USA, 2–9 February 2021; pp. 1–9. [Google Scholar]
Bahnsen, C.H.; Moeslund, T.B. Rain removal in traffic surveillance: Does it matter? IEEE Trans. Intell. Transp. Syst. 2018, 20, 2802–2819. [Google Scholar] [CrossRef] [Green Version]
Liang, X.D.; Wang, T.R.; Yang, L.N.; Xing, E.R. CIRL: Controllable Imitative Reinforcement Learning for Vision-Based Self-driving. In Proceedings of the 15th European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; Volume 11211, pp. 604–620. [Google Scholar] [CrossRef] [Green Version]
Kondapalli, C.P.T.; Vaibhav, V.; Konda, K.R.; Praveen, K.; Kondoju, B. Real-time rain severity detection for autonomous driving applications. In Proceedings of the 2021 IEEE Intelligent Vehicles Symposium (IV), Nagoya, Japan, 11–17 July 2021; pp. 1451–1456. [Google Scholar]
Wang, H.; Xie, Q.; Zhao, Q.; Meng, D.Y. A Model-driven Deep Neural Network for Single Image Rain Removal. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 3100–3109. [Google Scholar] [CrossRef]
Cui, X.; Shang, W.; Ren, D.; Zhu, P.; Gao, Y. Semi-Supervised Single Image Deraining with Discrete Wavelet Transform. In PRICAI 2021: Trends in Artificial Intelligence; Springer: Cham, Switzerland, 2021; pp. 265–278. [Google Scholar]
Du, Y.; Xu, J.; Zhen, X.; Cheng, M.M.; Shao, L. Conditional variational image deraining. IEEE Trans. Image Process. 2020, 29, 6288–6301. [Google Scholar] [CrossRef] [PubMed]
Chen, X.; Huang, Y.F.; Xu, L.; Soc, I.C. Multi-Scale Hourglass Hierarchical Fusion Network for Single Image Deraining. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 872–879. [Google Scholar] [CrossRef]
Wang, T.Y.; Yang, X.; Xu, K.; Chen, S.Z.; Zhang, Q.; Lau, R.W.H.; Soc, I.C. Spatial Attentive Single-Image Deraining with a High Quality Real Rain Dataset. In Proceedings of the 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019; pp. 12262–12271. [Google Scholar] [CrossRef]
Ren, D.W.; Zuo, W.M.; Hu, Q.H.; Zhu, P.F.; Meng, D.Y.; Soc, I.C. Progressive Image Deraining Networks: A Better and Simpler Baseline. In Proceedings of the 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019; pp. 3932–3941. [Google Scholar] [CrossRef] [Green Version]
Wei, W.; Meng, D.Y.; Zhao, Q.; Xu, Z.B.; Wu, Y.; Soc, I.C. Semi-supervised Transfer Learning for Image Rain Removal. In Proceedings of the 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019; pp. 3872–3881. [Google Scholar] [CrossRef] [Green Version]
Zhao, H.; Jia, J.; Koltun, V. Exploring self-attention for image recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 10076–10085. [Google Scholar]
Dai, Y.M.; Gieseke, F.; Oehmcke, S.; Wu, Y.Q.; Barnard, K. Attentional Feature Fusion. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 3–8 January 2021; pp. 3559–3568. [Google Scholar] [CrossRef]
Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar]
Li, X.; Wu, J.L.; Lin, Z.C.; Liu, H.; Zha, H.B. Recurrent Squeeze-and-Excitation Context Aggregation Net for Single Image Deraining. In Proceedings of the 15th European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; Volume 11211, pp. 262–277. [Google Scholar] [CrossRef] [Green Version]
Yang, W.; Tan, R.T.; Feng, J.; Guo, Z.; Yan, S.; Liu, J. Joint rain detection and removal from a single image with contextualized deep networks. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 42, 1377–1393. [Google Scholar] [CrossRef] [PubMed]
Chen, W.T.; Huang, Z.K.; Tsai, C.C.; Yang, H.H.; Ding, J.J.; Kuo, S.Y. Learning Multiple Adverse Weather Removal via Two-Stage Knowledge Learning and Multi-Contrastive Regularization: Toward a Unified Model. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 19–20 June 2022; pp. 17653–17662. [Google Scholar]
Guo, Q.; Sun, J.Y.; Juefei-Xu, F.; Ma, L.; Xie, X.F.; Feng, W.; Liu, Y.; Zhao, J.J. EfficientDeRain: Learning Pixel-wise Dilation Filtering for High-Efficiency Single-Image Deraining. In Proceedings of the 35th AAAI Conference on Artificial Intelligence, Virtually, 2–9 February 2021; Volume 35, pp. 1487–1495. [Google Scholar]
Garg, K.; Nayar, S.K.; Society, I.C. Detection and removal of rain from videos. In Proceedings of the Conference on Computer Vision and Pattern Recognition, Washington, DC, USA, 27 June–2 July 2004; pp. 528–535. [Google Scholar] [CrossRef]
Gu, S.H.; Meng, D.Y.; Zuo, W.M.; Zhang, L. Joint Convolutional Analysis and Synthesis Sparse Representation for Single Image Layer Separation. In Proceedings of the 16th IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 1717–1725. [Google Scholar] [CrossRef]
Barnum, P.C.; Narasimhan, S.; Kanade, T. Analysis of rain and snow in frequency space. Int. J. Comput. Vis. 2010, 86, 256–274. [Google Scholar] [CrossRef]
Chen, Y.L.; Hsu, C.T. A Generalized Low-Rank Appearance Model for Spatio-Temporally Correlated Rain Streaks. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Sydney, Australia, 1–8 December 2013; pp. 1968–1975. [Google Scholar] [CrossRef]
Kim, J.H.; Sim, J.Y.; Kim, C.S. Video deraining and desnowing using temporal correlation and low-rank matrix completion. IEEE Trans. Image Process. 2015, 24, 2658–2670. [Google Scholar] [CrossRef] [PubMed]
Li, M.H.; Xie, Q.; Zhao, Q.; Wei, W.; Gu, S.H.; Tao, J.; Meng, D.Y. Video Rain Streak Removal By Multiscale Convolutional Sparse Coding. In Proceedings of the 31st IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 6644–6653. [Google Scholar] [CrossRef]
Chen, J.; Tan, C.H.; Hou, J.; Chau, L.P.; Li, H. Robust Video Content Alignment and Compensation for Rain Removal in a CNN Framework. In Proceedings of the 31st IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 6286–6295. [Google Scholar] [CrossRef] [Green Version]
Liu, J.; Yang, W.; Yang, S.; Guo, Z. D3r-net: Dynamic routing residue recurrent network for video rain removal. IEEE Trans. Image Process. 2018, 28, 699–712. [Google Scholar] [CrossRef] [PubMed]
Liu, J.Y.; Yang, W.H.; Yang, S.; Guo, Z.M. Erase or Fill? Deep Joint Recurrent Rain Removal and Reconstruction in Videos. In Proceedings of the 31st IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 3233–3242. [Google Scholar] [CrossRef]
Fu, X.; Huang, J.; Ding, X.; Liao, Y.; Paisley, J. Clearing the Skies: A Deep Network Architecture for Single-Image Rain Removal. IEEE Trans. Image Process. 2017, 26, 2944–2956. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Fu, X.; Huang, J.; Zeng, D.; Huang, Y.; Ding, X.; Paisley, J. Removing Rain from Single Images via a Deep Detail Network. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 1715–1723. [Google Scholar] [CrossRef]
Ding, X.; Chen, L.; Zheng, X.; Huang, Y.; Zeng, D. Single image rain and snow removal via guided L0 smoothing filter. Multimed. Tools Appl. 2016, 75, 2697–2712. [Google Scholar] [CrossRef]
Voronin, V.; Semenishchev, E.; Zhdanova, M.; Sizyakin, R.; Zelenskii, A. Rain and snow removal using multi-guided filter and anisotropic gradient in the quaternion framework. In Artificial Intelligence and Machine Learning in Defense Applications; SPIE: Bellingham, WA, USA, 2019; Volume 11169. [Google Scholar] [CrossRef]
Kim, J.H.; Lee, C.; Sim, J.Y.; Kim, C.S. Single-image deraining using an adaptive nonlocal means filter. In Proceedings of the 20th IEEE International Conference on Image Processing (ICIP), Melbourne, Australia, 15–18 September 2013; pp. 914–917. [Google Scholar] [CrossRef]
Shi, W.; Caballero, J.; Huszar, F.; Totz, J.; Aitken, A.P.; Bishop, R.; Rueckert, D.; Wang, Z. Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 1874–1883. [Google Scholar] [CrossRef]
Wang, K.; Wang, M. Multistage Feature Complimentary Network for Single-Image Deraining. J. Robot. 2021, 2021. [Google Scholar] [CrossRef]
Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Zhao, H.; Gallo, O.; Frosio, I.; Kautz, J. Loss functions for image restoration with neural networks. IEEE Trans. Comput. Imaging 2016, 3, 47–57. [Google Scholar] [CrossRef]
Qian, R.; Tan, R.T.; Yang, W.H.; Su, J.J.; Liu, J.Y. Attentive Generative Adversarial Network for Raindrop Removal from A Single Image. In Proceedings of the 31st IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018; pp. 2482–2491. [Google Scholar] [CrossRef] [Green Version]
Baudin, E.; Bucher, F.X.; Chanas, L.; Guichard, F. DXOMARK objective video quality measurements. Electron. Imaging 2020, 2020, 166. [Google Scholar] [CrossRef]
Eigen, D.; Krishnan, D.; Fergus, R. Restoring An Image Taken Through a Window Covered with Dirt or Rain. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Sydney, Australia, 1–8 December 2013; pp. 633–640. [Google Scholar] [CrossRef] [Green Version]
Xia, H.; Lan, Y.; Song, S.; Li, H. Raindrop removal from a single image using a two-step generative adversarial network. Signal Image Video Process. 2022, 16, 677–684. [Google Scholar] [CrossRef]
Zhang, K.; Li, D.; Luo, W.; Ren, W. Dual Attention-in-Attention Model for Joint Rain Streak and Raindrop Removal. IEEE Trans. Image Process. 2021, 30, 7608–7619. [Google Scholar] [CrossRef] [PubMed]
Isola, P.; Zhu, J.Y.; Zhou, T.H.; Efros, A.A. Image-to-Image Translation with Conditional Adversarial Networks. In Proceedings of the 30th IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 5967–5976. [Google Scholar] [CrossRef]

$Fractalfract 07 00024 g001 550$

Figure 1. Multi-strategy methods utilised for single image deraining models. Rectangular modules represent one or more layers of neural networks. (a) Multi-sampling networks contain repetitive down-up sampling processes and related multiple stages [13,14,22]. (b) Multi-scale input networks are based on enhanced features or accumulated information [3,4,11,12,15,16,17,21,22]. (c) Multi-residual networks are based on various residual connections [6,10,13,15,21,22]. Although multi-stragedy methods have visible deraining performance, elaborately designed structures, suitable for a single image, rarely reach real-time inference speed.

$Fractalfract 07 00024 g001$

$Fractalfract 07 00024 g002 550$

Figure 2. The formalisation of the proposed model-free algorithm. (a) is the indexing operation utilising dual adjacent channels whose computation is shown in detail in Figure 4. “G” denotes the reshaped features from the ground truth. (b) is the training feedback mechanism, which calculates the loss based on each deraining layer and the reshaped feature gained from (a).

G T

is the clear image which corresponds to the

I n p u t

. “D” represents the features based on the concatenation of deraining features, namely “d1, d2, d3, and d4”. “nn.Concated()” is the common operation of PyTorch.

S S I M_{i}

and

S S I M_{t o t a l}

are the loss computations based on the criterion of the structural similarity index measure (

S S I M

), and

M S E

is the global loss computations based on the criterion of mean square error (

M S E

).

Figure 2. The formalisation of the proposed model-free algorithm. (a) is the indexing operation utilising dual adjacent channels whose computation is shown in detail in Figure 4. “G” denotes the reshaped features from the ground truth. (b) is the training feedback mechanism, which calculates the loss based on each deraining layer and the reshaped feature gained from (a).

G T

is the clear image which corresponds to the

I n p u t

. “D” represents the features based on the concatenation of deraining features, namely “d1, d2, d3, and d4”. “nn.Concated()” is the common operation of PyTorch.

S S I M_{i}

and

S S I M_{t o t a l}

are the loss computations based on the criterion of the structural similarity index measure (

S S I M

), and

M S E

is the global loss computations based on the criterion of mean square error (

M S E

).

$Fractalfract 07 00024 g002$

$Fractalfract 07 00024 g003 550$

Figure 3. The structure of DAMNet.

$Fractalfract 07 00024 g003$

$Fractalfract 07 00024 g004 550$

Figure 4. New pixels produced by the dual adjacent method.

$Fractalfract 07 00024 g004$

$Fractalfract 07 00024 g005 550$

Figure 5. The visualization results on Rain100H. From the preserved details of the face and hair, our model(DAMNet) has impressive performance. In particular, DAMNet maintains more textural details of hair under high-contrast conditions.

$Fractalfract 07 00024 g005$

$Fractalfract 07 00024 g006 550$

Figure 6. The visualization results on Rain1400. From the details and complexions of the faces and ears, our model(DAMNet) has satisfactory performance while best restraining adverse effects, such as skin tone change and ear deformation.

$Fractalfract 07 00024 g006$

$Fractalfract 07 00024 g007 550$

Figure 7. The deraining effects of the models above. The practical effects are seemingly the same, while DAMNet best maintains the original colour of the tree.

$Fractalfract 07 00024 g007$

$Fractalfract 07 00024 g008 550$

Figure 8. The visualisation results on Raindrops. Although ARNet [42] achieves the best performance in Table 4, the details and colour of the windows are restored best after the processing of DAMNet.

$Fractalfract 07 00024 g008$

$Fractalfract 07 00024 g009 550$

Figure 9. The visualisation comparison on Raindrops, Rain100H, Realworld, and Rain1400. DAMNet performs better than EfDeRain in terms of removing rain streaks and preventing side effects.

$Fractalfract 07 00024 g009$

$Fractalfract 07 00024 g010 550$

Figure 10. The comparison of DAMNet and EfDeRain [23] in the crossing-rain datasets, in which the training settings of DAMNet and EfDeRain are the same while the training and testing images are derived from different datasets. The models’ suffixes in Figure 10 are datasets for training themselves. The bigger images are the visual demonstration of rain removal, the zoomed image croppings furtherly highlight the final effect. The first row is the result of EfDeRain and the other is the display of DAMNet. Although the derained images have visible rain streaks, the more obvious tendency to remove rain tracks is clearly identified.

$Fractalfract 07 00024 g010$

Table 1. Experimental comparison on Rain100H [21].

Rain100H
Derainor	$P S N R$	$S S I M$	Speed (ms)
DAMNet	31.49	0.914	4.16
EfDeRain [23]	30.35	0.883	4.05
AIDDWT [11]	28.90	0.910	504.12
RCDNet [10]	31.50	0.910	424.80
CVID [12]	27.90	0.855	284.80
JORDERE [21]	30.10	0.890	14.40
SPANet [14]	25.01	0.825	10.05
PReNet [15]	30.12	0.907	50.01
SIRR [16]	22.50	0.725	354.80

Table 2. Experimental comparison on Rain1400 [34].

Rain1400
Derainor	$P S N R$	$S S I M$	Speed (ms)
DAMNet	32.40	0.931	4.40
EfDeRain [23]	32.29	0.927	4.06
MHEB [13]	33.51	0.954	>40
CKT&MCR [22]	33.13	0.930	>40
AIDDWT [11]	31.00	0.925	580.21
RCDNet [10]	33.14	0.947	549.00
CVID [12]	29.10	0.937	240.90
JORDERE [21]	32.04	0.930	11.22
SPANet [14]	29.80	0.914	8.12
PReNet [15]	32.34	0.945	60.31
SIRR [16]	28.51	0.890	400.00

Table 3. Experimental comparison on Realworld [14].

Realworld
Derainor	$P S N R$	$S S I M$	Speed (ms)
DAMNet	42.48	0.984	3.70
EfDeRain [23]	41.09	0.982	3.62
MHEB [13]	42.72	0.987	>40
RCDNet [10]	41.20	0.982	425.50
JORDERE [21]	40.80	0.981	18.10
SPANet [14]	40.22	0.980	12.23
PReNet [15]	40.10	0.981	75.30
SIRR [16]	35.50	0.943	362.00

Table 4. Experimental comparison on Raindrops [42].

Raindrops
Methods	$P S N R$	$S S I M$	Speed (ms)
DAMNet	29.69	0.911	4.10
EfDeRain [23]	28.48	0.897	4.06
HSNet [45]	26.73	0.831	>40
DAM [46]	30.26	0.914	>40
ARNet [42]	31.57	0.902	>40
JORDERE [21]	27.52	0.824	12.45
EIGEN [44]	28.59	0.673	>40
Pix2Pix [47]	30.14	0.829	>40

Table 5. The comparison on crossing-rain datasets. The models’ suffixes are datasets for training themselves. DAMNet is the multi-deraining model proposed, and EfDeRain [23] is the model with only one deraining layer. The numbers highlighted in bold denote that DAMNet outperforms EfDeRain on the same crossing datasets under the same conditions.

	Rain100H ( $PSNR$ )	Rain1400 ( $PSNR$ )	SPA ( $PSNR$ )	Raindrops ( $PSNR$ )	$MEAN$
DAMNet_rain100H	31.49	26.36	32.89	23.43	28.543
EfDeRain_rain100H	30.35	25.55	31.83	23.18	27.728
DAMNet_rain1400	15.06	32.38	30.71	23.46	25.403
EfDeRain_rain1400	14.73	32.29	31.08	23.52	25.405
DAMNet_SPA	12.96	25.48	42.48	23.5	26.105
EfDeRain_SPA	12.97	25.29	41.09	24.13	25.870
DAMNet_raindrops	12.53	23.05	27.21	29.69	23.120
EfDeRain_raindrops	11.89	21.6	26.3	28.48	22.068
	Rain100H ( $SSIM$ )	Rain1400 ( $SSIM$ )	SPA ( $SSIM$ )	Raindrops ( $SSIM$ )	$MEAN$
DAMNet_rain100H	0.914	0.865	0.933	0.823	0.884
EfDeRain_rain100H	0.8834	0.854	0.93	0.818	0.871
DAMNet_rain1400	0.46	0.931	0.924	0.817	0.783
EfDeRain_rain1400	0.444	0.927	0.926	0.82	0.779
DAMNet_SPA	0.398	0.82	0.984	0.822	0.756
EfDeRain_SPA	0.399	0.811	0.9825	0.821	0.753
DAMNet_raindrops	0.353	0.752	0.877	0.911	0.723
EfDeRain_raindrops	0.352	0.741	0.873	0.8971	0.716

DAMNet_rain100H and EfDeRain_rain100H mean that the training images of DAMNet and EfDeRain are derived from the Rain100H. Raindrops (PSNR) and Raindrops (SSIM) denote that the performance results are achieved when testing on the Raindrop [42]. Other similar naming ways are the same as the interpretation above.

Table 6. The ablation of DAMNet on Rain100H, approaching the relationship of the proposals.

	DAMNet	DAMNetfinalLoss	DAMNet_noDual	DAMNet_noMSE	DAMNet_noSSIMtotal	DAMNet1234
$P S N R$	31.49	30.5	30.36	29.5	30.1	18.28
$S S I M$	0.914	0.901	0.885	0.88	0.89	0.565
	DAMNet34	DAMNet0123	DAMNet012	DAMNet01	DAMNet04	DAMNet234
$P S N R$	18.12	18.46	18.3	17.912	26.65	18.26
$S S I M$	0.501	0.588	0.5838	0.5285	0.8403	0.55

DAMNet_noDual is the model that eliminates the dual adjacent indexing operation and is directly reshaped to the next merged stage. DAMNet_noMSE and DAMNet_noSSIMtotal represent the loss calculation eliminating MSE and SSIM_total in the loss computation of the final deraining layer. DAMNetfinalLoss is the deraining model proposed with one-off loss computation, whose loss computation is the same as the EfDeRain. The models with number affixes are the version of DAMNet maintaining the corresponding deraining layers.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhao, P.; Zheng, H.; Tang, S.; Chen, Z.; Liang, Y. DAMNet: A Dual Adjacent Indexing and Multi-Deraining Network for Real-Time Image Deraining. Fractal Fract. 2023, 7, 24. https://doi.org/10.3390/fractalfract7010024

AMA Style

Zhao P, Zheng H, Tang S, Chen Z, Liang Y. DAMNet: A Dual Adjacent Indexing and Multi-Deraining Network for Real-Time Image Deraining. Fractal and Fractional. 2023; 7(1):24. https://doi.org/10.3390/fractalfract7010024

Chicago/Turabian Style

Zhao, Penghui, Haowen Zheng, Suigu Tang, Zongren Chen, and Yangyan Liang. 2023. "DAMNet: A Dual Adjacent Indexing and Multi-Deraining Network for Real-Time Image Deraining" Fractal and Fractional 7, no. 1: 24. https://doi.org/10.3390/fractalfract7010024

Article Menu

DAMNet: A Dual Adjacent Indexing and Multi-Deraining Network for Real-Time Image Deraining

Abstract

1. Introduction

2. Related Works

3. Methodologies

3.1. Structures

3.2. The Dual Adjacent Method

3.3. Joint Training Method

4. Experiments and Results

4.1. Settings

4.2. Comparison on Public Datasets

4.3. Ablation Study

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI