Two-Stage and Two-Channel Attention Single Image Deraining Network for Promoting Ship Detection in Visual Perception System

Liu, Ting; Zhou, Baijun; Luo, Peiqi; Zhang, Yuxin; Niu, Longhui; Wang, Guofeng

doi:10.3390/app12157766

Open AccessArticle

Two-Stage and Two-Channel Attention Single Image Deraining Network for Promoting Ship Detection in Visual Perception System

by

Ting Liu

^*

,

Baijun Zhou

,

Peiqi Luo

,

Yuxin Zhang

,

Longhui Niu

and

Guofeng Wang

College of Marine Electrical Engineer, Dalian Maritime University, Dalian 116026, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2022, 12(15), 7766; https://doi.org/10.3390/app12157766

Submission received: 5 July 2022 / Revised: 28 July 2022 / Accepted: 29 July 2022 / Published: 2 August 2022

Download

Browse Figures

Versions Notes

Abstract

:

Featured Application

Environmental perception for unmanned surface vessel (USV).

Abstract

Image deraining ensures the visual quality of images to prompt ship detection for visual perception systems of unmanned surface vessels. However, due to the insufficiency of captured rain streaks features and global information, current image deraining methods often face the issues of rain streaks remaining and image blurring. Consider that the visual perception system captures the same useful information during rainy and hazy days, and only the way in which the image degrades is different. In addition, rainy days are usually accompanied by hazy days at the same time. In this paper, a two-stage and two-channel attention single image deraining network is proposed. Firstly, the subpixel convolution up-sampling module is introduced to increase the range of captured features and improve the image clarity. Secondly, the attention mechanism is integrated with the pyramid multi-scale pooling layer, so that the network can accumulate context information in a local to global way to avoid the loss of global information. In addition, a new composite loss function is designed, in which a regular term loss is introduced to maintain the smoothness and a perceptual loss function is employed to overcome the problem of large differences in the output of the loss function due to outliers. Extensive experimental results on both synthetic and real-world datasets demonstrate the superiority of our model in both quantitative assessments and visual quality by comparing with other state-of-the-art methods. Furthermore, the proposed deraining network is incorporated into the visual perception system and the detection accuracy of ships on rainy seas can be effectively improved.

Keywords:

image deraining; image dehazing; convolutional neural network; subpixel convolution module

1. Introduction

Adverse weather conditions such as rain, snow, and haze can alter the contents and colors of digital images and thus degrade their visual quality [1,2], which will hamper the performance of visual perception systems of unmanned surface vessels (USV), such as object detection, segmentation, tracking, surveillance [3], and driving assistance systems [4]. However, these algorithms are developed under clear weather conditions [5]. In particular, rain streaks usually have multiple directions and present different intensities in digital images [1]. Large water particles from the rain streaks randomly occlude the scene, resulting in a locally dense negative impact on the image [6]. It is formulated as:

D = I + R,

(1)

where rain image D can be expressed using a linear combination of rain-free background image I and rain streak R. The ultimate goal of deraining is to recover clean image I by removing rain streak R from D. Obviously, image deraining is a highly ill-posed problem [7], because we cannot find the unique solution of I from the given D [6]. In addition, fine particles generated from the rain streaks distort and blur the image. Hence, it is an essential and challenging task to remove the rain streaks from the image so vision systems will be robust to the maritime environment [2].

Early methods are mostly based on subjective prior knowledge and the linear characteristic of the rainy images [1,8,9,10] by estimating the rain streaks in the image. Prior information about the background image and rain streaks are designed to constrain the solution space [11]. A single image deraining algorithm based on the generation of a raindrop nonlinear physical model is designed in [12], which can accurately separate the extracted image layer and rain layer. However, this method also shows some limitations, such as some artifacts around the area containing rain streaks, incomplete raindrop removal and others. The deraining process is regarded as a process of dividing the image into a rainless layer and a rain layer for the sparse coding dictionary learning-based method. The rain image is decomposed into low-frequency and high-frequency parts by using bilateral filters in [13], and then the high-frequency part will be decomposed into rain components and non-rain components through sparse coding and dictionary learning, so that rain streaks can be successfully removed from the image. However, the deraining algorithm based on sparse coding dictionary learning has the problem of blurring some details of the original image in the process of removing rain streaks. In summary, these prior-based methods always behave ineffectively in real scenes, especially for highly diverse rain removal, because prior knowledge is difficult to obtain precisely. Besides, the prior-based method can be regarded as the optimization of the cost function, and it usually requires a high time cost [7]. Thus, it cannot be easily matched to the design requirements for mobile and embedded vision applications [7].

Recently, with the rise of deep learning and its computing performance, supervised learning-based convolutional neural network (CNN) algorithms have emerged with promising performance through designing various network structures with unique nonlinear mapping transformation [14,15,16,17]. These works can automatically extract features of the rain layer and separate them from the original image to get a clear background [7]. The network-based approach is data-driven, the quality of the approach is determined by data, rain model, and network architecture [11]. The popular deep networks are generally constructed with complicated branches and numerous layers to complete different deraining tasks for further performance improvement. The LPNet deraining network designed in [18] is based on fully supervised learning; a Gaussian pyramid is combined to improve the deraining effect and make the model lightweight. The network is trained with synthetic datasets and achieved a good deraining performance. However, the network still has problems in image color restoration. To solve this problem, a deraining network based on multi-scale progressive fusion strategy was designed in [19], which gradually fused the complementary information of similar patterns of rain patterns in different scales. Therefore, it more comprehensively retained the color contrast of the image while removing rain streaks. In addition, the attention mechanism also plays an important role in image rain removal, which can make the network adaptively pay more attention to those important features. For example, both [20,21] add an attention mechanism to the network to improve network performance. At the same time, MATNet [22] proposed a kind of asymmetric attention block called motor attention shift (MAT), which can produce accurate segmentation with clear boundaries, which will be helpful in image rain removal. The performance of the full supervision method has the greatest dependence on datasets. However, it is difficult to obtain a large-scale real dataset for network training. Thus, the semi supervised deraining method adds real data for joint training on the basis of full supervision, such as the most commonly used semi-supervised migration learning algorithm SIRR [23]. Weakly supervised networks such as those in [24,25] use real rainy data sets to remove rain, and the corresponding rain-free image is acquired by picture software. The real rainy images are introduced to train the network model, which significantly alleviates the problem of insufficient training samples. However, the quality of the dataset is still a key factor in the performance of the model.

Fortunately, transfer learning is commonly used to solve small sample learning problems. Considering the consistency of the final images acquired by the dehazing and deraining algorithms, it is a very interesting work on how to effectively implement deraining with a dehazing network based on the idea of transfer learning. Many scholars have used image dehazing networks to study image deraining algorithms [26,27,28]. Based on the above discussion, a two-channel and two-stage image deraining network is proposed in this paper. The remainder of this paper is organized as follows. Section 2 introduces the overall framework of our network and its details. In Section 3, the experimental results show the superiority of our method to state-of-the-art methods. The ship detection performance with deraining method is analyzed in Section 4. Finally, conclusions are presented in Section 4.

2. Materials and Methods

In this section, the newly designed deraining network is introduced in detail. As the rain removal network in this paper is designed on the basis of our previous dehazing network, the network structure of the original dehazing network and all the modifications are demonstrated.

2.1. Double Channel and Double Stage Image Dehazing Network

The framework of our previous two-channel and two-stage image dehazing network (DTD-Net) is shown in Figure 1. The main modules of network structure include RDB module, basic module, up-down sampling, and attention mechanism. Specifically, the RDB module is mainly composed of two convolution layers of different sizes to better obtain the hazy image features. In order to reuse features to avoid information loss, the output of each convolution layer of the RDB module is taken as the input of the subsequent convolution layer. Then, a convolution layer of size 3 and ReLU activation function constitute the basic module. Bilinear difference up-sampling is introduced to expand the network channel, and attention mechanism is used to weight the expanded features to improve the effectiveness of the network. The visual effect and quantitative indexes demonstrate that the deraining network designed has strong dehazing performance. For more details, please refer to our paper [29].

2.2. The Proposed Deraining Network

On the basis of the above dehazing network, an image deraining network is proposed in this paper. In particular, the bilinear difference up-sampling module is replaced with subpixel convolution up-sampling to further improve the image resolution. In addition, a multi-scale pyramid pooling layer is introduced to preserve the global information of the features processed by the attention mechanism to avoid the loss of detail information. Finally, the total variation (TV) loss function is introduced to better constrain the noise of the image. The whole structure of the designed image deraining network is shown in Figure 2.

2.2.1. Subpixel Convolution Up-Sampling

For the input stage of the rain removal network designed in this paper, the input image was uniformly segmented into ten small images to enlarge the patterns and details of the rain streaks in the image, allowing the network to extract and utilize more global features. However, the resolution is reduced with the rain streaks enlargement. Therefore, subpixel convolution up-sampling was incorporated to replace the bilinear interpolation of DTDNet to improve the resolution of the image.

Subpixel convolution up-sampling is an end-to-end up-sampling module. It can obtain a high-resolution image at a specified rate from a low-resolution image by setting the up-sampling ratio. Typically, a proximity pixel-filling algorithm is used in the up-sampling process in which mainly the effect of spatial factors is taken into account but the effect of channel dimension is ignored; however, the convolution channel dimension information is combined in the subpixel convolution up-sampling module. It can achieve super resolution and generate more realistic images, which can effectively solve the problem of reduced image resolution with image deraining [30]. The structure of subpixel convolution up-sampling is shown in Figure 3.

The subpixel convolution up-sampling is used in both two stages, but in contrast to the first stage, the attention mechanism and pyramid multi-scale pooling layer are not used to fuse the features of the output of the two channels again in the second stage. Moreover, the number of RDB modules in the extension channel is reduced and the connection mode of this module is changed to step-by-step connection. The depth of the network is reduced through this structure. Combined with the activation function, the overfitting can be effectively alleviated.

2.2.2. Attention Mechanism and Multi-Scale Pooling Layer of Pyramid

The input image is passed through two channels and rich multi-scale features are extracted from different depths, but the increase in feature dimensionality and redundancy of features are instead not utilized in the later image reconstruction. Therefore, the attention mechanism and the pyramid multi-scale pooling layer are connected to effectively perform global feature fusion for better feature representation.

The channel attention mechanism is introduced in DTDNet, which allows the model to perform feature recalibration by fusing image features captured by the two channels in proportion to their weight values. This can strengthen important information features and suppress useless features. However, the rain streaks are globally distributed in the image and a large number of raindrops are also present in the edge areas of the image. The global information will be lost by channel attention mechanism-based feature fusion. Therefore, the pyramid multi-scale pooling layer is introduced to process the output of the channel attention mechanism module.

The pyramid multi-scale pooling layer is a verified and effective global context prior module, which uses different scales of the pooling layer and the convolution layer to retain different scales of global information to avoid information loss [31]. The pyramid pooling layer adopted in this paper has four different pyramid scales: 1 × 1, 2 × 2, 3 × 3, and 6 × 6. Firstly, the feature maps are pooled to the target size. Then a 1 × 1 convolution is performed on the pooled result to reduce the number of channels by a quarter. Besides, the bilinear interpolation is used to up-sample each feature graph in the previous step to obtain the same size as the original feature map. Next, the original feature map and the up-sampled feature map are spliced according to the channel dimension. The obtained channel width is twice that of the original feature map. Finally, 1 × 1 convolution is used to reduce the channel to its original size as the output of the multi-scale pooling layer of the pyramid. The global information is improved on the basis of retaining the output characteristics of attention mechanism. The structure of the multi-scale pooling layer of the pyramid is shown in Figure 4.

2.2.3. Improvement of Loss Function

The MSE loss function is the most commonly used loss function in image processing networks, which overcomes the problems of low stability and unstable output of the L1 loss function. The MSE loss function is defined as:

L_{M S E} = \frac{1}{N} \sum_{i = 1}^{N} {(Y_{t r u e} - Y_{p r e d})}^{2},

(2)

where

Y_{t r u e}

and

Y_{p r e d}

represent the real value and predicted value respectively, and N is the number of numerical values. However, the incorrect predictions will be produced by MSE loss function when outliers are present. Therefore, a perceived loss function is incorporated to improve this situation. The perceived loss function can be used to quantify and estimate the visual differences between images through the pre-trained VGG16 network. Thus, the possibility of abnormal difference in the original MSE loss function is reduced. The calculation formula of perceived loss function is defined as follows:

L_{A} = \frac{1}{C \times W \times H} {\sum_{C = 1}^{C} \sum_{W = 1}^{W} \sum_{H = 1}^{H} ‖ V (Y_{true}) - V (Y_{p r e d}) ‖}_{2}^{2},

(3)

where C, W, and H are the number, width, and height of output channels respectively, V represents the pre-training model VGG16 used, and V( ) is the outputs of the VGG16 model.

In addition, slight noise on the image may have a great impact on the restoration result for image deraining. Some regularization terms need to be added to the model of the optimization problem to maintain the image smoothness. Therefore, on the basis of improving MSE loss function, the TV (total variation) loss function is also introduced in this paper. The TV loss function is a commonly used regular term, which is used to reduce the difference between adjacent pixel values by using constrained noise in coordination with other loss functions. The expression of the TV loss function is as follows:

L_{T V} = \sum_{i, j} {({(x_{i, j - 1} - x_{i, j})}^{2} + {(x_{i + 1, j} - x_{i, j})}^{2})}^{\frac{β}{2}},

(4)

where x is the pixel points of the image. The default value for

β

is fixed to 2. Finally, the total loss function of the deraining network designed in this paper is defined as:

L_{O U R} = L_{M S E} + α L_{A} + δ L_{T V},

(5)

where

α

and

δ

are the weight values of the perceived loss function and the TV loss function, respectively, which are 0.04 and 0.02 after a series of training adjustments.

3. Results

In this section, the dataset used in this study and the details of the experimental setting are presented in detail. Besides, various experiments are firstly conducted to evaluate our algorithm against the state-of-the-art methods on both synthetic benchmark datasets and real-world data. All these experiments are verified with both quantitative evaluation indicators and qualitative visual effects. Finally, ablation studies are conducted to verify the significance of each component of our design.

All our experiments in this paper are performed on a workstation configured in the laboratory; the operating system was Windows 10 Professional Edition, the processor was an Intel (R) Xeon (R) Silver 4210R and the graphics card was an NVIDIA GeForce RTX 2080 Ti. The software used was PyCharm 2020.1 version and the internal environment was PyTorch with Python 3.7.

3.1. Datasets and Evaluation Metrics

The data quality is important for deep-learning based methods [6]. The commonly used open-source datasets for single image rain removal mainly include Rain100H [32], Rain100L [32], Rain800 [33], and Rain14000 [14]. Compared with other datasets, the Rain14000 dataset fully considers the distribution of rain streaks. The inability of composite images to reflect the characteristics of real images has been compensated from multiple angles. Therefore, the Rain14000 dataset was selected to train and compare the rain removal network designed in this paper and all the comparisons.

Rain14000 dataset is a new open-source dataset published in 2017, which contains a total of 14,000 images. The training set of 12,600 images is a total of 14 synthetic rain images with different raindrop densities and different tilts corresponding to 900 real clear images, respectively. Correspondingly, the test set consisted of 1400 images, or 14 synthetic rain images of varying degrees of clarity corresponding to 100 real images. An example of the Rain14000 dataset is shown in Figure 5, which shows rain-free images and 14 corresponding synthetic rain images.

For synthetic image, peak signal-to-noise ratio (PSNR) and structure similarity were selected as evaluation metrics to assess the performance quantitatively, in which a larger value indicates a better-quality image. For real-world rainy images, visual performance is a criterion to validate the robustness and generalization of our method. Besides, two quantitative indicators, image information entropy and image contrast, were additionally selected to evaluate the experimental results. Information entropy is the overall characteristic of the information source in the average sense, and the unit is bits/pixel. There is only one information entropy for the particular source. Image information entropy was used to measure the average amount of information carried by each image [34].

3.2. Comparison between the Other Deraining Methods

The deraining models were evaluated on the Rain14000 with two traditional methods [12,13] and four rain removal convolutional neural networks SIRR [23], LPNet [18] MSPFN [19], and DTDNet. Specifically, in order to show the advantages of rain removal network in this paper compared with DTDNet, we trained DTDNet on the same dataset. The network training parameter epoch was 100, the batch size was 64, and the learning rate was fixed at 0.001. The Adam optimizer was used to accelerate the training. The default settings of optimizer parameters

β_{1}

and

β_{2}

were 0.9 and 0.999. To ensure the fairness of the experiments, all the above comparison methods adopted their publicly released codes and retrained with the same dataset Rain14000 in the experiments, and followed their original settings under the unified training dataset.

The quantitative performance of our model and the comparative models are shown in Table 1. As Table 1 shows, the rain removal network designed in this paper is significantly better than the traditional rain removal method and other rain removal networks in quantitative index comparison.

Moreover, qualitative comparisons are shown in Figure 6. The four test images were selected from Rain100H and Rain100L with different rain streaks texture and density, respectively. On the basis of comparing with the original picture, the information entropy and contrast were also calculated for each image. The format of “Entropy/Contrast” is noted below the image. It can be seen from the experimental results that the rain removal network designed in this paper is superior to other comparison methods, especially for the visual effects. In addition, the overall performance of the rain removal method based on a convolutional neural network is better than the traditional image rain removal method. Specifically, the two traditional rain removal methods could only remove a small part of the rain streaks, and there were obvious removal artifacts. Besides, the removed region of the rain streaks could not be merged with the image background. Compared with the traditional methods, more rain streaks could be removed and the image background could be better restored by SIRR and LPNet. However, there were still more visible rain streaks that had not been removed and the resolution of the output image was limited. MSPFN had a considerably better rain removal effect, but it can be seen from the second picture that some obvious rain streaks still remained. No obvious rain streak residues were found in the rain removal network designed in this paper. Moreover, the information entropy and contrast of MSPFN were lower than that of the proposed rain removal network. It can be seen that the proposed network has a good rain removal effect and strong applicability on synthetic images.

For further general verification of the proposed method, additional experiments were conducted on four real-world images collected from the internet, including portraits, still life, and so on. The rain removal results are shown in Figure 7. In addition, the information entropy and contrast were used to further quantitatively evaluate the rain removal performance. It can be seen from Figure 7 that the deraining results of the two traditional methods were limited, and a large number of rain streaks could not be removed. For the third image, SIRR, LPNet, and MSPFN all had better rain removal effect visually, but the values of information entropy and contrast were lower than the rain removal network designed in this paper. For the other three images, more rain streaks remained with SIRR and LPNet compared with MSPFN and the proposed network. In particular, it can be seen from the comparison between the second image and the fourth image that the proposed method had less residual raindrops than MSPFN and the information entropy and contrast were higher than those of all MSPFNs. Therefore, the network designed in this paper also has good rain removal effect and applicability on real-world rainy images. However, it can be seen from the first picture that there are still obvious raindrops in the part illuminated by strong light, and a small number of rain streaks could not remove completely in the fourth picture. So, the rain removal network designed in this paper still has room for improvement.

3.3. Ablation Experiment

An ablation experiment was conducted to evaluate the performance of the proposed strategies. The fusion modules of the attention mechanism and the pyramid pooling layer, the channel expansion modules of the first and second stages, and the improved loss function were verified item by item. The corresponding improved networks were trained and compared respectively. The results are shown in Table 2. The experimental results show that each key module had a positive effect on the final rain removal performance of the network.

4. Ship Detection

The effect of the deraining model for promoting the performance of ship detection in visual perception system is verified in this section. Because rain streaks can degrade the visibility of ships under complex weather conditions, the incorporation of effective image enhancement would be helpful in several vision models. Therefore, an improved YOLO V5 network was selected to perform ship detection task jointly with our deraining model. Besides, the ship detection dataset used for network training was constructed using a UAV to photograph an independently designed surface unmanned vessel “Lan Xin”, as shown in Figure 8a. In particular, the rainy image was obtained by synthesizing rain streaks, as shown in Figure 8b.

To test the effectiveness of the image enhancement method in this paper for ship detection, the ships detection network was trained with the noisy dataset and the corresponding clear images processed by the image rain removal network. Specifically, the performance of DTDNet and the proposed network were furtherly compared. The training results are shown in Figure 9. From Figure 9, the accuracy of the detection network trained with the clear images dataset which processed by the rain removal network steadily increased. The accuracy fluctuation was lighter than the network trained with the blurred images dataset. Besides, the recall rate reached the highest value earlier and stabilized. It is proved that the deraining network can provide better stability in the training of a ship detection network.

Quantitative results for the improvement of the ship detection accuracy in addition to the deraining performance are reported in Figure 10. Among them, (a) and (c) are the detection results of the rain image, (b) and (d) are the corresponding detection results of the deraining image processed by the designed rain removal network, (e) and (f) are the detection results of rainy images before and after rain removal processing with DTDNet.

The detection results show that although the ships in (a), (c), and (f) are correctly detected, the confidence is significantly lower than the detection confidence of the detection network for the clear image. Meanwhile, the confidence in comparison (b) is significantly higher than that in comparison (f), indicating that the rain removal network in this paper is more suitable for ship target detection tasks on rainy days than DTDNet. It is proved that the rain removal network designed in this paper has a strong applicability in ship detection, which improves the stability of the training process of the ship detection network and also improves the detection accuracy.

5. Conclusions

In this paper, a two-channel and two-stage image deraining network based on a convolutional neural network is proposed for single image deraining. The proposed deraining network is based on DTDNet, which is our previous version for image dehazing. Specifically, a sub-pixel up-sampling module is introduced to expand the network channel to acquire more reasonable multi-scale features and improve the image resolution. Meanwhile, attention mechanism is combined with the multi-scale pooling layer of a pyramid to realize adaptive fusion of multi-scale features on the basis of preserving the global information. Finally, perceptual loss function, TV loss function, and MSE loss function are combined to effectively control the noise on the basis of improving the unstable output of MSE loss function due to outliers. Qualitative and quantitative experiments on both synthesized and real-world images demonstrated that the proposed model outperforms existing state-of-the-art CNN-based methods. Furthermore, additional ship detection experiments demonstrate that our model contributes to ship detection by enhancing images degraded under bad weather conditions.

However, there are still some limitations for the network designed in this paper. Although previous research has shown that deraining networks trained on synthetic datasets can be effectively used in real-world image processing, a certain degree of rain streak residue is still present in the rain removal results of the real-world images compared to the synthetic images. Therefore, more reasonable datasets are needed for network model training. Besides, the network model structure needs to be further optimized to meet the practical applications. Finally, whether the number of parameters of the network meets the requirements of the hardware platform and the running time of the algorithm are also issues that need to be considered in practical applications including sea surface ship detection.

Author Contributions

Conceptualization, T.L.; Funding acquisition, T.L.; Investigation, L.N.; Methodology, B.Z.; Supervision, G.W.; Writing—original draft, T.L., P.L. and Y.Z.; Writing—review & editing, T.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by China Postdoctoral Science Foundation (2019M661076).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data is contained within the article.

Conflicts of Interest

The authors declare no conflict of interest.

References

Gu, Y.; Gao, Y.; Liu, H. Multi-directional rain streak removal based on infimal convolution of oscillation TGV. Neurocomputing 2022, 486, 61–76. [Google Scholar]
Parka, Y.; Jeonb, M.; Lee, J.; Kanga, M. MCW-Net: Single Image Deraining with Multi-level Connections and Wide Regional Non-local Blocks. Elsevier Signal Process. Image Commun. 2022, 105, 116701. [Google Scholar]
Li, T.; Chen, X.; Zhu, F.; Zhang, Z.; Yan, H. Two-stream deep spatial-temporal autoencoder for surveillance video abnormal event detection. Neurocomputing 2021, 439, 256–270. [Google Scholar]
Carranza-García, M.; Lara-Benítez, P.; García-Gutiérrez, J.; Riquelme, J.C. Enhancing object detection for autonomous driving by optimizing anchor generation and addressing class imbalance. Neurocomputing 2021, 449, 229–244. [Google Scholar]
Carion, N.; Massa, F.; Synnaeve, G.; Usunier, N.; Kirillov, A.; Zagoruyko, S. End-to-end object detection with transformers. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020. [Google Scholar]
Wja, A.; Tkk, B.; Hdc, C. Remove and Recover: Deep End-to-End Two-stage Attention Network for Single-shot Heavy Rain Removal. Neurocomputing 2022, 481, 216–227. [Google Scholar]
Gao, F.; Mu, X.; Ouyang, C.; Yang, K.; Ji, S.; Guo, J.; Wei, H.; Wang, N.; Ma, L.; Yang, B. MLTDNet: An efficient multi-level transformer network for single image deraining. Neural Comput. Appl. 2022, 156, 1–15. [Google Scholar]
Kang, L.W.; Lin, C.W.; Fu, Y.H. Automatic single-image-based rain streaks removal via image decomposition. IEEE Trans. Image Process. 2011, 21, 1742–1755. [Google Scholar]
Rubinstein, R.; Bruckstein, A.M.; Elad, M. Dictionaries for sparse representation modeling. Proc. IEEE 2010, 98, 1045–1057. [Google Scholar]
Bi, X.A.; Chen, Z.B.; Yue, J.C. LRP-Net: A Lightweight Recursive Pyramid Network for Single Image Deraining. Neurocomputing 2022, 490, 181–192. [Google Scholar]
Liang, S.; Meng, X.; Su, Z.; Zhou, F. Multi-receptive Field Aggregation Network for single image deraining. J. Vis. Commun. Image R. 2022, 84, 103469. [Google Scholar]
Luo, Y.; Xu, Y.; Ji, H. Removing rain from a single image via discriminative sparse coding. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; Available online: https://pan.baidu.com/s/1AztZ5BSNKWmxr9PzZwpGDwcode:d229 (accessed on 6 June 2020).
Li, Y.; Tan, R.T.; Guo, X.; Lu, J.; Brown, M.S. Rain streak removal using layer priors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2736–2744. Available online: https://github.com/yu-li/LPDerain (accessed on 6 June 2020).
Fu, X.; Huang, J.; Zeng, D.; Huang, Y.; Ding, X.; Paisley, J. Removing rain from single images via a deep detail network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
Fu, X.; Huang, J.; Ding, X.; Liao, Y.; Paisley, J. Clearing the skies: A deep network architecture for single-image rain removal. IEEE Trans. Image Process. 2017, 26, 2944–2956. [Google Scholar]
Zhang, H.; Patel, V.M. Density-aware single image de-raining using a multi-stream dense network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
Li, S.; Araujo, I.B.; Ren, W.; Wang, Z.; Tokuda, E.K.; Junior, R.H.; Cesar-Junior, R.; Zhang, J.; Guo, X.; Cao, X. Single image deraining: A comprehensive benchmark analysis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
Fu, X.; Liang, B.; Huang, Y. Lightweight pyramid networks for image deraining. IEEE Trans. Neural Netw. Learn. Syst. 2019, 31, 1794–1807. Available online: https://xueyangfu.github.io/projects/LPNet.html (accessed on 22 July 2019).
Jiang, K.; Wang, Z.; Yi, P. Multi-scale progressive fusion network for single image deraining. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; Available online: https://github.com/kuihua/MSPFN (accessed on 11 March 2021).
Jiang, K.; Wang, Z.; Yi, P. Decomposition makes better rain removal: An improved attention-guided deraining network. IEEE Trans. Circuits Syst. Video Technol. 2020, 31, 3981–3995. [Google Scholar]
Yin, H.; Deng, H. RAiA-Net: A Multi-Stage Network with Refined Attention in Attention Module for Single Image Deraining. IEEE Signal Process. Lett. 2022, 29, 747–751. [Google Scholar]
Zhou, T.; Li, J.; Wang, S. Matnet: Motion-attentive transition network for zero-shot video object segmentation. IEEE Trans. Image Process. 2020, 29, 8326–8338. [Google Scholar]
Wei, W.; Meng, D.; Zhao, Q. Semi-supervised transfer learning for image rain removal. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; Available online: https://github.com/wwzjer/Semi-supervised-IRR (accessed on 10 March 2020).
Lin, H.; Li, Y.; Fu, X.; Ding, X.; Huang, Y.; Paisley, J. Rain O’er Me: Synthesizing Real Rain to Derain with Data Distillation. IEEE Trans. Image Process. 2020, 29, 7668–7680. [Google Scholar]
Zhou, T.; Li, L.; Li, X.; Feng, C.M.; Li, J.; Shao, L. Group-Wise Learning for Weakly Supervised Semantic Segmentation. IEEE Trans. Image Process. 2022, 31, 799–811. [Google Scholar]
Wang, C.; Li, Z.; Wu, J. Deep residual haze network for image dehazing and deraining. IEEE Access 2020, 8, 9488–9500. [Google Scholar]
Chen, D.; He, M.; Fan, Q. Gated context aggregation network for image dehazing and deraining. In Proceedings of the 2019 IEEE Winter Conference on Applications Of Computer Vision (WACV), Waikoloa, HI, USA, 7–11 January 2019. [Google Scholar]
Liang, X.; Li, R.; Tang, J. Selective Attention network for Image Dehazing and Deraining. In Proceedings of the ACM Multimedia Asia, Beijing, China, 15–18 December 2019. [Google Scholar]
Liu, T.; Zhou, B. Dual-Channel and Two-Stage Dehazing Network for Promoting Ship Detection in Visual Perception System. Math. Probl. Eng. 2022, 2022, 1–15. [Google Scholar]
Shi, W.; Caballero, J.; Huszár, F. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
Zhao, H.; Shi, J.; Qi, X. Pyramid scene parsing network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
Yang, W.; Tan, R.T.; Feng, J.S.; Liu, J.Y.; Guo, Z.M.; Yan, S.C. Deep joint rain detection andremoval from a single image. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
Zhang, H.; Sindagi, V.; Vishal, M.P. Image deraining using a conditional generative adversarial network. IEEE Trans. Circuits Syst. Video Technol. 2020, 30, 3943–3956. [Google Scholar]
Li, X.; Liu, G.; Ni, J. Autofocusing of ISAR images based on entropy minimization. IEEE Trans. Aerosp. Electron. Syst. 1999, 35, 1240–1252. [Google Scholar]

Figure 1. The framework of dehazing network [29].

Figure 2. The framework of the deraining network.

Figure 3. The structure of subpixel convolution up-sampling [30].

Figure 4. Structure of pyramid multi-scale pooling layer [31].

Figure 5. Example of Rain14000.

Figure 6. Qualitative comparison of virtual images.

Figure 7. Qualitative comparison of real-world images.

Figure 8. Example of “Lan Xin” dataset. (a) Original image of “Lan Xin”. (b) Synthesized rainy image.

Figure 9. Comparison of the training results of the rainy image and the processed image in the ship detection network. (a) Training result for rainy image. (b) Training result for deraining image.

Figure 10. Detection results of ship images based on rain removal network. (a) detection result of rainy image 1. (b) detection result of deraining image of (a) based proposed method. (c) detection result of rainy image 2. (d) detection result of deraining image of (c) based proposed method. (e) detection result of rainy image 1. (f) detection result of deraining image of (e) based DTDNet.

Table 1. Quantitative comparison on the Rain14000.

	Ref. [12]	Ref. [13]	SIRR	LPNet	MSPFN	DTDNet	Our
PSNR	24.33	26.19	27.94	28.79	30.72	31.17	31.43
SSIM	0.81	0.79	0.85	0.88	0.90	0.90	0.91
processing time (s)	2.57	155.59	4.30	1.95	2.33	1.67	2.16

Table 2. Results of ablation experiment on Rain14000.

Attention Mechanism and Pyramid Pooling Layer	Phase 1 Dual Channel Module	Phase 2 Dual Channel Module	Improved Loss Function	PSNR/SSIM
√	√	√	√	31.43/0.9126
√ × × √	√ × × √	× √ × √	× × √ ×	29.934/0.891 29.413/0.877 28.739/0.843 30.47/0.906

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, T.; Zhou, B.; Luo, P.; Zhang, Y.; Niu, L.; Wang, G. Two-Stage and Two-Channel Attention Single Image Deraining Network for Promoting Ship Detection in Visual Perception System. Appl. Sci. 2022, 12, 7766. https://doi.org/10.3390/app12157766

AMA Style

Liu T, Zhou B, Luo P, Zhang Y, Niu L, Wang G. Two-Stage and Two-Channel Attention Single Image Deraining Network for Promoting Ship Detection in Visual Perception System. Applied Sciences. 2022; 12(15):7766. https://doi.org/10.3390/app12157766

Chicago/Turabian Style

Liu, Ting, Baijun Zhou, Peiqi Luo, Yuxin Zhang, Longhui Niu, and Guofeng Wang. 2022. "Two-Stage and Two-Channel Attention Single Image Deraining Network for Promoting Ship Detection in Visual Perception System" Applied Sciences 12, no. 15: 7766. https://doi.org/10.3390/app12157766

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Two-Stage and Two-Channel Attention Single Image Deraining Network for Promoting Ship Detection in Visual Perception System

Abstract

Featured Application

Abstract

1. Introduction

2. Materials and Methods

2.1. Double Channel and Double Stage Image Dehazing Network

2.2. The Proposed Deraining Network

2.2.1. Subpixel Convolution Up-Sampling

2.2.2. Attention Mechanism and Multi-Scale Pooling Layer of Pyramid

2.2.3. Improvement of Loss Function

3. Results

3.1. Datasets and Evaluation Metrics

3.2. Comparison between the Other Deraining Methods

3.3. Ablation Experiment

4. Ship Detection

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI