1. Introduction
With developments in science and technology, Internet of Things (IoT) technology has been introduced into advanced underwater vision tasks, such as autonomous underwater driving, ocean scene analysis, and fisheries. Underwater data, such as the marine environment, marine density, and seafloor pathways, are monitored to understand the state of the underwater environment and the growth and health of underwater life through the intelligent analysis of real-time photographs taken underwater. However, the accuracy of the intelligent analysis results is greatly influenced by the underwater images’ quality; a complex imaging environment results in color casts and a loss of detail in the images obtained [
1,
2]. As a result, it is critical to achieve clarity and improve the details in underwater images.
Underwater image enhancement is developing rapidly via both traditional and deep learning. In traditional methods, Drews et al. [
3], influenced by the dark channel prior (DCP [
4]), proposed transmission estimation in underwater single images (UDCP), which does not take into account the effects of red channels but is prone to overexpression. Ma et al. [
5] proposed the restoration of underwater images using a mix of improved dark channel prior and gray world methods; this new model improved the DCP and gray world theory to restore underwater images. Ancuti et al. [
6] used multi-scale fusion to generate an image that was clear after white balance and gamma correction had been performed on the damaged image. To improve underwater image quality, Liang et al. [
7] combined color correction based on attenuation maps with a detail retention and haze removal method based on multi-scale decomposition. Marques et al. [
8] derived an effective atmospheric illumination model from local contrast information based on human observations and, from this model, they generated an enhanced image for highlighting details and an enhanced image for removing darkness. In turn, the underwater image is enhanced via multi-scale fusion. The interpretability of traditional methods is obvious, but the effect needs to be further improved.
In recent years, the application of deep learning methods in underwater image processing has become more and more prominent, especially deep learning methods based on genetic algorithms. For example, Li et al. [
9] proposed Water-GAN to enhance underwater images. Synthetic underwater images are utilized as datasets for training the neural network to perform underwater image color correction. Fabbri et al. [
10] suggested the enhancement of underwater imagery using generative adversarial networks. They first applied CycleGAN to paired images to create degraded underwater images. The underwater image pairs were then selected as datasets for further network training. Guo et al. [
11] designed a multi-level intensive generative adversarial network, containing two multi-scale dense blocks that can correct color differences and enhance image details. Islam et al. [
12] suggested fast underwater image enhancement to enhance the perception of images (FUnIE-GAN) based on U-Net, which improves image detail clarity by using residual connections in the generator. GAN-RS, a multi-branch discriminator proposed by Chen et al. [
13], was developed to increase the quality of underwater images. However, numerous training parameters require careful tuning. If you train with incorrect parameters, the resulting images will produce artifacts. Huang et al. [
14] proposed Semi-UIR, to enhance the model performance with a semi-supervised method and mean-teacher-based underwater image restoration model, by constructing a reliable bank and contrast learning. Compared with the traditional learning methods, a deep learning method can better solve the image color distortion problem and has a superior portability and learning ability in image processing.
The above methods focus on enhancing underwater images, as shown in
Figure 1. However, their algorithm is not well-suited to intricate scenarios due to the lack of attention given to the color and data loss caused by the imaging environment. In addition, most methods that are available improve the network by increasing the network depth; however, this will result in problems such as gradients, training difficulties, and unstable parameters [
15].
To solve the problems above, we propose the implementation of the Dense Residual Generative Adversarial Network (DRGAN). Here are the primary contributions:
- (1)
A multi-scale feature extraction module is proposed to extract image detail information and expand the receptive field.
- (2)
A dense residual block is proposed to fuse feature maps into clear images, not only fully utilizing all layers with local dense connections but also adding residual connections to reuse information.
- (3)
We combine multiple loss functions to facilitate the learning of the generator regarding the generation of clear images. The experimental results show that DRGAN outperforms the state-of-the-art methods in terms of qualitative and quantitative indicators.
The remainder of this work is structured as follows.
Section 2 discusses related work, such as dense residual theory and GANs.
Section 3 describes our proposed method in detail, and
Section 4 presents and discusses the experimental results and analysis. Finally,
Section 5 concludes the paper.
4. Experiment
To verify the effectiveness of DRGAN, in this study, we firstly set the experimental details. Then, we compared DRGAN with different representative methods. These methods included Fusion [
6], ICCB [
20], L^2UWE [
8], FUnIE-GAN [
12] (replaced with FUnIE below), Semi-UIR [
14], and UWCNN [
21]. Finally, to validate the components of DRGAN, we performed ablation experiments. Furthermore, we conducted experiments such as feature point matching and edge detection to validate the usefulness of our approach in real-world applications.
4.1. Experimental Details
We conducted experiments on the Underwater ImageNet [
10] dataset and RUIE [
22] dataset, respectively. The details are as follows. (1) From the Underwater ImageNet dataset, we randomly selected 4000 pairs of images from underwater scenes for training and 2000 pairs for testing. (2) We exploited the trained model of the Underwater ImageNet dataset to test the RUIE dataset, which demonstrated the generalization ability of DRGAN. We trained DRGAN with Adam and set the training and test image size to 256 × 256 × 3, the batch size to 2, and the epoch to 50. TensorFlow was used as the deep learning framework on an Ubuntu 18.04 machine with 32 GB RAM and a GTX1070Ti (8 GB).
4.2. Subjective Evaluation
The color of the undistorted swatch image would be degraded because of the complex underwater imaging environment. Therefore, the color restoration impact of DRGAN could be efficiently tested through color recovery experiments on the color card [
23].
As can be seen in
Figure 8, the Fusion method reduces the contrast between the yellow and pink color blocks, while deepening the overall hue of the color card picture, and the image processed via the ICCB algorithm suffers from a color distortion problem. Although the Semi-UIR algorithm can achieve color correction, the visual effect is negatively affected by the overall redness of the processed image. The problem of low discrimination is shown in the image that was processed via the L^2UWE algorithm, as evidenced by the dark purple and green color cards that are visually close to black. Overall, the color cards processed via the UWCNN algorithm suffer from poor color correction, as shown through the blueish hue. Overall, the FUnIE algorithm tends to make the color cards appear red during the experiment. On the contrary, our method achieves promising visual results with the color card images, especially when dealing with indistinguishable color patches (specifically black, purple, and dark green), validating the superiority of the color correction capability of our method.
Next, the method was applied to images from a complex underwater environment. The input image was affected by different degrees of color distortion, low brightness, and turbidity, resulting in various degradation phenomena.
Figure 9 illustrates the processing results for each method. Images 1–2 are the normal degraded images, Images 3–4 are the atomized images, and Images 5–6 and Images 7–8 are green and blue partial images, respectively.
As can be seen in
Figure 9, the Fusion algorithm fails to improve the sharpness and quality of low-brightness and color-distortion images. The ICCB algorithm has some success in improving the brightness and color correction, but the vividness of the image colors is greatly reduced. The L^2UWE algorithm fails to improve green, blue, and normal degraded images. Although the fogging problem can be mitigated, the generated image seems to have insufficient brightness. The FunIE algorithm can solve the problem of low brightness, but the problem of color distortion remains. The fogged image processed via FUnIE shows the problem of an obvious reddish tint, which is not consistent with the real image, and the image processed via the UWCNN algorithm cannot achieve a good visual effect due to the overall bluish color. Image processing using the Semi-UIR method achieves some success in defogging and color correction, but the overall brightness of the final image is low. In addition, as shown in
Figure 5,
Figure 6,
Figure 7 and
Figure 8, the ICCB method is unable to perform effective deblurring, as evidenced by the severe color distortion. On the contrary, the results of our method show brighter and clearer images compared to all the tested comparison algorithms. It was found that the algorithm can address degradation in complex underwater environments (off-color, low brightness, high turbidity, etc.) and that it exhibits a strong robustness. It was determined through subjective evaluation that our method produces better-clarity results for images with different degrees of degradation compared to other, similar new methods.
4.3. Objective Evaluation
The image quality when applying our method was further evaluated through five objective evaluation indexes: UCIQE, UIQM, SSIM, PSNR, and CIEDE2000.
- (1)
The underwater color image quality evaluation index [
24] (UCIQE) is proportional to the underwater picture quality, and the formula for calculating the index is as follows:
where
is the chromaticity standard deviation,
represents the contrast in brightness,
represents the average value of saturation, and
,
, and
are weighting coefficients.
- (2)
The underwater image quality measurement [
25] (UIQM) is a quality-evaluated indicator of non-reference underwater images based on human visual system excitation. The calculation formula is as follows:
where
is set to 0.0282,
is set to 0.2953, and
is set to 3.5735. The underwater image quality measurement is a linear combination of the underwater image colorfulness measure (UICM), underwater image sharpness measure (UISM), and underwater image contrast measure (UICONM). The higher the UIQM, the better the image’s color balance, sharpness, and contrast.
- (3)
The structural similarity index measurement [
26] (SSIM) is an index for determining how similar the two images are. When two images,
and
, are given, the calculation formula is:
where
and
are the average of
and
, respectively;
are the variance of
and
; and
is a constant to maintain stability.
is the covariance of
and
;
.
- (4)
The peak signal-to-noise ratio (PSNR) is an index to measure image quality. The calculation formula for the mean square error (MSE) is:
where two images,
, are compared. The PSNR is obtained through the MSE, and the calculation formula is:
- (5)
The CIEDE2000 evaluation index [
27], which has a range of [0, 100], measures the color changes between the standard color card and each processed color block. The color differences are reduced when the index decreases. For the evaluation in
Figure 8, we used the CIEDE2000 evaluation index.
Table 1 displays the results.
Comparing the data in
Table 1, we can see that, like DRGAN, FUnIE, and Semi-UIR both achieve good results, and FUnIE adds residual connections to the generator to enhance the network performance. DRGAN’s CIEDE2000 average result is the lowest, showing that our technique performs better in terms of color recovery.
We used UCIQE to evaluate the images in
Figure 9, and the results are shown in
Table 2. The results show that the average value of the DRGAN algorithm is higher than that of other algorithms. For Images 1 and Image 2, the DRGAN UCIQE was lower than that of L^2UWE because the original image was less degradable, and Semi-UIR recovery was better than DRGAN enhancement. However, ICCB, with a higher UCIQE, showed significant color aberration in Image 6 and unnatural color restoration in Image 8.
The UIQM results from
Figure 9 are shown in
Table 3; our average UIQM for DRGAN is higher than for the other algorithms. The light degradation of Image 1 and Image 3 leads to a higher UIQM in the ICCB restoration algorithm than the enhancement effect of DRGAN, and when processed via FUnIE, Image 2 has a yellow color cast. Although Semi-UIR achieves good enhancement results, it is not thorough enough in detail processing, as shown in Image 8.
As the RUIE dataset has no ground truth, we chose the UIQE and UIQM metrics when comparing with other algorithms, and we used the UCIQE, UIQM, SSIM, and PSNR metrics on the Underwater ImageNet dataset. The test results using the Underwater ImageNet dataset and RUIE dataset are shown in
Table 4 and
Table 5.
We verified the effectiveness of DRGAN on the Underwater ImageNet dataset and applied the model trained on the Underwater ImageNet dataset to the RUIE dataset. In the Underwater ImageNet dataset, DRGAN’s PSRN, UIQM, and UCIQE outperformed the other algorithms, indicating that the DRGAN enhancement results are closer to the real images. On the RUIE dataset, on average, DRGAN also achieved better results. While FUnIE also adds residual connections, it is only in the Green and Atomization environments in the RUIE dataset that the SSIM indicators for the Underwater ImageNet dataset are better than those of DRGAN. From the above, it can be concluded that the addition of dense and residual connections in DRGAN has a better performance-enhancing effect and leads to a better generalization ability.
4.4. Ablation Study
We conducted module ablation experiments using the Underwater ImageNet dataset. Firstly, we evaluated the images processed via different modules using PSNR, SSIM, UCIQE, and UIQM.
Table 6 shows the objective metric scores for the ablation experiments, where w/o MSFEM denotes the removal of the multi-scale feature extraction module, w/o DRB denotes the removal of the dense residual block, w/o RES denotes the removal of residual connectivity in the DRB, and w/o DEN denotes the removal of dense connectivity in the DRB.
Table 6 shows the performance w/o MSFEM and w/o DRB on UCIQ, where it can be seen that the performance w/o MSFEM on UIQM is higher than that w/o DRB. Removing both the dense connections and the residual connections influences the model performance. This result fully demonstrates the importance of the two modules we adopted, MSFEM and DRB, for the overall performance of the network.
Then, we randomly selected an image for subjective comparison.
Figure 10 shows that the image processed w/o DRB has artifacts and is accompanied by a yellow color cast, while the image processed w/o MSFEM is subjectively better than that w/o DRB, but still has a small amount of color cast. The image enhanced via the full processing model is the best and the most visually natural.
Figure 10 also demonstrates that the image color recovery is poor, and there are artifacts, after the removal of the residual connections in the dense residual block. On the contrary, after the removal of the dense connections in the dense residual block, the image is over-enhanced.
4.5. Additional Experiments
Less image feature information makes underwater image detection more challenging. As shown in
Figure 11 and
Figure 12, several images were selected for surf feature point matching and Canny operator experiments, which verified that our method can enhance edges and other feature information in underwater images.
Figure 11 shows the results from the surf feature point matching; it can be seen that the processed image has significantly more feature points than the original underwater image. These experiments show that the proposed algorithm successfully enriches the characteristics of underwater images, making the subsequent information processing much easier.
Figure 12 shows the results of the Canny operator; after the processing in this method, more details of the image can be added (such as coral patterns, etc.). Compared with the degraded images, DRGAN can clearly show the contour information of the picture. This makes the detection and tracking of features of interest via underwater robots a much less taxing endeavor.