1. Introduction
Underwater images have tremendous usage in marine engineering. However, poor underwater image quality caused by the presence of wavelength-dependent light absorption and scattering [
1] often hinders their use. Underwater image dehazing is an approach to combat this, where underwater images are processed to improve quality, thereby increasing their application in the marine environment. The processing focuses on reducing the effects of wavelength-dependent light absorption and scattering. According to Alenezi et al. [
1,
2], an underwater dehazing model can be defined as:
where
denotes the intensities of the
{R,G,B} color channel at the pixel
in an input underwater image.
denotes the scene radiance image and
denotes the ambient light.
is the direct transmission, representing the attenuated scene radiance by transmission.
denotes a point spread function of pixel
x. Similar to in-air dehazing models [
3], underwater dehazing models aim to reduce the effect of haze in images. However, unlike in the atmospheric model, the underwater model presented in (
1) considers scene radiance,
, as a function of point spread to take into consideration the effects of wavelength-dependent light absorption. This makes underwater image dehazing a complex phenomenon that requires the continual exploration of effective techniques in order to improve the image quality and the usability of underwater images.
Recent years have seen underwater image dehazing attracting attention, leading to many suggested techniques. Traditional methods, such as image restoration and enhancement, estimate the dehazing models’ parameters to reduce the effect of haze. Such models include He et al.’s [
4] dark channel prior (DCP) that estimates the transmission map and atmospheric light from a hazy image to recover the underwater image. The simplicity of the DCP has enabled its modification to address the severe attenuation of the red color in water to attain images with improved color. Such techniques include that of Galdran et al. [
5] who recovered short wavelength-based colors via a red channel prior (RCP) to restore image contrast. Peng et al. [
6] used a DCP to present an underwater dehazing technique where they considered the green and blue color channels to restore the underwater image. Chiang and Chen [
7] enhanced underwater images via a compensation-based light attenuation dehazing algorithm. Peng and Cosman [
8] used depth estimators, image blurriness and light absorption to restore underwater images. These methods have been popular due to their ability to reduce the effect of blur and color cast effects on underwater images to a significant degree. However, they also estimate many parameters, making their results inflexible and sometimes inferior compared to other complex methods, such as those by [
9,
10]. On the other hand, the most recently proposed underwater image dehazing models reduce the effect of haze by disregarding the underwater modeling parameters and improving the image’s visual quality by adjusting the image pixel values. Such methods include Ancuti et al. [
11], where the contrast of the underwater images was improved via a fusion-based method. Fu et al. [
12] proposed an effective retinex-based method to enhance underwater images. Though effective in improving the underwater image quality in many instances, these methods have one major shortcoming: they fail to consider underwater physical parameters, making them inefficient in recovering high-quality images.
Deep neural network-based methods solved the problems of DCP-related models and image pixel-based models. Such techniques include image segmentation by Zhang et al. [
13], pattern recognition by Gedamu et al. [
14] and image dehazing by Liang et al. [
15]. Some methods utilized deep neural networks to utilize similarities between clear or ground-truth and hazy images. Such methods used similar network structures to achieve a higher image quality. These methods’ major shortcoming come when the images compose a harsh and complex scenario where attaining ground-truth images is impossible. In order to solve such problems, researchers have explored the pairing of in-air images with haze images and then used the same analogy to solve underwater images from harsh and complex scenarios. Such models are generative adversarial networks (GAN) for underwater images (WaterGAN)and others [
16]. The technique corrects the color of underwater images by pairing in-air images and using the attained depth information to simulate underwater images. Fabbri [
17] proposed paired training data generated by employing a cycle-consistent GAN (CycleGAN) by [
18] to formulate an underwater GAN (UGAN), which simulated the degradation process. Finally, they used the pix2pix model to reduce the effect of haze in underwater images. The suggested techniques were later exploited by Guo et al. [
19] and Fabbri [
17] to develop a more sophisticated model based on the dense multiscale GAN, boosting the performance and rendering more details in the final underwater images than the previous methods. Li et al. [
20] recently proposed an underwater convolutional neural network (UWCNN), which used underwater scenes prior to generating satisfactory clear underwater images. The deep neural network’s major shortcoming is its reliance on in-air images to attain underwater clear images. This is not always directly usable in underwater applications and is regarded as an extension of in-air dehazing networks; thus, their results may be misleading.
The general approach of underwater dehazing based on neural networks (NN) has not yielded accurate images due to an over-reliance on scene depth and on atmospheric light based on image pixels. However, if considered in terms of the red, green and blue (RGB) color channels, these pixels can yield images whose visual appearances are closer to that of the raw images. Inspired by this fact, the proposed technique approaches the underwater imaging dehazing problem based on the difference in pixel arrangements within the RGB color channels. Thus, a novel triple-dual end-to-end NN is proposed, in other words, a triple-dual-path recurrent network (TDPRN) to model scene radiance and ambient light. The proposed TDPRN consists of a feature extraction block, a transmission estimation map block, a TDPRN block with a parallel interaction function, image reconstruction and the softmax function for image fusion. The network information is modified from the already existing network by [
21]. Given the hazy underwater image, the network decomposes the image into the RGB channels and then the TDPRN uses the feature extraction block and the transmission map estimation block to extract features from the color channels from the hazed underwater images. These features are then fed into the dual-path block via three parallel branches to restore the image features and improve the color of the dehazed images. Unlike [
21]’s structure, the proposed structure has three units of convolution—long short-term memory (convLSTM) in each branch and a convolution layer based on the corresponding color channels’ pixels. ConvLSTM has the ability to learn and store information on the input image of the pixel correlation and compare it with the output. The communication between the interacting layers enables a comparison of the correlation patterns, thus enhancing the extraction of features in the output images. This extraordinary communication and comparison help approximate the infinite impulse response (IIR) model, which was already proposed by [
21,
22]. A parallel interaction function is also proposed to fuse the intermediary features between the branches. Thus, the basic features and information of the dehazed image are recovered alternatively. The corresponding features based on each color channel are then processed stepwise to obtain the ultimate dehazed image via a series of softmax-weighted fusion, whose details are discussed in detail by Zhao et al. [
23].
The proposed technique, presented in
Figure 1, can produce an output image with improved visual perception.
Figure 1 shows a summary of the visual perception improvement of the proposed method compared to the input images. The top row contains the raw (hazed) images. The bottom (second) row shows the corresponding output of the proposed method. The summary presented in
Figure 1 indicates that the proposed technique can learn and reduce the effects of haze in the output images.
Contribution
This proposed paper contains the following significant contributions:
The input image is decomposed according to the RGB color channels and the features, with each color channel decomposed into two units based on the similarities via the k-means. The k-means are described in detail by [
24,
25]. This guarantees the ease of adaptability and identification of similar pixels, and thus, by extension, removes pixels with a weak correlation, leaving only pixels with a higher correlation.
The structure’s triple-dual and parallel interaction allows a comprehensive comparison; hence, even minor features, i.e., pixels with the weakest correlations, are considered. This improves the visual perception of the final image.
The use of softmax-weighted fusion in the arrangement of the proposed structure also preserves the color, which explains why the proposed result’s color is very similar to the input color. This is achieved via adaptive learning based on the confidence levels of the pixel contribution variation in each color channel during the subsequent fuses.