1. Introduction
In recent years, with the increased demands of photography, transportation, military, aerospace and other fields, image dehazing has gradually become a popular area of research in image processing.
Researchers generally deal with haze from the perspectives of image enhancement and image restoration. The former focuses on enhancing low-level features of images, such as contrast, sharpness, edges, low-light stretch, histogram equalization, and homomorphic filtering [
1,
2,
3,
4,
5]. In the research on the physical degradation model of foggy images, the latter method enriches image information lost in the fogging process by optimally estimating the haze-free images [
6,
7,
8]. However, the estimation of parameters in the degradation model is a challenging task, as it is an ill-conditioned problem based on the unique given condition of one hazy image. Therefore, some image restoration methods require additional information or assumptions [
9,
10,
11].
Dong [
12] used a haziness flag to measure the degree of haziness, which obtains the adaptive initial transmission value by establishing the relationship between the image contrast and the haziness flag. The method has superior haze removal and color balancing capabilities for images with different haze densities. In addition, Fattal [
13] estimated the irradiance of the scene and deduced the transmittance image based on the assumption that the transmittance of the light and shadow of the object surface in the scene were locally irrelevant. To deal with outdoor images in sand-dust environments, Park [
14] used successive color balance with a coincident chromatic histogram to adjust the pixels of each color component based on the statistical characteristics of the green component. Yan et al. [
15] improved the dark channel prior theory and applied the contrast-limited adaptive histogram equalization (CLAHE) method to enhance the optimized transmittance image. This method has made significant improvements over the original DCP method and can be applied to the defogging of infrared dense fog images.
Han [
16] presented the reason for underwater image degradation. The state-of-the-art intelligence algorithms, such as deep learning in underwater image dehazing and restoration, were surveyed, demonstrating the performance of underwater image dehazing and color restoration with different methods. The paper introduced an underwater image color evaluation metric and provided an overview of the major underwater image applications. This provides great convenience for follow-up research. Zhang et al. [
17] designed a fully end-to-end dehazing network for single image dehazing named the dense residual and dilated dehazing network (DRDDN). A dilated, densely connected block was designed to fully exploit multi-scale features through an adaptive learning process. The deep residual was used to propagate the low-level features to high-level layers. Li et al. [
18] used an adversarial game between a pair of neural networks to accomplish end-to-end photorealistic dehazing. The generator learned to simultaneously restore the haze-free image and capture the non-uniformity of haze to avoid uniform contrast enhancement. A task-driven training strategy was proposed to optimize the object detection of dehazed images without updating the parameters of the object detector. Qin et al. [
19] proposed an end-to-end feature fusion attention network (FFA-Net) to directly restore the haze-free image. The network was composed of three parts: (1) a novel feature attention (FA) module, (2) a basic block consisting of local residual learning and feature attention, and (3) an attention-based different-level feature fusion structure. The feature weights were adaptively learned from the FA module, giving more weight to important features. The experimental results demonstrated strong progress in both indoor and outdoor defogging fields.
All of the above deep learning-based defogging methods have good defogging effects. However, they also have certain drawbacks. These models may perform badly under certain lighting conditions, such as strong sunlight and shadows. The models require a large amount of training data, and the decreased size of training data may lead to overfitting and poor generalization ability. Meanwhile, due to the large amount of computing resources and data required to train these models, the costs for deployment can be high. The most undesired part is the incompleteness of fog removal with varied haze densities for the deep-learning-based demisting methods, especially for situations with high haze density, which will be specifically stated in Chapter 3. The traditional defogging algorithm does not have such problems.
The dark channel prior (DCP) method proposed by He et al. [
20] is one of the most famous single-image dehazing methods. DCP imposes the assumption that there exists an extremely dark pixel in a local non-sky patch for every color channel of the image [
21]. Due to the computationally expensive drawback of implementing DCP with the soft-matting method [
22], some studies [
23,
24,
25] have employed guided filtering, bilateral filtering, and edge substitution to replace the soft-matting process. This significantly improved the efficiency of DCP. Salazar-Colores [
26] proposed a novel methodology based on depth approximations through DCP, local Shannon entropy, and fast guided filter to reduce artifacts and improve image recovery on sky regions with a significant decrease in calculation time. Peng [
27] used the depth-dependent color variation, scene ambient light difference, and adaptive color-corrected image formation model (IFM) to better restore degraded images. This method produces satisfying restored and enhanced results. Therefore, the approach has been approved to unify and generalize a wide variety of all DCP-based methods for underwater, nighttime, haze, and sandstorm images. Singh [
28] proposed a new haze removal technique according to DCP, which integrates the dark channel prior with CLAHE to remove the haze from color images. A bilateral filter was used to reduce the noise in images, and it showed quite effective results in noise reduction and correcting uneven illumination.
In this paper, in order to overcome the problems of deep, dark pictures and incomplete defogging by the inaccurate estimation of atmospheric transmittance, the hybrid dark channel prior (HDCP) is proposed. In HDCP, Retinex [
29] is first utilized to remove the interference of the illumination component and improve the brightness of the image. The atmospheric light intensity is further refined iteratively. Then, a tolerance-improved DCP is introduced to obtain the dehazed image. In the algorithm, a variant genetic algorithm (VGA) [
30] is proposed to enhance the grayscale of the original image, which is used as a guided filtering image to optimize the transmittance. In order to verify the algorithm, the public datasets of O-HAZE [
31] and NYU2 [
32] are used as the experimental images. Compared with other DCP-based algorithms, the average
MSE by the proposed method decreases by 26.98%. The average
SSIM increases by 10.298%. The average entropy increases by 7.780%. Compared with the conventional DCP, the result of the proposed algorithm has higher brightness and a more complete degree of fog removal. There are no serious image or color distortions.
2. Materials and Methods
Images with haze are characterized by uneven illumination, low visibility, and low contrast. The atmospheric scattering model describes the degradation process of foggy images [
33] and is expressed as:
where
I(x) represents the original image, and
x represents the pixels.
J(x) is the clear image restored by dehazing.
A is the atmospheric luminance intensity, and
t(
x) is the transmittance.
The unknowns A and t(x), the keys to obtaining a clear image J(x), are generally estimated in algorithms such as DCP.
DCP is designed based on a basic assumption, which is that, in most non-sky local areas, there are some pixels with very low values in at least one color channel, approaching 0. Therefore, the relevant parameters are estimated as follows.
The atmospheric luminance intensity
A is normally estimated from the pixels with the highest fog concentration and highest brightness in the image [
34]. The calculation of the transmittance
t(
x) is as below [
20]:
where Ω(
x) is a local patch centered at
x,
is a color channel of
I, and
= 0.95, which is a constant to ensure the true image perception. It represents retaining a small amount of fog in the resulting image to pursue the authenticity of the image, especially for distant objects. In fact, the dual min function obtains the dark channel of the normalized haze image. It directly provides the estimation of the transmission.
DCP has good defogging effects for landscape photos. However, the disadvantage of DCP is color distortion in bright areas or areas with massive gray and white colors, which results in dark dehazed images [
35].
In order to solve the above problems, this paper proposes a hybrid dark channel prior (HDCP) algorithm.
In
Figure 1, Retinex is used in the preprocessing to remove the interference of the illumination component. In the modified DCP, the atmospheric light intensity
Ai of each of the RGB channels is estimated iteratively after the dark channel is determined. Then, the transmittance
t(
x) is optimized by guided filtering. The grayscale image of
R(
x) enhanced by the variant genetic algorithm (VGA) is the guided image in the filter. Finally, the fog-free image
J(x) is obtained according to the atmospheric scattering model.
2.1. Retinex Algorithm
Based on Retinex, the original image
I is expressed as:
where
I(
x) is the original image,
R(
x) is the reflectance component, and
L(
x) is the illumination component.
The purpose of image enhancement based on Retinex is to estimate L(x) from I(x) and thereby decompose R(x). In the meantime, the effect of uneven brightness is eliminated, and the visual effect of the image improves.
Conventional Retinex has halo effects in areas with large brightness differences. In this paper, McCann’s Retinex [
29] is employed, as it is suitable for image enhancement for images with shallow shading or uneven illumination. The reflectance component for the center pixel in the window is expressed as:
where
Rc is the final reflectance component estimation of the center pixel, and
R0 is the largest pixel value.
Ic and
Im are the logarithmic values of the center point and selected point, respectively.
i represents the indexes of different points.
m is the total number of selected pixels.
2.2. Modified DCP
The solution of the dark channel needs to first calculate the minimum of the RGB components for each pixel and save it in a grayscale image with the same size as the original image. Then the grayscale image is processed by the minimum filter.
In order to improve the estimation accuracy of atmospheric light intensity
A, an iterative method is introduced to distinguish the RGB channels. The scheme of the
Ai is shown in
Figure 2.
Ai is specified as:
where
is the top 0.2% points of the brightness value in the dark channel of
R(
x) in descending order. The value of
Ai is obtained by comparing and updating the average of the corresponding pixel points in
multiple times. Therefore, each iteration needs to be compared with it. Through iteration, the points where the dark channel brightness is not very prominent are taken into account.
Through the above method, the atmospheric light intensity Ai corresponding to the RGB channels can be obtained.
In the optimization of transmittance by the soft-matting method [
22], the grayscale image of the original image is first employed as the guidance image. The transmittance matrix is secondly employed as the guidance image of the guided filter to filter the transmittance itself to preserve the edges. The fog-free image
J(
x) is finally obtained according to Equation (1).
2.3. Variant Genetic Algorithm (VGA)
The accuracy of the grayscale image as a guidance image affects the transmittance optimization. In this paper, a variant genetic algorithm (VGA) is used to obtain the transfer function. The role of the transfer function is to map the original image to its corresponding high-contrast image. The guidance image
Rg(
x) for guided filtering is thus obtained. The scheme of the VGA is shown in
Figure 3.
The illumination components from Retinex are firstly converted into grayscale image, which will be used as the guidance image. VGA is used to update the transfer function by varying the parameter set. The feedback function will also be updated every turn for the subsequent update of the transfer function. The fitness function is used to verify the quality of the current transfer function. A new parameter set is generated through crossover and mutation to obtain a new transfer function.
The low-contrast image (pixel values range from
Iin-min to
Iin-max) is converted into a high-contrast image (0~255) by mapping the transfer function. The generated transfer function should remain monotonically increasing. All the points less than
Iin-min are set to 0, and the points greater than
Iin-max are set to 255. The generation of the transfer function is derived from the exploration point from the lower left (
Iin-min,0) point to the upper right (
Iin-max,255) point. Additionally, there are three selection directions of the exploration point (upper, right, and upper right). The whole process seems to draw a curve from the bottom left to the top right. The selection of the next derived point is based on the roulette wheel technique, and the selection probability
P(
i) is calculated according to neighborhood points as:
where
G(
i) is the set of neighborhood points around the exploration point.
i, with the values of 1, 2, 3, represents the upper, right, and upper-right domains, respectively.
Τi is the magnitude of the feedback function, which is determined by the last iteration. It is used to control the corresponding probability. The larger the feedback function is, the greater the probability in this direction becomes.
Hi is the heuristic value.
Ki represents the current exploration point, which is used to record the distance traveled by the exploration point in the horizontal and vertical directions.
γ,
α, and
β are constants that can be changed by VGA. Among these,
α and
β control the weight of the feedback function and the heuristic value. Additionally, the combination of
γ and
ki can control the probabilities of moving up or right.
The purpose of the heuristic value is to obtain a monotonically increasing transfer function. The specific settings of ηi are η1 = Cup, η2 = Cright, η3 = 1, with the values for the rest of the areas being 0. The specific settings of ki are k1 = Iin-Iin-min, k2 = Iout, k3 = constant, and it is set to 0 for any other neighbor. Therefore, for P(1) in the upper direction, k1 shows the distance that the exploration point has moved to the right. Similarly, for P(2) in the right direction, the value of k2 represents the distance that the exploration point has moved upward. The parameters α, β, γ, Cup, and Cright are determined by the variant genetic algorithm.
After selecting 20 exploration points for updating the transfer function, the feedback function is updated as below:
where
ρ is the reduction rate of the feedback function, set to 0.4.
is the magnitude of the feedback function updated by the
l-th exploration point between points
i and
j, which equals
F/(30 ×
BF).
F is the fitness value of the
l-th exploration point.
BF is the best fitness value used to normalize the feedback function. The definition of
F is as below:
where
STD and
ENTROPY are the global standard deviation and information entropy of the grayscale image enhanced by the transfer function, respectively. The specific expression is as follows:
where
is the average value of
Rg for pixels.
Pi refers to the number of pixels whose gray value is
i in the image. Additionally,
SOBEL is the average intensity of the grayscale image obtained by applying the vertical and horizontal Sobel operators, respectively [
36].
SOBEL is defined as:
where
and
are the images obtained by applying the vertical and horizontal Sobel operators, respectively. The mean(.) operator denotes averaging.
In VGA, the reproduction stage is carried out by crossover and mutation. The population size is set to 20. The reproduction transfers the parent parameter set (i.e., α, β, γ, Cup and Cright) into a sequence by binary code (also called chromosomes). This paper adopts a uniform crossing method with a probability of 85%. Mutations change the code for perturbations with a probability of 0.05. In this algorithm, the mutation only changes one of the 5 parameters in the set and is limited to 10% of the original values.
VGA controls the generation process of the transfer function. In the initial stage, VGA needs crossovers and mutations in each iteration to achieve fast optimization. However, in the subsequent iterations, the numbers of crossover and mutation need to be reduced. Through the experiment, it is most appropriate to set the number of VGA iterations to 10, considering the final effect and processing speed. At the same time, GA participated in iterations 1, 2, 4, 6, and 9.
After the transfer function is obtained, the grayscale image of R(x) can be enhanced to obtain the guidance image Rg(x), and guided filtering can be used to optimize t(x).
The problem of image color distortion is often caused by inaccurate estimation of
t(
x) [
37]. In this paper, the tolerance
K is divided by the difference between the pixel value
Ri(
x) and the atmospheric light intensity
Ai to further ensure that the color of the restoration result is not distorted:
The tolerance K is a constant whose value is between 0 and 1. Ri(x) and Ai are normalized values; is used to multiply the transmittance t(x) to amplify t(x).
3. Experimental Results and Analysis
In order to verify the effectiveness of the proposed algorithm, the public dataset of O-HAZE with varied hazy densities is selected for comparison. In O-HAZE, the fog is a real haze generated by a professional haze machine. The same visual contents are recorded in both the hazy and dehazed conditions under the same light conditions. The dehazing methods of Salazar [
26], Peng [
27], Yan [
15], and Qin [
19] are cited as comparisons. The results are shown in
Figure 4. DCP is the conventional dark channel prior dehazing algorithm. FDCP is the method of Salazar [
26]. GDCP is the method of Peng [
27]. MDCP is the method of Yan [
15]. FFA is the method of Qin [
19]. HDCP is the proposed method.
Group (a) has abundant colors. The results are entirely distinct. The restored pictures of DCP and FDCP are too dark. The restored picture of GDCP has too much exposure, which leads to the loss of the original color. Meanwhile, the fog in the distant woods in the upper left of the picture is not completely removed. In order to show the line and color details of the restored image, enlarged contrasts in group (a) are added at the bottom of the figure, which focuses on enlarging the color cards placed in the figure. From the results, it can be seen that the processed pictures of DCP and FDCP are too dark, and the color contrast is low, which does not achieve the ideal situation. The results of GDCP are seriously blurred, and the details are markedly lost. The results of MDCP have been greatly improved, but the fog has not been completely removed. The FFA results are relatively complete overall, but the image is a bit dark. The quality of the image restored by the proposed method is significantly improved. The colors are richer and more realistic. The contrast and clarity are better. The texture details are clear.
In group (b), the white of the chairs and the grey of the ground occupy a large part of the picture. The fog concentration of the original picture is relatively high, which makes the implementation of the dehazing algorithms more difficult. Additionally, the whole picture is blurred, and the textures are notably missing. The result of DCP directly suffers from severe color distortion. The result of FDCP has been improved, but the problems of dark picture brightness and insufficient texture details still exist. The results of GDCP still have the same problem. The brightness is too high, and the fog removal is incomplete. The result of MDCP is more realistic, but the problem of incomplete mist removal still exists. The result of FFA is generally grayish so that the entire photo appears unnatural. The results of the proposed algorithm are better. The fog removal is more complete. The contrast is higher. The picture is more realistic, where the contrast color card on the chair is more clearly visible.
Group (c) mixes chairs, columns, and woods with more texture details behind them. The colors of the columns and chairs are not uniform, and the details of the woods are missing. The results of DCP are still dark and distorted. The overall appearance is bluish, especially for the pillars on both sides. The results of FDCP are similar to DCP. However, the color tone of the columns is more realistic and reasonable. GDCP still has high brightness and incomplete edge details. The resulting color of MDCP is biased towards situations similar to grayscale images, probably due to the modification of the prior theory of dark channels. The images of FFA have been improved significantly, but the contrast is slightly insufficient. The detailed information on the ground is incomplete. The result of the proposed method is more similar to reality, with significantly improved contrast and clarity. The colors of the columns and chairs are more uniform and similar to the original image.
In group (d), the concentration of the fog in the picture is not uniform. There is more fog in the upper middle of the picture. Additionally, this group of pictures has the richest texture details. The picture of DCP is still dull, and there are more black blocks in the grass, such as where the red box shows. The result of FDCP has been improved to a certain extent. However, the texture details in the grass are still insufficient. The result of GDCP has been completely degraded, and no valid information can be obtained. MDCP improves the darkening of the image, but the green and yellow parts of the bushes are missing, and the overall color is lost. In the result of FFA, the fog removal is incomplete, but the texture details are relatively complete. The result of HDCP is more in accord with the real situation. The fog removal is more complete. The texture details are rich. Additionally, there are fewer black blocks as interference.
It can be seen that the method proposed in this paper has a wide range of applicability, and it can also play a good role in defogging in the case of complex environments and uneven fog concentration.
In order to further verify the performance of the proposed algorithm and objectively evaluate the enhanced image enhancement quality, this paper adopts the mean squared error (MSE), peak signal-to-noise ratio (PSNR), structural similarity (SSIM), and information entropy (E) of the image as evaluation criteria.
(1) Mean squared error (
MSE):
where
I(
i,
j) and
J(
i,
j) are the original and restored images with sizes of
m by
n.
(2) Peak signal-to-noise ratio (
PSNR):
where
MAXI is the maximum possible pixel value of the picture. Generally, for uint8 data, the maximum pixel value is 255.
(3) Structural similarity index (
SSIM):
where
x,
y are the two images.
and
are the pixel means of
x and
y.
and
are the variances of
x and
y.
is the covariance of
x and
y.
and
are two constants used to maintain stability and avoid division by zero.
(4) Information entropy (
E):
where
i,
j, and
k are the sizes of the image.
pijk means the probability of occurrence of
pij in channel
k.
The comparison of different algorithms is compared as follows. The best-performing indexes are shown in bold.
In
Table 1, the proposed method in this paper has the smallest
MSE, indicating that the method does not alter the images much.
In
Table 2, there is no significant difference in the statistical sense as well as in numeral values for DCP, FDCP, and GDCP. The results show that the normal DCP and the other two improved DCP algorithms have almost the same
PSNRs. The
PSNRs of MDCP and FFA are relatively close. The proposed HDCP has the biggest
PSNR. This indicates that the image distortion degree of the processing result in this paper is the least.
As shown in
Table 3, in terms of
SSIM, FFA has the best processing results, and the gap between HDCP and FFA is not significant, indicating that the processed images are more similar to the original images.
For the
entropy in
Table 4, the method in this paper still has an absolute advantage. The larger the value is, the richer the information contained in the restored image.
For generality, this paper processed all the images in the O-Haze dataset, containing 45 groups of foggy and fog-free photos. The average
MSE,
PSNR,
SSIM, and
entropy are listed in
Table 5.
Table 5 shows that the average
PSNRs of all 6 methods are relatively close. The average
SSIM of HDCP is also in second place, and the gap between the first two algorithms is not huge. Compared with other DCP algorithms, the average
MSE of the proposed method decreases by 26.98%. The average
SSIM increases by 10.298%. The average
entropy increases by 7.780%.
In order to show defogging abilities for different haze densities, the public dataset of NYU2 was then selected as the experimental object. The results are as follows:
In
Figure 5, the first row is the images with the continuously increased fog densities compared to the original image. The next rows correspond to the different restored images. In the results of FDCP, as the haze densities increase, the images show that the fog is not removed cleanly. The results of GDCP are unstable. The results of MDCP have a certain stability, except for the last image. FFA has the same results as FDCP. However, its objective evaluation indicators have deteriorated seriously. The proposed method in this paper can still completely remove the influence of fog. The processing results are basically stable even with the increase in haze densities.
For generality, all 1750 photos of the public dataset of NYU2 were processed. Among them, there were 250 individual scenes with 7 levels of haze densities. The evaluation criteria were calculated, and the statistics of the results are as follows.
Table 6 shows that the proposed method in this paper still has a great advantage even with different haze densities. The performances of HDCP are all higher than the values of the compared algorithms. Among them, the average
MSE decreases by 49.29%. The average information entropy increases by 3.029%.
In summary, the proposed algorithm in this paper has better performance when improving the quality of hazy images, with improvements seen in contrast and sharpness. The details are more abundant. Moreover, the colors of the image enhanced by this algorithm are closer to the reality, and the color fidelity is higher. Finally, when faced with images of different fog densities, the method presented in this article exhibits strong stability both in terms of supervisor vision and objective evaluation indicators.