Single Image Haze Removal from Image Enhancement Perspective for Real-Time Vision-Based Systems

Ngo, Dat; Lee, Seungmin; Nguyen, Quoc-Hieu; Ngo, Tri Minh; Lee, Gi-Dong; Kang, Bongsoon

doi:10.3390/s20185170

Open AccessArticle

Single Image Haze Removal from Image Enhancement Perspective for Real-Time Vision-Based Systems

by

Dat Ngo

^1,†

,

Seungmin Lee

^1,†,

Quoc-Hieu Nguyen

¹

,

Tri Minh Ngo

²,

Gi-Dong Lee

¹ and

Bongsoon Kang

^1,*

¹

Department of Electronics Engineering, Dong-A University, Busan 49315, Korea

²

Faculty of Electronics and Telecommunication Engineering, The University of Danang—University of Science and Technology, Danang 550000, Vietnam

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Sensors 2020, 20(18), 5170; https://doi.org/10.3390/s20185170

Submission received: 3 August 2020 / Revised: 3 September 2020 / Accepted: 8 September 2020 / Published: 10 September 2020

(This article belongs to the Section Electronic Sensors)

Download

Browse Figures

Versions Notes

Abstract

:

Vision-based systems operating outdoors are significantly affected by weather conditions, notably those related to atmospheric turbidity. Accordingly, haze removal algorithms, actively being researched over the last decade, have come into use as a pre-processing step. Although numerous approaches have existed previously, an efficient method coupled with fast implementation is still in great demand. This paper proposes a single image haze removal algorithm with a corresponding hardware implementation for facilitating real-time processing. Contrary to methods that invert the physical model describing the formation of hazy images, the proposed approach mainly exploits computationally efficient image processing techniques such as detail enhancement, multiple-exposure image fusion, and adaptive tone remapping. Therefore, it possesses low computational complexity while achieving good performance compared to other state-of-the-art methods. Moreover, the low computational cost also brings about a compact hardware implementation capable of handling high-quality videos at an acceptable rate, that is, greater than 25 frames per second, as verified with a Field Programmable Gate Array chip. The software source code and datasets are available online for public use.

Keywords:

haze removal; real-time processing; detail enhancement; multiple-exposure image fusion; adaptive tone remapping; field programmable gate array

1. Introduction

Images or videos taken outdoors usually suffer from an apparent loss of contrast and details owing to the inevitable adverse effects of bad weather conditions. Spatially varying degradation sharply decreases the performance of computer vision and consumer applications, such as surveillance cameras, autonomous driving vehicles, traffic sign detecting systems, and notably face recognition, which is widely adopted on smartphones and Internet-of-Things devices [1]. Haze removal, also known as dehazing or defogging, addresses this problem by eliminating the undesirable effects of the transmission medium and restoring clear visibility.

In general, haze removal algorithms fall into two categories—single- and multiple-image algorithms. Even though the latter is no longer of interest to researchers, its superior performance is worthy of attention. According to Rayleigh’s scattering law [2], the scattering of incoming light in the atmosphere is inversely proportional to the wavelength, and therein lies the cause of the wavelength-dependent haze distribution. Thus, the polarizing filter, which allows light waves of a specific polarization to pass through while reflecting light waves of other polarizations, was widely used in photography as a first attempt to remove haze. However, this polarization filtering technique alone could not remove the hazy effects because the atmospheric scattering phenomenon appeared at different wavelengths. As a result, Schechner et al. [3] developed a method that took at least two images captured under different polarization to estimate the airlight and the scaled depth. Subsequently, they reversed the hazy image formation process to restore the original haze-free scene radiance. On the other hand, Narasimhan and Nayar [4] proposed an approach that could function properly with as few as two images taken in different weather conditions, and Kopf et al. [5] exploited the existing georeferenced digital terrain and urban models to remove haze from outdoor images. Notwithstanding the impressive dehazing performance, researchers were apathetic about multiple-image algorithms due to the difficulties in the input acquisition process, consequently leading to the rapid development of single-image haze removal algorithms.

Researchers usually approach removing haze from a single input image from two perspectives: image restoration and image enhancement. In the former, a physical model, referred to as Koschmieder atmospheric scattering model [6], describes the formation of hazy images in the atmosphere. The incoming light is not only scattered by microscopic particles like dust or water droplets, but it is also attenuated when traversing the transmission medium. Therefore, solving the Koschmieder model with the two unknown variables representing the phenomena mentioned earlier is considered problematic. He et al. [7] proposed a dark channel prior (DCP), which states that there exist some pixels whose intensity is very low in at least one color channel of non-sky local patches. DCP provided an efficient way to estimate the patch-based extinction coefficients of the transmission medium. Subsequently, the computationally expensive soft-matting [8] was employed to suppress the block artifacts, giving rise to the DCP’s time-consuming drawback. They after that proposed a multi-function guided image filter [9], which could replace the soft-matting for accelerating the DCP at the cost of a specific degradation in image quality. The DCP has also been developed in many directions to overcome its shortcomings of slow processing rate and unpredictable performance in sky regions [10,11,12,13,14,15,16]. A fast dehazing method proposed by Kim et al. [15] and its corresponding 4K-capable intellectual property (IP) presented in Reference [17] are cases considered. By exploiting the modified hybrid median filter, which possesses excellent edge-preserving characteristics, to estimate the atmospheric veil, their method eliminated the need for soft-matting refinement, resulting in considerably faster processing speed, albeit a slight degradation in image quality. Based on the observation that the scene depth positively correlates with the difference between the image’s saturation and brightness, Zhu et al. [18] developed a linear model known as color attenuation prior (CAP). They then estimated its parameters in a supervised learning manner. Since the depth is exponentially proportional to the extinction coefficients of the transmission medium, CAP provided a fast way to solve the Koschmieder model. However, according to a detailed investigation conducted by Ngo et al. [19], it suffered from a few shortcomings, such as background noise, color distortion, and reduced dynamic range. Tang et al. [20] utilized another machine learning technique called random forest regression to calculate the extinction coefficients from a set of multi-scale image features including DCP, local maximum contrast, hue disparity, and local maximum saturation. Ngo et al. [21] exploited the simplex-based Nelder-Mead optimization to find the optimal transmission map. They additionally devised an adaptive atmospheric light to account for the heterogeneity of lightness. Although the dehazing power of the methods proposed by Tang et al. [20] and Ngo et al. [21] is quite impressive, several practical difficulties resulted owing to their highly expensive computations. Recently, deep learning techniques, notably convolutional neural networks (CNNs), have been used to learn the unknown variables of the Koschmieder model from the collected data. Cai et al. [22] developed a shallow-but-efficient CNN dubbed DehazeNet, taking a hazy image and producing its corresponding extinction coefficients. Other studies [23,24,25,26] have attempted to improve performance by either increasing the receptive field via deeper networks or developing a more sophisticated loss function as a surrogate for the universally used mean squared error. However, they all share the same problem of lacking a real training dataset comprising pairs of hazy and haze-free images. This drawback imposes a limit on their dehazing power. Interested readers are referred to a comprehensive review provided by Li et al. [27].

Since estimating two unknown variables in the Koschmieder model is somewhat computationally expensive, researchers have attempted to dehaze images employing image enhancement techniques. Instead of relying on a physical model, they first attempted to enhance low-level features such as contrast, sharpness, and brightness for alleviating the adverse effects of haze. Low-light stretch [28], unsharp masking [29,30,31], and histogram equalization [32,33] are cases considered. Nonetheless, since these methods did not take the underlying cause of haze-relevant distortion into account, they were merely appropriate to images obscured by thin haze. With current efforts in exploiting image fusion techniques, this category of haze removal algorithms has become haze-aware. Ancuti et al. [34] developed a method where the fusion followed a multi-scale manner with corresponding weight maps derived from luminance, chrominance, and salience information. Choi et al. [35] investigated haze-relevant features such as contrast energy, image entropy, normalized dispersion, and colorfulness, to name but a few. They then developed a more sophisticated weighting scheme for selectively blending regions with good visibility to the restored image. Galdran [36] also removed the hazy effects utilizing multi-scale fusion, but weight maps solely comprised contrast and saturation. As haze-relevant image features constituted the weight maps, these approaches achieved improved performance and extended their applicability to a wide variety of images with different haze density. However, the multi-scale fusion process, represented by the Laplacian pyramid, is quite expensive because of image buffers and line memories that are resulted from up- and down-sampling operations.

In this paper, we present a novel and simple image enhancement-based haze removal method capable of producing satisfactory results. Based on the observation that haze often obscures image details and increases brightness, a set of detail-enhanced and under-exposed images derived from a single hazy image is employed as inputs to image fusion. The corresponding weight maps are calculated according to DCP, which is well recognized as a good haze indicator. Then, the fusion process is simply equivalent to a weighted sum of images and weight maps. Finally, a post-processing method known as adaptive tone remapping is employed for expanding the dynamic range. Thus, the proposed algorithm is computationally efficient and haze-aware, while its compact hardware counterpart is capable of handling videos in real-time. Figure 1 depicts a general classification of haze removal algorithms as a summary of this section.

The remainder of this paper is organized as follows. Section 2 introduces the Koschmieder model and explains the relation between under-exposure and haze removal. Section 3 presents the proposed algorithm by sequentially describing individual main computations. Section 4 conducts a comparative evaluation with other state-of-the-art methods. Section 5 describes a hardware architecture for facilitating real-time processing, while Section 6 concludes the paper.

2. Preliminaries

The Koschmieder model is used to highlight the importance of under-exposure in haze removal. This physical model describes the formation of hazy images by taking into account the atmospheric scattering phenomenon. Hence it can be exploited to derive the relation between hazy and clear images. By showing that the haze-free image contains lower intensity values than the hazy image, the ultimate goal of achieving the dehazing effect via fusing a set of under-exposed images becomes attainable.

2.1. Koschmieder Model

The light waves reflected from the object, also called the object radiance, usually suffer from two main distortion types. The first one is direct attenuation and represents the gradual extinction of the object radiance in the transmission medium. In contrast, the second one is airlight and represents the scattering phenomenon when the reflected light waves encounter the atmospheric aerosols. As shown in Figure 2, they are both dependent on the distance from the observer (e.g., camera) to the object. Hence, the formation of hazy images is expressed in terms of proportionally diminishing attenuation and increasing airlight as follows:

I (x) = J (x) \cdot e^{- β (λ) d (x)} + A \cdot [1 - e^{- β (λ) d (x)}]

(1)

where

I

,

J

, d, and

A

denote the captured image, the clear image, the scene depth, and the global atmospheric light, respectively; x represents the spatial coordinates of image pixels;

β

stands for the extinction coefficient of the atmosphere, and

λ

is the wavelength. As mentioned earlier in Section 1 about multiple-image haze removal, the wavelength-dependent

β (λ)

causes the corresponding wavelength-dependent haze distribution. However, this dependency is widely assumed to be negligible in virtually all haze removal algorithms. Thus, by letting

t (x) \approx e^{- β (λ) d (x)}

be the transmission map, Equation (1) is re-written as follows:

I (x) = J (x) \cdot t (x) + A \cdot [1 - t (x)]

(2)

2.2. Pertinence of Under-Exposure to Haze Removal

The term exposure is used in photography to indicate the amount of light that reaches the electronic image sensors, and it is determined by shutter speed (i.e., exposure time) and lens aperture. Since the range of image intensities recorded is limited, for example, 256 possible intensity values of 8-bit RGB data, correct exposure is of great importance. If the exposure time is too short, an image suffers from a loss of shadow details, as shown in the desk depicted in Figure 3a. In contrast, if the digital sensors are exposed to light for too long, important bright parts of an image appear as clipped whites, as depicted in the window of Figure 3b. Thus, image fusion, also called image blending, that makes use of both under-exposed and over-exposed images is widely employed to provide a precisely exposed image. Since the atmospheric scattering phenomenon increases the amount of light entering the lens aperture, reducing the exposure time effectively restores bright details faded by haze at the cost of losing some shadow details. Hence, selectively blending clear areas from images of different exposures is similar to removing the hazy effects from the captured image.

In order to show that the deduction mentioned above is mathematically valid, we first derive another formula for the transmission map by rearranging Equation (2). Then, the exponential relation between

t (x)

and

d (x)

is used for demonstrating that

d (x) \in [0, \infty)

leads to

t (x) \in (0, 1]

. Applying the condition

t (x) \leq 1

to Equation (3) results in

J (x) \leq I (x)

because the global atmospheric light is generally larger than most image pixels. The relation between

J (x)

and

I (x)

implies that the scene radiance increases in intensity due to the atmospheric scattering phenomenon. Therefore, by blending several under-exposed images derived from the single hazy input, clear visibility can be restored.

t (x) = \frac{I (x) - A}{J (x) - A}

(3)

3. Proposed Algorithm

As under-exposing an image requires human intervention for adjusting either shutter speed or lens aperture, it cannot be attained in an automated manner. Accordingly, a simple-and-efficient technique called gamma correction is exploited to mimic the physical under-exposure. However, since gamma correction simply applies the same amount of decrease to the entire image, bright details faded by haze still remain obscure in the artificially under-exposed images. To overcome this issue, the sole input of a hazy scene is pre-processed by a detail enhancement algorithm to restored faded details. For accurately blending haze-free areas into the fused image, weight maps are first calculated with regard to dark channels and then are normalized to avoid the out-of-range problem. Nevertheless, the fused image is darker than the hazy input as a result of fusing a set of under-exposed images. Hence, a post-processing algorithm called adaptive tone remapping is employed to enhance the luminance and emphasize the chrominance. Figure 4 depicts the overall block diagram of the proposed algorithm. Individual computing processes are described in the following subsections.

3.1. Detail Enhancement

Detail enhancement, also called sharpness enhancement, is an image processing technique that involves three main steps: (i) decomposing the input image into background and detail signals, (ii) multiplying the latter (i.e., detail signal) by an adequate factor, and (iii) adding the enhanced details back to the background signal. However, since decomposition usually involves applying an edge-preserving filter iteratively [29,31], the hardware realization is quite cumbersome. A variant of detail enhancement that can circumvent this issue using the Laplacian operator is a viable alternative [30], given by:

Y^{'} = κ (v) \cdot (h_{v} * Y + h_{h} * Y) + Y

(4)

where Y denotes the luminance channel of the input image;

h_{v}

and

h_{h}

represent vertical and horizontal Laplacian operators, respectively; * stands for the convolution operation;

κ (v)

refers to the adaptive scaling factor calculated using the image’s local variance v, and

Y^{'}

is the enhanced luminance. Ngo et al. [30] proposed a scaling factor that uses three distinct gains for clear, slightly degraded, and heavily degraded areas in previous work. However, this type of weighting scheme is prone to visual artifacts due to the abrupt transition between gain values. In this paper, we, therefore, develop a weighting scheme with a linear form. The formula for calculating the proposed scaling factor is in Equation (5), where

(v_{1}, v_{2})

and

(κ_{1}, κ_{2})

are user-defined parameters utilized to specify the range of linear transformation.

κ (v) = {\begin{matrix} κ_{1} & v < v_{1} \\ \frac{v_{2} κ_{1} - v_{1} κ_{2}}{v_{2} - v_{1}} - (\frac{κ_{1} - κ_{2}}{v_{2} - v_{1}}) \cdot v & v_{1} \leq v \leq v_{2} . \\ κ_{2} & v > v_{2} \end{matrix}

(5)

Figure 5 illustrates the block diagram of the detail enhancement module. The input image is converted to the YCbCr color space to enhance the luminance channel. The reason for this is that this channel contains much more high-frequency edge information than that of each R, G, and B channel. In addition, the YCbCr 4:2:2 format is exploited instead of the standard YCbCr 4:4:4 for reducing computing resources related to the algorithm’s data movement. Subsequently, the image data are converted back to the RGB color space for fetching to the gamma correction module.

3.2. Gamma Correction

To obtain images with different exposures, either the shutter speed or lens aperture must be adjusted to control the amount of light reaching the image sensors. However, these actions cannot be performed automatically as they are pertinent to physical devices. Thus, in this paper, we exploit a simple technique named gamma correction to artificially under-expose the captured image. This is a nonlinear image processing operation that is usually defined by the following power-law expression:

I_{u}^{c} (x) = {[I^{c} (x)]}^{γ}

(6)

where the super-script c denotes a color channel of the input image (i.e.,

c \in {R, G, B}

), the subscript u represents under-exposure, and

γ

is a constant representing the exposure degree. Given the normalized image data of the range

[0, 1]

, over-exposure and under-exposure are represented by

γ < 1

and

γ > 1

, respectively, as depicted in Figure 6. Specifically,

γ = 1

represents the ’identity line’, where the input intensity is left unchanged. Let K be the number of artificially under-exposed images generated by gamma correction. The empirical values for the corresponding set of

γ_{i}, i \in [1, K]

must satisfy

γ_{i} \geq 1

. Section 4.1 will delve deeply into the empirical settings of employed parameters.

3.3. Weight Calculation and Normalization

The selection of an appropriate weighting scheme in image fusion depends on the purpose of the designed algorithm. For example, Mertens et al. [37] utilized three image quality measures, including saturation, contrast, and well-exposedness. The reason for this is that their proposed algorithm fused a sequence of multi-exposure images into a high-quality image. Galdran [36] employed saturation and contrast in his fusion-based dehazing algorithm because these two features correlate with the haze distribution to a certain extent. In this paper, based on the comprehensive evaluation conducted by Ancuti et al. [38], the well-performed dark channel prior discovered by He et al. [7] lies a firm base for deriving a haze-aware weighting scheme.

Through extensive observations of clear images, He et al. [7] found that the clear visibility and colorfulness of captured scenes lead to the existence of dark pixels, whose intensity is close to zero on at least one color channel. This observation was then applied in a patch-based manner to define the dark channel, as follows:

I^{D C P} (x) = m i n_{y \in Ω (x)} \{m i n_{c \in {R, G, B}} [I^{c} (y)]\}

(7)

where y stands for pixel coordinates within the square window

Ω (x)

centered at x. From Equation (7), it is clear that haze-free patches possess extremely low DCP, while hazy patches exhibit large DCP due to the hazy effects. However, as the sky region has considerably high values in all its color channels, DCP does not hold. This problem is its widely recognized drawback and is left aside for now. It is necessary to inverse Equation (7) to develop a weighting scheme from DCP so that haze-free patches are assigned large weights and vice versa. Also, the previously mentioned shortcoming of DCP gives rise to the assignment of small weights to sky regions even though they are not obscured by haze; therein lies the cause of darkened sky after image fusion, as can be seen in Figure 4. To solve this problem, a post-processing step, which will be described in Section 3.5, is employed to judiciously enhance the luminance and color of the fused image. Formulas for the DCP-based weighting scheme as well as weight normalization are shown in Equations (8) and (9), respectively, where normalization is carried out to prevent an out-of-range problem.

W^{D C P} (x) = 1 - I^{D C P} (x)

(8)

W_{i} (x) = \frac{W_{i}^{D C P} (x)}{\sum_{i = 1}^{K} W_{i}^{D C P} (x)}

(9)

Additionally, Galdran [36] stated that DCP was more suitable than a combination of saturation and contrast to guide the fusion process in haze removal. However, it was also assumed that DCP was not computationally friendly because it is usually post-processed by a large guided image filter. This is only applicable to cases of large

Ω (x)

(e.g.,

15 \times 15

) whose block artifacts are noticeable. In this paper, the block artifacts are negligible for small

Ω (x)

(e.g.,

3 \times 3

), so that the guided image filter can be excluded. Therefore, a

3 \times 3

minimum filter and a simple multiplexing circuit suffice to compute the DCP. Conversely, in Galdran’s method [36], a

3 \times 3

Laplacian filter and a complex square rooter are required for computing contrast and saturation. Thus, the proposed DCP-based weighting scheme is both computationally efficient and beneficial for dehazing.

3.4. Image Fusion

Because the multi-scale fusion based on the Laplacian pyramid is costly in terms of memory usage, in this paper, image fusion is conducted at a single scale as a simple weighted sum of under-exposed images and corresponding weight maps. In Equation (10),

J

is the dehazed image,

I_{u}^{i}

is one of the under-exposed images

{I_{u}^{1}, I_{u}^{2}, \dots, I_{u}^{K}}

, and

W_{i}

is one of the corresponding weight maps

{W_{1}, W_{2}, \dots, W_{K}}

.

J (x) = \sum_{i = 1}^{K} W_{i} (x) \cdot I_{u}^{i} (x)

(10)

A detailed interpretation is then in order to support the use of single-scale image fusion. Assuming that there are two under-exposed images

{I_{1}^{1}, I_{1}^{2}}

derived from the single input image, two corresponding weight maps

{W_{1}^{1}, W_{1}^{2}}

are calculated based on the dark channel prior. In Figure 7,

u_{2} (\cdot)

and

d_{2} (\cdot)

denote the up-sampling and down-sampling operations by a factor of two, respectively. Moreover, the number of times to apply

u_{2} (\cdot)

and

d_{2} (\cdot)

is limited to two for simplicity. Accordingly, applying the Laplacian decomposition to

I_{1}^{1}

and

I_{1}^{2}

results in two sets

{L_{1}^{1}, L_{2}^{1}, L_{3}^{1}}

and

{L_{1}^{2}, L_{2}^{2}, L_{3}^{2}}

. As shown in Figure 7, performing image fusion in the multi-scale manner is quite involved because the weighted sum operations are performed on individual scales to calculate the fused image through the final step including up-samplings and summations. Also, from the hardware designer’s point of view, several image buffers and line memories are required for performing up-sampling and down-sampling operations. However, the single-scale image fusion scheme in the proposed algorithm is simple as it solely comprises multiplications and summation by excluding the sampling operations that require large memories.

The computational flows illustrated in Figure 7 were programmed in the MATLAB environment, and a simple evaluation was conducted to verify the performance of the two fusion schemes. The two input images

{I_{1}^{1}, I_{1}^{2}}

were created by applying gamma correction with

{γ_{1}, γ_{2}} = {1, 2}

to a single hazy image in the FRIDA2 [39], O-HAZE [40], and I-HAZE [41] datasets. The corresponding weight maps

{W_{1}^{1}, W_{1}^{2}}

were calculated using Equations (7)–(9). Table 1 summarized the evaluation results, wherein descriptions of three employed metrics are available in Section 4.3. Table 1 demonstrated that the difference in performance is negligible (i.e., less than

0.5 %

for all three datasets) even though multi-scale image fusion is far more complicated than single-scale image fusion. This observation, coupled with the interpretation above, supports the use of single-scale image fusion.

3.5. Dynamic Range Extension

The darkened sky after image fusion requires luminance enhancement; additionally, the whole image needs to be enhanced. Although the use of under-exposed images is beneficial to haze removal, it brings about the unwanted side effect of significantly darkening the entire image. Thus, the adaptive tone remapping (ATR) algorithm proposed by Cho et al. [42] is employed to post-process the fused image. ATR is expressed by the following Equations:

E L (x) = L (x) + G_{L} (x) \cdot W_{L} (x)

(11)

E C (x) = C (x) + G_{C} (x) \cdot W_{C} (x) + 0.5

(12)

where L and

E L

denote the input luminance and the enhanced luminance, respectively;

G_{L}

is the luminance gain, and

W_{L}

represents the adaptive luminance weight. A similar interpretation holds for Equation (12) for color emphasis, wherein the constant

0.5

is an offset because the chrominance is subtracted in advance by

0.5

to be zero-centered. ATR exploits the input luminance’s cumulative distribution function to locate the adaptive limit point, which constitutes the nonlinear power function

G_{L}

.

W_{L}

takes on the form of a linear function where

W_{L}

is the dependent variable, and L is the independent variable. Since performing the color emphasis depends on the enhancement degree of the luminance,

G_{C}

’s formula is the multiplication of the ratio

E L / L

and the input color C. The last one,

W_{C}

, is a piece-wise linear function comprising three line segments. Interested readers are referred to Cho et al. [42] for a more detailed explanation.

4. Experiments

This section presents a comparative evaluation of the proposed algorithm and four benchmarking methods, including those proposed by He et al. [7], Zhu et al. [15], Kim et al. [15], and Galdran [36]. As mentioned in Section 1, although the recent method proposed by Ngo et al. [21] is quite efficient in image quality, its costly computations require considerable effort for future research. Therefore, it is better to exclude this algorithm from our list of benchmarking methods. However, we will demonstrate later in Section 4.3 that the proposed method is comparable to that of Ngo et al. [21] using their reported results.

The evaluation involves both a synthetic dataset and real datasets for thorough performance verification. FRIDA2 is a computer graphic-generated dataset designed for advanced driver-assistance systems, and it consists of 66 ground-truth images of road scenes. These images, coupled with their corresponding depth map, produce 264 hazy images covering four different haze types—homogeneous, heterogeneous, cloudy homogeneous, and cloudy heterogeneous. O-HAZE and I-HAZE are employed to assess the dehazing performance on real datasets. While O-HAZE comprises 45 pairs of outdoor hazy and haze-free images, I-HAZE comprises 30 pairs of indoor hazy and haze-free images. The hazy effects were simulated by a specialized vapor generator.

4.1. Experimental Setup

The proposed algorithm and four benchmarking methods were all programmed in MATLAB R2018b and tested on a computer with an Intel Core i5-7500 (

3.4

GHz) CPU and 16 GB RAM. Default settings publicly provided by He et al. [7], Zhu et al. [18], Kim et al. [15], and Galdran [36] are those resulting in the best performance, as mentioned in their study. Thus, we used these parameter settings to configure the corresponding algorithms. Table 2 presents the empirically determined values of user-defined parameters employed in our work.

Generally, utilizing more under-exposed images increases the dehazing performance. However, because our ultimate goal is to provide a real-time IP of the proposed algorithm, the number of under-exposed images (i.e., K) is constrained by the limited hardware resources. As a result,

K = 4

is a feasible setting that maintains the hardware design’s simplicity. Concerning the detail enhancement step, the determined values of

\{v_{1}, v_{2}\}

must best divide a hazy image into three separate regions: (i) dense haze region whose local variance is less than

v_{1}

, (ii) moderate haze region whose local variance lies between

v_{1}

and

v_{2}

, and (iii) haze-free region whose local variance is greater than

v_{2}

. For this reason, we have empirically chosen

\{v_{1}, v_{2}\} = \{0.001, 0.010\}

. After that,

\{κ_{1}, κ_{2}\}

were set to

\{2.500, 1.000\}

to enhance the detail information according to the piece-wise scaling factor in Equation (5). The reason for this is that when

κ_{1}

was greater than

2.5

, over-enhanced pixels appeared clipped whites, resulting in image quality degradation. Finally, four gamma values in the gamma correction step were determined based on two observations: (i) the set of under-exposed images should contain the hazy input, and (ii) the gamma value must be not too high to compensate for the limited hardware resources. Thus,

\{γ_{1}, γ_{2}, γ_{3}, γ_{4}\} = \{1.000, 1.900, 1.950, 2.000\}

is a feasible setting.

4.2. Qualitative Evaluation

Figure 8 illustrates a real hazy scene of a tree obscured by moderate haze, to visually assess the five algorithms’ dehazing power. Additionally, these algorithms will be interchangeably referred to by the corresponding author list hereafter. Figure 8 demonstrated that algorithms proposed by He et al. [7], Zhu et al. [18], Kim et al. [15], and Galdran [36] had limitations. He et al. [7] suffers from visual artifacts in the background, Zhu et al. [18] exhibits too-weak dehazing power, Kim et al. [15] has a significant drawback of color distortion, and Galdran [36] appears to be a slightly under-exposed version of Zhu et al. [18] because the block-based contrast limited adaptive histogram equalization (CLAHE) employed therein does not bring about significant enhancement. In contrast, the proposed algorithm produces a satisfactory result due to the effective use of detail enhancement before image under-exposure and a DCP-based weighting scheme to guide the fusion process. Additionally, the well-known weakness of DCP in He et al. [7] and the darkening effect due to the use of under-exposed images are effectively resolved through adaptive tone remapping in post-processing.

Our method’s superior performance is confirmed in Figure 9, which depicts a mountain’s real hazy scene. Likewise, He et al. [7] suffers from visual artifacts including the yellowish sky and the bluish mountain, Zhu et al. [18] turns both the sky and the mountain bluish, Kim et al. [15] exhibits color distortion in the mountains, and Galdran [36] leaves a small portion of haze on the profile of the mountain. Only is ours capable of both removing the hazy effects and assuring high image quality. Figure 10 further demonstrates the proposed method’s dehazing performance with other benchmarking methods on various real hazy scenes. More evaluation results can be found online at: https://datngo.webstarts.com/blog/.

4.3. Quantitative Evaluation

This section utilized three evaluation metrics, including structural similarity (SSIM) [43], tone-mapped image quality index (TMQI) [44], and feature similarity extended to color images (FSIMc) [45], to access the five algorithms quantitatively. SSIM takes the luminance of both a dehazed image and a ground-truth reference as inputs and produces a value in

[0, 1]

representing the degree of similarity in structural information. A higher SSIM implies a greater degree of similarity. Supposing that X and Y denote the dehazed and ground-truth reference images’ luminance channel, respectively, Equation (13) demonstrates the calculation of the SSIM measure of two images.

S S I M (X, Y) = \frac{(2 μ_{x} μ_{y} + C_{1}) (2 σ_{x y} + C_{2})}{(μ_{x}^{2} + μ_{y}^{2} + C_{1}) (σ_{x}^{2} + σ_{y}^{2} + C_{2})}

(13)

where

(μ_{x}, μ_{y})

and

(σ_{x}, σ_{y})

represent the local average and standard deviation of X and Y, respectively;

σ_{x y}

denotes the correlation coefficient between the mean-subtracted

(X - μ_{x})

and

(Y - μ_{y})

, and

C_{1}

and

C_{2}

are stabilizing constants.

The second metric, TMQI, works on the luminance of images and assesses the multi-scale fidelity measure based on the structural fidelity (S) and the naturalness (N), as presented in Equation (14). Wherein

0 \leq a \leq 1

is a constant to adjust the relative importance of the two terms, and

φ

and

ϕ

are exponents to control their corresponding sensitivities. As with the SSIM mentioned above, X and Y denote the dehazed and ground-truth reference images.

T M Q I (X, Y) = a S^{φ} + (1 - a) N^{ϕ}

(14)

The structural fidelity is calculated based on the modified SSIM (

S_{l o c a l}

), as shown in Equations (15) and (16). In these two equations,

x_{i}

and

y_{i}

are the i-th local patches in the two images X and Y, respectively;

P_{q}

is the number of local patches in the q-th scale;

ψ_{q}

is the weight corresponding to the q-th scale, and Q is the total number of scales.

σ_{x}

and

σ_{y}

are passed through the nonlinear sigmoid function to produce the mapped

σ_{x}^{^{'}}

and

σ_{y}^{^{'}}

. The reason for this is to consider the visual sensitivity of contrast in the literature of visual psychophysics.

S = \prod_{q = 1}^{Q} {[\frac{1}{P_{q}} \sum_{i = 1}^{P_{q}} S_{l o c a l} (x_{i}, y_{i})]}^{ψ_{q}}

(15)

S_{l o c a l} (x, y) = \frac{(2 σ_{x}^{^{'}} σ_{y}^{^{'}} + C_{1}) (σ_{x y} + C_{2})}{({σ_{x}^{^{'}}}^{2} + {σ_{y}^{^{'}}}^{2} + C_{1}) (σ_{x} σ_{y} + C_{2})}

(16)

To calculate the naturalness, Yeganeh et al. [44] first fitted the means and standard deviations of 3000 gray-scale images to the Gaussian and Beta distributions. Then, they defined the naturalness measure as follows:

N = \frac{P_{m} P_{d}}{Z}

(17)

where

P_{m}

and

P_{d}

denote the Gaussian and Beta probability density functions, respectively, and

Z = m a x (P_{m} P_{d})

denotes the normalization factor to constrain N between 0 and 1. Since both the structural fidelity and the naturalness are upper-bounded by 1, TMQI is also upper-bounded by 1, wherein a higher score is favorable to haze removal.

The third metric, FSIMc, can be considered an improvement upon SSIM since it extends its calculation to the chrominance. Equation (18) demonstrates the calculation of FSIMc, wherein X and Y are now two color images, that is, a dehazed image and a ground-truth reference image,

S_{L}

is the combined similarity measure of the gradient magnitude and the phase congruency similarities between X and Y,

S_{C}

is the chrominance similarity measure,

P C_{m}

is the weighting coefficient,

Γ

is a positive constant for adjusting the importance of the chrominance component, and

Ω

is the whole image domain. FSIMc also takes on values between 0 and 1, wherein a higher score implies a better dehazing performance.

F S I M c (X, Y) = \frac{\sum_{i \in Ω} S_{L} (i) S_{C} {(i)}^{Γ} P C_{m} (i)}{\sum_{i \in Ω} P C_{m} (i)}

(18)

Table 3 and Table 4 show quantitative evaluation results on the FRIDA2, O-HAZE, and I-HAZE datasets, respectively, where the best results appear in boldface. Regarding the synthetic FRIDA2 dataset, Table 3 demonstrated that the proposed algorithm exhibits the best dehazing power in terms of SSIM and FSIMc. The employed detail enhancement and ATR post-processing primarily contribute to the high SSIM and FSIMc scores. More specifically, the former accentuates the objects’ profile, and the latter performs both luminance enhancement and color emphasis. The fact that He et al. [7] shows the lowest performance is owing to the contents of the FRIDA2 dataset, which comprises road scenes covered by a broad sky. Zhu et al. [18] is better than He et al. [7], albeit unsatisfactory owing to the drawbacks of over-dehazing, background noise, and color distortion, as mentioned in Reference [19]. Likewise, Kim et al. [15] is slightly better than He et al. [7], mainly due to the modified hybrid median filter for estimating the atmospheric veil. However, as pointed out in Reference [19], Kim et al. [15] is prone to noticeable background noise and color distortion. Galdran [36] shares top performance with the proposed method, and its TMQI is the highest, which is primarily due to the block-based CLAHE’s preference for the multi-scale base for TMQI. However, for the O-HAZE dataset, this observation is reversed. Galdran [36] is the best method under SSIM and FSIMc, while ours exhibits the best performance under TMQI. Dataset dependence can be responsible for this observation. Since O-HAZE consists of outdoor images with homogeneous lighting conditions, the block-based CLAHE can bring about a significant enhancement in image contrast without any noticeable artifacts, subsequently resulting in the top performance of Galdran [36]. Nevertheless, as the I-HAZE dataset comprises indoor images with heterogeneous lighting conditions, the proposed method is interestingly superior to the four benchmarking ones in all three employed metrics.

To conclude this section, a brief comparison of the proposed method with Ngo et al. [21] is in order. According to their reported results, the pairs of {SSIM, TMQI} scores were

{0.7297, 0.7993}

,

{0.7630, 0.8244}

, and

{0.7727, 8522}

on FRIDA2, O-HAZE, and I-HAZE datasets, respectively. It is evident from Table 3 and Table 4 that the proposed method always possesses slightly higher scores than that of Ngo et al. [21], albeit with a significantly lower computational cost.

5. Real-Time Processing

In this section, a hardware architecture for the proposed algorithm is presented for facilitating real-time processing. The synthesis results then prove that the proposed hardware design is compact and capable of handling high-quality videos in real-time.

5.1. Hardware Implementation

The hardware architecture for implementing the proposed algorithm in a Field Programmable Gate Array (FPGA) chip [46] is described in the same order as presented in Section 3. The hazy input is first fetched to the detail enhancement module, in which color conversion (i.e., RGB to YCbCr 4:2:2 and vice versa) is realized by using basic arithmetic operations. However, it should be noted that multiplication comprising too large multiplicands and multipliers (e.g., bit sizes greater than 16) is accomplished by means of split multipliers, which exploit associative and distributive properties to achieve throughput improvement of hardware real-time maximum frequency. The division in Equation (5) is implemented by serial dividers as user-defined parameters do not change from pixel to pixel, like the real-time input image data.

The detail-enhanced images are subjected to gamma correction for generating under-exposed images, where Equation (6) is realized by an efficient means of look-up tables (LUTs). The minimum filter in weight calculation and the vertical Laplacian operator in detail enhancement involve convolution, so they are implemented using line memories and their corresponding memory controllers. In this study, the proposed hardware is designed to handle the maximum video resolution of 4K. Thus, each line memory is designed for 4096 pixels, giving rise to the need for an efficient design to calculate weight maps. Sequentially following Equations (6)–(9) result in a direct implementation illustrated in the top half of Figure 11. Because four under-exposed images are generated, four sets of a 3-input channel-wise minimum followed by a

3 \times 3

minimum filter are required to find the dark channels in Equation (7). Accordingly, eight line memories (two for each minimum filter times four filters) are utilized solely for calculating the dark channels. In an attempt to reduce the number of line memories, the monotonicity of Equation (6) is worthy of attention. The gamma correction is a monotonically decreasing operation for

γ \geq 1

, so that the location of minimum pixels in all three color channels and local patches are preserved. Hence, the efficient implementation in the bottom half of Figure 11 exploits the monotonicity to reduce the number of line memories to two at the cost of doubling the number of LUTs. However, utilizing more LUTs is not problematic because LUT is considerably simpler than line memory. Consequently, by calculating weight maps via the efficient implementation in the proposed algorithm, the number of requisite line memories has been reduced from eight to two, that is

75 %

more efficient than the direct implementation in terms of memory usage. In Figure 11, it is worth noticing that delay modules ensuring the correct pipelined operations have been omitted for ease of illustration. Also, the set of a 3-input channel-wise minimum followed by a

3 \times 3

minimum filter is included in the detail enhancement module, and two sets of gamma correction LUTs are also represented as a single gamma correction module in Figure 12, which shows the whole hardware architecture.

Furthermore, because the real-time input data are required to be processed continuously, the division in Equation (9) for weight normalization is achieved by means of parallel dividers. Image fusion on a single scale is simply accomplished by using multipliers and adders. Finally, the IP for the ATR designed by Cho et al. [42] is exploited to post-process the fused image.

5.2. Synthesis and Comparison

The hardware architecture proposed in Figure 12 was designed using the Verilog hardware description language (IEEE Standard 1364-2005) [47] and synthesized using a Xilinx Design Analyzer. The synthesis results are summarized in Table 5. Our design utilized 30,676 registers, 36,357 LUTs, and 48 block RAMs. This occupies

7.02 %

,

16.63 %

, and

8.81 %

of available resources on the target FPGA chip, respectively. The maximum attainable processing rate is

242.718

MHz or Mpixels/s. Using this information, the maximum processing speed (

M P S

) in frames per second (fps) can be derived as follows:

M P S = \frac{f_{m a x}}{(W + H B) \cdot (H + V B)}

(19)

where

f_{m a x}

denotes the maximum operating frequency, W and H represent the width and height of the input image, respectively, and

H B

and

V B

represent the corresponding horizontal and vertical periods. The proposed hardware is designed so that it functions properly with the minimum

H B

and

V B

of one pixel and one line, respectively. Table 6 presents the maximum processing speeds for different video resolutions, demonstrating that our hardware implementation is capable of handling DCI 4K video at 27 fps. More specifically, 8,853,617 (=

4097 \times 2161

) clock cycles are required for processing one video frame of size

4096 \times 2160

. Therefore, substituting this value to Equation (19) results in the

M P S

of 27 (≈

242.718 \times 10^{6} /

8,853,617).

Table 7 shows the synthesis results of the proposed design and the other two methods side-by-side to illustrate the compactness and efficiency of our dehazing hardware. Park et al. [48] developed a fast execution scheme of the algorithm proposed by He et al. [7]. Their design comprises

53,400

registers, 64,000 LUTs, 32 digital signal processing (DSP) slices and

3.2

Mbits. Notwithstanding the maximum processing rate of

88.700

Mpixels/s, the hardware was designed to operate at fixed frame sizes of

320 \times 240

,

640 \times 480

and

800 \times 600

. Accordingly, the maximum video resolution that the design of Park et al. [48] can handle is solely super video graphics array (SVGA). Ngo et al. [17] developed a 4K-capable IP of the algorithm proposed by Kim et al. [15], and it consists of 70,864 registers, 56,664 LUTs and

1.5

Mbits. The highest possible attainable processing speed is

236.290

MHz, which is responsible for its 4K capability. Compared to these two designs, our haze removal hardware is quite compact and fast. More specifically, when compared to the design of Park et al. [48], the utilization of registers, LUTs, and memory is reduced by

42.6 %

,

43.2 %

and

59.4 %

, respectively; and the processing speed is increased by approximately thrice. Compared to the design of Ngo et al. [17], the reduction rates in the utilized registers, LUTs, and memory are

56.7 %

,

35.8 %

and

13.3 %

, respectively, and the processing speed is slightly improved. Thus, coupled with the algorithm performance evaluation in Section 4, it can be concluded that our proposed algorithm is of paramount importance because of its superior performance and an efficient hardware prototype that is capable of handling high-quality video streams in real-time.

6. Conclusions

A computationally efficient haze removal algorithm and its corresponding hardware implementation were presented in this paper. It was discovered that dehazing methods based on the Koschmieder model are computationally expensive, mainly due to the inevitable estimation process of unknown variables, that is, transmission map and atmospheric light. Therefore, we first exploited Koschmieder’s law to deduce the use of under-exposed images for haze removal. Then, simple image processing techniques such as detail enhancement, gamma correction, single-scale image fusion, and adaptive tone remapping were employed to carry our deduction into effect. The use of detail enhancement before artificial under-exposure by gamma correction effectively mimicked the physical exposure adjustment, while the DCP-based weighting scheme accurately guided the fusion process to blend image areas with clear visibility into the fused image. A novel adaptive tone remapping algorithm enhanced the darkened result obtained after fusing under-exposed images. Moreover, a compact hardware design capable of processing DCI 4K video standard was provided to facilitate the integration of the proposed method into existing real-time systems.

References yes

Author Contributions

Conceptualization, B.K., G.-D.L., and T.M.N.; software, D.N. and S.L.; validation, Q.-H.N.; data curation, D.N., S.L., and Q.-H.N.; writing—original draft preparation, D.N. and S.L.; writing—review and editing, B.K., G.-D.L., T.M.N., D.N., S.L., and Q.-H.N.; visualization, D.N. and S.L.; supervision, B.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by research funds from Dong-A University, Busan, Korea.

Conflicts of Interest

The authors declare no conflict of interest.

References

Adjabi, I.; Ouahabi, A.; Benzaoui, A.; Taleb-Ahmed, A. Past, Present, and Future of Face Recognition: A Review. Electronics 2020, 9, 1188. [Google Scholar] [CrossRef]
Young, A.T. Rayleigh scattering. Appl. Opt. 1981, 20, 533–535. [Google Scholar] [CrossRef] [PubMed]
Schechner, Y.; Narasimhan, S.; Nayar, S. Instant dehazing of images using polarization. In Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2001), Kauai, HI, USA, 8–14 December 2001; Volume 1, pp. I-325–I-332. [Google Scholar] [CrossRef] [Green Version]
Narasimhan, S.; Nayar, S. Chromatic framework for vision in bad weather. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2000), Hilton Head Island, SC, USA, 15 June 2000; Volume 1, pp. 598–605. [Google Scholar] [CrossRef] [Green Version]
Kopf, J.; Neubert, B.; Chen, B.; Cohen, M.; Cohen-Or, D.; Deussen, O.; Uyttendaele, M.; Lischinski, D. Deep photo: Model-based photograph enhancement and viewing. ACM Trans. Graph. 2008, 27, 116. [Google Scholar] [CrossRef] [Green Version]
Lee, Z.; Shang, S. Visibility: How Applicable is the Century-Old Koschmieder Model? J. Atmos. Sci. 2016, 73, 4573–4581. [Google Scholar] [CrossRef]
He, K.; Sun, J.; Tang, X. Single Image Haze Removal Using Dark Channel Prior. IEEE Trans. Pattern Anal. Mach. Intell. 2011, 33, 2341–2353. [Google Scholar] [CrossRef] [PubMed]
Levin, A.; Lischinski, D.; Weiss, Y. A Closed-Form Solution to Natural Image Matting. IEEE Trans. Pattern Anal. Mach. Intell. 2008, 30, 228–242. [Google Scholar] [CrossRef] [Green Version]
He, K.; Sun, J.; Tang, X. Guided Image Filtering. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 35, 1397–1409. [Google Scholar] [CrossRef]
Gibson, K.B.; Vo, D.T.; Nguyen, T.Q. An Investigation of Dehazing Effects on Image and Video Coding. IEEE Trans. Image Process. 2012, 21, 662–673. [Google Scholar] [CrossRef]
Lee, S.; Yun, S.; Nam, J.H.; Won, C.S.; Jung, S.W. A review on dark channel prior based image dehazing algorithms. EURASIP J. Image Video Process. 2016, 2016, 4. [Google Scholar] [CrossRef] [Green Version]
Zhu, Y.; Tang, G.; Zhang, X.; Jiang, J.; Tian, Q. Haze removal method for natural restoration of images with sky. Neurocomputing 2018, 275, 499–510. [Google Scholar] [CrossRef]
Park, Y.; Kim, T.H. Fast Execution Schemes for Dark-Channel-Prior-Based Outdoor Video Dehazing. IEEE Access 2018, 6, 10003–10014. [Google Scholar] [CrossRef]
Tufail, Z.; Khurshid, K.; Salman, A.; Fareed Nizami, I.; Khurshid, K.; Jeon, B. Improved Dark Channel Prior for Image Defogging Using RGB and YCbCr Color Space. IEEE Access 2018, 6, 32576–32587. [Google Scholar] [CrossRef]
Kim, G.J.; Lee, S.; Kang, B. Single Image Haze Removal Using Hazy Particle Maps. Commun. Comput. Sci. 2018, 101, 1999–2002. [Google Scholar] [CrossRef]
Peng, Y.T.; Lu, Z.; Cheng, F.C.; Zheng, Y.; Huang, S.C. Image Haze Removal Using Airlight White Correction, Local Light Filter, and Aerial Perspective Prior. IEEE Trans. Circuits Syst. Video Technol. 2020, 30, 1385–1395. [Google Scholar] [CrossRef]
Ngo, D.; Lee, G.D.; Kang, B. A 4K-Capable FPGA Implementation of Single Image Haze Removal Using Hazy Particle Maps. Appl. Sci. 2019, 9, 3443. [Google Scholar] [CrossRef] [Green Version]
Zhu, Q.; Mai, J.; Shao, L. A Fast Single Image Haze Removal Algorithm Using Color Attenuation Prior. IEEE Trans. Image Process. 2015, 24, 3522–3533. [Google Scholar] [CrossRef] [Green Version]
Ngo, D.; Lee, G.D.; Kang, B. Improved Color Attenuation Prior for Single-Image Haze Removal. Appl. Sci. 2019, 9, 4011. [Google Scholar] [CrossRef] [Green Version]
Tang, K.; Yang, J.; Wang, J. Investigating Haze-Relevant Features in a Learning Framework for Image Dehazing. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 2995–3002. [Google Scholar] [CrossRef] [Green Version]
Ngo, D.; Lee, S.; Kang, B. Robust Single-Image Haze Removal Using Optimal Transmission Map and Adaptive Atmospheric Light. Remote Sens. 2020, 12, 2233. [Google Scholar] [CrossRef]
Cai, B.; Xu, X.; Jia, K.; Qing, C.; Tao, D. DehazeNet: An End-to-End System for Single Image Haze Removal. IEEE Trans. Image Process. 2016, 25, 5187–5198. [Google Scholar] [CrossRef] [Green Version]
Li, C.; Guo, J.; Porikli, F.; Fu, H.; Pang, Y. A Cascaded Convolutional Neural Network for Single Image Dehazing. IEEE Access 2018, 6, 24877–24887. [Google Scholar] [CrossRef]
Yang, X.; Li, H.; Fan, Y.L.; Chen, R. Single Image Haze Removal via Region Detection Network. IEEE Trans. Multimedia 2019, 21, 2545–2560. [Google Scholar] [CrossRef]
Golts, A.; Freedman, D.; Elad, M. Unsupervised Single Image Dehazing Using Dark Channel Prior Loss. IEEE Trans. Image Process. 2020, 29, 2692–2701. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Ren, W.; Pan, J.; Zhang, H.; Cao, X.; Yang, M.H. Single Image Dehazing via Multi-scale Convolutional Neural Networks with Holistic Edges. Int. J. Comput. Vis. 2020, 128, 240–259. [Google Scholar] [CrossRef]
Li, B.; Ren, W.; Fu, D.; Tao, D.; Feng, D.; Zeng, W.; Wang, Z. Benchmarking Single-Image Dehazing and Beyond. IEEE Trans. Image Process. 2019, 28, 492–505. [Google Scholar] [CrossRef] [Green Version]
Ngo, D.; Kang, B. Preprocessing for High Quality Real-time Imaging Systems by Low-light Stretch Algorithm. J. Inst. Korean. Electr. Electron. Eng. 2018, 22, 585–589. [Google Scholar] [CrossRef]
Polesel, A.; Ramponi, G.; Mathews, V. Image enhancement via adaptive unsharp masking. IEEE Trans. Image Process. 2000, 9, 505–510. [Google Scholar] [CrossRef] [Green Version]
Ngo, D.; Kang, B. Image Detail Enhancement via Constant-Time Unsharp Masking. In Proceedings of the 2019 IEEE 21st Electronics Packaging Technology Conference (EPTC), Singapore, 4–6 December 2019; pp. 743–746. [Google Scholar] [CrossRef]
Ngo, D.; Lee, S.; Kang, B. Nonlinear Unsharp Masking Algorithm. In Proceedings of the 2020 International Conference on Electronics, Information, and Communication (ICEIC), Barcelona, Spain, 19–20 January 2020; pp. 1–6. [Google Scholar] [CrossRef]
Sengee, N.; Sengee, A.; Choi, H.K. Image contrast enhancement using bi-histogram equalization with neighborhood metrics. IEEE Trans. Consum. Electron. 2010, 56, 2727–2734. [Google Scholar] [CrossRef]
Tan, S.F.; Isa, N.A.M. Exposure Based Multi-Histogram Equalization Contrast Enhancement for Non-Uniform Illumination Images. IEEE Access 2019, 7, 70842–70861. [Google Scholar] [CrossRef]
Ancuti, C.O.; Ancuti, C. Single Image Dehazing by Multi-Scale Fusion. IEEE Trans. Image Process. 2013, 22, 3271–3282. [Google Scholar] [CrossRef]
Choi, L.K.; You, J.; Bovik, A.C. Referenceless Prediction of Perceptual Fog Density and Perceptual Image Defogging. IEEE Trans. Image Process. 2015, 24, 3888–3901. [Google Scholar] [CrossRef]
Galdran, A. Image dehazing by artificial multiple-exposure image fusion. Signal Process. 2018, 149, 135–147. [Google Scholar] [CrossRef]
Mertens, T.; Kautz, J.; Reeth, F.V. Exposure Fusion. In Proceedings of the 15th Pacific Conference on Computer Graphics and Applications (PG’07), Maui, HI, USA, 29 October–2 November 2007; pp. 382–390. [Google Scholar] [CrossRef]
Ancuti, C.; Ancuti, C.O.; De Vleeschouwer, C. D-HAZY: A dataset to evaluate quantitatively dehazing algorithms. In Proceedings of the 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 25–28 September 2016; pp. 2226–2230. [Google Scholar] [CrossRef]
Tarel, J.P.; Hautiere, N.; Caraffa, L.; Cord, A.; Halmaoui, H.; Gruyer, D. Vision Enhancement in Homogeneous and Heterogeneous Fog. IEEE Intell. Transp. Syst. Mag. 2012, 4, 6–20. [Google Scholar] [CrossRef] [Green Version]
Ancuti, C.O.; Ancuti, C.; Timofte, R.; De Vleeschouwer, C. O-HAZE: A Dehazing Benchmark with Real Hazy and Haze-Free Outdoor Images. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Salt Lake City, UT, USA, 18–22 June 2018; pp. 867–8678. [Google Scholar] [CrossRef] [Green Version]
Ancuti, C.O.; Ancuti, C.; Timofte, R.; De Vleeschouwer, C. I-HAZE: A dehazing benchmark with real hazy and haze-free indoor images. arXiv 2018, arXiv:1804.05091. [Google Scholar]
Cho, H.; Kim, G.J.; Jang, K.; Lee, S.; Kang, B. Color Image Enhancement Based on Adaptive Nonlinear Curves of Luminance Features. J. Semicond. Technol. Sci. 2015, 15, 60–67. [Google Scholar] [CrossRef]
Wang, Z.; Bovik, A.; Sheikh, H.; Simoncelli, E. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Yeganeh, H.; Wang, Z. Objective Quality Assessment of Tone-Mapped Images. IEEE Trans. Image Process. 2013, 22, 657–667. [Google Scholar] [CrossRef]
Zhang, L.; Zhang, L.; Mou, X.; Zhang, D. FSIM: A Feature Similarity Index for Image Quality Assessment. IEEE Trans. Image Process. 2011, 20, 2378–2386. [Google Scholar] [CrossRef] [Green Version]
Zynq-7000 SoC Data Sheet: Overview (DS190). Available online: https://www.xilinx.com/support/documentation/data_sheets/ds190-Zynq-7000-Overview.pdf (accessed on 12 May 2019).
IEEE Standard for Verilog Hardware Description Language; IEEE Std 1364-2005; IEEE: Piscataway, NJ, USA, 2006; pp. 1–590. [CrossRef]
Park, Y.; Kim, T.H. A video dehazing system based on fast airlight estimation. In Proceedings of the 2017 IEEE Global Conference on Signal and Information Processing (GlobalSIP), Montreal, QC, Canada, 14–16 November 2017; pp. 779–783. [Google Scholar] [CrossRef]

Figure 1. A general classification of haze removal algorithms.

Figure 2. An illustration of the atmospheric scattering phenomenon.

Figure 3. Image of different exposures: (a) under-exposure and (b) over-exposure.

Figure 4. Overall block diagram of the proposed algorithm.

Figure 5. Block diagram of the detail enhancement module.

Figure 6. Artificial exposure by gamma correction.

Figure 7. A visual illustration of multi-scale image fusion and single-scale image fusion employed in the proposed algorithm.

Figure 8. Qualitative comparison with other dehazing methods on a real hazy scene of a tree.

Figure 9. Qualitative comparison with other dehazing methods on a real hazy mountainous scene.

Figure 10. Qualitative comparison with other dehazing methods on various real hazy scenes.

Figure 11. The proposed efficient implementation for calculating weight maps by exploiting the monotonicity of gamma correction.

Figure 12. Hardware architecture of the proposed algorithm.

Table 1. Average structural similarity (SSIM), tone-mapped image quality index (TMQI), and feature similarity extended to color images (FSIMc) scores on the FRIDA2, O-HAZE, and I-HAZE datasets for evaluating multi-scale and single-scale image fusions.

Dataset	Fusion Scheme	SSIM	TMQI	FSIMc
	Multi-scale	0.7431	0.6800	0.8002
FRIDA2	Single-scale	0.7428	0.6795	0.8000
	Difference (%)	0.0404	0.0735	0.0250
	Multi-scale	0.6799	0.8184	0.7551
O-HAZE	Single-scale	0.6768	0.8172	0.7529
	Difference (%)	0.4559	0.1466	0.2914
	Multi-scale	0.7170	0.7397	0.8104
I-HAZE	Single-scale	0.7159	0.7389	0.8096
	Difference (%)	0.1534	0.1082	0.0987

Table 2. Empirical values of user-defined parameters in the proposed algorithm.

Parameter	Description	Value
K	The number of under-exposed images	4
${v_{1}, v_{2}, κ_{1}, κ_{2}}$	Being used to control detail enhancement step	${0.001, 0.010, 2.500, 1.000}$
${γ_{1}, γ_{2}, γ_{3}, γ_{4}}$	Gamma values in gamma correction step	${1.000, 1.900, 1.950, 2.000}$

Table 3. Average SSIM, TMQI, and FSIMc scores on FRIDA2 dataset. The boldface numbers indicate the best performance.

Method	Haze Type	SSIM	TMQI	FSIMc
He et al. [7]	Homogeneous	0.6653	0.7639	0.8168
	Heterogeneous	0.5374	0.6894	0.7251
	Cloudy Homogeneous	0.5349	0.6849	0.7222
	Cloudy Heterogeneous	0.6500	0.7781	0.8343
	Overall Average	0.5969	0.7291	0.7746
Zhu et al. [18]	Homogeneous	0.5651	0.7533	0.7947
	Heterogeneous	0.5519	0.7254	0.7845
	Cloudy Homogeneous	0.5310	0.7080	0.7764
	Cloudy Heterogeneous	0.5412	0.7674	0.8117
	Overall Average	0.5473	0.7385	0.7918
Kim et al. [15]	Homogeneous	0.5949	0.7320	0.8048
	Heterogeneous	0.6245	0.7037	0.7805
	Cloudy Homogeneous	0.6124	0.7015	0.7751
	Cloudy Heterogeneous	0.6078	0.7343	0.8135
	Overall Average	0.6099	0.7179	0.7935
Galdran [36]	Homogeneous	0.7200	0.7397	0.7958
	Heterogeneous	0.7213	0.7436	0.7909
	Cloudy Homogeneous	0.6921	0.7250	0.7800
	Cloudy Heterogeneous	0.7595	0.7588	0.8183
	Overall Average	0.7232	0.7418	0.7963
Proposed Algorithm	Homogeneous	0.7545	0.7295	0.8125
	Heterogeneous	0.7345	0.7204	0.7991
	Cloudy Homogeneous	0.7423	0.7235	0.7963
	Cloudy Heterogeneous	0.7278	0.7172	0.7902
	Overall Average	0.7398	0.7227	0.7995

Table 4. Average SSIM, TMQI, and FSIMc scores on O-HAZE and I-HAZE datasets. The boldface numbers indicate the best performance.

Dataset	Method	SSIM	TMQI	FSIMc
	He et al. [7]	0.7709	0.8403	0.8423
	Zhu et al. [18]	0.6647	0.8118	0.7738
O-HAZE	Kim et al. [15]	0.4702	0.6509	0.6869
	Galdran [36]	0.7877	0.8401	0.8468
	Proposed Algorithm	0.7753	0.8991	0.8350
	He et al. [7]	0.6580	0.7319	0.8208
	Zhu et al. [18]	0.6864	0.7512	0.8252
I-HAZE	Kim et al. [15]	0.6424	0.7026	0.7879
	Galdran [36]	0.7547	0.7613	0.8558
	Proposed Algorithm	0.7779	0.8077	0.8583

Table 5. Hardware synthesis result of the proposed hardware design.

Xilinx Design Analyzer $^{1}$
Device	xc7z045-2ffg900
Slice Logic Utilization	Available	Used	Utilization
Slice Registers (#)	437,200	30,676	7.02%
Slice LUTs (#)	218,600	36,357	16.63%
Used as Memory (#)	70,400	529	0.75%
RAM36E1/FIFO36E1s	545	48	8.81%
Minimum Period	4.120 ns
Maximum Frequency	242.718 MHz

^{1}

The EDA Tool was supported by the IC Design Education Center.

Table 6. Maximum processing rate for various video resolutions.

Video Resolution		Frame Size	Required Clock Cycles (#)	Processing Speed ( $MPS$ )
Full HD (FHD)		$1920 \times 1080$	2,076,601	116
Quad HD (QHD)		$2560 \times 1440$	3,690,401	65
	UW4K	$3840 \times 1600$	6,149,441	39
4K	UHD TV	$3840 \times 2160$	8,300,401	29
	DCI 4K	$4096 \times 2160$	8,853,617	27

Table 7. Comparison with other hardware designs.

Hardware Utilization	Park et al. [48]	Ngo et al. [17]	Proposed Design
Registers (#)	53,400	70,864	30,676
LUTs (#)	64,000	56,664	36,357
DSPs (#)	42	0	0
Memory (Mbits)	3.2	1.5	1.3
Maximum Frequency (MHz)	88.700	236.290	242.718
Maximum Video Resolution	SVGA	DCI 4K	DCI 4K

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ngo, D.; Lee, S.; Nguyen, Q.-H.; Ngo, T.M.; Lee, G.-D.; Kang, B. Single Image Haze Removal from Image Enhancement Perspective for Real-Time Vision-Based Systems. Sensors 2020, 20, 5170. https://doi.org/10.3390/s20185170

AMA Style

Ngo D, Lee S, Nguyen Q-H, Ngo TM, Lee G-D, Kang B. Single Image Haze Removal from Image Enhancement Perspective for Real-Time Vision-Based Systems. Sensors. 2020; 20(18):5170. https://doi.org/10.3390/s20185170

Chicago/Turabian Style

Ngo, Dat, Seungmin Lee, Quoc-Hieu Nguyen, Tri Minh Ngo, Gi-Dong Lee, and Bongsoon Kang. 2020. "Single Image Haze Removal from Image Enhancement Perspective for Real-Time Vision-Based Systems" Sensors 20, no. 18: 5170. https://doi.org/10.3390/s20185170

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Single Image Haze Removal from Image Enhancement Perspective for Real-Time Vision-Based Systems

Abstract

1. Introduction

2. Preliminaries

2.1. Koschmieder Model

2.2. Pertinence of Under-Exposure to Haze Removal

3. Proposed Algorithm

3.1. Detail Enhancement

3.2. Gamma Correction

3.3. Weight Calculation and Normalization

3.4. Image Fusion

3.5. Dynamic Range Extension

4. Experiments

4.1. Experimental Setup

4.2. Qualitative Evaluation

4.3. Quantitative Evaluation

5. Real-Time Processing

5.1. Hardware Implementation

5.2. Synthesis and Comparison

6. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI