*T*he difference of *T* is less than 0.04

**Figure 4.** The images with different *T* values.

#### *4.2. Transmission Refinement Based on Spatial Structure*

We estimate the optimal transmission based on the assumption that all pixels in a block have the same transmission. However, scene depths may vary spatially within a block, and the block-based transmission map usually has a blocking-artifact problem. Therefore, an edge-preserving filter is adopted to refine the block-based transmission map.

The single-image dehazing method using dark channel prior [9] employs the soft matting technique [30] to refine the large block size in the transmission map, which causes an enormous computational burden. In this paper, the guided filter method [31] is adopted to refine the transmission map, which has less computational cost. The filtered transmission ˆ*<sup>t</sup>*(*p*) is an affine combination of the guidance image *<sup>I</sup>*(*p*), as show in Equation (13):

$$
\hat{I}(p) = s^T I(p) + \psi \tag{13}
$$

where *s* = (*sr*,*sg*,*sb*)*<sup>T</sup>* is a scaling vector, and *ψ* is an offset determined by the size of block. For a block in one image, the optimal parameters of *s*<sup>∗</sup> and *ψ*∗ can be obtained by minimizing the difference between the transmission *<sup>t</sup>*(*p*) and the filtered transmission ˆ*<sup>t</sup>*(*p*) using the least squares method as Equation (14):

$$\left(\left(s^\*,\psi^\*\right) = \arg\min(s,\psi)\sum\_{p\in\Omega} \left(t(p) - \hat{t}(p)\right)^2\tag{14}$$

If the transmission is too small, the noise will be enhanced in the restored image [9]. Thus, the lower limit of the transmission is set to 0.1. If a window slides pixel by pixel over the entire image, there will be multiple windows that overlap at each pixel position. Therefore, we adopt the centered window scheme, which sets the final transmission values as the average of all associated refined transmission values at each pixel position. However, the average transmission value in this scheme will cause blurring in the final transmission map, especially around object boundaries, where the depths change abruptly. To overcome this problem, the shiftable window scheme [32] is employed instead of the centered window scheme. The centered window scheme overlays a window on each pixel so that the window contains multiple objects with different depths, which leads to unreliable depth estimation. In the shiftable window scheme, the window is shifted within a block of 40 × 40. The optimal shift position is selected depending on the smallest change of pixel values within the window. Even though a shiftable window is selected for a specific pixel, the number of overlapping windows usually varies at different positions. The windows in smooth regions are selected more frequently than those in rough boundary regions. Thus, the shiftable window scheme can reduce the effects of unreliable transmission values derived from rough boundary regions, thereby alleviating the blurring artifacts.

#### *4.3. Lane Separation for Traffic Videos*

After analyzing the spatial characteristics of traffic video, we found that the traffic lane is an obvious structure. In a traffic video detection system, the detected objects are mostly concentrated in the driveway regions. The areas outside lanes are not the regions of interest in traffic video processing. Therefore, we can process haze removal only in the driveway region of traffic video to reduce computing time.

However, the estimations of atmospheric light and transmission are based on the whole image. If these values are achieved only through the driveway regions, it may cause some deviations, especially when the sky occupies a large area of the image, such as the cases shown in Table 2. The larger the sky region is, the greater the deviation for the value of *T* ∗ *X* is. Therefore, the separated lane can be used in the last step to restore the pixels only for the driveway regions.


#### **Table 2.** Global image and driveway.

We adopt a straight-line extraction algorithm based on the Hough transform to detect the lanes and separate the driveway region from the global image. The process of haze removal combined with the driveway region separation is described as follows:


Step 1: Obtain the edge information in the video through edge detection.

Step 2: Remove obviously wrong-angle lines by Hough linear fitting, and obtain lane candidates, as shown in Figure 5b.

Step 3: Find the far left lane and the far right lane, and set them as the driveway boundaries, then find the intersection of these two lines, as shown in Figure 5c.

Step 4: Identify a rectangular area as the driveway region, which is composed of the boundary of the image and a horizontal line across the intersection, as shown in Figure 5c. If the intersection is outside the image, take the whole image area as the driveway region.

3. Use the original pixel values and the optimal transmission of driveway region in the dehazing model to restore the image in the driveway region.

**Figure 5.** Lane space separation: (**a**) original Image; (**b**) lane candidates; (**c**) driveway boundary; (**d**) result for lane separation.

In a traffic video detection system, each camera is located at a fixed position and captures the same traffic scenes for a long time. Based on the time continuity, the result of lane space separation for the initial frame of a traffic video can be used over a long time period. Lane space separation can decrease the area of haze removal and improve the efficiency of the dehazing algorithm. Figure 6 shows the haze removal results with and without lane separation. In this scene, the dehazing of 2000 frames needs 35.301 s without lane separation and 32.74 s with lane separation (lane space separation takes 0.182 s). Although lane separation requires some time, the operation just occurs in the first frame. Thus, the time for lane separation can be shared by all frames of a traffic video. With an increasing number of frames, the efficiency of the dehazing algorithm with lane separation will be improved more significantly. Hence, if the driveway region is a larger portion of a whole image, the processing time can be decreased obviously. When real-time processing is required, a little reduction in processing time has been of practical significance.

**Figure 6.** Results for video dehazing with lane separation: (**a**) before haze removal; (**b**) haze removal without lane separation; (**c**) haze removal with lane separation.

#### *4.4. Optimization Based on Spatial Distribution of Cameras*

With an increasingly complex layout of transportation networks, the number of traffic monitoring cameras also increases gradually, and sometimes there are multiple cameras in the same section of road. These cameras located in close physical proximity usually have the same hardware indicators. In a traffic video detection system, multiple cameras are connected to one system. These cameras have similar characteristics according to their spatial distribution. The weather is also an index with spatial characteristics, that is, the degrees of haze are similar in nearby regions. Thus, we can use the spatial distribution information of cameras to speed up dehazing and optimize the performance of the traffic video detection system.

Figure 7 shows the images captured by four surveillance videos of DE-elevated freeways in Hangzhou City at the same time. The locations of these cameras are shown in Figure 8, where the distance between the cameras is about 500 to 600 m. Table 3 shows the initial transmission values of these four videos. The haziness flag values *T* calculated from each video are shown in the first column of Table 3. We obtain relatively proper initial transmission correction value *X* by using the method proposed in Section 3, and then determine the initial transmission value *T* ∗ *X*. According to the results, these initial transmission values are very numerically similar, thus there may be no obvious influence on the restored images.

In traffic video dehazing, the cameras are divided into different regions according to their locations, and one camera in a region is set as the calibration camera. The images from the calibration camera are used to calculate the initial transmission value, which is also applied to other cameras in the same region. Therefore, we can avoid repeatedly calculating the values of *T*, *C*, and *X* for other cameras, thus improving the efficiency of haze removal. The results of haze removal with the initial transmission value obtained by calibration cameras is shown in Figure 9b, and the result directly using the initial transmission value obtained by the image itself is shown in Figure 9c. It is obvious that the results are very similar in these two ways. It takes 0.033 s to calculate the initial transmission value, which can be saved by using that of the calibration camera.

(**a**)

(**d**)

(**b**)

(**c**)

> **Figure 7.** Example images

ofthenearbyregions.

**Figure 8.** The locations of cameras.



**Figure 9.** Results of haze removal with and without calibration camera: (**a**) original image; (**b**) initial transmission value for calibration camera is 0.596; (**c**) initial transmission value for image itself is 0.578.
