1. Introduction
Short-wave infrared (SWIR) cameras have increasingly become popular due to their long-acting distance, controllable cost, and better fog transmission. These advantages are important in optoelectronic pod [
1] applications, which typically integrate imaging systems with a long focal length and large F-number. Infrared (IR) images are generally processed by nonuniformity correction (NUC), bad pixel correction (BPC), detail and contrast enhancement, and dynamic range compression before being displayed [
2]; detail as well as contrast enhancement are crucial to improving image quality. It is well-established that detailed information is represented in the form of edges (dramatic intensity changes within a spatial neighborhood) [
3], so detail improvement can be regarded as edge enhancement in IR image processing. The dominant IR image detail improvement methods can be divided into mapping-based approaches, gradient-domain algorithms, and decomposition-based methods [
3]. The typical disadvantages of gradient-domain algorithms [
4] typically include being computationally expensive and having poor stability, while the mapping-based approaches [
5,
6] provide relatively limited improvements in the details. Consequently, the existing high-performance detail enhancement methods for IR image processing are decomposition-based methods [
7], which separate a detail layer from the input image and assign higher weight before merging, providing outstanding detail and contrast enhancement performance.
In 2009, Branchitta et al. [
8] first applied the bilateral filter to the IR image processing and proposed decomposition-based dynamic range partitioning (BF&DRP), but a careful tuning of several parameters is required in different applications, and serious gradient reversal artifacts caused by the bilateral filter are inevitable in some scenarios. To optimize the work in [
8], Zuo et al. [
9] proposed introducing the bilateral filter [
10,
11] for display and detail enhancement (BF&DDE), and an adaptive Gaussian filter was applied following the bilateral filter to suppress noise and artifacts. The BF&DDE achieved impressive enhancement performance, but at a high computational cost. In 2014, Liu et al. [
12] proposed a method known as GF&DDE, which has a similar framework, but with the bilateral filter replaced by the guided image filter [
13]. In our previous work [
7], we achieved a good balance between detail enhancement and blinking pixel suppression by combining the fast guided filter [
14] and plateau equalization [
6] (FGF&PE), with the running time being superior to those of the baseline methods, including the BF&DDE and the GF&DDE, by a large margin.
However, the characteristics of SWIR images are different from those of medium-wave infrared (MWIR) and long-wave infrared (LWIR) images, especially images based on indium gallium arsenide (InGaAs) detectors [
15]. The MWIR and LWIR images in long- range imaging systems are typically acquired by mercury cadmium telluride (MCT) IR detectors, which are susceptible to flickering noise (blinking pixels) [
16]. The flickering noise of the detector is a characteristic of random telegraphic signals (RTSs), caused by the presence of defects in semiconductor crystals. The resulting location of the blinking pixels changes with time, and the response state is unstable, meaning it cannot be detected and corrected as defective pixels. Consequently, the existing decomposition-based IR image enhancement approaches focus on the performance of blinking pixels suppression. SWIR images are typically acquired by InGaAs detectors, which have fewer blinking pixels than MCT detectors. Blinking pixels are generally suppressed at the cost of detail enhancement performance, meaning that the existing decomposition-based methods [
7,
8,
9,
12] used for MWIR and LWIR enhancement cannot yield suitable SWIR image processing results.
More importantly, SWIR images have more detail information than MWIR and LWIR images with the same pixel resolution and optical parameters due to the optical diffraction limit (radius of the Airy disk), which is strongly related to the wavelength; the requirement for SWIR detail enhancement is higher than for MWIR and LWIR images.
In addition, the full-scale frame rate of SWIR (InGaAs) detectors (up to around 300 fps) is much higher than that of MCT detectors (up to around 100 fps). The existing IR image enhancement methods [
7,
8,
9,
12] are typically based on edge-preserving filters, including the bilateral [
10,
11] and guided filters [
13], which have difficulty meeting the requirements of applications with high data rates [
17].
In this paper, we propose a simple and robust SWIR image-enhancement method based on the difference of Gaussian (DoG) filter [
18] and plateau equalization. Compared with our previous work [
19], this paper focuses on the description and analysis of strategies for FPGA porting, which is helpful for the application of imaging systems. The primary contributions are given as follows:
To the best of our knowledge, there has been no implementation of the DoG filter to IR image enhancement. We demonstrate that the DoG filter has many advantages and potential for edge and detail extraction in SWIR imaging systems.
The running speed of the proposed method is faster than those of the methods used for comparison by a large margin, with a frame rate of around 50 fps for high-definition (HD) SWIR images, and can be simply and significantly accelerated by pipeline architecture, which we describe in
Section 3.3.
The results of FPGA implementation and laptop CPU processing demonstrate that our method has the potential for engineering applications.
This paper is organized as follows:
Section 2 provides background information about the DoG filter and plateau equalization.
Section 3 describes the details of the proposed method, and provides application guidance for FPGA-based system.
Section 4 discusses the results with different parameters and running modes, and the proposed method is compared with other methods. Finally, we conclude with the merits and limitations of our study, and outline future directions in
Section 5.
4. Experimental Results Comparison and Discussions
To evaluate the enhancement performance of the proposed method, we applied it to HD SWIR images (including the image shown in
Figure 3a) that were acquired by a self-developed SWIR camera with a resolution of 1280 × 1024 and a focal length of 600 mm, which is shown in
Figure 6. The brief characteristics of the test images are described in
Table 1; the original test images were blurry due to the atmospheric turbulence and point spread function (PSF) of the optical lens.
Because there is no universal objective criterion for IR image quality assessment, and several blind image quality assessment metrics perform inconsistently on IR images [
3], we performed a comparison using the average gradient (AG), which is widely used as an indicator for the evaluation of the edge detail contrast and sharpness characteristics of an image [
4]. The AG can be calculated as follows:
where
is the pixel value of the image;
R and
C denote the number of rows and columns of the image, respectively; and
and
represent the horizontal and vertical gradients, respectively.
4.1. Performance Comparison of Different Parameters
To compare the edge extraction performance of different DoG filter parameters, we compared the filtering results obtained using different window and edge size (defined by the two standard deviations in Equation (5)), with the gain factor and PE clipping threshold fixed to 16 and 0.01%, respectively. The DoG filtering result and the final enhancement result produced by the proposed method on test image 1 are shown in
Figure 7 and
Figure 8, respectively.
Figure 7 and
Figure 8 demonstrate that the window size of the DoG operator has a significantly stronger impact on image sharpness than the standard deviation values, while the extraction result did not show a notable difference among different preset edge sizes. As the operator with a small window size is sensitive to noise, as shown in
Figure 7, and a large window size may cause a wider edge transition area and have negative impacts on image quality, we set the window size to seven in the subsequent experiments. The edge scale, which defines the standard deviation, was set to four because it robustly worked in most of our experiments.
4.2. Performance between CPU and FPGA
To evaluate the implementation performance on an FPGA, we applied the DoG filter to the test images with floating-point numbers and integers, which are shown in Equations (6) and (7), respectively. A comparison between the enhanced results on the test images by using CPU (floating-point numbers) and FPGA (integers) is shown in
Figure 9. It can be seen that the difference is slight, proving that the proposed method does not suffer from serious performance degradation during implementation on an embedded system such as an FPGA. The average gradient values of different filtering parameters are provided in
Table 2, which further illustrates that the difference is slight, meaning the proposed method has application potential.
To further evaluate the resource requirement of the proposed method, we implemented our method on Xilinx Artix-7 FPGA (XC7A100T-2SBG484I); and the hardware use (with a clock frequency of 108 MHz), including the lookup table (LUT), slices, flip-flops (FFs), block random access memory (BRAM), and digital signal processing (DSP) slices, is shown in
Table 3.
4.3. Performance among Different Methods
For the enhancement performance comparison among the proposed and other methods, we performed classic decomposition-based methods, including BF&DDE [
9], GF&DDE [
12], and FGF&PE [
7], as well as PE [
6] as comparison methods, to demonstrate the effect of detail extraction and fusion on image quality.
The enhancement results on Image 1 are shown in
Figure 10. The input image was acquired in low-light conditions, and all of the comparison methods significantly improved the dynamic range. The proposed method better enhanced detail at the cost of slightly higher noise. The result processed by BF&DDE had obvious artifacts, and the PE, GF&DDE, and FGF&PE produced limited detail enhancement. The proposed method achieved a good balance between detail enhancement and noise, with a better visual effect.
Image 2 was captured under strong sunlight. The distance of the tower crane was around 4.5 km, and the original image was obviously blurry due to the point spread function of the optical lens and atmospheric turbulence. The results in
Figure 11 demonstrate that the dynamic range improvement performance of the five methods were not significantly different, but our method produced a significant sharpening effect on edges, especially in areas with fonts (zoom in on
Figure 11 for details), being superior to the methods used for comparison by a large margin.
Image 3 was acquired with a short integration time, and the signal-to-noise ratio (SNR) of the input image is relatively low. The proposed method yielded the best performance at an acceptable noise level, and its improvement in detail was significantly higher than that produced by all comparison methods, as shown in
Figure 12.
Image 4 was captured under strong sunlight, and there were distinct bright and dark areas in the image. The results in
Figure 13 demonstrate that our method had a promising enhancement performance on dense edges in both bright and dark areas.
The average gradient results of different methods are given in
Table 4. The decomposition-based methods typically produced better enhancement performance than PE, while the proposed method yielded the best AG result, with a mean value much higher than that of the other methods. More importantly, the enhancement performance provided by using the DoG filter before PE (the proposed method) was significantly better than that achieved by using PE only, with clear improvements in both visual effect and average gradient.
We performed different methods by running them on a laptop with Apple M1 Pro and 16 GB RAM for a fair comparison, and the mean values of running time were obtained by running all the methods 10 times. The results shown in
Table 5 demonstrate that the proposed method ran quickly, with the running time being slightly longer than that of PE, but significantly less than those of the other decomposition-based methods. Notably, we performed our method in serial mode, which takes much longer than pipelined architectures. Although the frame rate of the proposed method is around 50 fps, we could accelerate it according to the operational guidance of the pipelined architecture in
Section 3.3, which can be simply applied in FPGA-based imaging systems. The frame rate is only limited by the output data rate of the IR detector and bandwidth of the imaging system.
4.4. Limitations
Although our method is simple, efficient, and practical, it has some drawbacks, which can be the focus of future research. First, DoG filters are susceptible to noise, especially flickering noise caused by blinking pixels. The current InGaAs SWIR detectors have fewer blinking pixels, but the proposed method cannot be directly performed for MWIR and LWIR image enhancement because he MCT-based detectors suffer from blinking pixels. Second, the dynamic range reprojection performance of PE can be further optimized and improved in the future.