1. Introduction
Thermal infrared (TIR) imaging can determine the nature, state, and change patterns of ground objects by measuring the differences in infrared properties reflected or radiated by the ground. In addition to its importance to global energy transformations and sustainable development, it has been extensively researched in the fields of surface temperature inversion, urban heat island effect, forest fire monitoring, prospecting, and geothermal exploration [
1,
2]. Due to the limitations of remote sensors, the thermal infrared band generally has a coarser spatial resolution than the visible band, which reduces its accuracy. As a result, improving the spatial resolution of thermal infrared images is of great significance and value.
In order to produce synthetic TIR images, TIR images can be fused with reflection bands of higher spatial resolution using image fusion techniques. In most cases, current image fusion methods assume that there is a significant correlation between panchromatic (PAN) and multispectral (MS) images. Data fusion of PAN and MS images has been widely used to create fused images with higher spatial and spectral resolution [
3]. Although there are several image fusion methods available for MS and PAN images, only a few in the literature claim to be applicable to TIR and reflection data, for example, pixel block intensity modulation [
4], nonlinear transform and multivariate analysis [
5], and optimal scaling factor [
6]. There are two problems with the current methods: (i) Since the TIR spectral range is far off from the reflectance spectral range, the correlation between TIR and reflectance data is generally weak, resulting in blurred images or significant spectral distortions. (ii) TIR and reflection data fusion models are currently subject to strong subjective influences on parameter selection for different scenes.
Previous research on PAN and MS image fusion methods revealed that multi-resolution analysis (MRA) is widely used due to its high computational efficiency and excellent fusion performance. The MRA method primarily uses wavelet transforms, Laplacian pyramids, etc. The aim is to extract information about spatial structure that is affected by spatial resolution and inject it into the hyperspectral image to enhance its spatial detail. Inspired by this, we propose a Generalized Laplacian Pyramid with Modulation Transfer Function matched filter model to fuse the thermal infrared band (30 m) and multispectral band (10 m) information (MTF-GLP-TAM) of SDGSAT-1. There are three payloads in SDGSAT-1: a thermal infrared spectrometer, a microlight, and a multispectral imager. The synergistic observation performed by these three payloads round the clock provides short-time phase, high-resolution, and high-precision image data for the fine portrayal of human traces, offshore ecology, urban heat island effect, and polar environment. The multispectral imager (MSI) is one of its main optical payloads, containing a total of seven bands, mainly in 380~900 nm, with a spatial resolution of 10 m. The thermal infrared spectrometer (TIS) mainly collects three thermal infrared bands with a spatial resolution of 30 m [
7,
8,
9,
10]. The low resolution of images in the thermal infrared band compared with the MS band limits their further application. Thus, it is necessary to integrate SDGSAT-1 MS images with TIS images. It is important to preserve the spatial information of the multispectral bands while maintaining the spectral properties of the three thermal infrared bands in the fused images.
This paper is organized as follows: In
Section 2, we review different methods for remote sensing image fusion and analyze the applicability of these methods on thermal infrared and multispectral data fusion. In
Section 3, we describe the whole framework of the image fusion algorithm in detail. In
Section 4, we compare the results of the proposed method with those of other methods in a comprehensive manner, and we select several scenes to demonstrate its performance after fusion. In
Section 5, we discuss the fusion performance of the proposed algorithm on Landsat series satellite images and compare it with those of other advanced algorithms. Finally,
Section 6 summarizes the main conclusions.
2. Related Works
Remote sensing image fusion algorithms have a wide range of applications and a variety of data sources [
10]. The purpose of this section is to present recent studies relating to the application of TIR and MS fusion algorithms in remote sensing. Furthermore, we present some fusion algorithms between PAN and MS bands to analyze their potential application to thermal infrared data.
In the early days, Liu et al. developed a pixel block intensity modulation (PRIM) method to add spatial details to Landsat Thematic Mapper (TM) thermal band images at 120 m resolution using spatial information in the reflectance spectral band at 30 m resolution [
4]. However, the PRIM method can only improve the topographic resolution of thermal images and not the spectral resolution. University of Lausanne researchers have proposed a generalized Bayesian data fusion (BDF) method for improving the spatial resolution of ASTER thermal images [
11]. In this method, the variation in support is explicitly taken into account when combining information from the visible and near-infrared (VNIR) bands (15 m and 90 m). The fused image retains the local spectral values of the original image while adding spatial detail from the 15 m VNIR band, but it exhibits local blurring in some areas. In urban areas with spectral and spatial diversity, Landsat TM TIR bands have a spatial resolution of 120 m, which is too coarse to depict surface temperatures. University of York researchers addressed this problem by proposing an algorithm that uses nonlinear transformations and multivariate analysis to fuse 30 m resolution reflectance band data with Landsat TM thermal infrared data [
5]. Meanwhile, Seoul National University researchers proposed an effective method to fuse Landsat 8 PAN and TIR images using an optimal scale factor to control the trade-off between spatial detail and thermal information [
6,
12]. In addition, the authors emphasized that the method can also be used to fuse (1) VNIR and TIR images from ASTER or MODIS data products and (2) PAN and MIR (mid-infrared) images from Kompsat-3A. The optimal scale factor method is, however, subject to subjective factors when setting parameters. The model is not generalizable. Infrared channel data from geostationary meteorological satellites can be used for meteorological research and applications, but their spatial resolution is poor. As a result, Ocean University of China researchers proposed a correction method based on thermophysical properties for fusing geostationary meteorological satellite infrared (4 km) and visible (1 km) images [
13]. In spite of this, this method requires high-quality data, which is highly dependent on the solar elevation angle at the time of data acquisition. Chonnam National University researchers investigated an efficient method for fusing Landsat 7 PAN and TIR images using the sparse representation (SR) technique [
14]. The missing details of TIR images are estimated using the SR algorithm to enhance their spatial features. However, the optimal parameters for fusion using the SR algorithm are not consistent for different regions. University of Tehran researchers quantitatively and qualitatively evaluated the performance of TIR and PAN band fusion using a wavelet transform and different filters for the Landsat 8 satellite [
15]. Several deep learning-based image fusion algorithms also have excellent performance in remote sensing image fusion, since they generally have strong nonlinear mapping capabilities. They require a lot of computational resources and training data, which are not easily accessible in the image fusion field due to the lack of underlying facts. Since all deep learning methods use synthetic data for training, their performance in the fusion of real data from novel satellites is limited [
16,
17,
18].
Based on the above advancements in integrating thermal infrared images with other bands, it is evident that the main difficulties with current methods relate to the preservation of spatial and spectral information, as well as the generality of model parameters. The most studied remote sensing image fusion method is fusing MS images with PAN images accordingly, and this process is called panchromatic sharpening. In pansharpening, MS images are merged with PAN images to achieve the same spectral and spatial resolution as PAN images [
19]. The pansharpening method has been applied to many Earth observation satellites, such as IKONOS, QuickBird, GeoEye-1, WorldView-2, and ZiYuan-3, which are capable of acquiring both high-resolution PAN images and low-resolution MS images [
20]. Object detection [
21], land cover classification [
22], and other applications can benefit from high-resolution MS images obtained with fusion. Even though pansharpening is well known, few studies have applied these algorithms to the fusion of thermal infrared and multispectral data. On the one hand, it is because of the long spectral range of both; on the other hand, previous thermal infrared remote sensing instruments have low spatial resolution, for example, 90 m for Terra/ASTER and 100 m for Landsat 8/TIRS; therefore, it is difficult to use classical pansharpening methods directly when spatial enhancement is needed.
Component Substitution (CS) algorithms and multi-resolution analysis (MRA) methods constitute two classical methods in the field of generalized sharpening. The CS approaches are also referred to as spectral methods. They are based on the projection of the original MS image in a transformed domain [
23]. This class includes algorithms such as intensity–hue–saturation (IHS) [
24], principal component analysis (PCA) [
25], and Gram–Schmidt (GS) spectral sharpening [
26]. The CS class fusion algorithm exploits the differences between linear combinations of PAN and MS image channels to extract details. However, applying that into TIR and MS bands requires the establishment of a nonlinear synthesis relationship between their two channels, which is hard to achieve. Another method, the multi-resolution analysis (MRA) method, uses spatially invariant linear filters to extract spatial details from high-resolution images and add them to multispectral images [
27]. Based on the MRA method, we extracted the spatial details from the MS band and injected them into the three SDGSAT-1 bands. While maintaining the original thermal infrared spectral information, the fused image introduces spatial details to increase thermal spatial resolution.
3. Methodologies
We refined the method based on multi-resolution analysis and applied it to multispectral and thermal infrared remote sensing image fusion. The contribution of multispectral images to the spatial detail of the final fusion product is achieved by calculating the difference between the higher-resolution multispectral images and their low-pass components. The method obtains the spatial details with the multi-scale decomposition of the high-spatial-resolution multispectral images and injects them into the thermal infrared image bands obtained with scaled up-sampling based on the multispectral image size. The main advantages of the fusion technique based on multi-resolution analysis are as follows: (1) good temporal coherence; (2) strong spectral consistency; and (3) robustness to blending under appropriate conditions. The flow chart of our proposed algorithm is shown in
Figure 1. Specifically, the fusion algorithm in this paper can be decomposed into the following sequential processes: (1) up-sample the thermal infrared image according to the dimensions of the multispectral image; (2) calculate the low-pass components of the multispectral image with filters for an R-fold sampling ratio; (3) calculate the injection gain; and (4) inject the extracted details.
The thermal infrared image has three bands, which we denote as
, where
. The multispectral image has seven bands we denote as
, where
. We find the multispectral band with the maximum correlation coefficient with the thermal infrared band by calculating the correlation number, which we denote as
MS. The goal of our algorithm is to inject the high-resolution details from the
MS image into the three thermal infrared bands. Accordingly, a formula describing this fusion process is given by Expression (1).
where subscript
k (ranging from 1 to B) indicates the spectral band and B is the number of the TIR bands.
and
are the kth channels of the TIR image up-sampled to the
MS size and of the fused product, respectively.
indicates the
MS details obtained as the difference in
MS image
and its low resolution version,
. The specific formula is shown in (2), where
is the result of the histogram matching of
MS with each TIR band, as shown in Equation (3). In Equation (3), μ denotes the mean value of the image, and σ denotes the variance. Finally,
[·] are the functions that modulate the injection of the
MS details into the TIR bands and distinguish, together with the method used for producing
.
In fact, Equation (1) is a generalization of the MRA approach, where each band is independently treated. Almost all classical approaches employ a linear function
[·], which is obtained through the pointwise multiplication (indicated by ◦) of the
MS details by a coefficient matrix
. There are different ways for obtaining low-pass component
and defining
form.
There are usually two types of forms. Global gain coefficients: for all , is a matrix of appropriate size with all elements equal to a fixed constant. This definition is the so-called additive injection scheme. Pixel-level gain coefficients: for all , . In this case, the details are weighted by the ratio between the up-sampled thermal infrared image and the low-pass filtered multispectral image in order to reproduce the local intensity contrast of the multispectral image in the fused image. However, the local intensity contrast in the multispectral image does not reflect the true thermal contrast information, so we use global gain coefficients here.
In this paper, we use the classical MTF-GLP-based model, which is based on MTF Gaussian filters for detail extraction and an additive injection model, where
= 1 for each
. Therefore, the final fusion equation of the TIR image and the
MS image is shown in (5). In order to better show the spectral changes of the three bands of thermal infrared after fusion in the results, its three bands
are pseudo-colored according to the RGB channel, and finally, color image
is Obtained.
5. Discussion
In this section, the application of our proposed algorithm on other satellites is discussed, and two advanced multi-sensor fusion algorithms are selected for visual comparison. The results of the experiments are shown in
Figure 11,
Figure 12 and
Figure 13, where
Figure 11a shows the thermal infrared data taken by the Landsat8 TIRS payload and
Figure 11b shows the panchromatic band data taken by the OLI payload;
Figure 12a and
Figure 13a show the thermal infrared data taken by the Landsat9 TIRS payload, and
Figure 12a and
Figure 13b show the panchromatic band data taken by the OLI payload; and
Figure 11c–e,
Figure 12c–e and
Figure 13c–e show the fusion results of OSF, SRT, and our proposed algorithm, respectively. The OSF algorithm has been successfully applied to Landsat8 and KOMPSAT-3A satellites, and it controls the trade-off between spatial details and thermal information through the optimal scaling factor. The SRT algorithm is mainly applied to Landsat7 satellites, and it mainly uses the sparse representation technique for the fusion of panchromatic and thermal infrared bands. From the experimental results, we could see that the OSF fusion algorithm had clearer ground details, but it had poorer retention of the spectral properties of the thermal infrared bands, such as the circular building in
Figure 11 and the airport runway in
Figure 13. This approach controls the trade-off between spatial detail and thermal information by introducing a scaling factor, with the disadvantage that its optimal scale factor needs to be re-estimated for each set of images. For different scenarios, the optimal value changes. However, it has better performance in specific application scenarios, for example, when more spatial details are needed for military applications of remote sensing, which can be achieved by increasing the scale factor. The SRT fusion algorithm results was visually close to that of our proposed algorithm, but using the SRT algorithm requires human judgment of the best fusion parameters for each scene. In addition, we calculated some statistics of the grayscale values of thermal infrared images before and after fusion, mainly including the maximum value, minimum value, mean value, and standard deviation, and the results are shown in
Table 3.
Overall, the method proposed in this paper achieved the best results in terms of both subjective visual evaluation and some objective statistical measures of metrics. This excellent performance was achieved thanks to the contribution of multispectral images to the spatial detail of the final fusion results performed by calculating the difference between the high-resolution images and their low-pass components. The difference among the successive orders of the Gaussian pyramid that we employ defines the Laplacian pyramid. The Gaussian filter can be tuned to simulate the sensor modulation transfer function by adjusting the Nyquist frequency. This facilitates the extraction of details from high-resolution images that cannot be captured by thermal infrared sensors due to their low spatial resolution and can effectively improve the performance of the fusion algorithm. In this case, the only parameter characterizing the entire distribution is the standard deviation of the Gaussian distribution, which is determined using sensor-based information (usually the amplitude response value at the Nyquist frequency provided by the manufacturer or using in-orbit measurements).
Using this image fusion method, the thermal infrared band can be improved from 30 m resolution to 10 m resolution. Higher spatial resolution thermal infrared remote sensing can better solve many practical environmental problems. Thermal details can be obtained with 10 m resolution in surface temperature inversion. We can fuse images to finely portray the spatial distribution of high-energy sites and residence types in urban areas. The method can also be applied to detect the precise movement conditions of volcanic lava, the radioactive exposure of nuclear power plants, and land cover classification, among others.
6. Conclusions
Thermal infrared images record radiometric information radiated from features that is invisible to the naked eye and use this information to identify features and invert surface parameters (e.g., temperature, emissivity, etc.). However, the low spatial resolution severely limits its potential applications. Image fusion techniques can be used to fuse TIR images with higher spatial resolution reflectance bands to produce synthetic TIR images. The multi-sensor fusion of MS and TIR images is a good example of improved observability. In this paper, a fusion algorithm based on the MTF-GLP model is proposed for fusing TIR images of SDGSAT-1 with MS images. The fusion method was experimented on real images and simulated images with three-fold degradation in spatial resolution. Compared with the existing image fusion methods, the synthesized TIR images performed better in visualization and did not suffer from spectral distortion anymore. The proposed method in this paper achieved optimal performance in the quantitative evaluation metrics such as CC, SAM, RMSE, UIQI, and ERGAS. Finally, we successfully applied the algorithm to the fusion of thermal infrared data from Landsat series satellites with panchromatic band data, and we obtained better results in visual evaluation compared with several advanced fusion algorithms.