1. Introduction
Spatial and spectral information are significant in remote sensing imaging applications, such as land classification, change detection, and road extraction. However, based on considerations of imaging quality, the high-frequency spatial information is separated from the spectral information during satellite imaging [
1], and typical optical remote sensing satellites, such as QuickBird, WorldView-2, GF-1, and GF-2, only provide high-spatial-resolution panchromatic (PAN) images and low-spatial-resolution multispectral (MS) images. The fusion of PAN and MS images effectively solves the problem of this separation of the high-frequency spatial information from the spectral information.
According to the different techniques for high-frequency information injection, PAN-MS fusion can be divided into two categories: spectral and spatial methods [
2]. The spectral methods are based on the component replacement method in which the spectral information component (SIC) is separated from the spatial structure component (SSC) by projecting the MS into another vector space. Then, the SSC is replaced by a PAN to incorporate high-frequency spatial information, and, finally, a fused image is obtained through inverse transformation. Typical component substitution (CS) methods include principal component analysis (PCA) and the Gram–Schmidt process (GS). In addition, in recent years, methods based on deep learning [
3,
4] have also achieved good results, but their computational complexity is high, and they are not suitable for large-scale remote sensing images; thus, this paper does not discuss them in depth.
The spatial methods include multi-resolution analysis (MRA), which decomposes MS and PAN at multiple scales. The high-frequency components are fused with low-frequency components using different rules and, finally, inverted back to the fused image. Typical MRA methods include wavelet transform [
5], curvelet transform [
6], contourlet transform [
7,
8], non-subsampled contourlet transform (NSCT) [
9], non-subsampled shearlet transform (NSST) [
10], and so on. Among these, wavelet transform is the most widely used MRA method, but its direction selectivity is limited and it cannot achieve a stable fusion effect. The curvelet and contourlet transforms have no translation invariance, and the fusion result may be affected by the noise or alignment accuracy of the source image. NCST has high computational complexity, so it is unsuitable for large images. NSST has good directional selectivity, can obtain more information from the source image, and has no down-sampling operation in the decomposition process, thus effectively reducing the pseudo-Gibbs phenomena caused by the registration accuracy.
The CS approaches have good spatial quality but severe spectral distortion, while the MRA class methods have high spectral fidelity but poor spatial quality. These two types of method are complementary [
11], which has given rise to many coupled methods. The conventional model of coupling method is shown in
Figure 1a: (1) project MS into another vector space to separate the spectral information (MS_SIC) and spatial information (MS_SSC); (2) fuse MS_SSC and PAN using MRA-like methods to obtain the new spatial structure component (New-SSC); and (3) invert NEW-SSC and MS_SIC back to the original space to obtain the fused image. The addition of SSC reduces the information mismatch between PAN and MS, thus reducing the spectral distortion. However, the SSC is obtained directly from the MS, which lacks high-frequency spatial information, thus reducing the image’s sharpness. Although the coupling method can overcome the spectral distortion of CS and the spatial distortion of MRA, its spatial information quality (sharpness) is inferior to that of CS, and the spectral information quality (color) is inferior to that of MRA.
Therefore, it is of practical significance to optimize the coupling method to improve the quality of spatial information and the quality of spectral information of the fused images. The PCA technique can partially concentrate the spatial information shared by the bands in the first principal component through a linear transformation of the data. In this paper, a new fusion strategy is proposed, as shown in
Figure 1b: (1) project MS into another vector space to separate the spectral information (MS_SIC) and spatial information (MS_SSC); (2) combine PAN and MS for PCA transformation and use the first principal component (PC1) as the spatial component (PC1_SSC); (3) use the MRA-like method to fuse PC1_SSC and PAN to obtain NEW-SSC; and (4) invert NEW-SSC and MS_SIC back to the original space to obtain the fused image. The difference between the new and conventional modes lies in how the spatial structure components are obtained: the conventional mode obtains the spatial structure components directly from MS using color space transformations and others. In contrast, using PCA transformation, the new mode obtains the spatial structure components from PAN and MS.
In the subsequent experiments, we selected the generalized intensity–hue–saturation (GIHS) algorithm from the CS methods and the NSST from the MRA class methods. In terms of fusion rules, this paper proposes a new low-frequency fusion rule using gradient-domain singular value decomposition (SVD) [
12] and local structure descriptors to construct the weight coefficients, as well as a bootstrap filter [
13], to guide the weights and to increase the spatial continuity of the weights; meanwhile, the high-frequency coefficients use local spatial frequencies to guide them.
5. Discussion
As demonstrated by the previous experimental results, the proposed method achieved good results for both the evaluation system with references and the evaluation system without references and was obviously better than the comparison methods, especially in the index test on the retention of spatial structure information. The conventional mode of extracting spatial structure information through conventional spectral methods and spatial methods mainly uses color space transformation and other techniques to obtain spatial structure components from images. In contrast, the method proposed in this paper uses PCA transformation to jointly extract spatial structure components from PAN and MS images, which can better preserve and fuse the obtained spatial information. The above experiments also verified this point of view. Compared with the conventional method, it can be seen that the method in this paper retained more spatial details. Compared with a single spectral method and a spatial method, the method given in this paper combines the advantages of the two. The optimized coupling method used in this paper can improve the quality of the spatial information and spectral information of the fused image. In this paper, PCA technology was used to concentrate the part of the spatial information shared by the bands in the first principal component; to obtain the spatial components through linear transformation of the data; to make full use of the acquired spatial information of all the bands; and, on this basis, to extract the MS image. The spectral information is fused with this to obtain the final fused image. However, the method given in this paper uses PCA transformation when extracting spatial structure information. Although this transformation can concentrate the main information in the first component, it will inevitably lose part of the spatial structure information. In the future, we could study how to use deep learning to extract spatial structure information from the original image and then inject this into the low-frequency spectral component to avoid interference by human factors.
6. Conclusions
For the fusion of PAN and MS images, a fusion framework combining GIHS, NSST, and PCA was proposed in this paper. The GIHS method was improved to take advantage of its concise formulas and high execution efficiency, while there was no limitation on the number of bands of input data. The constructed fusion algorithm contains more spatial structure information of MS and PAN images and retains some spectral information of MS. PCA is applied to each band of the PAN image and MS image to obtain the first principal component, and then NSST decomposition is used in the fusion with the PAN image. Finally, the fused image is used to replace the original intensity component, which can enhance the fusion effect and reduce the spectral distortion. Compared with the traditional algorithms, the algorithm in this paper obtained more spatial structure components from PAN and MS and could preserve spectral information with high fidelity while effectively retaining spatial structure information. In the process of low-frequency coefficient fusion, this paper proposed a new fusion rule based on the gradient-domain SVD, using a local structure description operator to obtain the initial fusion weights and bootstrap filtering to increase the spatial continuity of these weights. Four scenes of urban, plants and water, farmland, and desert images from GeoEye-1, WorldView-4, Gaofen-7, and GFDM were used as experimental data for a fusion method study. The method was compared with five other fusion algorithms, using the average gradient, structural similarity, correlation coefficient, common image quality index, spectral angle mapping, and relative global error, with and without evaluation indexes, including the spectral aberration index, spectral distortion index, and comprehensive evaluation index in the reference mode. The results showed that the method proposed in this paper achieved outstanding results in spectral preservation and spatial information incorporation.