4.1. Experiment and Results
After data preprocessing, all 295 bands of GF-5 are 30 m, whereas Sentinel-2A retains four 10 m bands, 2, 3, 4, and 8, for fusion experiments. The central wavelength of the B2 band is 490 nm, B3 band is 560 nm, B4 band is 665 nm, and B8 band is 842 nm.
Given the lower clarity of GF-5, conducting fusion experiments with GF-5 can enhance its spatial resolution and facilitate subsequent lithology classification.
Considering the extensive data volume of the study area, a lithologically diverse region was chosen for the fusion experiments.
For this purpose, GF-5 images from the southern region of Tuanjie Peak were segmented into 150 × 150-pixel sub-images, while Sentinel-2 images were divided into 450 × 450-pixel sub-images. The remote sensing images of the two datasets are shown in
Figure 3.
The fusion algorithm experiments based on GSA, SFIM, CNMF, and HySure are conducted in Matlab2022a software, using these algorithms to read and fuse sub-images of GF-5 and Sentinel-2A. For the fusion algorithm based on NonRegSRNet, the prepared sub-images are used as test data, while the remaining areas are used for training. There is no overlap between the test and training areas. In total, 80% of the data are used for training and 20% for testing. The experiments are implemented using the Python 3.7 language on the PyTorch framework.
To account for spectral characteristic discrepancies in multi-source imagery of identical land objects, relative radiometric normalization and Spectral Response Function (SRF) computation were prerequisite steps in the NonRegSRNet fusion algorithm.
Table 5 details the parameters, defaulting to the method’s inherent settings except where noted.
Training loss curves, visualized using the PyTorch tool visdom, typically stabilized between 300 to 400 training iterations, facilitating an effective observation of loss trends.
The fusion outcomes employing GSA, SFIM, CNMF, Hysure, and NonRegSRNet are depicted in
Figure 4.
4.1.1. Visual Evaluation
As observed in
Figure 5, the SFIM technique fails to significantly enhance spatial resolution, resulting in relatively blurred fusion outcomes where lithological and geological features are not distinctly portrayed. In contrast, GSA, HySure, and CNMF methods markedly improve the spatial resolution, rendering clearer lithological textures, albeit with minor variations in color fidelity. NonRegSRNet, however, produces monochromatic land object representations, lacking in accurately depicting lithological variations and characteristics.
Consequently, GSA, HySure, and CNMF demonstrate superior performance in terms of visual clarity in lithology, whereas SFIM shows subpar clarity, and NonRegSRNet exhibits limited color diversity. To delve deeper into these variances, subsequent steps will involve the application of quantitative analysis using specific evaluation metrics.
4.1.2. Indicator Assessment
Firstly, this arcticle uses eight indices for comprehensive evaluation, PSNR, SAM, ERGAS,
,
(spatial distortion),
(spectral distortion),
(no-reference image evaluation index) and
(average gradient), with the results shown in
Table 6. We analyzed based on the table results as follows:
The evaluation of the image quality post-fusion is carried out using these eight indices, discussing from both referenced and non-referenced aspects. PSNR, SAM, ERGAS, and are referenced image evaluation indices (using the original hyperspectral image as the reference); , , , and are non-referenced image evaluation indices, being determined by and .
In the PSNR index, GSA fusion achieved the highest value, indicating the best spatial reconstruction. GSA showed the lowest in the SAM index, suggesting the best spectral quality. In the ERGAS index, NonRegSRNet was the smallest, implying the best global statistical measure of fused data quality. GSA was closest to 1 in the index, indicating good spectral fidelity and minimal spatial distortion. In the index, NonRegSRNet was closest to 0, denoting it as the optimal choice; similarly, NonRegSRNet was nearest to 0 in the index, also indicating that it is the best. As NonRegSRNet had the best results in both and , it was also the best in the QNR index. GSA performed best in the AG index, showing the highest clarity.
Secondly, we compared the spectra of the main lithologies and features in the study area before and after fusion, with the results shown in
Figure 6. From
Figure 6, it can be seen that the spectral curves of the five main lithologies in the remote sensing images fused by GSA are closest to the pre-fusion spectral curves. The spectral distortions in the NonRegSRNet and HySure fused spectra mainly manifest as slight increases in reflectance, but the overall spectral shape remains fairly consistent with the original spectra. This is in line with the conclusion from the SAM index evaluation that GSA has the best spectral quality. Therefore, a comprehensive analysis of the indices and spectral analysis shows that GSA has the best fusion effect in this experiment.
4.1.3. Evaluation of Lithology Classification Performance
In the current research, the Random Forest (RF) classification methodology was employed to compare lithological classifications between individual and fused images. The RF classification process was executed using the sklearn library [
58]. For parameter optimization, a grid search approach was adopted, setting n_estimators to 120 and max_depth to 2, while maintaining default settings for all other parameters. Aiming to enhance the accuracy of lithology classification, we selected samples for analysis through both point-based and area-based approaches. The distribution of training rock samples and test samples is depicted in
Table 7. Python programming was utilized to automatically allocate these samples into a 70:30 training-to-testing ratio. The classification was then applied to pre-fusion GF-5 and Sentinel-2 images, as well as images resulting from five different fusion methods, with the outcomes presented in
Figure 7 and
Figure 8.
In this study, our initial focus was on the delineated imagery presented in
Figure 7 and
Figure 8. We noted that the periphery of each classified category in the surface sample training results predominantly aligns well. However, some categories included noise points and a mixture of other classification types. The performance of point sample training was moderate, with notable misclassification in certain categories.
The examination of
Figure 9 and
Figure 10 indicated a general balance among the categories within the surface samples. Apart from the exclusive use of Sentinel-2A data and the NonRegSRNet fusion method, the accuracy across the categories was commendably high. In contrast, point samples demonstrated considerable variability in accuracy, with a tendency towards lower precision compared to surface samples.
An analysis of the overall classification accuracy and kappa coefficient, as shown in
Figure 11 and
Figure 12, was crucial for a nuanced understanding of the differences between surface and point samples. This paper, therefore, undertakes a detailed discussion and analysis of both sample types.
Evaluating the classification accuracy of individual lithologies, we observed that surface samples maintained a relatively stable accuracy across categories. This was with the notable exception of those utilizing Sentinel-2A data and the NonRegSRNet fusion method. Point samples, however, showed more pronounced fluctuations in accuracy, underperforming compared to their surface counterparts. It was clear that for surface samples, each lithology category achieved exceptional results with the GSA fusion method. Specifically, the Jurassic Longshan Formation’s second section, characterized by grey-black and grey microcrystalline limestone , exhibited superior performance with the GSA fusion method. While the GF-5 data showed some discrepancies in lithology classification, the Quaternary Holocene stood out as the bestperforming category. In point samples, the same section of the Jurassic Longshan Formation excelled when using GF-5 data, with the Quaternary Holocene achieving the highest accuracy with the GSA fusion method. This analysis suggests that the recognition accuracy of and is commendably high in both point and surface samples within the studied region.
In evaluating the classification accuracy of individual lithologies through various fusion methods, our analysis revealed that surface samples using the GSA fusion method, standalone GF-5 data, and SFIM fusion exhibit a commendable performance. The majority of lithologies in these samples surpass the average user accuracy. For point samples, the GSA and CNMF fusion methods stand out, delivering superior results, whereas the sole use of Sentinel-2A data is markedly less effective, showing substantial variability in lithology classification across different datasets.
Upon examining the data sources, it is evident that GSA-based fusion data consistently offers the most accurate lithology classifications. The un-fused GF-5 data rank second, outperforming the un-fused Sentinel-2A data, which show weaker results across all lithologies. Notably, the lithology classifications from GF-5 data, especially for the Quaternary Holocene , biotite granite K, the middle Jurassic Longshan Formation’s second section , and the lower Permian Shenxianwan Formation’s first section , are more precise than certain fused datasets. This could be attributed to spectral distortions resulting from the fusion process. For instance, the NonRegSRNet fusion method, while scoring well on some fusion evaluation metrics, suffers from inadequate spatial reconstruction and significant spectral distortion, leading to reduced classification accuracy of individual lithologies. Similarly, the SFIM fusion method, despite its initial promise, results in images of inferior spectral and spatial quality, ultimately compromising the classification accuracy compared to the pre-fusion GF-5 data. The overall lower accuracy of Sentinel-2A data in lithology classification further underscores that high spatial resolution alone is inadequate for accurate lithology classification in this region. A high spectral resolution is essential to meet the classification requirements effectively.
In this research, lithology classification outcomes were meticulously examined through two lenses, lithology classification accuracy and the kappa coefficient, as detailed in
Table 8 and
Figure 9. A closer look at the overall lithology classification accuracy in
Figure 11 and
Figure 12 reveals a consistent trend with the single lithology classification accuracy. Notably, the GSA method in surface samples attained the apex of overall lithology classification accuracy at 92.02%, coupled with a robust kappa coefficient of 0.9143. Meanwhile, the HySure fusion method shone in point samples, achieving the highest overall classification accuracy of
. It is important to highlight that across both point and surface samples, the classification accuracy of the single Sentinel-2A data consistently lagged, ranking as the lowest among the fusion data.
Furthermore,
Figure 11 unveils that the un-fused remote sensing image GF-5’s overall lithology classification accuracy and kappa coefficient were marginally surpassed by GSA. However, the performance metrics for Sentinel-2A were distinctly at the lower end. This analysis underscores the superiority of GF-5 in lithology classification, primarily attributed to its higher spectral resolution. Despite the high spatial resolution of the single Sentinel-2A data, its lower spectral resolution proved to be a limiting factor in effective lithology recognition.