4.6.2. Full-Resolution Experiments

In order to evaluate the generality of the above method, we also conducted a fullresolution experiment on the QB and WV4 datasets. Table 7 shows the average quantitative results of the full-resolution experiments from the QB and WV4 datasets. Since the EXP

does not inject the details of the PAN image into the LRMS image, the result of the EXP shows spectral features similar to those of the LRMS image, which can be regarded as a reference for evaluating spectral preservation [9,59] and excluded from the comparison.

As shown in Table 7, our method is mediocre in *Dλ* on both QB and WV4 datasets but achieves the best results in *Ds* and the QNR of our method is high, which demonstrated our method still achieves satisfactory fusion results. One potential reason is that the spatial attention of SSA module changes the spectral features excessively during the interaction process, resulting in the reduction of the fidelity of the original spectral information.

**Methods QB WV4** *Dλ↓ Ds↓* **QNR***↑ Dλ↓ Ds↓* **QNR***↑* **EXP 0 0.1016 0.8984 0 0.0819 0.9181** GSA 0.0875 0.1743 0.7584 0.0766 0.1576 0.7803 PRACS **0.0465** 0.1096 0.8510 **0.0305** 0.0975 0.8758 BDSD-PC 0.0622 0.1515 0.7998 0.0478 0.1258 0.8350 MTF-GLP 0.1261 0.2004 0.7056 0.0914 0.1332 0.7907 PNN 0.0622 0.1115 0.8374 0.0473 0.0612 0.8944 PanNet 0.0604 0.0990 0.8502 0.0326 0.0620 **0.9076** MSDCNN 0.0572 0.1025 0.8493 0.0449 0.0665 0.8927 TFNET 0.0492 0.0728 0.8840 0.0569 0.0562 0.8905 GGPCRN 0.0509 0.0688 0.8858 0.0555 0.0581 0.8902 MUCNN 0.0488 0.0886 0.86 0.0611 0.0591 0.8847 MDA-Net 0.0473 0.0656 **0.8921** 0.0560 0.0607 0.8873 SSIN 0.0532 **0.0609** 0.8910 0.0483 **0.0534** 0.9012

**Table 7.** Quantitative evaluation comparison of different methods on the QB and WV4 dataset at the full-resolution experiments. The best results are in bold and the second-best results are underlined.

Figures 10 and 11 show the visualized results of different methods on QB and WV4 datasets in the full-resolution experiment, respectively. In addition, we enlarge the region marked in the red box in the fused images for better subjective evaluation. To intuitively observe the differences between different methods, the residual image between the results of EXP and the results of other methods in Figures 10 and 11 are shown in Figures 12 and 13, respectively.

As shown in Figure 10, The results of PNN, PanNet, and MSDCNN suffer from obvious spectral distortion. The results of BDSD-PC, GSA, MTF-GLP, and PRACS yield different levels of spatial distortion compared with the PAN. Specifically, they produce thicker strip structures. The other methods produce better visual results. From the residual images in Figure 12, we can see that the traditional methods inject fewer details than the compared DL-based methods but have better spectral preservation. On the DL based category, we can find that the residual image of PNN, PanNet, and MSDCNN has obvious spatial distortion. Because there are noise pixels in the whole residual image, however, the residual images of the rest of the methods are difficult to discriminate.

**Figure 10.** The full-resolution experiments results of different methods on QB dataset.

**Figure 11.** The full-resolution experiments results of different methods on WV4 dataset.

Figure 11 shows results of a full-resolution sample from WV4. As it shows, the result of the GSA and PRACS exhibits serious spectral distortion. Specifically, the grass color in the lower-left corner of the result images is lighter than the result of EXP. The results of the BDSD-PC, MTF-GLP, and PRACS are blurred and present serious spatial distortion. As for DL-based methods, MSDCNN, MUCNN, PanNet, PNN, TFNET, and MDA-Net produce obvious artifacts and spatial distortion with varying degrees. GGPCRN and SSIN generate relatively clearer images. As shown in the enlarged area, we can observe that the results of PNN, PanNet, TFNET, MUCNN, and MDA-Net contain obvious spectral distortions evidenced by distinct color pixels from the result of EXP. As we can see in Figure 13, MTF-GLP and BDSD-PC inject fewer details than DL-based methods. However, GSA injects more details than PNN, PanNet, MSDCNN, and MDA-Net. PRACS observably injects the spatial details, but there is significant spectral distortion, as the residual image of PRACS is obvious color deviation. Furthermore, the residual images of PNN, PanNet, MSDCNN,

TFNET, MUCNN, MDA-Net, and SSIN exhibit different levels of spectral distortion as the residual images have many distinct colors of pixels. Although our SSIN suffers from some spectral distortion and is slightly worse than BDSD-PC, GSA, and MTF-GLP in spectral preservation, it injects more edges into the fusion result.

**Figure 12.** Residual results of the enlarged region of Figure 10.

**Figure 13.** Residual results of the enlarged region of Figure 11.

Overall, from the above results of both reduced- and full-resolution experiments, our SSIN achieves favorable and promising performance in spatial detail injection and spectral preservation.
