*4.1. Datasets*

We conduct experiments using partial datasets provided in [37]. To verify the performance of SSIN, we chose the dataset from three different satellites for our experiments including QuickBird (QB), WorldView-4 (WV4), and WorldView-2 (WV2). The detailed information of the datasets is shown in Table 1. For each dataset, we randomly divide the data into the training set and test set by a ratio of 8:2, and twenty percent of the test set is chosen as the validation set. Due to the lack of data volume, the data augmentation is utilized to generate training samples, including random cropping, random horizontal flips, and rotating. After the data augmentation, 400 × 2, 380 × 2 and 400 × 2 image pairs are used as training samples for QB, WV4 and WV2, respectively. The sizes of PAN and LRMS images are 64 × 64 and 16 × 16. As for the test data, 80 reduced-scale and 80 full-scale image pairs are utilized. The sizes of reduced-scale PAN and LRMS images are 256 × 256 and 64 × 64, respectively. The sizes of full-scale PAN and MS images are 1024 × 1024 and 256 × 256, respectively. All the deep learning methods use the same dataset for training andtesting.

**Table 1.** The detailed information of the datasets.


Following the Wald protocol [52], we downsample the PAN and MS images at a fourfold scale using spatial degradation based on a modulation transfer function (MTF), then we can use the degraded images as the network input and the HR MS images as ground truth image to train the network.
