a. Datasets

In our experiment, eight pairs of optical and SAR images are used to test the ROS-PC; these pairs are referred to as Pairs A-H. Table 2 lists the information for the test images.


**Table 2.** Information for the test images.

Numerous factors are considered in the selection of test images, including different SAR sensors, date, resolution, and size. The optical images of the eight pairs are obtained from Google Earth, and the SAR image contains a satellite SAR image and seven airborne SAR images. To verify the robustness of the ROS-PC, the image contains different features, as shown in Figure 13.

Pair A includes images of an airport in Tucson, AZ, USA; in this pair, there exists a slight rotation and translation difference. Pair B is obtained in Zhengzhou, Henan, China, and it includes images of a small village. There is a slight rotation and translation in this pair, and because of the arc-shaped roof, there is a large radiation difference over the houses. Some houses even are difficult to recognize on SAR images. Pair C is also obtained in Zhengzhou, Henan, China, and it includes images of a field, and the features of the field vary significantly with the date. Some obvious features exist in the SAR image but not in the optical image. The remaining five pairs of images are obtained in Weinan, Shaanxi, China. Pair D includes images of a large scene, that includes a river, small buildings, multiple fields, and several roads. There are no scale and rotation changes in this pair, and the features exhibit a little temporal difference. This pair of images is used for the rotation and scale variation experiments. Pair E mainly includes images of a lake. There are certain scale variations and time differences. Pair F includes images of a terrace. The fields and terraces in the image are divided into two parts by a road. The feature intensity of the terraces is stronger than that of the fields. Pair G includes images of a scene including a river and some fields, there exists a large time span between the two images. Pair H includes images of a complex scene with some di fferent structure buildings.

**Figure 13.** *Cont*.

**Figure 13.** Eight pairs of test images and enlarged view of the main features. (**a**) Pair A; (**b**) Pair B; (**c**) Pair C; (**d**) Pair D; (**e**) Pair E; (**f**) Pair F; (**g**) Pair G; (**h**) Pair H.

b. Parameter Settings

The parameter settings of the UMPC-Harris detector are described in Section 3.1.2. The proposed ROS-PC method contains three parameters *no*, *np* and *m*, respectively. Parameter *no* and *np* are related to the dimensions of the descriptor, so they should not be too large. Parameter *m* is the size of the local region used for feature description. If the local region is too small, it contains less information, which does not reflect the di fference of features. On the contrary, if the local region is too large, not only the amount of calculation will increase but also the e ffect of geometric distortion will be received. In the feature descriptor, the parameters are set as *no* = 6, *np* = 4, and *m* = 96. Therefore, the dimension of the feature descriptor of a keypoint is 384. The parameter settings of comparative algorithms OS-SIFT and RIFT follow the References [28,39]. For a fair, the thresholds of keypoints detection are properly adjusted to obtain similar numbers of keypoints (approximately 1000~1200).

For the feature matching, the sum of squared di fferences (SSD) is selected for the feature matching metric. If the distance between the two feature vectors is less than the threshold, a pair of keypoints is considered as a potential match, and the threshold is set to 3 pixels. Generally, the matching pairs contain many false matches. The FSC algorithm is used to remove false matches.

#### 3.2.3. Comparison of Experimental Results and Discussion

To evaluate the optical and SAR image registration performance of the ROS-PC, the algorithm is compared with OS-SIFT and RIFT. The OS-SIFT utilizes two di fferent operators to calculate the gradients for SAR and optical images. Multiple image patches are aggregated to construct a gradient location orientation histogram-like descriptor. It is an advanced gradient-based method. The RIFT is a radiation-insensitive feature matching method based on PC and MIM, which is considerably more robust to nonlinear radiation distortions than traditional gradient maps. The registration results of eight pairs of optical and SAR images are shown in Figures 14–21.

**Figure 17.** Registration results of Pair D. (**a**) OS-SIFT; (**b**) RIFT; (**c**) ROS-PC.

**Figure 21.** Registration results of Pair H. (**a**) OS-SIFT; (**b**) RIFT; (**c**) ROS-PC.

Eight groups of images with different features are selected to verify the robustness of ROS-PC algorithm. It can be found that the ROS-PC has the best performance among the three methods, owing to the advantages of the proposed UMPC-Harris detector and HOSMI descriptor. For images with date and season differences, shown in Figures 14 and 16, the ROS-PC shows better robustness and obtains some matched keypoints with the time difference. For images with large radiation differences, shown in Figures 15 and 21, ROS-PC can still obtain some correctly matched keypoints, which are well-distributed in the image. For images with multiple objects, shown in Figures 17–20, ROS-PC can extract matching keypoints from each object, and they are uniformly distributed, which ensures the accuracy of registration. To sum up, the ROS-PC is a robust algorithm, which is suitable for optical and SAR image registration.

To further observe the registration accuracy of the ROS-PC, the checkboard mosaic images and enlarged sub-images of each pair are displayed in Figure 22.

**Figure 22.** Checkboard mosaic images and enlarged sub-images of ROS-PC. (**a**) Pair A; (**b**) Pair B; (**c**) Pair C; (**d**) Pair D; (**e**) Pair E; (**f**) Pair F; (**g**) Pair G; (**h**) Pair H.

The sub-image is an enlarged view of the intersection of the checkboard mosaic images, where the common features in the optical and SAR images are displayed clearly. In each pair, three sub-images with different features are selected.

Comparisons of RMSE, NCM, and the running time of the eight pairs are presented in Table 3.


**Table 3.** Comparison of root mean square error (RMSE), number of correct matches (NCM), and time for di fferent methods on eight pairs of test images.

In the eight pairs of test images, most features of suburban areas, such as airports, houses, fields, terraces, roads, rivers, and lakes are included. The optical and SAR images exhibit nonlinear radiation distortion, which leads to intensity di fferences or gradient inversion. Next, we analyze and discuss the registration results of each pair of images.

For Pair A, all three methods can successfully achieve optical and SAR image registration because of the HR and less noise. The ROS-PC performs best on the RMSE and NCM, benefiting from the high repeatability rate of keypoints and the robustness of the feature description method. For Pair B, the RIFT fails to register the two images correctly because of the serious nonlinear radiometric di fference. This is because the houses in the image have arc-shaped roofs, which induce the optical and SAR images with intensity di fferences. Although several correctly matched keypoints are detected by the OS-SIFT, the number and accuracy are significantly lower than the ROS-PC. For Pair C, the OS-SIFT fails to register the two images because of the gradient di fference and obvious scattering in the SAR image. However, the ROS-PC is suitable for describing similar local information based on the PC orientation and the multi-scale MIMs of the keypoints. For Pair D, the image contains many features, and there are little scale, rotation, and date di fference. The ROS-PC remains superior to the other two methods, as it is robust to nonlinear radiation di fferences and noise. For pair E and F, the image contains two types of features. ROS-PC can obtain correctly matching keypoints from each object, and they are well-distributed. For pair G, owing to the time di fference, there are many unmatched keypoints of the river in the image, which causes extra di fficulties in feature matching. However, the ROS-PC still successfully completes more NCM and achieves higher accuracy. For pair H, the radiation di fference between optical and SAR images is large, because there are many buildings in the image, which leads to the failure of the other two algorithms. To sum up, the ROS-PC has the most uniform distribution, the largest NCM, and the best RMSE among the three algorithms. Therefore, the ROS-PC is more robust to noise and scattering in SAR images and the radiation di fference between optical and SAR images.

Gradient-based descriptors such as OS-SIFT are more sensitive to nonlinear radiation di fferences because the gradient-based descriptors rely on a linear relationship between images, and therefore, they are not appropriate for significant nonlinear intensity di fferences caused by radiation distortion. The speckle noise and scattering in SAR images pose significant challenges in image registration. The RIFT performs better than the gradient-based descriptors because it uses PC to capture MIM. The RIFT uses the MIM to express the shape and structure information of objects and, therefore, it is robust to nonlinear radiation distortion. However, the repeatability rate of the corner detector in RIFT is not as good as that of UMPC-Harris, and the descriptor in RIFT has limited significance and robustness to noise and scattering in SAR images. Therefore, our ROS-PC yields the smallest RMSE and the largest NCM among all eight pairs because of two reasons, which are listed below.

• The UMPC-Harris can obtain a higher repeatability rate of keypoints than SAR-Harris and m + M-Harris between SAR and optical images.

• The HOSMI descriptor uses four-scale and six-orientation LGFs to capture the multi-scale max index and orientation feature information of PC, which is robust to nonlinear radiation variations of optical and SAR images. Further, it can effectively overcome the noise and scattering of SAR images.

After comparisons of the running time in Table 3, it can be found that the ROS-PC is the most time-consuming. The reason is that the algorithm is based on the principle of PC, which is slow by nature. Second, in the process of feature detection, the overlapping block and voting strategies need to additional calculation than the other methods. Third, the descriptor is constructed over the four scales and the dimension is larger. This paper only focuses on a robust registration method for optical and SAR images, and the running time is not the focus. Therefore, reducing the computation time and improving the efficiency of the algorithm is a problem we need to study in the future. Moreover, computational efficiency can be further improved by optimizing the algorithm and implementing the ROS-PC in C/C++.

#### *3.3. Influence of Rotation and Scale Variations on the Proposed ROS-PC*

The previous experimental results show that the algorithm is robust to the radiation distortion between optical and SAR images; however, the ROS-PC is not designed for scale and rotation deformations. The large-angle rotation between remote sensing images can be corrected using sensor geographic information. Further, by employing remote sensing image ground resolution information, remote sensing images can be assigned to the same scale by resampling. Then, the ROS-PC could be used for fine matching, which can handle slight rotation and scale differences between optical and SAR images. In this subsection, the influence of rotation and scale variation on our algorithm is evaluated based on the NCM for Pair D.

#### 3.3.1. Rotation Experiments of the Proposed ROS-PC

We tested the effect of rotation changes on the ROS-PC. Assuming the optical image remains unchanged, the SAR image is rotated from −12◦ to 16◦. The optical and SAR image registration results of the rotation variation are shown in Figure 23. The relationship between the NCM and the rotation angle is listed in Table 4.


**Table 4.** NCM with different rotation angles.

Figure 23 and Table 4 indicate that the ROS-PC can tolerate rotations between optical and SAR images below 9◦, which is sufficient for images that have been corrected by sensor geographic information.

#### 3.3.2. Scale Experiments of the Proposed ROS-PC

We test the robustness of the ROS-PC to scale changes. The optical image in Pair-D remains unchanged, and the SAR image is resized from 0.6 to 1.4 with an interval of 0.1. The optical and SAR image registration results of the scale variation are shown in Figure 24. The relationship between the NCM and the scale factor is listed in Table 5.

**Figure 23.** Registration results of optical and SAR images with different rotation angles (degree). (**a**) <sup>−</sup>12◦, (**b**) <sup>−</sup>9◦, (**c**) <sup>−</sup>6◦, (d) <sup>−</sup>3◦, (**e**) 0◦, (**f**) 4◦, (**g**) 8◦, (**h**) 12◦, and (**i**) 16◦.

**Figure 24.** Registration results of optical and SAR images with different scales. (**a**) 0.6, (**b**) 0.7, (**c**) 0.8, (**d**) 0.9, (**e**) 1.0, (**f**) 1.1, (**g**) 1.2, (**h**) 1.3, and (**i**) 1.4.

**Table 5.** NCM with different scale factors.


Figure 24 and Table 5 indicates that the ROS-PC can tolerate the scale difference between optical and SAR images in the range of 0.7–1.2, which is sufficient for images that have been assigned a similar scale by resampling.
