**1. Introduction**

The rapid development of sensor technology provided multiple remote sensing images for the observation of the Earth. Optical images ensure facilitated interpretation and are similar to human vision; however, they are affected easily by the weather. The synthetic aperture radar (SAR) is an active microwave imaging system that effectively compensates for the shortcomings of optical imaging systems and operates irrespective of the time of day and weather conditions. Optical and SAR images can be used together to form complementary information, which has important application value, such as image fusion [1,2], pattern recognition [3], and change detection [4,5]. Image registration is a preliminary work of these applications. It refers to aligning two or more images of the same scene acquired by different times, viewpoints, or sensors. Registration accuracy seriously affects these applications. Optical and SAR registration is still a challenging task owing to the speckle noise of SAR and the large radiation differences between optical and SAR images [6,7].

Generally, image registration methods can be roughly divided into two categories—namely, area-based methods and feature-based methods [8]. In area-based methods, which are also known as

intensity-based methods, first, a template is defined, and subsequently, the geometric transformation model is estimated by optimizing a similarity measurement between the SAR and optical images, such as mutual information [9,10], normalized cross-correlation [11], and cross-cumulative residual entropy [12]. Area-based methods deliver high accuracy, as the entire intensity information is utilized. However, due to its high computational loads and sensitivity to the geometry and radiation di fferences, they are limited in their applications of optical and SAR image registration.

Feature-based methods usually first extract features such as points [13], edges [14,15], and contours [16] from input images. Then, a distinctive feature descriptor is designed. Finally, the transformation model is estimated by establishing the corresponding relationship between the features. Feature-based methods are recommended for optical and SAR image registration because they process images with their significant features rather than all intensity information, thereby achieving high precision and robustness to geometry and radiation di fferences. Feature-based methods are mainly composed of three steps: feature detection, feature description, and feature matching.

The most representative feature-based method is the scale-invariant feature transform (SIFT), owing to its e fficient performance and invariance to scale and rotation [17]. Subsequently, a variety of improved methods have been reported. To improve matching e fficiency, principal component analysis (PCA) is applied to reduce the dimension of the descriptor [18]. To reduce time, a speeded-up robust feature uses the determinant value of the Hessian matrix to detect feature points and an integral graph to accelerate the operation [19]. A ffine SIFT simulates the parameters of a ffine transformation to achieve full a ffine invariance and considerably expands the scope of application of image registration [20]. A uniform robust SIFT is proposed to extract uniformly distributed and robust feature points [21]. Adaptive binning SIFT is proposed to increase the particularity and robustness of descriptors [22].

However, speckle noise in SAR images and the intensity di fference between optical and SAR images make it di fficult to obtain good results when these methods are applied directly to image registration. Numerous scholars have proposed improved methods for optical and SAR image registration. An improved SIFT is realized using optical and SAR satellite image registration by exploring their spatial relationship [23]. An automatic SAR and optical image registration method, from rough to accurate, is proposed with the use of SIFT features [24]. A novel gradient definition, yielding an orientation and a magnitude that is robust to speckle noise, is specifically dedicated to SAR images [25]. Further, to overcome the di fference in image intensity between remote image pairs and increase the number of correct correspondences, a new gradient definition and an enhanced feature matching method by combining the position, scale, and orientation of each keypoint are proposed [26]. The gradients in the descriptor are computed by a multiscale Gabor odd filter (GOF)-based ratio operator, and the proposed GOF-based descriptor is formed for the SIFT features [27]. Xiang et al. proposed a robust SIFT-like algorithm (OS-SIFT) to register high-resolution optical and SAR images, in which the consistent gradient magnitudes in the SAR and optical images are computed using a multi-scale ratio of exponentially weighted averages (ROEWA) operator and a multi-scale Sobel operator, respectively [28].

Although numerous methods have achieved improvements in gradient redefinition and descriptor construction when encountering optical and SAR images with large nonlinear radiation di fferences, the matching performance of feature descriptors based on gradient information is not ideal, and there are still many mismatches. Recently, various registration methods based on phase congruency (PC) information have been widely used in multi-sensor images, because PC has been confirmed as an illumination and contrast invariant measure of the features [29–31].

An image descriptor, namely, the histogram of oriented phase (HOP) based on the PC concept and PCA is present, and it is more robust to image scale variations and contrast and illumination changes [32]. Ye et al. proposed a novel feature descriptor named the histogram of oriented phase congruency (HOPC) for multimodal image registration [33]. Further, they proposed a local phase-based invariant feature for remote sensing image matching, which consists of a feature detector called minimum moment of PC (MMPC)-Lap and a feature descriptor called the local HOPC (LHOPC) [34]. Similar to gradients, PC also reflects the significance of the features of local image regions. Chen et al. proposed an optical and SAR image registration method by combining a new Gaussian-Gamma-shaped bi-windows-based gradient operator and the histogram of oriented gradient pattern [35]. To address large geometric di fferences and speckle noise in SAR images, a novel optical-to-SAR image registration algorithm was proposed using a new structural descriptor [36]. A dense descriptor named the histograms of oriented magnitude and phase congruency was proposed to register multi-sensor images. It is based on the combination of the magnitude and PC information of local regions, and successfully captures the common features of images with nonlinear radiation changes [37]. A novel image registration method, which combines nonlinear di ffusion and PC structural descriptors, has been proposed for the registration of SAR and optical images [38]. To overcome nonlinear radiation distortions, Li et al. [39] proposed a radiation invariant feature transform (RIFT) algorithm to register multi-sensor images, including optical and SAR images. The RIFT uses PC instead of image intensity for feature point detection and it proposes a maximum index map (MIM) for feature description. Further, the RIFT not only largely improves the stability of feature detection but also overcomes the limitation of gradient information for feature description.

Although a number of PC-based image registration methods have been proposed in the past few years, there are limitations that cannot be ignored when these methods are applied to optical and SAR image registration with large radiation di fferences. These limitations are listed below.


In this paper, we address the above limitations by developing a robust optical and SAR image registration method based on PC (ROS-PC). The proposed method mainly contains the following two works.

First, a uniform Harris feature detection method based on multi-moment of the PCM (UMPC-Harris) is proposed. In the UMPC-Harris, we take the corners and edge points as keypoints. The edge structure feature has a high similarity and better resistance to radiation di fference between optical and SAR images [30,36,39], thus, extracting feature points on the edge can ensure enough number of features and robustness to radiation di fference. Besides, corner features can increase the number of homologous points. Therefore, the multi-moment of the PCM is constructed by using maximum and minimum moment maps. Harris operator is used on the multi-moment to detect corners and edge points. Finally, the overlapping block and voting strategy are introduced to detect uniformly distributed and reliable keypoints.

**Figure 1.** Comparison results of keypoints detection in optical and synthetic aperture radar (SAR) images (top row depicts the optical image, and bottom row depicts SAR images). (**a**) Original images; (**b**) Harris on the original images; (**c**) Harris on the minimum moment of the phase congruency map (PCM); (**d**) Harris on the maximum moment of the PCM.

Second, since PC is not suitable for constructing descriptors directly, the feature descriptor is derived for a keypoint by utilizing the histogram of phase congruency orientation on multi-scale max amplitude index maps (HOSMI). The proposed HOSMI descriptor is utilizing the MIM instead of the PCM because it is more robust to intensity radiation distortions than the PCM [39]. Furthermore, in remote sensing images, many salient features usually appear in different scales [38]. Therefore, we construct the phase congruency orientation maps and max amplitude index maps, respectively. In the local region of each keypoint, the histograms of phase congruency orientation on multi-scale max amplitude index maps are calculated. Finally, the descriptor is constructed by combining the feature vectors of all patched in order. Compared with state-of-the-art, the main contribution of this study can be summarized as follows:


The rest of this paper is organized as follows: Section 2 starts with a review of PC theory, and followingly introduces the ROS-PC in detail, including the UMPC-Harris feature detector and HOSMI feature descriptor. In Section 3, through several experiments, the repeatability rate of keypoints by UMPC-Harris, the robustness of ROS-PC, and the sensitivity of ROS-PC to scale and rotation changes are evaluated and discussed. Finally, the conclusions are provided in Section 4.
