**2. Methodology**

The PC has been confirmed to be robust to nonlinear radiometric differences, which can capture the common features between multi-sensor images [37,39,40]. The ROS-PC method is based on PC. This section first reviews the PC theory briefly and then presents the design processing of the UMPC-Harris detector and HOSMI descriptor.

#### *2.1. Review of PC Theory*

According to Kovesi's approach, PC can be computed by convolving an image with a log-Gabor filter (LGF) to extract local phase information. The LGF is efficient for detecting features over multiple scales and orientations. In the frequency domain, LGF is defined as:

$$LGF(\omega) = \exp\left(\frac{-\left(\log\left(\omega/\omega\_0\right)\right)^2}{2\left(\log\left(\kappa/\omega\_0\right)\right)}\right) \tag{1}$$

where ω0 is the central frequency of the filter, κ is the related-width parameter of the filter that varies with ω0, which ensures that <sup>κ</sup>/<sup>ω</sup>0 is a constant.

The filter is transformed from the frequency to the spatial domain using an inverse Fourier transform. In the spatial domain, the 2-D LGF is represented as:

$$LGF(\mathbf{x}, \boldsymbol{y}) = LGF\_{s, \rho}^{\text{even}}(\mathbf{x}, \boldsymbol{y}) + i \times LGF\_{s, \rho}^{\text{odd}}(\mathbf{x}, \boldsymbol{y}). \tag{2}$$

Considering the coordinates of an input image *<sup>I</sup>*(*<sup>x</sup>*, *y*), the convolution responses *es*,*<sup>o</sup>*(*<sup>x</sup>*, *y*) and *os*,*<sup>o</sup>*(*<sup>x</sup>*, *y*) at scale s and orientation o are obtained, and then, the convolution results of even and odd symmetric wavelets form the response arrays as follows:

$$[e\_{\sf s}(\mathbf{x}, \mathbf{y}), o\_{\sf s}(\mathbf{x}, \mathbf{y})] = \left| I(\mathbf{x}, \mathbf{y}) \ast LGF^{\rm even}\_{\sf s\rho}, I(\mathbf{x}, \mathbf{y}) \ast LGF^{\rm odd}\_{\sf s\rho} \right|, \tag{3}$$

where *LGFeven s*,*o* and *LGFodd s*,*o* refer to the even-symmetric (cosine) and odd-symmetric (sine) wavelets of the LGF at scale s and orientation o, respectively. Further, *es*,*<sup>o</sup>*(*<sup>x</sup>*, *y*) and *os*,*<sup>o</sup>*(*<sup>x</sup>*, *y*) are the convolution responses of *LGFeven s*,*o* and *LGFodd s*,*o* at scale *s* and orientation *o*, respectively.

The corresponding amplitude *As*,*<sup>o</sup>*(*<sup>x</sup>*, *y*) and phase ϕ*<sup>s</sup>*,*<sup>o</sup>*(*<sup>x</sup>*, *y*) at scale *s* and orientation *o* are given by:

$$A\_{\mathfrak{s},\mathfrak{o}}(\mathfrak{x},\mathfrak{y}) = \sqrt{\mathfrak{e}\_{\mathfrak{s},\mathfrak{o}}(\mathfrak{x},\mathfrak{y})^2 + o\_{\mathfrak{s},\mathfrak{o}}(\mathfrak{x},\mathfrak{y})^2},\tag{4}$$

$$
\varphi\_{\mathbb{S}^\rho}(\mathbf{x}, \boldsymbol{y}) = \arctan(o\_{\mathbb{S}^\rho}(\mathbf{x}, \boldsymbol{y}), e\_{\mathbb{S}^\rho}(\mathbf{x}, \boldsymbol{y})).\tag{5}
$$

Considering the negative effect of image noise, the improved PC (called PC2) and the phase deviation function are, respectively, defined as [41]:

$$\text{PC}\_2 = \frac{\sum\_{\boldsymbol{\sigma}} \sum\_{\boldsymbol{s}} W\_{\boldsymbol{\sigma}}(\mathbf{x}, \boldsymbol{y}) \|A\_{\boldsymbol{s},\boldsymbol{\rho}}(\mathbf{x}, \boldsymbol{y}) \Delta \boldsymbol{\rho}\_{\boldsymbol{s},\boldsymbol{\rho}}(\mathbf{x}, \boldsymbol{y}) - T\|}{\sum\_{\boldsymbol{\sigma}} \sum\_{\boldsymbol{s}} A\_{\boldsymbol{s},\boldsymbol{\rho}}(\mathbf{x}, \boldsymbol{y}) + \boldsymbol{\varepsilon}},\tag{6}$$

$$\Delta q\_{\\$\beta\rho}(\mathbf{x}, \mathbf{y}) = \cos(q\_{\\$\rho\|}(\mathbf{x}, \mathbf{y}) - \overline{\mathbf{q}}\_{s\rho}(\mathbf{x}, \mathbf{y})) - \left| \sin(q\_{\\$\rho\|}(\mathbf{x}, \mathbf{y}) - \overline{\mathbf{q}}\_{s\rho}(\mathbf{x}, \mathbf{y})) \right|, \tag{7}$$

where *Wo*(*<sup>x</sup>*, *y*) is the weighting function, *T* is the estimated noise threshold, ε is a small constant to prevent division by zero, and <sup>ϕ</sup>*<sup>s</sup>*,*<sup>o</sup>*(*<sup>x</sup>*, *y*) is the mean phase angle. The function · denotes that the enclosed quantity is equal to itself when its value is positive, and zero otherwise. *PC*2 denotes the PC magnitude map of the input image.

Further, to obtain the information of PC varying with orientation *o* in the image, phase congruency is calculated independently in each orientation. Thus, serval PCMs according to the orientation angle are obtained [42].

$$PC2 = \sum\_{o} PC(\theta\_o)\_{\prime} \tag{8}$$

where θ*o* denotes the angle corresponding to orientation *o*, and *PC*(<sup>θ</sup>*o*) represents a PCM at orientation angle θ*<sup>o</sup>*. The moment of PC is calculated using these intermediate quantities as:

$$a = \sum\_{o} \left( \text{PC}(\theta\_0) \cos(\theta\_0) \right)^2,\tag{9}$$

$$b = 2\sum\_{o} \left( \text{PC}(\theta\_o) \cos(\theta\_o) \right) \cdot \left( \text{PC}(\theta\_o) \sin(\theta\_o) \right),\tag{10}$$

$$\mathcal{L} = \sum\_{o} \left( \text{PC}(\theta\_o) \sin(\theta\_o) \right)^2. \tag{11}$$

The maximum moment *max*ψ and the minimum moment *min*ψ of PC are defined as:

$$
\max\_{\psi} = \frac{1}{2}(a+c+\sqrt{b^2+\left(a-c\right)^2}),
\tag{12}
$$

$$
\min\_{\psi} = \frac{1}{2}(a+c-\sqrt{b^2+(a-c)^2}).\tag{13}
$$

The maximum and minimum moments of the PCM represent the edge and corner strength map, respectively.

#### *2.2. The Proposed UMPC-Harris Feature Detector*

p

Keypoints with high repeatability and uniform distribution can obtain sufficient matches, thus improving the image registration accuracy [38]. The subsection presents a novel feature detector UMPC-Harris, which is based on voting strategy, Harris on the multi-moment of PCMs, and overlapping block strategy for the detection of corners and edge points. The purpose of this UMPC-Harris detector is to detect sufficient, reliable, and well-distributed keypoints in optical and SAR images. Figure 2 presents the main process of the UMPC-Harris detector, which contains three steps.

**Figure 2.** Main process of UMPC-Harris detector.

First, the input image is divided into *Sn* × *Sm* blocks. Further, to avoid missing feature information on the block boundary, an overlap region with *nop* pixels is added between adjacent blocks. The choice of parameters *Sn* and *Sm* is a tradeoff between the amount of computation and the uniform distribution of the keypoints. When more blocks are divided, the keypoints will become more uniform, while increasing the number of calculations. The size and local complexity of the image should be considered for the selection of *Sn* and *Sm*.

Second, according to the description in Figure 2, we take block (1,2) as an example to illustrate the construction of multi-moment of the PCM. According to the definition of the maximum and minimum moments, the moment of the PCM *Mk* is defined as:

$$M\_k = \frac{1}{2}(a+c) + \frac{k\_t}{2}\sqrt{b^2 + (a-c)^2},\tag{14}$$

where *kt* is a variable between −1 and 1. The moment map contains the maximum and minimum moment map, and we can use *max*ψ and *min*ψ to describe the above equation as:

$$M\_k = \frac{1}{2} \{ \max\_{\psi} + \min\_{\psi} \} + \frac{k\_l}{2} \{ \max\_{\psi} - \min\_{\psi} \},\tag{15}$$

where *Mk* represents the moment of the PCM with parameter *kt*, and it is obvious that if *kt* is set to −1, *Mk* is the minimum moment map *Mk* = *min*ψ, and if *kt* is set to 1, *Mk* is the maximum moment map *Mk* = *max*ψ. The number of moments is *n*, and the step *h* is 2*<sup>n</sup>*−1 .

Third, the points detected by Harris on the maximum and minimum moments of the PCM represent edge points and corners, respectively. Because the edge feature has a high similarity and better resistance to radiation difference between optical and SAR images, thus, extracting feature points on the edge can ensure enough number of features and robustness to radiation difference. Besides, corner features can increase the number of homologous points. Thus, we combine the corners and edge points as keypoints. However, corner features are sensitive to SAR speckle noise and the repeatability rate of the edge points is poor, and therefore, if all of them are considered as keypoints, there could be some unreasonable keypoints. Therefore, we extract Harris corners on the multi-moment of the PCMs, respectively, and we consider the points appearing many times as the final keypoints. Stable and reliable keypoints are found based on the voting strategy.
