Robust Non-Rigid Feature Matching for Image Registration Using Geometry Preserving

Zhu, Hao; Zou, Ke; Li, Yongfu; Cen, Ming; Mihaylova, Lyudmila

doi:10.3390/s19122729

Open AccessArticle

Robust Non-Rigid Feature Matching for Image Registration Using Geometry Preserving

¹

Key Laboratory of Intelligent Air-Ground Cooperative Control for Universities in Chongqing, and Automotive Electronics and Embedded System Engineering Research Center, College of Automation, Chongqing University of Posts and Telecommunications, Chongqing 400065, China

²

Department of Automatic Control and Systems Engineering, University of Sheffield, Mappin Street, Sheffield S1 3JD, UK

^*

Author to whom correspondence should be addressed.

Sensors 2019, 19(12), 2729; https://doi.org/10.3390/s19122729

Submission received: 24 April 2019 / Revised: 10 June 2019 / Accepted: 14 June 2019 / Published: 18 June 2019

(This article belongs to the Collection Multi-Sensor Information Fusion)

Download

Browse Figures

Versions Notes

Abstract

:

In this paper, a robust non-rigid feature matching approach for image registration with geometry constraints is proposed. The non-rigid feature matching approach is formulated as a maximum likelihood (ML) estimation problem. The feature points of one image are represented by Gaussian mixture model (GMM) centroids, and are fitted to the feature points of the other image by moving coherently to encode the global structure. To preserve the local geometry of these feature points, two local structure descriptors of the connectivity matrix and Laplacian coordinate are constructed. The expectation maximization (EM) algorithm is applied to solve this ML problem. Experimental results demonstrate that the proposed approach has better performance than current state-of-the-art methods.

Keywords:

image registration; non-rigid feature matching; local structure descriptor; Gaussian mixture model

1. Introduction

Image registration is a fundamental task in many fields, such as computer vision, robotics, medical image processing, and remote sensing [1,2,3,4]. The main purpose of image registration is to align two or more images of the same scene taken from different viewpoints, at different times, and/or by different sensors.

Many algorithms have been developed for image registration. It can be roughly divided into area-based and feature-based methods. Area-based methods match image intensity values directly. They mainly include the cross-correlation (CC) methods, the Fourier methods, and mutual information (MI) methods. The normalized CC method is a classic in the area-based methods [5]. The similarity of window pairs from two images are computed and the maximum is considered as a correspondence. A method based on wavelet decomposition and correlation is proposed for image registration [6]. The Fourier methods find the Fourier representation of the images and a subpixel phase correlation with Gaussian mixture model (GMM) is used to register the images [7]. The MI method provides a measure of dependence between two images, and a deterministic explanation for MI-based image registration is proposed in [8].

Feature-based methods extract the salient structures, i.e., features, from the images. The extracted features are called control points [9], and the feature-based methods are considered as point set registration. The traditional for feature-based image registration approach uses a two-step strategy. In the first step, the distances of feature points from local descriptors, such as scale-invariant Fourier transform (SIFT), speeded-up robust features (SURF), or shape context (SC), are computed to obtain putative correspondences. The putative correspondences may contain some true matches, false matches, and outliers. The true matches from the putative correspondences are then distinguished by some geometrical constraint methods. In [10], an improved random sample consensus (RANSAC) algorithm is proposed to get correct matches, and a common visual pattern is detected for feature matching in pairwise images [11]. In [12], the progressive vector field consensus (VFC) is used to establish the feature points correspondences between images.

The feature-based image registration is formulated to estimate the correspondence matrix and the transformation function. This strategy can be categorized as distance-based methods and probability-based methods [13]. Two-step schemes are employed in distance-based methods. The two point sets distance with the computed correspondence is minimized to estimate the transformation. Widely used methods include the iterative closest point (ICP) algorithm [14] and its variants [15]. Using least squares (LS) minimization, the closest point is considered as the correspondence and the transformation is estimated in the ICP algorithm. In [16], the similarity measures for matching and registration are used as thin-plate spline (TPS) functions in the non-rigid case. In [17], a Havrda–Charvat–Tsallis entropy is proposed to register medical images.

For the probability-based methods, feature points are represented as probability density functions. These methods have attracted considerable attention [18,19,20,21,22,23,24]. In the CPD method [18], one point set is represented by Gaussian mixture model (GMM) centroids, and the other point set is fitted to the first point set by moving coherently. In [19], the point sets from medical images are represented as a GMM, the Jensen–Shannon (JS) divergence between these Gaussian densities is minimized for image registration. In [20], both point sets are modeled as GMM, the Euclidean distance between these GMMs is minimized to achieve registration. The above-mentioned methods only consider the global structure between two point sets. In [21], a local descriptor, which is called as a local linear embedding (LLE), is proposed to preserve local structure between two point sets. In [22], a locally linear transforming (LLT), which is similar to LLE, is used to match feature points for remote sensing image. In [24], a dual-feature based point set registration with global-local structural preservation is proposed.

In this paper, we propose a novel GMM-based non-rigid point set registration method to perform image feature matching with structure constraints. The feature points of one image are represented by GMM centroids, and the feature points of the other image are considered as data points. The GMM centroids are moved coherently to fit the data points using a non-rigid transformation by maximizing the likelihood, which encodes the global structure of the pairwise image feature points. For a smooth non-rigid transformation, the coherent constraint is applied by regularization of the displacement field. The shape context is used for the membership probabilities initialization. Furthermore, two local structure descriptors are constructed. The first descriptor is inspired by the idea that the local neighbors in the point set could be preserved after the non-rigid transformation. The Laplacian coordinate is used in the second descriptor to keep the size of the neighborhood structure. Therefore, the objective function is composed of a global distance item, non-rigid transformation constraint item, and two local structure constraint items. An expectation-maximization (EM) algorithm is then applied to perform the maximum likelihood (ML) optimization.

The contribution of this paper can be summarized as follows. (1) Two local structure descriptors of the connectivity matrix and Laplacian coordinates are constructed to preserve the local structure of feature points. The connectivity matrix-based constraint is proposed to retain the local neighborhood relationship, and the Laplacian coordinate-based constraint is employed to encode the local neighborhood scale. The proposed method preserves the local structure in terms of the neighborhood relationship and neighborhood scale, and it is more flexible. (2) The likelihood of image registration is formulated and the EM algorithm is proposed for feature-based image registration.

The paper is organized as follows. The problem of pairwise image registration is formulated in Section 2. The unknown parameters of the proposed method are estimated by the EM algorithm in Section 3. In Section 4, experiments are carried out to validate the performance of the proposed method. Finally, conclusions are given in Section 5.

2. Problem Formulation

Without loss of generality, the extracted feature points from pairwise images are represented as

X = {x_{i} |x_{i} \in ℜ^{D}}_{i = 1}^{N}

and

Y = {y_{j} |y_{j} \in ℜ^{D}}_{j = 1}^{M}

, respectively, where D denotes the dimension of the feature points. Our goal is to find a suitable transformation and to establish the correct correspondence between

X

and

Y

. We set the non-rigid transformation of point set

X

as the GMM centroids and the other point set

Y

as the data point generated by the GMM. That is,

p (Y) = \prod_{j = 1}^{M} \sum_{i = 1}^{N} π_{j i} N (y_{j} |f (x_{i}), σ^{2} I_{D})

(1)

where

N

denotes the Gaussian distribution, f means the non-rigid transformation,

σ^{2}

represents the equal isotropic covariances, and

I

is the identity matrix. We introduce an indicator

Z = [z_{1}, z_{2}, \dots, z_{M}]

, where

z_{j}

is a

1 \times M

binary vector with elements

z_{j i}

for

i = 1, 2, \dots N

. The

z_{j i}

satisfy

z_{j i} \in {0, 1}

and

\sum_{i} z_{j i} = 1

. It means that only one element in vector

z_{j}

is equal to 1 and all the other elements in vector

z_{j}

are equal to 0. We have

p (z_{j} |π) = \prod_{i = 1}^{N} {π_{j i}}^{z_{j i}}

(2)

p (y_{j} |x_{i}, σ^{2}, z_{j}) = \prod_{i = 1}^{N} N {(y_{j} |f (x_{i}), σ^{2} I_{D})}^{z_{j i}}

(3)

where

π = {π_{j i}}_{i = 1 : N}^{j = 1 : M}

. To account for outliers in

Y

, a uniform component

p (y_{j} |N + 1) = \frac{1}{M}

with weight w is added to the mixture model, where w is a weight of the uniform distribution for noise and outlier. Assuming independent and identically distributed data in

Y

, we have

p (y_{j}) = w \frac{1}{M} + (1 - w) \sum_{i = 1}^{N} π_{j i} N (y_{j} |f (x_{i}), σ^{2} I_{D})

(4)

The negative log-likelihood function of Equation (1) can be represented as:

\begin{matrix} Q & = \sum_{j = 1}^{M} \sum_{i = 1}^{N} q (z_{j i}) \frac{{∥y_{j} - f (x_{i})∥}^{2}}{2 σ^{2}} \\ + \frac{D}{2} log (σ^{2}) \sum_{j = 1}^{N} \sum_{i = 1}^{N} q (z_{j i}) \end{matrix}

(5)

where

q (z_{j i})

is the posterior probability.

The non-rigid transformation f is chosen as a displacement function [18]

f (X) = X + GW

(6)

where

G

denotes a

N \times N

dimensional Gaussian kernel matrix with element

g_{i j} = exp (- \frac{1}{2} {∥\frac{x_{i} - x_{j}}{τ}∥}^{2})

,

τ

denotes the width of the Gaussian kernel, and

W

is a

N \times D

dimensional weight matrix of the Gaussian kernel. In order to enforce the motion coherence for preserving the global topology, the constraint on the weight matrix

W

given as reference is used [18]

E_{w} (W) = Tr (W^{T} GW)

(7)

where

Tr

denotes the trace of matrix and

W^{T}

is the transposition of matrix

W

. The objective function of feature-based image registration can then be written as

\begin{matrix} Q & = \sum_{j = 1}^{M} \sum_{i = 1}^{N} q (z_{j i}) \frac{{∥y_{j} - f (x_{i})∥}^{2}}{2 σ^{2}} \\ + \frac{D}{2} log (σ^{2}) \sum_{j = 1}^{N} \sum_{i = 1}^{N} q (z_{j i}) + α Tr (W^{T} GW) \end{matrix}

(8)

where

α

is the trade-off parameter.

In Equation (1), the membership probability

π_{j i}

is chosen in advance. Some methods set the membership probability to be equal [18,25,26,27]. For robustness, the feature of local shape is proposed here for membership probability initialization. In the 2D case, the shape context (SC) [28] is used as feature descriptor and the Hungarian method is used to perform point matching. In the 3D case, the fast point feature histograms (FPFH) [29] and a sample consensus initial alignment method are used for matching.

Then, two local structure descriptors are constructed to preserve the local structure of point set

X

. By computing the Euclidean distance between each point and its neighbors in

X

, the K nearest neighbors of each point in

X

can be obtained. Each point in

X

can be represented as a weighted linear combination of its K nearest neighbors. Let

L = {L_{i j}}_{i = 1 : N}^{j = 1 : N}

be an

N \times N

weighted matrix. If a point

x_{j}

does not belong to the K nearest neighbors of a point

x_{i}

, then

L_{i j}

is set as 0. The matrix

L

can be obtained by minimizing the cost function:

e (L) = \sum_{i = 1}^{N} {∥x_{i} - \sum_{j = 1}^{N} L_{i j} x_{j}∥}^{2}

(9)

where the sum of each of row of

L

is equal to 1. After the non-rigid transformation, the local structure can be preserved by minimizing the transformed cost function

\begin{matrix} E_{l} (L) \\ = \sum_{j = 1}^{M} {\sum_{i = 1}^{N} q (z_{j i}) ∥f (x_{i}) - \sum_{j = 1}^{N} L_{i j} f (x_{j})∥}^{2} \\ = \sum_{j = 1}^{M} {\sum_{i = 1}^{N} q (z_{j i}) ∥x_{i} + G (i, .) W - \sum_{j = 1}^{N} L_{i j} (x_{i} + G (i, .) W)∥}^{2} \end{matrix}

(10)

where

G (i, .)

means the

i^{t h}

row of

G

. The objective function of feature-based image registration in Equation (8) can be expressed as:

\begin{matrix} Q_{1} \\ = Q + β E_{l} (L) \\ = \sum_{j = 1}^{M} \sum_{i = 1}^{N} q (z_{j i}) \frac{{∥y_{j} - f (x_{i})∥}^{2}}{2 σ^{2}} \\ + \frac{D}{2} log (σ^{2}) \sum_{j = 1}^{M} \sum_{i = 1}^{N} q (z_{j i}) + α Tr (W^{T} GW) \\ + β \sum_{j = 1}^{M} {\sum_{i = 1}^{N} q (z_{j i}) ∥x_{i} + G (i, .) W - \sum_{j = 1}^{N} L_{i j} (x_{i} + G (i, .) W)∥}^{2} \end{matrix}

(11)

where

β

is the trade-off parameter. For the second structure descriptor, the Laplacian coordinate is proposed to preserve the size of neighborhood structure. For a graph

(V, E)

, where

V

and

E

is the set of vertices and the set of edges in the graph, respectively, the Laplacian coordinate is expressed as

J (v_{i}) = \sum_{(i, j) \in E} h_{i j} (v_{i} - v_{j})

(12)

where

(i, j)

means the edge between vertices

v_{i}

and vertices

v_{j}

and

h_{i j}

is the weight coefficient. The graph of point set X is constructed as follows: the vertex set is set as

V = {x_{i}}_{i = 1 : N}

and the edges

(x_{i}, x_{j})

if and only if

{∥x_{i} - x_{j}∥}^{2} < ε

, where

ε

is the threshold parameter to construct the neighborhood graph.

h_{i j}

is chosen as

h_{i j} = e^{- \frac{1}{ε} {∥x_{i} - x_{j}∥}^{2}}

(13)

After the non-rigid transformation, the Laplacian regularization can be preserved by minimizing the following term:

\begin{matrix} E_{h} (h) = \sum_{j = 1}^{M} \sum_{i = 1}^{N} q (z_{j i}) {∥J (x_{i}) - J (f (x_{j}))∥}^{2} \\ = \sum_{j = 1}^{M} \sum_{i = 1}^{N} q (z_{j i}) {∥J (x_{i}) - J (x_{i} + G (i, .) W)∥}^{2} \end{matrix}

(14)

The objective function of the feature-based image registration in Equation (11) can be updated as:

\begin{matrix} Q_{2} \\ = Q_{1} + γ E_{h} (h) \\ = \sum_{j = 1}^{M} \sum_{i = 1}^{N} q (z_{j i}) \frac{{∥y_{j} - f (x_{i})∥}^{2}}{2 σ^{2}} \\ + \frac{D}{2} log (σ^{2}) \sum_{j = 1}^{M} \sum_{i = 1}^{N} q (z_{j i}) + α Tr (W^{T} GW) \\ + β \sum_{j = 1}^{M} {\sum_{i = 1}^{N} q (z_{j i}) ∥x_{i} + G (i, .) W - \sum_{j = 1}^{N} L_{i j} (x_{i} + G (i, .) W)∥}^{2} \\ + γ \sum_{j = 1}^{M} \sum_{i = 1}^{N} q (z_{j i}) {∥J (x_{i}) - J (x_{i} + G (i, .) W)∥}^{2} \end{matrix}

(15)

where

γ

is the trade-off parameter.

3. EM for the Proposed Method

In order to estimate the

Θ = [W, σ^{2}, Z]

, the EM algorithm is proposed. There are two steps in the EM algorithm.

(1).: $E_{L} (Θ, Θ^{(m)}) = Q_{2}$
(2).: $Θ^{(m + 1)} = max E_{L} (Θ, Θ^{(m)})$

where m refers to the mth iteration. By iterating these two steps, the parameters

Θ

are determined while the likelihood function can also be increased.

3.1. E-Step

We use

q (z_{j i})

to denote

p (z_{j i} = 1 |Y, X)

, which can be found using the Bayes’ theorem

q (z_{j i}) = \frac{π_{j i} N (y_{j} |f (x_{i}), σ^{2} I_{D})}{\sum_{i} π_{j i} N (y_{j} |f (x_{i}), σ^{2} I_{D}) + \frac{w}{1 - w} \times \frac{N}{M}}

(16)

3.2. M-Step

The

E_{L} (Θ, Θ^{(m)})

can be rewritten as

\begin{matrix} E_{L} (Θ, Θ^{(m)}) \\ = \frac{1}{2 σ^{2}} \{Tr (Y d (P 1) Y) - 2 Tr ({(P^{T} Y)}^{T} (X + GW)) \\ + Tr ({(X + GW)}^{T} d (P^{T} 1) (X + GW))\} \\ + \frac{D}{2} log (σ^{2}) \sum_{j = 1}^{M} \sum_{i = 1}^{N} q (z_{j i}) + α Tr (W^{T} GW) \\ + β Tr (X^{T} BX) + 2 β Tr (X^{T} BGW) \\ + β Tr (W^{T} GBGW) \\ + γ Tr (W^{T} G S^{T} d (P^{T} 1) SGW) \end{matrix}

(17)

where

d (.)

indicates diagonal matrix,

P

is the

M \times N

matrix with element

q (z_{j i})

,

B = {(I - L)}^{T} d (P^{T} 1) (I - L)

, 1 is the all-one column vector,

I

is the identity matrix, S denotes the

N \times N

Laplacian matrix,

S = C - H

, H is the

N \times N

adjacency matrix with element

h_{i j}

, and

C

is the diagonal matrix whose i-th entry is the sum of

{(h_{i j})}_{j = 1, 2, \dots N}

.

The estimates of

σ_{k}^{2}

and

W

are updated iteratively by taking the corresponding partial derivative of the expected log likelihood. That is,

\begin{matrix} \frac{\partial E_{L} (Θ, Θ^{(m)})}{\partial σ^{2}} \\ = - \frac{1}{2 σ^{4}} \{Tr (Y d (P 1) Y) - 2 Tr ({(P^{T} Y)}^{T} (X + GW)) \\ + Tr ({(X + GW)}^{T} d (P^{T} 1) (X + GW))\} \\ + \frac{D \sum_{j = 1}^{M} \sum_{i = 1}^{N} q (z_{j i})}{2 σ^{2}} \\ = 0 \end{matrix}

(18)

We have:

\begin{matrix} σ^{2} = \frac{1}{D \sum_{j = 1}^{M} \sum_{i = 1}^{N} q (z_{j i})} \{Tr (Y d (P 1) Y) \\ - 2 Tr ({(P^{T} Y)}^{T} (X + GW)) \\ + Tr ({(X + GW)}^{T} d (P^{T} 1) (X + GW))\} \end{matrix}

(19)

Similarly,

\begin{matrix} \frac{\partial E_{L} (Θ, Θ^{(m)})}{\partial W} \\ = \frac{1}{2 σ^{2}} (- 2 G P^{T} Y + 2 G d (P^{T} 1) X + 2 G d (P^{T} 1) GW) \\ + 2 GW + 2 β GBX + 2 β GBGW \\ + 2 γ G S^{T} d (P^{T} 1) SGW \\ = 0 \end{matrix}

(20)

W

can be obtained by solving the following system

\begin{matrix} (d (P^{T} 1) GW + 2 σ^{2} I + 2 σ^{2} β BG + 2 σ^{2} γ S^{T} d (P^{T} 1) SG) W \\ = P^{T} Y - d (P^{T} 1) X - 2 σ^{2} β BX \end{matrix}

(21)

The proposed feature-based image registration algorithm is summarized in Algorithm 1.

Algorithm 1: The proposed non-rigid feature-based image registration algorithm

Require:

The feature point

X = {x_{i} |x_{i} \in ℜ^{D}}_{i = 1}^{N}

and

Y = {y_{j} |y_{j} \in ℜ^{D}}_{j = 1}^{M}

, parameters w,

α

,

β

, and

γ

.

1:: Initialize $W = 0$ , $P = I_{N \times N}$
2:: Initialize $W = 0$ , $P = I_{N \times N}$
3:: Search the K nearest neighbors for each point in $X$ .
4:: Perform $L$ by minimizing the Equation (9)
5:: Compute ${h_{i j}}_{i = 1 : N}^{j = 1 : N}$ as Equation (13)
6:: while converged do
7:: E-step:
8:: Extract the local shape to assign the membership probability $π_{j i}$
9:: Update the $q (z_{j i})$ as Equation (16)
10:: M-step:
11:: Update $σ^{2}$ as Equation (19)
12:: Update $W$ by solving the linear system as Equation (21)
13:: end while

Ensure:

The aligned point set is

f (X) = X + GW

The probability of correspondence is given by

P

4. Performance Validation

In this section, experiments are carried out to evaluate the performance of the proposed method. The state-of-the-art algorithms ICP method http://www.cvlibs.net/software/libicp/ [14,30], TPS-RPM method https://www.cise.ufl.edu/~anand/publications.html [16], CPD method https://sites.google.com/site/myronenko/research/cpd [18], and CPD-GL method https://sites.google.com/site/jiayima2013/ [31] are used for comparison. The criteria for performance evaluation is chosen as

e r r o r = \sqrt{\frac{1}{|S|} \sum_{(n, m) \in S} {∥f (x_{n}) - y_{m}∥}^{2}}

(22)

where S is the true matches between the two point sets. The

e r r o r

refers to the registration error.

4.1. Parameter Settings

In the proposed method, there are seven parameters: w,

τ

, K,

ε

,

α

,

β

, and

γ

. Parameter w is a weight of the uniform distribution for noise and outlier. Parameter

τ

is the width of the Gaussian kernel for non-rigid transformation function. Parameter K is the number of neighbors used to perform the matrix

L

in Equation (9) for the first local structure descriptor. Parameter

ε

is a threshold used to construct the neighborhood graph for the second local structure descriptor. Parameters

α

,

β

, and

γ

are three trade-off parameters for global and local constraints. The other four parameters w,

τ

, K,

ε

are selected as [31]. The model selection is shown in Figure 1. In this experiment, the fish dataset is performed and the registration error

e r r o r

is chosen as the metric. It is observed that the proposed method has almost the same performance when

α \in [9, 11]

,

β \in [200, 400]

, and

γ \in [20, 40]

. Therefore, we set w = 0.1,

τ

= 2, K = 5,

ε

= 0.05,

α

= 10,

β

=340, and

γ

= 24.

4.2. Synthesized Point Set Registration

This test is based on the synthesized dataset, which is constructed by Chui and Rangarajan [16]. It consists of two point sets: Chinese character and fish shape. For each model, we designed five sets of data to measure the robustness of registration algorithms with respect to different degrees of deformation and noise. In the deformation test, we use Gaussian radial basis functions to generate deformations, where the coefficients are sampled from a Gaussian distribution with zero mean and standard deviation ranging from 0.02 to 0.08. For the noise test, the standard deviation of the Gaussian noise added to the the original data ratio ranges from 0.01 to 0.05. In each test, one of the above distortions is applied to the model set to create an observed data set, and 100 samples are generated for each degradation level. Comparisons on the fish and Chinese data with varying noise degradation are shown in Figure 2. It is observed that the proposed method with

β = 0

and

γ = 0

has almost the same performance as the CPD-GL method, and the proposed method with

β = 340

and

γ = 0

can register the point set under different degradation slightly better than CPD-GL, whereas the proposed method with

β = 340

and

γ = 24

has the best performance. The metrics under different noise and deformation in fish and Chinese dataset are shown in Figure 3. Apparently, the proposed method with

β = 340

and

γ = 24

has the best performance among all. Furthermore, in order to show the smoothness of the transformation using the proposed method, the experiment of the standard deviation

0.03

for the Gaussian noise added to the Chinese data is performed and 100 samples are generated. The performances of smoothness of the transformation by CPD, CPD-GL, and the proposed method are given in Figure 4, where

N P

and

S D

denote the points in the Chinese dataset and the number of samples, respectively,

| f (X) |

is defined as

| f (X) | = \sqrt{f_{x} {(X)}^{2} + f_{y} {(X)}^{2}}

,

f_{x} (X)

and

f_{y} (X)

denote the 1st and 2nd dimensions of

f (X)

, respectively. As can be observed, the proposed method achieves smoother performance than other methods.

4.3. IMM Hand Landmark Registration

A hand landmark experiment is used to evaluate the performance of the proposed method. The benchmark IMM Hand Database http://www2.imm.dtu.dk/pubdb/views/publication_details.php?id=403 is used. It consists of 40 images of 4 groups of the different human left hand. Images were acquired with resolution

800 \times 600

JPG format in December 2001. Each group has 10 samples with different poses. The hand structures in the dataset have 56 landmarks with the ground truth correspondences. In this experiment, we use 4 groups for evaluation. For each group, the first hand shape is used as the model point set, another 9 hand shapes are considered as data point sets. Therefore, the ground truth correspondences of nine hand landmark pairs for each group are constructed. The proposed method is used to align these face landmark pairs. Comparisons are given in Figure 5. It is observed that the proposed method with

β = 340

and

γ = 24

outperforms than the other methods. Then, comparing the registration error are given in Figure 6, which the performance of the proposed method with

β = 340

and

γ = 24

has the best performance.

With respect to synthesized point set and IMM hand landmark, CPD-GL outperforms ICP, CPD, and TPS, whereas the proposed method with

β = 0

and

γ = 0

has the almost same performance with CPD-GL method. Moreover, the proposed method with

β = 340

and

γ = 0

has slightly better performance than the CPD-GL method, and the proposed method with

β = 340

and

γ = 24

has the best performance. Then, the computational complexity of the proposed method is analyzed. The computational complexity for searching the K nearest neighbors for each local track in

X

needs

O ((K + N) log N)

by using the

k - d

tree, the computational complexity for performing the matrix

L

needs

O (K^{3} N)

, the EM algorithm needs almost

O (N^{3})

. Therefore, the total computational complexity of the proposed algorithm is

O (N^{3})

. All algorithms are implemented in MATLAB, and the tests are performed on a Core i7-4790 3.6GHz with 8GB RAM. The run-times of these algorithms are given in Table 1, which illustrates that the proposed method has almost the same computation time with CPD-GL method.

4.4. 4DCT_75 Dataset

The 4DCT_75 dataset of the benchmark data from the DIR-lab http://www.dir-lab.com/index.html is used. The 4DCT_75 data contains ten (“case1” to “case10”) thoracic 4D CT images. Each 4D CT images have ten 3D CT images. In particular, we chose “4DCT_75_T00” as a model point set. The other five point sets (“4DCT_75_T10” to “4DCT_75_T50”) are set as the scene point sets. Thus, we have 50 pairs of point sets. The example of the registration result of the proposed method are shown in Figure 7, which the proposed method with

β = 340

and

γ = 24

can align these point pairs. Furthermore, the registration errors of CPD method, CPD-GL method, and the proposed method are shown in Figure 8. It is shown that the proposed method with

β = 340

and

γ = 24

has a better performance than the other methods.

4.5. Real Image Feature Matching

In this experiments, the real image pairs are obtained from the data of Jiayi Ma [31], Tuytelaars [32], the Oxford buildings dataset http://www.robots.ox.ac.uk/~vgg/data/oxbuildings, and the Paris dataset http://www.robots.ox.ac.uk/~vgg/data/parisbuildings. The SIFT algorithm is used to extract the feature points of each image, and the putative correspondence is created by nearest neighbor matching. The goal is to find correspondences/matches between two sets of feature points. The performance is evaluated by precision, recall, and F1, where the F1 measure is used to evaluate the balance between recall and precision. The precision, recall, and F1 are defined by:

Precision = \frac{N_{L}}{N_{C}}

,

Recall = \frac{N_{L}}{N_{P}}

, and

F 1 = \frac{2 \cdot Recall \cdot Precision}{Recall + Precision}

, where

N_{L}

,

N_{C}

, and

N_{P}

denote the preserved inlier number, the preserved correspondence number, and the inlier number contained in the putative correspondences, respectively. The ground truth is established in advance manually. The results of matching these image pairs are illustrated in Figure 9. It is observed that the proposed method can distinguish inliers from the outliers. Furthermore, the performance of CPD, CPD-GL and the proposed method in different scenarios is shown in Figure 10. It is observed that the proposed method achieves slightly better performance in precision, recall and F1 metrics compared to other methods. Furthermore, the registration error

e r r o r

in pixels is proposed to evaluate the performance of these algorithms. Figure 10d demonstrates that the proposed method performs better than the CPD method and CPD-GL method. Based on the above evaluation results, the proposed method has better performance than other methods.

5. Discussion

In this paper, the proposed method is compared with the CPD and CPD-GL methods. The CPD registration method only preserves the global structure and ignores the local structure of point set. Although the CPD-GL algorithm preserves the global and local structure, it only introduces one local structure descriptor. In the proposed method, two local structure descriptors are constructed to preserve the local structure of feature points. The first one is the connectivity matrix-based constraint, which retains the local neighborhood relationship, while the other is the Laplacian coordinate-based constraint, which encodes the local neighborhood scale. Therefore, the proposed method preserves the local structure in terms of the neighborhood relationship and neighborhood scale, and it is more flexible. To illustrate the advantages of this method, three experiments were performed based on the proposed method and compared with the state-of-the-art registration algorithms. In the first experiment, for the 2D point set registration, the experimental results show that the proposed algorithm outperforms other algorithms and has almost the same computation time with CPD-GL algorithm. In the second experiment, registration error is employed to estimate the proposed algorithm on the 3D data, and the proposed algorithm outperforms other algorithms. Furthermore, the proposed algorithm is tested on several image pairs under many different evaluation metrics. The performances of these registration algorithms are evaluated by precision, recall, F1, and registration error. From the above experimental results, the proposed method outperforms the other methods. In particular, it is better to use the proposed method which provides stable and accurate transformation estimation for handling complicated non-rigid deformation. When deformation is not significant, the CPD, CPD-GL, and the proposed method obtain almost similar results, and the CPD has lower computation complexity.

There is room for further development of the proposed framework, especially proposing a fast algorithm and developing more evaluation metrics for point set registration such as smoothness of the transformation. It is interesting to investigate some real-life scenarios, where point set correspondences are weaker such as clouds of points with different amounts of points, missing correspondences or points. Furthermore, the proposed point set registration framework can be applied to perform track-to-track association with sensor bias in distributed multi-target tracking [33].

6. Conclusions

In this paper, a GMM-based probabilistic method for image feature matching is proposed. The feature points of one image are represented by GMM centroids, and the feature points of the other image are fitted to the first image feature points by maximizing the likelihood. To preserve the local structure of these feature points, two local structure descriptors are constructed. The EM algorithm is used to perform image feature matching. Computer simulations confirm the effectiveness of the proposed method in registering different types of images compared to the state-of-the-art registration algorithms.

Author Contributions

Conceptualization, H.Z.; Software, K.Z.; Supervision, L.M.; Writing—original draft, H.Z.; Writing—review & editing, Y.L. and M.C.

Funding

This work is jointly supported by the Scientific and Technological Research Program of Chongqing Municipal Education Commission (Grant No. KJ1704078), by the Research Funds of Chongqing Science and Technology Commission (Grant No. cstc2017jcyjAX0293), by the Chongqing Key Technology Innovation Project (Grant No. cstc2017rgzn-zdyfX0039), by the Chongqing Key Technology Innovation Project (cstc2017zdcy-zdyfX0004), by the National Natural Science Foundation of China (Grant No. 61773082), by the Key Project of Crossing and Emerging Area of CQUPT (Grant No. A2018-02), by the Research Fund of young-backbone university teacher in Chongqing province, by Chongqing Overseas Scholars Innovation Program, by Wenfeng Talents of Chongqing University of Posts and Telecommunications, by Innovation Team Project of Chongqing Education Committee (Grant No.CXTDX201601019), by the Research Funds of Chongqing University of Posts and Telecommunications (Grant No. A2018-174), by the National Key Research and Development Program (Grant No. 2016YFB0100906), by the Research and Innovation of Chongqing Postgraduate Project, by the Lilong Innovation and Entrepreneurship Fund of Chongqing University of Posts and Telecommunications (Grant No. 2019-1-01).

Conflicts of Interest

The authors declare no conflict of interest.

References

Zhu, H.; Li, Y.; Yu, J.; Leung, H.; Li, Y. Ensemble Registration of Multisensor Images by a Variational Bayesian Approach. IEEE Sens. J. 2014, 14, 2698–2705. [Google Scholar] [CrossRef]
Li, Y.; He, Z.; Zhu, H.; Zhang, W.; Wu, Y. Jointly registering and fusing images from multiple sensors. Inf. Fusion 2016, 27, 85–94. [Google Scholar] [CrossRef]
Zhu, H.; Yuen, K.V.; Mihaylova, L.; Leung, H. Overview of Environment Perception for Intelligent Vehicles. IEEE Trans. Intell. Transp. Syst. 2017, 18, 2584–2601. [Google Scholar] [CrossRef]
Li, Y.; Tang, C.; Peeta, S.; Wang, Y. Nonlinear Consensus-Based Connected Vehicle Platoon Control Incorporating Car-Following Interactions and Heterogeneous Time Delays. IEEE Trans. Intell. Transp. Syst. 2019, 20, 2209–2219. [Google Scholar] [CrossRef]
Gonzalez, R.; Woods, R.E. Digital Image Processing, 2nd ed.; Prentice Hall: Upper Saddle River, NJ, USA, 2002. [Google Scholar]
Moigne, J.L.; Campbell, W.J.; Cromp, R.F. An automated parallel image registration technique based on the correlation of wavelet features. IEEE Trans. Geosci. Remote Sens. 2002, 40, 1849–1864. [Google Scholar] [CrossRef]
Dong, Y.; Long, T.; Jiao, W.; He, G.; Zhang, Z. A Novel Image Registration Method Based on Phase Correlation Using Low-Rank Matrix Factorization With Mixture of Gaussian. IEEE Trans. Geosci. Remote Sens. 2017, 56, 446–460. [Google Scholar] [CrossRef]
Tagare, H.D.; Rao, M. Why Does Mutual-Information Work for Image Registration? A Deterministic Explanation. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 37, 1286–1296. [Google Scholar] [CrossRef]
Zitova, B.; Flusser, J. Image registration methods: A survey. Image Vis. Comput. 2003, 21, 977–1000. [Google Scholar] [CrossRef]
Wu, Y.; Ma, W.; Gong, M.; Su, L.; Jiao, L. A Novel Point-Matching Algorithm Based on Fast Sample Consensus for Image Registration. IEEE Geosci. Remote Sens. Lett. 2015, 12, 43–47. [Google Scholar] [CrossRef]
Liu, H.; Yan, S. Common visual pattern discovery via spatially coherent correspondences. In Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA, 13–18 June 2010; pp. 1609–1616. [Google Scholar]
Ma, J.; Ma, Y.; Zhao, J.; Tian, J. Image Feature Matching via Progressive Vector Field Consensus. IEEE Signal Process. Lett. 2015, 22, 767–771. [Google Scholar] [CrossRef]
Zhu, H.; Guo, B.; Zou, K.; Li, Y.; Yuen, K.V.; Mihaylova, L.; Leung, H. A Review of Point Set Registration: From Pairwise Registration to Groupwise Registration. Sensors 2019, 19, 1191. [Google Scholar] [CrossRef]
Besl, P.J.; McKay, N.D. A method for registration of 3-D shapes. IEEE Trans. Pattern Anal. Mach. Intell. 1992, 14, 239–256. [Google Scholar] [CrossRef]
Yang, J.; Li, H.; Campbell, D.; Jia, Y. Go-ICP: A Globally Optimal Solution to 3D ICP Point-Set Registration. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 38, 2241–2254. [Google Scholar] [CrossRef] [PubMed]
Chui, H.; Rangarajan, A. A new point matching algorithm for non-rigid registration. Comput. Vis. Image Underst. 2003, 89, 114–141. [Google Scholar] [CrossRef]
Tustison, N.J.; Awate, S.P.; Song, G.; Cook, T.S.; Gee, J.C. Point Set Registration Using Havrda-Charvat-Tsallis Entropy Measures. IEEE Trans. Med. Imaging 2011, 30, 451–460. [Google Scholar] [CrossRef] [PubMed]
Myronenko, A.; Song, X. Point Set Registration: Coherent Point Drift. IEEE Trans. Pattern Anal. Mach. Intell. 2010, 32, 2262–2275. [Google Scholar] [CrossRef]
Wang, F.; Vemuri, B.C.; Rangarajan, A.; Eisenschenk, S.J. Simultaneous Nonrigid Registration of Multiple Point Sets and Atlas Construction. IEEE Trans. Pattern Anal. Mach. Intell. 2008, 30, 2011–2022. [Google Scholar] [CrossRef]
Jian, B.; Vemuri, B.C. Robust Point Set Registration Using Gaussian Mixture Models. IEEE Trans. Pattern Anal. Mach. Intell. 2011, 33, 1633–1645. [Google Scholar] [CrossRef]
Ge, S.; Fan, G.; Ding, M. Non-rigid Point Set Registration with Global-Local Topology Preservation. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops, Columbus, OH, USA, 23–28 June 2014; pp. 245–251. [Google Scholar]
Ma, J.; Zhou, H.; Zhao, J.; Gao, Y.; Jiang, J.; Tian, J. Robust Feature Matching for Remote Sensing Image Registration via Locally Linear Transforming. IEEE Trans. Geosci. Remote Sens. 2015, 53, 6469–6481. [Google Scholar] [CrossRef]
Bai, L.; Yang, X.; Gao, H. Nonrigid Point Set Registration by Preserving Local Connectivity. IEEE Trans. Cybern. 2018, 48, 826–835. [Google Scholar] [CrossRef]
Zhang, S.; Yang, K.; Yang, Y.; Luo, Y.; Wei, Z. Non-rigid point set registration using dual-feature finite mixture model and global-local structural preservation. Pattern Recognit. 2018, 80, 183–195. [Google Scholar] [CrossRef]
Horaud, R.; Forbes, F.; Yguel, M.; Dewaele, G.; Zhang, J. Rigid and articulated point registration with expectation conditional maximization. IEEE Trans. Pattern Anal. Mach. Intell. 2011, 33, 587–602. [Google Scholar] [CrossRef]
Nguyen, T.M.; Wu, Q.M.J. Multiple Kernel Point Set Registration. IEEE Trans. Med. Imaging 2016, 35, 1381–1394. [Google Scholar] [CrossRef]
Zhou, Z.; Zheng, J.; Dai, Y.; Zhou, Z.; Chen, S. Robust Non-Rigid Point Set Registration Using Student’s-t Mixture Model. PLoS ONE 2014, 9, e91381. [Google Scholar] [CrossRef]
Salve, S.G.; Jondhale, K.C. Shape matching and object recognition using shape contexts. In Proceedings of the 2010 3rd International Conference on Computer Science and Information Technology, Chengdu, China, 9–11 July 2010; Volume 9, pp. 471–474. [Google Scholar]
Rusu, R.B.; Blodow, N.; Beetz, M. Fast Point Feature Histograms (FPFH) for 3D registration. In Proceedings of the 2009 IEEE International Conference on Robotics and Automation, Kobe, Japan, 12–17 May 2009; pp. 3212–3217. [Google Scholar]
Zhang, Z. Iterative point matching for registration of free-form curves and surfaces. Int. J. Comput. Vis. 1994, 13, 119–152. [Google Scholar] [CrossRef]
Ma, J.; Zhao, J.; Yuille, A.L. Non-Rigid Point Set Registration by Preserving Global and Local Structures. IEEE Trans. Image Process. 2016, 25, 53–64. [Google Scholar]
Tuytelaars, T.; Gool, L.V. Matching widely separated views based on affine invariant regions. Int. J. Comput. Vis. 2004, 59, 61–85. [Google Scholar] [CrossRef]
Zhu, H.; Wang, M.; Yuen, K.; Leung, H. Track-to-Track Association by Coherent Point Drift. IEEE Signal Process. Lett. 2017, 24, 643–647. [Google Scholar] [CrossRef]

Figure 1. Model selection of the regularization parameters

α

,

β

, and

γ

for structure constraints. (a) model selection of the regularization parameters

α

and

β

, (b) and (c) are the different perspectives of (a), (d) model selection of the regularization parameters

β

and

γ

, (e) and (f) are different perspectives of (d).

Figure 1. Model selection of the regularization parameters

α

,

β

, and

γ

for structure constraints. (a) model selection of the regularization parameters

α

and

β

, (b) and (c) are the different perspectives of (a), (d) model selection of the regularization parameters

β

and

γ

, (e) and (f) are different perspectives of (d).

Figure 2. Registration results of point set registration algorithms on the fish and Chinese data with vary noise degradation. (a) The data with different gradation level; (b) iterative closest point (ICP); (c) thin-plate spline (TPS)-RPM; (d) CPD; (e) CPD-GL; (f) the proposed method with

β = 0

and

γ = 0

; (g) the proposed method with

β = 340

and

γ = 0

; (h) the proposed method with

β = 340

and

γ = 24

.

Figure 2. Registration results of point set registration algorithms on the fish and Chinese data with vary noise degradation. (a) The data with different gradation level; (b) iterative closest point (ICP); (c) thin-plate spline (TPS)-RPM; (d) CPD; (e) CPD-GL; (f) the proposed method with

β = 0

and

γ = 0

; (g) the proposed method with

β = 340

and

γ = 0

; (h) the proposed method with

β = 340

and

γ = 24

.

Figure 3. Registration error on Chinese character and fish shape under vary deformation and noise degradation. Sub-figures (a) and (b) show the registration error on Chinese character and fish shape under vary deformation, respectively, Sub-figures (c), and (d) show the registration error on Chinese character and fish shape under vary noise degradation, respectively.

Figure 4. The performances of smoothness of the transformation by CPD, CPD-GL, and the proposed methods. Sub-figures (a), (b), and (c) show the performances of smoothness of the transformation using CPD, CPD-GL, and the proposed methods, respectively, while (d), (e), and (f) depict the same figures in another perspective.

Figure 5. Four examples of IMM hand landmark registration results for each group. (a) four hand landmark images (b) target hand shape for each group (c) ICP (d) TPS-RPM (e) CPD (f) CPD-GL (g) the proposed method with

β = 0

and

γ = 0

(h) the proposed method with

β = 340

and

γ = 0

(i) the proposed method with

β = 340

and

γ = 24

.

Figure 5. Four examples of IMM hand landmark registration results for each group. (a) four hand landmark images (b) target hand shape for each group (c) ICP (d) TPS-RPM (e) CPD (f) CPD-GL (g) the proposed method with

β = 0

and

γ = 0

(h) the proposed method with

β = 340

and

γ = 0

(i) the proposed method with

β = 340

and

γ = 24

.

Figure 6. Registration error on IMM hand landmark.

Figure 7. A example of the registration result of the 4DCT_75 data, while the left column denotes the initialization, the right column denotes the registration result using the proposed method with

β = 340

and

γ = 24

.

Figure 7. A example of the registration result of the 4DCT_75 data, while the left column denotes the initialization, the right column denotes the registration result using the proposed method with

β = 340

and

γ = 24

.

Figure 8. Registration error on “4DCT_75”.

Figure 9. Different scenes on real image data. (a) different image pairs extracted feature points from scale-invariant Fourier transform (SIFT) algorithm (b) manual real correspondence, with results from (c) CPD method, (d) CPD-GL method, (e) proposed method. Each line segment corresponds to the position of the corresponding feature point in the two images (blue = true positive, green = false negative, and red = false positive). For visibility, 100 pairs are selected for some pair of images as presented, and the true negatives are not shown.

Figure 10. The results of different evaluation metrics of the CPD method, CPD-GL method, and the proposed method in different scenes. (a) Precision (b) Recall (c) F1 (d) Registration error (in pixels).

Table 1. Runtime of these point set registration algorithms on different datasets.

Method	Fish	Chinese	IMM Hand
ICP	0.4896s	0.3343s	0.1283s
TPS	2.3457s	1.6755s	0.6231s
CPD	0.1853	0.1278s	0.0954s
CPD-GL	2.2667s	1.4249s	0.3449s
Proposed method	2.4665s	1.7751s	0.4624s

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhu, H.; Zou, K.; Li, Y.; Cen, M.; Mihaylova, L. Robust Non-Rigid Feature Matching for Image Registration Using Geometry Preserving. Sensors 2019, 19, 2729. https://doi.org/10.3390/s19122729

AMA Style

Zhu H, Zou K, Li Y, Cen M, Mihaylova L. Robust Non-Rigid Feature Matching for Image Registration Using Geometry Preserving. Sensors. 2019; 19(12):2729. https://doi.org/10.3390/s19122729

Chicago/Turabian Style

Zhu, Hao, Ke Zou, Yongfu Li, Ming Cen, and Lyudmila Mihaylova. 2019. "Robust Non-Rigid Feature Matching for Image Registration Using Geometry Preserving" Sensors 19, no. 12: 2729. https://doi.org/10.3390/s19122729

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Robust Non-Rigid Feature Matching for Image Registration Using Geometry Preserving

Abstract

1. Introduction

2. Problem Formulation

3. EM for the Proposed Method

3.1. E-Step

3.2. M-Step

4. Performance Validation

4.1. Parameter Settings

4.2. Synthesized Point Set Registration

4.3. IMM Hand Landmark Registration

4.4. 4DCT_75 Dataset

4.5. Real Image Feature Matching

5. Discussion

6. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI