Tensor Affinity Learning for Hyperorder Graph Matching

Wang, Zhongyang; Wu, Yahong; Liu, Feng

doi:10.3390/math10203806

Open AccessArticle

Tensor Affinity Learning for Hyperorder Graph Matching

by

Zhongyang Wang

,

Yahong Wu

and

Feng Liu

^*

School of Communications and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210003, China

^*

Author to whom correspondence should be addressed.

Mathematics 2022, 10(20), 3806; https://doi.org/10.3390/math10203806

Submission received: 17 July 2022 / Revised: 24 September 2022 / Accepted: 12 October 2022 / Published: 15 October 2022

(This article belongs to the Special Issue Advancement of Mathematical Methods in Feature Representation Learning for Artificial Intelligence, Data Mining and Robotics)

Download

Browse Figures

Versions Notes

Abstract

:

Hypergraph matching has been attractive in the application of computer vision in recent years. The interference of external factors, such as squeezing, pulling, occlusion, and noise, results in the same target displaying different image characteristics under different influencing factors. After extracting the image feature point description, the traditional method directly measures the feature description using distance measurement methods such as Euclidean distance, cosine distance, and Manhattan distance, which lack a sufficient generalization ability and negatively impact the accuracy and effectiveness of matching. This paper proposes a metric-learning-based hypergraph matching (MLGM) approach that employs metric learning to express the similarity relationship between high-order image descriptors and learns a new metric function based on scene requirements and target characteristics. The experimental results show that our proposed method performs better than state-of-the-art algorithms on both synthetic and natural images.

Keywords:

hypergraph matching; similarity metric; information-theoretic metric learning

MSC:

68T20

1. Introduction

Graph matching has been applied in a variety of fields, including biological applications [1], remote sensing image recognition [2], and image retrieval [3]. The key to graph matching is to find correspondences between image visual features using particular algorithms. Graph matching is typically viewed as a quadratic assignment problem (QAP) [4], and since the quadratic objective function is also non-convex, obtaining the global optimal value is challenging [5]. Various approximation algorithms have been developed to settle them under relatively relaxed conditions. Ref. [6] proposed a matching approach based on linear programming. In [7], semidefinite programming was used to solve such a problem, and [8] adopted a similar strategy. However, these algorithms are locally optimal in the discrete domain, and discretization can cause extra errors. There are other methods based on tree search that focus on the suboptimality; for instance, Sanfeliu improved the previous method by considering the joint probability of points and edges in [9]. In [10], it is shown that random walk-based models greatly enhance the graph topological features. A. Robles-Kelly [11] introduced a novel algorithm based on the relationship between the adjacent matrix of the two graphs and their stationary distribution.

Matching-based techniques have been adopted in a variety of study fields. Early classification based on sparse representation (SRC) [12] is not satisfactory in the treatment of occlusion. With the development of multiview non-negative matrix factorization (NMF) methods [13], the local geometry is preserved while global representation under a global alignment strategy is obtained. However, these methods are still affected by various noises and cannot highlight the target characteristics. In [14], Ou et al. proposed a method that used adaptively estimated occlusion information and robustly selected features to improve the performance of facial recognition. The K-nearest neighbor (KNN) is also a non-parametric classifier that is widely used in pattern recognition. However, the performance of KNN-based classification is severely affected by the sensitivity of the neighbourhood size, especially when the sample size is small and there are outliers. Refs. [15,16] improve this problem by weighting and averaging, and have a robust and effective classification performance.

In recent years, high-order graph matching algorithms have put a large amount of attention on the better fusion of structural similarities in order to improve the matching accuracy. Zass and Shashua [17] proposed a probabilistic setting-based hypergraph matching approach that uses an iterative successive projection process to find the global optimal solution. Lee et al. [18] expanded the reweighted random walk approach to hypergraph matching and probabilistically reinterpreted the idea of random walk on hypergraphs. However, these matching algorithms employ Euclidean distance to generate an affinity matrix, and each feature attribute is regarded as being independent from the others. Traditional methods lack a specific metric for feature description, their performance on different types of data is highly variable, and their overall accuracy is low. In this paper, we present an improved hypergraph matching algorithm on the basis of the metric learning theory. By learning the training dataset, it obtains a Mahalanobis matrix that is used to consummate the affinity formulation. Since the Mahalanobis distance is a measure considering the correlation between feature descriptions as well as scaling relations, the assignment matrix obtained by this method would be closer to the ground truth; in fact, experiments show that our algorithm can improve the accuracy of their matching results. The main contributions of this manuscript are described as follows:

A novel approach for graph matching based on metric learning is proposed, in which, the correlation of different features is well considered so that it can perform better.
An information-theoretic metric learning (ITML) method was applied to solve the learning task under high-order graph matching.
The metric learning algorithm is proved by parallel computation, which greatly reduces the communication demands.
Compared with other state-of-the-art algorithms in experiments on test datasets, our proposed method can obtain more accurate results in an efficient way.

The rest of this article is structured as follows. Section 2 begins with an overview of graph-matching-related work. Section 3 presents the proposed model as well as the generic formula for graph matching. Section 4 develops the MLGM approach for optimizing the suggested graph matching model. Section 5 evaluates and analyzes the experimental results of the proposed method on synthetic and natural picture benchmarks. The final section is the conclusion.

2. Related Works

In the last few years, spectral methods have developed into one of the most representative algorithms in graph matching. The eigenvalues of a matrix do not change when its rows and columns are shuffled; thus, we can utilize this fact to find the adjacency matrix that has the same eigenvalues between similar pictures. Earlier, the spectral method was applied to perform feature matching [19]. Ref. [20] introduced a method incorporating the grey level information around the feature points to improve the matching accuracy. Leordeanu et al. [8] proposed a matching algorithm by building an affinity matrix of the feature points that considered the effect of different weight functions in point matching. In another direction, the grouping method can also improve the effectiveness of matching. Egozi et al. [21] proposed a probabilistic interpretation of spectral matching schemes and developed a unique probabilistic matching (PM) scheme that outperforms earlier methods. Feature matching carried out by means of alternating the embedding and matching of the adjacency spectrum was introduced in [22], and, in [23], a relaxation scheme with matching constraints was proposed. Duchenne et al. [9] proposed a class algorithm that uses a tensor to represent similarities between higher-order features, and the graph can be matched after rank-one decomposition of the similarity tensor. This algorithm extends the spectral method to the hypergraph, and it has been further improved through research [24]. The adjacency spectrum optimization of undirected weighted graphs [25] and the approximation of the proximal matrix spectrum of undirected weighted graphs [26] have been developed in recent years, and results have been gained in image processing applications.

The graph edit distance (GED), which represents the matching link between nodes and edges in two graphs, was utilized to solve graph matching. For example, Ref. [27] proposed a self-organizing mapping algorithm to learn the distance, which makes the distance between similar images smaller, and this method was improved in [28]. Serratosa proposed a method based on an adaptive learning paradigm, which was improved in [29]. Andreas Fischer and Kaspar Riesen presented an algorithm [30] combining Hausdorff matching with greedy allocation to improve the quadratic time approximation of GED.

Metric learning has been widely used in face recognition [31,32], image retrieval [33], re-recognition [34], and other fields. For traditional metric methods such as Euclidean distance, it is challenging to capture the structure of diverse data sets. In order to increase the performance of classification models, it is important to learn a specific measure for various data sets, which is the objective of metric learning. The algorithm for metric learning based on the Mahalanobis distance is still the primary focus of metric learning research at the present time. Bohne et al. [35] proposed dividing the data and learning a metric for each cluster, and Wang et al. [36] suggested learning a set of basis metrics and a set of weights for each sample.

3. Problem Statement

As a mathematical expression of a relationship, a graph model [37] is composed of a node set and edge set. Finding the corresponding relationship between two graphs is the objective of graph matching. Generally, it seeks the relationship between nodes in graphs, and the specific node may be a pixel, a graph area, or a feature point. In the study of the graph matching and hypergraph matching algorithm, in order to express the relationship between graph features more comprehensively, the structure information of graph model is used to represent the problem in graph matching. Figure 1 depicts a graph matching example diagram.

We now consider two sets of feature points

P = \{p_{1}, p_{2} \dots, p_{m}\}

and

Q = \{q_{1}, q_{2} \dots, q_{n}\}

, which are extracted from graphs A and B, respectively. The number of points obtained in each graph is m and n, which can be the same or different. In the high-order graph matching algorithm [9], what is different from the previous methods is that it matches a tuple of points instead of one to one or pair to pair, and k is used here to represent the number of points in each tuple. High-order graph matching has a good robustness under unfavorable conditions such as noise deformation and the rotation of external points [38], but it requires more space and time complexity. The third order can reflect the invariance of the similarity transformation in the field of computer vision, and, as the smallest higher-order topology, it can measure the subtle differences between high-order graphs. For convenience, only third-order graph matching is discussed in this paper, and it is straightforward to generalize to higher-order potentials.

The matching problem of the two graphs is to compute an optimal assignment relationship between points. Mathematically, this is the equivalent of finding an

m \times n

assignment matrix X. If the feature point

p_{i}

in P matches

q_{j}

in Q, then the corresponding

X_{i, j}

is equal to 1; otherwise, it is 0. In this paper, we assumed that each feature point in P can match zero or more feature points in Q, but each point in Q can match only one point in P. As a result, the set of assignment matrix X can be denoted as

X

.

X = \{X ∣ X \in {0, 1}^{m \times n}, \forall i, \sum_{j = 1}^{n} X_{i, j} = 1\}

(1)

where

i \in [1, m]

.

The universal second-order graph matching model can be used to solve X as

max_{X} S c o r e (X) = \sum_{i_{1}, i_{2}, j_{1}, j_{2}} M_{i_{1}, i_{2}, j_{1}, j_{2}} X_{i_{1}, j_{1}} X_{i_{2}, j_{2}}

(2)

where M is an affinity tensor and represents the affinity relationship between point pairs (

i_{1}, j_{1}

) and (

i_{2}, j_{2}

).

S c o r e (X)

is the sum of the affinity values of all of the matched tuples; the higher the value corresponds to, the more precise the matching result. Establishing the affinity tensor M for two graphs A and B requires taking into account the similarity between pairs of nodes and pairs of edges.

M_{i_{1}, i_{2}, j_{1}, j_{2}} = exp (- γ ‖f_{i_{1} j_{1}} - f_{i_{2} j_{2}}‖)

(3)

f is the feature of each tuple, which is represented by Euclidean distance between points in second-order graph matching, and

γ

is the parameter [9].

However, the second-order graph matching model can only express paired relations, which are not resistant to scale changes and difficult to express higher-order feature information. Considering the high-order relation of feature points, we describe the similarity between feature point sets based on the relation between point tuples. Given two point sets P and Q, the affinity tensor can be expressed as

M_{i_{1}, i_{2}, j_{1}, j_{2}, k_{1}, k_{2}} = exp (- ξ {‖f_{i_{1}, j_{1}, k_{1}} - f_{i_{2}, j_{2}, k_{2}}‖}^{2})

(4)

where

i_{1}

,

j_{1}

,

k_{1}

represent the point tuples in point set P of graph A,

i_{2}

,

j_{2}

,

k_{2}

represent the three potential point tuples to be matched in point set Q of graph B,

ξ

is a constant that controls the distribution of the intimacy tensor value, and

f_{i_{1}, j_{1}, k_{1}}

and

f_{i_{2}, j_{2}, k_{2}}

represent the vectors constructed from the feature information of point tuples in point set P and Q, respectively. There are numerous ways to represent feature information; for ease of calculation, we use the sine value of the inner angle of the triangle formed by point tuples.

Similar to model (2), the high-order graph matching model can be formulated as

max_{X} S c o r e (X) = \sum_{i_{1}, i_{2}, j_{1}, j_{2}, k_{1}, k_{2}} M_{i_{1}, i_{2}, j_{1}, j_{2}, k_{1}, k_{2}} X_{i_{1}, i_{2}} X_{j_{1}, j_{2}} X_{k_{1}, k_{2}}

(5)

In (5), only when point tuples (

i_{1}, j_{1}, k_{1}

) match (

i_{2}, j_{2}, k_{2}

) separately does

X_{i 1, i 2} X_{j 1, j 2} X_{k 1, k 2}

equal 1. This is an optimal solution problem; by finding the assignment matrix corresponding to the maximum value of

S c o r e (X)

, the matching relation between tuples can be obtained. We can also write (5) as (6) by using the notation of tensor–vector multiplication:

max_{\tilde{X}} S c o r e (\tilde{X}) = \tilde{M} \otimes_{3} \tilde{X} \otimes_{2} \tilde{X} \otimes_{1} \tilde{X}

(6)

where

\tilde{X}

represents the vector created by combining the X columns,

\tilde{M}

stands for the symmetric matrix produced by tensor expansion, and I, J, and K are the three dimensions of tensor M.

In the traditional affinity measure, Function (4), each feature is considered to be of the same importance. However, because these features may have different correlations with sample categories, their weights need to be reconsidered. In other words, a suitable distance or similarity measure based on the feature space of the sample should be used to measure the difference in the sample. Due to its two characteristics, decoupling and dimensionality independence, Mahalanobis distance [39] is an excellent measurement function for image processing and computer vision. In this paper, we used the Mahalanobis distance function to measure the affinity of feature vectors and create the appropriate metric learning model.

4. Tensor Affinity Learning for Hyperorder Graph Matching

4.1. A Short Introduction to Metric Learning

The study of metric learning has significant theoretical implications. Metric learning is concerned with developing an accurate function model for an input feature vector and obtaining an accurate similarity measure by learning the model’s parameters. It can enhance the performance of the classifier by generating similarity relationships with high accuracy [40]. However, how to accurately measure the similarity of samples affected by different external factors is overlooked. The simple normalization method is used to preprocess data samples in classical learning algorithms, and then Euclidean distance is used to measure similarity. These normalization and measurement methods are crude, and the resulting classifier’s performance is easily influenced by noise and interference.

Euclidean distance is a representative distance metric function, defined as

d (x_{1}, x_{2}) = \sqrt{{(x_{1} - x_{2})}^{T} (x_{1} - x_{2})}

(7)

where

x_{1}

,

x_{2}

are paired sets of samples. Although Euclidean distance is simple to understand, it has several flaws. It treats the differences between the feature vectors of the sample as the same, which is incompatible with the application requirements of high-order graph matching. Another limitation of Euclidean distance is that it cannot handle data coupling relationships. When calculating the similarity of point tuples, for example, it is necessary to consider how points and edges are related to each other via the global structure formed.

To improve the deficiencies of traditional distance measurement, the (squared) Mahalanobis distance was used to measure similarity in this paper, which is defined as

d_{M} (x_{1}, x_{2}) = {(x_{1} - x_{2})}^{T} W {(x_{1} - x_{2})}_{}

(8)

W represents the Mahalanobis matrix [39], which is a positive semidefinite symmetric matrix. The purpose of the metric learning process is to obtain a positive semidefinite symmetric matrix W for a given training dataset, which is used to establish the similarity measurement between the features of the samples. In other words, it aims to make the metric distance of similar features closer, and dissimilar features are estranged from each other.

4.2. The Establishment of Training Constraints

For graph supervised learning, the assignment matrix represents the correspondence between the points of two graphs. In hypergraph matching, it is easy to obtain the matching relation of tuples according to the relationship between points represented by the assignment matrix.

To consider the correlation of feature vectors in hypergraph matching and find the Mahalanobis matrix, we used the binary tuple constraint [41] to represent the similarity relation of training samples.

{(x_{i}, x_{j}), w_{i j}}

(9)

where

w_{i j}

refers to whether the two training samples

x_{i}

and

x_{j}

are similar. If

w_{i j}

equals 1, indicating that

(x_{i}, x_{j})

belongs to the set of similar samples, the given pair of samples should be close to each other under the learned distance metric function. Similarly, when

(x_{i}, x_{j})

belongs to a dissimilar pairs set,

w_{i j}

equals −1, indicating that a given pair of samples should be far apart under the learned distance metric function. For a training dataset,

w_{i j}

can be easily obtained according to the assignment matrix. Each tuple is stored as a feature vector; in order to reduce the distance between similar pairs and increase the distance between dissimilar pairs in metric learning, we further constrain similar pairs set S and dissimilar pairs set D by establishing thresholds.

\begin{matrix} S = {(f_{i}, f_{j}) : d_{M} (f_{i}, f_{j}) \leq g} \\ D = {(f_{i}, f_{j}) : d_{M} (f_{i}, f_{j}) \geq h} \end{matrix}

(10)

where g and h are constants, and

(f_{i}, f_{j})

represents the pair of feature vectors.

4.3. Metric Learning Algorithm

Given a set of distance constraints as described in (10), our aim is to learn a positive-definite matrix W that parameterizes the corresponding Mahalanobis distance. In order to improve the computational efficiency, an information-theoretic metric learning approach (ITML) [42] is introduced. It uses a natural information theoretic approach to handle constraints on the distance function while minimizing the relative entropy between two multivariate Gaussians. There is a straightforward bijection between the set of Mahalanobis distances and the set of equalmean multivariate Gaussian distributions, and the multivariate Gaussian that corresponds to a Mahalanobis distance parameterized by W can be stated as follows:

p (x; W) = \frac{1}{z} exp (\frac{1}{2} d_{M} (x, μ))

(11)

where

μ

is the mean value and z is the normalization factor. The relative entropy between corresponding multivariate Gaussians is used to measure the distance between two Mahalanobis distance functions parameterized by

W_{0}

and W:

K L (p (x; W_{0}) | | p (x; W)) = \int p (x; W_{0}) log \frac{p (x; W_{0})}{p (x; W)} d x

(12)

K L (\cdot)

stands for relative entropy, which is known as Kullback–Leibler divergence [43]. We use it to represent the difference between two probability distributions. Given a similar pair set S and a dissimilar pair set D, the distance measurement learning problem can be transformed into:

\begin{array}{l} min_{W \geq 0} K L (p (x; W_{0}) | | p (x; W)) \\ s . t . \\ d_{M} (f_{i}, f_{j}) \leq g, (f_{i}, f_{j}) \in S \\ d_{M} (f_{i}, f_{j}) \geq h, (f_{i}, f_{j}) \in D \end{array}

(13)

where g and h are constants.

It has been demonstrated that the Mahalanobis distance between mean vectors and the LogDet divergence between covariance matrices can be combined convexly to express the differential relative entropy between two multivariate Gaussians [44]. To solve this optimization function, the Logdet distance

D_{l d} (\cdot)

for measuring the difference of the matrix was introduced to calculate:

\begin{matrix} K L (p (x; W_{0}) | | p (x; W)) = \frac{1}{2} D_{l d} (W_{0}^{- 1}, W^{- 1}) \\ D_{l d} (W, W_{0}) = t r (W W_{0}^{- 1}) - log det (W W_{0}^{- 1}) - d \end{matrix}

(14)

where d is the number of rows in W.

To facilitate the solution in the wider feasible region, the ITML algorithm introduces the relaxation variable

ξ

, initializes it into

ξ_{0}

, and further rewrites (14) as follows:

\begin{matrix} min_{W \geq 0, ξ} (D_{l d} (W, W_{0}) + ρ D_{l d} (d i a g {ξ}, d i a g {ξ_{0}})) \\ s . t . \\ t r (W (f_{i} - f_{j}) {(f_{i} - f_{j})}^{T}) \leq ξ_{i, j}, (f_{i} - f_{j}) \in S \\ t r (W (f_{i} - f_{j}) {(f_{i} - f_{j})}^{T}) \geq ξ_{i, j}, (f_{i} - f_{j}) \in D \end{matrix}

(15)

ρ

is the equilibrium parameter. According to the principle of Logdet distance optimization in [45], the iterative formula of optimization can be obtained:

W_{t + 1} = W_{t} + β W_{t} (f_{i} - f_{j}) {(f_{i} - f_{j})}^{T} W_{t}

(16)

where

W_{t}

is the metric matrix calculated by the t-th iteration,

β

is the mapping parameter, and

f_{i}

and

f_{j}

are the constraint pairs.

4.4. Parallel Learning Algorithm

It is not feasible to apply the ITML method for distance measurement learning with high-dimensional training data. The complexity of the ITML algorithm has a direct correlation with the square of the data dimension, which leads to high heterogeneity in processing high-dimensional data. Furthermore, the ITML method learns a full rank metric matrix that scales quadratically with the number of input data dimensions, imposing a significant computing overhead on the learning process.

Typically, actual high-dimensional datasets are contaminated with noise or contain redundant information, so the algorithm cannot learn an effective measurement matrix. Therefore, when the dimensions of training samples are large enough, the measurement matrix obtained through ITML algorithm learning cannot effectively suppress the noise and also has disadvantages, such as a low solving efficiency and vulnerability to inadequate training data. To address the above challenges, we improved the metric learning algorithm through parallel computing.

The following proposes a parallel computing process. We may reconstruct the Mahalanobis matrix W as

W = I + \sum_{i} α_{i} z_{i} z_{i}^{T}

using the principle in [46] that every positive semidefinite matrix can be decomposed into linear combinations of rank-one matrices, where

z_{i} \in R^{d}

. It is clear that

W_{t} (f_{i} - f_{j})

is d-dimensional, and

W_{t} (f_{i} - f_{j}) {(f_{i} - f_{j})}^{T} W_{t}^{T}

is a rank-one matrix. We can concretize the expression for W by adding the Bregman projections [47] of all pairs of constraints:

W_{t + 1} = I + \sum_{i = 1}^{C} β_{i} (t) z_{i} (t) z_{i} {(t)}^{T}

(17)

I represents the identity matrix of d dimensions, C is the number of constraint pairs that represent the mapping parameters,

β

denotes the learning rate, and

z_{i} (t) = W_{t} c_{i}

and

c_{i} = f_{j} - f_{k}

correspond to constraint pair (

f_{j}, f_{k}

). In the algorithm framework, only z is saved instead of

W_{t}

, preventing the issue where W tends to grow as d gets larger. Therefore, the iteration is changed into the update formula of z:

z_{k} (t + 1) = W_{t + 1} c_{k} = (I + \sum_{i = 1}^{C} β_{i} (t) z_{i} (t) z_{i} {(t)}^{T}) c_{k}

(18)

According to the original algorithm,

β_{i} (t)

should be the upper or lower bound constraint of the measured distance function. The key step in updating is to calculate the actual distance. Equation (19) can be used to express the real distance of the k-th constraint

c_{k}

when combined with Equation (17):

\begin{matrix} p_{k} (t) = c_{k}^{T} W_{t} c_{k} \\ = c_{k}^{T} (I + \sum_{i = 1}^{C} β_{i} (t) z_{i} (t) z_{i}^{T} (t)) c_{k} \\ = c_{k}^{T} c_{k} + \sum_{i = 1}^{C} β_{i} (t) c_{k}^{T} z_{i} (t) z_{i}^{T} (t) c_{k} \end{matrix}

(19)

Due to the decomposable nature of W, the task of updating z and p is assigned to C work units (worker), which reflects the concept of parallel execution. In our framework, worker k needs to receive all z values generated by previous iterations from other workers and then carry out the next iteration update. Each worker only needs to send the vector z and receive

(c - 1) z

instead of the entire metric matrix. Therefore, the transfer amount of each step is reduced from

O (d^{2})

to

O (d)

. When d exceeds the number of constraints C, the transfer requirements will be significantly reduced, which greatly reduces the computational complexity.

We define affinity

Ω

in terms of the Mahalanobis distance instead of (4), which can better account for the correlation between tuples through learning the training set.

\begin{matrix} Ω_{i_{1}, i_{2}, j_{1}, j_{2}, k_{1}, k_{2}} = exp (- \frac{{(f_{1} - f_{2})}^{T} W (f_{1} - f_{2})}{γ}) \\ f_{1} = (i_{1}, j_{1}, k_{1}) \\ f_{2} = (i_{2}, j_{2}, k_{2}) \end{matrix}

(20)

Then, M is defined as the following:

\begin{matrix} M_{i_{1}, i_{2}, j_{1}, j_{2}, k_{1}, k_{2}} = Ω_{i_{1}, i_{2}, j_{1}, j_{2}, k_{1}, k_{2}}, i f | | f_{1} - f_{2} | | \leq σ \\ o t h e r w i s e 0 \end{matrix}

(21)

The value of parameter

σ

corresponds to the degree of triplets deformation; a larger value of

σ

reduces the sensitivity of matching. The resulting algorithm is given as Algorithm 1.

Algorithm 1 Parallel Metric Learning

Input: S: similar data; D: dissimilar data;

u, l

: distance thresholds:

γ

: slack parameter
Output: W: Mahalanobis matrix

1:: $W = I, C = | S | + | D |$
2:: for constraint ${(x_{p}, x_{q})}_{k}, k \in \{1, 2, \dots, C\}$ do
3:: $λ_{k} \leftarrow 0$
4:: $d_{k} \leftarrow u$ for ${(x_{p}, x_{q})}_{k} \in S$ otherwise $d_{k} \leftarrow l$
5:: $c_{k} \leftarrow {(x_{p} - x_{q})}_{k}, z_{k} \leftarrow c_{k}$
6:: end for
7:: while $β$ does not converge do
8:: for all worker $k \in \{1, 2, \dots, C\}$ do in parallel
9:: $z k = c k + \sum_{i = 1}^{C} β_{i} z_{i} z_{i}^{T} c_{k}$
10:: $p \leftarrow c_{k}^{T} z_{k}$
11:: if ${(x_{p}, x_{q})}_{k} \in S$ then
12:: $α \leftarrow min (λ_{k}, \frac{1}{2} (\frac{1}{p} - \frac{γ}{d_{k}}))$
13:: $β \leftarrow \frac{α}{1 - α p}$
14:: $d_{k} \leftarrow \frac{γ d_{k}}{γ + α d_{k}}$
15:: else
16:: $α \leftarrow min (λ_{k}, \frac{1}{2} (\frac{γ}{d_{k}} - \frac{1}{p}))$
17:: $β \leftarrow \frac{- α}{1 - α p}$
18:: $d_{k} \leftarrow \frac{γ d_{k}}{γ - α d_{k}}$
19:: end if
20:: $λ_{k} \leftarrow λ_{k} - α$
21:: $z_{k} \leftarrow (I + \sum_{i = 1}^{C} β_{i} z_{i} z_{i}^{T}) c_{k}$
22:: send $z_{k}$ to other workers
23:: end for
24:: end while
25:: $W = I + \sum_{i = 1}^{C} β_{i} z_{i} z_{i}^{T}$

5. Experiments

In the following, we compare our method to advanced hyper-graph matching algorithms using benchmark data sets in which the original information for a specific sample graph is the feature point set. In order to express conveniently, the proposed method was represented by MLGM. We used the following advanced methods to compare with MLGM: spectral matching (SM) [8], max-pooling matching (MPM) [22] and IPFP [29], probabilistic graph matching (HGM) [17], tensor matching (TM) [9], reweighted random walk hypergraph matching (RRWHM) [18], block coordinate ascent graph matching (BCAGM) [23], and alternating direction graph matching (ADGM) [48]. We introduced noise and distortion to several datasets to distinguish our method’s performance from that of other approaches. We compared the results to other algorithms in terms of accuracy and matching score. Accuracy was determined as the ratio between the number of accurate matches and the total amount of points and score was determined by Equation (6). The parameter settings for all of the state-of-art algorithms were identical to those suggested in their respective articles.

In our method, the dimension of the feature vector for each tuple of points was set to 3. We used Equation (20) to compute the affinity tensor M, and

γ

was set as in [9]. In the calculation process, we simply randomly selected

N \times m

triplets from the graph model, where N is a user-defined parameter (this paper was set to 50). For the best empirical performance, only K nearest tuples matching for each triplet in the target image were selected, where K was set to 300 in this paper.

5.1. Synthetic Dataset

In this section, we introduce the popular benchmark datasets Blessing and Fish [49] in our experiment, which are reliable in evaluating graph matching algorithms.

In Figure 2 and Figure 3, we show examples of the existing synthetic database (the Chinese character “blessing” and a tropical fish). The model shape is shown in the first column, in which, the images of Blessing and Fish are composed of 105 and 98 points, respectively. In order to validate the robustness of the proposed algorithm under noise, deformation, outliers, and rotation conditions, we conducted four sets of experiments. Column b contains examples of noisy targets produced by the addition of Gaussian random noise. Column c contains examples of deformed targets created by applying nonrigid deformation to model points. Column d contains examples of targets with outliers created by combining random points with a normal distribution of unit variance and moderate degrees of rotation. Column e contains examples of targets with large rotations and moderate Gaussian noise. We then experimented with each group of graphs and evaluated the robustness of these methods. The results are shown in Figure 4, Figure 5, Figure 6 and Figure 7.

In the case of deformation disposal, the degree of deformation was set from 0.02 to 0.1; as it increases, the matching accuracy of all algorithms decreases correspondingly. For each graph pair, we fixed the points of an image and used algorithms to find the corresponding points in the other image. Each experimental result is the average of a multigroup parallel experiment to ensure the reliability of the test. The results show that the ITML method has an obvious advantage in this situation. For the noise condition experiments, the target points were obtained by adding Gaussian random noise from

σ = 0.01

to

σ = 0.05

; we can see from Figure 4 and Figure 6 that the methods using high-order graph matching can achieve a higher accuracy because the internal information of the image topology is applied to the feature description. For these groups of experiments on synthetic database, our algorithm can obtain more accurate matching results. In particular, our method can achieve

100 %

accuracy for datasets with outliers. Graphs in Figure 8 and Figure 9 also display the matching scores under varying experimental conditions. In the case of increased interference, the matching score remains steady at a higher level, demonstrating that the affinity metric of the feature in our method is totally invariant to massive affine deformations and strong Gaussian noise. With the addition of rotation, the results of images show that the rotation angle has little effect on the matching results, but when the rotation angle reaches 90 degrees, the accuracy obtained by our method has a slight decline at this point. We believe that this is mainly because the 90-degree rotation has a degree of influence on the algorithm that focuses on correlation. The experiments show that, after the metric learning of the dataset, the results of matching can be improved.

5.2. Face Dataset and Duck Dataset

In this section, we compare the performance of our method to other methods on the Face dataset and Duck dataset, which are the sub-datasets from Caltech-256 [50]. These datasets contain images from specific classes: 109 face images, and 50 duck images. The ground truth is known for each graph pair. We chose 70 pairs of faces at random from the data set for testing, manually picked 10 feature points from each picture, and chose 20 photographs from each class at random as the training dataset for metric learning. The baseline was varied from 10 to 80 frames, and we tested all algorithms and obtained the average of the results. The accuracy and matching scores were obtained by averaging experiments of 10 frames to 80 frames. To make it more intuitive, we show several examples of matching results in Figure 10 and Figure 11. It can be seen that our algorithm performes better. Figure 12 and Figure 13 show that the MLGM method achieves the largest score value, and obtains more accurate matching results than other test methods. It also demonstrates that the compared approaches are easily affected by noise and distortion. The MLGM method can obtain a better matching result for the entire dataset.

6. Conclusions

In this paper, we proposed a tensor graph matching model based on metric learning that uses Mahalanobis distance as the affinity measure function and makes full use of the distribution information and geometric information of hypergraphs. To solve the proposed model, a parallel distance metric learning approach was used, which can learn appropriate metrics from high-dimensional data without using low-rank approximation. The experimental results of testing on several databases, such as the synthetic datasets of Blessing, Fish, and Face datasets, and the Duck dataset, indicated that the suggested method performs better than the existing ones. In the future, we may consider combining this strategy with deep learning.

Author Contributions

Methodology, Z.W. and Y.W.; Project administration, F.L.; Supervision, F.L.; Writing—original draft, Z.W.; Writing—review & editing, Y.W. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by the National Natural Science Foundation of China under Grant No. 62072256, the Natural Science Foundation of Nanjing University of Posts and Telecommunications (Grant No. NY221057 and NY220003) and the Postgraduate Research & Practice Innovation Program of Jiangsu Province, China (Grant No. SJCX19_0248).

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Tian, Y.; Mceachin, R.C.; Santos, C.; States, D.J.; Patel, J.M. SAGA: A subgraph matching tool for biological graphs. Bioinformatics 2007, 23, 232–239. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Chaudhuri, B.; Demir, B.; Chaudhuri, S.; Bruzzone, L. Multilabel remote sensing image retrieval using a semisupervised graph-theoretic method. IEEE Trans. Geosci. Remote. Sens. 2017, 56, 1144–1158. [Google Scholar] [CrossRef]
Yang, X.; Latecki, L.J. Affinity learning on a tensor product graph with applications to shape and image Retrieval. In Proceedings of the CVPR 2011, Colorado Springs, CO, USA, 20–25 June 2011; pp. 2369–2376. [Google Scholar]
Lawler, E.L. The quadratic assignment problem. Manag. Sci. 1963, 9, 586–599. [Google Scholar] [CrossRef]
Gold, S.; Rangarajan, A. A graduated assignment algorithm for graph matching. IEEE Trans. Pattern Anal. Mach. Intell. 1996, 18, 377–388. [Google Scholar] [CrossRef] [Green Version]
Almohamad, H.; Duffuaa, S. A linear programming approach for the weighted graph matching problem. IEEE Trans. Pattern Anal. Mach. Intell. 1993, 15, 522–525. [Google Scholar] [CrossRef] [Green Version]
Torr, P.H. Solving markov random fields using semi definite programming. In Proceedings of the International Workshop on Artificial Intelligence and Statistics PMLR, Key West, FL, USA, 3–6 January 2003; pp. 292–299. [Google Scholar]
Leordeanu, M.; Hebert, M. A Spectral Technique for Correspondence Problems Using Pairwise Constraints; The Robotics Institute, Carnegie Mellon University: Pittsburgh, PA, USA, 2005; pp. 1482–1489. [Google Scholar]
Duchenne, O.; Bach, F.; Kweon, I.S.; Ponce, J. A tensor-based algorithm for high-order graph matching. IEEE Trans. Pattern Anal. Mach. Intell. 2011, 33, 2383–2395. [Google Scholar] [CrossRef] [Green Version]
Medasani, S.; Krishnapuram, R.; Choi, Y. Graph matching by relaxation of fuzzy assignments. IEEE Trans. Fuzzy Syst. 2001, 9, 173–182. [Google Scholar] [CrossRef]
Nocedal, J.; Wright, S. Numerical Optimization; Springer: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
Ou, W.; You, X.; Tao, D.; Zhang, P.; Tang, Y.; Zhu, Z. Robust face recognition via occlusion dictionary learning. Pattern Recognit. 2014, 47, 1559–1572. [Google Scholar] [CrossRef]
Ou, W.; Yu, S.; Li, G.; Lu, J.; Zhang, K.; Xie, G. Multi-view non-negative matrix factorization by patch alignment framework with view consistency. Neurocomputing 2016, 204, 116–124. [Google Scholar] [CrossRef]
Ou, W.; Luan, X.; Gou, J.; Zhou, Q.; Xiao, W.; Xiong, X.; Zeng, W. Robust discriminative nonnegative dictionary learning for occluded face recognition. Pattern Recognit. Lett. 2018, 107, 41–49. [Google Scholar] [CrossRef]
Gou, J.; Qiu, W.; Yi, Z.; Shen, X.; Zhan, Y.; Ou, W. Locality constrained representation-based K-nearest neighbor classification. Knowl. Based Syst. 2019, 167, 38–52. [Google Scholar] [CrossRef]
Gou, J.; Ma, H.; Ou, W.; Zeng, S.; Rao, Y.; Yang, H. A generalized mean distance-based k-nearest neighbor classifier. Expert Syst. Appl. 2019, 115, 356–372. [Google Scholar] [CrossRef]
Zass, R.; Shashua, A. Probabilistic graph and hypergraph matching. In Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, USA, 23–28 June 2008; pp. 1–8. [Google Scholar]
Lee, J.; Cho, M.; Lee, K.M. Hyper-graph matching via reweighted random walks. In Proceedings of the CVPR 2011, Colorado Springs, CO, USA, 20–25 June 2011; pp. 1633–1640. [Google Scholar]
Byrd, R.H.; Lu, P.; Nocedal, J.; Zhu, C. A limited memory algorithm for bound constrained optimization. SIAM J. Sci. Comput. 1995, 16, 1190–1208. [Google Scholar] [CrossRef]
Ni, Q.; Yuan, Y.x. A subspace limited memory quasi-Newton algorithm for large-scale nonlinear bound constrained optimization. Math. Comput. 1997, 66, 1509–1520. [Google Scholar] [CrossRef] [Green Version]
Egozi, A.; Keller, Y.; Guterman, H. A probabilistic approach to spectral graph matching. IEEE Trans. Pattern Anal. Mach. Intell. 2012, 35, 18–27. [Google Scholar] [CrossRef]
Cho, M.; Sun, J.; Duchenne, O.; Ponce, J. Finding matches in a haystack: A max-pooling strategy for graph matching in the presence of outliers. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 2083–2090. [Google Scholar]
Nguyen, Q.; Gautier, A.; Hein, M. A flexible tensor block coordinate ascent scheme for hypergraph matching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 5270–5278. [Google Scholar]
Jiang, B.; Tang, J.; Ding, C.; Luo, B. A local sparse model for matching problem. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, Austin, TX, USA, 25–30 January 2015. [Google Scholar]
Gillis, D.B.; Bowles, J.H. Hyperspectral image segmentation using spatial-spectral graphs. In Proceedings of the Algorithms and Technologies for Multispectral, Hyperspectral, and Ultraspectral Imagery XVIII. SPIE, Baltimore, MD, USA, 23–27 April 2012; Volume 8390, pp. 527–537. [Google Scholar]
Meng, D.; Fazel, M.; Mesbahi, M. Proximal alternating direction method of multipliers for distributed optimization on weighted graphs. In Proceedings of the 2015 54th IEEE Conference on Decision and Control (CDC), Osaka, Japan, 15–18 December 2015; pp. 1396–1401. [Google Scholar]
Liu, Z.Y.; Qiao, H.; Yang, X.; Hoi, S.C. Graph matching by simplified convex-concave relaxation procedure. Int. J. Comput. Vis. 2014, 109, 169–186. [Google Scholar] [CrossRef]
Chertok, M.; Keller, Y. Efficient high order matching. IEEE Trans. Pattern Anal. Mach. Intell. 2010, 32, 2205–2215. [Google Scholar] [CrossRef]
Leordeanu, M.; Hebert, M.; Sukthankar, R. An integer projected fixed point method for graph matching and map inference. Adv. Neural Inf. Process. Syst. 2009, 1114–1122. [Google Scholar]
Cho, M.; Lee, J.; Lee, K.M. Reweighted random walks for graph matching. In Proceedings of the European Conference on Computer Vision, Heraklion, Crete, Greece, 5–11 September 2010; pp. 492–505. [Google Scholar]
Xu, Y.; Wu, L.; Jian, M.; Zheng, W.S.; Ma, Y.; Wang, Z. Identity-constrained noise modeling with metric learning for face anti-spoofing. Neurocomputing 2021, 434, 149–164. [Google Scholar] [CrossRef]
Yu, J.; Hu, C.H.; Jing, X.Y.; Feng, Y.J. Deep metric learning with dynamic margin hard sampling loss for face verification. Signal Image Video Process. 2020, 14, 791–798. [Google Scholar] [CrossRef]
Cao, R.; Zhang, Q.; Zhu, J.; Li, Q.; Li, Q.; Liu, B.; Qiu, G. Enhancing remote sensing image retrieval using a triplet deep metric learning network. Int. J. Remote Sens. 2020, 41, 740–751. [Google Scholar] [CrossRef]
Jin, Y.; Li, C.; Li, Y.; Peng, P.; Giannopoulos, G.A. Model latent views with multi-center metric learning for vehicle re-identification. IEEE Trans. Intell. Transp. Syst. 2021, 22, 1919–1931. [Google Scholar] [CrossRef]
Bohné, J.; Ying, Y.; Gentric, S.; Pontil, M. Large margin local metric learning. In Proceedings of the European Conference on Computer Vision, Zurich, Switzerland, 6–12 September 2014; pp. 679–694. [Google Scholar]
Wang, J.; Kalousis, A.; Woznica, A. Parametric local metric learning for nearest neighbor classification. Adv. Neural Inf. Process. Syst. 2012, 25, 1601–1609. [Google Scholar]
Kilgour, D.M.; Hipel, K.W.; Fang, L. The graph model for conflicts. Automatica 1987, 23, 41–55. [Google Scholar] [CrossRef]
Zhu, H.; Cui, C.; Deng, L.; Cheung, R.C.; Yan, H. Elastic net constraint-based tensor model for high-order graph matching. IEEE Trans. Cybern. 2019, 51, 4062–4074. [Google Scholar] [CrossRef]
De Maesschalck, R.; Jouan-Rimbaud, D.; Massart, D.L. The mahalanobis distance. Chemom. Intell. Lab. Syst. 2000, 50, 1–18. [Google Scholar] [CrossRef]
Wang, X.; Han, X.; Huang, W.; Dong, D.; Scott, M.R. Multi-similarity loss with general pair weighting for deep metric learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 5022–5030. [Google Scholar]
Xing, E.; Jordan, M.; Russell, S.J.; Ng, A. Distance metric learning with application to clustering with side-information. Adv. Neural Inf. Process. Syst. 2002, 15. [Google Scholar]
Davis, J.V.; Kulis, B.; Jain, P.; Sra, S.; Dhillon, I.S. Information-theoretic metric learning. In Proceedings of the 24th International Conference on Machine Learning, New York, NY, USA, 20–24 June 2007; pp. 209–216. [Google Scholar]
Kullback, S.; Leibler, R.A. On information and sufficiency. Ann. Math. Stat. 1951, 22, 79–86. [Google Scholar] [CrossRef]
Davis, J.; Dhillon, I. Differential entropic clustering of multivariate gaussians. Adv. Neural Inf. Process. Syst. 2006, 19. [Google Scholar]
Kulis, B.; Sustik, M.A.; Dhillon, I.S. Low-Rank Kernel Learning with Bregman Matrix Divergences. J. Mach. Learn. Res. 2009, 10. [Google Scholar]
Shen, C.; Kim, J.; Wang, L.; Van Den Hengel, A. Positive semidefinite metric learning using boosting-like algorithms. J. Mach. Learn. Researc 2012, 13, 1007–1036. [Google Scholar]
Bregman, L.M. The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming. USSR Comput. Math. Math. Phys. 1967, 7, 200–217. [Google Scholar] [CrossRef]
Lê-Huu, D.K.; Paragios, N. Alternating direction graph matching. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 4914–4922. [Google Scholar]
Chui, H.; Rangarajan, A. A new point matching algorithm for non-rigid registration. Comput. Vis. Image Underst. 2003, 89, 114–141. [Google Scholar] [CrossRef]
Griffin, G.; Holub, A.; Perona, P. Caltech-256 Object Category Dataset; Technical Report 7694; Caltech: Pasadena, CA, USA, 2007. [Google Scholar]

Figure 1. Graph matching schematic diagram.

Figure 2. (a) shows model fish point sets, and (b–e) show point sets added with deformation, noise, outliers, and rotation, respectively.

Figure 3. (a) shows model blessing point sets, and (b–e) show point sets added with deformation, noise, outliers, and rotation, respectively.

Figure 4. Accuracy comparison on the Fish dataset. (a) Accuracy with different degree of deformation. (b) Accuracy with different noise level. (c) Accuracy with different number of outliers. (d) Accuracy with different rotation angle.

Figure 5. The assignment matrix obtained by MLGM from matching results on the Fish dataset with different degree of deformation (a–e).

Figure 6. Accuracy comparison on the Blessing dataset. (a) Accuracy with different degree of deformation. (b) Accuracy with different noise level. (c) Accuracy with different number of outliers. (d) Accuracy with different rotation angle.

Figure 7. The assignment matrix obtained by MLGM from matching results on the Blessing dataset with different degree of deformation (a–e).

Figure 8. Matching score comparison on the Fish dataset. (a) Matching score with different degree of deformation. (b) Matching score with different noise level. (c) Matching score with different number of outliers. (d) Matching score with different rotation angle.

Figure 9. Matching score comparison on the Blessing dataset. (a) Matching score with different degree of deformation. (b) Matching score with different noise level. (c) Matching score with different number of outliers. (d) Matching score with different rotation angle.

Figure 10. Example results of experiments on the Face dataset, in which red and yellow lines denote correct and incorrect matching results.

Figure 11. Example results of experiments on the Duck dataset.

Figure 12. Trend chart of matching accuracy and score of the Face dataset. (a) Accuracy of the Face dataset. (b) Matching score of the Face dataset.

Figure 13. Trend chart of matching accuracy and score of the Duck dataset. (a) Accuracy of the Duck dataset. (b) Matching score of the Duck dataset.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, Z.; Wu, Y.; Liu, F. Tensor Affinity Learning for Hyperorder Graph Matching. Mathematics 2022, 10, 3806. https://doi.org/10.3390/math10203806

AMA Style

Wang Z, Wu Y, Liu F. Tensor Affinity Learning for Hyperorder Graph Matching. Mathematics. 2022; 10(20):3806. https://doi.org/10.3390/math10203806

Chicago/Turabian Style

Wang, Zhongyang, Yahong Wu, and Feng Liu. 2022. "Tensor Affinity Learning for Hyperorder Graph Matching" Mathematics 10, no. 20: 3806. https://doi.org/10.3390/math10203806

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Tensor Affinity Learning for Hyperorder Graph Matching

Abstract

1. Introduction

2. Related Works

3. Problem Statement

4. Tensor Affinity Learning for Hyperorder Graph Matching

4.1. A Short Introduction to Metric Learning

4.2. The Establishment of Training Constraints

4.3. Metric Learning Algorithm

4.4. Parallel Learning Algorithm

5. Experiments

5.1. Synthetic Dataset

5.2. Face Dataset and Duck Dataset

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI