Graph-Regularized, Sparsity-Constrained Non-Negative Matrix Factorization with Earth Mover’s Distance Metric

Li, Shunli; Lu, Linzhang; Liu, Qilong; Chen, Zhen

doi:10.3390/math11081894

Open AccessArticle

Graph-Regularized, Sparsity-Constrained Non-Negative Matrix Factorization with Earth Mover’s Distance Metric

¹

School of Mathematical Sciences, Guizhou Normal University, Guiyang 550025, China

²

College of Mathematics and Information Science, Guiyang University, Guiyang 550005, China

³

School of Mathematical Sciences, Xiamen University, Xiamen 361005, China

^*

Author to whom correspondence should be addressed.

Mathematics 2023, 11(8), 1894; https://doi.org/10.3390/math11081894

Submission received: 22 March 2023 / Revised: 13 April 2023 / Accepted: 14 April 2023 / Published: 17 April 2023

Download

Browse Figures

Versions Notes

Abstract

:

Non-negative matrix factorization (NMF) is widely used as a powerful matrix factorization tool in data representation. However, the traditional NMF, measured by Euclidean distance or Kullback–Leibler distance, does not take into account the internal implied geometric information of the dataset and cannot measure the distance between samples as well as possible. To remedy the defects, in this paper, we propose the NMF method with Earth mover’s distance as a metric, for short GSNMF-EMD. It combines graph regularization and

L_{1 / 2}

smooth constraints. The GSNMF-EMD method takes into account the intrinsic implied geometric information of the dataset and can produce more sparse and stable local solutions. Experiments on two specific image datasets showed that the proposed method outperforms related state-of-the-art methods.

Keywords:

non-negative matrix factorization (NMF); Earth mover’s distance (EMD); graph regularized; L_1/2 sparsity-constrained

MSC:

68U10; 62H30; 15A69

1. Introduction

The non-negative matrix factorization (NMF) method has been widely used in various fields of feature learning and has become one of the most popular methods, among which are text clustering [1,2,3], digital image processing [4,5,6,7,8,9], face recognition [10], and signal analytics [11,12]. Due to the wide range of practical properties of NMF, numerous scholars have proposed some improvements to the original NMF: Imposing different constraints on the two factor matrices. For example, P. O. Hoyer [13] proposed NMF with sparseness constraints, which improve the accuracy of the parts-based representation compared to the basic NMF. Cai et al. [14] proposed a manifold-based NMF (GNMF) method that respects the geometric structure information hidden inside the dataset. He et al. [15] proposed a robust NMF method with sparse constraints in order to deal with both sparse and Gaussian noise. Kong et al. [16] added the

L_{21}

-norm constraint and proposed a robust non-negative matrix decomposition algorithm. Huang et al. [17] proposed a new unsupervised learning model that takes into account the overall and local structure of the data space, called NMF with robustness. Pan et al. [18] introduced an orthogonal non-negative matrix factorization. Luo et al. [19] applied NMF to the collaborative filtering problem and proposed a regularized single-element-based model (RSNMF) that achieves computational complexity reduction and accuracy improvement for large industrial datasets. Sun et al. [20] analyzed the generalization performance of the NMF algorithm from the perspective of algorithm stability and gave bounds on the generalization error.

The vast majority of the various NMF-based variants mentioned above use the Euclidean norm or the K-L divergence [21] to measure the minimization distance between the product of two factor matrices and the original matrix. A shortcoming of the above NMF-based approaches is that either the intrinsic structure of the dataset or the distance between samples is ignored. To remedy the shortcoming, many researchers have tried to adopt new metrics. Recently, correntropy has proven to be a very effective approximation measure by virtue of its stability against outliers or noise [22,23,24,25,26]. For example, Yu et al. [26] proposed the correntropy-based, hypergraph-regularized NMF (CHNMF) method using correlation entropy. The method is used for clustering and feature extraction of multi-cancer integrated data with better robustness. Another measure, called the Earth mover’s distance (EMD), has also attracted great interest. EMD not only provides a better measure of the distance between samples but also has good robustness [27,28]. In [28], the authors showed that EMD has good performance in tasks involving large distortions, such as geometric deformations, illumination changes, or heavy intensity noise.

Rubner et al. [29] constructed an image comparison framework that better accounts for perceptual similarity than other previously proposed methods by combining EMD with a vector quantization-based distribution representation scheme. Sandler et al. [30,31] proposed an EMDNMF algorithm that minimizes the EMD error between the data and the product of factor matrices. The main advantage of this method is enhanced robustness. In order to improve the efficiency and accuracy of approximate EMD calculations, Atasu et al. provided new theoretical and practical results in [32]. Qu et al. [33] proposed a novel EMD method to detect false data injection attacks in smart grids to improve power-system security.

In this paper, the NMF algorithm is improved based on the EMD distance and the use of graph-regularization terms and sparse constraints, which can discover the geometric structure information inherent in the dataset and produce a smooth and more accurate solution. We call it graph-regularized,

L_{1 / 2}

sparsity-constrained NMF with EMD metric (GSNMF-EMD). In view of numerous literature showing that

L_{1 / 2}

-NMF provides sparser and more accurate results than those delivered using the

L_{1}

norm [34,35,36,37,38], we also chose the

L_{1 / 2}

norm as the sparse constraint. The graph-regularization constraint uncovers the implied semantics while focusing on preserving information about the intrinsic geometric structure of the dataset. Furthermore, update rules and convergence proofs of the GSNMF-EMD algorithm are presented. Experimental results on real datasets demonstrate the validity and accuracy of our proposed multiplicative update algorithm.

The rest of the paper is organized as follows: In Section 2, we present some of the related work. In Section 3, we propose a multiplicative update rule for the GSNMF-EMD model and prove its convergence. We report some experimental results on two image datasets in Section 4. The parameter selection is provided in Section 5. Finally, we briefly summarize in Section 6.

2. Related Work

In this section, we review the EMD metric and the NMF with the Earth Mover’s distance metric (EMDNMF in short). We use bold capital letters to denote matrices and lowercase letters to denote vectors. For example,

A

is an

m \times n

matrix,

a_{j}

is the j-

t h

column vector, and

a_{i j}

is the

(i, j)

-th entry of matrix

A

. Let x, y

\in R^{m}

,

\tilde{K L} (x ∥ y) = x^{⊤} log (x ⊘ y) - 1^{⊤} x + 1^{⊤} y

be the generalized K-L divergence between

x

and

y

, where

x ⊘ y

represents element-wise division and

{(\cdot)}^{⊤}

denotes the transpose operation.

2.1. EMD

EMD was first proposed for certain vision problems by Peleg et al. [39]. EMD is an efficient metric based on an optimal transportation problem solution. To be precise, this is the minimum cost that must be paid to convert one distribution into another. Since it can manipulate the variable-length representation of the distribution, it effectively avoids the quantization and other typical histogram binning problems, causing it to be more robust than histogram matching techniques.

Definition 1

([29]). Let

x, y \in R^{m}

be two normalized histograms with

x^{⊤} 1_{m} = 1, y^{⊤} 1_{m} = 1

. Let

M

be an

m \times m

distance metric matrix.

T = (t_{p q}) \in R^{m \times m}

is a transport matrix. We call

\begin{matrix} D_{M} (y, x) & = min_{t_{p q} \geq 0} \sum_{p, q = 1}^{m} m_{p q} t_{p q} s . t . T 1_{m} = y, T^{⊤} 1_{m} = x, \end{matrix}

(1)

an Earth mover’s distance (EMD) between

x

and

y

.

Despite EMD having many good properties, its calculations based on the well-known solution to the transportation problem are very expensive. Recent years, some scholars have done fruitful work to improve the operation efficiency. Based on Equation (1), Marco Cuturi [40] proposed a maximum entropy perspective. He smoothed the solution to the EMD problem by adding an entropy-regularization term. Based on the excellent results of Cuturi, Frogner [41] replaced the equality constraint with a soft penalty in terms of K-L divergence. He obtained an unconstrained approximate transmission problem. The specific form is:

\begin{matrix} D_{M}^{λ, γ} (y, x) & = min_{t_{p q} \geq 0} {\sum_{p, q = 1}^{m} m_{p q} t_{p q} + \frac{1}{λ} \tilde{H} (T)} + γ (\tilde{K L} (T 1 ∥ y) + \tilde{K L} (T^{T} 1 ∥ x)), \end{matrix}

(2)

where

\tilde{H} (T)

=

\sum_{p, q}^{m} t_{p q} log t_{p q}

is the entropy of

T

, and

λ

and

γ

are the regularization parameters. When

x, y

are normalized vectors, Equation (2) can approximate Cuturi’s algorithm closely for large enough

γ

. Moreover, Frogner obtains the optimal solution of Equation (2) by borrowing Cuturi’s results; that is, the optimal solution is a diagonal scaling of a matrix

K = e^{- λ M - 1}

. The specific form is:

\begin{matrix} T^{*} & = d i a g (u) K d i a g (v), \end{matrix}

(3)

\begin{matrix} u = {(y)}^{\frac{γ λ}{γ λ + 1}} ⊙ {(K v)}^{- \frac{γ λ}{γ λ + 1}}, v = {(x)}^{\frac{γ λ}{γ λ + 1}} ⊙ {(K^{T} u)}^{- \frac{γ λ}{γ λ + 1}}, \end{matrix}

(4)

where ⨀ denotes element-wise multiplication. The gradient of Equation (2) is given by

\begin{matrix} ▽_{y} D_{M}^{λ, γ} (y, x) = & γ (1 - T^{*} 1 ⊘ y) . \end{matrix}

(5)

2.2. EMDNMF

Considering n non-negative histograms with m bins, the histogram is expressed in matrix form as

X = [x_{i j}] = [x_{1}, x_{2}, \dots, x_{n}] \in R^{m \times n}

, where the j-

t h

histogram is the j-

t h

column of the matrix. The matrix

X

may be decomposed into a product of

U = [u_{i j}] = [u_{1}, u_{2}, \dots, u_{r}] \in R^{m \times r}

and

V = [v_{i j}] = [v_{1}, v_{2}, \dots, v_{r}] \in R^{n \times r}

(X \approx {UV}^{⊤})

, where

U

and

V

are defined as the basis and coefficient matrices, respectively. In many cases, the low-dimensional approximation is more valuable than the exact decomposition. Let

Y = {UV}^{⊤} = [y_{1}, y_{2}, \dots, y_{n}] \in R^{m \times n}

,

y_{j} = \sum_{k} u_{k} v_{j k}

. Optimizing the exact EMD throughout the iterative solution process is expensive and makes it difficult to guarantee that

\sum_{i} x_{i j} = \sum_{i} y_{i j}

. Equation (2) describes a regularized approximation which can be calculated quickly and efficiently, even with unnormalized data. Fortunately,

X \approx Y

implies that

\sum_{j = 1}^{n} D_{M}^{λ, γ} (x_{j}, y_{j})

is the sum of distances between the feature histograms. Based on Equation (2), the objective function of EMDNMF can be expressed as:

\begin{matrix} ϝ_{1} = min_{U, V} \sum_{j = 1}^{n} D_{M}^{λ, γ} (y_{j}, x_{j}) . \end{matrix}

(6)

3. GSNMF-EMD

3.1. The Objective Function

To explicitly explain how to discover information about the geometric structure implied by the dataset, the low-dimensional representation of the original data

x_{i}

relative to the basis matrix

U

is

v_{i}

. A reasonable assumption is that if two data points

x_{i}

and

x_{j}

are close to each other in the intrinsic geometric structure of the dataset, then these two points

v_{i}

and

v_{j}

are also close to each other in the new basis matrix [42]. For each data point

x_{i}

, we can use a 0–1 weighting scheme to generate a k-nearest neighbor graph, which in turn generates a weight matrix

W

according to the graph theory [43,44]. Given a graph with n vertices, where each vertex corresponds to a data point, the edge weight matrix

W

is defined as follows:

w_{i j} = \{\begin{matrix} 1, i f x_{i} \in {\tilde{N}}_{k} (x_{j}) o r x_{j} \in {\tilde{N}}_{k} (x_{i}), \\ 0, o t h e r w i s e, \end{matrix}

where

{\tilde{N}}_{k} (x_{j})

denotes the set of k nearest neighbors of the data point

x_{j}

. Thus, the formula used to measure the smoothed representation of the lower dimension is

\begin{matrix} min_{V} ℜ & = \frac{1}{2} \sum_{j, s = 1}^{n} w_{j s} {∥ v_{j} - v_{s} ∥}^{2} . \end{matrix}

(7)

Combining Equations (6) and (7), we get another objective function:

\begin{matrix} ϝ_{2} = min_{U, V} \sum_{j = 1}^{n} D_{M}^{λ, γ} (y_{j}, x_{j}) + \frac{1}{2} α ℜ, \end{matrix}

(8)

where

α \geq 0

is the regularization parameter.

To obtain a sparser and more accurate solution, the first issue to be addressed is: Should a sparse constraint be imposed on the basis matrix

U

or the coefficient matrix

V

? In this paper, we choose the

L_{1 / 2}

sparsity constraint on the coefficients

V

. Numerous results have shown that the proposed

L_{1 / 2}

regularity is both easier to solve than the

L_{0}

regularity and more sparse and stable than the

L_{1}

regularity [35,36,37], and it is also computationally efficient. Based on the above graph-regularization term and

L_{1 / 2}

sparsity constraint, the final objective function of graph-regularized, sparsity-constrained NMF with the Earth mover’s distance metric (GSNMF-EMD) is given as:

\begin{matrix} ϝ = min_{U, V} \sum_{j = 1}^{n} D_{M}^{λ, γ} (y_{j}, x_{j}) + \frac{1}{4} α \sum_{j, s = 1}^{n} w_{j s} {∥ v_{j} - v_{s} ∥}^{2} + β {∥ V ∥}_{1 / 2}, \end{matrix}

(9)

where

{∥ V ∥}_{1 / 2} = \sum_{j = 1}^{n} \sum_{s = 1}^{r} {|v_{j s}|}^{1 / 2}

is the

L_{1 / 2}

regularization and

β

is the regularization parameter.

3.2. Multiplicative Update Rules

The objective function Equation (9) is not convex in both

U

and

V

. Therefore, to find the global minimum is a difficult task, and fortunately we can find an iterative update algorithm to obtain a locally optimal solution. Similar to [14,45], we obtained the two-step multiplicative update rule that can maintain non-negativity and find a local minimum.

Theorem 1.

The objective function ϝ in Equation (9) has the following update rules:

\begin{matrix} u_{i k} & ⟵ u_{i k} \frac{\sum_{s} v_{s k} \frac{\sum_{t} T_{s}^{* i t}}{y_{i s}}}{\sum_{s} v_{s k}}, \end{matrix}

(10)

\begin{matrix} v_{j k} & ⟵ v_{j k} \frac{γ \sum_{s} u_{s k} \frac{\sum_{t} T_{j}^{* s t}}{y_{s j}} + α \sum_{s \neq j} w_{j s} v_{j s}}{γ \sum_{s} u_{s k} + α v_{j k} \sum_{s \neq j} w_{j s} + \frac{β}{2} \sum_{j, k} v_{j k}^{- \frac{1}{2}}}, \end{matrix}

(11)

where

T_{s}^{* i t}

is the

(i, t)

-entry of the optimal transportation matrix between

x_{s}

and

y_{s}

defined in Equation (3). Furthermore, the objective function ϝ is nonincreasing and permanent under these update rules if and only if

U

and

V

are at a stationary point.

Now the specific procedure for finding the local optimal

U

and

V

of GSNMF-EMD is summarized in Algorithm 1.

Algorithm 1 GSNMF-EMD algorithm.

Input:: Data matrix $X \in R^{m \times n}$ , metric matrix $M \in R^{m \times m}$ , weight matrix $W \in R^{n \times n}$ . The parameters: $λ, γ, α, β$ .
Output:: $U \in R^{m \times r}$ , $V \in R^{n \times r}$ .
1:: Initialization: $U_{0}, V_{0}, Y_{0} = U_{0} V_{0}^{⊤}$ , $k = 0$ ;
2:: repeat
3:: Calculate the optimal transport matrix $T^{*}$ ;
4:: Update $U$ according to

$\begin{matrix} u_{i k} & ⟵ u_{i k} \frac{\sum_{s} v_{s k} \frac{\sum_{t} T_{s}^{* i t}}{y_{i s}}}{\sum_{s} v_{s k}}; \end{matrix}$
5:: Update $V$ according to

$\begin{matrix} v_{j k} & ⟵ v_{j k} \frac{γ \sum_{s} u_{s k} \frac{\sum_{t} T_{j}^{* s t}}{y_{s j}} + α \sum_{s \neq j} w_{j s} v_{j s}}{γ \sum_{s} u_{s k} + α v_{j k} \sum_{s \neq j} w_{j s} + \frac{β}{2} \sum_{j, k} v_{j k}^{- \frac{1}{2}}}; \end{matrix}$
6:: $k \leftarrow k + 1$
7:: until convergence or maximum iteration is reached.

3.3. Convergence Analysis

In this section, we will draw on the ideas of [14,45] and cite the relevant theory as Definition 2 and Lemma 1.

Definition 2

([45]).

Ψ (h, h^{'})

is an auxiliary function for

\tilde{ϝ} (h)

, if the conditions

Ψ (h, h^{'}) \geq \tilde{ϝ} (h)

and

Ψ (h, h) = \tilde{ϝ} (h)

are satisfied.

Lemma 1

([45]). If Ψ is an auxiliary function of

\tilde{ϝ}

, then

\tilde{ϝ}

is nonincreasing under the update

\begin{matrix} h^{t + 1} & = a r g min_{h} Ψ (h, h^{t}) . \end{matrix}

By fixing the matrix

U

, we transform the objective function

ϝ

as

\begin{matrix} ϝ (V) = \sum_{j = 1}^{n} D_{M}^{λ, γ} (\sum_{k} v_{j k} u_{k}, x_{j}) + \frac{1}{4} α \sum_{j, s, k}^{n} w_{j s} {∥ v_{j} - v_{s} ∥}^{2} + β {∥ V ∥}_{1 / 2} . \end{matrix}

(12)

With fixed

V

,

ϝ (U)

can similarly be rewritten under the update rules for

U

. We only prove that

ϝ (V)

is nonincreasing under the update rule of Equation (10), and Equation (11) can be proved in similar way.

Lemma 2.

Let

φ_{i k_{i}} = \frac{u_{i k} v_{j k}^{(q)}}{\sum_{k} u_{i k} v_{j k}^{(q)}}

, and

φ = {[φ_{1 k_{1}}, φ_{2 k_{2}}, \dots, φ_{m k_{m}}]}^{⊤}

. Then, function

\begin{matrix} Ψ (V, V^{(q)}) & = \sum_{j, k_{1} \dots, k_{m}} \prod_{i} φ_{i k_{i}} D_{M}^{λ, γ} ((\sum_{k} v_{j k} u_{k}) ⊘ φ, x_{j}) \\ + \frac{1}{4} α \sum_{j, s, k}^{n} w_{j s} {∥ v_{j} - v_{s} ∥}^{2} + β {∥ V ∥}_{1 / 2} \end{matrix}

(13)

is an auxiliary function for

ϝ (V)

.

Proof.

When

V = V^{(q)}

and

φ_{i k_{i}} = \frac{u_{i k} v_{j k}}{\sum_{k} u_{i k} v_{j k}}

, we know

\begin{matrix} \sum_{j, k_{1} \dots, k_{m}} \prod_{i} φ_{i k_{i}} D_{M}^{λ, γ} ((\sum_{k} v_{j k} u_{k}) ⊘ φ, x_{j}) = \sum_{j = 1}^{n} D_{M}^{λ, γ} (\sum_{k} v_{j k} u_{k}, x_{j}), \end{matrix}

Compare Equation (12) with Equation (13):

Ψ (V, V) = ϝ (V)

.

It can be seen from reference [41],

d_{M}^{λ, γ}

is convex. We use the convexity for

i = 1, \dots, m

one by one and obtain

\begin{matrix} \sum_{j, k_{1} \dots, k_{m}} \prod_{i} φ_{i k_{i}} D_{M}^{λ, γ} ((\sum_{k} v_{j k} u_{k}) ⊘ φ, x_{j}) \geq \sum_{j = 1}^{n} D_{M}^{λ, γ} (\sum_{k} v_{j k} u_{k}, x_{j}), \end{matrix}

which implies

Ψ (V, V^{(q)}) \geq ϝ (V)

. Therefore, Lemma 2 is proved. □

Proof.

(Proof of Theorem 1.) The gradient

▽_{y} D_{M}^{λ, γ} (y, x)

satisfies Equation (6). Let the gradient of

Ψ (V, V^{(q)})

with respect to

V

be zero. We can obtain

V^{(q + 1)}

. The equations are such that

\frac{\partial Ψ (V, V^{(q)})}{v_{j k}} = 0

. We have

\begin{matrix} v_{j k} & ⟵ v_{j k} \frac{γ \sum_{s} u_{s k} \frac{\sum_{t} T_{j}^{* s t}}{y_{s j}} + α \sum_{s \neq j} w_{j s} v_{j s}}{γ \sum_{s} u_{s k} + α v_{j k} \sum_{s \neq j} w_{j s} + \frac{β}{2} \sum_{j, k} v_{j k}^{- \frac{1}{2}}} . \end{matrix}

Similarly, we can construct

Ψ (U, U^{(q)})

. Use the same update method as V. We get

\begin{matrix} u_{i k} & ⟵ u_{i k} \frac{\sum_{s} v_{s k} \frac{\sum_{t} T_{s}^{* i t}}{y_{i s}}}{\sum_{s} v_{s k}}, \end{matrix}

Combine Lemma 1 and Lemma 2. Theorem 1 is obtained. □

4. Experiments

In this section, we compare the data-clustering results of two popular datasets with the related most well-known methods, such as K-means, NMF [45], GNMF [14], and EMDNMF [30], to evaluate the performance of the proposed GSNMF-EMD algorithm.

4.1. Datasets

We evaluated the clustering performance on two widely used datasets, COIL20 and MNIST. The basic information of these two datasets is presented in Table 1.

COIL20 (https://www.cs.columbia.edu/CAVE/software/softlib/coil-20.php, accessed on: 26 June 2022 ): The COIL20 dataset contains 1440 gray images of 20 objects—that is, 72 images of each object acquired from different angles. The images we use here are limited to a size of $32 \times 32$ .
MNIST (http://yann.lecun.com/exdb/mnist/, accessed on: 26 June 2022): The MNIST handwritten digit dataset comes from Yann LeCun’s web page and contains a training set of $60, 000$ examples and a test set of $10, 000$ examples. The size of the images we used here was $28 \times 28$ .

4.2. Evaluation Metric

We used two of the most popular evaluation metrics to evaluate the clustering performance of the GSNMF-NMF algorithm. One metric was ACC (clustering accuracy), and the other was

M I (C, C^{'})

(mutual information). ACC measures the proportion of correctly classified data points in a clustering task, and NMI measures the mutual information between the true labels of the data points and the cluster assignments. These two metrics are commonly used in clustering tasks because they provide an easy-to-understand measure of the quality of the clustering results. In the context of NMF, ACC and NMI are used to evaluate the quality of the clusters that are produced by the algorithm. ACC is based on the principle of evaluating the clustering performance by comparing the clustering labels of each sample with the labels provided by the dataset, which is defined as [5,9,11,42], etc., as follows:

\begin{matrix} A C C & = \frac{\sum_{i = 1}^{n} δ (s_{i}, m a p (r_{i}))}{n}, \end{matrix}

where

δ (x, y)

is the delta function that equals one if

x = y

and equals zero otherwise, n is the total number of samples, and map

(r_{i})

is the permutation mapping function that maps each cluster label

r_{i}

to the equivalent label from the data corpus.

The

M I (C, C^{'})

between the two clusters C and

C^{'}

, which is defined as in [5,10,42], etc., it is defined as follows:

\begin{matrix} M I (C, C^{'}) & = \sum_{c_{i} \in C, c_{j}^{'} \in C^{'}} p (c_{i}, c_{j}^{'}) {log}_{2} \frac{p (c_{i}, c_{j}^{'})}{p (c_{i}) p (c_{j}^{'})}, \end{matrix}

where C is a set of true labels;

C^{'}

is a set of clustering labels obtained from a specific clustering algorithm;

p (c_{i}, c_{j}^{'})

denotes the joint probability that this arbitrarily selected document belongs to the clusters

c_{i}

and

c_{j}^{'}

at the same time, as opposed to

p (c_{i})

and

p (c_{j}^{'})

, which denote the probabilities of a document belonging to the clusters

c_{i}

and

c_{j}^{'}

, respectively. The other metric is the NMI (normalized mutual information) [5,9,11,42]. It is expressed as follows

\begin{matrix} N M I (C, C^{'}) & = \frac{M I (C, C^{'})}{max (\tilde{H} (C), \tilde{H} (C^{'}))}, \end{matrix}

\tilde{H} (C)

and

\tilde{H} (C^{'})

are the entropies of C and

C^{'}

, respectively.

4.3. Performance Evaluations and Comparisons

To demonstrate the performance of our proposed method in improving image clustering, we compared the GSNMF-EMD algorithm with the following four classical clustering algorithms.

K-means: Canonical K-means clustering method (K-means in short).
NMF: [45]: The original NMF is considered to be the baseline algorithm and only imposes non-negative constraints on the two factor matrices.
GNMF: [42]: Graph-regularized NMF (GNMF in short) with Euclidean distance. It adds graph-regularization constraints to NMF, taking into account information about the geometric structure of the data space. The regularization parameter $α$ is set to 10.
EMDNMF: [30]: NMF with EMD (EMDNMF in short). We set the $λ$ and $γ$ to 100 and 0.1, respectively; meanwhile, we used the 2D distance of pixels’ location of the image as the ground metric.
GSNMF-EMD: Graph-regularized $L_{1 / 2}$ sparsity-constrained NMF with the Earth mover’s distance metric (GSNMF-EMD in short). We set the $λ$ , $γ$ , $α$ , and $β$ to 100, 0.1, 20, and 1.8, respectively. Meanwhile, we use the 2D distances among pixels’ locations in the image as the ground metric.

In order to demonstrate the clustering performance, we compare our method with K-means, NMF, GNMF, and EMDNMF methods under the same iterations and with the same well-known datasets. In all experiments, we used the 0-1 weighting scheme for constructing the k-nearest neighbor graph with

k = 5

. Table 2 and Table 3 show the ACC and NMI results on the COIL20 and MNIST datasets, respectively, including those of GSNMF-EMD and the four comparison algorithms. For each given cluster number k, 20 test and 100 iterations runs were conducted on different randomly chosen clusters. The average performance is reported in the Table 2 and Table 3. Figure 1 and Figure 2 show the clustering results for the COIL20 and MNIST datasets in graphical form, and we can see from these experiments that our algorithm has high performance.

The GSNMF-EMD method and GNMF method are better than the NMF and EMDNMF methods. This is because graph regularization can respect the geometric information implied by the dataset. Meanwhile, the

L_{1 / 2}

-smoothing-constraint method in GSNMF-EMD can produce more accurate and smoother solutions. In addition, the GSNMF-EMD method has good perceptual similarity and robustness, thereby having better performance than GNMF and the other three methods.

5. Parameter Selection

GSNMF-EMD has four essential regularization parameters:

λ, γ, α,

and

β

, when both values of

α

and

β

are equal to zero. The new model degrades to the EMDNMF method [30]. In this experiment, based on the algorithm of graph theory, we set the number of nearest neighbors on all datasets to five. There are four regularization parameters, as in [14,17,42]. It is very interesting and meaningful to explore the influences of their changes on the clustering results. Of course, there are many combined results of these four parameters, which requires a large number of numerical experiments. In this section, we mainly investigate the sensitivity of

λ, γ, α

, and

β

, with the iterations set to 100 and the number of clusters set to eight. Figure 3, Figure 4, Figure 5, Figure 6, Figure 7 and Figure 8 demonstrate the ACC and NMI of GSNMF-EMD with different

λ

and

γ

or

α

and

β

on COIL20.

As shown in Figure 3, Figure 4, Figure 5, Figure 6, Figure 7 and Figure 8, the performance of the algorithm is influenced by the parameters. It can be seen that the performance of GSNMF-EMD is relatively stable with respect to the parameter

β

. In general,

β

can be selected from the range [0.001, 10]. The effect of

γ

on GSNMF-EMD is small. When the value of

α

is within [0.1, 1], GSNMF-EMD achieves consistently good performances. However, when

α

is greater than 10, the performance of our algorithm degrades. When the value of

λ

is within [1, 10,000], the performance of the GSNMF-EMD improves as

λ

increases. In particular, when

λ

is greater than 10, the performance of GSNMF-EMD stabilizes due to a slower increase, so the range of

λ

can be set to [10, 10,000]. Overall, the performance of the GSNMF-EMD algorithm can remain relatively stable as the parameters are transformed, and in most cases, it is more accurate than any other algorithm we tested. Based on the experimental results, we conclude that all the regularization parameters in the algorithm can control the clustering performance. Therefore, if certain parameters are not set to reasonable values, the clustering performance may be relatively low.

Figure 9a,b show reconstructions of face images from the COIL20 database and handwritten digits from the MNIST database, respectively. From Figure 9, we can see that our algorithm has good reconstruction performance.

6. Conclusions

We proposed a new algorithm called graph-regularized,

L_{1 / 2}

sparsity-constrained NMF with the Earth mover’s distance metric (GSNMF-EMD). GSNMF-EMD builds the geometric structures of the data distribution and sparsity constraints and incorporates them into EMDNMF as the additional regularization terms. The experimental results showed that it can have more clustering power than lots of NMF-based algorithms. Meanwhile, the results also showed that GSNMF-EMD is more stable and works more accurately and rapidly than the original EMD.

In the end, there are two issues that may lead to more interesting work in the future. Firstly, the normalization assumption of the histograms is not essential for GSNMF-EMD. This is because the algorithm proposed by EMD is not limited to the normalized case. Therefore, GSNMF-EMD has the potential ability to deal with occlusions, which is an important problem in local descriptors. Secondly, using our algorithm for hyperspectral analysis is also a good research direction, and finally, as indicated in [29], GSNMF-EMD can also be modeled as a network flow problem, and that would be worthy work, full of research significance.

Author Contributions

Conceptualization, S.L. and Q.L.; methodology, S.L. and L.L.; software, Q.L. and Z.C.; writing—original draft preparation, S.L.; writing—review and editing, Q.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was partially funded by the National Natural Science Foundation of China under Grants 12161020 and 12061025; and partially funded by the Natural Science Foundation of the Educational Commission of Guizhou Province under Grant Qian-Jiao-He KY Zi [2021]298, and Guizhou Provincial Science and Technology Projects (QKHJC-ZK[2023]YB245), GYU-KYZ(2019-2020)PT06-04, under the Guiyang Municipal Bureau of Science and Technology (No. K1930000701225).

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

Shahnaz, F.; Berry, M.W.; Pauca, V.P.; Plemmons, R.J. Document clustering using nonnegative matrix factorization. Inf. Process Manag. 2006, 42, 373–386. [Google Scholar] [CrossRef]
Pei, X.; Wu, T.; Chen, C. Automated graph regularized projective nonnegative matrix factorization for document clustering. IEEE Trans. Cybern. 2014, 44, 1821–1831. [Google Scholar]
Chen, Z.; Li, L.; Peng, H.; Liu, Y.; Yang, Y. Attributed community mining using joint general non-negative matrix factorization with graph Laplacian. Phys. A 2018, 495, 324–335. [Google Scholar] [CrossRef]
Yang, J.; Yang, S.; Fu, Y.; Li, X.; Huang, T. Non-negative graph embedding. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition, Anchorage, Alaska, 23–28 June 2008; pp. 1–8. [Google Scholar]
Dai, X.; Su, X.; Zhang, W.; Xue, F.; Li, H. Robust Manhattan non-negative matrix factorization for image recovery and representation. Inf. Sci. 2020, 527, 70–87. [Google Scholar] [CrossRef]
Li, Z.; Tang, J.; He, X. Robust structured nonnegative matrix factorization for image representation. IEEE Trans. Neural. Netw. Learn. Syst. 2017, 29, 1947–1960. [Google Scholar] [CrossRef]
Gong, M.; Jiang, X.; Li, H.; Tan, K.C. Multiobjective sparse non-negative matrix factorization. IEEE Trans. Cybern. 2018, 49, 2941–2954. [Google Scholar] [CrossRef]
Guan, N.; Tao, D.; Luo, Z.; Yuan, B. Online nonnegative matrix factorization with robust stochastic approximation. IEEE Trans. Neural. Netw. Learn. Syst. 2012, 23, 1087–1099. [Google Scholar] [CrossRef]
Chen, Z.; Li, L.; Peng, H.; Liu, Y.; Yang, Y. A novel digital watermarking based on general non-negative matrix factorization. IEEE Trans. Multimedia 2018, 20, 1973–1986. [Google Scholar] [CrossRef]
Chen, Z.; Li, L.; Peng, H.; Liu, Y.; Zhu, H.; Yang, Y. Sparse general non-negative matrix factorization based on left semi-tensor product. IEEE Access. 2019, 7, 81599–81611. [Google Scholar] [CrossRef]
Fu, X.; Huang, K.; Sidiropoulos, N.D.; Ma, W.K. Nonnegative matrix factorization for signal and data analytics: Identifiability, algorithms, and applications. IEEE Signal. Proc. Mag. 2019, 36, 59–80. [Google Scholar] [CrossRef]
Vaswani, N.; Bouwmans, T.; Javed, S.; Narayanamurthy, P. Robust subspace learning: Robust PCA, robust subspace tracking, and robust subspace recovery. IEEE Signal. Proc. Mag. 2018, 35, 32–55. [Google Scholar] [CrossRef]
Hoyer, P.O. Nonnegative matrix factorization with sparseness constraints. J. Mach. Learn. Res. 2004, 5, 1457–1469. [Google Scholar]
Qian, W.; Hong, B.; Cai, D.; He, X.; Li, X. Non-Negative Matrix Factorization with Sinkhorn Distance. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16), New York, NY, USA, 9–15 July 2016; pp. 1960–1966. [Google Scholar]
He, W.; Zhang, H.; Zhang, L. Sparsity-regularized robust non-negative matrix factorization for hyperspectral unmixing. IEEE J. Sel. Top. Appl. Earth. Obs. Remote. Sens. 2016, 9, 4267–4279. [Google Scholar] [CrossRef]
Kong, D.; Ding, C.; Huang, H. Robust nonnegative matrix factorization using l21-norm. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management, Glasgow, UK, 24–28 October 2011; pp. 673–682. [Google Scholar]
Huang, Q.; Yin, X.; Chen, S.; Wang, Y.; Chen, B. Robust nonnegative matrix factorization with structure regularization. Neurocomputing 2020, 412, 72–90. [Google Scholar] [CrossRef]
Pan, J.; Ng, M.K. Orthogonal nonnegative matrix factorization by sparsity and nuclear norm optimization. SIAM J. Matrix Anal. Appl. 2018, 39, 856–875. [Google Scholar] [CrossRef]
Luo, X.; Zhou, M.; Xia, Y.; Zhu, Q. An efficient non-negative matrix-factorization-based approach to collaborative filtering for recommender systems. IEEE Trans. Industr. Inform. 2014, 10, 1273–1284. [Google Scholar]
Sun, H.C.; Yang, J. The Generalization of Non-Negative Matrix Factorization Based on Algorithmic Stability. Electronics 2023, 12, 1147. [Google Scholar] [CrossRef]
Venkatesan, R.C.; Plastino, A. Deformed statistics Kullback—Leibler divergence minimization within a scaled Bregman framework. Phys. Lett. A 2011, 375, 4237–4243. [Google Scholar] [CrossRef]
He, R.; Zheng, W.S.; Hu, B.G. Maximum correntropy criterion for robust face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2010, 33, 1561–1576. [Google Scholar]
Wang, J.J.Y.; Wang, X.; Gao, X. Non-negative matrix factorization by maximizing correntropy for cancer clustering. BMC Bioinform. 2013, 14, 1–11. [Google Scholar] [CrossRef]
Wang, Y.; Pan, C.; Xiang, S.; Zhu, F. Robust hyperspectral unmixing with correntropy-based metric. IEEE Trans. Image Process. 2015, 24, 4027–4040. [Google Scholar] [CrossRef]
Peng, S.; Ser, W.; Lin, Z.; Chen, B. Robust sparse nonnegative matrix factorization based on maximum correntropy criterion. In Proceedings of the IEEE International Symposium on Circuits and Systems, Florence, Italy, 27–30 May 2018; pp. 1–5. [Google Scholar]
Yu, N.; Wu, M.J.; Liu, J.X.; Zheng, C.H.; Xu, Y. Correntropy-based hypergraph regularized NMF for clustering and feature selection on multi-cancer integrated data. IEEE Trans. Cybern. 2020, 51, 3952–3963. [Google Scholar] [CrossRef] [PubMed]
Pele, O.; Werman, M. Fast and robust earth mover’s distances. In Proceedings of the 2009 IEEE 12th International Conference on Computer Vision, Kyoto, Japan, 29 September–2 October 2009; pp. 460–467. [Google Scholar]
Ling, H.; Okada, K. An efficient earth mover’s distance algorithm for robust histogram comparison. IEEE Trans. Pattern Anal. Mach. Intell. 2007, 29, 840–853. [Google Scholar] [CrossRef]
Rubner, Y.; Tomasi, C.; Guibas, L.J. The earth mover’s distance as a metric for image retrieval. Int. J. Comput. Vision. 2000, 40, 99–121. [Google Scholar] [CrossRef]
Sandler, R.; Lindenbaum, M. Nonnegative matrix factorization with earth mover’s distance metric. In Proceedings of the Computer Vision and Pattern Recognition Conference, Miami, FL, USA, 20–25 June 2009; pp. 1873–1880. [Google Scholar]
Sandler, R.; Lindenbaum, M. Nonnegative matrix factorization with earth mover’s distance metric for image analysis. IEEE Trans. Pattern Anal. Mach. Intell. 2011, 33, 1590–1602. [Google Scholar] [CrossRef]
Atasu, K.; Mittelholzer, T. Linear-complexity data-parallel earth mover’s distance approximations. ICML 2019, 97, 364–373. [Google Scholar]
Qu, Z.W.; Yang, J.C.; Lang, Y.S.; Wang, Y.J.; Han, X.M.; Guo, X.Y. Earth Mover Distance based detection of false data injection attacks in smart grids. Energies 2022, 15, 1733. [Google Scholar] [CrossRef]
Lu, X.; Wu, H.; Yuan, Y.; Yan, P.; Li, X. Manifold regularized sparse NMF for hyperspectral unmixing. IEEE Trans. Geosci. Remote 2012, 51, 2815–2826. [Google Scholar] [CrossRef]
Qian, Y.; Jia, S.; Zhou, J.; Robles-Kelly, A. Hyperspectral unmixing via L_1/2 sparsity-constrained nonnegative matrix factorization. IEEE Trans. Geosci. Remote 2011, 49, 4282–4297. [Google Scholar] [CrossRef]
Xu, Z.B.; Zhang, H.; Wang, Y.; Xu, Y.; Yong, L. L_1/2 regularizer. Sci. China Inf. Sci. 2010, 53, 1159–1169. [Google Scholar] [CrossRef]
Wang, W.; Qian, Y.; Tang, Y.Y. Hypergraph-regularized sparse NMF for hyperspectral unmixing. IEEE J. Sel. Topics Appl. Earth Obs. Remote Sens. 2016, 9, 681–694. [Google Scholar] [CrossRef]
Bioucas-Dias, J.M.; Plaza, A.; Dobigeon, N.; Parente, M.; Du, Q.; Gader, P.; Chanussot, J. Hyperspectral unmixing overview: Geometrical, statistical, and sparse regression-based approaches. IEEE J. Sel. Topics Appl. Earth Obs. Remote Sens. 2012, 5, 354–379. [Google Scholar] [CrossRef]
Werman, M.; Peleg, S.; Rosenfeld, A. A distance metric for multidimensional histograms. Comput. Vis. Image. Underst. 1985, 32, 328–336. [Google Scholar] [CrossRef]
Cuturi, M. Sinkhorn distances: Lightspeed computation of optimal transport. Proc. Adv. Neural Inf. Process. Syst. 2013, 26, 2292–2300. [Google Scholar]
Frogner, C.; Zhang, C.; Mobahi, H.; Araya, M.; Poggio, T.A. Learning with a Wasserstein loss. Proc. Adv. Neural Inf. Process. Syst. 2015, 28, 2044–2052. [Google Scholar]
Cai, D.; He, X.; Han, J.; Huang, T.S. Graph regularized nonnegative matrix factorization for data representation. IEEE Trans. Pattern Anal. Mach. Intell. 2010, 33, 1548–1560. [Google Scholar]
Chung, F.R. Spectral Graph Theory; American Mathematical Society: Providence, RI, USA, 1997. [Google Scholar]
Belkin, M.; Niyogi, P. Laplacian eigenmaps and spectral techniques for embedding and clustering. In Proceedings of the Advances in Neural Information Processing Systems, Denver, CO, USA, 3–6 November 2001; Volume 14, pp. 585–591. [Google Scholar]
Lee, D.; Seung, H.S. Algorithms for non-negative matrix factorization. In Proceedings of the International Conference on Neural Information Processing Systems, Denver, CO, USA, 28–30 November 2000; Volume 13, pp. 556–562. [Google Scholar]

Figure 1. Clustering performance comparisons on the COIL20 dataset.

Figure 2. Clustering performance comparisons on the MNIST dataset.

Figure 3. With fixed

λ = 100

and

γ = 1

, ACC and NMI of GSNMF-EMD with different

α

and

β

and the same number of clusters on the COIL20 dataset.

Figure 3. With fixed

λ = 100

and

γ = 1

, ACC and NMI of GSNMF-EMD with different

α

and

β

and the same number of clusters on the COIL20 dataset.

Figure 4. With fixed

λ = 100

and

β = 2

, ACC and NMI of GSNMF-EMD with different

α

and

γ

and the same number of clusters on the COIL20 dataset.

Figure 4. With fixed

λ = 100

and

β = 2

, ACC and NMI of GSNMF-EMD with different

α

and

γ

and the same number of clusters on the COIL20 dataset.

Figure 5. With fixed

α = 10

and

β = 2

, ACC and NMI of GSNMF-EMD with different

γ

and

λ

, and the same number of clusters on the COIL20 dataset.

Figure 5. With fixed

α = 10

and

β = 2

, ACC and NMI of GSNMF-EMD with different

γ

and

λ

, and the same number of clusters on the COIL20 dataset.

Figure 6. With fixed

γ = 1

and

β = 2

, ACC and NMI of GSNMF-EMD with different

α

and

λ

and the same number of clusters on the COIL20 dataset.

Figure 6. With fixed

γ = 1

and

β = 2

, ACC and NMI of GSNMF-EMD with different

α

and

λ

and the same number of clusters on the COIL20 dataset.

Figure 7. With fixed

α = 10

and

γ = 1

, ACC and NMI of GSNMF-EMD with different

β

and

λ

and the same number of clusters on the COIL20 dataset.

Figure 7. With fixed

α = 10

and

γ = 1

, ACC and NMI of GSNMF-EMD with different

β

and

λ

and the same number of clusters on the COIL20 dataset.

Figure 8. With fixed

α = 10

and

λ = 100

, ACC and NMI of GSNMF-EMD with different

β

and

γ

and the same number of clusters on the COIL20 dataset.

Figure 8. With fixed

α = 10

and

λ = 100

, ACC and NMI of GSNMF-EMD with different

β

and

γ

and the same number of clusters on the COIL20 dataset.

Figure 9. Some of the reconstructed images. (a): Reconstructions of face images from the COIL20 database. (b): Reconstructions of handwritten digits from the MNIST database.

Table 1. Statistics of the datasets used in our experiment.

Data Sets	Size	Features	Classes
COIL20	1440	1024	20
MNIST	70,000	784	10

Table 2. Clustering performance on the COIL20 dataset.

k	Accuracy (%)					Normalized Mutual Information (%)
k	K-Means	NMF	GNMF	EMD-NMF	GSNMF-EMD	K-Means	NMF	GNMF	EMD-NMF	GSNMF-EMD
4	53.819	52.083	67.361	54.514	96.528	33.341	32.487	61.494	33.366	92.384
5	60.556	61.944	63.056	58.611	73.333	49.56	51.166	66.436	49.325	67.782
6	90.972	93.519	97.917	85.185	97.685	85.280	89.561	95.585	78.465	95.452
7	69.643	63.294	76.786	60.317	88.095	63.951	63.951	79.354	58.235	84.485
8	73.438	65.104	88.021	52.951	87.847	72.469	73.52	91.678	58.092	91.672
9	65.741	63.889	90.432	61.111	89.969	66.802	63.611	87.646	58.933	87.821
10	67.222	73.611	93.611	75.972	83.472	75.336	75.768	93.049	75.691	87.653
11	57.828	66.793	77.652	62.247	77.778	62.304	67.408	83.464	64.313	87.510
12	67.708	64.583	79.282	62.847	79.630	74.297	68.273	86.313	67.503	88.222
13	72.543	70.192	80.021	66.026	80.449	75.162	74.933	87.055	71.762	86.964
14	69.048	66.468	78.968	55.456	85.218	75.613	73.247	87.700	64.102	89.273
15	61.667	65.556	78.889	58.889	77.500	71.048	73.327	85.515	65.449	84.420
16	63.542	63.368	79.861	59.983	79.688	71.752	72.15	85.899	68.569	84.599
17	63.235	61.601	73.611	59.150	72.876	72.187	71.148	86.209	69.008	86.298
18	67.130	55.247	75.926	53.781	78.086	75.907	68.560	87.027	66.846	86.954
19	59.137	62.792	77.485	54.605	79.313	73.307	72.659	87.899	67.241	87.061
20	67.083	63.403	75.903	55.764	73.958	76.674	72.049	87.108	66.723	86.653
Avg.	66.489	65.497	79.693	61.024	82.437	69.117	68.460	84.672	63.743	86.777

Table 3. Clustering performance on the MNIST dataset.

k	Accuracy (%)					Normalized Mutual Information (%)
k	K-Means	NMF	GNMF	EMD-NMF	GSNMF-EMD	K-Means	NMF	GNMF	EMD-NMF	GSNMF-EMD
2	20.082	20.902	19.945	19.809	20.219	10.099	10.051	14.094	10.878	14.992
3	30.601	24.772	27.140	25.319	30.328	25.187	13.428	20.438	12.115	25.352
4	39.481	27.254	41.120	24.658	41.120	35.416	19.965	42.834	17.543	39.852
5	39.071	30.710	35.191	26.721	43.770	32.256	26.791	40.075	19.289	37.957
6	35.246	33.197	54.508	36.566	52.505	35.468	31.005	50.292	26.079	47.111
7	44.223	41.491	48.673	44.145	59.133	41.302	38.712	53.542	33.950	57.559
8	50.615	46.619	50.854	38.251	48.770	46.927	44.749	54.334	32.038	48.106
9	54.614	50.213	64.511	48.482	64.359	51.401	44.901	65.380	40.234	59.547
Avg.	39.242	34.395	42.743	32.994	45.026	34.757	28.700	42.623	24.016	41.310

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, S.; Lu, L.; Liu, Q.; Chen, Z. Graph-Regularized, Sparsity-Constrained Non-Negative Matrix Factorization with Earth Mover’s Distance Metric. Mathematics 2023, 11, 1894. https://doi.org/10.3390/math11081894

AMA Style

Li S, Lu L, Liu Q, Chen Z. Graph-Regularized, Sparsity-Constrained Non-Negative Matrix Factorization with Earth Mover’s Distance Metric. Mathematics. 2023; 11(8):1894. https://doi.org/10.3390/math11081894

Chicago/Turabian Style

Li, Shunli, Linzhang Lu, Qilong Liu, and Zhen Chen. 2023. "Graph-Regularized, Sparsity-Constrained Non-Negative Matrix Factorization with Earth Mover’s Distance Metric" Mathematics 11, no. 8: 1894. https://doi.org/10.3390/math11081894

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Graph-Regularized, Sparsity-Constrained Non-Negative Matrix Factorization with Earth Mover’s Distance Metric

Abstract

1. Introduction

2. Related Work

2.1. EMD

2.2. EMDNMF

3. GSNMF-EMD

3.1. The Objective Function

3.2. Multiplicative Update Rules

3.3. Convergence Analysis

4. Experiments

4.1. Datasets

4.2. Evaluation Metric

4.3. Performance Evaluations and Comparisons

5. Parameter Selection

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI