1. Introduction
In recent years, the problem of recovering an unknown low-rank matrix from its limited number of observed entries has been actively studied in many scientific applications such as Netflix problem [
1], image processing [
2], system identification [
3], video denoising [
4], signal processing [
5], subspace learning [
6], and so on. In mathematics, this problem can be modeled by the following low-rank matrix completion problem:
where
is the set of indices of observed entries. Without loss of generality, throughout this paper, we assume that
. If we summarize the observed entries via
, where the projection
is defined by
the low-rank matrix completion problem (
1) can be rewritten as
Unfortunately, the problem (
2) is NP-hard and all known algorithms for exactly solving it are time doubly exponential in both theory and practice [
1,
7]. To overcome such a difficulty, many researchers (e.g., [
1,
3,
5,
8,
9,
10]) have suggested to relax the rank function
by the nuclear norm
, which leads to the following nuclear norm minimization problem:
where
denotes the nuclear norm of matrix
, and
,
, are the singular values of
.
The nuclear norm
, as a convex relaxation of the rank function
, can be considered as the best convex approximation of the rank function
. In theory, Candès et al. [
1,
11] have proved that, if the observed entries are selected uniformly at random, the unknown low-rank matrix can be exactly recovered with high probability by solving the problem (
2). As a convex optimization problem, the problem (
2) can be solved by using the semidefinite programming solvers such as SeDuMi [
12] and SDPT3 [
13]. However, these algorithms are unsuitable to handle the large size problems due to the high computational cost and the memory requirements. Different from the semidefinite programming solvers, the regularized version of the problem (
2), i.e.,
has been frequently studied in the literature to solve the large-scale matrix completion, where
is a regularization parameter representing the tradeoff between error and nuclear norm. There are many algorithms proposed to solve the problem (
3). The most popular algorithms include the singular value thresholding (SVT) algorithm [
8], accelerated proximal gradient (APG) algorithm [
14], linearized augmented lagrangian and alternating direction methods [
15], and so on. These algorithms are all based on the soft thresholding operator [
16] to recover the unknown low-rank matrices. Although the problem (
4) possesses many algorithmic advantages, these iterative soft thresholding version algorithms tend to have the bias on shrinking all the singular values of matrices, and sometimes result in over-penalization as the
-norm in compressed sensing.
To reduce the bias, in this paper we design a new thresholding operator, namely, designed thresholding operator, to recover the low-rank matrices, which has less bias than the soft thresholding operator. The numerical experiments on some low-rank matrix completion problems show that the proposed thresholding operator performs efficiently in recovering the low-rank matrices.
This paper is organized as follows. In
Section 2, we give the definition of the designed thresholding operator, and study some properties for the designed thresholding operator. In
Section 3, an iterative thresholding algorithm based on the designed thresholding operator is generated to recover the low-rank matrices. In
Section 4, some numerical experiments are presented to verify the effectiveness of the proposed algorithm. Finally, we draw some conclusions in
Section 5.
2. Designed Thresholding Operator
In this section, we first review the classical soft thresholding operator [
16], and then design a new thresholding operator, namely, designed thresholding operator, to recover the low-rank matrices.
2.1. Soft Thresholding Operator
For any fixed
and
, the soft thresholding operator [
16] is defined as
which can be explained by the fact that it is the proximity operator of
, i.e.,
The soft thresholding operator plays a significant role in solving problem (
4). However, it has the biased estimation and sometimes results in over-penalization. The behavior of the soft thresholding operator is plotted in
Figure 1, and we can see that the bias between
t and soft thresholding operator
equals to
.
2.2. Designed Thresholding Operator
To reduce the bias, in this subsection we design a new thresholding operator, namely, designed thresholding operator, to recover the low-rank matrices.
Definition 1. (Designed thresholding operator) For any fixed , and , the designed thresholding operator is defined as According to Definition 1, we immediately obtain the following Property 1, which shows that the designed thresholding operator has less bias to the larger coefficients than the soft thresholding operator.
Property 1. For any , the bias approaches zero as the magnitude of t increases.
Proof. Without loss of generality, we assume . Then, we have for any , and for any . It is easy to verify that if . The proof is thus complete. □
In addition, we can also observe that the designed thresholding operator approximates soft thresholding operator when
, i.e.,
The behaviors of the designed thresholding operator for some
with
are plotted in
Figure 2. With the change of parameter
, the designed thresholding operator can apply less bias to the larger coefficients.
According to ([
17], Theorem 1), we immediately obtain the following Lemma 1, which shows that the designed thresholding operator is in fact the proximal mapping of a non-convex penalty function.
Lemma 1. The designed threshold operator is the proximal mapping of a penalty function , i.e.,where g is even, nondecreasing and continuous on , differentiable on , and nondifferentiable at
with . Moreover, g is concave on and satisfies the triangle inequality. Definition 2. (Matrix designed thresholding operator) Given a matrix , let be the singular value decomposition (SVD) of matrix , where is a unitary matrix, is a unitary matrix, is the i-th largest singular value of , is the diagonal matrix of the singular values of matrix arranged in descending order , and is a zero matrix. For any fixed and , the matrix designed thresholding operator is defined aswhere is defined in Definition 1. For any matrix
, define the function
as
where
, we can get the following result.
Theorem 1. For any fixed and , suppose that is the optimal solution of the problemThen can be expressed as Before we give the proof of Theorem 1, we need prepare the following Lemma 2 which plays the key rule for proving Theorem 1.
Lemma 2. (von Neumann’s trace inequality) For any matrices ,where and are the singular values of and , respectively. The equality holds if and only if there exists unitary matrices and such thatandare the SVDs of and , simultaneously. Now, we give the proof of Theorem 1.
Proof. (of Theorem 1) By Lemma 2, we have
Notice that the equality holds if and only if
and
share the same left and right unitary matrices, we assume that
and
are the SVDs of
and
, simultaneously. Therefore, it holds that
and the problem (
11) reduces to
The objective function in (
13) is separable, hence, solving the problem (
13) is equivalent to solving the following
n problems, for each
,
Let
be the optimal solution of the problem (
14), by Lemma 1,
can be expressed as
Therefore, we can get the optimal solution
of the problem (
11) as follows
namely,
This completes the proof. □
3. Iterative Matrix Designed Thresholding Algorithm
In this section, we present an iterative thresholding algorithm to recover the low-rank matrices using the proposed designed thresholding operator.
Consider the following minimization problem:
we can verify the optimal solution of problem (
17) can be analytically expressed by the designed thresholding operator.
For any fixed
,
and
, let
and
The function
defined in (
18) is the objective function of the problem (
17), and the function
defined in (
19) is a surrogate function of
. Clearly,
. By (
19), we would expect minimizing the problem (
17) by minimizing the surrogate function
.
Lemma 3. For any fixed , and , if is the minimizer of on , then satisfies Proof. By definition, the function
can be rewritten as
This means that, for any fixed
,
and
, minimizing the function
on
is equivalent to solving the following minimization problem
By Theorem 1, the minimizer
of
on
can be expressed as
which completes the proof. □
Lemma 4. For any fixed and . If is the optimal solution of the problem (17), then also solves the minimization problemthat is, for any , Proof. Since
, we have
Therefore, we can get that
which completes the proof. □
By Lemmas 3 and 4, we can derive that the problem (
17) permits a thresholding representation theory for its optimal solution.
Theorem 2. For any fixed and . If is the optimal solution of the problem (17), then can be analytically expressed as With representation (
24), an algorithm for solving the problem (
17) can be naturally given by
where
. In this paper, we call the iteration (
25) the iterative matrix designed thresholding (IMDT) algorithm, which is summarized in Algorithm 1.
Algorithm 1 Iterative matrix designed thresholding (IMDT) algorithm |
Input: , , ; Initialize: , ; while not converged, do end while return:
|
Similar to ([
18], Theorem 4.1), we can immediately derive the convergence of the IMDT algorithm.
Theorem 3. Let be the sequence generated by the IMDT algorithm with . Then
The sequence is monotonically decreasing and converges to , where is any accumulation point of .
The sequence is asymptotically regular, i.e., .
Any accumulation point of the sequence is a stationary point of the problem (17).
It is worth mentioning that the quality of the optimal solution generated by IMDT algorithm depends seriously on the setting of the regularization parameter
. But the selection of optimal regularization parameter
is a very hard problem. There is no optimal rule for the selection of the optimal regularization parameter
in general. However, when some prior information (e.g., rank) is known for the optimal solution of problem (
17), it is realistic to set the regularization parameter more reasonably. Following [
18,
19], we can give a useful parameter-setting rule to select the optimal regularization parameter
for the IMDT algorithm. The details of the parameter-setting rule can be described as follows.
Suppose that the optimal solution
of the problem (
17) is rank
r, and the singular values of matrix
are arranged as
By Theorem 2, we have
which implies
Estimation (
26) can help to set the optimal regularization parameter
for IMDT algorithm, and a reliable selection can be set as
In practice, we approximate the unknown real optimal solution
by
, that is, we can take
in each iteration of IMDT algorithm.
When we set the optimal regularization parameter
using Equation (
28), the IMDT algorithm will adapt to select the optimal regularization parameter
. In this paper, we call IMDT algorithm with parameter-setting rule (
28) the adaptive iterative matrix designed thresholding (AIMDT) algorithm which is summarized in Algorithm 2.
Algorithm 2 Adaptive iterative matrix designed thresholding (AIMDT) algorithm |
Input: , ; Initialize: , ; while not converged, do ; ; ; ; ;
end while return:
|
4. Numerical Experiments
In this section, we only carry out the numerical experiments to test the performance of the AIMDT algorithm on some grayscale image inpainting problems, and then compare it with the classical SVT algorithm [
8]. The freedom ration is defined as
, where
s is the cardinality of observation set
and
r is the rank of the matrix
. For
, it is impossible to recover the original low-rank matrix [
9]. The stopping criterion of the algorithms is defined as
Given the original low-rank matrix
, the accuracy of the generated solution
of the algorithms is measured by the relative error (
), defined as
In all numerical experiments, we set
in AIMDT algorithm. The numerical experiments are all conducted on a personal computer with Matlab platform.
In numerical experiments, the algorithms are tested on three
grayscale images (Lena, Boat and Fingerprint). It is well known that grayscale images can be expressed by matrices. In grayscale image inpainting, the grayscale value of the pixels of the image are missing, and we want to fill in these missing pixels. If the image is of low-rank, or of numerical low-rank, we can solve the image inpainting problem as a matrix completion problem (
2). We first use the SVD to obtain their approximated low-rank images with rank 50. The original images and their corresponding approximated low-rank images are displayed in
Figure 3,
Figure 4 and
Figure 5, respectively. Then, we mask some pixels of these three low-rank images. The 3.81% of elements of image are set to mask, and the masked images are displayed in
Figure 6.
Table 1 reports the numerical results of the algorithms for image inpainting problems. For AIMDT algorithm, three different
c values are considered, that is,
. From
Table 1, we can make the following observations: (i) AIMDT algorithms with parameter
are all superior to SVT algorithm in RE. In particular, the REs generated by the AIMDT algorithms decreases as the parameter
c decreases; (ii) AIMDT algorithms with parameter
are all more time-consuming than SVT algorithm. The calculation time of AIMDT algorithm increases with the decrease of parameter
c. In a word, AIMDT algorithms can more accurately recover the low-rank grayscale images than SVT algorithm, and the AIMDT algorithm with parameter
performs best. We display the recovered low-rank Lena, Boat and Fingerprint images in
Figure 7,
Figure 8 and
Figure 9, respectively. From these figures, we can see that the AIMDT algorithm with parameter
has the best performance in recovering the low-rank grayscale images.