1. Introduction
The problem of high-dimensional matrix restoration from a noisy observation with low-rank conditions arises in a variety of applications. Examples include a collaborative filtering and online recommendation system [
1], systems identification [
2], face recognition [
3], statistics [
4], as well as engineering and optimal control [
5,
6]. One well-known problem is the Netflix recommendation system, which is a matrix completion (MC) problem in which a small set of entries of an unknown matrix can be observed. The mathematical model of MC can be expressed as follows:
where
M is the unknown matrix with some available sampled entries,
is an unknown low-rank matrix, and
is a set of index pairs
for the known sampled entries. The MC problem in a general form is represented by the following affine rank minimization problem, which can be formulated as
where
is a linear map, and
is an observed measurement vector. Usually, the observed entries may be perturbed by some noise. The corresponding formula reads:
where
is the noise level.
It is widely acknowledged that the rank minimization problems (
1)–(
3) are generally NP-hard [
7]. A widely adopted strategy involves employing the nuclear norm as a convex relaxation for the rank function [
8,
9,
10], so the problem (
3) can be reformulated as the following nuclear norm minimization problem
or its equivalent least square regularization form
where
is the trade-off parameter, which balances the two terms in the objective function. Assuming
are
r positive singular values of the matrix
X, then the nuclear norm is defined as
, which is the best convex approximation of the rank function over the unit ball of matrices with a norm less than one [
7].
The nuclear norm minimization problems are convex optimization problems, for which numerous efficient algorithms are proposed, including SeDuMi [
11] and SDPT3 [
12], singular value thresholding (SVT) [
13], the accelerated proximal gradient (APG) algorithm [
14], the fixed-point continuation with approximate SVD (FPCA) method [
7], the proximal point algorithm (PPA) [
15] and ADMM-type algorithms [
16,
17,
18], and so on.
It is observed that most of above proposed noisy models have considered using the
-norm to fit the data fidelity term, which are particularly effective for dealing with the Gaussian noisy problem. Nevertheless, both theoretical analyses and numerical experiments indicate that the
-norm fidelity term is less effective in handling non-Gaussian additive noise, because it tends to amplify the effect of noise [
19]. Thus, it is highly needed to find an alternative formulation to overcome the limitation of the
-norm fidelity term. In [
20], it was also demonstrated that the
-norm fidelity term is very suitable for handling non-Gaussian additive noise, such as impulsive noise. The advantage of
-norm fitting over
-norm fitting in handling non-Gaussian noise lies in its robustness to outliers. The
-norm, also known as the least absolute deviations regularization, penalizes large deviations more significantly than the
-norm (least squares). This property makes
-norm fitting less sensitive to the influence of outliers, such as impulsive noise or heavy-tailed non-Gaussian noise. So many researchers have studied the
model and
model for image and signal restoration. Elsener and Geer [
4] observed that numerous results have been established for diverse nuclear norm penalized estimators in the context of the uniform sampling matrix completion problem, but they are not robust; therefore, they studied robust nuclear norm penalized estimators using the absolute value loss and derived the asymptotic behavior of the estimators. They also pointed out that the least squares estimator will perform very well when errors are assumed to follow a light-tailed distribution, such as i.i.d. Gaussian errors, but the ratings are susceptible to heavy fraud. Udell et al. [
21] indicated a factorization formulation of the low-rank matrix with
error loss. Zhao et al. [
22] exploited a bilinear factorization formulation and developed a novel algorithm fully utilizing parallel computing resources. Both of the above methods need a given rank first. Jiang et al. [
23] formulated the matrix completion as a feasibility problem and presented an alternating projection algorithm to find a feasible point in the intersection of the low-rank constraint set and fidelity constraint set. Guennec et al. [
24] locally modeled the structure as gradient-sparse and the texture as a low patch rank and proposed a rule based upon the theoretical results of the sparse and low-rank matrix recovery in order to automatically tune the proposed model depending on the local content. Liang [
25] proposed a novel robust low-rank matrix completion model, which adds the
-norm penalty directly to the rank function in the objective function in order to alleviate the row-structured noise under the condition of equality constraint. They also adapted the ADMM to solve the nonconvex and discontinuous model directly and showed its convergence. Wong and Lee [
26] presented a celebrated Huber function from the robust statistics literature to down-weight the effects of outliers and developed a practical algorithm for solving matrix completion with noisy entries and outliers. But it did not refer to solving the general affine low-rank minimization problem. And, as proposed in [
27], the impulsive noise, Gaussian noise, and their mixture widely exist in the corrupted image, video, and collected dates. Especially, the literature [
28] has conducted some research on robust video denoising using low-rank matrix completion, which deals with the problem with heavy Gaussian noise mixed with impulsive noise. However, they were not by means of a robust model and did not deal with large-scale problems. Given the above analysis, it is highly necessary to study designing an efficient algorithm for robust low-rank matrix estimation. The general nuclear norm minimization problem with the
-norm fidelity term is a popular robust model for the low-rank matrix estimation problem. Thus, in this paper, we aim to design an efficient algorithm for solving the following general nuclear norm minimization problem with the
-norm fidelity term, which can be written as
where the observation
might contain some noise and
is a penalty parameter. It is well known that when
is larger than a certain threshold, the
-norm fidelity term makes problem (
6) into an exact penalty function problem.
It is clear to see that the problem (
6) is not easy to solve due to the two nonsmooth terms in the objective function. Wang et al. [
29] proposed a penalty decomposition method for solving (
6) by minimizing the quadratic penalty function for the problem (
6), which can obtain the solution of (
6) requiring the penalty parameter to go to infinity. As we all know, it is an inexact method and the penalty parameter is difficult to chose appropriately because it greatly affects the efficiency of the algorithm. Thus, in this paper, we consider making the convex and nonsmooth objective function into a variable-separated form by introducing an auxiliary variable and developing a semi-proximal ADMM to solve the primal problem (
6) and its dual problem. Each resulting subproblem has a closed-form solution, which makes the model (
6) be solved efficiently.
Our main contributions of this paper are developing a semi-proximal ADMM to successfully solve the proposed model (
6) and its dual problem, which can deal with the problems not only with a single non-Gaussian noise but also with Gaussian noise or their mixture. Moreover, it is the first time considering to solve its dual problem. The presented algorithms also have a convergence guarantee. Most importantly, model (
6) has a more general form than the MC model and has good properties from a numerical–experimental point of view. For example, the parameter
can be controlled relatively easily, and the estimation of the corrupted low-rank matrix can attain an approximately exact solution, such as the accuracy attaining less than
.
The remaining parts of this paper are organized as follows. In
Section 2, we present three subsections.
Section 2.1 introduces some key notations.
Section 2.2 provides some basic concepts to facilitate our later discussions.
Section 2.3 reviews some types of ADMM for latter developments. In
Section 3, we develop a semi-proximal ADMM for solving (
6). Additionally, we also apply the semi-proximal ADMM for solving its dual problem. The convergence of these proposed methods is given in this section. In
Section 4, we illustrate the robustness of the model (
6) and the effectiveness of both presented algorithms by conducting some numerical experiments. Finally, we conclude this paper in
Section 5.
4. Numerical Experiments
In this section, we will have a discussion to emphasize an important issue on choosing denoising models for recovering a low-rank matrix. In many practical applications, measured data are usually contaminated by different kinds of noise or their mixtures. It is well known that models (
4) and (
5) are widely used to solve the matrix rank minimization problem with noise. Both of them apply the
-norm for the data-fitting term and usually deal with the problems with Gaussian noise successfully, except non-Gaussian noise.
Thus, in this section, by carrying out kinds of numerical experiments, we demonstrate that model (
6) with
fidelity can handle several noise cases and perform better than the models with
fidelity, and the proposed algorithms can solve the primal and dual problems with approximately precise fidelity.
Now, we firstly give some parameter selections, the meaning of signs, the stopping criterion, and the running environment as follows.
m and
n denote the row number and the column number of the matrix, respectively.
r denotes the rank of the original matrix, which is far less than
. Let
and
p be the sample ratio and the number of measurements, respectively, where
p is set to be
. Let
be the number of the degree of freedom for a real-valued rank
r matrix, which is
. We note
M as the real low-rank matrix, which is generated by
, where the matrices
and
are generated with independent identically distributed Gaussian entries. The
M is produced by the Matlab script “
”.
is an index set of known elements for the matrix completion problem, which are selected uniformly at random entries from
. A linear map
in the general matrix rank minimization usually is the partial discrete cosine transform (PDCT) operator.
b is the given measurement vector, and
, where
is noise.
denotes a Gaussian noise of mean zero and standard deviation
, and it can be generated by
.
represents an impulsive noise, which can be set to
at
N random positions of
b, where
, and
can be chosen, respectively.
is an exact penalty parameter, which can be chosen to be slightly greater than the reciprocal of its corresponding Lagrangian multiplier.
and
are approximate parameters. We can note that
for matrix completion problems and PDCT measurements. However, we set
because the values of them are slightly greater than
can accelerate the convergence rate by experiments, which can also be seen in [
17]. Let
represent the optimal solution produced by the proposed method.
Now, we terminate the process when the optimal solution
produced by the proposed method satisfies the following criterion
where the relative error measures the quality of
to the original
M. The
M is recovered successfully by
if the corresponding
is less than
, which has been used in [
7,
13,
38]. So, we usually take the
and the maximum iteration as the terminal condition.
As we noted, the computation of the matrix singular value decomposition (SVD) is needed at each iteration for nuclear norm minimization problems, which may be expensive. So, for all the tests, we apply a PROPACK package [
39] for partial SVD. However, PROPACK cannot automatically compute only those singular values greater than a threshold; it needs us to choose the predetermined number
of the singular values to be computed at the
k-th iteration. As in [
14], initializing
, if
, we set
; if
, we obtain
, where
represents the number of positive singular values of the matrix.
All the experiments are performed under Windows 10 premium and MATLAB R2021a running on a Lenovo laptop with an Intel core CPU at 4.6 GHz and 32 GB memory.
4.1. Matrix Completion Problems
In this subsection, we mainly solve the nuclear norm matrix completion problems with different kinds of noise in order to show their properties for different noisy models. Numerical experiments are tested for solving the following two matrix completion problems. One is the following nuclear norm matrix completion problem with
fidelity
Another is the following nuclear norm matrix completion problem with
fidelity
In the following tests, we use the proposed sPADMM and the sPDADMM to solve model (
32). Model (
33) is solved by the state-of-the-art algorithm ADMM-NNLS [
17].
Test 1: The numerical results can be seen in
Figure 1. We test the two problems (
32) and (
33) with three noisy cases: impulsive noise only, Gaussian and impulsive noise, and Gaussian noise only. Moreover, we choose the maximum iteration to be 200 and
as the stopping rule in order to observe their performance more clearly.
From the first row of
Figure 1, we can see that the proposed sPADMM and sPDADMM for model (
32) can obtain a higher accuracy than the ADMM-NNLS for model (
33). Moreover, the sPDADMM can attain accuracy faster than the sPADMM. Observing the second row, it is the case containing both Gaussian and impulsive noises. It is clear to see that using model (
32) can recover the matrix successfully; however, model (
33) is inefficient only if the measurement contains a little impulsive noise. In other words, no matter how small the percentage of impulsive noise contained in the measurement
b, model (
32) performs better than model (
33). In the third row, we test the case with Gaussian noise only, where the noisy level
varies from
to
. From the bottom row of
Figure 1, we can see that model (
33) obtains a faster convergence rate than model (
32) and attains a similar accuracy of the solution as model (
32). Additionally, it is illustrated that model (
33) can efficiently solve the case of containing Gaussian noise only, but the proposed model (
32) is more efficient and robust for the case with non-Gaussian noise.
To sum up, these numerical results illustrate the following conclusions for the nuclear norm minimization problem. Firstly, whenever the observation data contain impulsive noise, Gaussian noise, or their mixtures, model (
32) naturally performs better than model (
33) or its variants. In particular, the data with impulsive noise only can be recovered exactly by using model (
32) as the parameter chosen appropriately, but model (
33) is inefficient. Secondly, without impulsive noise, the
-fitting model (
32) does not harm the quality of the solution as long as the measurement data do not contain a large amount of Gaussian noise. The above analysis illustrates that model (
32) has a broader scope of applicability. Finally, when the data contain a large amount of Gaussian noise, the
-fitting model (
33) performs better, but high accuracy would not be obtained no matter which method is chosen.
Test 2: The numerical results can be seen in
Figure 2. This test mainly observes the performances of models (
32) and (
33) with different
and
r, which all contain both Gaussian noise (
) and impulsive noises (
).
From the first row of
Figure 2, we can see that model (
32) may perform better than model (
33) for the matrix completion problem with mixture noises when the
is relatively higher. Observing the second row of
Figure 2, it is clear to see that both the sPADMM and sPDADMM for (
32) perform well with
r increasing. Moreover, model (
32) can obtain a higher accuracy within 100 iterations than model (
33). The above numerical analysis shows the robustness of model (
32) and the efficiency of the proposed algorithms.
Test 3: The numerical results on recovering real gray images can be seen in
Figure 3, which is a
image. Firstly, we apply matrix SVD for obtaining the low-rank images. Then, we randomly select
of the elements from the low-rank image and add some different kinds of noise to obtain the corrupted images. Finally, the corrupted images are recovered by using the proposed sPADMM and sPDADMM to solve model (
32). Here, we also use the
to measure the quality of the recovered images.
From the (d) and (e) in
Figure 3, we can see that the proposed sPADMM and sPDADMM recovered the corrupted image with impulsive noise and Gaussian noise successfully. In the same way, the (a4) and (a5) in
Figure 4 show that the sPADMM and sPDADMM can recover the corrupted image without randomly selecting samples, which also contains only impulsive noise. In a word, these tests show that the proposed sPADMM and sPDADM perform well on recovering these real corrupted images.
4.2. Nuclear Norm Minimization with Fidelity Term
In this subsection, we mainly describe some results of the sPADMM and sPDADMM for solving the problem (
6) to illustrate the robustness of model (
6) and the efficiency of the sPADMM and sPDADMM.
Test 4: The numerical results are shown in
Table 1. We test the sPADMM and sPDADMM for solving (
6) with impulsive noise.
From
Table 1, we can see that all the situations can be recovered successfully and attain high accuracy, which can be reviewed as successful recovery. In addition, the sPDADMM is faster than the sPADMM as the scale is smaller. When the scale becomes larger, the Z-subproblem needs to compute a full SVD, so it becomes expensive. Moreover, we can solve some larger-scale problems if the CPU memory capacity is large enough. These limited numerical results illustrate that the proposed sPADMM and sPDADMM are very efficient for solving the nuclear norm minimization problem with impulsive noise.
Test 5: The numerical results can be seen in
Table 2. In order to further illustrate the efficiency of the sPADMM and sPDADMM and the robustness of the proposed model, we test the sPADMM and sPDADMM to solve the problem (
6) with different noises, which are impulsive noise, Gaussian noise, and their mixtures, respectively. Here, we set the terminated conditions to be
and
.
From
Table 2, we can observe that the proposed sPADMM can successfully solve almost all problems, except the case with only impulsive noise (
). The mentioned case
is a highly challenging problem because the noise level is so high. But the proposed sPDADMM can successfully deal with all the cases; moreover, it can attain high accuracy in the case of only impulsive noise (
). Additional experiments are conducted in the following
Table 3 for illustrating the aforementioned results. The sampling rate
has a great influence on whether the problem can be solved successfully. Observing the results in
Table 3, the sPADMM cannot be successfully implemented for the cases of
. As the sampling rate increases, so does the efficiency of both algorithms. Moreover, the sPDADMM can solve successfully for the case of
and is superior to the sPADMM for all the cases. These numerical results illustrate that the sPDADMM for solving the dual problem of (
6) is more robust, which is our main contribution to apply an effective and robust method for solving the nonsmooth convex optimization problem (
6) from the dual perspective. As the level of the Gaussian noise becomes high, it can be seen that the sPADMM can obtain higher accuracy than the sPDADMM. Especially, both of the proposed algorithms are efficient for the noiseless case.
Test 6: The numerical results are displayed in
Figure 5. We test the sPADMM and sPDADMM for solving problem (
6) with different
.
Observing
Figure 5, we can see that the sPADMM and sPDADMM perform better with
increasing and can even attain a higher accuracy than
. In the meantime, as the parameter is chosen appropriately, it can obtain an approximate exact solution for solving the matrix nuclear norm minimization problem.
These numerical results show that the above conclusions on the matrix completion problem (
32) also apply to the general problem (
6) as well. Moreover, if the measurement contains kinds of noises, it is more efficient to choose the nuclear norm minimization problem with the
-norm fidelity term. Moreover, these tests illustrate that the sPADMM is more robust and efficient for solving the case with non-Gaussian noise.