1. Introduction
Sparse representation of signals in dictionary domains has been widely studied and has provided promising performance in numerous signal processing tasks such as image denoising [
1,
2,
3,
4,
5], super resolution [
6,
7,
8], inpainting [
9,
10] and compression [
11,
12]. It is well known that images are represented by a linear combination of certain atoms of a dictionary. Overcomplete sparse representation is the overcomplete system with a sparse constraint. Common overcomplete systems differ from the traditional bases, such as DCT, DFT and Wavelet, because they offer a wider range of generating elements; potentially, this wider range allows more flexibility and effectiveness in signal sparse representation. However, it is a severely under-constrained illposed problem to find the underlying overcomplete representation due to the redundancy of the systems. When the underlying representation is sparse and the overcomplete systems have stable properties, the ill-posedness will disappear [
13]. Sparse models are generally classified into two categories: Synthesis sparse models and analysis sparse model [
14]. The commonly referred to sparse models are synthesis sparse models. The analysis ones characterize the signal by multiplying it with an analysis overcomplete dictionary, leading to a sparse outcome. A variety of effective sparse models have been investigated and established such as the classical synthesis sparse model [
9,
15], the classical analysis sparse model [
14], the nonlocal sparse model [
16,
17] and the 2D sparse model [
18]. Unfortunately, these models ignore the stability recovery property which claims that once a sufficient sparse solution is found, all alternative solutions necessarily reside very close to it [
9]. Recently, the stable recovery of sparse representation has drawn attention in signal processing theory. Generally speaking, stable recovery can be guaranteed by two properties: Sufficient sparsity and a favorable structure of the dictionary [
19]. Donoho defines the concept of mutual incoherence of the dictionary and applies it to prove some possibility of stable recovery [
19]. The authors of [
20] proposea sparsity-based orthogonal dictionary learning method to minimize the mutual incoherence. The authors of [
21] propose an incoherent dictionary learning scheme by integrating a low rank gram matrix of the dictionary into the dictionary learning model.
A more powerful stable recovery guarantee developed by Candes and Tao, termed Restricted Isometry Property (RIP), makes consequent analysis easy [
22]. A matrix
is said to satisfy the RIP of order
k if there exists a constant
such that
holds for all
k-sparse vectors
.
is defined as the smallest constant which satisfies the above inequalities and is called the restricted isometry constant of
.
Most RIP research substantially investigates applying RIP as a stablility analysis instrument [
17,
23,
24] or finding optimal RIP constant [
25,
26] which are all theoretical analyses rather than practical applications. According to the research of [
21], the intrinsic property of a dictionary has a direct influence on its performance. All familiar algorithms are staggeringly unstable with a coherent or degenerate dictionary [
19]. Recognizing the gap between theoretical analyses and practical applications of RIP, this paper aims to build a stable sparse model satisfying the RIP.
Recently, the frame as a stable overcomplete system has drawn some attention in signal processing as the given signal can be represented by its canonical expansion in a manner similar to conventional bases under the frame. Some data-driven approaches are proposed in [
1,
27,
28,
29,
30]. The authors of [
27,
29,
30] utilize redundant tight frame in compressed sensing and [
28] applies tight frame to few-view image reconstruction. Study [
1] presents a data-driven method that the dictionary atoms associated with the tight frame are generated by filters. These approaches achieve much better image processing performance than previous methods, and meanwhile the tight frame condition which requires the frame almost-orthogonality will limit the flexibility in sparse representation. Study [
31] derives stable recovery result for
-analysis minimization in redundant, possibly non-tight frames. Inspired by this result and the relationship between RIP and frame, we aim to establish a stable sparse model with RIP based on non-tight frame.
We call a sequence
a frame if and only if there exist two positive numbers
A and
B such that
Here, A and B are called the bounds of the frame. We find that every submatrix satisfied RIP is a non-tight frame with and as its frame bounds with a given k. Obviously, there is an essential connection between the non-tight frame and the RIP.
In this paper we focus on a stable sparse model and more specifically on the development of an algorithm that would learn a pair of non-tight frame based dictionaries from a set of signal examples. We propose a stable sparse model via applying the non-tight frame condition to approximate the RIP. This model shares the favorite overcomplete structures with the common sparse models, and meanwhile it contains RIP and closed-form sparse coefficient expression which ensure stable recovery. Recognizing that the optimal framebounds are essentially the maximum and minimum singular values of the frame, RIP is actually enforced on the dictionary pair (the frame and its dual frame) by constraining the singular values of them. We also formulate a dictionary pair learning model via applying the second-order truncated Taylor series to approximate the inverse frame operator. Then we present an efficient algorithm to learn the dictionary pair via a two-phase iterative approach. To summarize, this paper makes the following contributions:
We propose a stable sparse model along with a dictionary pair learning model. Non-tight frame condition is utilized to develop a relaxation of RIP to guarantee stable recovery of sparse representation. Moreover, the sparse coefficients are also modeled, which leads to a more stable recovery especially for seriously noisy image.
It is nearly impossible to solve the dictionary pair learning model in a straightforward way since the inverse frame operator is involved. We provide an effective way to modify the model via applying a second-order truncated Taylor series to approximate the inverse frame operator, and provide an efficient algorithm for the modified one.
We present the stability analysis of the proposed model and demonstrate it on natural and synthetic image denoising, super resolution and image inpainting. The denoising results show that the proposed approach outperforms synthesis models such as the KSVD and the data-driven tight frame based methods for natural image case in terms of average PSNR. Moreover, it also gains comparable performance to the Analysis KSVD for a piecewise-constant (PWC) image in terms of average PSNR. The meaningful structures in the trained dictionary pairs for natural images and a PWC image are observed. The super resolution results show that the SSM-NTF produces better performance than the Bicubic interpolation method and the method in [
32]. The inpainting results show that our model is able to eliminate text of fonts completely.
This paper is organized as follows:
Section 2 reviews the related work on frame, synthesis sparse model and analysis sparse model.
Section 3 presents our stable sparse model with non-tight frame SSM-NTF along with a dictionary pair learning model.
Section 4 proposes the corresponding dictionary pair learning algorithm.
Section 5 proposes the image restoration method of our proposed SSM-NTF model. In
Section 6 we analyze the computational complexity of our proposed algorithm. In
Section 7, we demonstrate the the effectiveness of our SSM-NTF model by analyzing the convergence of the corresponding algorithm, denoising natural and piecewise constant images, super resolution and image inpainting. Finally,
Section 8 concludes this paper.
2. Related Work
In this section, we briefly review the related work on frame, synthesis sparse model and analysis sparse model.
Frame: A frame
is called a tight frame if the frame bounds are equal in the Equation (
2) [
32]. There are two associated operators can be defined between the Hilbert space
and Square Integrable Space
once a frame is defined. One is the analysis operator
defined by
and the other is its adjoint operator
which is called the synthesis operator
then, the frame operator can be defined as the following canonical expansion
In Euclidean space, a given frame
can be represent in manner of matrix with its columns of it as the frame elements. Then one of its adjoint operator can be representated as
[
32]. Let
be an arbitrary vector, a reconstruction function can be expressed as the following form
Synthesis sparse model: The conventional synthesis sparse model represents a vector
by the linear combination of a few atoms from a large dictionary
, denoted as
, where
L is the sparsity of
. The computational techniques for approximating sparse coefficient
under a given dictionary
and
includes greedy pursuit (e.g., OMP [
9]) and convex relaxation optimization, such as Lasso [
33] and FISTA [
8]. In order to improve the performance of sparse representation, some modified models such as the nonlocal sparse model [
16], the frame based sparse model [
21], and the MD sparse model [
18] are also investigated.
Analysis sparse model: The analysis sparse model is defined as:
where
is a linear operator (also called as a dictionary), and
l denotes the co-sparsity of the signal
. The analysis representative vector
is sparse with
l zeros. The zeros in
denote the low-dimensional subspace to which the signal
belongs. The analysis sparse coding [
14] and dictionary learning [
34] approach are also been proposed.
However, all these models ignore the stability recovery property which provides stable reconstruction of the signals in presence of noise.
Dictionary learning methods: The dictionaries include analytical dictionaries, such as DCT, DWT, curvelets and contourlets and learned dictionaries. Some dictionary learning method are proposed, such as the classical KSVD [
9] algorithm, the efficient sparse coding which convert the original dictionary learning problem to two least squares problem by applying the Lagrange dual [
3], the non-local sparse model [
16] which learns a set of PCA sub-dictionaries by cluster the samples into K clusters using image nonlocal self-similarity prior and its improved version which using the
-norm to instead the
-norm in order to handle different image contents. With the realization of stability, some mutual-coherence based methods are proposed. In [
20] a sparsity-based orthogonal dictionary learning method is proposed to minimize the mutual-coherence of the dictionary. The authors of [
21] propose an in coherent dictionary learning scheme by integrating a low rank gram matrix of the dictionary into the dictionary learning model. However, these methods only concern the capability of the dictionary without modeling the sparse coefficients which still has some probability of instability.
3. The Proposed SSM-NTF
In this section, we present the stable sparse model with non-tight frame, (
Section 3.1), the stability analysis of the proposed model, (
Section 3.2) and the dictionary pair (the frame pair) learning model, (
Section 3.3).
3.1. Stable Sparse Model with Non-Tight Frame
In this section, we derive our stable sparse model with non-tight frame where the non-tight frame condition serves as an approximation to the RIP.
According to [
35], a
k-th RIP constant can be express as
where
The Equation (
7) provides a new perspective in integrating the RIP to sparse model via applying
and
instead of the RIP constant
. The difficulty in building a stable sparse model decreases. However, the sparsity
k varies with the noise level, and also, in a feasible numerical calculation method, it is impossible to sweep through all the samples satisfying
to pursue an unknown dictionary
.
Let
be a signal vector, the frame reconstruction function can be formulated as
where
is a dual frame of
. Adding a reasonable sparsity prior to the signal
over
domain, we can derive
Denoting the optimal frame bounds of
as
A and
B, the frame condition of
can be formulated as
. Then a pair of bounds for Equation (
10) can be obtained as
. A formula similar to Equation (
9) is derived as
,
where
is the data set. Imitating Equation (
7), we can obtain a RIP-like constant expression
where
. Obviously,
can be regarded as an approximation of the RIP constant which benefits the computation due to the ignorance on sparsity degree. In a word, the RIP constraint can be satisfied by constraining the frame bounds. Thus, a stable overcomplete system with a sparsity prior can be established.
Now we discuss the characteristic of the frame bounds
A and
B. The Frame Condition (
2) has a more compact form
where
denotes any singular value of
. More specifically,
,
where
and
denote the maximum and minimum singular values of
, respectively. Then, we can obtain
. It is easy to know that
. Obviously,
is a reasonable relaxation of
as
is slightly exceed
but resides very close to it as long as the data is not seriously degraded. Therefore, the RIP constraint can be enforced on the frames by limiting the maximum and minimum singular values.
In this paper, we integrate non-frame to traditional sparse model to establish a stable sparse model with RIP. Let
be a signal vector. Under the assumption of the sparsity prior of
, we apply a soft thresholding operator
(which shall be defined in the next subsection) on it such that
where
is a vector with elements
corresponding to
,
. Therefore, we propose the stable sparse model with non-tight frame (SSM-NTF) as follows
Here, the correlation between the frame and its dual frame is formulated as . The frame operator is formulated as which is indeed a gram matrix of . The singular values of are constrained by to satisfy the RIP. Actually, by constraining the singular values of , the elements of the gram matrix are also bounded which meets the theory of mutual coherence.
In order to be consistent with the traditional sparse models, we refer to the frame and its dual frame as dictionary and its dual dictionary.
3.2. The Stability Analysis of the Proposed Model
In sparse representation problem, a given noiseless signal
, can be formulated formulated as
where
is the sparse representation dictionary and
is the sparse coefficients. While
is an underdetermined linear system, the problem
has the unique solution
as soon as it satisfies the uniqueness property which is formulated as
where
is the mutual-coherence of
[
9]. However, the signals are usually acquired with noise, then the problem
should be relaxed to the problem
which is expressed as
where
is an error-tolerant which exists due to the noise. The problem
will no longer maintain the uniqueness of solution as
is an inequality system. Thus, the notion of Uniqueness Property (
15) is replaced by the notion of stability which claims that all the alternative solutions reside very close to the ideal solution. Under the stable guarantee, we can yet ensure that the recovery results of our methods produce meaningful solutions. Assume that
is the ideal solution to the problem
and
is the candidate one, the traditional sparse model has a stability claim of the form [
9]
where
is the mutual coherence which is formulated as
. Apparently, the error bound of Equation (
17) can only be determined with given sparsity
and the mutual coherence
. However, the mutual coherence of an unknown dictionary is very difficult to calculate which lead to a result that we can not ensure the stability in the dictionary learning case. In contrast, we derive a similar stability claim of our proposed SSM-NTF model.
Defining
with
as the ideal solution to the model, we have that
. From the previous subsection, we have know that the frame
satisfies the RIP with the corresponding parameter
. Thus, using this property and exploiting the lower-bound part in Equation (
1), we get
where
. Thus, we get a stability claim of the form
Obviously, the error bound of the SSM-NTF is determined by , the ratio of the upper bound to the lower bound of the frame, rather than the specific values of A and B. Thus, for the convenience of numerical experiments, we usually set A to a fixed value. A main advantage of standard orthogonal transformations is that they maintain the energy of the signals in the transform domain as its frame bounds A and B are equal to 1. However, the standard orthogonal basis is non-redundant that limits its performance in sparse representation. In order to make a trade off between the represent accuracy and the degree of redundant, we usually set the lower frame bound A to a value a little smaller than 1 but not over-small as A is the minimum singular value of which determines the condition number of . Thus, once the tolerance error is given, the value of B can be easily calculated. Further, a pair of dictionaries conform to the given error can be obtained using the proposed SSM-NTF model. On the other hand, if the value of B is given by experience, the error bound of our model can be measured.
3.3. Learning Model of Dictionary Pair
Assuming
is the training data with signal vectors
, as its columns. The dictionary pair learning model can be written as
However, the Problem (
20) is difficult to solve. First, the inverse of the frame operator
has no closed-form explicit expression. Secondly, the thresholding operator is a highly nonlinear operator which makes the optimization with respect to
hard to optimize.
Apparently, the Problem (
20) is difficult to solve as the existence of the inverse of
. Fortunately, the matrix
can be expressed as a convergent series [
36] which is formulated as
Here, we truncated the series at
to make a tradeoff between computational complexity and approximation accuracy. It is formulated as
In this way, once the frame bounds are given, the inverse of
can be calculated easily. Then the optimization problem for training RIP-dictionary pair is formulated as
where
is the elementwise thresholding operator. There are two basic thresholding methods: The hard thresholding method whose thresholding operator defines as
and the soft thresholding whose operator is defined as
. Both of the two operator are are non-convex and highly discontinuous which lead to big challenges to solve Problem (
23). The mean reason is the fact that the update of the thresholding values
causing non-smooth changes to the cost function. To solve this difficulty, we design an alternative direction method via global search and least square that will be introduce in
Section 4.1.
4. Dictionary Pair Learning Algorithm
In this subsection, we propose the two-phase iterative algorithm for dictionary pair learning by dividing Problem (
23) into two subproblems: The sparse coding phase which updates the sparse coefficients
and thresholding values
, and the dictionary pair update phase which computes
and
.
4.1. Sparse coding phase
In this subsection, we discuss how to calculate the sparse coefficients and the threshold values with given and under our SSM-NTF model.
Given a pair of dictionaries
and
, calculating
and
from
is formulated as:
We pursue the two variables alternatively. Firstly, with fixed
, we obtain the sparse coefficients
by solving Problem (
24) through OMP [
9] as it can be easily convert to the classical synthesis sparse expression
where
and
.
Secondly, the pursue of
is equivalent to solving the following problem
which can be decomposed into
M individual optimization problems
where
is the column of
. From the definition of soft thresholding operator, we can know that the function of Problem (
26) is discrete. By denoting the data indices set that remains intact after the thresholding as
, we split the data
into two parts:
and
such that
where
is a supplementary to the intact indices
which turn the all elements to zero. It is clear to know that the variables
and
are both functions of
without explicit expressions which leads to a large challenge in optimization.
In order to solve Problem (
26), an intermediate variable
is necessarily to introduced to separate the whole problem into two parts: The update of the indices
and
(determined by
) and the update of the explicit thresholding value
. Then Problem (
26) can be transformed to another optimization problem:
where
and
are two functions of the intermediate variable
.
At the
k-th step, to obtain
, we solve Problem (
29) with
fixed and denote the functions as
where
,
,
. Optimizing this expression is obviously non-trivial as the target function is non-convex and highly discontinuous. Actually, with
fixed, the minimization of
can be globally solved due to its discrete finite nature. In another word, if a series of candidate terms of
are given, the global search is guaranteed to succeed.
Once a is given, the will be a piecewise constant function. It means that the function values remain unchanged within a series of intervals which are determined by . Therefore, can be taken as a portion of candidate terms of . For the function , it is clear that it minimizes at and monotonically increases with the increasing distance between and the given . So, to minimize , we only need to choose the closest point in the feasible region.
Without loss of generality, we assume that all the
are ascending ordered and the corresponding signals are in the same order. We compute all the possible values of
by
where
. Sort
in descending order of
, and every two adjacent values form an interval on which the function value remains unchanged. In another word, the objective function
is minimized at the point closest to
in the interval. Thus, compute all the minimizer values on every interval and the minimum must be the optimal result.
With
fixed, we solve the following problem in order to pursue
:
This is a standard continuous convex function that can be easily sovled by least square.
We summarize our sparse coding method in Algorithm 1.
Algorithm 1 Sparse coding algorithm |
Input and Initialization: Training data , iteration number r, initial value . Output: Sparse coefficients , and thresholding values - 1:
Compute the sparse coefficients via Problem ( 24) according to the OMP algorithm. - 2:
Sort the columns of and in increasing order of . - 3:
For p=1:r For j=1:L Compute all the possible values for by
where . Denote them as a vector . End for - 4:
Sort the elements of in descending order of . Denote the intervals bounded as . - 5:
compute every where is the point closest to in . - 6:
. - 7:
Compute via Problem ( 31). End for
|
4.2. Dictionary Pair Update Phase
To obtain
, we solve the following problem with all other variables fixed:
Such problem is a highly nonlinear optimization due to the definition of . Here we solve columnwisely by updating each column of .
For each
, we solve the following subproblem:
We denote
and
as the indices set as before. Set the elements of
corresponding to the indices
to be zeros and denote the new vector as
. This operation leads to a consequence that
≈0. Then we solve the following quadratic optimization problem that is easy to solve with least squares.
The optimization problem to pursue
is formulated as
where the frame operator
is given by
and
is defined as Equation (
22). The target function then becomes
which is denoted by
. We apply the gradient descent method to unconstraint version of Problem (
35) and then project the solution to the feasible space. The gradient is given by a very complicated form as follows
In order to reduce the complexity, the gradient can also be computed with the fixed
calculated in the previous step of the ADM. Then at the
k-th iteration, the gradient can be written as
where
. The descent step length can be obtained by optimizing the problem
with fixed
, which is given by
where
. For the frame condition
, we apply a SVD decomposition
and map the singular values to the interval
linearly. We denote the mapped singular matrix as
and reconstruct
by
.
We summarize our algorithm in Algorithm 2.
Algorithm 2 Dictionary pair learning algorithm |
Input and Initialization: Training data , frame bound A,B, iteration , gradient descent iterations r. Build frame and , either by using random entries, or using M randomly chosen data. Output: Frame , Sparse coefficients , and thresholding values For l=1: Sparse Coding Step:- 1:
Compute the sparse coefficients and the thresholding values via Algorithm 1. Frame Update Step: - 2:
Update in column-wise. Compute For Denote as the indices of zeros in the i-th column of W. Set . Compute via Equation ( 34). End For - 3:
Update via Gradient Descent and singular value map. For k=1:r Calculate Gradient via Equation ( 39) and calculate the descent step via Equation ( 40) End for - 4:
Apply SVD decomposition , map to obtain and reconstruct . End for
|
5. Restoration
The image restoration aims to reconstruct a high-quality image
from its degraded (e.g., noisy, blurred and/or downsampled) version
, denoted by
, where
H represents a blurring filter,
S the downsampling operator, and
is a noisy signal. For the signal satisfies the SSM-NTF, the restoration model based on SSM-NTF is formulated as
where
is an operator that extracts the
i-th patch of the image
and
is the
i-th column of
.
denotes a vector
with
operating on the
j-th element of
. On the right side of Equation (
41), the first term is the global force that demands the proximity between the degraded image
, and its high-quality version
. The rest terms are the local constraints to make sure every patch at location
i satisfies the SSM-NTF.
To solve Problem (
41), we apply Algorithm 1 to obtain the sparse coefficients
and the threshold values
. We mainly state the iterative method to obtain
. Assume the sign of
will not change much between two steps, we set it in the
k-th step by
. where
is the sign function. Denote
. We set
as an index set that satisfies
. Set
as a vector with elements
Then the non-convex and non-smooth threshold can be removed with the substitution that
Thus, in the
k-th step, the problem needs to be solved is expressed as
where ⊙ is point multiplication. This convex problem can be easily solved by gradient descent algorithm.
We summarized the restoration algorithm in Algorithm 3.
Algorithm 3 Restoration algorithm |
Input Training dictionaries , , iteration number r, a degraded image , set . Output: The high quality image - 1:
Compute and via the method in Algorithm 1. For k=1:r - 2:
compute . Set . set as an index set that satisfies . Set . - 3:
Solving the Problem ( 42) via gradient descent algorithm. End for
|
6. Complexity Analysis
In this section, we discuss the computational complexity of our sparse coding and dictionary pair learning algorithms with regard to those of conventional sparse model counterparts.
We first analyze complexities of the main components of the sparse coding (SC) and dictionary updating (DU) algorithms. In terms of SC, given a set of training samples, , the complexity of BtOMP of calculating is where K is the target sparsity and the complexity of threshold of calculating is , which cost most of time in SC step at each iteration. The sparse coefficients and the the threshold values are computed with fixed dictionaries and . Correspondingly, the traditional sparse coefficients is sparse approximated by dictionary and the computational complexity is .
In terms of DU, with the given training samples
, we learn a pair of dictionary
and
. We update
via Problem (
34) with a computational complexity of
. In order to update
. we need to calculate the gradient via Equation (
39) with a computational complexity of
and the step size via Equation (
40) with a computational complexity of
where
r is the iteration number of the gradient descent. For the traditional dictionary learning, the corresponding training set is
and the dictionary
is updated by SVD decomposition of rank-1 with a computational complexity of
.
8. Conclusions
In this paper, we propose a stable sparse model with non-tight frame (SSM-NTF) and further formulate a dictionary pair learning model to stably recover the signals. We theoretically analyze the rationality of the approximation for RIP with the non-tight frame condition. The proposed SSM-NTF has RIP and the closed-form expression of the sparse coefficients that ensure the stable recovery especially for seriously noise images. The proposed SSM-NTF contains both a synthesis sparse and an analysis system which share the common sparse coefficients without taking into account the thresholding. We also propose an efficient dictionary pair learning algorithm via developing an explicit analytical expression of the inherent relation between the dictionary pair. The proposed algorithm is capable of approximating structures of signals via a pair of adaptive dictionaries. The effectiveness of our proposed SSM-NTF and its corresponding algorithms are demonstrated in image denoising, image super-resolution and image inpainting. The results of numerical experiments show that the proposed SSM-NTF achieves superior to the compared methods in objective and subjective quality on most of the cases.
On the other hand, our proposed SSM-NTF is actually a 1D sparse model. The 1D sparse model suffers from high memory as well as high computational costs especially when handling high dimensional data. MD frame can be expressed as the kronecker product of a series of 1D frames. Benefitting from this good characteristic, in future work, we will extend our stable sparse model to propose an MD stable sparse model. Moreover, the proposed SSM-NTF is not effective enough to remove other kinds of noise (e.g., salt and pepper noise) as the loss function of SSM-NTF is gaussian. We would like to improve the performance of our model by changing the loss function.