1. Introduction
The purpose of image restoration is to reconstruct high-quality images
from the degraded images
. This is a typical inverse problem, and its mathematical expression is
where
denotes the degenerate operator and
is usually assumed to be zero-mean Gaussian white noise. Under different settings, Equation (1) can represent different image processing tasks. When
denotes the identity matrix, Equation (1) represents the image denoising task [
1,
2]; when
denotes a diagonal matrix with diagonal 1 or 0, Equation (1) represents an image inpainting task [
3,
4]; when
H denotes the blurring operator, Equation (1) represents an image deblurring task [
5,
6]. In this paper, we focus on the image restoration task.
In order to obtain high-quality reconstructed images, image prior knowledge is usually used to regularize the solution space. In general, image restoration can be expressed as the following minimization problems:
where the first term
represents data fidelity, the second term
depends on the image prior, and
is a regularization parameter that balances the two terms. Due to the ill-posed nature of image restoration, the prior knowledge of image plays an important role in improving the performance of the image restoration algorithm. In the past decades, various image prior models have been proposed, such as total variation [
7], sparse representation [
3,
8,
9,
10,
11], and deep convolutional neural network (CNN) [
2,
12,
13].
Sparse representation is a commonly used technique in image processing. Sparse representation models are usually divided into two categories: analytical sparse representation models [
14,
15] and synthetic sparse representation models [
3]. The analytic sparse representation model represents the signal by multiplying it with an analytic over-complete dictionary to produce a sparse effect. In this paper, we mainly study the synthetic sparse representation model. Generally speaking, synthetic sparse representation models in image processing can be further divided into two categories: patch-based sparse representation (PSR) [
16,
17] and group-based sparse representation (GSR) [
3,
9,
10,
11]. The PSR model assumes that each patch of an image can be modeled perfectly by sparse linear combination of learnable dictionaries, which are usually learned from images or image datasets. Compared with traditional analysis dictionaries, such as discrete cosine variation and wavelet, dictionaries that learn directly from images can improve sparsity and are superior in adapting to the local structure of images. For example, K-SVD based dictionary learning [
17] not only shows good image denoising effects, but also has been extended to many image processing and computer vision tasks [
18,
19]. However, the PSR model uses an over-complete dictionary, which usually produces poor visual artifacts in image restoration [
20]. Moreover, the PSR model ignores the correlation between similar patches [
3,
21], which usually leads to image degradation.
Inspired by the success of nonlocal self-similarity prior (NSS) [
22], the GSR model was proposed. The GSR model uses patch group instead of image patch as the basic unit of image processing in sparse representation and shows great potential in various image processing tasks [
3,
8,
9,
11,
23,
24,
25,
26,
27]. Dabov et al. [
27] proposed the BM3D method combining transform domain filtering with NSS prior, which is still one of the most effective denoising methods. Elad et al. [
23] proposed an image denoising algorithm based on the improved KSVD learning dictionary and non-local self-similarity, which combined the correlation coefficient matching criterion with the dictionary clipping method. Mairal et al. [
28] proposed to learn simultaneous sparse coding (LSSC) for image restoration, improving the recovery performance of KSVD [
17] through GSR. Zhang et al. [
24] used non-locally similar patches as data samples and estimated statistical parameters based on PCA training. Zhang et al. [
3] proposed a group-based sparse representation model for image restoration, which is essentially equivalent to a low-rank minimization model. Dong et al. [
25] developed structured sparse coding with Gaussian-scale mixture prior for image restoration. Zha et al. [
8] proposed a joint model to integrate the PSR model and GSR model, making image restoration establish a unified model in the field of sparse representation. Wu et al. [
11] proposed structured analysis sparsity learning (SASL), which combines the structured sparse priors learned from the given degraded image and reference images in an iterative and trainable manner. Zha et al. [
9] introduced the group sparse residual constraint, trying to further define and simplify the image restoration problem by reducing the group sparse residual. Zha et al. [
26] proposed an image recovery method using NSS priors of both internal and external image data to develop the GSR model. Despite the great success of the GSR models in various image restoration tasks, the image restored by the traditional GSR model is prone to over-smooth effect [
29]. At the same time, the traditional GSR model and various improved models only consider using the patch group of degraded image to minimize the approximate error, which will produce the effect of image over-fitting, especially when the degraded image is highly damaged.
Therefore, we propose a hybrid sparse representation model. The model uses both degraded image and the NSS prior of external image dataset to perform image restoration more effectively. On this basis, a joint sparse representation model is introduced. This model integrates the PSR model and GSR model into one model, which not only retains the advantages of the PSR model and the GSR model, but also reduces their shortcomings, so that the models in the sparse representation field are unified. For the convenience of description, the proposed hybrid sparse representation model is called HSR model. The NSS priors of degraded images are called internal NSS priors, and the NSS priors of external image datasets are called external NSS priors.
Figure 1 shows how the HSR model can repair degraded images. The contributions of this paper are summarized as follows:
(1) We propose a hybrid sparse representation model that combines the NSS priori of degraded images and external image dataset to make full use of the specific structure of degraded image and the common characteristics of natural image;
(2) The introduction of joint model into the HSR not only retains the advantages of the PSR model and GSR model, but also alleviates their respective disadvantages.
The rest of this paper is organized as follows.
Section 1 describes the related work of sparse representation.
Section 2 introduces how to learn NSS prior from external image corpus.
Section 3 introduces the proposed mixed sparse representation model.
Section 4 employs an iterative algorithm based on the alternating direction multiplier framework to solve the proposed model.
Section 5 presents the experimental results.
Section 6 concludes the paper.
4. The Proposed Hybrid Sparse Representation Model
As mentioned above, the traditional sparse representation model only uses the internal NSS priors of degraded images, which leads to over-fitting in the image restoration process. Therefore, this paper uses both the internal NSS priors of degraded images and the external NSS priors of external image dataset. At the same time, the PSR model usually produces some undesirable visual artifacts, and the GSR model leads to over-smoothing effects in various image processing tasks. In order to overcome their shortcomings and improve the image restoration effect, we have introduced a joint model [
8] based on both internal and external NSS priors. This model integrates the PSR model and GSR model, instead of using Equations (4) and (6) separately. Combining Equations (4), (6) and (17), the proposed new hybrid sparse representation model is expressed as
where
represents the internal sparse coefficient of the joint sparse representation model, and
represents the internal joint dictionary of the joint sparse representation model.
represents the external dictionary, which is learned from the image group of the external image data set using the external NSS prior [
26,
30].
and
represent non-zero constants and act as balance factors to make the solution of Equation (18) more feasible.
,
,
represents the regularization parameter, which is used to balance the sparse coefficients terms of
,
, and
. The sparse coefficient
corresponds to the sparseness of the image patch on the basis of maintaining the local consistency of the image, which reduces the over-smoothing effect. The sparse coefficient
corresponds to the sparseness of patch group on the basis of maintaining the non-local consistency of the image and suppresses the undesirable visual artifacts. For specific details of the joint sparse representation model, please refer to [
8]. Based on the above analysis, the proposed hybrid sparse representation model not only uses the internal and external NSS priors, but also unifies the sparse representation model.
The hybrid sparse representation model is used in the task of image restoration, and the joint Equations (1) and (18) are expressed as
In Equation (19), represents the internal dictionary of the joint sparse representation model, and represents the external dictionary. represents the sparse coefficient of the joint sparse representation model, and represents the external sparse coefficient. The hybrid sparse representation model proposed in Equation (19) not only comprehensively considers the NSS priors of internal image and external image database, which can provide mutually complementary information for image reconstruction, but also unifies the sparse representation model by combining the PSR model and GSR model.
6. Experimental Results
In this section, the experimental results of the proposed HSR model and seven comparison methods are given, including the SALSA [
40], BPFA [
41], GSR [
3], JSR-SR [
8], GSRC-NLP [
9], IR-CNN [
42], and IDBP [
43] methods. All experiments were carried out on Intel (R) Core (TM) I7-6700 CPU and 3.40 GHz CPU PC under Matlab 2018B environment. The source code for all competing methods is open source, and we use the default parameter settings. The 13 images used for the experimental tests are shown in
Figure 2. In order to evaluate the quality of the restored images, an experimental comparative analysis of the restored images was performed from both objective and subjective aspects. For objective evaluation, the peak signal to noise ratio (PSNR) and structural similarity (SSIM) [
44] metrics were used for the experimental comparison of the restored images. The PSNR was calculated as shown in Equations (49) and (50),
where
and
denote the original image and the restored image, respectively, and
denotes the size of the image. Equation (20) is used to calculate the mean squared error MSE of the original image
and the restored image
. Equation (50) is the calculation formula of PSNR, and
is the number of bits per pixel. A larger value of PSNR indicates less image distortion. The calculation of SSIM is shown in Equations (51)–(55),
In Equation (51), SSIM measures similarity in terms of luminance , contrast , and image structure . Where and denote the mean of the original image and the restored image of size , respectively; and denote the variance of the original image and the restored image , respectively; and denotes the covariance of the original image and the restored image . , , and are constants and introducing a constant can avoid the situation where the denominator is 0. The SSIM indicator is closer to human subjective feelings, and its value range is . The larger the value of SSIM, the more similar the two images are, and the better the effect of image restoration.
For color images, this paper only focuses on the restoration of the luminance channel in YCrCb space. In the group-based GMM learning phase, the training patch group used in the experiment is collected from the Kodak photoCD dataset, which includes 24 natural images.
6.1. Objective Evaluation
In the image restoration task, the image restoration results are given for four masks, i.e., 80%, 70%, 60%, and 50% of random pixel loss. The parameters of the HSR model used for image restoration are set as follows: the search window
is set to
, the size of the image patch is set to
, the number of similar patches is set to 60,
,
, and
. We compared the proposed HSR model with seven restoration methods, including SALSA [
40], BPFA [
41], GSR [
3], JPG-SR [
8], GSRC-NLP [
9], IR-CNN [
42] and IDBP [
43]. Among these seven methods for comparison, SALSA [
40], BPFA [
41], GSR [
3], JPG-SR [
8], and GSRC-NLP [
9] are based on traditional image restoration algorithms. The GSR [
3], JPG-SR [
8], and GSRC-NLP [
9] methods are image restoration algorithms based on the traditional GSR model, which belong to the same type of model as our proposed HSR model. SALSA [
40] and BPFA [
41] are not based on GSR. In order to comprehensively evaluate the performance of the proposed model for image restoration, the proposed HSR model was also compared with algorithms based on deep learning [
42,
43].
The SALSA model [
40] proposes an algorithm belonging to the augmented Lagrangian method family to deal with constraint problems. The method solves optimization problems where the optimal regularization parameters are tuned by manual trial and error, which requires considerable time and effort to achieve the optimal value of the method. The BPFA model [
41] utilizes a non-parametric Bayesian dictionary learning method for image sparse representation, and uses image patches as the basic unit of sparse representation, which ignores the similarity between image patches. In terms of the average value, the proposed HSR model is 4.74 dB and 6.19 dB higher than SALAS and BPFA methods respectively.
The GSR method [
3] is a typical representative of the traditional GSR model, and the JPG-SR method [
8] and the GSRC-NLP method [
9] are both improved methods based on the GSR model. The GSR method, the JPG-SR method and the GSRC-NLP method only utilize the internal NSS prior. However, the HSR model proposed in this paper combines internal and external NSS priors. In terms of the average value, the HSR model proposed in this paper improves 1.47 dB, 1.43 dB, and 1.06 dB over the GSR, JPG-SR, and GSRC-NLP methods respectively. IRCNN [
42] and IDBP method [
43] are recovery methods based on deep learning, using powerful prior knowledge of deep neural networks. In terms of the average value, the proposed HSR model improves 3.66 dB and 3.01 dB over the IRCNN and IDBP methods, respectively.
As shown in
Table 1,
Table 2,
Table 3 and
Table 4, the PSNR of the proposed HSR model on images with a pixel loss rate of 80%, 70%, 60% and 50% is higher than that of SALSA, BPFA, GSR, JRG-SR, GSRC-NLP, IR-CNN and IDBP. It can be seen from the statistical SSIM values in
Table 5,
Table 6,
Table 7 and
Table 8, that the HSR model is better than other methods in most cases. The experimental results in
Table 1,
Table 2,
Table 3 and
Table 4 and
Table 5,
Table 6,
Table 7 and
Table 8 prove that the proposed HSR model is effective and gives good restoration results compared to the comparison method.
6.2. Subjective Assessment
The visual comparison between the proposed HSR model in this paper and SALSA [
40], BPFA [
41], GSR [
3], JPG-SR [
8], GSRC-NLP [
9], IR-CNN [
42] and IDBP [
43] methods after restoration of the image Mickey with pixel missing rate 80% is given in
Figure 3. It can be observed from
Figure 3 that the SALSA [
40] and BPFA [
41] methods cannot recover sharp edges and fine details. The GSR [
3] method is better in recovering details, but produces an over-smoothing effect. The JPG-SR [
8] method can obtain better visual quality than GSR [
3] method. However, the objective evaluation results in
Table 1,
Table 2,
Table 3,
Table 4 and
Table 5 and
Table 5,
Table 6,
Table 7 and
Table 8 show that although the JPG-SR [
8] method has a higher mean value of PSNR than the GSR [
3] method in
Table 1,
Table 2,
Table 3,
Table 4 and
Table 5, in the actual restoration process, the PSNR and SSIM values of some images after restoration are lower than the restoration results of GSR [
3] method. The image restoration effect of the JPG-SR [
8] method is unstable, and only some of the image restoration results are better than the GSR [
3] method. The GSRC-NLP [
9] method can obtain similar visual effects as our proposed HSR model, which is not easy to distinguish from the naked eye. However, according to the experimental results in
Table 1,
Table 2,
Table 3,
Table 4 and
Table 5 and
Table 5,
Table 6,
Table 7 and
Table 8, our proposed HSR model has better objective evaluation results. The visual result of our proposed method is also better in recovering details than the results of IR-CNN [
42] and IDBP [
43]. The visual results in
Figure 3 show that our proposed HSR model retains clear edges and details, especially at higher pixel missing rates, and produces the result with the best visual quality.
6.3. Running Time
In this section, we present a comparison of the proposed HSR method with other comparison methods in terms of running time. Taking image Butterfly as an example, the running time of all comparison methods is compared when the pixel loss rate is 50%. As can be seen from
Table 9, the processing time of HSR method proposed in this paper is 5000.22 s for the image, which is less than 5027.67 s of the GSRC-NLP method. The proposed HSR method utilizes NSS to construct internal and external image groups and needs to learn the corresponding dictionaries, which requires higher computational workload and therefore consumes more time. To reduce processing time in our future work, learning external NSS priors from the external data set will be done in advance in the Kodak photoCD data set. Through one-time learning from Kodak photoCD data set, the external NSS priors are obtained. The priors learned in advance are applied to speed up the proposed HSR method.