Next Article in Journal
The Integration of Artificial Intelligence in Inclusive Education: A Scoping Review
Previous Article in Journal
Complexity Evaluation of Test Scenarios for Autonomous Vehicle Safety Validation Using Information Theory
Previous Article in Special Issue
Advances and Challenges in Automated Drowning Detection and Prevention Systems
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Non-Local Prior Dense Feature Distillation Network for Image Compressive Sensing

School of Information and Electronic Engineering, Zhejiang University of Science and Technology, Hangzhou 310023, China
*
Author to whom correspondence should be addressed.
Information 2024, 15(12), 773; https://doi.org/10.3390/info15120773
Submission received: 8 November 2024 / Revised: 24 November 2024 / Accepted: 27 November 2024 / Published: 3 December 2024
(This article belongs to the Special Issue Computer Vision for Security Applications)

Abstract

:
Deep learning-based image compressive sensing (CS) methods often suffer from high computational complexity and significant loss of image details in reconstructions. A non-local prior dense feature distillation network (NPDFD-Net) is proposed for image CS. First, the non-local priors of images are leveraged to enhance high-frequency information in the measurements. Second, a discrete wavelet decomposition learning module and an inverse discrete wavelet reconstruction module are designed to reduce information loss and significantly lower computational complexity. Third, a feature distillation mechanism is incorporated into residual dense blocks to improve feature transmission efficiency. Finally, a multi-scale enhanced spatial attention module is proposed to strengthen feature diversity. Experimental results indicate that compared to MRCS_GAN, OCTUF, and DPC-DUN, the proposed method achieves an average PSNR improvement of 1.52%, 2.35%, and 0.93%, respectively, on the Set5 dataset. The image reconstruction running time is enhanced by 93.93%, 71.76%, and 40.74%, respectively. Furthermore, the proposed method exhibits significant advantages in restoring fine texture details in the reconstructed images.

1. Introduction

With the development of chip technology and the artificial intelligence industry, images have been widely utilized as a significant signal carrier. However, the resources required for transmitting and storing massive amounts of image data face increasing challenges. CS [1] is a vital signal theory that overcomes the limitations of the traditional Nyquist sampling theorem and holds potential research value in addressing this challenge. Research progress indicates that CS theory has achieved remarkable results in various image applications, such as single-pixel cameras [2], rapid magnetic resonance imaging (MRI) [3], snapshot compressive imaging [4], and medical imaging [5].
CS alleviates the pressure of data transmission and storage by designing an observation matrix at the compression end, which allows for sampling data far below the Nyquist theorem. CS achieves image reconstruction at the reconstruction end by designing reconstruction algorithms to solve convex optimization problems. Consequently, extensive exploration is required to find sparse representations of natural images and uncorrelated observation matrices. Due to strict mathematical constraints, emerging breakthroughs have been limited [6]. Moreover, the iterative optimization process of CS reconstruction is computationally intensive and ineffective. Mun et al. [7] introduced the concept of block compressive sensing (BCS) to address this issue. BCS first segments the image into non-overlapping blocks and then performs independent block sampling and reconstruction. However, BCS remains within the realm of classical CS methodologies.
In recent years, deep learning (DL) has achieved significant progress in computer vision [8,9]. It has been introduced into the compressed sensing (CS) field, yielding some promising results [10,11]. Compared with traditional CS methods, DL can adaptively learn semantic features of high-dimensional image data, achieving more precise feature extraction than conventional CS observation matrices while providing higher-resolution image reconstruction capabilities [12,13,14,15,16,17,18,19,20,21,22,23,24]. Traditional CS methods often rely on iterative solving processes, such as iterative hard thresholding or greedy algorithms, which have high computational complexity and slow reconstruction speed. In contrast, DL-based methods perform inference directly through end-to-end network structures. Once the model is trained, the reconstruction process only requires forward propagation, making it computationally very fast and suitable for real-time applications. Furthermore, the network structures in deep learning can organically integrate compression and sensing reconstruction in CS, fundamentally enabling end-to-end system learning in the CS domain [15,16].
DL-based CS methods can be divided into two categories: data-driven and deep unfolding methods. Using deep neural networks, data-driven methods establish a nonlinear mapping between the reconstructed image and the ground truth. These methods rely on large datasets for training and can automatically learn complex image features to achieve high-quality reconstruction. Typical examples include ReconNet [12], Deep Residual Reconstruction Network (DR2-Net) [14], and CSNet [15]. However, model complexity also rises as the network size increases, resulting in significantly higher computational costs and storage requirements. Some deep unfolding methods [13,16,17,19,21,22,23,24] have been proposed to enhance the interpretability of network models. The core idea of deep unfolding methods is to map the iterative process of traditional CS reconstruction algorithms to deep neural network structures. Specifically, deep unfolding methods treat each iteration step of classical algorithms as a layer in a neural network, unfolding the entire iteration process into a deep network and optimizing network parameters through end-to-end training. This approach retains the interpretability of traditional iterative algorithms while incorporating the symbolic power of DL, thereby significantly improving reconstruction efficiency. However, since deep unfolding methods simulate the classical iterative optimization process, they often require longer inference times, especially when handling complex images or multiple iterations, making computation speed a potential bottleneck.
Most deep learning-based compressed sensing (CS) methods primarily focus on image reconstruction, typically replacing traditional handcrafted measurement matrices with convolutional layers [15], where the sampling matrix is learned through convolutional kernels during training. However, the sampling matrix generated by convolutional layers tends to capture low-frequency information of images, resulting in the loss of high-frequency details and causing aliasing artifacts during image reconstruction [25]. Therefore, effectively improving the quality of image reconstruction and preserving texture details while balancing computational efficiency and model complexity remains a pressing challenge.
To address the aforementioned issues, this paper proposes an image compression sensing method based on a non-local prior dense feature distillation network. The contributions of this paper are as follows:
  • To reduce network complexity, a discrete wavelet decomposition learning module and an inverse discrete wavelet reconstruction module were designed to replace traditional downsampling and upsampling methods. This approach significantly reduces computational complexity while maintaining high reconstruction quality and better extracts and restores multi-scale image features, thereby enhancing the details and overall visual quality of the final reconstruction. Furthermore, a dense feature distillation block was proposed, which dynamically adjusts the number of feature reuses based on the distance between features to achieve efficient feature reuse and reduce redundancy in the network.
  • To enhance texture information in the reconstructed image, a non-local prior sampling module was first designed to fully leverage the original image’s non-local information. This module enables the network to more comprehensively capture the dependencies between distant pixels in the original image, thereby improving the recovery of texture details. Additionally, a multi-scale enhanced spatial attention module was introduced to enrich the diversity of feature representations, allowing the model to reconstruct the image’s texture information more accurately.
  • Experimental results show that NPDFD-Net demonstrates a significant advantage in running speed compared to other methods. Additionally, this model can produce more realistic texture details in image reconstruction and exhibits robust resistance to noise, providing an efficient and high-performance solution for CS image reconstruction.
In the remainder of this paper, related work is presented in Section 2. Section 3 describes the model structure and working principles of the proposed NPDFD-Net method. Detailed experiments, which comprehensively analyze the performance of NPDFD-Net and other methods, are presented in Section 4. Finally, the conclusion of our work is provided in Section 5.

2. Related Works

2.1. Theory of CS

For an original image x R N , CS theory projects Φ R M × N into a low-dimensional space y R M using an observation matrix Φ R M × N . CS theory can be expressed as:
y = Φ x
where y is referred to as the CS measurements. Due to M << N, the inverse reconstruction process of x via y is typically underdetermined, meaning that x has infinitely many solutions. To ensure the uniqueness of x, CS theory requires that x be sparse or sparse in a certain transform domain. It can be expressed as:
x = Ψ α
where Ψ R N × N represents the sparse transform basis, and α R N denotes the sparse coefficients after original signal sparse transformation. CS theory requires that satisfies K-sparsity in the transform domain Ψ , meaning that x has K (K << N) non-zero coefficients in the transform domain Ψ .
By combining Equations (1) and (2), CS process can be uniformly expressed as:
y = Φ x = Φ Ψ α = A α
where A is referred to as the sensing matrix, and the observation matrix Φ must satisfy the restricted isometry property (RIP), which can be expressed as:
( 1 δ ) x 2 2 < Φ x 2 2 < ( 1 + δ ) x 2 2
where δ ( 0 , 1 ) represents the restricted isometry constant. Since determining whether a measurement matrix satisfies RIP is highly complex, an equivalent condition is typically required, namely that the observation matrix is incoherent with the sparse transform basis.
In classical CS reconstruction methods, the reconstruction of the signal is essentially an l1-norm minimization problem:
x = min x x 1 s . t .   y = Φ x = Φ Ψ α

2.2. Deep Learning-Based CS

ReconNet [12] is the first deep learning method to use convolutional neural network (CNN) for solving compressive sensing, its image reconstruction quality and speed significantly surpass those of classical CS iterative reconstruction algorithms. Zhang et al. [13] mapped the iterative shrinkage-thresholding algorithm (ISTA) into a network to enhance its interpretability. Yao et al. [14] proposed a DR2-Net, which reconstructs CS observations through linear mapping and residual networks. Shi et al. [15] designed a sampling sub-network for training and learning the observation matrix. Furthermore, they proposed an image CS framework using convolutional neural networks, which jointly optimizes image CS through sampling and reconstruction networks. Zhang et al. [16] incorporated the constraints of traditional observation matrices into the sampling network and improved the block effect issue in image reconstruction by leveraging the correlation between image patches. AMP-Net [17] combines the Approximate Message Passing (AMP) denoising algorithm with CNN, and jointly trains the measurement matrix and the network to enhance reconstruction performance. Tian et al. [18] proposed the MR-CSGAN model with multi-scale structural features. Recently, researchers have introduced Transformer models to enhance reconstruction quality further. Shen et al. [19] mapped the ISTA into a Transformer and combined it with a CNN to form a hybrid architecture. Ye et al. [20] leveraged the Transformer’s strength in capturing global context information to combine CNN and Transformer, proposing the Csformer model. Song et al. [21] proposed an optimization-inspired cross-attention transformer module. The dynamic path-controllable deep unfolding network (DPC-DUN) adjusts the complexity through a path-controllable selector [22]. Zhang et al. [23] proposed a U-shaped transformer that allocates measurement resources based on the sparsity of image patches. Li et al. [24] proposed the Dual-Domain Deep Convolutional Encoding Network (D3C2-Net), which effectively transmits high-capacity adaptive convolutional features.

2.3. Sampling Network

At the sampling end of CS methods, researchers have found that multi-scale sampling can extract image features at various levels [25]. Yin et al. [26] proposed using multi-level wavelet transforms to sample from sparse signals at the sampling end. Shi et al. [27] proposed a multi-scale deep network for image compressive sensing, where the same deep reconstruction network is used to recover images at different sampling rates, thereby reducing storage requirements. Zhang et al. [28] introduced an adaptive multi-scale image CS network in the wavelet domain, where images are decomposed using discrete wavelet transforms. Since low-frequency information is more crucial than high-frequency information for final reconstruction quality, higher sampling rates are allocated to low-frequency sub-bands to improve reconstruction performance.
The original image often contains rich structural and texture information in compressive sensing image reconstruction. If this prior information can be effectively exploited and utilized, it can significantly enhance the reconstruction accuracy and visual quality. However, most existing methods focus on extracting generic features, neglecting the modeling and application of priors specific to the original image. This limitation affects the reconstruction quality to some extent, particularly in restoring high-frequency details and textures.

3. Methodology

3.1. The Proposed Network Architecture

The proposed NPDFD-Net method is an end-to-end network architecture comprising a non-local prior sampling sub-network, initial reconstruction sub-network, and deep reconstruction sub-network, as illustrated in Figure 1. The input image is sampled using a non-local prior sampling sub-network to obtain the measurements, and the initial reconstruction sub-network transforms the measurements from a low-dimensional to a high-dimensional space to complete the initial image reconstruction. The deep reconstruction sub-network performs high-quality reconstruction of the initial reconstructed image. In this section, we will present detailed information about our NPDFD-Net.

3.2. Non-Local Prior Sampling and Initial Reconstruction Sub-Network

To capture the non-local similarity information in the original image, we designed a non-local prior (NP) sampling sub-network based on a cross-attention [29], as shown in Figure 2. First, the input image X R H × W × 1 is passed through 1 × 1, 3 × 3, and 5 × 5 three convolutional layers to generate three feature maps: Q, K, and V. Next, Q, K, and V are adjusted through a reshape operation to obtain the deformed feature maps: Q R H W × 1 , K R 1 × H W , and V R H W × 1 . Then, Q and K are multiplied to obtain the similarity matrix S, where S describes the similarity between different positions in the input image.
Subsequently, the similarity matrix S is normalized using the softmax function to generate an attention map At. The attention map At is multiplied by the feature map V to obtain an enhanced feature result. The multiplied result is then restored to the same size as the input image X through reshape operation. Finally, the result is fused with the input image via a skip connection to obtain the final output X , forming a residual unit that effectively preserves the low-frequency information of the original input while enhancing the capture of high-frequency features. This process can be described by the following equations:
X = R ( s o f t m a x ( Q K T ) V ) + X
where softmax represents the softmax function, and R denotes the reshape operation.
The NP sampling sub-network effectively exploits the non-local similarity information in the input image through the cross-attention mechanism, enhancing its ability to model global dependencies and retain more high-frequency information in the measurements. Consequently, this improves image reconstruction quality, particularly showing significant advantages in restoring complex texture details.
In BCS, the input image is first divided into non over lapping blocks of size B × B. The working principle of NP sampling sub-network can be represented by the following equation:
y i = Φ B x i
where x i R N denotes the i th sub-block of the input image after segmentation, y i R M represents the measurements output by NP sampling sub-network for the i th sub-block, denotes the convolution operation, and Φ B represents the observation matrix, which is a convolutional layer of size and step size B × B, here N = B × B, the sampling ratio is defined as r = M/N. The parameters of the NP sampling sub-network have adaptive learning ability, and the bias and activation function are not set to protect the fine-grained information of the image.
The initial reconstruction sub-network is composed of convolution and pixelshuffle layers. The initial reconstruction process can be represented by the following equation:
X i n i t = P i x e l S h u f f l e ( Φ B T y i )
where Φ B T R N × M corresponds to a convolutional layer with B2 filters of size r B 2 × 1 × 1 . PixelShuffle(·) stands for pixelshuffle layer, the pixelshuffle layer reshapes each tensor B 2 × 1 × 1 into tensor 1 × B × B and forms the initial reconstruction image X init .

3.3. Discrete Wavelet Decomposition Learning Module

Traditional downsampling methods in CNN, such as max pooling and average pooling, can lead to the loss of fine-grained information, such as image boundaries and textures, which are detrimental to fine image reconstruction. To our knowledge, researchers have not proposed an effective downsampling and upsampling method in image CS reconstruction. In contrast, wavelet transforms can better preserve these details. Moreover, wavelets possess visual capabilities such as multi-scale analysis and global feature representation [30,31]. We designed the discrete wavelet decomposition learning module (DWDLM) based on the Haar wavelet, as illustrated in Figure 3 (where the illustration 2 represents the a 2× downsampling operation), which includes two processes: the subband feature decomposition stage and the channel expansion stage.
In the subband feature decomposition stage, we employ four filters from the two dimensional discrete wavelet transform (2D-DWT) algorithm based on Haar wavelets to decompose the initial reconstructed image into four distinct subbands: the approximate subband X LL , horizontal detail subband X HL , diagonal detail subband X LH , and vertical detail subband X HH . Notably, the length and width of these subbands are each half the size of the initial reconstructed image. Therefore, the computational load of the network model is reduced to 1/4 of the original. After decomposition, the features from these four subband channels are concatenated into a single feature map with a channel dimension of 4. The four subband filters f LL , f HL , f LH and f HH used in 2D-DWT and are defined as follows:
f LL = 1   1 1   1 ,   f HL = 1   1 1   1 ,   f LH = 1   1   1             1 ,   f HH =     1     1 1         1
In the channel expansion stage, a 1 × 1 convolution layer is applied to expand the dimensions of the feature map to 64, aiming to enhance the model’s feature representation capability.

3.4. Multi-Scale Enhanced Spatial Attention

The enhanced spatial attention [32] (ESA) has certain limitations in enhancing the representational capacity of the network, primarily due to its relatively fixed receptive field, which makes it challenging to capture the rich and diverse features in images. To further enhance the network’s representational capability and better learn diverse features to increase the texture details in reconstructed images, we improved the ESA. We proposed the multi-scale enhanced spatial attention (MESA), as shown in Figure 4.
First, a 1 × 1 convolution layer reduces the number of channels, achieving a lightweight design and lowering computational complexity. Next, a strided convolution with a 3 × 3 kernel and a stride of 2 expands the receptive field. This approach provides a broader range of local information, enhancing the network’s effectiveness in local mapping. Generally, a larger receptive field enhances the network’s ability to capture features in local regions, resulting in more accurate mappings. To further expand the receptive field and capture multi-scale features, a maxpooling operation with a window size of 7 and strides of 3, 5, and 7 was first applied, creating an abstract representation of the input features at multiple levels of receptive fields. Subsequently, these multi-scale features were learned through the Convgroups module to extract deeper semantic information. An upsampling layer was then used to restore the spatial dimensions of the features, aligning them with the original resolution. Next, the features from the three scales were fused to integrate information from various receptive fields. Simultaneously, a skip connection mechanism combined high-resolution features from before spatial downsampling with the fused features, forming a residual unit.

3.5. Dense Feature Distillation Block

In the residual dense network [33] (RDB), the input to each convolution layer consists of the output feature maps from all previous convolution layers. While this indiscriminate dense feature reuse helps capture more information, it also leads to redundancy in the network structure and increased computational complexity. To address this issue, we introduced the concept of feature distillation [32] into the RDB and proposed the dense feature distillation block (DFDB), as shown in Figure 5. Specifically, feature distillation is achieved through a 1 × 1 convolution layer, dynamically adjusting the number of features reused by the receiving convolutional layer. The reuse strategy is determined based on the distance between the output features of each convolutional layer and the receiving convolutional layer, enabling effective selection and reuse of features. This effectively reduces the reuse of output features from neighboring convolution layers, lowering the redundancy among features. At the same time, to ensure sufficient feature learning, the receiving convolution layer still receives all 64 output feature maps from the previous convolution layer. If the receiving convolution layer is F n , it can be expressed as:
F n = v = 1 n 1 ω v × F n v
where ω v is the weight coefficient, refined through a 1 × 1 convolution layer to distill the features F n v at a distance v from the receiving convolution layer, × denotes multiplication operation.
The theory of hybrid dilated convolution (HDC) [34] provides a flexible method to adjust the network’s receptive field without increasing the number of parameters, effectively addressing the grid effect that often arises when using dilated convolutions with the same dilation rate. To this end, the DFDB employs dilated convolutions that adhere to HDC theory as the fundamental operator. Specifically, the dilation rates d of the three dilated convolutions are set to 1, 2, and 3, respectively, allowing for greater flexibility in meeting the diverse feature extraction needs at different scales and ensuring that information across various scales can be fully utilized. Additionally, we insert the proposed MESA after the second convolution layer and the activation function. This approach helps optimize the features extracted by the preceding convolution layers, enabling subsequent convolution layers to learn based on more representative and effective features, thereby further improving the quality and detail representation of the reconstructed image. The principle of DFDB can be expressed as:
F 1 64 = σ ( W 3 , d = 1 Z n 1 64 )
F 2 64 = σ ( W 3 , d = 2 ( C o n c a t ( Z n 1 16 , F 1 64 ) ) )
F 2 64 = Γ M E S A ( F 2 64 ) × F 2 64
F 3 64 = σ ( W 3 , d = 3 C o n c a t ( Z n 1 32 , F 1 16 , F 2 64 ) )
Z n 64 = C o n c a t ( Z n 1 48 , F 1 32 , F 2 16 , F 3 64 ) W 1 + Z n 1 64
where Z n 1 64 represents the input feature of the n th DFDB, Z n 1 j (j = 16, 32, 48) represents the j features distilled from the input feature map of the n th DFDB using a 1 × 1 convolution layer, W 3 , d = k represents the 3 × 3 convolution layer with a dilation rate of k, σ represents the ReLU activation function, and Concat denotes the concatenate operation, F i j (i =1, 2, 3, j = 16, 32, 48) indicates that the features after the i th convolution layer and ReLU activation function yield j output features distilled using a 1 × 1 convolution layer, Γ M E S A ( ) represents the MESA moudle, W 1 represents the 1 × 1 convolution layer, Z n 64 represents the output feature map of the n th DFDB.

3.6. Inverse Discrete Wavelet Reconstruction Module

We design the inverse discrete wavelet reconstruction module (IDWRM) to restore low-resolution wavelet subbands to high-resolution feature images to perform the upsampling operation. As shown in Figure 6, the IDWRM consists of two processes: the channel reduction stage and the reconstruction output stage. The channel reduction stage uses a 1 × 1 convolution layer to reduce the channel dimension of the wavelet subbands from 64 to 4. In the reconstruction output stage, the two dimensional-inverse discrete wavelet transform (2D-IDWT) algorithm restores the wavelet subbands into high-resolution feature maps, followed by a 3 × 3 convolution layer that outputs the final reconstructed image.

4. Experiment

4.1. Experimental Settings

We select the DIV2K [35] dataset as the training image set. The DIV2K dataset contains 800 images, expanded to 96,000 images through data augmentation techniques such as random cropping, translation, and rotation. We use the Set11 [12] dataset for validation and the Set5 [36], Set14 [37], and BSD100 [38] datasets as test sets. All experiments are conducted using the PyTorch 1.12 deep learning framework. The experimental platform consists of an Intel Core i7-13700KF CPU and an RTX4070Ti GPU. The input batch size is set to 64, and the size of the image patches is set to 96 × 96. We use four commonly used experimental sampling ratios of r = 0.01, 0.04, 0.10 and 0.25. Other parameters are set to B = 32, and ω v = 1 , 2 , 3 , 4 = 1 ,   0.25 ,   0.5 ,   0.75 . We use an adaptive moment estimation (Adam) [39] optimizer to optimize all network parameters, with a learning rate set to 0.0004. The model is trained for 80 epochs to achieve better convergence, with 3000 iterations per epoch. The learning rate is halved every ten epochs. We chose peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM) as the objective metrics to evaluate the quality of reconstructed images. The loss function used is the Mean Absolute Error (MAE).

4.2. Performance of Different Numbers of DFDBs

Experimental tests assess the impact of varying numbers of DFDBs on the quality of reconstructed images. The results are shown in Table 1, where the average values represent the mean PSNR metric across four sampling ratios, with the red results represents the best results. The results from the three datasets in Table 1 indicate that the network structure with 7 DFDBs outperforms those with other configurations regarding image quality metrics. Although the network with 8 DFDBs slightly surpasses the configuration with 7 DFDBs at a few sampling ratios, this suggests it may be experiencing some degree of overfitting. In contrast, the configuration with 6 DFDBs exhibits a significant overall disadvantage compared to the 7 DFDBs. Based on these experimental findings, subsequent experiments employ the depth reconstruction sub-network with 7 DFDBs.

4.3. Ablation Experiment Analysis of the Reconstruction Network Structure

In this section, we designed four variant models for ablation experiments to verify the effectiveness of the proposed improvements. The four variant models are as follows: N1: The ESA module is used instead of the MESA to evaluate the effect of the improved MESA. N2: The RDB module is used instead of the DFDB to validate the contribution of the DFDB to the overall network performance. N3: Traditional max pooling and nearest-neighbor interpolation upsampling modules replace the DWDLM and IDWRM to assess the impact of the improved wavelet decomposition and reconstruction modules on performance. N4: Both DWDLM and IDWRM are removed further to analyze the combined effect of these two modules. Through comparative experiments with these variant models, we can systematically evaluate the specific contributions of each improved module, thereby validating the effectiveness of the overall improvement scheme and its impact on the quality of the reconstructed images. Four variants are tested in an ablation experiment at a sampling ratio of r = 0.10. The parameters (Params) and floating-point operations (FLOPs) of a 256 × 256 instance image during forward propagation are also tested. The results of the ablation experiment are shown in Table 2, the red results represents the best results.
The data in Table 2 indicates that introducing MESA enhances the network’s ability to recover critical information while only adding a minimal amount of Params and FLOPs compared to ESA. The RDB employs a distillation strategy, resulting in increases of 0.48 dB and 0.0129, 0.32 dB and 0.0176, and 0.19 dB and 0.0111 in PSNR and SSIM for the three datasets, respectively. Additionally, FLOPs and Params decreased by 7.44 G and 503.89 K, respectively. This indicates that it enhances the efficiency of feature reuse in the RDB and reduces the complexity of the model. Compared to traditional upsampling and downsampling methods, the proposed DWDLM and IDWRM incur lower accuracy loss. The introduction of DWDLM and IDWRM dramatically reduces the computational complexity of the model compared to not using downsampling and upsampling modules, with FLOPs decreasing by 57.08 G, thus significantly improving the network’s inference speed. Furthermore, the PSNR and SSIM metrics for the three datasets improved by 0.18 dB and 0.0021, 0.16 dB and 0.0034, and 0.09 dB and 0.0027, respectively, indicating that the application of DWDLM and IDWRM helps alleviate network overfitting and enhances the quality of the reconstructed images.

4.4. Effectiveness Analysis of the NP Sampling Sub-Network

To evaluate the impact of the proposed NP sampling sub-network on the quality of reconstructed images, a comparative experiment was conducted using only a convolutional layer sampling method with a size of 32 × 32 and a stride of 32 × 32, under a sampling rate of r = 0.25. The experimental results are shown in Figure 7, where the horizontal axis represents the training epochs, and the vertical axis represents the PSNR metric. The results indicate that the proposed NP sampling sub-network significantly improves the PSNR of the reconstructed image.
Furthermore, to validate the effectiveness of the proposed method further, we tested an example image, as shown in Figure 8, we have enlarged the images within the red box to more clearly display the local texture details. The results demonstrate that the reconstructed image using the NP sampling sub-network exhibits finer and more realistic texture details.

4.5. Comparisons with State-of-the-Art Method

In this section, we compare the objective quality of the reconstructed images from NPDFD-Net with other state-of-the-art methods. The state-of-the-art methods include ISTA-Net+ [13], DR2-Net [14], OPINE-Net+ [16], AMP-Net [17], TransCS [19], MR_CSGAN [18], OCTUF [21], DPC-DUN [22], and D3C2-Net [24]. The experimental results of the Set5, Set14, and BSD100 datasets are shown in Table 3, Table 4 and Table 5, respectively. For ease of comparison, the red and blue results represent the best and second-best results, respectively. The results in Table 3 demonstrate that in the Set5 dataset, our NPDFD-Net method shows a significant advantage over other methods. Except for the PSNR value at the sampling ratio r = 0.25, which is slightly lower than the MR_CSGAN and OCTUF methods. NPDFD-Net achieves the best PSNR and SSIM metrics at the other three sampling ratios. Especially at low sampling ratios (r = 0.01), the PSNR and SSIM are improved by 0.13 dB and 0.0242, 0.20 dB and 0.0050, respectively, compared to the MR_CSGAN and DPC-DUN methods. The critical challenge for image reconstruction tasks based on CS is using as few measurements as possible to achieve high-quality image reconstruction. This is of significant practical importance for resource constrained platforms, such as vehicle-mounted systems and embedded mobile devices, where transmission bandwidth and storage capacity are limited. Additionally, NPDFD-Net consistently outperforms other methods regarding average reconstruction quality metrics across all four sampling ratios. Specifically, the average PSNR and SSIM of NPDFD-Net are improved by 0.94 dB and 0.0139, 0.47 dB and 0.0166, 0.72 dB and 0.0122, 0.29 dB and 0.0109 compared to TransCS, MR_CSGAN, OCTUF, and DPC-DUN.
The results in Table 4 indicate that on the Set14 dataset, the proposed method demonstrates an overall advantage compared to the state-of-the-art methods TransCS, MR_CSGAN, OCTUF, and DPC-DUN. The average PSNR increases by 0.65 dB, 0.16 dB, 0.41 dB, and 0.06 dB, while the average SSIM values increase by 0.0099, 0.0086, 0.0068, and 0.0104, respectively. Additionally, our method is compared to the larger BSD100 dataset, with results shown in Table 5. The proposed method achieves the best or next best PSNR and SSIM metrics at the sampling ratios r = 0.01, 0.04, and 0.10. The higher sampling ratio r = 0.25 is slightly inferior to the next best OCTUF and MR_CSGAN methods. Overall, the proposed method maintains a clear advantage. The average PSNR and SSIM are improved by 0.36 dB and 0.0068 compared to the OCTUF method, while the PSNR and SSIM values increase by 0.09 dB and 0.0071 over the MR_CSGAN method.
We randomly chose two instance images to test at the sampling ratios r = 0.04 and 0.25, respectively, and the results are shown in Figure 9 and Figure 10, we have enlarged the local details within the red box. Both objective evaluation metrics, PSNR and SSIM, demonstrate that NPDFD-Net has a significant advantage over other methods. Additionally, from a subjective visual perception perspective, it is clear that the images reconstructed by NPDFD-Net exhibit higher overall quality and perform best in reconstructing local texture details.

4.6. Comparison of Noise Robustness

We randomly add Gaussian noise with a mean of 0 and standard variances of 0.001, 0.005, 0.010, and 0.020 to images in the Set14 dataset. At a sampling ratio of r = 0.10, we experimentally compare the noise robustness of NPDFD-Net with AMP-Net [17], TransCS [19], MR_CSGAN [18], DPC-DUN [21], OCTUF [22], and D3C2-Net [24], as shown in Figure 11.
Experimental results show that the proposed NPDFD-Net demonstrates superior noise robustness. Specifically, as the standard deviation of Gaussian noise increases, the PSNR gap between NPDFD-Net and D3C2-Net gradually narrows. When the standard deviation reaches 0.020, our method surpasses D3C2-Net in terms of PSNR, exhibiting superior reconstruction quality. Further analysis reveals that the slope of PSNR changes between adjacent noise standard deviations indicates a smaller PSNR decline for NPDFD-Net as noise standard deviation increases, showing a relatively gentle slope variation. This characteristic suggests that as noise levels increase, NPDFD-Net experiences a slower decline in image quality, demonstrating significant noise robustness. This enhancement in robustness is crucial for compressed sensing reconstruction tasks, especially in real-world applications where noise interference is unavoidable. We attribute this strong robustness primarily to the feature learning capability of NPDFD-Net in the wavelet domain. Wavelet transformation disperses noise energy across different frequency bands, with most noise concentrated in high-frequency regions. At the same time, the image’s primary structure and essential details are preserved in low- and mid-frequency regions.

4.7. Comparison of Running Time

We further evaluate the average running time of different methods for a 256 × 256 image at a sampling rate of r = 0.10. The results are shown in Table 6, the red and blue results represent the best and second-best results, respectively. All methods are tested on the same hardware and software platform, with the result for DR2-Net taken from its original paper. Table 6 shows that the average runtime of our method is only 0.0096 s, significantly outperforming all other methods. Among the other methods, DPC-DUN and MR_CSGAN exhibit higher reconstruction quality. However, our method surpasses DPC-DUN and MR_CSGAN regarding reconstruction quality and demonstrates superior performance in average running time.

5. Conclusions

This paper proposes a novel image compression sensing method based on non-local priors and dense feature distillation to enhance texture details in image reconstruction and improve computational efficiency. By introducing an NP sampling sub-network, the sub-network captures long-range dependencies in the original image, thereby retaining more high-frequency information and significantly enhancing the detailed representation of the reconstructed images. The DFDB effectively reduces feature redundancy through selective distillation, enhances feature reuse efficiency, and lowers model complexity. HDC and MECA further improve the network’s ability to model multi-scale features and essential information. Compared to traditional upsampling and downsampling methods, the proposed DWDLM and IDWRM effectively reduce information loss, lower computational costs, and significantly enhance the network’s inference speed, while demonstrating strong robustness against noise interference.
Although NPDFD-Net has demonstrated excellent experimental performance, it lacks sufficient theoretical interpretability. The ‘black box’ nature of deep learning models makes their internal mechanisms difficult to explain, which can be challenging for applications requiring high transparency. In the future, we plan to combine traditional optimization algorithms with our method to better balance interpretability and reconstruction performance. Additionally, our method provides a new idea for lightweight CS models, reducing computational complexity and storage requirements and promoting applications in resource-constrained environments, such as embedded devices and mobile platforms.

Author Contributions

Conceptualization, M.F.; methodology, X.H.; software, M.F.; validation, M.F. and X.H.; formal analysis, K.Z.; investigation, M.F. and X.H.; data curation, M.F.; writing—original draft, X.H.; writing—review & editing, K.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Basic Public Welfare Research Program of Zhejiang Province under Grant (No. LGF22F020017).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The code for this study is available from https://github.com/hanzhitangxi/NPDFD-Net/tree/master (accessed on 10 November 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Donoho, D.L. Compressive sensing. IEEE Trans. Inf. Theory 2006, 52, 1289–1306. [Google Scholar] [CrossRef]
  2. Duarte, M.F.; Davenport, M.A.; Takhar, D.; Laska, J.N.; Sun, T.; Kelly, K.F.; Baraniuk, R.G. Single-pixel imaging via compressive sampling. IEEE Signal Process. Mag. 2008, 25, 83–91. [Google Scholar] [CrossRef]
  3. Lustig, M.; Donoho, D.; Pauly, J.M. Sparse MRI: The application of compressed sensing for rapid MR imaging. Magn. Reson. Med. 2007, 58, 1182–1195. [Google Scholar] [CrossRef] [PubMed]
  4. Liu, Y.; Yuan, X.; Suo, J.; Suo, J.L.; Brady, D.J.; Dai, Q.H. Rank minimization for snapshot compressive imaging. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 41, 2990–3006. [Google Scholar] [CrossRef]
  5. Liu, Y.; Wu, S.; Huang, X.; Chen, B.; Zhu, C. Hybrid CS-DMRI: Periodic time-variant subsampling and omnidirectional total variation based reconstruction. IEEE Trans. Med. Imaging 2017, 36, 2148–2159. [Google Scholar] [CrossRef]
  6. Zhang, T. Sparse recovery with orthogonal matching pursuit under RIP. IEEE Trans. Inf. Theory 2001, 57, 6215–6221. [Google Scholar] [CrossRef]
  7. Mun, S.; Fowler, J.E. Block Compressive sensing of images using directional transforms. In Proceedings of the 2009 IEEE International Conference on Image Processing (ICIP), Cairo, Egypt, 7–10 November 2009; pp. 3021–3024. [Google Scholar]
  8. Munsif, M.; Khan, N.; Hussain, A.; Kim, M.J.; Baik, S.W. Darkness-adaptive action recognition: Leveraging efficient tubelet slow-fast network for industrial applications. IEEE Trans. Industr Inform. 2024, in press. [Google Scholar] [CrossRef]
  9. Munsif, M.; Khan, S.U.; Khan, N.; Hussain, A.; Kim, M.J.; Baik, S.W. Contextual visual and motion salient fusion framework for action recognition in dark environments. Knowl. Based Syst. 2024, 304, 112480. [Google Scholar] [CrossRef]
  10. Zhang, J.; Chen, B.; Xiong, R.; Zhang, Y. Physics-inspired compressive sensing: Beyond deep unrolling. IEEE Signal Process. Mag. 2023, 40, 58–72. [Google Scholar] [CrossRef]
  11. Machidon, A.L.; Pejović, V. Deep learning for compressive sensing: A ubiquitous systems perspective. Artif. Intell. Rev. 2023, 56, 3619–3658. [Google Scholar] [CrossRef]
  12. Kulkarni, K.; Lohit, S.; Turaga, P.; Kerviche, R.; Ashok, A. ReconNet: Non-Iterative Reconstruction of Images from Compressively Sensed Measurements. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 449–458. [Google Scholar]
  13. Zhang, J.; Ghanem, B. ISTA-Net: Interpretable optimization-inspired deep network for image compressive sensing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 1828–1837. [Google Scholar]
  14. Yao, H.; Dai, F.; Zhang, S.; Zhang, Y.; Tian, Q. DR2-Net: Deep residual reconstruction network for image compressive sensing. Neurocomputing 2019, 359, 483–493. [Google Scholar] [CrossRef]
  15. Shi, W.; Jiang, F.; Zhang, S.; Zhao, D. Image compressed sensing using convolutional neural network. IEEE Trans. Image Process. 2019, 29, 375–388. [Google Scholar] [CrossRef] [PubMed]
  16. Zhang, J.; Zhao, C.; Gao, W. Optimization-inspired compact deep compressive sensing. IEEE J. Sel. Top. Signal Process. 2020, 14, 765–774. [Google Scholar] [CrossRef]
  17. Zhang, Z.; Liu, Y.; Liu, J.; Wen, F.; Zhu, C. AMP-Net: Denoising-based deep unfolding for compressive image sensing. IEEE Trans. Image Process. 2020, 30, 1487–1500. [Google Scholar] [CrossRef] [PubMed]
  18. Tian, J.; Yuan, W.; Tu, Y. Image compressed sensing using multi-scale residual generative adversarial network. Vis. Comput. 2021, 38, 4193–4202. [Google Scholar] [CrossRef]
  19. Shen, M.; Gan, H.; Ning, C.; Hua, Y.; Zhang, T. TransCS: A transformer-based hybrid architecture for image compressive sensing. IEEE Trans. Image Process. 2022, 31, 6991–7005. [Google Scholar] [CrossRef]
  20. Ye, D.; Ni, Z.; Wang, H.; Zhang, J.; Wang, S.; Kwong, S. Csformer: Bridging convolution and transformer for compressive sensing. IEEE Trans. Image Process. 2023, 32, 2827–2842. [Google Scholar] [CrossRef]
  21. Song, J.; Mou, C.; Wang, S.; Ma, S.W.; Zhang, J. Optimization-Inspired Cross-Attention Transformer for Compressive Sensing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 6174–6184. [Google Scholar]
  22. Song, J.; Chen, B.; Zhang, J. Dynamic path-controllable deep unfolding network for compressive sensing. IEEE Trans. Image Process. 2023, 32, 2202–2214. [Google Scholar] [CrossRef]
  23. Zhang, K.; Hua, Z.; Li, Y.; Zhang, Y.; Zhou, Y. Uformer-ICS: A U-Shaped Transformer for Image Compressive Sensing Service. IEEE Trans. Serv. Comput. 2023, 17, 2974–2988. [Google Scholar] [CrossRef]
  24. Li, W.; Chen, B.; Liu, S.; Zhao, S.; Du, B.; Zhang, Y.; Zhang, J. D3C2-Net: Dual-Domain Deep Convolutional Coding Network for Compressive Sensing. IEEE Trans. Circuits Syst. Video Technol. 2024, 34, 9341–9355. [Google Scholar] [CrossRef]
  25. Canh, T.N.; Jeon, B. Multi-scale deep compressive sensing network. In Proceedings of the 2018 IEEE Visual Communications and Image Processing (VCIP), Taichung, Taiwan, 9–12 December 2018; pp. 1–4. [Google Scholar]
  26. Yin, Z.; Shi, W.; Wu, Z.; Zhang, J. Multilevel wavelet-based hierarchical networks for image compressive sensing. Pattern Recognit. 2022, 129, 108758. [Google Scholar] [CrossRef]
  27. Shi, W.; Jiang, F.; Liu, S.; Zhao, D. Multi-scale deep networks for image compressive sensing. In Proceedings of the IEEE International Conference on Image Processing (ICIP), Athens, Greece, 7–10 October 2018; pp. 46–50. [Google Scholar]
  28. Zhang, K.; Hua, Z.; Li, Y.; Chen, Y.; Zhou, Y. Ams-net: Adaptive multi-scale network for image compressive sensing. IEEE Trans. Multimed. 2022, 25, 5676–5689. [Google Scholar] [CrossRef]
  29. Huang, Z.; Wang, X.; Huang, L.; Huang, C.; Wei, Y.; Liu, W. Ccnet: Criss-cross attention for semantic segmentation. In Proceedings of the IEEE/CVF International Conference On Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 603–612. [Google Scholar]
  30. Zhao, X.; Huang, P.; Shu, X. Wavelet-Attention CNN for image classification. Multimed. Syst. 2022, 28, 915–924. [Google Scholar] [CrossRef]
  31. Duan, Y.; Liu, F.; Jiao, L.; Zhao, P.; Zhang, L. SAR image segmentation based on convolutional-wavelet neural network and Markov random field. Pattern Recognit. 2017, 64, 255–267. [Google Scholar] [CrossRef]
  32. Liu, J.; Zhang, W.; Tang, Y.; Wu, G. Residual feature aggregation network for image super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 2359–2368. [Google Scholar]
  33. Zhang, Y.; Tian, Y.; Kong, Y.; Zhong, B.; Fu, Y. Residual dense network for image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018; pp. 2472–2481. [Google Scholar]
  34. Wang, P.; Chen, P.; Yuan, Y.; Liu, D.; Huang, Z.; Hou, X.; Cottrell, G. Understanding convolution for semantic segmentation. In Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Tahoe, NV, USA, 12–15 March 2018; pp. 1451–1460. [Google Scholar]
  35. Timofte, R.; Agustsson, E.; Gool, V.L.; Yang, M.; Zhang, L. Ntire 2017 challenge on single image super-resolution: Methods and results. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA, 21–26 July 2017; pp. 114–125. [Google Scholar]
  36. Bevilacqua, M.; Roumy, A.; Guillemot, C.; Morel, A. Low-Complexity Single Image Super-Resolution Basedon Nonnegative Neighbor Embedding. In Proceedings of the 23rd British Machine Vision Conference (BMVC), Surrey, UK, 3–7 September 2012; pp. 1–10. [Google Scholar]
  37. Zeyde, R.; Elad, M.; Protter, M. On Single Image Scale-Up Using Sparse-Representations. In Proceedings of the Curves and Surfaces: 7th International Conference, Avignon, France, 24–30 June 2010; pp. 711–730. [Google Scholar]
  38. Martin, D.; Fowlkes, C.; Tal, D.; Malik, J. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Vancouver, BC, Canada, 7–14 July 2001; pp. 416–423. [Google Scholar]
  39. Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Figure 1. The architecture of the proposed NPDFD-Net.
Figure 1. The architecture of the proposed NPDFD-Net.
Information 15 00773 g001
Figure 2. Principle schematic diagram of NP sampling and initial reconstruction sub-network.
Figure 2. Principle schematic diagram of NP sampling and initial reconstruction sub-network.
Information 15 00773 g002
Figure 3. Principle schematic diagram of DWDLM.
Figure 3. Principle schematic diagram of DWDLM.
Information 15 00773 g003
Figure 4. Principle schematic diagram of MESA.
Figure 4. Principle schematic diagram of MESA.
Information 15 00773 g004
Figure 5. Principle schematic diagram of DFDB.
Figure 5. Principle schematic diagram of DFDB.
Information 15 00773 g005
Figure 6. Principle schematic diagram of IDWRM.
Figure 6. Principle schematic diagram of IDWRM.
Information 15 00773 g006
Figure 7. Comparison of average PSNR among different sampling sub-networks on the Set5 dataset.
Figure 7. Comparison of average PSNR among different sampling sub-networks on the Set5 dataset.
Information 15 00773 g007
Figure 8. Comparison of reconstructed image quality with using different sampling sub-networks at a sampling rate r = 0.25.
Figure 8. Comparison of reconstructed image quality with using different sampling sub-networks at a sampling rate r = 0.25.
Information 15 00773 g008
Figure 9. Comparison of reconstructed image quality using various methods at a sampling rate r = 0.04.
Figure 9. Comparison of reconstructed image quality using various methods at a sampling rate r = 0.04.
Information 15 00773 g009
Figure 10. Comparison of reconstructed image quality using various methods at a sampling rate r = 0.25.
Figure 10. Comparison of reconstructed image quality using various methods at a sampling rate r = 0.25.
Information 15 00773 g010
Figure 11. Comparison of robustness for different noise intensities added to the Set14 dataset.
Figure 11. Comparison of robustness for different noise intensities added to the Set14 dataset.
Information 15 00773 g011
Table 1. The impact of varying numbers of DFDBs on the quality of reconstructed images.
Table 1. The impact of varying numbers of DFDBs on the quality of reconstructed images.
DatasetsNumber of DFDBsr = 0.01r = 0.04r = 0.10r = 0.25Average
PSNR
/dB
SSIMPSNR
/dB
SSIMPSNR
/dB
SSIMPSNR
/dB
SSIMPSNR
/dB
SSIM
Set5824.550.667229.560.857833.630.928737.710.964131.360.8545
724.530.668629.630.857333.770.930637.740.965131.420.8554
623.440.661329.500.850733.600.929137.770.965331.080.8516
Set14823.060.546026.770.740329.850.844033.520.924128.300.7636
723.010.569826.770.738829.880.846133.660.927628.330.7706
623.020.561726.610.732929.850.845933.540.926128.260.7667
BSD100823.870.546026.530.700128.810.810732.190.907427.840.7411
723.870.548126.520.698628.820.810832.280.910227.870.7419
623.820.541326.450.694228.800.811232.320.910627.850.7368
Table 2. Ablation experimental of each module in the reconstruction sub-network.
Table 2. Ablation experimental of each module in the reconstruction sub-network.
VariantsFLOPs/GParams/KSet5Set14BSD100
PSNR
/dB
SSIMPSNR
/dB
SSIMPSNR
/dB
SSIM
N119.041628.5333.650.922029.750.842328.780.8088
N226.492136.7033.290.917729.560.828528.630.7997
N318.811632.8133.650.929029.770.843828.800.8098
N476.131632.8133.590.928529.720.842728.730.8081
NPDFD-Net19.051632.8133.770.930629.880.846128.820.8108
Table 3. Comparison of reconstructed image quality levels with using different methods on the Set5 dataset.
Table 3. Comparison of reconstructed image quality levels with using different methods on the Set5 dataset.
Methodsr = 0.01r = 0.04r = 0.10r = 0.25Average
PSNR
/dB
SSIMPSNR
/dB
SSIMPSNR
/dB
SSIMPSNR
/dB
SSIMPSNR
/dB
SSIM
ISTA-Net+ [13]18.550.440823.450.661928.610.931534.170.927226.200.7404
DR2-Net [14]18.500.452722.740.617726.560.757131.010.867624.700.6620
OPINE-Net+ [16]21.890.593927.950.828032.510.915036.780.956530.030.8234
AMP-Net [17]22.420.618327.810.817232.100.902436.790.953229.780.8228
TransCS [19]24.320.664428.140.828032.470.914237.020.959530.490.8415
MR_CSGAN [18]24.400.644428.830.830432.810.915337.770.965130.950.8388
OCTUF [21]23.120.639828.710.845233.180.924337.780.963430.700.8432
DPC-DUN [22]24.330.663629.280.839733.510.916937.390.957631.130.8445
D3C2-Net [24]24.070.668229.170.856133.440.9286----
NPDFD-Net (Ours)24.530.668629.630.857333.770.930637.740.965131.420.8554
Table 4. Comparison of reconstructed image quality levels with using different methods on the Set14 dataset.
Table 4. Comparison of reconstructed image quality levels with using different methods on the Set14 dataset.
Methodsr = 0.01r = 0.04r = 0.10r = 0.25Average
PSNR
/dB
SSIMPSNR
/dB
SSIMPSNR
/dB
SSIMPSNR
/dB
SSIMPSNR
/dB
SSIM
ISTA-Net+ [13] 18.220.401422.080.570826.000.728930.620.870024.230.6428
DR2-Net [14]18.310.414921.330.537324.440.664428.130.799723.050.6041
OPINE-Net+ [16]21.360.526225.500.712228.770.829433.120.919627.180.7469
AMP-Net [17]21.650.618325.490.700428.760.818233.210.914427.470.7628
TransCS [19]23.030.570825.510.713228.810.834333.380.924427.680.7607
MR_CSGAN [18]23.050.561326.440.723929.380.834733.800.928028.170.7620
OCTUF [21]21.990.548126.040.730329.480.845434.180.931227.920.7638
DPC-DUN [22]22.950.576926.640.720629.790.826933.720.917428.270.7602
D3C2-Net [24]22.730.582326.580.736129.970.8544----
NPDFD-Net (Ours)23.010.569826.770.738829.880.846133.660.927628.330.7706
Table 5. Comparison of reconstructed image quality levels with using different methods on the BSD100 dataset.
Table 5. Comparison of reconstructed image quality levels with using different methods on the BSD100 dataset.
Methodsr = 0.01r = 0.04r = 0.10r = 0.25Average
PSNR
/dB
SSIMPSNR
/dB
SSIMPSNR
/dB
SSIMPSNR
/dB
SSIMPSNR
/dB
SSIM
ISTA-Net+ [13] 19.360.407422.230.540325.090.684329.040.840523.930.6181
DR2-Net [14]19.250.428121.720.527124.040.637527.230.777423.060.5926
OPINE-Net+ [16]21.900.500225.000.667527.550.790631.210.898426.420.7152
AMP-Net [17]22.280.527325.120.664127.630.783631.380.902326.600.7193
TransCS [19]23.920.549425.140.676527.630.800531.370.909927.010.7317
MR_CSGAN [18]23.840.540326.310.686728.570.800832.410.911427.780.7348
OCTUF [21]22.630.524725.400.680927.990.809031.730.915126.940.7324
DPC-DUN [22]23.820.553826.410.677928.810.789532.390.898727.860.7300
D3C2-Net [24]23.010.541525.660.694928.260.8025----
NPDFD-Net (Ours)23.870.548126.520.698628.820.810832.280.910227.870.7419
Table 6. Comparison of average running time for different methods.
Table 6. Comparison of average running time for different methods.
MethodsAverage Running Time/sPrograming Language
DR2-Net [14] 0.0565Matlab + Caffe
ISTA-Net+ [13]0.0185Python 3.9 + Pytorch 1.12
OPINE-Net+ [16]0.0156
AMP-Net [17]0.0143
TransCS [19]0.0252
MR_CSGAN [18]0.1189
OCTUF [21]0.0340
DPC-DUN [22]0.0162
D3C2-Net [24]0.0260
NPDFD-Net (Ours)0.0096
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Feng, M.; Han, X.; Zheng, K. Non-Local Prior Dense Feature Distillation Network for Image Compressive Sensing. Information 2024, 15, 773. https://doi.org/10.3390/info15120773

AMA Style

Feng M, Han X, Zheng K. Non-Local Prior Dense Feature Distillation Network for Image Compressive Sensing. Information. 2024; 15(12):773. https://doi.org/10.3390/info15120773

Chicago/Turabian Style

Feng, Mingkun, Xiaole Han, and Kai Zheng. 2024. "Non-Local Prior Dense Feature Distillation Network for Image Compressive Sensing" Information 15, no. 12: 773. https://doi.org/10.3390/info15120773

APA Style

Feng, M., Han, X., & Zheng, K. (2024). Non-Local Prior Dense Feature Distillation Network for Image Compressive Sensing. Information, 15(12), 773. https://doi.org/10.3390/info15120773

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop