Next Article in Journal
Effect of Assimilating SMAP Soil Moisture on CO2 and CH4 Fluxes through Direct Insertion in a Land Surface Model
Previous Article in Journal
Revealing Active Mars with HiRISE Digital Terrain Models
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Backtracking Reconstruction Network for Three-Dimensional Compressed Hyperspectral Imaging

1
Key Laboratory of Photoelectronic Imaging Technology and System of Ministry of Education of China, School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, China
2
Beijing Institute of Technology Chongqing Innovation Center, Chongqing 401120, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2022, 14(10), 2406; https://doi.org/10.3390/rs14102406
Submission received: 30 March 2022 / Revised: 14 May 2022 / Accepted: 15 May 2022 / Published: 17 May 2022

Abstract

:
Compressed sensing (CS) has been widely used in hyperspectral (HS) imaging to obtain hyperspectral data at a sub-Nyquist sampling rate, lifting the efficiency of data acquisition. Yet, reconstructing the acquired HS data via iterative algorithms is time consuming, which hinders the real-time application of compressed HS imaging. To alleviate this problem, this paper makes the first attempt to adopt convolutional neural networks (CNNs) to reconstruct three-dimensional compressed HS data by backtracking the entire imaging process, leading to a simple yet effective network, dubbed the backtracking reconstruction network (BTR-Net). Concretely, we leverage the divide-and-conquer method to divide the imaging process based on coded aperture tunable filter (CATF) spectral imager into steps, and build a subnetwork for each step to specialize in its reverse process. Consequently, BTR-Net introduces multiple built-in networks which performs spatial initialization, spatial enhancement, spectral initialization and spatial–spectral enhancement in an independent and sequential manner. Extensive experiments show that BTR-Net can reconstruct compressed HS data quickly and accurately, which outperforms leading iterative algorithms both quantitatively and visually, while having superior resistance to noise.

Graphical Abstract

1. Introduction

Hyperspectral (HS) images have been widely used in remote sensing, biomedical and environmental monitoring, and other fields [1,2,3,4,5] due to their enriched spatial and spectral information. However, such plentiful information is accompanied by an exponential growth in data amount, and requires dense sampling and a long imaging time for HS image acquisition. Compressed sensing (CS) theory reconstructs original signal from sub-Nyquist sampled measurements [6,7,8], which can effectively reduce the density and time of data acquisition, and is thus becoming increasingly popular in HS imaging [5,9,10,11,12].
The coded aperture tunable filter (CATF) spectral imager is a compressive spectral imaging system based on liquid crystal tunable filters (LCTFs), which can effectively improve spectral and spatial resolutions over traditional LCTF-based spectral imagers without changing the structures of LCTFs and detectors [13]. The CATF spectral imager simultaneously modulates spatial and spectral domains to obtain three-dimensional encoded compressive spectral images. Spatial encoding is implemented by digital micromirror devices (DMDs), which load the discretionary coded pattern, making it feasible to improve system performance simply through coding optimization [14,15]. Hereby, LCTF is taken as a spectral modulator instead of a narrowband filter as common. Under the CS framework, by designing the coded patterns on DMD and precisely measuring the transmission function of LCTF, HS images with higher resolution than the detector and more spectral bands than the LCTF can be reconstructed.
Like other computational imaging systems, the nature of CATF spectral imager is to shift efforts from data acquisition to reconstruction. Conventional solutions use iterative algorithms to reconstruct HS images, such as the popular iterative algorithms TwIST [16], GPSR [17], and GAP-TV [18]. Yet, reconstructing a 1280 × 1280 -sized HS image with 196 spectral bands would take over 5 h on a CPU. Such time-consuming features greatly hamper the application of CATF spectral imager for real-time imaging. Moreover, iterative algorithms are rather sensitive to the variation of sensing matrix (the product of measurement matrix and sparse transform basis). However, the measurement matrix in real optical systems can hardly achieve the same effect as designed and signals are usually not strictly sparse on this basis [19,20,21]. These further weaken the practicability of iterative algorithms for reconstructing real HS images.
In recent years, advanced reconstruction algorithms for HS images have also emerged [22,23,24,25]. Compared with popular iterative algorithms, these algorithms take into account the features of the HS images and achieve better reconstruction results. However, these elaborate algorithms usually contain multiple hyperparameters, which reduces their transferability across different imaging models. In addition, these algorithms are still fundamentally iteration-based and thus cannot escape the limitations of iterative algorithms.
In the past decade, with the advent of deep learning, convolution neural networks (CNNs) have achieved a huge success in the fields of computer vision and pattern recognition, etc. [26,27,28,29,30,31,32,33]. Recently, CNNs have also been offering promising solutions to reconstruct measurements of compressed HS imaging, and demonstrate superiority in reconstruction quality and speed over traditional iterative algorithms. HyperReconNet [34] is a pioneering spectral image reconstruction network for coded aperture snapshot spectral imaging (CASSI) systems, which optimize the entities of coded apertures as network parameters to obtain both optimized coded apertures and reconstructed spectral images. Some other reconstruction networks [35,36,37,38] have been further proposed for CASSI systems, but suffer limited spectral resolution for reconstructed results. DeepCubeNet [39] incorporates pseudo-inverse operators and 3D convolutions to perform spectral reconstruction for compressive sensing miniature ultra-spectral imagers (CS-MUSI), achieving reconstructed images of high spectral resolution. However, all previous methods are designed for specific imaging systems, failing to be applied to systems such as the CATF spectral imager, whose measurements are compressed in both spectral and spatial dimensions.
To overcome the limitations of iterative algorithm, this paper makes the first attempt to design a CNN-based framework for three-dimensional compressed HS data reconstruction. Instead of directly applying a well-established CNN backbone to forcibly map compressed measurements to original HS data and train the network as a black box with limited insights from the CS domain, we pay more attention to network design to achieve a more reasonable and interpretable solution. If we take the optical imaging and reconstruction as a process of encoding and decoding, respectively, then a simple yet effective way for reconstruction should be to reverse the imaging process step by step. Following the above intuition, we propose a backtracking reconstruction network (BTR-Net) to reconstruct three-dimensional compressed HS data for a CATF spectral imager.
Concretely, BTR-Net performs spatial initialization, spatial enhancement, spectral initialization, and spatial–spectral enhancement in a step-wise manner through multiple built-in subnetworks. The spatial initialization subnet exploits channel–spatial relations to extend the spatial resolution of input compressed data. The spatial enhancement subnet is designed to enrich spatial details through residual learning [40]. The spectral initialization subnet captures long-range dependencies among sampled spectra to increase spectral resolution. The spatial–spectral enhancement network lifts image quality from spatial and spectral perspective collaboratively to obtain final reconstructed results.
For evaluation, we conduct both quantitative and qualitative comparisons of BTR-Net with widely used iterative algorithms. Robustness tests on varying levels of noise are presented. Experimental results demonstrate that BTR-Net achieves higher reconstruction quality and stronger anti-noise performance, while running two magnitudes faster than iterative algorithms. In addition, we build a real optical system and verify the effectiveness of BTR-Net in real data.
The rest of this paper is organized as follows. Section 2 introduces the imaging model of CATF spectral imager. Section 3 develops a CNN-based backtracking reconstruction framework for the CATF spectral imager. Section 4 shows the performance of proposed framework. Section 5 discusses the parameters settings. Section 6 presents our conclusion.

2. CATF Spectral Imager

Figure 1 shows the schematic of CATF spectral imager with its optical structure presented in the dashed box. The reflected light source from the object is modulated by LCTF and DMD, in turn, and received by the detector. Concretely, the imaging lens first focuses the reflected light on LCTF that modulates the object’s spectral information by adjusting the amplitudes of selected channel transmission functions. The spectrally modulated scene is then imaged on the DMD by the first relay lens to undergo spatially modulation with a random coded aperture pattern. Finally, the compressed measurements are projected on the detector by the second relay lens.
Differently from traditional imaging systems, the detector hereby receives compressed measurements that need to be further reconstructed by algorithms to obtain final HS images. The original HS cube is denoted by F R N λ × N x × N y with a spatial resolution of N x × N y and N λ number of spectral bands. Multiple sampling in both spectral and spatial dimensions is required to achieve accurate HS data reconstruction. Let L denote the spectral sampling number (i.e., the number of LCTF channels) and K denote the spatial sampling number (i.e., the number of coded patterns in the DMD). Then, the compressed measurements acquired on the detector can be expressed as G R L × K × M x × M y , where M x × M y denotes the dimension of the detector. Since the spatial resolution of the detector is usually much smaller than that of the coded aperture, the scaling factor is R = N x / M x = N y / M y . Note that we hereby only consider the case where the coded aperture matches with the detector, i.e., R shall be an integer.
Mathematically, let f R N λ N x N y denote the vectorized representation of the original HS cube, the process of obtaining vectorized compressed measurements g R L K M x M y by the detector can be formulated as:
g = Φ f ,
where Φ R L K M x M y × N λ N x N y is the measurement matrix of the system, which can be regarded as the product of a spectral and a spatial measurement matrix determined by the transmission functions of the LCTF and the patterns of the coded aperture, respectively.
The massive information contained in HS image, which inevitably results in high computational cost for reconstruction, poses a great challenge to reconstruct the whole HS images using either iterative algorithms or deep networks. Hence, we adopt a block-compressed sensing (BCS) framework [41] to alleviate computational complexity. Assume that the original HS cube is spatially divided into several N λ × B × B -sized subcubes, each of which corresponds to a M B × M B -sized region on the detector ( M B = B / R ). The measurement matrix of CATF spectral imager can be denoted as:
Φ = Φ B Φ B Φ B ,
where Φ B is the measurement matrix for each subcube, which can be written in the following Kronecker production form:
Φ B = Φ λ Φ x y B R L K M B M B × N λ B B ,
where Φ λ R L × N λ and Φ x y B R K M B M B × B B are the spectral and the spatial measurement matrix for the subcube, respectively. Based on the above analysis, Equation (1) can be decomposed into a number of subproblems:
g B = Φ B f B ,
where g B R L K M B M B are the vectorized compressed measurements for subcube f B R N λ B B . The iterative reconstruction algorithms rely on the sparsity of HS images and thus transform the solution to Equation (4), to an optimization problem of l 1 norm:
θ B ^ = arg min θ B θ B 1 subject to g B Φ B Ψ B θ B 2 ε ,
where Ψ B R N λ B B × N λ B B is the sparse basis of the sub-HS cube, θ B R N λ B B is the sparse coefficient vector, and ε is the reconstruction error bound.
In addition, Equation (4) can also be solved as a total variation (TV) minimization problem:
f B = arg min f B TV ( f B ) subject to g B = Φ B f B ,
where TV ( f B ) denotes the TV norm of f B .

3. Methodology

In this section, we first briefly describe design inspiration and representation for network-based reconstruction, and then elaborate on the design of the proposed BTR-Net. The BCS framework is also used in the design of BTR-Net, and the superscript B is omitted for simplicity of writing.

3.1. Design Inspiration and Representation

The intention of the proposed reconstruction network is to reverse the imaging process of spectral imager step by step. Figure 2 illustrates the overall workflow of BTR-Net with the example of acquiring and reconstructing compressed one-pixel measurements on the detector.
For the acquisition process, the input HS subcube F R N λ × B × B , with a spatial resolution of B × B and N λ spectral bands, is first spectrally modulated by LCTF with L channels to produce a multispectral (MS) image M R L × B × B . The MS image is further spatially encoded by DMD with K different coding patterns to obtain shrunken measurements on an M B × M B -sized detector, resulting in modulated output G R ( L × K ) × M B × M B .
Conversely, the reconstruction process aims to learn a reverse mapping function Θ : G M F . It first maps the compressed measurement, G , to an MS image, M , by spanning spatial resolution (spatial initialization) and enriching fine-grained details (spatial enhancement), then it extends its spectral resolution (spectral initialization) and jointly promotes the image quality spatially and spectrally (spatial–spectral enhancement), leading to final reconstructed result F .

3.2. BTR-Net Architecture

Figure 3 shows the network architecture of the proposed BTR-Net. It is composed of four subnetworks: spatial initialization, spatial enhancement, spectral initialization, and spatial–spectral enhancement subnet, mapping the compressed measurement to original HS data step by step in an interpretable and unified manner. In the following, we use [ batch size , channel , high , width ] to denote the data size in the BTR-Net.
Spatial Initialization Subnet: This aims to acquire the spatial initialization M ^ from compressed measurement G by spanning spatial resolution. The function of this subnet can be formulated as follows:
M ^ = P S γ W 2 γ W 1 R 1 G ,
where R 1 represents a reshape operation, W 1 and W 2 are the weights to be trained for the first and second convolution layers, respectively, and each convolution layer is followed by a ReLU activation γ . P S is a periodic shuffling operator called sub-pixel convolution layer, which was first introduced in [30]. More specifically, R 1 merges the spectral measurements of the input data g into the batch size dimension (i.e., [ B S , L × K , M B , M B ] [ B S × L , K , M B , M B ] ), thus we were able to focus on the spatial information in the following operations. Subsequent convolutional layers extract features with low spatial resolution, and ensure that the number of feature maps feeding to P S is R 2 (where R = B / M B ). Finally, PS rearranges R 2 features with resolution M B × M B to a single image: B × B (i.e., [ B S × L , R 2 , M B , M B ] [ B S × L , 1 , B , B ] ).
Spatial Enhancement Subnet: Designed to obtain predicted MS image M from the spatial initialization M ^ by feature refinement. Mathematically, it learns the following function:
M = R 2 H M ^ , F N r ,
where R 2 performs a reverse operation of R 1 (i.e., [ B S × L , 1 , B , B ] [ B S , L , B , B ] ). H takes the spatial initialization M ^ as input and contains N r residual learning block (Resblock) to be learned. The nth Resblock is defined as:
M n = F n M ^ , W i n + M ^ , n = 1 F n M n 1 , W i n + M n 1 , 1 < n N r ,
where W i n represents the weights to be trained for the ith residual mapping F n . Each Resblock takes the output of the previous Resblock as input and adds the learned residual mapping to the input of the current block as output. The structure of Resblock is designed with reference to the setting in [42], which contains three convolutional layers. The residual mapping is formulated as:
F n x , W i n = W 3 r γ W 2 r γ W 1 r x ,
where x represents the input, and W i r is the weight for the ith convolutional layer of residual mapping. Furthermore, the first two convolutional layers are followed by ReLU activation γ .
Spectral Initialization Subnet: Designed to obtain initialization of the HS image F ^ via extending the spectral resolution of the predicted MS image M , which can be formulated as:
F ^ = γ W 3 M ,
where W 3 represents the weights to be trained. We generated N λ feature maps with a convolutional layer to preliminarily reconstruct spectra (i.e., [ B S , L , B , B ] [ B S , N λ , B , B ] ). As with the previous design, an ReLU activation γ is added.
Spatial–Spectral Enhancement Subnet: Jointly promotes image quality of the initialized HS image F ^ spatially and spectrally, resulting in the final reconstructed HS image F . The function of this subnet could be expressed as:
F = ς W 4 F ^ ,
where W 4 represents the weights to be trained. Moreover, a Sigmoid activation ς is added to limit the output to between 0 and 1.
It is worth noting that, when we feed the divided images into the network, it is likely to cause distinct block artifacts for the reassemble reconstructed images due to zero-padding. Reflect-padding replaces “zero” with a pixel value at the feature map, so that the convolution result at the edge will not be pulled down. Thus, we use reflect-padding in the padding operation of each convolution layer, which effectively mitigates artifacts caused by block-wise processing [43].

3.3. Loss Function

We optimize the network parameters by minimizing the pixel-wise mean square error (MSE), i.e.,
L Λ = min Λ 1 N λ B 2 i = 1 N λ j = 1 B k = 1 B F i , j , k F i , j , k 2 ,
where Λ is the trainable parameters in BTR-Net, and F and F represent the HS image predicted by BTR-Net and the original HS image, respectively.

4. Results

We trained the networks from scratch for 30 epochs on a single NVIDIA GeForce GTX 1080, with batch size B S = 10 and an initial learning rate of 10 3 . We gradually decreased the learning rate by an order of magnitude after every 10 epochs. We used two Resblocks in BTR-Net throughout the experiments.

4.1. Dataset and Evaluation Metrics

We carried out experiments on a public HS dataset—the BGU iCVL Hyperspectral Image Dataset [44]. This dataset consists of HS images with 519 spectral bands ranging from 400 to 1000 nm, with a spectral interval of about 1.25 nm. We only used N λ = 196 bands (488 to 730 nm) to make the spectral range of input HS images consistent with that of the LCTF. We randomly selected 32 HS images for training and 8 for testing, and normalized the pixel values of all images to ( 0 , 1 ) . Through the blocking operation, more than 12,000 image pairs are used for network training. When generating input data, the following key points should be noted: (i) the original HS images were divided into 196 × 64 × 64 -sized image blocks, and then input cubes of ( 22 × 25 ) × 8 × 8 were extracted by spectral filtering and spatial coding; (ii) L = 22 real measured LCTF transmittance curves were utilized to simulate spectral filtering; and (iii) instead of using the same coded patterns for every image block, K = 25 random coded patterns (random matrices with values between 0 and 1) were generated for each image block to simulate spatial coding.
For a comprehensive evaluation of the reconstructed results, we adopted mean peak signal to noise ratio (MPSNR), mean structural similarity index measure (MSSIM), mean relative absolute error (MRAE), and mean spectral angle mapper (MSAM) as evaluation metrics. The lower the MRAE and MSAM, or the larger the MSSIM and MPSNR, the better the reconstructed images.
The MPSNR, which measures the difference between two images, is defined as:
MPSNR = 1 N λ i = 1 N λ 10 × log 10 1 MSE i ,
where N λ denotes the number of spectral bands, and MSE i is the MSE between the reconstructed and the original HS image at the ith spectral band.
The MSSIM, which evaluates the structural similarity between the reconstructed and the original images, is defined as:
MSSIM = 1 N λ i = 1 N λ 2 μ F i μ F i + c 1 2 σ F i F i + c 2 μ F i 2 + μ F i 2 + c 1 σ F i 2 + σ F i 2 + c 2 ,
where F i (with mean μ F i and variance σ F i 2 ) and F i (with mean μ F i and variance σ F i 2 ) denote the reconstructed and the original HS image at the ith spectral band, respectively. σ F i F i is the covariance of F i and F i , and c 1 and c 2 are the two hyperparameters.
The MRAE, which describes the proportion of the reconstruction error of each pixel to the original value, is defined as:
MRAE = 1 N λ N x N y i = 1 N λ j = 1 N x k = 1 N y F i , j , k F i , j , k F i , j , k ,
where F i , j , k and F i , j , k denote the point at the ith spectral band with spatial coordinates of ( j , k ) on the reconstructed and the original HS image, respectively. N x and N y denote the spatial resolution of the HS image.
The MSAM, which calculates the average angle between spectra of the reconstructed and the original images across all spatial positions, is defined as:
MSAM = 1 N x N y j = 1 N x k = 1 N y cos 1 i = 1 N λ F i , j , k F i , j , k i = 1 N λ F i , j , k 2 i = 1 N λ F i , j , k 2 .

4.2. Comparison with Iterative Algorithms

We compare the proposed BTR-Net with popular iterative algorithms, including TwIST [16], GPSR [17], and GAP-TV [18]. TwIST and GPSR aim to find the sparse solution of HS data, as in Equation (5), and the sparse basis is defined as the DCT basis. GAP-TV is a TV-based algorithm.
Table 1 provides quantitative comparisons of reconstructed results from our BTR-Net and iterative algorithms. One can see that BTR-Net outperforms the three iterative algorithms in terms of all three metrics on each testing image (without noise). For instance, BTR-Net gains more than 4 dB over iterative algorithms in terms of average MPSNR.
Table 2 shows the running time required for each methods to reconstruct an HS image with a spatial resolution of 1280 × 1280 and 196 spectral bands. BTR-Net demonstrates significant decrease in running time compared with iterative algorithms. Specifically, BTR-Net runs two orders of magnitude faster than iterative algorithms on CPU. Moreover, BTR-Net supports acceleration using GPU, which makes it runs about 38 times faster than when using CPU.
Figure 4 shows qualitative comparisons of the reconstructed results in RGB projection. The red, green and blue channels of the RGB image are taken from three spectral bands of the HS image at 660 nm, 550 nm and 500 nm, respectively. For a clear comparison of reconstructed details, the image region in red square is enlarged at the lower left corner of the RGB image. The RGB images indicate that BTR-Net is superior to conventional iterative algorithms in color reproduction and detail recovery.
Figure 5 compares spectral curves reconstructed by these four methods. The second and third columns draw the spectra of two representative pixels whose positions are marked on the RGB image in the first column, where the x-axis and y-axis represent wavelength and normalized intensity, respectively. The SAM is labeled in the legend to evaluate the quality of the reconstructed spectra. The spectra suggest that BTR-Net performs well in spectrum reconstruction, while conventional iterative algorithms show poor performance at the edge of spectrum.
In order to further demonstrate the wavelength-dependent performance variation, we quantitatively compare the results at different wavelengths (taking Scene 1 as an example), as shown in Figure 6. One can see that the proposed BTR-Net possesses excellent global performance and stable reconstructed outcomes. By comparison, the reconstructed results of iterative algorithms change significantly with wavelength, probably due to the fact that the spectra are not strictly sparse on the given sparse basis.
We studied the noise immunity of BTR-Net by adding white Gaussian noise to the compressive measurements (i.e., input data of BTR-Net). The network is trained in the absence of noise and tested in the presence of noise. The three iterative algorithms were also tested by adding noise to the compressive measurements. Table 3 compares the experimental results with noise levels of 40 dB, 30 dB and 20 dB. Taking Scene 1 as an example, Figure 7 compares the visual quality of these four methods at different noise levels, and Figure 8 shows the performance of the reconstructed spectra. It can be observed that the proposed BTR-Net surpasses iterative algorithms in terms of reconstruction performance and anti-noise ability. Concretely, the results of BTR-Net are almost impervious to the addition of 40 dB and 30 dB noise; although the 20 dB noise slows down the performance of BTR-Net, it is still acceptable. By contrast, the performance of TwIST declines obviously with 30 dB noise; the performance of GPSR and GAP-TV degrades but is still acceptable. When the noise level is set to 20 dB, TwIST and GPSR can hardly reconstruct the data successfully; the noise immunity of GAP-TV is better than that of TwIST and GPSR, but the reconstruction result is unsatisfactory with 20 dB noise.

4.3. Real Experiments

We constructed an experimental prototype of the CATF spectral imager, as shown in Figure 9. The testbed consisted of a fiber ring illuminator (Thorlabs FRI61F50 and OSL2), an imaging lens with a focal length of 50 mm, a visible LCTF with a range of 500 nm–710 nm, two relay lenses with a focal length of 75 mm, a DMD (Texas Instruments DLP9500), and a monochromatic CMOS camera (Basler acA2040-90 um).
We took a set of compressive measurements by the prototype as the input data of the trained BTR-Net. Consistent with the parameter settings in simulations, L = 22 LCTF channels and K = 25 random coded patterns were taken in real experiments, and the scaling factor was set to R = 8 . The object occupied 384 × 384 pixels on the DMD, which corresponded to 48 × 48 pixels on the detector. The measurements were spatially divided into 8 × 8 image blocks to meet the requirements of the BTR-Net on the size of input data.
In real experiments, we used the CATF spectral imager to obtain the measurements, but could not obtain the ground truth of the corresponding HS data. This made it impossible to train the BTR-Net with the dataset obtained in real experiments. So, we applied the trained BTR-Net in simulations to the real experiment. This was challenging since our BTR-Net was not trained with real data and the performance of the measurement matrix in the real experiments was degraded compared with the matrix used to generate the training data.
To compare the results qualitatively, Figure 10 shows the RGB projections of the reconstructed results from the real experimental data, including three iterative algorithms and the proposed BTR-Net. It can be seen that the colors of the RGB images are similar, which reflects that the spectral reconstruction capabilities of the four methods are comparable. Perceptually, the BTR-Net is superior to iterative algorithms in detail recovery.
Figure 11 draws the reflection spectra of two representative points whose positions are marked on the RGB image, where the x-axis and y-axis represent wavelength and reflectivity, respectively. The PSNR is labeled in the legend to evaluate the quality of the reconstructed spectra. The original reflection spectra are measured by a grating spectrometer (Ocean Optics Maya2000pro). The result shows that the reconstructed spectra of BTR-Net is in the best agreement with the original spectra in P1. These four methods show comparable spectral reconstruction performance at the background P2.
In real experiments, iterative algorithms tend to obtain a solution with high sparsity in order to combat noise (that is, the number of non-zero coefficients of θ B in Equation (5) is extremely small). The reconstructed results of iterative algorithms lose a large proportion of high-frequency components, resulting in a lack of spatial details and relatively smooth spectra. Our BTR-Net learned the characteristics of HS dataset in the training. Thus, the details in the RGB projections were richer, and the spectra contained more fluctuations. The results demonstrate the feasibility of the proposed BTR-Net; although, the results are not as good as in the simulation.

5. Discussion

In this section, We further discuss our design choices and parameter settings in the proposed BTR-Net.

5.1. Effect of Up-Sampling Methods

We adopt sub-pixel convolution to up-sample low-resolution measurements in spatial initialization subnet. To validate our design, we compared it with the model variant by replacing sub-pixel convolution with an alternative up-sampling approach, i.e., bilinear interpolation. Figure 12 and Table 4 depict training error curve and quantitative comparisons of the reconstructed results, respectively. The model with either up-sampling approach can converge quickly, demonstrating the universality and rationality of our network design. Furthermore, the model with sub-pixel convolution achieves lower training error and better quantitative results, showing its superiority in enlarging spatial resolutions while retaining beneficial local structures.

5.2. Effect of Resblock Number

The Resblock infers the residual between spatial initialized image and MS image, thereby ameliorating image quality. We test the effect of incorporating different numbers of Resblock in the spatial enhancement subnet. Figure 13 presents the varying trend of reconstruction quality and running speed with an increasing number of Resblocks. Generally, incorporating more Resblocks brings higher reconstruction quality, but at the cost of decreasing the running speed. We observe that reconstruction quality improves as the block number N r increases; however, the curves are almost flat when N r 4 . Considering the tradeoff between network complexity and reconstruction performance, we set the default block number N r = 2 for BTR-Net. It is worth noting that BTR-Net performed better than iterative algorithms even without using any Resblocks for feature refinement, evidencing the effectiveness of backtracking reconstruction strategy itself.

5.3. Effect of Kernel Size in Resblock

Each Resblock in the spatial enhancement subnet contains three convolutional layers with different kernel sizes K s to extract multiscale features. The Resblock in DR2-Net [42], which we borrowed from, applies a combination of relatively large kernel sizes, i.e., K s = 11-1-7, resulting in added computation. To balance complexity and reconstruction performance, we test the model with shrunken convolutional kernels in Resblock. Table 5 lists the quantitative comparisons of reconstructed results in K s = 11-1-7, 9-1-5 and 7-1-3, and the time complexity of one Resblock with different parameter settings. Diminution in kernel size degrades the reconstruction performance, while bringing the benefit of reducing the time and complexity. To weigh gains and losses, we set the parameter to K s = 9-1-5 for our Resblock.

5.4. Cross-Validation

We perform K-fold cross-validation experiments to verify the performance of the proposed BTR-Net. The original HS images are randomly divided into 5 groups, each of which contains 8 HS images. In each experiment, we selected one group as the testing set, and the other groups as the training set. Through the blocking operation, more than 12,000 image pairs are used for training in each experiment. The cross-validation process is repeat 5 times, and the results are shown in Table 6. The results show that the proposed BTR-Net is reliable and stable.

6. Conclusions

Suffering from traditional, iteration-based algorithms, we propose a backtracking reconstruction network called BTR-Net to solve the reconstruction problem in three-dimensional compressed HS imaging. We decomposed the imaging process based on CATF spectral imager into steps, and designed a series of subnetworks to reverse these steps. Specifically, we built four subnetworks—spatial initialization subnet, spatial enhancement subnet, spectral initialization subnet, and spatial–spectral enhancement subnet—in sequence, to obtain a reverse mapping from compressed measurements to HS data. Experimental results show that the proposed BTR-Net outperforms traditional iteration-based algorithms in reconstruction performance and running speed, while exhibiting great noise resistance.
The BTR-Net shows obvious advantages over iterative algorithms, yet there are several aspects requiring further study:
1. Multiple reshaping operations were adopted in BTR-Net, which means that the size of input data needs to be strictly limited once the network is trained. Follow-up work will be directed towards the design of a fully convolutional network capable of accepting variable dimension input data.
2. The BTR-Net takes compressed measurements as inputs to reconstruct HS data, which ignores characteristics of the imaging system itself, such as the transmittance curves of LCTF and coded patterns of DMD. Further effort is needed to design a network that rationally utilizes these prior information to make the network more interpretable.

Author Contributions

Conceptualization, T.X.; methodology, X.W. and J.L.; software, X.W.; validation, C.X.; formal analysis, A.F.; investigation, Y.Z. and C.X.; resources, T.X.; data curation, A.F.; writing—original draft preparation, X.W.; writing—review and editing, Y.Z. and J.L.; visualization, X.W.; supervision, T.X.; project administration, T.X.; funding acquisition, T.X. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded in part by the National Natural Science Foundation of China grant number 61527802 and in part by the National Key Laboratory Foundation of China grant number TCGZ2020C004 and 202020429036.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Acknowledgments

We would like to thank anonymous reviewers whose comments resulted in a noticeable improvement of our paper.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
HShyperspectral
CScompressed sensing
BTR-Netbacktracking reconstruction network
CATFcoded aperture tunable filter
LCTFliquid crystal tunable filter
DMDdigital micromirror device
CNNconvolution neural network
CASSIcoded aperture snapshot spectral imaging
BCSblock compressed sensing
TVtotal variation
MSmultispectral
Resblockresidual learning block
MPSNRmean peak signal to noise ratio
MSSIMmean structural similarity index measure
MRAEmean relative absolute error
MSAMmean spectral angle mapper

References

  1. Zhang, B.; Wu, D.; Zhang, L.; Jiao, Q.; Li, Q. Application of hyperspectral remote sensing for environment monitoring in mining areas. Environ. Earth Sci. 2012, 65, 649–658. [Google Scholar] [CrossRef]
  2. Roberts, D.A.; Quattrochi, D.A.; Hulley, G.C.; Hook, S.J.; Green, R.O. Synergies between VSWIR and TIR data for the urban environment: An evaluation of the potential for the Hyperspectral Infrared Imager (HyspIRI) Decadal Survey mission. Remote Sens. Environ. 2012, 117, 83–101. [Google Scholar] [CrossRef]
  3. Li, W.; Du, Q. A survey on representation-based classification and detection in hyperspectral remote sensing imagery. Pattern Recognit. Lett. 2016, 83, 115–123. [Google Scholar] [CrossRef]
  4. Jay, S.; Guillaume, M.; Minghelli, A.; Deville, Y.; Chami, M.; Lafrance, B.; Serfaty, V. Hyperspectral remote sensing of shallow waters: Considering environmental noise and bottom intra-class variability for modeling and inversion of water reflectance. Remote Sens. Environ. 2017, 200, 352–367. [Google Scholar] [CrossRef]
  5. Yu, C.; Yang, J.; Song, N.; Sun, C.; Wang, M.; Feng, S. Microlens array snapshot hyperspectral microscopy system for the biomedical domain. Appl. Opt. 2021, 60, 1896–1902. [Google Scholar] [CrossRef]
  6. Donoho, D. Compressed sensing. IEEE Trans. Inf. Theory 2006, 52, 1289–1306. [Google Scholar] [CrossRef]
  7. Neifeld, M.A.; Ke, J. Optical architectures for compressive imaging. Appl. Opt. 2007, 46, 5293–5303. [Google Scholar] [CrossRef] [PubMed]
  8. Duarte, M.F.; Davenport, M.A.; Takhar, D.; Laska, J.N.; Sun, T.; Kelly, K.F.; Baraniuk, R.G. Single-pixel imaging via compressive sampling. IEEE Signal Process. Mag. 2008, 25, 83–91. [Google Scholar] [CrossRef] [Green Version]
  9. Wagadarikar, A.; John, R.; Willett, R.; Brady, D. Single disperser design for coded aperture snapshot spectral imaging. Appl. Opt. 2008, 47, B44. [Google Scholar] [CrossRef] [Green Version]
  10. Lin, X.; Liu, Y.; Wu, J.; Dai, Q. Spatial-spectral encoded compressive hyperspectral imaging. ACM Trans. Graph. 2014, 33, 1–11. [Google Scholar] [CrossRef]
  11. Wang, L.; Xiong, Z.; Shi, G.; Wu, F.; Zeng, W. Adaptive Nonlocal Sparse Representation for Dual-Camera Compressive Hyperspectral Imaging. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2104–2111. [Google Scholar] [CrossRef]
  12. Ren, W.; Fu, C.; Wu, D.; Xie, Y.; Arce, G.R. Channeled compressive imaging spectropolarimeter. Opt. Express 2019, 27, 2197–2211. [Google Scholar] [CrossRef]
  13. Wang, X.; Zhang, Y.; Ma, X.; Xu, T.; Arce, G.R. Compressive spectral imaging system based on liquid crystal tunable filter. Opt. Express 2018, 26, 25226–25243. [Google Scholar] [CrossRef]
  14. Zhang, Y.; Xu, T.; Wang, X.; Pan, C.; Hao, J.; Huang, C. Real-time adaptive coded aperture: Application to the compressive spectral imaging system. In Proceedings of the Optics, Photonics and Digital Technologies for Imaging Applications VI, Online, 6–10 April 2020; Schelkens, P., Kozacki, T., Eds.; International Society for Optics and Photonics, SPIE: Bellingham, WA, USA, 2020; Volume 11353, pp. 280–289. [Google Scholar]
  15. Xu, C.; Xu, T.; Yan, G.; Ma, X.; Zhang, Y.; Wang, X.; Zhao, F.; Arce, G.R. Super-resolution compressive spectral imaging via two-tone adaptive coding. Photon. Res. 2020, 8, 395–411. [Google Scholar] [CrossRef]
  16. Bioucas-Dias, J.M.; Figueiredo, M.A.T. A New TwIST: Two-Step Iterative Shrinkage/Thresholding Algorithms for Image Restoration. IEEE Trans. Image Process. 2007, 16, 2992–3004. [Google Scholar] [CrossRef] [Green Version]
  17. Figueiredo, M.A.T.; Nowak, R.D.; Wright, S.J. Gradient Projection for Sparse Reconstruction: Application to Compressed Sensing and Other Inverse Problems. IEEE J. Sel. Top. Signal Process. 2007, 1, 586–597. [Google Scholar] [CrossRef] [Green Version]
  18. Yuan, X. Generalized alternating projection based total variation minimization for compressive sensing. In Proceedings of the 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 25–28 September 2016; pp. 2539–2543. [Google Scholar]
  19. Wagadarikar, A.A.; Pitsianis, N.P.; Sun, X.; Brady, D.J. Spectral image estimation for coded aperture snapshot spectral imagers. In Proceedings of the Image Reconstruction from Incomplete Data V, San Francisco, CA, USA, 28 August 2008; International Society for Optics and Photonics, SPIE: Bellingham, WA, USA, 2008; Volume 7076, pp. 9–23. [Google Scholar]
  20. Mousavi, A.; Baraniuk, R.G. Learning to invert: Signal recovery via Deep Convolutional Networks. In Proceedings of the 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, LA, USA, 5–9 March 2017; pp. 2272–2276. [Google Scholar]
  21. Metzler, C.A.; Maleki, A.; Baraniuk, R.G. From Denoising to Compressed Sensing. IEEE Trans. Inf. Theory 2016, 62, 5117–5144. [Google Scholar] [CrossRef]
  22. Wang, L.; Feng, Y.; Gao, Y.; Wang, Z.; He, M. Compressed Sensing Reconstruction of Hyperspectral Images Based on Spectral Unmixing. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 1266–1284. [Google Scholar] [CrossRef]
  23. Xue, J.; Zhao, Y.; Liao, W.; Chan, J.C.W. Nonlocal Tensor Sparse Representation and Low-Rank Regularization for Hyperspectral Image Compressive Sensing Reconstruction. Remote Sens. 2019, 11, 193. [Google Scholar] [CrossRef] [Green Version]
  24. Chen, Y.; Huang, T.; He, W.; Yokoya, N.; Zhao, X. Hyperspectral Image Compressive Sensing Reconstruction Using Subspace-Based Nonlocal Tensor Ring Decomposition. IEEE Trans. Image Process. 2020, 29, 6813–6828. [Google Scholar] [CrossRef]
  25. Takeyama, S.; Ono, S.; Kumazawa, I. A Constrained Convex Optimization Approach to Hyperspectral Image Restoration with Hybrid Spatio-Spectral Regularization. Remote Sens. 2020, 12, 3541. [Google Scholar] [CrossRef]
  26. Schmidhuber, J. Deep learning in neural networks: An overview. Neural Netw. 2015, 61, 85–117. [Google Scholar] [CrossRef] [Green Version]
  27. Rastegari, M.; Ordonez, V.; Redmon, J.; Farhadi, A. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks. In Proceedings of the Computer Vision—ECCV 201, 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016; Leibe, B., Matas, J., Sebe, N., Welling, M., Eds.; Springer International Publishing: Cham, Switzerland, 2016; pp. 525–542. [Google Scholar]
  28. Moeskops, P.; Viergever, M.A.; Mendrik, A.M.; de Vries, L.S.; Benders, M.J.N.L.; Išgum, I. Automatic Segmentation of MR Brain Images With a Convolutional Neural Network. IEEE Trans. Med. Imaging 2016, 35, 1252–1261. [Google Scholar] [CrossRef] [Green Version]
  29. Liskowski, P.; Krawiec, K. Segmenting Retinal Blood Vessels With Deep Neural Networks. IEEE Trans. Med. Imaging 2016, 35, 2369–2380. [Google Scholar] [CrossRef]
  30. Shi, W.; Caballero, J.; Huszár, F.; Totz, J.; Aitken, A.P.; Bishop, R.; Rueckert, D.; Wang, Z. Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 1874–1883. [Google Scholar]
  31. Chen, L.C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 834–848. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  32. Li, X.; Zhang, G.; Qiao, H.; Bao, F.; Deng, Y.; Wu, J.; He, Y.; Yun, J.; Lin, X.; Xie, H.; et al. Unsupervised content-preserving transformation for optical microscopy. Light. Sci. Appl. 2021, 10, 44. [Google Scholar] [CrossRef] [PubMed]
  33. Zhang, W.; Song, H.; He, X.; Huang, L.; Zhang, X.; Zheng, J.; Shen, W.; Hao, X.; Liu, X. Deeply learned broadband encoding stochastic hyperspectral imaging. Light. Sci. Appl. 2021, 10, 108. [Google Scholar] [CrossRef]
  34. Wang, L.; Zhang, T.; Fu, Y.; Huang, H. HyperReconNet: Joint Coded Aperture Optimization and Image Reconstruction for Compressive Hyperspectral Imaging. IEEE Trans. Image Process. 2019, 28, 2257–2270. [Google Scholar] [CrossRef] [PubMed]
  35. Miao, X.; Yuan, X.; Pu, Y.; Athitsos, V. lambda-Net: Reconstruct Hyperspectral Images From a Snapshot Measurement. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea, 27 October–2 November 2019; pp. 4058–4068. [Google Scholar]
  36. Wang, L.; Sun, C.; Fu, Y.; Kim, M.H.; Huang, H. Hyperspectral Image Reconstruction Using a Deep Spatial-Spectral Prior. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 8024–8033. [Google Scholar]
  37. Wang, L.; Sun, C.; Zhang, M.; Fu, Y.; Huang, H. DNU: Deep Non-Local Unrolling for Computational Spectral Imaging. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 1658–1668. [Google Scholar]
  38. Yang, Y.; Xie, Y.; Chen, X.; Sun, Y. Hyperspectral Snapshot Compressive Imaging with Non-Local Spatial-Spectral Residual Network. Remote Sens. 2021, 13, 1812. [Google Scholar] [CrossRef]
  39. Gedalin, D.; Oiknine, Y.; Stern, A. DeepCubeNet: Reconstruction of spectrally compressive sensed hyperspectral images with deep neural networks. Opt. Express 2019, 27, 35811–35822. [Google Scholar] [CrossRef] [PubMed]
  40. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
  41. Gan, L. Block Compressed Sensing of Natural Images. In Proceedings of the 2007 15th International Conference on Digital Signal Processing, Wales, UK, 1–4 July 2007; pp. 403–406. [Google Scholar]
  42. Yao, H.; Dai, F.; Zhang, S.; Zhang, Y.; Tian, Q.; Xu, C. DR2-Net: Deep Residual Reconstruction Network for image compressive sensing. Neurocomputing 2019, 359, 483–493. [Google Scholar] [CrossRef] [Green Version]
  43. Alsallakh, B.; Kokhlikyan, N.; Miglani, V.; Yuan, J.; Reblitz-Richardson, O. Mind the Pad–CNNs can Develop Blind Spots. arXiv 2020, arXiv:2010.02178. [Google Scholar]
  44. Arad, B.; Ben-Shahar, O. Sparse Recovery of Hyperspectral Signal from Natural RGB Images. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; Springer: Berlin/Heidelberg, Germany, 2016; pp. 19–34. [Google Scholar]
Figure 1. Schematic of coded aperture tunable filter (CATF) spectral imager. The reflected light source from the object is modulated by liquid crystal tunable filter (LCTF) and digital micromirror device (DMD). Each block on the DMD represents a micromirror unit. The micromirror unit in ON state (in white) reflects the light along the main optical axis to the detector, while the unit in OFF state (in black) reflects the light far away from the main optical axis.
Figure 1. Schematic of coded aperture tunable filter (CATF) spectral imager. The reflected light source from the object is modulated by liquid crystal tunable filter (LCTF) and digital micromirror device (DMD). Each block on the DMD represents a micromirror unit. The micromirror unit in ON state (in white) reflects the light along the main optical axis to the detector, while the unit in OFF state (in black) reflects the light far away from the main optical axis.
Remotesensing 14 02406 g001
Figure 2. The imaging model of the CATF spectral imager and the reconstruction process (diagram of one pixel on the detector). The upper part of the figure describes the imaging process of the CATF spectral imager. The original hyperspectral (HS) cube is first filtered by LCTF with L channels to generate multispectral (MS) image (without loss of spatial resolution). It is then spatially modulated by DMD with K coded apertures. Finally, L × K measurements are obtained on the detector. The reconstruction process is in the lower part of the figure, backtracking the imaging process step by step. The detector measurements are used as the input, then the MS image with the same spatial resolution as DMD is produced via spatial initialization and enhancement. The reconstructed HS cube is finally acquired after spectral initialization and spatial–spectral information enhancement.
Figure 2. The imaging model of the CATF spectral imager and the reconstruction process (diagram of one pixel on the detector). The upper part of the figure describes the imaging process of the CATF spectral imager. The original hyperspectral (HS) cube is first filtered by LCTF with L channels to generate multispectral (MS) image (without loss of spatial resolution). It is then spatially modulated by DMD with K coded apertures. Finally, L × K measurements are obtained on the detector. The reconstruction process is in the lower part of the figure, backtracking the imaging process step by step. The detector measurements are used as the input, then the MS image with the same spatial resolution as DMD is produced via spatial initialization and enhancement. The reconstructed HS cube is finally acquired after spectral initialization and spatial–spectral information enhancement.
Remotesensing 14 02406 g002
Figure 3. Overall architecture of the proposed backtracking reconstruction network (BTR-Net). The arrows denote different operations. Each box corresponds to a feature map, with the batch size denoted on top, the number of channels denoted at the topper left edge, and the x–y size provided at bottom.
Figure 3. Overall architecture of the proposed backtracking reconstruction network (BTR-Net). The arrows denote different operations. Each box corresponds to a feature map, with the batch size denoted on top, the number of channels denoted at the topper left edge, and the x–y size provided at bottom.
Remotesensing 14 02406 g003
Figure 4. RGB projections of the reconstructed results.
Figure 4. RGB projections of the reconstructed results.
Remotesensing 14 02406 g004
Figure 5. Comparisons of spectrum reconstruction.
Figure 5. Comparisons of spectrum reconstruction.
Remotesensing 14 02406 g005
Figure 6. Performance of BTR-Net and iterative algorithms at different wavelengths, taking PSNR (left), SSIM (middle) and RAE (right) as evaluation metrics.
Figure 6. Performance of BTR-Net and iterative algorithms at different wavelengths, taking PSNR (left), SSIM (middle) and RAE (right) as evaluation metrics.
Remotesensing 14 02406 g006
Figure 7. Visual quality comparisons of reconstructed results at different noise levels.
Figure 7. Visual quality comparisons of reconstructed results at different noise levels.
Remotesensing 14 02406 g007
Figure 8. Comparisons of the reconstructed spectra at different noise levels.
Figure 8. Comparisons of the reconstructed spectra at different noise levels.
Remotesensing 14 02406 g008
Figure 9. Testbed of the CATF spectral imager.
Figure 9. Testbed of the CATF spectral imager.
Remotesensing 14 02406 g009
Figure 10. Visual quality comparison of the reconstructed results.
Figure 10. Visual quality comparison of the reconstructed results.
Remotesensing 14 02406 g010
Figure 11. The reflection spectra of reconstructed results for two representative points whose positions are marked on the RGB image (left), P1 (middle) and P2 (right).
Figure 11. The reflection spectra of reconstructed results for two representative points whose positions are marked on the RGB image (left), P1 (middle) and P2 (right).
Remotesensing 14 02406 g011
Figure 12. Training error of the model using sub-pixel convolution and bilinear interpolation.
Figure 12. Training error of the model using sub-pixel convolution and bilinear interpolation.
Remotesensing 14 02406 g012
Figure 13. Performance of different number of Resblocks (average of all test images).
Figure 13. Performance of different number of Resblocks (average of all test images).
Remotesensing 14 02406 g013
Table 1. Comparisons of reconstructed quality of BTR-Net and iterative algorithms. Each HS image is with a spatial resolution of 1280 × 1280 and 196 spectral bands. We adopted mean peak signal to noise ratio (MPSNR), mean structural similarity index measure (MSSIM), mean relative absolute error (MRAE), and mean spectral angle mapper (MSAM) as evaluation metrics.
Table 1. Comparisons of reconstructed quality of BTR-Net and iterative algorithms. Each HS image is with a spatial resolution of 1280 × 1280 and 196 spectral bands. We adopted mean peak signal to noise ratio (MPSNR), mean structural similarity index measure (MSSIM), mean relative absolute error (MRAE), and mean spectral angle mapper (MSAM) as evaluation metrics.
MethodsMetricsScene 1Scene 2Scene 3Scene 4Scene 5Scene 6Scene 7Scene 8Average
TwISTMPSNR25.946125.360329.070926.406425.909123.357025.386625.716725.8941
MSSIM0.76510.75550.84460.79130.73890.57450.77950.77410.7529
MRAE0.14400.16180.15480.15310.14000.13040.17300.16960.1533
MSAM0.19250.17790.30290.19030.17010.16830.27300.30210.2221
GPSRMPSNR27.749126.612530.936628.008128.114225.469426.693527.032827.5770
MSSIM0.85870.83590.90560.87540.85250.71000.85440.84510.8422
MRAE0.10120.12940.11060.11300.10180.10530.13120.13580.1160
MSAM0.14730.14220.24340.14150.13410.14690.21130.25250.1774
GAP-TVMPSNR27.029026.061729.677827.329127.131126.155825.308825.845826.8174
MSSIM0.89770.87280.90320.89770.89590.91870.84420.84590.8845
MRAE0.14450.15820.17810.15260.13680.12300.19590.19200.1601
MSAM0.20900.19380.32370.20790.18610.18300.29470.32300.2402
BTR-NetMPSNR31.436129.206335.069630.251432.490133.775330.238629.961431.5536
MSSIM0.93540.90650.96820.92420.93560.94850.91600.90090.9294
MRAE0.04730.07860.05440.06790.04540.02540.07240.07800.0587
MSAM0.02570.03370.04580.03160.02250.01990.03710.04910.0332
Table 2. Comparisons of running time (average time to reconstruct an HS image with a spatial resolution of 1280 × 1280 and 196 spectral bands).
Table 2. Comparisons of running time (average time to reconstruct an HS image with a spatial resolution of 1280 × 1280 and 196 spectral bands).
MethodsTwISTGPSRGAP-TVBTR-Net
Running time(s)
CPU/GPU
1.35 × 10 4 / 1.15 × 10 4 / 1.08 × 10 4 / 1.29 × 10 2 / 3.36
Table 3. Performance of different noise level (average of all test images).
Table 3. Performance of different noise level (average of all test images).
MethodsMetricsNone40 dB30 dB20 dB
TwISTMPSNR25.894125.273422.704315.9897
MSSIM0.75290.72250.59420.2820
MRAE0.15330.16250.20480.4066
MSAM0.22210.22770.26170.4585
GPSRMPSNR27.577027.465225.882112.0305
MSSIM0.84220.83850.78160.1633
MRAE0.11600.11750.12450.5493
MSAM0.17740.17790.18930.6204
GAP-TVMPSNR26.817426.605825.325120.5241
MSSIM0.88450.87360.80110.5357
MRAE0.16010.17550.18890.2654
MSAM0.24020.24180.25120.3299
BTR-NetMPSNR31.553631.520931.315829.9170
MSSIM0.92940.92940.92920.9278
MRAE0.05870.05940.06250.0820
MSAM0.03320.03320.03320.0334
Table 4. Quantitative comparisons of using sub-pixel convolution and bilinear interpolation (the metric is the average of all test images).
Table 4. Quantitative comparisons of using sub-pixel convolution and bilinear interpolation (the metric is the average of all test images).
MethodsMPSNR(dB)MSSIMMRAEMSAM
Sub-pixel Convolution31.55360.92940.05860.0332
Bilinear Interpolation30.68360.92260.07060.0362
Table 5. Performance and time complexity of using different kernel sizes in Resblock (the metric is the average of all test images).
Table 5. Performance and time complexity of using different kernel sizes in Resblock (the metric is the average of all test images).
Kernel SizeMPSNR(dB)MSSIMMRAEMSAMTime Complexity
11-1-731.61000.93020.05850.0333 4.6 × 10 7
9-1-531.55360.92940.05860.0332 3.3 × 10 7
7-1-331.35660.92690.06050.0332 2.2 × 10 7
Table 6. Cross-validation of BTR-Net.
Table 6. Cross-validation of BTR-Net.
Testing SetMPSNRMSSIMMRAEMSAM
130.91600.91730.05630.0316
230.89340.91160.05890.0315
330.98250.91360.05680.0327
431.62780.92090.05180.0289
531.55360.92940.05870.0332
Average31.19470.91860.05650.0316
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Wang, X.; Xu, T.; Zhang, Y.; Fan, A.; Xu, C.; Li, J. Backtracking Reconstruction Network for Three-Dimensional Compressed Hyperspectral Imaging. Remote Sens. 2022, 14, 2406. https://doi.org/10.3390/rs14102406

AMA Style

Wang X, Xu T, Zhang Y, Fan A, Xu C, Li J. Backtracking Reconstruction Network for Three-Dimensional Compressed Hyperspectral Imaging. Remote Sensing. 2022; 14(10):2406. https://doi.org/10.3390/rs14102406

Chicago/Turabian Style

Wang, Xi, Tingfa Xu, Yuhan Zhang, Axin Fan, Chang Xu, and Jianan Li. 2022. "Backtracking Reconstruction Network for Three-Dimensional Compressed Hyperspectral Imaging" Remote Sensing 14, no. 10: 2406. https://doi.org/10.3390/rs14102406

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop