Synthetic Aperture Radar Image Compression Based on Low-Frequency Rejection and Quality Map Guidance

Deng, Jiawen; Huang, Lijia

doi:10.3390/rs16050891

Open AccessArticle

Synthetic Aperture Radar Image Compression Based on Low-Frequency Rejection and Quality Map Guidance

by

Jiawen Deng

^1,2,3

and

Lijia Huang

^1,2,*

¹

Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China

²

Key Laboratory of Technology in Geo-Spatial Information Processing and Application System, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100190, China

³

School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing 100190, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2024, 16(5), 891; https://doi.org/10.3390/rs16050891

Submission received: 19 January 2024 / Revised: 25 February 2024 / Accepted: 28 February 2024 / Published: 2 March 2024

(This article belongs to the Special Issue SAR Data Processing and Applications Based on Machine Learning Method)

Download

Browse Figures

Versions Notes

Abstract

:

Synthetic Aperture Radar (SAR) images are widely utilized in the field of remote sensing. However, there is a limited body of literature specifically addressing the compression of SAR learning images. To address the escalating volume of SAR image data for storage and transmission, which necessitates more effective compression algorithms, this paper proposes a novel framework for compressing SAR images. Experimental validation is performed using a representative low-resolution Sentinel-1 dataset and the high-resolution QiLu-1 dataset. Initially, we introduce a novel two-stage transformation-based approach aimed at suppressing the low-frequency components of the input data, thereby achieving a high information entropy and minimizing quantization losses. Subsequently, a quality map guidance image compression algorithm is introduced, involving the fusion of the input SAR images with a target-aware map. This fusion involves convolutional transformations to generate a compact latent representation, effectively exploring redundancies between focused and non-focused areas. To assess the algorithm’s performance, experiments are carried out on both the low-resolution Sentinel-1 dataset and the high-resolution QiLu-1 dataset. The results indicate that the low-frequency suppression algorithm significantly outperforms traditional processing algorithms by 3–8 dB when quantifying the input data, effectively preserving image features and improving image performance metrics. Furthermore, the quality map guidance image compression algorithm demonstrates a superior performance compared to the baseline model.

Keywords:

lossy compression; low-frequency suppression; quality map; learned compression

1. Introduction

Synthetic Aperture Radar (SAR) images are assuming an increasingly pivotal role in the realm of remote sensing. In recent years, a myriad of SAR systems of various types have been developed globally, with a continuous increase in the research and deployment of SAR satellites. The utilization of SAR images is progressively expanding into diverse domains, including aerospace, and there is a concurrent enhancement in both the data volume and the quality of SAR images [1,2,3].

SAR images exhibit intrinsic features of complex data structures and concentrated pixel distributions. Due to the continuous development of imaging technology, SAR images are attaining higher spatial and spectral resolutions, resulting in the creation of a significant volume of data characterized by features such as a high resolution, extensive imaging coverage, and a large data size [4,5,6]. In certain scenarios, especially over maritime surfaces where focal targets such as ships are relatively scarce and non-focal areas like open sea are extensive, the associated costs for the transportation, storage, and management of SAR images become notably elevated. Consequently, these challenges present significant obstacles to downstream tasks such as detection, identification, and segmentation in SAR image analysis [7,8].

Traditional image compression can be broadly categorized into two main types: lossless compression and lossy compression. Lossless compression refers to the process where the reconstructed image, after compression, retains all information from the original image, ensuring the preservation of image details without any loss. Despite the absence of information loss in lossless compression, its compression ratio typically hovers around 10:1 [9]. On the other hand, lossy compression involves a certain degree of information loss between the reconstructed and original images. However, it achieves compression ratios of up to 100:1 or even higher [10].

Lossless compression finds common application in tasks requiring precise capture of detailed features, such as spectral feature extraction and medical image detection. Prominent algorithms for lossless compression encompass run-length encoding, arithmetic coding, and Huffman coding [11,12]. Termed as entropy coding, this approach is distinguished by the preservation of information entropy during image reconstruction and restoration. While these methods can compress images without loss, their compression ratios often do not meet ideal standards. Conversely, lossy compression is frequently employed in scenarios where detailed image reconstruction is not paramount, such as in tasks like image segmentation, classification, and target extraction [13,14,15,16,17]. Across various domains, including military and civilian applications, the demand for lossy compression has been on the rise. Prominent lossy compression algorithms include JPEG and JPEG2000 [18,19]. Despite their proficiency in achieving commendable image reconstruction at low compression ratios, issues such as perceptible information loss, subjective deviations from the original image, and artifacts like block effects, ringing effects, and blurring tend to manifest at higher compression ratios [20].

In recent years, researchers have introduced numerous innovative compression algorithms for Synthetic Aperture Radar (SAR) images. These approaches primarily fall into two categories: those based on traditional image processing and those leveraging machine learning techniques. JPEG is considered one of the classical traditional compression algorithms; however, it is not very effective on SAR images affected by significant multiplicative noise. Kozhemiakin et al. considered the specific characteristics of SAR images and applied the JPEG2000 and SPIHT [21] algorithms to SAR images. The results indicated that, under equivalent compression ratios, the JPEG2000 method exhibited a performance comparable to SPIHT [22]. Li and Chang optimized the wavelet transform and proposed an enhanced SAR image compression model based on tower-shaped wavelet decomposition [23]. Subsequently, Zemliachenko et al. developed a compression ratio prediction algorithm for a discrete cosine transform (DCT) encoder, utilizing remote sensing images. Building upon the discrete wavelet transform (DWT), Dheepa et al. introduced a directional lifting wavelet transform (DLWT) [24,25]. By constructing a transformation matrix, implementing internal quantization, and employing a general function for the encoding and decoding processes, they aimed to enhance the coding efficiency and clustering capabilities. Experimental results demonstrated that the PSNR performance of DLWT surpassed that of the discrete wavelet transform (DWT) [26]. In exploring efficient strategies for image compression, Du and Fowler demonstrated the feasibility of effectively compressing hyperspectral images by combining JPEG2000 and principal component analysis (PCA), offering an effective approach for handling the high data volumes of remote sensing images [27]. On another note, Bai et al. (2017) explored PolSAR image compression based on online sparse K-SVD dictionary learning, proposing a novel strategy to address the unique challenges of PolSAR images [28]. Furthermore, Li et al. (2018) introduced an innovative method for compressing remote sensing images in the visible/near-infrared range using heterogeneous compressive sensing, providing new perspectives and technical support for the compression of remote sensing images [29]. In practical applications of remote sensing image compression, the demand for various image compression qualities becomes apparent. Consequently, research focusing on quality control has gained prominence. Ieremeiev and Makarichev conducted studies on image quality control and successfully implemented a compressive model with controllable quality [30,31].

In recent years, there has been significant development in learning-based image processing methods, including detection, recognition, and segmentation, and these methods have found successful applications in the field of remote sensing [13,14,15,16,17]. Concurrently, the demand for image compression in remote sensing has witnessed a steady increase. Consequently, numerous researchers have dedicated their efforts to exploring learning-based algorithms for compressing remote sensing images [32,33,34,35,36,37,38,39,40]. These learning-based approaches in remote sensing image compression use extensive sample learning to extract key features that leverage the spatial characteristics of images. Compared with traditional image compression algorithms, learning-based algorithms use Convolutional Neural Networks (CNNs) to deeply explore image features and represent high-dimensional features into low-dimensional visualizations.

In 2016, the application of CNNs to image compression emerged, with Balle et al. introducing an end-to-end image compression framework based on variational autoencoder CNNs [32]. Subsequently, in 2018, Bella et al. introduced further enhancements to the end-to-end CNN compression framework. This framework employed variational autoencoders for data processing and introduced a hyperprior network to capture latent data structures [33]. Expanding on this work, Minnen et al. improved the entropy coding stage, introducing an enhanced context model for entropy coding. This marked a significant advancement, representing the first instance of a deep learning compression method surpassing the performance of the widely used BPG in objective image evaluations [34]. Li and Liu leveraged CNNs for feature extraction from multispectral remote sensing images. The baseline network employed a simple two-layer CNN and achieved overall remote sensing image compression through the DCT transformation and entropy coding. The experimental results demonstrated superiority over methods based on BPG [35]. In a parallel development, Xu et al. proposed a variational autoencoder model for SAR image compression, incorporating a priori models. By combining residual blocks with transformations, they enhanced the depth of the network for improved image feature extraction. The results demonstrated superior performance compared to JPEG, JPEG2000, and Li’s method [36]. Building on Xu’s foundation, Zhang et al. further refined the model by introducing a hybrid Gaussian model for fitting and estimating model parameters. This modification, validated on ICEYE and Sandia datasets, demonstrated superiority over traditional compression methods and learning-based algorithms [37]. In a separate contribution [38], a compression algorithm featuring pyramid features and quality enhancement was proposed. Utilizing a variational autoencoder–decoder network as the baseline model, this approach combined the conditional Gaussian with universal quantization for SAR remote sensing image compression. Validation on Sandia National Laboratories and ICEYE datasets confirmed the effectiveness of the proposed method. Fu et al. endeavored to explore spatial redundancy by incorporating both local and global context information. Multiple residual modules were introduced to enhance the model’s feature extraction performance [39]. In the same year, Fu et al. proposed a model employing multiple prior networks, combining a CNN-based prior network with one based on transformer modules. This cascade approach, although increasing the complexity, resulted in a thorough exploration of spatial redundancy, achieving favorable experimental results on high-resolution remote sensing image (HRRSI) datasets [40].

In practical applications, most models often overlook errors at data input [21,22,23,24,25,26,27,28,29,30,31,35,36,37,38,39,40]. Commonly, traditional techniques such as linear stretching are employed for quantization processing [41]. Regrettably, most preprocessing methods based on conventional approaches result in the loss of crucial input features. Additionally, existing models often concentrate solely on spatial redundancy at both global and local levels, without distinguishing between target and non-target regions [36,37,38,39,40,41]. This lack of consideration may adversely affect the compression performance of the models.

By addressing the loss of image features, aiming to enhance information preservation in SAR image inputs, and exploring redundancy in spatial structures unrelated to the target, this paper introduces a SAR image compression model based on two-stage low-frequency suppression and quality map guidance. The primary contributions of our work are summarized as follows:

The paper proposes an SAR image compression model that utilizes two-stage low-frequency suppression and quality map guidance, validated through experiments conducted on Sentinel-1 low-resolution images and QiLu-1 high-resolution images.
Aiming at the problem of existing huge losses in the input data, the paper constructs two-stage transformation operators to suppress low-frequency input data, achieving both a peak signal-to-noise ratio and a minimized quantization loss in the input data.
To explore the redundancy between focused and non-focused targets, we establish a compression model guided by a quality map, directing the allocation of compression bit rates. This method results in a higher level of information fidelity in the compressed model focused on target perception.

The remainder of this paper is organized as follows: Section 2 provides a review of the related work, including an introduction to baseline network principles and formulas. Section 3 details the algorithm proposed in this paper. Experimental results and analyses are presented in Section 4. Section 5 discusses the algorithm, and in Section 6, we conclude with the main findings of this paper.

2. Materials and Methods

This section introduces the relevant work on remote sensing image compression. It encompasses an overview of the baseline network model, the principles of the hyperprior network, and an exposition of the parameters and formulas employed in this paper.

Most compression models obtained through learning typically include components such as an encoder, a decoder, quantization coding, and an entropy coding network. Built on the foundation of the variational autoencoder model, the introduction of a hyperprior network captures structural redundancy among latent representation feature maps. Consequently, these networks exhibit high precision, a superior compression performance, and reduced loss of complex data. The optimization problem of the algorithm can be modeled as a variational autoencoder [42], with the prior of the latent layer representation corresponding to the entropy model and the edge information generated by the hyperprior network serving as a prior for the entropy model. In general, the quantized data probability model is regarded as a joint known distribution. This model is subsequently employed for entropy coding and is applied to tasks such as image storage and transport. The overall network framework is illustrated in Figure 1, with the main parameters and formulas described as follows:

\begin{array}{l} y = g_{a} (x) \\ \hat{y} = U | Q (y) \\ z = h_{a} (y) \\ \hat{z} = U | Q (z) \\ p r i o r = h_{s} (\hat{y}) \\ \hat{x} = g_{s} (\hat{y}) \\ R_{y} = R (\hat{y} | p r i o r) \\ R_{z} = R (\hat{z}) \end{array}

(1)

where

x

and

\hat{x}

represent the original and reconstructed images,

y

and

\hat{y}

represent the latent and quantized latent variables, and

z

and

\hat{z}

hyperprior represent the latent and quantized latent variables. The prior is the output of the hyperprior decoder, representing the parameters σ within a zero-mean single Gaussian model or μ and σ for multiple Gaussian models within a mixture Gaussian model. Here,

g_{a}

and

g_{s}

represent the encoder and decoder transformations, while

h_{a}

and

h_{s}

represent the transformations of the hyperprior encoder and decoder. Due to the disruption of gradient backpropagation during training caused by uniform quantization, the training process incorporates uniform noise instead of quantization to complete the entire model training. In this way, we use

U | Q

to refer to the quantizer.

R

represents entropy coding, such as arithmetic coding (AE), where

R_{y}

and

R_{z}

respectively denote the bit rate (BPP, bits per pixel) after entropy coding for the latent representation and hyperprior latent representation.

The entropy model constitutes a pivotal element in the realm of learning-based lossy image compression algorithms. By leveraging the additional information introduced through the hyperprior, the entropy model occupies a minute bit rate while aiding in the construction of a more precise entropy model, thereby significantly enhancing the compression performance. The hyperprior model essentially estimates the parameters of the true distribution y. Given the inherent discrepancy arising from the unknown nature of the actual distribution, the primary aim is to minimize this disparity. This involves aligning the probability model distribution as closely as possible with the true distribution of y. During the image entropy coding process, effective image compression coding can be achieved through arithmetic coding when the parameters of the probability model

\hat{y}

are discerned. To reduce the divergence between the probability model and the actual model, the introduction of boundary variables z through the hyperprior encoding network h_a facilitates the precise estimation of the probability model [43]. Through quantization, the arithmetic encoder (AE), and the arithmetic decoder (AD), the hyperprior latent representation z is redefined as

\hat{z}

. Using the parameter generation model, i.e., the hyperprior decoding network h_s, to conduct multi-Gaussian modeling, the estimation of the distribution of y can be expressed as:

p_{\hat{y} | \hat{z}} (\hat{y} | \hat{z}) \leftarrow h_{s} (\hat{z}; θ_{h})

(2)

where θ represents the parameters of the hyperprior decoding network h, denoting the model’s estimation. If a Gaussian mixture model is employed for the estimation process [44,45], the model can be characterized as follows:

p_{\hat{y} | \hat{z}} (\hat{y} | \hat{z}) \sim \sum_{k} N_{k} (μ_{k}, σ_{k}^{2})

(3)

Here,

μ

and

σ

represent the estimations of the mean and standard deviation in the Gaussian model, with a practical choice of

k

set to 3. The process of parameter fitting and estimation involves the utilization of a three-component Gaussian mixture model. After processing the data with the estimated parameters, they are input into the primary decoder to achieve image reconstruction. Concerning the optimization of the entire network’s parameters [46,47], we continue to employ the overarching rate distortion function:

L = R + λ D

(4)

where

D

represents the distortion between the original image and the reconstructed image, which is usually replaced by MSE, PSNR, and MS-SSIM [48,49].

R

represents the compression code rate of the overall framework. In fact, R in Equation (4) includes two parts of the code rate shown in Equation (1):

R = R_{y} + R_{z}

(5)

3. Proposed Algorithm

This section mainly introduces the low-frequency suppression algorithm and the target perception model. In Section 3.1, we explore the motivation and principles of the two-stage low-frequency suppression algorithm, and in Section 3.2, we introduce the quality map guidance model algorithm. Figure 2 illustrates the overall design framework of the model, including modules for pre-processing and post-processing for the low-frequency suppression algorithm, as well as the quality map guidance image compression model. The latter includes modules for quality map extraction and fusion, with the baseline network being the hyperprior network described in Section 2.

3.1. The Two-Stage Low-Frequency Suppression Algorithm

The overall algorithmic process of the low-frequency suppression algorithm is illustrated in Figure 3. Firstly, the distribution of raw SAR data needs to be fitted. Subsequently, the low-frequency suppression algorithm is applied to obtain a bag-of-words model for the selection of parameters k and t. Two algorithmic modes have been designed to cater to different requirements: a rough selection mode, directly completed using interpolation, and a refined selection mode that utilizes the low-frequency suppression algorithm for precise parameter selection around the target parameters. With the selected parameters, the original data undergo transformations, achieving processing at the data input level. The processed image will be compressed and reconstructed through the compression model. Finally, the reconstructed data undergo post-processing to achieve overall data reconstruction.

3.1.1. Background and Motivation

For the accurate preservation of SAR signals, most primary and secondary products derived from SAR images are stored in a 16-bit format [50,51]. However, in many downstream tasks of learning-based SAR image processing, such as compression, detection, and recognition [13,14,15,16,17], the utilized data are in 8-bit JPEG and PNG formats. According to [52], SAR image samples from MSTAR-O are generated by employing simple linear enhancement of official MSTAR raw data using MATLAB, while SAR image samples from MSTAR-P are directly downloaded from an online personal blog as post-processed “.jpeg” images. In many learning-based SAR processing tasks, the utilization of 16-bit data input in the network is challenging due to limitations in hardware conditions, model size, and resource consumption. Therefore, quantizing SAR data from 16-bit to 8-bit is a crucial yet frequently overlooked aspect in many learning-based SAR processing tasks.

When converting 16-bit SAR raw data to 8-bit through linear quantization, a substantial portion of the grayscale tends to be concentrated in the lower intensity range. For example, in the experimental data, the Sentinel-1 dataset displays a distribution primarily within the range of 0–30, while the QiLu-1 dataset was concentrated within 0–3. This direct linear quantization results in a significant quantization loss. Established methods for enhancing quantization in SAR images include techniques such as linear stretching, histogram equalization, and the power-law transformation. The primary objective of these enhancement techniques is to mitigate quantization loss and enhance the preservation of image information. Considering the characteristic concentration of SAR image distributions, there is a need to design a transformation operator to suppress the low-frequency components and expand the high-frequency components in the data, thereby minimizing information loss at the input level. Traditional methods for low-frequency suppression include linear stretching, histogram equalization, and the power-law transformation.

The method of linear stretching involves dynamically expanding the dynamic range of data in a linear fashion. Pixels exceeding a certain threshold are excluded, and the quantized result will be used as input data. While this method preserves most information data, it leads to a significant loss for pixels with higher grayscale values, resulting in substantial information loss at the input data level.

Histogram equalization’s central idea is to suppress low-frequency pixel intensities and expand high-frequency pixel intensities. However, the degree of suppression and expansion varies with the number of pixels within a specific range. Like linear stretching within a certain range, this method necessitates recording additional and larger parameters, which deviates from the original intention of image compression.

The power-law transformation strikes a balance between linear stretching and histogram equalization. Requiring only one parameter, this method efficiently suppresses low-frequency pixel intensities. However, the extent of expansion for high-frequency components can become excessive with different internal model parameters.

To address these issues, we propose a two-stage low-frequency suppression operator built upon existing methods. It involves linear stretching for the primary high-frequency pixel components to achieve high-frequency expansion. Simultaneously, it compresses the secondary high-frequency and low-frequency pixel components to achieve low-frequency suppression. This design aims to minimize information loss at the data input level while introducing fewer parameters.

3.1.2. Model Design and Construction

The SAR image distribution generally follows a Rayleigh distribution. The raw data can be accurately described by fitting them to the Rayleigh distribution, which requires only one parameter, σ, to represent the probability distribution of grayscale values. The probability distribution function of the Rayleigh distribution can be expressed as follows:

y = \frac{x}{σ^{2}} * e^{- \frac{x^{2}}{2 * σ^{2}}}

(6)

In most SAR images, high-frequency components correspond to low grayscale values, while low-frequency components correspond to high grayscale values. Typically, under the conditions of compressing high grayscale values and stretching low grayscale values, methods such as the power transformation and histogram equalization can achieve more optimal SAR image quantization and reconstruction compared to linear transformation. However, with different introduced parameter values, the power transformation may lead to an excessive stretching of low grayscale values. Additionally, histogram equalization introduces extra parameters and is not conducive to the reconstruction of quantized data. To meet the low parameter reference and suit the characteristics of SAR distribution, a linear and power two-stage low-frequency suppression constructor

g (x)

is constructed:

g (x) = \{\begin{matrix} g (t) * x / t w h i l e x < t \\ 255 * \log (\frac{x}{k} + 1) / \log (\frac{255}{k} + 1) w h i l e x \geq t \end{matrix}

(7)

where

x \in [0,255]

. The constructed function

g

introduces two additional parameters,

t

and

k

, where

t \in [0,255]

and

k \in (0, \infty)

. Here,

t

represents the threshold. Pixels smaller than t are considered high-frequency points for linear stretching, and pixels larger than

t

are considered low-frequency points for power compression. The parameter

k

represents the degree of compression in the power-law transformation. Specifically, the inflection point

g (t)

at

x = t

is expressed as:

g (t) = 255 * \log (\frac{t}{k} + 1) / \log (\frac{255}{k} + 1)

(8)

It is apparent that for

t

= 0, the constructed function

g

reduces to the power transformation method, denoted by a sole parameter,

k

. When

t

= 255, the function simplifies to

g (x) = x

, mapping any point x to itself. As

k

approaches 0,

g (t)

approaches 255, and the function approximates a linear stretch within the 0 to t range, transitioning to a linear transformation with removal of the t to 255 range;

x

is mapped to itself. As

k

approaches infinity,

g (t)

converges towards

t

, and the function

g

similarly approximates

g (x) = x

, mapping any point x to itself. The inverse function of the transformation function

g (x)

is:

g^{- 1} (x) = \{\begin{matrix} t * x / g (t) w h i l e x < g (t) \\ k * (10^{x * \log (\frac{255}{k} + 1) / 255} - 1) w h i l e x > = g (t) \end{matrix}

(9)

3.1.3. Quantitative Loss Analysis

SAR raw data are initially stored in a 16-bit format. For subsequent calculations, we linearize them to a floating-point representation within the 0–255 range without quantization, minimizing loss. In traditional quantization methods, the Mean Squared Error (MSE) quantization loss function for directly quantizing SAR images can be defined as:

l o s s_{r a w} = \int_{0}^{255} y (x) * {(x - r o u n d (x))}^{2} = \sum_{m = 0}^{255} \int_{m - 0.5}^{m + 0.5} y (x) * {(x - m)}^{2} d x

(10)

where

l o s s_{r a w}

can be conceptualized as the sum of MSE losses across the interval

[m - 0.5, m + 0.5]

, where m ranges from 0 to 255. Here,

y (x)

denotes the distribution function of the SAR image. The loss for each interval

[m - 0.5, m + 0.5]

can be defined as

l o s s_{m}

:

l o s s_{m} = \int_{m - 0.5}^{m + 0.5} y (x) * {(x - m)}^{2} d x

(11)

Within the interval

[m - 0.5, m + 0.5]

, the distribution function

y (x)

exhibits continuous smoothness. For computational convenience, we can approximate y as a linear segment that is symmetric about the point

(m, y (m))

. Hence, the

y (x)

in Equation (11) can be simplified to a constant,

y (m)

. Through overall simplification of the MSE loss function, the loss results in a constant with respect to

σ

.

l o s s_{r a w} = \sum_{m = 0}^{255} y (m) \int_{m - 0.5}^{m + 0.5} {(x - m)}^{2} d x \approx \int_{0}^{255} y / 12 d x = \frac{1 - e^{- \frac{255^{2}}{2 * σ^{2}}}}{12}

(12)

The method of directly quantizing high-frequency and low-frequency points without distinction undoubtedly leads to significant losses. Therefore, it is imperative to introduce a front-end preprocessing step to mitigate these losses. In this preprocessing stage, we apply the constructed function g. Following this transformation,

y (g (x))

is quantized within the interval

[m - 0.5, m + 0.5]

, where m represents the index ranging from 0 to 255. Subsequently, restoration is performed through

y (g^{- 1} (Q (g (x))))

, and this process can be conceptualized as quantizing the actual distribution y(x) within the range

[g^{- 1} (m - 0.5), g^{- 1} (m + 0.5)]

, mapping it to

g^{- 1} (m)

. After the two-stage low-frequency suppression transformation, the quantized MSE loss function, denoted as

l o s s_{p r o}

, can be expressed as:

l o s s_{p r o} = \sum_{m = 0}^{255} \int_{g^{- 1} (m - 0.5)}^{g^{- 1} (m + 0.5)} y (x) * {(x - g^{- 1} (m))}^{2} d x

(13)

3.1.4. Function Parameter Optimization

In fact, the problem can be simplified to a quantization loss problem of stretching within

0

~

t

and compression within

t

~

255

. Referring to the optimization method in Equations (10)–(13), the distribution

y (x)

is regarded as a straight-line segment within

[g^{- 1} (m - 0.5), g^{- 1} (m + 0.5)]

, and the following simplified loss can be obtained:

l o s s_{p r o} = (\sum_{m = 0}^{g (t)} y (g^{- 1} (m)) + \sum_{m = g (t) + 1}^{255} y (g^{- 1} (m))) \int_{g^{- 1} (m - 0.5) - g^{- 1} (m)}^{g^{- 1} (m + 0.5) - g^{- 1} (m)} x^{2} d x

(14)

The integral part in Equation (14) can be proposed separately to obtain an expression related only to the transformation function g:

\int_{g^{- 1} (m - 0.5) - g^{- 1} (m)}^{g^{- 1} (m + 0.5) - g^{- 1} (m)} x^{2} d x = \frac{{(g^{- 1} (m + 0.5) - g^{- 1} (m))}^{3} - {(g^{- 1} (m - 0.5) - g^{- 1} (m))}^{3}}{3}

(15)

The first part of loss_pro is the linear stretching loss, and the second part is the power compression loss. The losses can be expressed as follows:

l o s s_{1} = \sum_{m = 0}^{g (t)} \frac{y (g^{- 1} (m))}{12 {(g (t) / t)}^{3}} = \frac{\sum_{m = g (0)}^{g (t)} y (g^{- 1} (m))}{12 l^{3}} = \frac{1 - e^{- \frac{t^{2}}{2 σ^{2}}}}{12 l^{3}}

(16)

l o s s_{2} = \frac{e^{- \frac{t^{2}}{2 σ^{2}}} - e^{- \frac{255^{2}}{2 σ^{2}}}}{12 {(p)}^{3}} \approx \frac{e^{- \frac{t^{2}}{2 σ^{2}}}}{12 {(p)}^{3}}

(17)

For ease of expression, we design two variables

l

and

p

, where

l

is the average derivative of the constructor function

g

within

0

~

t

, and

p

is the average derivative of the constructor function

g

from

t

to

255

. The derivative of constructor function

g

is expressed as:

g^{'} (x) = \{\begin{matrix} g (t) / t w h i l e x < t \\ \frac{255}{\log (\frac{255}{k} + 1) * \ln (10) * (x + k)} w h i l e x \geq t \end{matrix}

(18)

where

l

and

p

can be calculated as:

\begin{array}{l} l = \frac{g (t)}{t} \\ p = \frac{255 - g (t)}{255 - t} = \frac{255 - l t}{255 - t} \end{array}

(19)

The final simplified loss function

l o s s_{p r o}

can be expressed as:

l o s s_{p r o} = \frac{1 - e^{- \frac{t^{2}}{2 σ^{2}}}}{12 l^{3}} + \frac{{(255 - t)}^{3} e^{- \frac{t^{2}}{2 σ^{2}}}}{12 {(255 - l t)}^{3}}

(20)

where

l o s s_{p r o} (l, t, σ)

is a parsable loss function in the range of 1 < l < 255/t, 0 < t. The entire parameter optimization solution problem can ultimately be attributed to minimizing the loss losspro under the given parameters σ, l and t:

\begin{array}{l} \min_{l, t \in D} l o s s_{p r o} (l, t, σ) \\ s u b j e c t t o 1 < l < \frac{255}{t}, 0 < t, l = \frac{g (t)}{t} \end{array}

(21)

Through the optimization of parameters, we can determine the minimum quantization loss parameters

l

,

t

. It is important to note that in the formulation of Equation (7), the constructed function

g

introduces two parameters,

k

and

t

. The parameter

k

is embedded in the expression

g (t)

presented in Equation (2). Despite the different expressions, the use of

k

is simply for the sake of a concise description. In practical terms, the experiment continues to employ the values of

k

and

t

used for parameter selection. According to the determined σ, the optimal values of

k

and

t

are identified within the feasible domain to minimize the loss function

l o s s_{p r o}

. This process leads to the establishment of a bag-of-words model corresponding to σ. We can choose the optimal parameters in this model and implement input data processing.

3.2. The Quality-Map-Guided Image Compression Model

Current research on deep-learning-based SAR image compression mainly involves global compression and reconstruction of the entire image using encoding and decoding networks. However, these methods tend to focus primarily on the overall compression performance and metrics at a global level [36,37,38,39,40,41]. Unfortunately, achieving a higher level of information fidelity for specific local targets remains a challenging task, resulting in redundant information in non-target regions. To address this limitation, we propose a novel compression model incorporating a map aware of targets, enabling the preservation of a high information content in specific regions of interest and thereby reducing the redundancy in non-target areas. The proposed model is structured as a hyper-prior network, consisting mainly of the main encoding–decoding network and the hyper-prior encoding–decoding network. Rate allocation is performed utilizing quality map side information to effectively minimize the redundancy in non-target regions. Figure 4 illustrates the architecture of the overall network, wherein the importance guidance map is introduced into both the main encoder–decoder and the hyper-prior encoder–decoder. Lateral concatenation is performed whenever there is a change in dimensionality, followed by dimension transformation and reduction. The importance guidance feature map is extracted using a pre-trained ViT network [53], guiding rate allocation based on pre-trained feature maps to emphasize rate weights for critical regions. Spatial feature fusion (SFT) [54,55] is employed to spatially fuse the importance guidance map with the input data. Through upsampling, downsampling, and spatial fusion of feature maps and guidance maps from different perspectives, the model aims to deepen the significance of the guidance map within the network, thereby achieving overall quality-guided image compression.

The framework employed in this study is grounded in the Vision Transformer (VIT) model, utilized as a pre-trained model for the extraction of multi-class image features. The multi-head attention output is extracted into n feature sub-images before the ViT model classification module. We take the mean of the feature maps to obtain the required quality map. The pre-trained model can not only better extract image features, but also can make up for the shortcomings of traditional learning-based networks that are prone to under-fitting due to too few layers. In Figure 4, the quality map is input into the encoder after feature fusion with the original data to deepen the importance of the guidance map in the network. The SFT module used is shown in Figure 5. Compared with the model in [55], the number of stacking layers is reduced to achieve a more lightweight network model. The fusion module mainly includes two convolutions of the feature map and activation function distribution representing α and β, which are linearly weighted with the input features to obtain the feature fusion result. The model loss function is shown in Equations (4) and (5).

X_{i} = α f_{i} + β

(22)

4. Experimental Results and Analysis

4.1. Dataset and Indicators

The primary datasets employed in this study consist of SAR data from Sentinel-1 and QiLu-1 satellites. Sentinel-1 data cover a diverse range of terrains, including both marine and terrestrial landscapes. The detailed aspects include entities such as ships and houses, while the textured components comprise features like roads and ocean ripples. Sentinel-1 data have an extensive coverage, a high complexity, and a resolution of approximately 5 m, establishing them as quintessential examples of low-resolution SAR imagery. Therefore, we undertook studies utilizing low-resolution data from Sentinel-1. Similarly, QiLu-1 data encompass diverse terrains, including roads, deserts, and vegetation, demonstrating a high complexity and extensive coverage. With an azimuth resolution of better than 0.2 m, QiLu-1 data stand out as a valuable resource for high-resolution image studies. Therefore, this study strategically uses both low-resolution Sentinel-1 data and high-resolution QiLu-1 data for experimental validation, ensuring a comprehensive evaluation of the proposed model.

The assessment of the information-preservation capabilities of the compression–reconstruction model primarily relies on image evaluation metrics. These metrics encompass subjective evaluation, objective evaluation metrics, and application-oriented evaluation metrics. Subjective evaluation of image compression involves visual assessment of both the original and reconstructed images. Objective evaluation metrics for image compression predominantly include the PSNR, MS-SSIM, and BPP rate.

In the realm of image compression, application-oriented evaluation metrics are tailored to SAR image characteristics, assessing indicators for subsequent SAR image applications, such as recognition rates for maritime vessel identification. This study predominantly utilizes the PSNR metric, quantifying the distortion between the original and reconstructed images, as articulated in the following formula:

P S N R = 10 \log_{10} (\frac{M A X^{2}}{M S E})

(23)

where MAX and MSE represent the maximum value of the pixel count in an image and the Mean Squared Error between the original and reconstructed images, respectively. The PSNR is measured in decibels (dB), and a higher PSNR value indicates a smaller difference between the original and reconstructed images.

4.2. Experimental Results and Analysis of the Low-Frequency Suppression Algorithm

In the process of the low-frequency suppression algorithm, the first step involves estimating the data distribution parameter σ, thus representing each data point with a single σ parameter. The fitted results are depicted in Figure 6, with example curves for Sentinel-1 images at σ = 6.42 and QiLu-1 images at σ = 0.32.

According to the low-frequency suppression algorithm model described in Section 3.1, we performed fitting under a given σ and selected the optimal parameters t and k to minimize the PSNR loss. The results are illustrated in Figure 7a,b, depicting the loss surface plots for low-resolution Sentinel-1 experimental images on the left (with t = 27 and k = 1 × 10⁻⁸) and high-resolution QiLu-1 experimental images on the right (with t = 1.0 and k = 1 × 10^−0.7). The curve depicted in Figure 6 can be interpreted as the probability density function (PDF) of the original data, whereas Figure 7c,d represent the PDF after undergoing low-frequency suppression transformations. It is evident that the contrast has been enhanced, indicating a higher degree of preservation of information. Through this method to achieve the optimal parameter selection, we can finally obtain the bag-of-words model of t and k corresponding to σ. Thus, based on the results of the sampling bag-of-words model, the optimal k and t parameters corresponding to a given σ can be directly identified and interpolated. For tasks demanding a higher precision, a more refined selection of the optimal values of t and k can be conducted with a smaller interval around the target point through the low-frequency suppression algorithm.

Conventional processing methods encompass linear transformations, power transformations, and histogram equalization. However, histogram equalization introduces an excessive number of parameters, and its inverse transformation results in substantial information loss, rendering it impractical. This study compares traditional linearization, power transformation, and the proposed algorithms. Experimental trials were conducted using images from Sentinel-1 and QiLu-1, and the preprocessing outcomes of the three methods are depicted in Figure 8 and Figure 9. The recommended algorithm, in comparison to the other two, notably enhances the preservation of image information.

We employed the PSNR metric for an objective quantitative assessment, and the results of the PSNR values of the three processing methods are presented in Table 1. The proposed algorithm consistently achieves superior PSNR results for both low-resolution and high-resolution images, outperforming traditional algorithms by 3–22 dB. This demonstrates the reduced quantization loss and enhanced information preservation in data processing.

4.3. Experimental Results and Analysis of the Quality-Map-Guided Image Compression Model

We conducted experiments on a selected subset of Sentinel-1 images, applying our low-frequency suppression algorithm followed by quality map extraction and experimentation with the compression model. In the actual process of perception map extraction, ViT preprocessing models were employed to extract multi-head attention feature maps. The resulting feature map is ultimately obtained by applying weighted processing to the 12 sub-maps. Figure 10 illustrates the feature maps of both traditional methods and the low-frequency suppression algorithm. The experimental results demonstrate the efficacy of this approach in preserving and integrating image features, consequently enhancing the overall image quality.

In our subsequent experiments, we employed a hyperprior as our baseline net and integrated the quality map with the input data to perform spatial feature fusion. Throughout the training phase, we utilized the PyTorch framework and trained the model end-to-end using the Adam optimizer. The batch size was set to 8, with an initial learning rate of 0.0001. Experimental training involved random cropping to a size of 256 × 256 pixels. All experiments were conducted on an NVIDIA GeForce RTX 3080 Ti GPU, and the training process extended over 100 epochs. For a beneficial understanding of the computational requirements and efficiency of the framework, Table 2 contains the computational complexity of the proposed algorithm.

Figure 11 presents a comparative illustration of the compression reconstruction performance across different methods and scenarios. The second column illustrates more detailed sub-images compared to the first column. The entire model is evaluated using a BPP of 1.2. While the PSNR obtained with the JPEG method is 17.32 dB, our test results demonstrate a higher PSNR of 19.97 dB, indicating a superior image quality. Notably, the second column of detailed sub-images visibly demonstrates this superior performance.

The quantitative results of the experiments, employing the PSNR metric, are presented in Table 3. The findings suggest that around a BPP of 1.2, the recommended model demonstrates an better PSNR performance by approximately 3 dB compared to JPEG. Additionally, the purpose of this algorithm is to investigate a universally applicable quality map guidance network. Therefore, research was conducted based on a foundational universal model. The results substantiate the effectiveness of this approach in preserving and improving the performance of the image compression model.

5. Discussion

This paper introduces an SAR image compression model based on low-frequency suppression and quality map guidance, validated through experiments conducted on Sentinel-1 low-resolution images and QiLu-1 high-resolution images. The primary innovations include the construction of two-stage transformation operators for the input data, aiming to suppress low-frequency input data, achieve an optimal PSNR, and minimize the quantization loss in the data input. Simultaneously, a quality map guidance compression model is developed to guide the allocation of compression bit rates, exploring the redundancy between focused and non-focused targets to attain higher levels of information fidelity.

However, in low-frequency suppression, it is important to note that some simplifications of loss calculations may introduce theoretical errors. Researchers are encouraged to conduct more in-depth investigations into loss calculations and formula derivations to establish a more comprehensive and theoretically sound experimental model. Additionally, concerning quality map guidance, this paper primarily implements a more lightweight quality map guidance network on the baseline network. While many existing models enhance the network complexity and time and material costs through layer stacking to achieve optimal model performance, this paper leverages features extracted by a pre-trained model for data fusion. This approach mitigates the drawbacks of the baseline model complexity. It is worth mentioning that the pre-trained model utilized in this paper is essentially a natural image feature extraction model, resulting in a suboptimal outcome. We look forward to researchers being able to propose more pre-trained models in the SAR field for reference and improvement in the future. Furthermore, it is hoped that researchers will conduct more in-depth research on feature fusion modules in the future.

6. Conclusions

Firstly, we meticulously designed a two-stage transformation operator for input data, enabling effective low-frequency suppression. This approach aims to optimize the PSNR and minimize the quantization loss. It is crucial to emphasize that our proposed algorithm transcends its application solely for compression tasks; it also demonstrates versatility in detection, recognition, and various other applications. This adaptability is achieved through meticulous data preprocessing, ensuring superior information retention in the input data.

Subsequently, we formulated a compression model guided by a quality map, emphasizing the focus on the feature components extracted from the quality map. This guidance aids in the judicious allocation of compression bit rates and delves into the redundancy between focused and non-focused targets. The ultimate objective is to attain a heightened level of information fidelity in compressing images with a specific emphasis on local targets of interest. The experimental results unequivocally affirm the efficacy of this method, demonstrating its proficiency in mitigating losses in the input data in SAR images and improving the overall compression performance of the model.

The paramount significance of this study lies in the adept application of deep learning techniques in SAR image processing. Our novel contributions include the introduction of a two-stage low-frequency suppression algorithm and a quality map guidance image compression model. The efficacy of these contributions is substantiated through rigorous experimental validation.

Author Contributions

J.D. performed all experiments and wrote the paper; L.H. suggested the directions of the experiments; J.D. and L.H. performed the results analysis and edited the manuscript; and J.D. and L.H. reviewed the results and reviewed the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Youth Innovation Promotion Association No. 2019127, Chinese Academy of Sciences.

Data Availability Statement

The data is unavailable due to privacy or ethical restrictions.

Conflicts of Interest

The authors declare no conflict of interest.

References

Hahn, J.; Debes, C.; Leigsnering, M.; Zoubir, A.M. Compressive sensing and adaptive direct sampling in hyperspectral imaging. Digit. Signal Process. 2014, 26, 113–126. [Google Scholar] [CrossRef]
Liu, F.; Wu, J.; Li, L.; Jiao, L.; Hao, H.; Zhang, X. A hybrid method of SAR speckle reduction based on geometric-structural block and adaptive neighborhood. IEEE Trans. Geosci. Remote Sens. 2017, 56, 730–748. [Google Scholar] [CrossRef]
Guo, T.; Luo, F.; Zhang, L.; Zhang, B.; Tan, X.; Zhou, X. Learning structurally incoherent background and target dictionaries for hyperspectral target detection. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 3521–3533. [Google Scholar] [CrossRef]
DeGraaf, S.R. SAR imaging via modern 2-D spectral estimation methods. IEEE Trans. Image Process. 1998, 7, 729–761. [Google Scholar] [CrossRef]
Pestel-Schiller, U.; Ostermann, J. Subjective evaluation of compressed SAR images using JPEG and HEVC intra coding: Sometimes, compression improves usability. In Proceedings of the 2018 15th European Radar Conference (EuRAD), Madrid, Spain, 26–28 September 2018; pp. 154–157. [Google Scholar]
Wu, Z.; Hou, B.; Jiao, L. Multiscale CNN with autoencoder regularization joint contextual attention network for SAR image classification. IEEE Trans. Geosci. Remote Sens. 2020, 59, 1200–1213. [Google Scholar] [CrossRef]
Zhou, S.; Deng, C.; Zhao, B.; Xia, Y.; Li, Q.; Chen, Z. Remote sensing image compression: A review. In Proceedings of the 2015 IEEE International Conference on Multimedia Big Data, Beijing, China, 20–22 April 2015; pp. 406–410. [Google Scholar]
Rusyn, B.; Lutsyk, O.; Lysak, Y.; Lukenyuk, A.; Pohreliuk, L. Lossless image compression in the remote sensing applications. In Proceedings of the 2016 IEEE First International Conference on Data Stream Mining & Processing (DSMP), Lviv, Ukraine, 23–27 August 2016; pp. 195–198. [Google Scholar]
Weinberger, M.J.; Seroussi, G.; Sapiro, G. The LOCO-I lossless image compression algorithm: Principles and standardization into JPEG-LS. IEEE Trans. Image Process. 2000, 9, 1309–1324. [Google Scholar] [CrossRef]
Said, A.; Pearlman, W.A. An image multiresolution representation for lossless and lossy compression. IEEE Trans. Image Process. 1996, 5, 1303–1310. [Google Scholar] [CrossRef]
Huffman, D.A. A method for the construction of minimum-redundancy codes. Resonance 2006, 11, 91–99. [Google Scholar] [CrossRef]
Wang, H.; Babacan, S.D.; Sayood, K. Lossless hyperspectral-image compression using context-based conditional average. IEEE Trans. Geosci. Remote Sens. 2007, 45, 4187–4193. [Google Scholar] [CrossRef]
Luo, F.; Zou, Z.; Liu, J.; Lin, Z. Dimensionality reduction and classification of hyperspectral image via multistructure unified discriminative embedding. IEEE Trans. Geosci. Remote Sens. 2021, 60, 1–16. [Google Scholar] [CrossRef]
Henry, C.; Azimi, S.M.; Merkle, N. Road segmentation in SAR satellite images with deep fully convolutional neural networks. IEEE Geosci. Remote Sens. Lett. 2018, 15, 1867–1871. [Google Scholar] [CrossRef]
Pu, W. Shuffle GAN with autoencoder: A deep learning approach to separate moving and stationary targets in SAR imagery. IEEE Trans. Neural Netw. Learn. Syst. 2021, 33, 4770–4784. [Google Scholar] [CrossRef] [PubMed]
Wang, R.; Xu, F.; Pei, J.; Wang, C.; Huang, Y.; Yang, J.; Wu, J. An improved faster R-CNN based on MSER decision criterion for SAR image ship detection in harbor. In Proceedings of the IGARSS 2019 IEEE International Geoscience and Remote Sensing Symposium, Yokohama, Japan, 28 July–2 August 2019; pp. 1322–1325. [Google Scholar]
Wang, P.; Zhang, H.; Patel, V.M. SAR image despeckling using a convolutional neural network. IEEE Signal Process. Lett. 2017, 24, 1763–1767. [Google Scholar] [CrossRef]
Datcu, M.; Schwarz, G.; Schmidt, K.; Reck, C. Quality evaluation of compressed optical and SAR images: JPEG vs. wavelets. In Proceedings of the 1995 International Geoscience and Remote Sensing Symposium, IGARSS’95. Quantitative Remote Sensing for Science and Applications, Firenze, Italy, 10–14 July 1995; Volume 3, pp. 1687–1689. [Google Scholar]
Rabbani, M.; Joshi, R. An overview of the JPEG 2000 still image compression standard. Signal Process. Image Commun. 2002, 17, 3–48. [Google Scholar] [CrossRef]
Zhou, P.; Zhao, B. Novel scheme for SAR image compression based on JPEG2000. In Proceedings of the 2009 9th International Conference on Electronic Measurement & Instruments, Beijing, China, 16–19 August 2009; pp. 4–206. [Google Scholar]
Said, A.; Pearlman, W.A. A new, fast, and efficient image codec based on set partitioning in hierarchical trees. IEEE Trans. Circuits Syst. Video Technol. 1996, 6, 243–250. [Google Scholar] [CrossRef]
Kozhemiakin, R.; Abramov, S.; Lukin, V.; Djurović, B.; Djurović, I.; Simeunović, M. Strategies of SAR image lossy compression by JPEG2000 and SPIHT. In Proceedings of the 2017 6th Mediterranean Conference on Embedded Computing (MECO), Bar, Montenegro, 11–15 June 2017; pp. 1–6. [Google Scholar]
Li, J.; Chang, L. A SAR image compression algorithm based on Mallat tower-type wavelet decomposition. Optik 2015, 126, 3982–3986. [Google Scholar] [CrossRef]
Zemliachenko, A.N.; Kozhemiakin, R.A.; Abramov, S.K.; Lukin, V.V.; Vozel, B.; Chehdi, K.; Egiazarian, K.O. Prediction of compression ratio for DCT-based coders with application to remote sensing images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2017, 11, 257–270. [Google Scholar] [CrossRef]
Jakka, T.K.; Reddy, Y.M.; Rao, B.P. GWDWT-FCM: Change detection in SAR images using adaptive discrete wavelet transform with fuzzy C-mean clustering. J. Indian Soc. Remote Sens. 2019, 47, 379–390. [Google Scholar] [CrossRef]
Dheepa, B.; Nithya, R.; Nishavithri, N.; Vinoth, K.; Balaji, K. Directional lifting wavelet transform based SAR Image Compression. Int. J. Adv. Trends Comput. Sci. Eng. 2020, 9, 1–8. [Google Scholar]
Du, Q.; Fowler, J.E. Hyperspectral image compression using JPEG2000 and principal component analysis. IEEE Geosci. Remote Sens. Lett. 2007, 4, 201–205. [Google Scholar] [CrossRef]
Bai, J.; Liu, B.; Wang, L.; Jiao, L. PolSAR image compression based on online sparse K-SVD dictionary learning. Multimed. Tools Appl. 2017, 76, 24859–24870. [Google Scholar] [CrossRef]
Li, J.; Fu, Y.; Li, G.; Liu, Z. Remote sensing image compression in visible/near-infrared range using heterogeneous compressive sensing. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 4932–4938. [Google Scholar] [CrossRef]
Ieremeiev, O.; Lukin, V.; Okarma, K.; Egiazarian, K. Full-reference quality metric based on neural network to assess the visual quality of remote sensing images. Remote Sens. 2020, 12, 2349. [Google Scholar] [CrossRef]
Makarichev, V.; Vasilyeva, I.; Lukin, V.; Vozel, B.; Shelestov, A.; Kussul, N. Discrete atomic transform-based lossy compression of three-channel remote sensing images with quality control. Remote Sens. 2021, 14, 125. [Google Scholar] [CrossRef]
Ballé, J.; Laparra, V.; Simoncelli, E.P. End-to-end optimized image compression. arXiv 2016, arXiv:1611.01704. [Google Scholar]
Ballé, J.; Minnen, D.; Singh, S.; Hwang, S.J.; Johnston, N. Variational image compression with a scale hyperprior. arXiv 2018, arXiv:1802.01436. [Google Scholar]
Minnen, D.; Ballé, J.; Toderici, G.D. Joint autoregressive and hierarchical priors for learned image compression. Adv. Neural Inf. Process. Syst. 2018, 31, 1–10. [Google Scholar]
Li, J.; Liu, Z. Multispectral transforms using convolution neural networks for remote sensing multispectral image compression. Remote Sens. 2019, 11, 759. [Google Scholar] [CrossRef]
Xu, Q.; Xiang, Y.; Di, Z.; Fan, Y.; Feng, Q.; Wu, Q.; Shi, J. Synthetic aperture radar image compression based on a variational autoencoder. IEEE Geosci. Remote Sens. Lett. 2021, 19, 1–5. [Google Scholar] [CrossRef]
Zhang, L.; Pan, T.; Huang, Y.; Qu, L.; Liu, Y. SAR image compression using discretized Gaussian adaptive model and generalized subtractive normalization. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
Di, Z.; Chen, X.; Wu, Q.; Shi, J.; Feng, Q.; Fan, Y. Learned compression framework with pyramidal features and quality enhancement for SAR images. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
Fu, C.; Du, B.; Zhang, L. SAR Image Compression Based on Multi-Resblock and Global Context. IEEE Geosci. Remote Sens. Lett. 2023, 20, 1–5. [Google Scholar] [CrossRef]
Fu, C.; Du, B. Remote Sensing Image Compression Based on the Multiple Prior Information. Remote Sens. 2023, 15, 2211. [Google Scholar] [CrossRef]
Ross, T.D.; Worrell, S.W.; Velten, V.J.; Mossing, J.C.; Bryant, M.L. Standard SAR ATR evaluation experiments using the MSTAR public release data set. In Proceedings of the Algorithms for Synthetic Aperture Radar Imagery V, Orlando, FL, USA, 14–17 April 1998; Volume 3370, pp. 566–573. [Google Scholar]
Sun, Y.; Li, L.; Ding, Y.; Bai, J.; Xin, X. Image compression algorithm based on variational autoencoder. In Proceedings of the Journal of Physics: Conference Series, University of Al-Qadisiyah, Diwaniyah, Iraq, 21–22 April 2021; Volume 2066, p. 012008. [Google Scholar]
Wu, C.P.; Kuo, C.C.J. Efficient multimedia encryption via entropy codec design. In Proceedings of the Security and Watermarking of Multimedia Contents III, San Jose, CA, USA, 22–25 January 2001; Volume 4314, pp. 128–138. [Google Scholar]
Murphy, K.P. Machine Learning: A Probabilistic Perspective; MIT Press: Cambridge, MA, USA, 2012. [Google Scholar]
McLachlan, G.J.; Rathnayake, S. On the number of components in a Gaussian mixture model. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2014, 4, 341–355. [Google Scholar] [CrossRef]
Ortega, A.; Ramchandran, K. Rate-distortion methods for image and video compression. IEEE Signal Process. Mag. 1998, 15, 23–50. [Google Scholar] [CrossRef]
Berger, T. Rate-Distortion Theory. In Wiley Encyclopedia of Telecommunications; Wiley: Hoboken, NJ, USA, 2003. [Google Scholar]
Sara, U.; Akter, M.; Uddin, M.S. Image quality assessment through FSIM, SSIM, MSE and PSNR—A comparative study. J. Comput. Commun. 2019, 7, 8–18. [Google Scholar] [CrossRef]
Hore, A.; Ziou, D. Image quality metrics: PSNR vs. SSIM. In Proceedings of the 2010 20th International Conference on Pattern Recognition, Istanbul, Turkey, 23–26 August 2010; pp. 2366–2369. [Google Scholar]
Drusch, M.; Del Bello, U.; Carlier, S.; Colin, O.; Fernandez, V.; Gascon, F.; Hoersch, B.; Isola, C.; Laberinti, P.; Martimort, P.; et al. Sentinel-2: ESA’s optical high-resolution mission for GMES operational services. Remote Sens. Environ. 2012, 120, 25–36. [Google Scholar] [CrossRef]
Zhou, G.; Liu, M.; Xu, Z.; Wang, M.; Zhang, B.; Wu, Y. Azimuth Ambiguities Suppression Using Group Sparsity and Nonconvex Regularization for Sliding Spotlight Mode: Results on QILU-1 SAR Data. In Proceedings of the IGARSS 2022—2022 IEEE International Geoscience and Remote Sensing Symposium, Kuala Lumpur, Malaysia, 17–22 July 2022; IEEE: Piscataway, NJ, USA; pp. 1660–1663. [Google Scholar]
Geng, Z.; Xu, Y.; Wang, B.N.; Yu, X.; Zhu, D.Y.; Zhang, G. Target Recognition in SAR Images by Deep Learning with Training Data Augmentation. Sensors 2023, 23, 941. [Google Scholar] [CrossRef]
Han, K.; Wang, Y.; Chen, H.; Chen, X.; Guo, J.; Liu, Z.; Tang, Y.; Xiao, A.; Xu, C.; Xu, Y.; et al. A survey on vision transformer. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 87–110. [Google Scholar] [CrossRef]
Liu, S.; Huang, D.; Wang, Y. Learning spatial fusion for single-shot object detection. arXiv 2019, arXiv:1911.09516. [Google Scholar]
Song, M.; Choi, J.; Han, B. Variable-rate deep image compression through spatially-adaptive feature transform. In Proceedings of the Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 2380–2389. [Google Scholar]

Figure 1. The framework of the hyperprior compression algorithm.

Figure 2. The framework of the designed compression algorithm.

Figure 3. The framework of the low-frequency suppression algorithm model.

Figure 4. The network of the quality-map-guided image compression model.

Figure 5. SFT feature fusion module.

Figure 6. Fitting curves. (a) Sentinel-1 image with σ = 6.42; (b) QiLu-1 image with σ = 0.32.

Figure 7. Loss surface chart results and PDF of Figure 6 after low-frequency suppression transformation. (a) Loss surface chart results of Sentinel-1; (b) loss surface chart results of QiLu-1; (c) PDF of Sentinel-1; (d) PDF of QiLu-1.

Figure 8. Preprocessing result of Sentinel-1. (a,d) Traditional linear method; (b,e) traditional power method; (c,f) proposed method.

Figure 9. Preprocessing result of QiLu-1. (a,d) Traditional linear method; (b,e) traditional power method; (c,f) proposed method.

Figure 10. Attention feature maps. (a–d) Attention features of raw data, the traditional linear method, the traditional power method, and the proposed method.

Figure 11. Visual display of compression model results. (a,d) Ground truth and subgraphs; (b,e) JPEG and subgraphs; (c,f) proposed method and subgraphs.

Table 1. PSNR of the traditional linear method, traditional power method and proposed method.

Algorithm	Sentienl-1/PSNR	QiLu-1/PSNR
Traditional linear method	42.93	68.32
Traditional power method	58.92	87.46
Proposed method	65.38	90.53

Table 2. Model complexity of the proposed algorithm.

Algorithm	Para/M	FLOPs/G
Proposed method	5.06	133.28

Table 3. Compression performance comparison.

Algorithm	Sentienl-1/PSNR
JPEG	18.58
Proposed method	21.51

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Deng, J.; Huang, L. Synthetic Aperture Radar Image Compression Based on Low-Frequency Rejection and Quality Map Guidance. Remote Sens. 2024, 16, 891. https://doi.org/10.3390/rs16050891

AMA Style

Deng J, Huang L. Synthetic Aperture Radar Image Compression Based on Low-Frequency Rejection and Quality Map Guidance. Remote Sensing. 2024; 16(5):891. https://doi.org/10.3390/rs16050891

Chicago/Turabian Style

Deng, Jiawen, and Lijia Huang. 2024. "Synthetic Aperture Radar Image Compression Based on Low-Frequency Rejection and Quality Map Guidance" Remote Sensing 16, no. 5: 891. https://doi.org/10.3390/rs16050891

APA Style

Deng, J., & Huang, L. (2024). Synthetic Aperture Radar Image Compression Based on Low-Frequency Rejection and Quality Map Guidance. Remote Sensing, 16(5), 891. https://doi.org/10.3390/rs16050891

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Synthetic Aperture Radar Image Compression Based on Low-Frequency Rejection and Quality Map Guidance

Abstract

1. Introduction

2. Materials and Methods

3. Proposed Algorithm

3.1. The Two-Stage Low-Frequency Suppression Algorithm

3.1.1. Background and Motivation

3.1.2. Model Design and Construction

3.1.3. Quantitative Loss Analysis

3.1.4. Function Parameter Optimization

3.2. The Quality-Map-Guided Image Compression Model

4. Experimental Results and Analysis

4.1. Dataset and Indicators

4.2. Experimental Results and Analysis of the Low-Frequency Suppression Algorithm

4.3. Experimental Results and Analysis of the Quality-Map-Guided Image Compression Model

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI