Efficient Aero-Optical Degraded Image Restoration via Adaptive Frequency Selection

Huang, Yingjiao; Zhang, Qingpeng; Ma, Xiafei; Ma, Haotong

doi:10.3390/rs17071122

Open AccessTechnical Note

Efficient Aero-Optical Degraded Image Restoration via Adaptive Frequency Selection

¹

National Key Laboratory of Optical Field Manipulation Science and Technology, Chinese Academy of Sciences, Chengdu 610209, China

²

Key Laboratory of Optical Engineering, Chinese Academy of Sciences, Chengdu 610209, China

³

The Institute of Optics and Electronics, Chinese Academy of Sciences, Chengdu 610209, China

⁴

University of Chinese Academy of Sciences, Beijing 100039, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2025, 17(7), 1122; https://doi.org/10.3390/rs17071122

Submission received: 15 January 2025 / Revised: 5 March 2025 / Accepted: 13 March 2025 / Published: 21 March 2025

Download

Browse Figures

Versions Notes

Abstract

:

During high-speed flight, the aircraft causes rapid compression of the surrounding air, creating a complex turbulent flow field. This high-speed flow field interferes with the optical transmission of optical imaging systems, resulting in high-frequency random displacement, blurring, intensity attenuation, or saturation of the target scene. Aero-optical effects severely degrade imaging quality and target recognition capabilities. Based on the spectral characteristics of aero-optical degraded images and the deep learning approach, this paper proposes an adaptive frequency selection network (AFS-NET) for correction. To learn multi-scale and accurate features, we develop cascaded global and local attention mechanism modules to capture long-distance dependency and extensive contextual information. To deeply excavate the frequency component, an adaptive frequency separation and fusion strategy is proposed to guide the image restoration. Integrating both spatial and frequency domain processing and learning the residual representation between the observed data and the underlying ideal data, the proposed method assists in restoring aero-optical degraded images and significantly improves the quality and efficiency of image reconstruction.

Keywords:

aero-optics; image restoration; deep learning; frequency domain learning

1. Introduction

With the development of aviation technology, aircraft are continuously surpassing the previous altitude and speed limits. During high-speed flight, the optical window is subjected to notable aero-optical effects due to spatial variations in the airflow density field near the aircraft. The inhomogeneity in air density induces high-frequency, random fluctuations in the local refractive index, resulting in distortion or optical aberration as the target scene traverses the fluctuating medium [1,2,3]. Consequently, aero-optical effects can adversely affect high-speed imaging performance and, in extreme cases, cause a complete loss of imaging and detection system functionality.

In recent years, substantial progress has been made in the accurate quantification and mitigation of aero-optical effects, largely fueled by significant advances in numerical simulation and imaging technologies. As detailed in [4,5], the aero-optical effects can be effectively addressed through the application of three primary methods: (1) turbulent flow control; (2) Adaptive Optics-Electronic Correction; (3) Digital Image Restoration. Firstly, turbulent flow control systems often struggle to keep up with the rapid variations in the aerodynamic flow field, resulting in unstable or delayed control. Consequently, the system cannot obtain real-time clear images when the imaging becomes blurred or severely distorted. Secondly, the quality of detection and imaging during high-speed flight usually fail to reach the desired levels, owing to the complexity of control technology and the prohibitively high cost of adaptive optics (AO). Consequently, the quality of detection and imaging during high-speed flight usually fails to reach the desired levels, owing to the complexity of turbulent flow control technology and the prohibitively high cost of AO. However, digital image restoration is both component-free and economically feasible, providing an effective approach for correcting aero-optical degradation. These methods can generally be divided into two categories: optimization-based methods and learning-based methods. Some optimization methods restore the degraded image based on constraints of the Point Spread Function (PSF) through iterative refinement between the spatial and frequency domains, for example, blind iterative restoration [6], Wiener filtering [7], and the Richardson-Lucy algorithm [8]. However, these earlier deblurring algorithms are sensitive to noise and fluctuations in PSF, and noise amplification and ringing effects may occur during restoration. It is difficult to ensure the effective estimation of PSF in the high-speed dynamic imaging scenarios. So other optimization methods leverage image prior constraints, such as dark channel [9] and gradient distribution [10], to construct optimization models for restoring degraded aero-optical images. These methods avoid explicit estimation of the PSF but rely heavily on the hand-crafted features parameter selection (e.g., sparser lp gradient regularization weights). Extensive experiments demonstrated that these methods can effectively enhance the degraded image quality without significant noise amplification, but the sharp edge of the restored image cannot be maintained and the ringing effect is still a problem [11,12]. In addition, optimization-based methods are time-consuming in high-speed dynamic imaging.

In view of the powerful representation ability of deep learning and the availability of large image datasets, an increasing number of attempts have been made to address turbulence degradations with network models. In 2019, Su et al. [13] proposed a single-frame recovery algorithm based on a generative adversarial network (GAN) to deal with image deformation and blurring caused by aero-transmission effect and aero-heat radiation effect. In 2022, Gao et al. [14] adopted the mean filter to remove additive noise from the image, then employed a GAN based on MobileNet-v2 to restore blurred images with aero-optical effects. These GANs adopt conventional Convolutional Neural Networks to generate image details. However, Li [15] has indicated that using Convolutional Neural Networks to counteract turbulence degradation leads to over-smoothing and loss of detail. Besides, the lack of physical constraints makes reliable results difficult. Ref. [16] used the modified Transformer block as baseline and proposed that increasing the network depth to restore high-quality images provides a possible solution to the problem of aero-optical image restoration. All these approaches purely operate in the spatial domain, and fail to consider the variations in low-frequency and high-frequency caused by aero-optical effects. Actually, we observe that different aero-optical distortions may affect the image content in different frequency subbands, as illustrated in Figure 1. On the one hand, images in Figure 1a,b are dominated by low-frequency degradation, while on the other hand, more high-frequency discrepancies arise in Figure 1c,d. This indicates the need to treat each restored image on its specific characteristics.

In this paper, we introduce a deep-learning approach inspired by the Fourier Transform to mitigate the aero-optical effects on degradation images [17]. Drawing on the aero-optical imaging model and deep learning principle, we propose the AFS-NET for learning alterations in both spatial and frequency domains. Specifically, unlike the optimization algorithms those require prior information regarding the corruption present in the input images, a learnable mask block is generated in our model to separate the low-frequency and high-frequency components according to the residual spectral weight distribution. Then the most informative frequency information will be selectively extracted and emphasized in the subsequent fusion process. Besides, by analyzing the architectures of advanced methods and extracting their fundamental components, we establish a baseline with lower system complexity. An Efficient Channel Attention (ECA) module with negligible additional cost is also introduced to further enhance model performance [18]. These specifically designed modules make our method achieve state-of-the-art aero-optical degraded image restoration performance while maintaining computational efficiency. The main contributions include:

(1): We propose the Composite Channel Learning Block (CCL-Block) as the baseline to minimize information loss during feature mapping in the network transfer process, which employs cascaded global and local attention mechanism modules to enhance the network’s focus on multi-scale features within the channels.
(2): We design a Frequency Separation and Fusion Block (FSF-Block) by analyzing the residual spectral characteristics of degraded and clear images. This block adaptively adjusts the feature weights of degraded images across different frequency subbands, which enhances the high-frequency restoration performance in aero-optical degraded images.
(3): Experiments show that the proposed method outperforms the comparison algorithms on the simulated dataset, while also shortening the network running time.

2. Materials and Methods

2.1. Degradation Model

In order to quantify aero-optical effects, one needs to use the ray-tracing method to evaluate the aerodynamic fluctuations around the aircraft inducing time-varying wavefront aberration in the exit pupil of the optical system. The computed Optical Path Length (

OPL

) directly reflects the air’s unsteady density field in the flow, as the air’s refractive index n is directly influenced by fluctuating density according to the Gladstone-Dale relationship [19],

O P L = \sum_{i} n_{i} z_{i} = \sum_{i} [1 + K_{G D} (λ) ρ_{i})] z_{i}

(1)

where

n_{i}

,

z_{i}

,

ρ_{i}

represent the indexes of refraction, the distance, and the density in the ith step on the ray respectively.

The Gladstone-Dale coefficient

K_{G D} (λ)

is only dependent on the wavelength

λ

, and can be approximately acquired by:

K_{G D} (λ) = 2.22 \times 10^{- 4} \times (1 + {(6.71 \times 10^{- 8} / λ)}^{2})

(2)

Indeed, the variation in OPL, referred to as the Optical Path Difference (OPD), is more significant for assessing the intensity of aero-optical effects. The OPD is given by [19]:

O P D_{k} = O P L_{k} - \bar{O P L}

(3)

where the overline represents the spatially averaged component, and corresponding aberrations are defined as

φ = 2 π \times O P D / λ

. To investigate the impact of high-speed incoming flow on the performance of a hemispherical optical cover and its internal imaging optical window, COMSOL Multiphysics 6.2 is used to build the simulation model, as shown in Figure 2. The model includes a hemispherical optical cover mounted on the underside of the aircraft, with a 100 mm diameter imaging optical window installed inside. The external flow conditions are set to a subsonic flow at Mach 0.8 and 0 angle of attack. To minimize the influence of the computational boundaries on the flow field, the computational domain is designed as a 1 m × 1 m × 1 m rectangular region, consisting of 400,000 mesh elements. Then, the surface temperature distribution and density distribution on the optical window are obtained through simulation to calculate the wavefront distribution. The wavefront aberration is expressed as a series of orthogonal Zernike polynomials to facilitate the subsequent simulation of the imaging system [20,21,22]:

φ (x, y) = \sum_{k = 1}^{\infty} a_{k} \cdot z_{k} (x, y)

(4)

where

a_{k}

is the weighting coefficient corresponding to the Zernike polynomial

z_{k}

. Then, the point spread function (PSF) caused by the wavefront aberration is computed by:

P S F (x, y) = {|F \{P (x, y) exp [i φ (x, y)]\}|}^{2}

(5)

where

F {*}

represents the Fourier transform,

P (x, y)

represents the optical pupil function of the system. Thus, the degradation model can be mathematically formulated as:

g (x, y) = f (x, y) * P S F (x, y) + n (x, y)

(6)

where

g (x, y)

is the degraded image,

f (x, y)

is the sharp image,

n (x, y)

is the additive noise. We aim to obtain a sharp image

f (x, y)

from the degraded image

g (x, y)

.

2.2. Restoration Model

Here, we will propose our AFS-NET to restore the aero-optical degraded image, which is illustrated in Figure 3. AFS-NET consists of two key sub-blocks: Composite Channel Learning Block (CCL-Block) and Frequency Separation and Fusion Block (FSF-Block): (1) A basic CCL-Block consists of two residual modules, each integrated with Global channel attention (

G C A

) and Focal channel attention (

F C A

), respectively. The local residual learning allows the less important information such as low-frequency to be bypassed through multiple local residual connections, let the learning process focus on more efficient feature propagation. Furthermore, the propagated features are refined through channel attention mechanisms, enhancing the CCL-Block’s ability to capture essential details. (2) A novel FSF-Block integrates context modeling, feature transformation, and feature fusion. Context modeling is first applied to separately extract the high- and low-frequency information from the image. Feature transformation then processes these frequency components independently, reducing noise interference while preserving fine details. Feature fusion is finally employed to gradually reconstruct the image by integrating the refined features. By treating different frequency components and feature representations unequally, FSF-Block provides additional flexibility in dealing with complex types of aero-optical degraded images, expanding the representational ability of CNNs.

Figure 2. (a) Simulated aerodynamic flow field model. (b) Density field distribution near the optical window. (c) Aero-optical degraded wavefront. (d) Original image. (e) Aero-optical degraded PSF. (f) Degraded image. * denotes convolution operation.

Main Backbone. AFS-NET employs a simple U-Net architecture as the backbone. We design the CCL-Block as our building block. As shown in Figure 3, we input more building blocks to lower encoder stage-4 for efficiency, i.e., from encoder to decoder, the number of CCL-Blocks is respectively set as [1, 1, 1, 10] and [1, 1, 1, 2]. The latent level contains one CCL-Block. A single CCL-Block incurring a small computational overhead of 0.030 M parameters and 1.69 G MACs (we evaluate Params and MACs with the stage 1 input tensor). Then, we insert FSF-Block after CCL-Block to progressively reconstruct high-resolution clean output from latent stage-5 to decoder stage-2, i.e., the architecture of AFS-NET uses a single FSF-Block in latent and the number of FSF-Blocks in each sub-decoder is [1, 1, 1, 0]. A single FSF-Block introduces a modest computational overhead, with 0.234M parameters and 14.72 G MACs (we evaluate with a stage-2 input tensor). Depending on the task complexity, we adjust the default network width by setting the number of channels C to 64 for deblurring.

2.2.1. CCL-Block Introduction

Recently, the channel attention mechanism has exhibited substantial potential for improving the performance of deep learning networks [23,24]. Motivated by this, we propose the CCL-Block that incorporates two residual modules which are combined with

G C A

and

F C A

, respectively.

G C A

is responsible for processing the Global channel information.

F C A

, on the other hand, focuses on modeling local inter-channel dependencies. When the intermediate feature X is input, the whole process is formulated as:

\begin{matrix} X = G C A (X) + X \\ X = F C A (X) + X \end{matrix}

(7)

Global channel attention ( $G C A$ ) unit: This unit is introduced by NAFNET [13], and its details are shown in Figure 3a. Overall, given the input

X \in R^{H \times W \times C}

,

G C A

is computed as:

\begin{matrix} X = f_{3 \times 3}^{d w c} (f_{1 \times 1}^{c} (L N (X))), [X_{1}, X_{2}] = X \\ G C A (X) = f_{1 \times 1}^{c} (S C A (X_{1} ⊙ X_{2})) \end{matrix}

(8)

where

f_{3 \times 3}^{d w c} (\cdot)

is the learnable parameters of the depth-wise convolution,

f_{1 \times 1}^{c} (\cdot)

represent linear projection, and

L N (\cdot)

is the layer-norm layer. Both

X_{1}

and

X_{2}

are in

X \in R^{H \times W \times \frac{C^{'}}{2}}

space, where

C^{'}

denotes the hidden dimension in

G C A

. ⊙ denotes the element-wise multiplication.

S C A

denotes the simplified channel attention, which is introduced into

G C A

to capture global information. It extracts the global features of each channel through global average pooling and then uses a

1 \times 1

convolution to directly transform the channel features, thus, suppressing less useful features and only allowing more informative ones to pass further. However, it does not capture dependencies between channels. Formally,

S C A

is defined as:

S C A (X) = X \cdot f_{1 \times 1}^{c} (G A P (X))

(9)

where

G A P

denotes the adaptive average pooling operation, which extracts a scale factor of each channel, and · represents the channel-wise product operation.

Focal cross-channel attention (FCA) unit: Previous studies [25,26] usually introduce single-scale depth-wise convolutions into the regular feed-forward network (FFN) to improve locality. However, the redundant information in channels hinders feature expression competence (an ablation study is provided in the experiments section). The motivation for this part is to build a more efficient and effective FFN mechanism. Here, a novel

F C A

module is proposed. As illustrated by Figure 3a, the

F C A

first performs a convolution operation, followed by element-wise multiplication. Second, the channel-wise attention vector is obtained by using the Efficient Channel Attention (

E C A

) module to extract the attention of the feature map at different scales. Finally, another convolution operation is applied to refine the features. The overall process of

F C A

is summarized as:

\begin{matrix} X = f_{1 \times 1}^{c} (L N (X)), [X_{3}, X_{4}] = X \\ F C A (X) = f_{1 \times 1}^{c} (E C A (X_{3} ⊙ X_{4})) \end{matrix}

(10)

Note that

E C A

guarantees both efficiency by appropriately capturing focal cross-channel interactions. The calculation is presented as follows:

E C A (X) = X \cdot σ (f_{1 \times 1}^{k} (G A P (X)))

(11)

where

σ (\cdot)

denotes the sigmoid function.

f_{1 \times 1}^{k} (\cdot)

indicates 1-D convolution, which only involves k parameters. The kernel size k represents the extent of local cross-channel interaction, i.e., how many neighbors participate in attention prediction of one channel. The calculation for parameter k is as follows:

\begin{matrix} t = int (abs (log (C, 2) + b) / gamma) \end{matrix} k = t if t % 2 = 0 else t + 1

(12)

where C is channel inputs, gamma and b are empirically set to 2 and 1 as in [15], respectively. Finally, our

F C A

can capture more precise channel information and mitigate the redundancy of channel features in the FFN layers. Moreover, the proposed AFS-NET inherits the advantages of the CCL-block, and thus it has strong multi-scale representation capabilities and can adaptively recalibrate the cross-dimension channel-wise weight.

2.2.2. FSF-Block Introduction

Since complex aero-optical degradations affect image content across diverse frequency subbands (as shown in Figure 1), we specifically design the FSF-Block to extract both low- and high-frequency components from the input features and then modulate them to emphasize the corresponding subbands for each aero-optical degraded image. As shown in Figure 3c, given as input both the degraded image

I \in R^{H \times W \times 1}

and the CCL-Block output features

X^{'} \in R^{H \times W \times C}

, FSF-Block adaptively segregates the degradation content from the input image content in the frequency domain, and subsequently assists in refining the intermediate features

X^{'} \in R^{H \times W \times C}

in the spatial domain for effective image restoration. FSF-Block mainly refers to frequency process that is realized in three parts: Context modeling, feature transform, and feature fusion.

Context modeling: For the adaptive extraction of different frequency bands from the adjusted degraded image

I \in R^{H \times W \times 1}

, a Mask Generation Block (MGB) is used to generate a mask M that serves as the boundary for frequency separation. This boundary is adjusted according to the frequency distribution of each degraded image I. Notably, a convolutional layer is applied to the degraded input image

3 \times 3

in the FSF-Block, aiming to align with the feature channel dimensions

I \in R^{H \times W \times 1}

of each stage of CCL-Block. As shown in Figure 4a, the input feature map

I^{'} \in R^{H \times W \times C}

is processed by a

G A P

, two

1 \times 1

convolution layers, and a GELU activation function, thus the factors ranging from 0 to 1 are produced to define the width and height of the mask. The process is formally expressed as follows:

[α, β] = σ (f_{1 \times 1}^{c} (GELU (f_{1 \times 1}^{c} (G A P (X)))))

(13)

where the reduction ratio of

f_{1 \times 1}^{c} (\cdot)

are

r_{1}

and

\frac{C}{2 r_{1}}

, respectively, progressively down-sampling the channel dimensions to 2. Subsequently, the binary mask

M_{l} \in {0, 1}^{H \times W}

is generated by checking whether the normalized distance is less than or equal to 1:

M_{l} (x, y) = \{\begin{matrix} 1, & if D (x, y) \leq 1 \\ 0, & otherwise \end{matrix}

(14)

where the distance from each pixel to the center

D (H, W) = \sqrt{{(\frac{H - H_{center}}{k_{h}})}^{2} - {(\frac{W - W_{center}}{k_{w}})}^{2}}

, the coordinates of the center point

[H_{center}, W_{center}] = [\frac{H}{2}, \frac{W}{2}]

,

[k_{h}, k_{w}] = [α \frac{H}{n}, β \frac{W}{n}]

, where n is set to a small value of 128, as the curve junction in Figure 1 is relatively minor. Accordingly, the mask for high-frequency

M_{h}

can be obtained by

M_{h} = 1 - M_{l}

. Next, we can obtain the adaptively decoupled features by applying the learned masks to the spectra via element-wise multiplication and using the inverse Fourier transform.

Feature transform: To capture the inter-channel dependencies, we pass the descriptor and through the multi-deconv head transposed cross attention, resulting in new attention features (as displayed in Figure 4c). The feature transform process is defined as:

X_{*} = SOFTMAX (Q K^{T} / γ) V

(15)

where we generate query, key, and value tensors (denoted as Q, K, and V, respectively),

Q = f_{3 \times 3}^{c} (f_{1 \times 1}^{c} (IFFT (FFT (M_{*}))))

,

K = f_{3 \times 3}^{c} (f_{1 \times 1}^{c} (X))

,

V = f_{3 \times 3}^{c} (f_{1 \times 1}^{c} (X))

.

[*] \in {l, h}

is a set for low/high frequency,

γ

is a learnable scale factor. Aero-optical blur essentially attenuates or even eliminates high-frequency information (such as edges and sharp transitions). To address this, we leverage low-frequency information

X_{l} \in R^{H \times W \times C}

to enrich high-frequency

X_{h}

, which lets AFS-NET pay more attention to effective information such as high-frequency textures and details via an ultra-lightweight unit (L-H).

{\hat{X}}_{h} = X_{h} ⊙ A_{l \to h}

(16)

where

A_{l \to h} = σ (f_{1 \times 1}^{c} (GELU (f_{1 \times 1}^{c} (G A P (X)))))

,

A_{l \to h} \in R^{H \times W \times C}

.

Feature fusion: low-frequency features

X_{l}

and reconstructed high-frequency features

X_{h}

are combined and processed via a

1 \times 1

convolution. Cross attention is then used to aggregate contextual features

X_{m}

to each position of the intermediate features X, following the process defined in Equation (15), where Q is produced from X, while K and V are generated from

X_{m}

.

The proposed FSF-Block boasts several advantages. Primarily, it effectively analyzes degradation information present in the degraded input images, thereby enhancing feature representation during subsequent up-sampling in the encoder and promoting more stable model training. Moreover, the FSF-Block adaptively modulates both low- and high-frequency contents of the input features, ensuring the fusion of the most useful information, and thereby enhancing aggregation features for more efficient high-quality image reconstruction.

3. Results

In this section, we describe the details of image datasets, loss function and network implementation. Then, we provide a comprehensive evaluation of the results produced by our AFS-NET and compare it with the state-of-the-art methods.

3.1. Image Datasets

The simulation is implemented on an aero-imaging system with a 1-m aperture imaging window mounted on an aircraft. The aircraft flies at a speed of 0.8 Mach at an altitude of 10 km. The simulated data acquired over 5 s is then added to the clear images in the MAR20 [27] dataset, which is one of the largest remote sensing image datasets for aircraft recognition, resulting in degraded images. Each image pair consists of a clear image and its corresponding degraded image. The first 3420 image pairs are used for training and the last 380 pairs for testing.

For scene generalization testing, we extensively evaluate the network on multiple individual scenes from the NWPU-RESISC45 [28] dataset and the AID [29] dataset. The sample images from these two datasets exhibit high scene diversity and stylistic. Considering typical terrestrial features found in human-inhabited environments, we construct ten scene image test sets with aero-optical degradation. Specifically, four scene image test sets are constructed from the NWPU dataset, including harbors, storage facilities, runways, and overpasses. Each data set contains 700 image pairs. Additionally, six scene test sets are constructed from the AID dataset. Following the settings in [29], the numbers of test image pairs for Playground, School, Square, Center, Church, and Stadium are 370, 300, 330, 260, 240, and 290, respectively.

For robustness testing under different scenarios and noise conditions, we assess the network on the single-scene MAR20 test set and the multi-scene NWPU test set. The MAR20 test set contains 380 image pairs, while the NWPU test set consists of 2800 image pairs across four scene categories. Different levels of Gaussian noise are then added to both datasets with aero-optical degradation.

3.2. Loss Function

In order to train AFS-NET, three kinds of loss functions are adopted:

1. Frequency Reconstruction loss [30]. As shown in Figure 1, the degraded image experiences severe blurring due to the impact of aero-optical effects, with a significant loss of detail information. Since the purpose of mitigating these effects is mainly to restore the lost high-frequency components, it is crucial to minimize the difference between the restored image and the sharp image in the frequency domain. To this end, we employ Frequency Reconstruction loss

ℓ_{fft}

as follows:

ℓ_{fft} = {∥F ({\tilde{S}}_{k}) - F (S_{k})∥}_{1}

(17)

where

F

represents the FFT operation.

2. Charbonnier loss [24]. Similar to other deblurring networks, we adopt the Charbonnier loss

ℓ_{char}

[24,31], where we found that

ℓ_{char}

produces better results than L1 loss for our network. The

ℓ_{char}

is formulated as follows:

ℓ_{char} = \sqrt{{∥{\tilde{S}}_{k} - S_{k}∥}^{2} + ϵ^{2}}

(18)

3. Edge loss [24]. Recent studies suggest the auxiliary loss terms in addition to the Charbonnier loss for the performance improvement [31,32]. Consequently, we present the additional Edge loss

ℓ_{edge}

to constrain the high-frequency components between the sharp image and the restored image, defined as:

ℓ_{edge} = \sqrt{{∥Δ ({\tilde{S}}_{k}) - Δ (S_{k})∥}^{2} + ϵ^{2}}

(19)

where

Δ

denotes the Laplacian operator. Overall, the total loss function for AFS-NET is:

ℓ_{total} = ℓ_{fft} + ℓ_{char} + λ ℓ_{edge}

, where

λ

is the tradeoff-parameter set to 0.1. This loss function ensures that the network emphasizes detailed features from both frequency and spatial domains, thereby restoring a sharper image.

3.3. Implementation Details

The proposed architecture is end-to-end trainable and requires no pre-training of any sub-modules. The training parameters, consistent across all experiments, are as follows. Our framework is implemented on two NVIDIA GeForce RTX 3090 GPUs. AFS-NET is trained with Adam optimizer for 100,000 iterations, with an initial learning rate of

1 \times 10^{- 3}

, which is steadily reduced to

1 \times 10^{- 6}

using a cosine annealing strategy. The batch size is set to 8 and, for data augmentation, we perform horizontal and vertical flips, along with rotation. All images are resized to

256 \times 256

to ensure consistency and compatibility with the network architecture. Figure 5 illustrates the loss curve over 100,000 iterations. The loss drops sharply at the start and stabilizes around 0.09, indicating effective convergence of the model.

3.3.1. The Test of Image Restoration

To evaluate the feasibility of our proposed method for clear imaging in aero-optical turbulent flow, we conduct a comparative analysis with representative iteration-based methods, such as IBD [7], and several recent deep learning-based image restoration algorithms. This comparison includes turbulence removal algorithms like CGAN [12] and DETURNET [14], as well as low-level vision methods such as the Spatial Domain Operations-based NAFNET [15] and Improved Restormer [33], and the Frequency Domain Operations-based DEEPRFT [31] and LoFormer [34]. As presented in Table 1, the traditional IBD algorithm performs negligibly due to severe blur from aero-optical degradation. In contrast, our approach achieves the best performance on the same dataset, with a PSNR of 27.42 dB and an SSIM of 0.8504. Specifically, our method outperforms the second-ranked NAFNET by 0.80 dB in PSNR on aero-optical degraded images while incurring 13.05 G lower MACs, demonstrating the high efficiency of our frequency separation and fusion unit. Additionally, AFS-NET processes each image in just 0.026 s, which significantly outperforms other methods in inference speed and makes it superior for real-time applications.

The visual comparisons on the aero-optical degraded images dataset are illustrated in Figure 6. The IBD and CGAN methods consistently exhibit significant residual aero-optical blur and substantial detail loss, thereby failing to delineate a coherent target outline. DeturNET, Improved Restormer and DEEPRFT lack fine details, showing slight blurring and artifacts around the target. While LoFormer and NAFNET perform well in aero-optical restoration, but do not effectively present sharp edges in the results. In contrast, AFS-NET significantly reduces aero-optical residuals and the corrected images exhibit sharper edges, higher image contrast, and clearer targets. These results demonstrate that our method is more effective in reconstructing high-quality aero-optical restoration images than other competitors.

3.3.2. Scene Generalization Testing of AFS-NET

To extensively evaluate the generalization performance of our network, we train AFS-NET on the single-scene MAR20 training set and assess it on ten scene image testing sets from the NWPU and AID datasets. Each class represents a unique environment with varying structural features, object density, and texture complexity, which challenge the ability of the model to accurately restore scene content. Initially, we analyzed the four scene image datasets constructed by NWPU. As shown in Figure 7, in the enlarged details of storage and runway scenes, key elements such as storage tanks and runway markings are successfully reconstructed, and the restored images of complex scenes like overpasses and harbors retain the approximate shapes of their features, which demonstrates that our proposed method can restore the target image well in unknown scenarios. Furthermore, we also present the restoration results for six scene image datasets constructed by AID. As depicted in Figure 8, while some minor detail loss is observed in scenes like School, Square, Church, and Stadium, the overall contours of each scene image are well restored. These observations highlight the strong generalization capability of our network across diverse scenes.

From a quantitative analysis perspective, our method significantly improves both PSNR and SSIM values across all tested scenes as shown in Table 2. In the four scene testing sets from the NWPU dataset, significant gains are observed in image quality assessment metrics. In the harbor scene, PSNR increased by 12 dB and SSIM by 0.4491. The storage scene saw a PSNR increase of 10.91 dB and an SSIM improvement of 0.4570. The runway scene experienced a PSNR improvement of 12.30 dB and an SSIM increase of 0.2839. For the overpass scene, PSNR rose by 13.29 dB and SSIM enhanced by 0.4633. For the AID dataset, the results across the six scene test sets also show notable improvements. As shown in Table 2, the Playground, School, Square, Center, Church, and Stadium sets achieved PSNR improvements of 12.96 dB, 12.37 dB, 11.38 dB, 8.99 dB, 11.68 dB, and 8.76 dB, respectively. Meanwhile, SSIM increased by 0.3505, 0.5347, 0.4706, 0.4142, 0.5010, and 0.4117, respectively. These substantial improvements highlight that our method consistently elevates image quality, and the overall reconstruction effectiveness remains robust across different scene datasets. These substantial improvements demonstrate that our method consistently elevates image quality, and the overall reconstruction effectiveness remains robust.

3.3.3. The Robustness of AFS-NET on Different Noises

To further assess the effectiveness of aero-optical restoration, we evaluated the single-scene degraded MAR20 test set and the multi-scene degraded NWPU test set by adding Gaussian noise with a mean of 0 and variances of 0.01, 0.03, 0.05, 0.08, and 0.10. The restoration results on the MAR20 test set, shown in Figure 9, indicate that higher Gaussian noise variances progressively reduce image clarity. The aircraft details remain distinguishable at lower noise levels (variances 0.01, 0.03, and 0.05), which preserves key features essential for accurate recognition. However, at higher noise variances (0.08 and 0.10), some targets in the image become smoothly blurred, which results in a slight loss of fine details while maintaining the overall image structure. The results on the multi-scene NWPU test set, shown in Figure 10, further demonstrate the robustness of our network. At lower noise levels (0.01, 0.03, and 0.05), the lines within the scene images are restored, preserving the overall structural clarity. As the noise variance increases, particularly at higher levels (0.08 and 0.10), the key target contours are still retained. Overall, AFS-NET maintains high restoration quality even under different scenarios and noise conditions, which demonstrates its potential applicability in real-world environments.

Table 3 reports denoising results on different datasets with five distinct noise variances. As the noise variance increases, PSNR and SSIM values for restorations gradually decrease. Nevertheless, our method consistently maintains favorable performance across all noise levels.

3.3.4. Ablation Studies

We conduct ablation experiments on the MAR20 dataset with aero-optical degradation to verify the effectiveness of each module in AFS-NET. All ablation experiments follow the experimental setup of the previous section. First, we investigate different choices for the FSF-Block. As shown in Table 4, the rectangular mask achieves only 27.19 dB PSNR, which is 0.23 dB lower than our proposed elliptical mask. To further illustrate this improvement, we analyze the residual frequency spectra of the restored images under different masks, as depicted in Figure 11. In contrast to the rectangular mask, the elliptical mask yields a more naturally distributed frequency restoration and reduces directional bias near segmentation boundaries, thus minimizing prominent residual components. Furthermore, the L-H unit brings a gain of 0.11 dB PSNR over using only the elliptical mask, resulting in the highest performance as presented in Table 4. Second, we quantitatively evaluate AFS-NET without ECA and/or FSF-Block in Table 5. FSF-Block significantly boosts PSNR by 0.74 dB, using only the FSF-Block in AFS-NET leads to SSIM performance drop. ECA improves the results by 0.08 dB, without introducing more parameters or runtime in inference. Finally, we experimentally verify the effectiveness of our total loss function. The results are shown in Table 6. We can see that combining the frequency loss function and perceptual loss function with a weight ratio of

1 : 1

achieves a PSNR of 27.32 dB and an SSIM of 0.8494. When an additional edge loss function with a weight of 0.1 is included, the PSNR improves to 27.42 dB, and the SSIM increases to 0.8504, which is higher than other versions. These results demonstrate the effectiveness of our design.

4. Discussion

The proposed Adaptive Frequency Selection Network (AFS-NET) introduces an innovative approach for restoring aero-optical degraded images by integrating spatial and frequency domain processing. Through extensive experimentation, we have demonstrated that AFS-NET effectively mitigates aero-optical degradation, outperforming both traditional optimization-based methods and contemporary deep learning-based solutions.

As the basic building block of AFS-Net, the CCL-Block incorporates global and focal channel attention mechanisms to significantly enhance feature extraction efficiency. This design enables the network to capture multi-scale contextual information, ultimately improving the clarity and sharpness of the restored images.

Moreover, a key strength of AFS-NET lies in its ability to adaptively separate and fuse frequency components. Unlike conventional spatial-domain methods that primarily focus on pixel-wise restoration, our approach explicitly model the frequency characteristics of degraded images. By leveraging the FSF-Block, our method selectively enhances high-frequency details while preserving the overall structural integrity, ensuring superior reconstruction quality.

The generalization capability of AFS-NET is another notable advantage. Our evaluation across multiple simulated datasets of aero-optical degraded images, confirms that AFS-NET performs robustly across a wide range of typical terrestrial features commonly found in human-inhabited environments. The ability to maintain high PSNR and SSIM scores across various scenarios highlights the model’s adaptability to diverse and unseen environments. Additionally, AFS-NET exhibits strong resilience to noise on both single-scene and multi-scene datasets. Under varying levels of Gaussian noise, the model consistently achieves effective reconstruction with notable improvements, demonstrating its robustness in noisy scene.

Despite its promising results, there are still potential areas for improvement. First, while our method achieves high-quality performance, the computational cost of frequency-domain processing remains a consideration for real-time applications. Future work could explore lighter-weight architectures or hardware-accelerated implementations to further reduce inference time. Second, our approach primarily addresses blur and intensity distortions, but additional adversarial learning strategies could be integrated to enhance fine-grained texture restoration, particularly in extreme turbulence conditions.

5. Conclusions

Aero-optical effects are a major cause of the degradation of the image quality of aircraft with optical imaging systems under high-speed flight. In this paper, we have introduced the Efficient aero-optical degraded image restoration via adaptive frequency selection method (AFS-NET). This method reflects that the negative impact of aero-optical effects on the propagation of target information in turbulence can be mitigated through learning from highly productive digit imaging processing approach. The main CCL-Block utilizes channel attention mechanisms to focus on preserving precise spatial details, and the supplementary FSF-Block integrates spectral information to enhance the context feature representation. The Frequency loss, Charbonnier loss and Edge loss have been combined to enhance the performance of the network. The simulation results show that the AFS-NET has the ability to restore complex textures and achieve more competitive restoration results compared with other state-of-the-art networks, which provides a new direction for mitigating the impact of aero-optical effects. Additionally, the accomplishments of this study in high-quality and efficient reconstruction lay a foundation for subsequent wind tunnel experimental research.

Author Contributions

Conceptualization, Y.H. and X.M.; methodology, Y.H.; validation, Q.Z.; formal analysis, Y.H.; resources, Q.Z.; writing—original draft preparation, Y.H.; data curation, Y.H.; writing—review and editing, X.M.; supervision, H.M. and X.M.; project administration, X.M.; funding acquisition, H.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Natural Science Foundation of China (62175243).

Data Availability Statement

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Sutton, G.W.; Hwang, Y.F.; Pond, J.; Hwang, Y. Hypersonic interceptor aero-optics performance predictions. J. Spacecr. Rocket. 1994, 31, 592–599. [Google Scholar] [CrossRef]
Yin, X.L. New subdiscipline of contemporary optics: Aero-optics. Eng. Sci. 2005, 7, 1–6. [Google Scholar]
Zhang, T.X.; Hong, H.Y.; Zhang, X.Y. Aero-Optical Effect Correction: Principles, Methods and Applications; University of Science and Technology of China Press: Hefei, China, 2014. [Google Scholar]
Zhang, L.Q.; Fei, J.D. Research on photoelectric correction method of aero-optical effect. Infrared Laser Eng. 2004, 6, 580–583. [Google Scholar]
Richardson, W.H. Bayesian-based iterative method of image restoration. J. Opt. Soc. Am. 2013, 62, 55–59. [Google Scholar] [CrossRef]
Ayers, G.R.; Dainty, J.C. Iterative blind deconvolution method and its applications. Opt. Lett. 1988, 13, 547–549. [Google Scholar] [CrossRef]
Zhao, J.L.; Wang, Y. An accurate parameter estimation algorithm for fuzzy images based on quadratic Wiener filtering. Small Microcomput. Syst. 2014, 35, 1180–1183. [Google Scholar]
Fish, D.A.; Brinicombe, A.M.; Pike, E.R.; Walker, J.G. Blind deconvolution by means of the Richardson–Lucy algorithm. J. Opt. Soc. Am. 1995, 12, 58–65. [Google Scholar] [CrossRef]
Hong, H.; Shi, Y.; Zhang, T.; Liu, Z. A correction method for aero-optics thermal radiation effects based on gradient distribution and dark channel. Optoelectron. Lett. 2019, 15, 374–380. [Google Scholar] [CrossRef]
Wang, Y.; Sui, X.; Wang, Y.; Liu, T.; Zhang, C.; Chen, Q. Contrast enhancement method in aero thermal radiation images based on cyclic multi-scale illumination self-similarity and gradient perception regularization. Opt. Express 2024, 32, 1650–1668. [Google Scholar] [CrossRef]
You, Y.L.; Kaveh, M. A regularization approach to joint blur identification and image restoration. IEEE Trans. Image Process. 1996, 5, 416–428. [Google Scholar]
Yuan, L.; Sun, J.; Quan, L.; Shum, H.-Y. Image deblurring with blurred/noisy image pairs. ACM Trans. Graph. 2007, 26, 1-es. [Google Scholar] [CrossRef]
Su, Y.B.; Wang, Z.Y.; Liang, S.; Zhang, T.X. CGAN for simulation and digital image correction of aero transmission effect and aero heat radiation effect. In Proceedings of the Third International Conference on Photonics and Optical Engineering, Xi’an, China, 5–8 December 2018; Volume 11052, pp. 405–412. [Google Scholar]
Gao, X.; Wang, S.; Cui, Y.; Wu, Z. Aero-optical image and video restoration based on mean filter and adversarial network. In Proceedings of the 2022 3rd International Conference on Computer Vision, Image and Deep Learning & International Conference on Computer Engineering and Applications (CVIDL & ICCEA), Changchun, China, 20–22 May 2022; pp. 528–532. [Google Scholar]
Li, X.; Liu, X.; Wei, W.; Zhong, X.; Ma, H.; Chu, J. A deturnet-based method for recovering images degraded by atmospheric turbulence. Remote Sens. 2023, 15, 5071. [Google Scholar]
Chen, L.; Chu, X.; Zhang, X.; Sun, J. Simple baselines for image restoration. In European Conference on Computer Vision; Springer Nature Switzerland: Cham, Switzerland, 2022; pp. 17–33. [Google Scholar]
Cui, Y.; Zamir, S.W.; Khan, S.; Knoll, A.; Shah, M.; Khan, F.S. AdaIR: Adaptive All-in-One Image Restoration via Frequency Mining and Modulation. arXiv 2024, arXiv:2403.14614. [Google Scholar]
Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient channel attention for deep convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 11534–11542. [Google Scholar]
Gladstone, J.H.; Dale, T.P. Researches on the Refraction, Dispersion, andSensitiveness of Liquids. Philos. Trans. R. Soc. Lond. 1862, 12, 448–453. [Google Scholar]
Fender, J.S. Synthetic apertures: An overview. Synthetic Aperture Systems I. 1984, 440, 2–7. [Google Scholar]
Meinel, B. Aperture synthesis using independent telescopes. Appl. Opt. 1970, 9, 2501. [Google Scholar] [CrossRef]
Gardneretal, J.P. The James Webb Space Telescope. Space Sci. Rev. 2006, 123, 485–606. [Google Scholar]
Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 7132–7141. [Google Scholar]
Zamir, S.W.; Arora, A.; Khan, S.; Hayat, M.; Khan, F.S.; Yang, M.H.; Shao, L.; bin Zayed, M. Multi-stage progressive image restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 14821–14831. [Google Scholar]
Zamir, S.W.; Arora, A.; Khan, S.; Hayat, M.; Khan, F.S.; Yang, M.H. Restormer: Efficient transformer for high-resolution image restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 5728–5739. [Google Scholar]
Han, K.; Wang, Y.; Chen, H.; Chen, X.; Guo, J.; Liu, Z. A survey on vision transformer. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 87–110. [Google Scholar]
Yu, W.Q.; Cheng, G.; Wang, M.J.; Yao, Y.; Xie, X.; Yao, X.; Han, J. MAR20: A Benchmark for Military Aircraft Recognition in Remote Sensing Images. Natl. Remote Sens. Bull. 2022, 27, 2688–2696. [Google Scholar]
Wang, Q.; Gao, J.; Lin, W.; Li, X. NWPU-crowd: A large-scale benchmark for crowd counting and localization. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 2141–2149. [Google Scholar]
Xia, G.S.; Hu, J.; Hu, F.; Shi, B.; Bai, X.; Zhing, T. AID: A benchmark data set for performance evalu-ation of aerial scene classification. IEEE Trans. Geosci. Remote Sens. 2017, 55, 3965–3981. [Google Scholar]
Cho, S.-J.; Ji, S.-W.; Hong, J.-P.; Jung, S.-W.; Ko, S.-J. Rethinking coarse-to-fine approach in single image deblurring. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 4641–4650. [Google Scholar]
Mao, X.; Liu, Y.; Liu, F.; Li, Q.; Shen, W.; Wang, Y. Intriguing findings of frequency selection for image deblurring. Proc. AAAI Conf. Artif. Intell. 2023, 37, 1905–1913. [Google Scholar]
Jiang, K.; Wang, Z.; Yi, P.; Chen, C.; Huang, B.; Luo, Y.; Ma, J.; Jiang, J. Multi-scale progressive fusion network for single image deraining. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 8346–8355. [Google Scholar]
Akmaral, A.; Zafar, M.H. Efficient Transformer for High Resolution Image Motion Deblurring. arXiv 2025, arXiv:2501.18403. [Google Scholar]
Mao, X.; Wang, J.; Xie, X.; Li, Q.; Wang, Y. Loformer: Local frequency transformer for image deblur-ring. In Proceedings of the 32nd ACM International Conference on Multimedia, Melbourne, VIC, Australia, 28 October–1 November 2024; pp. 10382–10391. [Google Scholar]

Figure 1. From top to bottom: sharp images, degraded images and, the Fourier spectra of residual images obtained by subtracting the degraded images from the original images. As observed, different aero−optical distortion scenes pay different attention to different frequency subbands. For example, in Scene 1 and Scene 2, there are larger discrepancies in the low−frequency components between the sharp and degraded image pairs. The spectra are all

256 \times 256

.

Figure 1. From top to bottom: sharp images, degraded images and, the Fourier spectra of residual images obtained by subtracting the degraded images from the original images. As observed, different aero−optical distortion scenes pay different attention to different frequency subbands. For example, in Scene 1 and Scene 2, there are larger discrepancies in the low−frequency components between the sharp and degraded image pairs. The spectra are all

256 \times 256

.

Figure 3. Framework of the proposed AFS-NET. (a) Details of CCLB. Each CCLB consists of two residual blocks, which mainly contain (

G C A

) with simple channel attention (

S C A

) and fusion channel attention (

F C A

) with efficient channel attention (

E C A

).

S C A

is proposed in NAFNET. (b) Efficient Channel Attention (

E C A

). (c) Details of FSFB. Each FSFB further contains modeling, transform, and fusion.

Figure 3. Framework of the proposed AFS-NET. (a) Details of CCLB. Each CCLB consists of two residual blocks, which mainly contain (

G C A

) with simple channel attention (

S C A

) and fusion channel attention (

F C A

) with efficient channel attention (

E C A

).

S C A

is proposed in NAFNET. (b) Efficient Channel Attention (

E C A

). (c) Details of FSFB. Each FSFB further contains modeling, transform, and fusion.

Figure 4. (a) Mask Generation Block (MGB). (b) L-H is a converter that enriches features from the low- to the high- frequency branch. (c) Cross Attention (CA).

Figure 5. Training loss curve versus iteration process for the model.

Figure 6. Comparison of aero-optical effects restoration on simulated degraded images. Red boxes highlight restored details, which are magnified for better visualization.

Figure 7. Examples of scene generalization test results from NWPU testing sets. The left three columns represent the original images, the simulated aero-optical images and the restored images, respectively. The right three columns show the corresponding enlarged details.

Figure 8. Examples of scene generalization test results from AID testing sets. The first three columns on the left correspond to the scene types Playground, School, and Square, including the original images, the simulated aero-optical images, and the corresponding restored results. The last three columns on the right correspond to the scene types Center, Church, and Stadium, including the original images, the simulated aero-optical images, and the corresponding restored results.

Figure 9. Visual robustness test results on the single-scene data set under different levels of Gaussian noise. The first and third rows show the noisy images. The second and fourth rows show the restored images.

Figure 10. Visual robustness test results on the multi-scene data set under different levels of Gaussian noise. The first and third rows show the noisy images. The second and fourth rows show the restored images.

Figure 11. From left to right: The frequency spectra of residual images obtained by subtracting the degraded images from the original images, the frequency spectra of residual images obtained by subtracting the rectangular mask restored images from the original images, and the frequency spectra of residual images obtained by subtracting the elliptical mask restored images from the original images. The details within the black bounding boxes have been enlarged for clearer viewing. The colors indicate the magnitude of frequency differences, where deeper red represents larger residual values, and lighter blue represents smaller residual values.

Table 1. Average restoration results of different methods on 380 test images (best are highlighted in Bold and second-best are Underlined). MACs and inference times are computed on an input tensor shape of (1, 256, 256).

	PSNR	SSIM	MACs (G)	Times (s)
Degraded	13.66	0.5317	-	-
IBD	13.68	0.5448	-	15.05
CGAN	22.82	0.6951	78.30	0.077
DeturNET	25.14	0.8348	88.70	0.033
Improved Restormer	25.88	0.8033	64.46	0.097
DEEPRFT	26.22	0.8488	63.46	0.089
LoFormer-B	26.32	0.8375	73.04	0.132
NAFNET-64	26.62	0.8423	63.21	0.044
Ours	27.42	0.8504	50.16	0.026

Table 2. Comparison of the ten scene data sets.

		Scene Test		Restoration
	Sets	PSNR	SSIM	PSNR	SSIM
NWPU	Storage	13.90	0.3526	24.81	0.8096
	Runway	12.22	0.5710	24.52	0.8549
	Overpass	10.27	0.3305	23.56	0.7938
	Harbor	11.51	0.3615	23.51	0.8106
AID	Playground	12.74	0.4767	25.70	0.8272
	School	12.10	0.2650	24.47	0.7997
	Square	12.72	0.3280	24.10	0.7986
	Center	15.08	0.3724	24.07	0.7866
	Church	12.00	0.2608	23.68	0.7618
	Stadium	14.77	0.3933	23.53	0.8050

Table 3. Comparison of the MAR20 and NWPU datasets under the same experimental settings for noise robustness testing.

		Noise Test		Restoration
		PSNR	SSIM	PSNR	SSIM
MAR20	Var = 0.01	13.65	0.4804	24.13	0.7200
	Var = 0.03	13.61	0.4144	24.04	0.7003
	Var = 0.05	13.58	0.3707	23.83	0.6880
	Var = 0.08	13.53	0.3254	23.54	0.6746
	Var = 0.10	13.50	0.3027	23.37	0.6665
		PSNR	SSIM	PSNR	SSIM
NWPU	Var = 0.01	11.96	0.3650	20.97	0.6604
	Var = 0.03	11.93	0.3135	20.64	0.6388
	Var = 0.05	11.90	0.2790	20.40	0.6243
	Var = 0.08	11.85	0.2428	20.14	0.6078
	Var = 0.10	11.82	0.2248	20.02	0.5988

Table 4. Different configurations of FSF-Block components in AFS-NET. The ✔ symbol indicates that the corresponding component is included in the FSF-Block configuration.

Net	L-H	Rect Mask	Ellip Mask	PSNR	SSIM
a	✔	✔		27.19	0.8493
b			✔	27.31	0.8487
c	✔		✔	27.42	0.8504

Table 5. Ablation studies on the impact of ECA and / or FSF-Block in AFS-NET. The ✔ symbol indicates that the corresponding component is included in the network configuration.

Net	ECA	FSF-Block	PSNR	SSIM	Parms (M)	FLOPs (G)
a			26.85	0.8495	34.50	35.69
b	✔		26.99	0.8556	34.50	35.74
c		✔	27.25	0.8466	53.80	50.11
d	✔	✔	27.42	0.8504	53.80	50.16

Table 6. Design choices for loss functions in AFS-NET.

Loss Method	PSNR	SSIM
$ℓ_{char}$	26.42	0.8308
$ℓ_{char} + 0.1 ℓ_{fft}$	27.24	0.8493
$ℓ_{char} + ℓ_{fft}$	27.32	0.8494
$ℓ_{char} + ℓ_{fft} + 0.1 ℓ_{edge}$	27.42	0.8504

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Huang, Y.; Zhang, Q.; Ma, X.; Ma, H. Efficient Aero-Optical Degraded Image Restoration via Adaptive Frequency Selection. Remote Sens. 2025, 17, 1122. https://doi.org/10.3390/rs17071122

AMA Style

Huang Y, Zhang Q, Ma X, Ma H. Efficient Aero-Optical Degraded Image Restoration via Adaptive Frequency Selection. Remote Sensing. 2025; 17(7):1122. https://doi.org/10.3390/rs17071122

Chicago/Turabian Style

Huang, Yingjiao, Qingpeng Zhang, Xiafei Ma, and Haotong Ma. 2025. "Efficient Aero-Optical Degraded Image Restoration via Adaptive Frequency Selection" Remote Sensing 17, no. 7: 1122. https://doi.org/10.3390/rs17071122

APA Style

Huang, Y., Zhang, Q., Ma, X., & Ma, H. (2025). Efficient Aero-Optical Degraded Image Restoration via Adaptive Frequency Selection. Remote Sensing, 17(7), 1122. https://doi.org/10.3390/rs17071122

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Efficient Aero-Optical Degraded Image Restoration via Adaptive Frequency Selection

Abstract

1. Introduction

2. Materials and Methods

2.1. Degradation Model

2.2. Restoration Model

2.2.1. CCL-Block Introduction

2.2.2. FSF-Block Introduction

3. Results

3.1. Image Datasets

3.2. Loss Function

3.3. Implementation Details

3.3.1. The Test of Image Restoration

3.3.2. Scene Generalization Testing of AFS-NET

3.3.3. The Robustness of AFS-NET on Different Noises

3.3.4. Ablation Studies

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI