Next Article in Journal
Investigation of Toy Parts Produced Using Injection Molding and FDM and Selection of the Best Manufacturing Method: A Multi-Criteria Approach
Previous Article in Journal
Small but Significant: A Review of Research on the Potential of Bus Shelters as Resilient Infrastructure
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Self-Attention GAN for Electromagnetic Imaging of Uniaxial Objects

1
Department of Electrical and Computer Engineering, Tamkang University, Tamsui 251301, Taiwan
2
School of Engineering, San Francisco State University, San Francisco, CA 94117-1080, USA
*
Author to whom correspondence should be addressed.
Appl. Sci. 2025, 15(12), 6723; https://doi.org/10.3390/app15126723
Submission received: 14 May 2025 / Revised: 13 June 2025 / Accepted: 13 June 2025 / Published: 16 June 2025

Abstract

:
This study introduces a Self-Attention (SA) Generative Adversarial Network (GAN) framework that applies artificial intelligence techniques to microwave sensing for electromagnetic imaging. The approach involves illuminating anisotropic objects using Transverse Magnetic (TM) and Transverse Electric (TE) electromagnetic waves, while sensing antennas collecting the scattered field data. To simplify the training process, a Back Propagation Scheme (BPS) is employed initially to calculate the preliminary permittivity distribution, which is then fed into the GAN with SA for image reconstruction. The proposed GAN with SA offers superior performance and higher resolution compared with GAN, along with enhanced generalization capability. The methodology consists of two main steps. First, TM waves are used to estimate the initial permittivity distribution along the z-direction using BPS. Second, TE waves estimate the x- and y-direction permittivity distribution. The estimated permittivity values are used as inputs to train the GAN with SA. In our study, we add 5% and 20% noise to compare the performance of the GAN with and without SA. Numerical results indicate that the GAN with SA demonstrates higher efficiency and resolution, as well as better generalization capability. Our innovation lies in the successful reconstruction of various uniaxial objects using a generator integrated with a self-attention mechanism, achieving reduced computational time and real-time imaging.

1. Introduction

Electromagnetic imaging involves transmitting electromagnetic waves toward a target and reconstructing its size, shape, material properties, and location by analyzing the scattered signals produced from wave–object interactions. Sensing antennas play a crucial role in this process by capturing these scattered signals, which are then used to generate detailed reconstructions of the target. This technology has widespread applications in biomedical imaging, geophysical exploration, remote sensing, and medical diagnostics. Despite its rapid development and widespread applications, electromagnetic imaging faces several challenges, such as poor penetration leading to a low signal-to-noise ratio, nonlinearity, ill-posedness, the curse of dimensionality, computational complexity, and parameter selection issues. Broadly speaking, two primary approaches exist for solving electromagnetic imaging problems: traditional algorithms [1,2,3] and Artificial Intelligence (AI)-based methods [4,5,6,7,8,9,10,11,12,13,14,15,16,17]. Traditionally, Inverse Scattering Problems (ISPs) transform into optimization problems for solving. However, conventional algorithms require significant computational resources and time, limiting their ability to achieve high-resolution reconstructions, while AI-based methods can be broken down into U-Net [4,5,6,7,8,9,10,11] and GAN-based approaches [12,13,14,15,16,17].
In 2020, Xiao proposed a Three-Dimensional Electromagnetic Inversion (3D-EMI) method based on Born Approximation (BA) and 3D U-Net. The method improved the initial image using Monte Carlo simulations before training a 3D U-Net. In comparison with the Variational Born Iteration Method (VBIM), this method demonstrated superior performance [4]. In 2021, Li demonstrated that a multi-frequency U-Net Convolutional Neural Network (CNN) performed well under high-contrast and complex conditions, showing adaptability to different frequency bands [5]. In 2022, Yang introduced a deep learning approach combining U-Net with an attention mechanism to solve ISP. The attention mechanism improved accuracy and efficiency compared with traditional U-Net [6]. In the same year, Ao proposed a deep learning-based down-sampling and up-sampling model to solve the Electromagnetic Inverse Scattering (EMIS) problem. The down-sampling operation extracted relevant feature segments from the measured scattered field and compressed these features, while the up-sampling operation retrieved and expanded these features to reconstruct the dielectric parameters of the unknown scatterer. Numerical results indicated that this method was both effective and feasible for solving the EMIS problems [7]. In 2023, Liu integrated Total Variation (TV) as a regularization term into a Subspace-based Optimization Method (SOM) to preprocess data, and improved U-Net model performance [8]. In 2023, Saladi proposed a two-stage deep learning strategy to solve the ISP. In the first stage, a Deep Convolutional Neural Network (DCNN) extracted features from the scattered field data. In the second stage, an improved Attention U-Net refined the reconstructed images. The results indicated that this method outperformed other approaches in terms of image quality and reconstruction accuracy [9]. In 2024, Hu proposed an optimization technique combining U-Net and Cycle-GAN for the EMIS, which significantly refined rough reconstructions generated by U-Net alone [10]. In 2024, Si proposed a two-stage method to solve the Two-Dimensional Full-Wave (2D Full-Wave) ISP for deriving the dielectric properties of scatterers. First, a BP scheme estimated an initial approximation of the dielectric constant. Then, a U-Net model further optimized the reconstruction quality. Experimental results confirmed that this method was effective and accurate, demonstrating the potential for real-time quantitative imaging applications [11].
Ye applied GANs to solve ISPs with inhomogeneous backgrounds in 2020. The adversarial loss function enhanced generator learning, thereby improving reconstruction quality [12]. In 2021, Guo introduced a GAN that enhanced image resolution, achieving superior results compared with conventional optimization techniques, particularly in peak signal-to-noise ratio and correlation coefficient [13]. In 2021, Ma proposed a non-iterative Forward Induced Current Learning Method (FICLM) based on GAN Pix2Pix to solve the electromagnetic scattering problem. Comparisons with other neural networks, such as the U-Net, demonstrated that, thanks to the adversarial framework of Pix2Pix, FICLM excelled in handling complex scatterers. This method provided a novel deep-learning-based approach for efficiently solving electromagnetic scattering problems [14]. In 2022, Ye proposed a Super-Resolution Generative Adversarial Network (SR-GAN) for quantitative imaging of 2D biaxial anisotropic scatterers. Additionally, the Visual Geometry Group (VGG) loss was introduced to extract high-level features of the target instead of traditional low-level pixel error measurements. Numerical results validated the effectiveness of the proposed method [15]. In 2023, Xu integrated GANs and SA while introducing a weighted loss function, proposing the Learning-Assisted Inversion Method (LAIM) to further enhance the model’s generalization capability and stability. Experimental data tested on the FoamDielExt Profile (FDEP) demonstrated that LAIM’s combination of SA and the weighted loss function significantly outperformed other methods in image quality and detail restoration, particularly for high-frequency data [16]. In 2024, Yao proposed a deep learning framework based on the Conditional Deep Convolutional Generative Adversarial Network (CDCGAN) to address the EMIS problem. The generator learned the distribution between scattered field data and scatterer contrast, while the discriminator assessed sample authenticity. This method accurately resolved high-contrast scatterer problems, with numerical experiments verifying its high precision and feasibility [17]. This paper introduces an advanced AI technique that integrates SA into a GAN to reconstruct electromagnetic images. Currently, no research explores the application of SA-enhanced GANs in imaging anisotropic objects. Therefore, the Super Resolution Neural Network (SRNN) generator adopts a CNN architecture, which consists of a contracting network on the left and an expanding network on the right. The contracting network incorporates repeated 3 × 3 convolution layers, Batch Normalization layer (BN), and Rectified Linear Unit ReLU layer. Additionally, a 2 × 2 max pooling layer is introduced in the pooling layer of the contracting network. In the expanding network, 3 × 3 up-convolution layer, BN layers, and ReLU layers are added, followed by a 1 × 1 convolution layer as the fully connected layer. SA is incorporated at the output of SRNN to enhance the quality of the reconstructed image. The contributions of this paper are summarized as follows:
  • By separating TE and TM wave incidence, the GAN with SA simultaneously reconstructs the electromagnetic images of anisotropic objects in the x, y, and z directions, making the reconstruction process more challenging.
  • The vector sum of the scattered field for the TE-polarized wave introduces greater complexity compared with that of the TM-polarized wave.
  • The strong nonlinear interaction between the dielectric material and the applied electric field induces significant directionality in the TE waves, leading to less accurate reconstruction in certain directions.
  • In the presence of a discriminator network, integrating an SA mechanism at the end of the generator improves learning performance compared with the absence of SA.
In Section 3, the architecture and design of the proposed GAN with SA are detailed. Section 4 provides a comprehensive set of numerical experiments under varying noise conditions. Finally, Section 5 offers a summary of the key findings and concluding remarks.

2. Theoretical Formulation

2.1. Direct Problem

This study focuses on a uniaxial cylindrical dielectric object situated in free space, with its central axis aligned along the z-direction. The corresponding relative permittivity tensor ε ̿ 1 r and the magnetic permeability μ 0 are depicted in Figure 1. In the Cartesian coordinate system x ,   y ,   z , the permittivity tensor takes the form of a diagonal matrix, as defined in Equation (1). All directional components ε 1 x x , y , ε 1 y x , y , and ε 1 z x , y are modeled as a complex function to reflect both the anisotropic nature of the medium and potential material losses.
ε ̿ 1 r = ε 1 x x , y 0 0 0 ε 1 y x , y 0 0 0 ε 1 z x , y x y z
This study centers on the analysis of uniaxial dielectric scatterers subjected to electromagnetic wave excitation. The incident field is assumed to exhibit a time-harmonic behavior, mathematically expressed as e j ω t .
Then, we consider the following two cases.

2.1.1. TM Waves

Let E ¯ i r ¯ denote the incident electric field.
E ¯ i r ¯ = E 1 z i x , y z ^   = e j k 0 ( x cos ϑ + y sin ϑ ) z ^
k 0 refers to the free-space wave number, while ϑ denotes the angle at which the wave is incident. For the TM mode, only TM-polarized scattered fields are generated, which results in the scattered field exclusively containing a z-component. Accordingly, the total electric field E ¯ = E z z ^ and the scattered field E ¯ s = E z s z ^ are represented in the following expressions:
E z r ¯ = s   G r , r ε 1 z r 1 E z   r d s + E 1 z i r , r , r S
E z s r ¯ = s   G r , r ε 1 z r 1 E z   r d s , r S , r S
where G r , r = j k 0 2 4 H 0 2 k 0 r r is the two-dimensional free space Green’s function and H 0 2 is the zero-order Hankel function of the second kind.

2.1.2. TE Waves

The incident waves E 1 x i r ¯ and E 1 y i r ¯ can be expressed as the following:
E 1 x i r = sin ϑ e j k 0 x cos ϑ + y sin ϑ
E 1 y i r = cos ϑ e j k 0 x cos ϑ + y sin ϑ
Due to the interdependence between the E x and E y components, the incident field is represented using the vector potential approach as E ¯ i r = E 1 x i r x ^ + E 1 y i r y ^ . In a similar manner, the total electric field is described by E ¯ r = E x r x ^ + E y r y ^ , and the scattered field in the exterior region is given as E ¯ s r ¯ = E x s r ¯ x ^ + E y s r ¯ y ^ . The detailed formulations for these components are provided below. E ¯ s r ¯ = E x s r ¯ x ^ + E y s r ¯ y ^ can be expressed as the following equations:
E x r = 2 x 2 + k 0 2 s   G r , r ε 1 x r 1 E x r d s + 2 x y s   G r , r ε 1 y r 1 E y r d s + E 1 x i r
E y r = 2 x y s   G r , r ε 1 x r 1 E x r d s + 2 x 2 +                     k 0 2 s   G r , r ε 1 y r 1 E y r d s + E 1 y i r
E x s r = 2 x 2 + k 0 2 s   G r , r ε 1 x r 1 E x r d s + 2 x y s   G r , r ε 1 y r 1 E y r d s
E y s r = 2 x y s   G r , r ε 1 x r 1 E x r d s + 2 x 2 +     k 0 2 s   G r , r ε 1 y r 1 E y r d s
Given the known spatial distribution of the dielectric permittivity tensor, the total electric field E is computed using Equations (3), (7), and (8), while the scattered field E s is subsequently obtained from Equations (4), (9), and (10). To enable numerical simulation of the forward scattering problem, the object domain is discretized into N small regions, within which the material properties and field values are assumed to remain constant. Similarly, the boundary surface S is subdivided into N equally sized elements to maintain spatial uniformity of both the electric field and the permittivity in each portion. For the n-th region, the dielectric coefficients along the x-, y-, and z-axes are represented by ε 1 x n , ε 1 y n , and ε 1 z n , respectively.
The method of moments is applied to solve Equations (3)–(10), employing pulse basis functions for expansion and the Dirac delta function for testing. By doing so, Equations (3)–(10) are converted into matrix form to facilitate numerical computations:
E 1 z i = G 1 τ z I E z
E z s = G 2 τ z E z
E 1 x i E 1 y i = G 3 G 4 G 4 G 5 τ x 0 0 τ y I 0 0 I E x E y
E x s E y s = G 6 G 7 G 7 G 8 τ x 0 0 τ y E x E y
The Green’s function matrix can be expressed as:
G 1 m n = j π k 0 a n 2 J 1 k 0 a n H 0 2 k 0 ρ m n , m n j 2 π k 0 a n H 1 2   k 0 a n 2 j , m = n
G 2 m n = j π k 0 a n 2 J 1 k 0 a n H 0 2 k 0 ρ m n
G 3 = j π a n J 1 k 0 a n 2 ρ 3 m n × k ρ m n y m y n 2 H 0 2 k 0 ρ m n + x m x n 2 y m y n 2 H 1 2 k 0 ρ m n ,   m n   j 4 π k 0 a n H 1 2 k 0 a n 4 j , m = n
G 4 = j π a n J 1 k 0 a n 2 ρ 3 m n x m x n y m y n × 2 H 1 2 k 0 ρ m n k 0 ρ m n H 0 2 k 0 ρ m n ,   m n   0 , m = n
G 5 = j π a n J 1 k 0 a n 2 ρ 3 m n × k 0 ρ m n x m x n 2 H 0 2 k 0 ρ m n + y m y n 2 x m x n 2 ,   m n   j 4 π k 0 a n H 1 2 k 0 a n 4 j , m = n
G 6 = j π a n J 1 k 0 a n 2 ρ 3 m n × k 0 ρ m n y m y n 2 H 0 2 k 0 ρ m n + x m x n 2 y m y n 2 H 1 2 k 0 ρ m n
G 7 m n = j π a n J 1 k 0 a n 2 ρ 3 m n x m x n y m y n 2 H 1 2 k 0 ρ m n ρ m n H 0 2 k 0 ρ m n
G 8 m n = j π a n J 1 k 0 a n 2 ρ 3 m n k 0 ρ m n x m x n 2 H 0 2 k 0 ρ m n + y m y n 2 x m x n 2 H 1 2 k 0 ρ m n
The parameter ρ m n = x m x n 2 + y m y n 2 denotes the Euclidean distance between the observation point x m   ,   y m and the source point x n   ,   y n . Here, H 0 2 and H 1 2 represent the zeroth- and first-order Hankel functions of the second kind, respectively, while J 1 refers to the Bessel function of the first kind with order one.
The total field column vectors, denoted as ( E x ) , E y , and E z , each consist of N-elements, while the incident field column vectors, ( E 1 x i ) , ( E 1 y i ) , and ( E 1 z i ) , also contain N elements. Meanwhile, the scattered field column vectors E x s , E y s , and E z s are composed of M elements, where M corresponds to the number of measurement points. The system matrices include G 1 , G 3 , G 4 , and G 5 , which are N × N square matrices, while G 2 , G 6 , G 7 , and G 8 are M × N matrices. The diagonal matrices τ x , τ y , and τ z are constructed based on the dielectric coefficients, specifically τ x n n = ε 1 x r ¯ 1 , τ y n n = ε 1 y r ¯ 1 , and τ z n n = ε 1 z r ¯ 1 . The identity matrix I has dimensions N × N . The forward scattering problem under TM polarization is solved by applying Equations (11) and (12), whereas the corresponding formulation for the TE case is derived using Equations (13) and (14).

2.2. Inverse Problem

In solving the inverse problem, the scattered field is collected from the region exterior to the object. It is noteworthy that, for the TM mode, the only unknown parameter is the dielectric component ε 1 z , whereas for the TE mode, both ε 1 x and ε 1 y must be reconstructed. As a result, we treat the estimation of the dielectric distribution separately, recovering ε 1 z under TM illumination and ε 1 x , ε 1 y under TE excitation. The method of moments is employed to discretize the integral formulation into a matrix system. Subsequently, the BPS and Dominant Current Scheme (DCS) are used to generate an initial guess for the dielectric tensor. This coarse reconstruction is then refined using a U-Net-based deep learning architecture, allowing for improved accuracy in estimating the spatial permittivity distribution.
BPS
In this section, the measured scattered field data is utilized to estimate an initial permittivity distribution using BPS, which helps streamline the learning process for U-Net training. Our findings indicate that BPS is particularly effective in reconstructing the scattered field for weak scatterers. This approach operates under the assumption that the back-propagated field maintains a proportional relationship with the induced current I z b , I x b   , and I y b .
I z b = Υ m · G 2 H E z s
where H denotes the conjugate transpose. Similarly, the induced current I x b and I y b can be written as the following equation:
I x b I y b = Υ e · G 6 G 7 G 7 G 8 H E x s E y s
Based on (12), the loss function can be defined as:
L m b Υ = E z s G 2 · Υ m · G 2 H E z s 2
Similarly, L e b based on (14), the loss function can be defined as:
L e b Υ = E x s E y s G 6 G 7 G 7 G 8 · Υ e · G 6 G 7 G 7 G 8 H E x s E y s 2
The loss function reaches its minimum when its gradient equals zero. Under this condition, the closed-form solutions for Υ m and Υ e can be derived as follows:
Υ m = E z s T · G 2 G 2 H · E z s * G 2 G 2 H · E z s 2
Υ e = E x s E y s T · G 6 G 7 G 7 G 8 G 6 G 7 G 7 G 8 H · E x s E y s * G 6 G 7 G 7 G 8 G 6 G 7 G 7 G 8 H · E x s E y s 2
where T represents the transpose operation, while the asterisk * indicates the complex conjugate. Once Υ m is determined, the corresponding induced current can be evaluated using Equation (23). The back-propagated total field E z b is subsequently calculated from Equation (11) as follows:
E z b = E 1 z i + G 1 I z b
In a similar manner, the induced current is derived from Equation (24). The back-propagated total field components E x b and E y b are then computed based on Equation (13), given by:
E x b E y b = E 1 x i E 1 y i + G 3 G 4 G 4 G 5 I x b I y b
The relationship between I z b and the contrast τ z b is shown below:
I z , p b = d i a g τ z b E z b
while the relationship between I x , p b and I y , p b with the contrast τ x b and τ y b can be expressed as:
I x , p b I y , p b = d i a g τ x b 0 0 τ y b E x b E y b
where p represents each incident wave, while τ z b ,   τ x b , a n d τ y b correspond to the dielectric coefficient distributions derived through back-propagation. By applying the least squares method to all incident cases in Equation (31), we can derive the analytical solution for the n-th element of the contrast τ z b .
τ z b = p = 1 N i I z ,     p b n · E z b n * p = 1 N i E z b n 2
Likewise, incorporating all incidences from Equation (32) yields a least squares problem. The expressions for the n-th components of the contrast terms τ x b and τ y b can be explicitly obtained as follows:
τ x b 0 0 τ y b = p = 1 N i I x , p b n I y , p b n · E x b E y b n * p = 1 N i E x b n E y b n 2
where N i is the number of incident waves.
The forward model in this study is rigorously derived from Maxwell’s equations, and the scattered field is computed using the Method of Moments (MoM). This numerical formulation inherently satisfies the physical principles of reciprocity and continuity, owing to the mathematical properties of the Green’s function and the enforcement of boundary conditions. The initial estimate of the permittivity distribution, obtained via the Back Propagation Scheme (BPS), is directly based on these physically consistent fields, thereby ensuring adherence to energy conservation at the initialization stage.

3. Neural Network

CNN is a type of feedforward architecture that operates based on three core concepts: sparse connectivity, parameter sharing, and equivariance to translation. Its standard structure includes convolutional layers, pooling layers, activation functions such as ReLU, and fully connected layers. Among these, the convolutional layer is essential for feature extraction from the input data. The process typically begins by representing the input image in matrix form, allowing the network to learn spatial patterns through localized filters. Various convolutional kernels slide across this matrix, performing calculations that yield a transformed image. Each resulting feature map retains information from the original input image, while parameters such as kernel size and stride length significantly influence the operation of this layer. Pooling layers function to down sample the spatial resolution of feature maps, effectively decreasing the dimensionality of the data. This reduction leads to fewer trainable parameters, which in turn helps prevent overfitting during model training. Pooling operations, such as max pooling, extract the most significant values from regions within the feature maps, minimizing redundant information while ensuring that the sliding windows do not overlap. In the fully connected layer, all neurons are interconnected, allowing the network to perform classification based on the extracted features. The ReLU layer enhances the network’s ability to model nonlinear relationships. This activation function introduces nonlinearity, aiding in better function approximation during training, though its direct impact on final results may not always be highly pronounced. A notable example of CNN application is the Fully Convolutional Network (FCN) introduced by the University of Freiburg in 2015. Their image segmentation method demonstrated that, despite training with a limited dataset, the approach could still achieve highly precise segmentation results [18,19]. The selection of the U-Net architecture is motivated by the following considerations:
  • U-Net exhibits strong generative capabilities, allowing it to achieve favorable training outcomes even with limited training data.
  • To address the strong correlation between input and output features, skip connections are introduced to directly link corresponding layers. This technique helps alleviate the vanishing gradient problem commonly encountered during network training.
  • The down-sampling shrinking network in U-Net enhances receptive field coverage, thereby improving the accuracy of pixel-wise predictions.
  • Batch normalization layers accelerate training by stabilizing gradient updates and reducing sensitivity to parameter initialization, leading to improved network performance.
  • As established in [20], under the TM configuration, the reconstruction of ε 1 z is feasible when the incident field consists solely of the E 1 z i component. In contrast, the TE case involves incident fields E 1 x i and E 1 y i . For uniaxial objects, where ε 1 x and ε 1 y are equal, these components can be individually recovered, provided that their corresponding incident fields are sufficiently strong to excite the desired responses.
This section provides a detailed explanation of the GAN architecture shown in Figure 2. We begin by generating an initial estimate of the image, denoted as ε 1 . This estimated image ε 1 is then fed into the generator to produce the output image G θ ε 1 . Subsequently, the discriminator evaluates the difference between the real image Y and the generated image G θ ε 1 , which is used to compute the loss function. Here, θ represents the unknown parameters in the generative network. Moreover, D represents the discriminator network, where denotes the unknown parameter of the discriminator. The output of this network is a discriminative matrix. The generative network and the discriminator are trained alternately in a competitive manner, with each network updating in response to the other. A GAN is composed of an SRNN generator with an SA mechanism and a discriminator. First, the overall architecture of the SRNN with the SA mechanism, as shown in Figure 3, takes the image input and passes it through the encoder. The encoded image and the image processed by the SRNN are then combined and fed into the discriminator. Next, the structure of the SRNN generator with SA in Figure 4 is introduced. The SRNN generator adopts CNN architecture, which consists of a contracting network on the left and an expanding network on the right. The contracting network incorporates repeated 3 × 3 convolution layers, BN layers, and ReLU layers. Additionally, a 2 × 2 max pooling layer is introduced in the pooling layer of the contracting network. In the expanding network, 3 × 3 up-convolution layers, BN layers, and ReLU layers are added, followed by a 1 × 1 convolution layer serving as the fully connected layer. SA is incorporated at the output of SRNN to enhance the quality of the reconstructed image. Next, the SA mechanism shown in Figure 5 is described in detail. It consists of three 3 × 3 convolution layers, where two of them pass through a Matrix Multiplication Layer (MML) and a SoftMax layer. The MML is served as a fully connected layer. The results generated by the generator are fed into the discriminator for evaluation. Finally, the architecture of the GAN discriminator in Figure 6 is presented. It consists of one 3 × 3 convolution layer, three additional 3 × 3 convolution layers, BN layers, and ReLU layers. The discriminator receives as input the image generated by the generative network. In essence, the discriminator assigns a score based on its evaluation. This score determines whether the generative network needs to adjust its training weights. The process continues iteratively until a satisfactory equilibrium is reached. In GANs, the loss function of the generative network, denoted as L G A N G is defined in (35).
Through repeated cycles of generation and discrimination, the model eventually reaches Nash equilibrium, enabling the generator to produce accurate dielectric tensor reconstructions of anisotropic objects.
L G A N G θ = L R M S E ( θ ) + γ L A ( θ | )
where L R M S E ( θ ) represents the error between the reconstructed image and the reference image. The parameter γ serves as a weighting factor to balance the two loss components. L A ( θ | ) denotes the adversarial loss function of the discriminator network, which is formulated as follows:
L A θ = 1 N   i = 1 N | | D ( G θ ε 1 i 1 ) | |
L A represents the scoring mechanism of the discriminator network, which evaluates whether the reconstructed image accurately reflects the true image. N denotes the number of data points included in each training batch.
The loss function associated with the discriminator network in GAN is formulated as follows:
L G A N D θ = 1 2 N i = 1 N D Y i 1 2 2 + D G θ ( ε 1 i ) 2 2
where represents the unknown parametric data, and denotes the weight parameter. Y i and ε 1 i refer to the true and generated data, respectively. As D and G θ are alternately optimized in a competitive fashion until a Nash equilibrium is achieved, the output of the generator G θ ( ε 1 i ) closely resembles the real image, making it indistinguishable from the actual data to the discriminator D . Both the generator and discriminator are sufficiently robust to accurately reconstruct the dielectric permittivity distribution.
Notably, the incorporation of the SA mechanism significantly improves image reconstruction accuracy.

4. Numerical Result

This section outlines the simulation framework designed to investigate the inverse scattering characteristics of uniaxial dielectric objects situated in free space. The primary objective is to reconstruct the spatial distribution of the relative permittivity within anisotropic media. To accomplish this, both TM- and TE-polarized plane waves are continuously applied to illuminate the domain containing the target, thereby collecting a comprehensive set of scattered field data necessary for the reconstruction process.
Since each U-Net operates independently, during training, GPU parallelization is highly suitable and efficient in computational tasks. The ADAM is configured with an initial learning rate of 10 4 . Epochs are set to drop at 70, with a drop learning rate of 10 1 . The mini-batch size is 16 and training runs for up to 200 epochs. The training data is shuffled at the beginning of each epoch to ensure robust learning. Specifically, the learning rate is set to 0.001, as higher values result in unstable training dynamics and poor convergence, whereas lower values lead to sluggish learning and suboptimal reconstruction performance. The batch size is fixed at 16 to balance training stability and computational efficiency. Empirically, smaller batch sizes introduce noisy gradient updates and destabilize the training process, while larger batch sizes impose higher memory demands and prolong convergence. The training parameter configuration as shown in Table 1.
To evaluate the effectiveness of each method, we utilize the Root Mean Square Error (RMSE), defined as:
R M S E = 1 M t i = 1 M t ε ̿ r ε ̿ r r F / ε ̿ r F
where ε ̿ r denotes the ground truth of the relative permittivity distribution, whereas ε ̿ r r refers to the reconstructed result. The parameter M t indicates the total number of test samples, and F represents the Frobenius norm. To provide a more comprehensive assessment of reconstruction quality across different methods, the Structural Similarity Index Measure (SSIM) is also utilized, and it is mathematically expressed as follows:
S S I M = 2 μ y ~ μ y + C 1 2 σ y ~ y + C 2 μ y ~ 2 + μ y 2 + C 1 σ y ~ 2 + σ y 2 + C 2
where y ~ and y represent the reconstructed and ground truth permittivity distributions, respectively. The term μ y denotes the meaning value of y, σ y ~ 2 corresponds to the variance of the reconstructed profile, and σ y ~ y indicates the covariance between y ~ and y. The constants C 1 and C 2 are added to prevent division by zero, where C 1 = K 1 D 2 and C 2 = K 2 D 2 ; K 1 = 0.01 , K 2 = 0.03 , and D is the dynamic range of pixel intensities in the target image y [21].

4.1. The Permittivity Is Between 1 and 1.5 with 20% Noise

In this simulation, the dielectric permittivity values are randomly assigned within the range of 1.0 to 1.5, with 20% Gaussian noise introduced to simulate measurement uncertainty. The scatterers are assumed to possess one of 10 distinct permittivity distributions and can be located at any of 50 predefined positions within the observation domain. Consequently, each case produces a dataset of 500 images (10 patterns × 50 positions). The dataset is divided into 80% for training and 20% for testing. To generate preliminary estimates of the dielectric distribution, we apply the BPS method and feed the resulting data into the GAN both without and with SA. Finally, we evaluate and compare the reconstruction performance of both methods.
Figure 7a,b show the original ε 1 z and ε 1 x dielectric coefficient distributions of the object with 20% noise. Figure 8a,b present the ε 1 z and ε 1 x distributions reconstructed by the GAN with 20% noise, using the initial estimates obtained through BPS. The results in Figure 8 indicate that the general shape of the dielectric distribution is identifiable. Figure 9a,b show the ε 1 z and ε 1 x distributions reconstructed by the GAN with SA under 20% noise. The reconstructions in Figure 9 demonstrate improved visual quality. The corresponding RMSE and SSIM values are summarized in Table 2, which clearly shows that the GAN with SA provides better reconstruction performance than the GAN.

4.2. The Permittivity Is Between 1.5 and 2 with 5% Noise

In this case, the dielectric coefficient is assigned values ranging from 1.5 to 2. The simulation environment includes the addition of 5% Gaussian noise. The training and testing sets remain the same as in case A. To generate preliminary estimates of the dielectric distribution, we apply the BPS method and feed the resulting data into GAN both without and with SA. Finally, we evaluate and compare the reconstruction performance of both methods.
Figure 10a,b illustrate the ground truth distributions of the dielectric coefficients under ε 1 z and ε 1 x polarizations, respectively, with 5% Gaussian noise added. In Figure 11a,b, the corresponding ε 1 z and ε 1 x reconstructions generated by the standard GAN model under the same noise conditions are shown. As observed in Figure 11, the reconstructed dielectric profiles tend to be blurry and lack structural clarity. In contrast, Figure 12a,b present the reconstruction results obtained using the proposed GAN with SA approach, again under 5% noise, for both ε 1 z and ε 1 x cases. The reconstructions in Figure 12 are noticeably clearer than those in Figure 11. The corresponding RMSE and SSIM values are summarized in Table 3, which clearly shows that the GAN with SA mechanism provides better reconstruction performance than the GAN.

4.3. The Permittivity Is Between 2 and 2.5 with 5% Noise

In this case, the dielectric coefficient is assigned values ranging from 2 to 2.5. The simulation setup incorporates 5% of Gaussian noise to account for measurement uncertainty. For input data, a subset of handwritten digits ( 0 9 ) is randomly selected from the Modified National Institute of Standards and Technology (MNIST) database—a widely used benchmark dataset in the field of handwriting recognition. Each MNIST image consists of a 28 × 28 grayscale pixel array, and the dataset includes approximately 70,000 samples, with 10,000 examples available for each digit. To introduce variability, every group of 50 images corresponds to a distinct handwriting style, and each style is further augmented by applying 50 different rotation angles. Owing to its simplicity and representativeness, MNIST has become a standard resource for training and evaluating image-based neural network models.
For each digit class, 50 samples are selected, resulting in a dataset comprising 500 images in total ( 10 × 50 ). The dataset is partitioned with 80 % allocated for training and the remaining 20 % reserved for testing. BPS is first applied to preprocess the data, which is subsequently used to train both the baseline GAN and the GAN with SA models. The reconstruction performance of both architectures is then quantitatively evaluated and compared.
Figure 13a,b show the original ε 1 z and ε 1 x dielectric coefficient distributions of the object with 5% noise. Figure 14a,b present the ε 1 z and ε 1 x distributions reconstructed by the GAN with 5% noise. The results in Figure 14 indicate that the general shape of the dielectric distribution is recognizable. Figure 15a,b display the TE and TM distributions reconstructed by GAN with SA under 5% noise. The reconstructions in Figure 15 demonstrate that the image is clearly identifiable based on the permittivity distribution. More importantly, it closely resembles the original permittivity distribution. The corresponding RMSE and SSIM values are summarized in Table 4, which clearly shows that the GAN with SA provides better reconstruction performance than the GAN.
Based on the simulation results from the three scenarios discussed above, we discover that lower dielectric coefficient values result in better reconstruction outcomes. Moreover, a more uniform distribution of the dielectric coefficient leads to improved reconstruction performance.
The experiments are conducted on a desktop equipped with an Intel Core i7-14700K CPU and 64 GB RAM. With the SA module enabled, training takes approximately 78 min with peak RAM usage of 1.3 GB. Without the SA module, training time is 76 min with 1.2 GB RAM usage. For both configurations, inference on a single sample takes less than 0.3 s, demonstrating that the trained model is suitable for near real-time image reconstruction.

5. Conclusions

This paper proposes an electromagnetic imaging method for uniaxial objects based on a pixel basis. TM and TE waves are respectively emitted toward the uniaxial object, and the scattered fields are received. The initial estimate of the permittivity distribution, obtained via the BPS method, is subsequently input into both the standard GAN and the GAN with SA models to perform real-time reconstruction and comparative analysis. Numerical simulations indicate that, under identical noise conditions and parameter configurations, the GAN with SA yields more accurate reconstructions than the conventional GAN. This discrepancy becomes more pronounced in the TE case, where the vectorial characteristics of the electric field and the complexity of the associated Green’s function render the reconstruction task significantly more challenging than in the TM scenario. Furthermore, the permittivity in the cross-sectional plane is represented as a tensor, further increasing the difficulty of reconstruction. Though the BPS method provides a straightforward and computationally efficient initial estimate of the permittivity distribution, with the addition of a discriminator network, integrating an SA mechanism at the end of the generator does enhance learning performance compared with standalone GAN.
A key limitation of the proposed method arises when the target is embedded in more complex environments, such as half-space media or multilayered dielectric backgrounds. In such cases, the measured scattered fields are significantly influenced by the background stratification, which in turn degrades the accuracy of the initial estimate provided by the BPS. The proposed method holds strong potential for application in real-world scenarios, including biomedical imaging, security screening, and remote sensing, among others. This deterioration in the initial reconstruction may ultimately lead to suboptimal performance of the overall imaging system. In future studies, there remains significant potential for AI to enhance electromagnetic imaging. For instance, implementing lightweight neural network architectures or segmenting scatterers into finer elements to enable high-resolution image reconstruction is expected to become a key area of research in the near term. Although the proposed framework has been validated using synthetic datasets with varying levels of noise and permittivity distributions, it is well-suited for extension to real-world applications. In future work, we intend to test the method on measured scattered field data collected from physical microwave imaging systems. The modular two-stage design, comprising a physics-based initialization and a self-attention-enhanced GAN, offers good adaptability to realistic measurement constraints such as limited view angles, background inhomogeneities, and hardware noise. Such adaptations will be essential for practical deployments in biomedical imaging, nondestructive testing, and security screening scenarios.

Author Contributions

Conceptualization, C.-C.C.; Methodology, P.-H.C.; Software, P.-H.C.; Validation, P.-H.C.; Formal analysis, Y.-H.C.; Resources, H.J.; Writing—original draft, Y.-H.C.; Writing—review & editing, C.-C.C.; Visualization, Y.-H.C.; Supervision, C.-C.C.; Project administration, H.J.; Funding acquisition, H.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Yin, T.; Pan, L.; Chen, X. Subspace-Based Distorted-Rytov Iterative Method for Solving Inverse Scattering Problems. IEEE Trans. Antennas Propag. 2023, 71, 8173–8183. [Google Scholar] [CrossRef]
  2. Wang, M.; Sun, S.; Dai, D.; Zhang, Y.; Su, Y. Cross-Correlated Subspace-Based Optimization Method for Solving Electromagnetic Inverse Scattering Problems. IEEE Trans. Antennas Propag. 2024, 72, 8575–8589. [Google Scholar] [CrossRef]
  3. Zhou, H.; Zhao, Y.; Wang, Y.; Han, D.; Hu, J. An Anderson Acceleration Inspired VBIM for Solving the Electromagnetic Inverse Scattering Problems. IEEE Trans. Antennas Propag. 2025, 73, 2585–2595. [Google Scholar] [CrossRef]
  4. Xiao, J.; Li, J.; Chen, Y.; Han, F.; Liu, Q.H. Fast electromagnetic inversion of inhomogeneous scatterers embedded in layered media by Born approximation and 3-D U-Net. IEEE Geosci. Remote Sens. Lett. 2020, 17, 1677–1681. [Google Scholar] [CrossRef]
  5. Li, H.; Chen, L.; Qiu, J. Convolutional neural networks for multifrequency electromagnetic inverse problems. IEEE Antennas Wirel. Propag. Lett. 2021, 20, 1424–1428. [Google Scholar] [CrossRef]
  6. Yang, L.; Wang, H.; Pang, M.; Jiang, Y.; Lin, H. Deep learning with attention mechanism for electromagnetic inverse scattering. In Proceedings of the 2022 IEEE International Symposium on Antennas and Propagation and USNC-URSI Radio Science Meeting (AP-S/URSI), Denver, CO, USA, 10–15 July 2022; pp. 1712–1713. [Google Scholar]
  7. Peng, A.; Zuo, Y.; Liu, W.; Guo, L. An electromagnetic inverse scattering imaging method based on U-net structure. In Proceedings of the 2022 International Applied Computational Electromagnetics Society Symposium (ACES-China), Xuzhou, China, 9–12 December 2022. [Google Scholar]
  8. Liu, J.; Wei, Z. Deep learning model of nonlinear electromagnetic inverse problem based on total variation. In Proceedings of the 2023 International Applied Computational Electromagnetics Society Symposium (ACES-China), Hangzhou, China, 15–18 August 2023; pp. 1–3. [Google Scholar]
  9. Saladi, P.; Kalepu, Y. Electromagnetic inverse scattering problem solved by DConvNet and adapted attention U-Net. In Proceedings of the 2023 IEEE 12th International Conference on Communication Systems and Network Technologies (CSNT), Bhopal, India, 8–9 April 2023. [Google Scholar]
  10. Hu, W.; Xie, G.; Dai, J.; Hou, G.; Li, X.; Huang, Z. An optimization method based on deep learning for electromagnetic inverse scattering problems. In Proceedings of the 2024 International Applied Computational Electromagnetics Society Symposium (ACES-China), Xi’an, China, 16–19 August 2024; pp. 1–3. [Google Scholar]
  11. Si, A.; Dai, D.; Wang, M.; Fang, F. Two steps electromagnetic quantitative inversion imaging based on convolutional neural network. In Proceedings of the 2024 5th International Conference on Geology, Mapping and Remote Sensing (ICGMRS), Wuhan, China, 12–14 April 2024; pp. 28–32. [Google Scholar]
  12. Ye, X.; Bai, Y.; Song, R.; Xu, K.; An, J. An inhomogeneous background imaging method based on generative adversarial network. IEEE Trans. Microw. Theory Technol. 2020, 68, 4684–4693. [Google Scholar] [CrossRef]
  13. Guo, L.; Song, G.; Wu, H. Complex-valued Pix2pix—Deep neural network for nonlinear electromagnetic inverse scattering. Electronics 2021, 10, 752. [Google Scholar] [CrossRef]
  14. Ma, Z.; Xu, K.; Song, R.; Wang, C.F.; Chen, X. Learning-based fast electromagnetic scattering solver through generative adversarial network. IEEE Trans. Antennas Propag. 2021, 69, 2194–2208. [Google Scholar] [CrossRef]
  15. Ye, X.; Du, N.; Yang, D.; Yuan, X.; Song, R.; Sun, S.; Fang, D. Application of generative adversarial network-based inversion algorithm in imaging 2-D lossy biaxial anisotropic scatterer. IEEE Trans. Antennas Propag. 2022, 70, 8262–8275. [Google Scholar] [CrossRef]
  16. Xu, K.; Qian, Z.; Zhong, Y.; Su, J.; Gao, H.; Li, W. Learning-assisted inversion for solving nonlinear inverse scattering problem. IEEE Trans. Microw. Theory Techn. 2023, 71, 2384–2395. [Google Scholar] [CrossRef]
  17. Yao, H.M.; Jiang, L.; Ng, M. Enhanced deep learning approach based on the conditional generative adversarial network for electromagnetic inverse scattering problems. IEEE Trans. Antennas Propag. 2024, 72, 6133–6138. [Google Scholar] [CrossRef]
  18. Neapolitan, R.E.; Jiang, X. Artificial Intelligence: With an Introduction to Machine Learning; CRC Press: Boca Raton, FL, USA, 2018. [Google Scholar]
  19. Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional networks for biomedical image segmentation. In Proceedings of the 18th Interentional Conference Medical Image Computing Computer-Assisted Intervention (MICCAI), Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar]
  20. Wei, Z.; Chen, X. Deep-learning schemes for full-wave nonlinear inverse scattering problems. IEEE Trans. Geosci. Remote Sens. 2019, 57, 1849–1860. [Google Scholar] [CrossRef]
  21. Huang, Y.; Song, R.; Xu, K.; Ye, X.; Li, C.; Chen, X. Deep learning-based inverse scattering with structural similarity loss functions. IEEE Sens. J. 2021, 21, 4900–4907. [Google Scholar] [CrossRef]
Figure 1. Typical schematic of problem in the plane.
Figure 1. Typical schematic of problem in the plane.
Applsci 15 06723 g001
Figure 2. Overall GAN architecture.
Figure 2. Overall GAN architecture.
Applsci 15 06723 g002
Figure 3. Overall SRNN architecture with SA mechanism.
Figure 3. Overall SRNN architecture with SA mechanism.
Applsci 15 06723 g003
Figure 4. Typical schematic of the SRNN generator with SA mechanism.
Figure 4. Typical schematic of the SRNN generator with SA mechanism.
Applsci 15 06723 g004
Figure 5. SA mechanism architecture.
Figure 5. SA mechanism architecture.
Applsci 15 06723 g005
Figure 6. The GAN’s discriminator architecture.
Figure 6. The GAN’s discriminator architecture.
Applsci 15 06723 g006
Figure 7. Ground truth permittivity ranging from 1 to 1.5: (a) ε 1 z and (b) ε 1 x ,   ε 1 y
Figure 7. Ground truth permittivity ranging from 1 to 1.5: (a) ε 1 z and (b) ε 1 x ,   ε 1 y
Applsci 15 06723 g007
Figure 8. Reconstructed permittivity ranging from 1 to 1.5 for the GAN without SA input under 20% noise: (a) ε 1 z and (b) ε 1 x ,   ε 1 y
Figure 8. Reconstructed permittivity ranging from 1 to 1.5 for the GAN without SA input under 20% noise: (a) ε 1 z and (b) ε 1 x ,   ε 1 y
Applsci 15 06723 g008
Figure 9. Reconstructed permittivity ranging from 1 to 1.5 for the GAN with SA input under 20% noise: (a) ε 1 z and (b) ε 1 x ,   ε 1 y
Figure 9. Reconstructed permittivity ranging from 1 to 1.5 for the GAN with SA input under 20% noise: (a) ε 1 z and (b) ε 1 x ,   ε 1 y
Applsci 15 06723 g009
Figure 10. Ground truth permittivity ranging from 1.5 to 2: (a) ε 1 z and (b) ε 1 x ,   ε 1 y
Figure 10. Ground truth permittivity ranging from 1.5 to 2: (a) ε 1 z and (b) ε 1 x ,   ε 1 y
Applsci 15 06723 g010
Figure 11. Reconstructed permittivity ranging from 1.5 to 2 for the GAN without SA input under 5% noise: (a) ε 1 z and (b) ε 1 x ,   ε 1 y
Figure 11. Reconstructed permittivity ranging from 1.5 to 2 for the GAN without SA input under 5% noise: (a) ε 1 z and (b) ε 1 x ,   ε 1 y
Applsci 15 06723 g011
Figure 12. Reconstructed permittivity ranging from 1.5 to 2 for the GAN with SA input under 5% noise: (a) ε 1 z and (b) ε 1 x ,   ε 1 y
Figure 12. Reconstructed permittivity ranging from 1.5 to 2 for the GAN with SA input under 5% noise: (a) ε 1 z and (b) ε 1 x ,   ε 1 y
Applsci 15 06723 g012
Figure 13. Ground truth permittivity ranging from 2 to 2.5 with 5% noise: (a) ε 1 z and (b) ε 1 x ,   ε 1 y
Figure 13. Ground truth permittivity ranging from 2 to 2.5 with 5% noise: (a) ε 1 z and (b) ε 1 x ,   ε 1 y
Applsci 15 06723 g013
Figure 14. Reconstructed permittivity ranging from 2 to 2.5 for the GAN without SA input under 5% noise: (a) ε 1 z and (b) ε 1 x ,   ε 1 y
Figure 14. Reconstructed permittivity ranging from 2 to 2.5 for the GAN without SA input under 5% noise: (a) ε 1 z and (b) ε 1 x ,   ε 1 y
Applsci 15 06723 g014
Figure 15. Reconstructed permittivity ranging from 2 to 2.5 for the GAN with SA input under 5% noise: (a) ε 1 z and (b) ε 1 x ,   ε 1 y
Figure 15. Reconstructed permittivity ranging from 2 to 2.5 for the GAN with SA input under 5% noise: (a) ε 1 z and (b) ε 1 x ,   ε 1 y
Applsci 15 06723 g015
Table 1. Parameter configuration table.
Table 1. Parameter configuration table.
Parameter NameParameter Setting
Learning rate10−4
Max epoch200
Mini-batch size16
Drop epoch70
Drop learning rate10−1
Table 2. RMSE and SSIM for permittivity between 1 and 1.5 with 20% noise.
Table 2. RMSE and SSIM for permittivity between 1 and 1.5 with 20% noise.
Reconstruction PerformanceGANGAN with SA
TERMSE3.48%1.21%
SSIM95.39%99.77%
TMRMSE3.37%2.76%
SSIM95.58%97.25%
Table 3. RMSE and SSIM for permittivity between 1.5 and 2 with 5% noise.
Table 3. RMSE and SSIM for permittivity between 1.5 and 2 with 5% noise.
Reconstruction PerformanceGANGAN with SA
TERMSE6.27%3.8%
SSIM94.66%97.84%
TMRMSE3.29%2.64%
SSIM96.45%97.45%
Table 4. RMSE and SSIM for permittivity between 2 and 2.5 with 5% noise.
Table 4. RMSE and SSIM for permittivity between 2 and 2.5 with 5% noise.
Reconstruction PerformanceGANGAN with SA
TERMSE9.11%7.98%
SSIM88.16%98.18%
TMRMSE7.05%5.81%
SSIM94.38%98.47%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chiu, C.-C.; Chen, P.-H.; Chen, Y.-H.; Jiang, H. Self-Attention GAN for Electromagnetic Imaging of Uniaxial Objects. Appl. Sci. 2025, 15, 6723. https://doi.org/10.3390/app15126723

AMA Style

Chiu C-C, Chen P-H, Chen Y-H, Jiang H. Self-Attention GAN for Electromagnetic Imaging of Uniaxial Objects. Applied Sciences. 2025; 15(12):6723. https://doi.org/10.3390/app15126723

Chicago/Turabian Style

Chiu, Chien-Ching, Po-Hsiang Chen, Yi-Hsun Chen, and Hao Jiang. 2025. "Self-Attention GAN for Electromagnetic Imaging of Uniaxial Objects" Applied Sciences 15, no. 12: 6723. https://doi.org/10.3390/app15126723

APA Style

Chiu, C.-C., Chen, P.-H., Chen, Y.-H., & Jiang, H. (2025). Self-Attention GAN for Electromagnetic Imaging of Uniaxial Objects. Applied Sciences, 15(12), 6723. https://doi.org/10.3390/app15126723

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop