Diagonalized Macromodels in Finite Element Method for Fast Electromagnetic Analysis of Waveguide Components

Nyka, Krzysztof

doi:10.3390/electronics8030260

Open AccessArticle

Diagonalized Macromodels in Finite Element Method for Fast Electromagnetic Analysis of Waveguide Components

by

Krzysztof Nyka

Faculty of Electronics, Telecommunications and Informatics, Gdansk University of Technology, Narutowicza 11/12, 80-233 Gdansk, Poland

Electronics 2019, 8(3), 260; https://doi.org/10.3390/electronics8030260

Submission received: 23 January 2019 / Revised: 14 February 2019 / Accepted: 22 February 2019 / Published: 27 February 2019

(This article belongs to the Section Microwave and Wireless Communications)

Download

Browse Figures

Versions Notes

Abstract

:

A new technique of local model-order reduction (MOR) in 3-D finite element method (FEM) for frequency-domain electromagnetic analysis of waveguide components is proposed in this paper. It resolves the problem of increasing solution time of the reduced-order system assembled from macromodels created in the subdomains, into which an analyzed structure is partitioned. This problem becomes particularly relevant for growing size and count of the macromodels, and when they are cloned in multiple locations of the structures or are used repeatedly in a tuning and optimization process. To significantly reduce the solution time, the diagonalized macromodels are created by means of the simultaneous diagonalization and subsequently assembled in the global system. For the resulting partially diagonal matrix, an efficient dedicated solver based on the Schur complement technique is proposed. The employed MOR method preserves frequency independence of the macromodels, which is essential for efficient diagonalization, as it can be performed once for the whole analysis bandwidth. The numerical validation of the proposed procedures with respect to accuracy and speed was carried out for varying size and count of macromodels. An exemplary finite periodical waveguide structure was chosen to investigate the influence of macromodel cloning on the resultant efficiency. The results show that the use of the diagonalized macromodels provided a significant solution speedup without any loss of accuracy.

Keywords:

finite element method; model-order reduction (MOR); macromodels; diagonalization; computational electromagnetics; matrix algebra

1. Introduction

Waveguide components, such as filters, diplexers, power dividers/combiners, junctions, resonators, etc. have always been used in microwave technique. Although being gradually replaced by monolithic and hybrid integrated circuits, they are still indispensable in space applications or in millimeter-wave bands due to their low losses and high power handling capabilities. Since waveguide components have the size comparable to the wavelength and cannot be described by mens of currents and voltages, they can be accurately characterized only in rigorous full-wave electromagnetic analysis. The finite element method (FEM) is widely recognized for its ability to accurately solve frequency domain electromagnetic problems governed by the Maxwell equations in structures having arbitrary geometry, such as waveguide components [1]. It is also known that in complex 3-D domains, especially comprising small geometric features, the discretization mesh may grow dramatically, leading to very large systems of linear equations [2]. With rapidly increasing number of unknowns, they become very costly to solve in terms of both memory demand and computational time. An extensive mesh refinement is needed to conform the discretization mesh to both complex shapes and strong field variations within the structure. However, the resulting number of unknowns, regarded as the degrees of freedom (DOF) or the states of the FEM model, exceeds largely the necessary number of states of a model, which could be created to capture accordingly the electromagnetic behavior of the structure with respect to its external ports only. Such a reduced-order black-box model, meant to surrogate the original FEM problem, is constructed in a process called model-order reduction (MOR). It adopts the approach, formalism and selected techniques used in MOR for the linear time-invariant systems (LTI) [3] that have been widely employed in the analysis of large RLC networks (consisting of resistors, inductors and capacitors) [4,5,6].

This paper deals with MOR performed locally in subregions, into which the analyzed structure is partitioned. The resulting reduced-order models are called macromodels and this technique is referred to as macromodeling [7,8]. The macromodels for each of the subregions are subsequently combined in the final system representing the whole structure. The advantage of macromodeling over the model-order reduction in the entire domain is that the matrix problems solved during the generation of macromodels are smaller and the reduction orders can be decreased selectively in each subdomain, providing an essential reduction of memory demand and computation time. In this respect, macromodeling can be considered as a combination of MOR and domain decomposition [9] and it shares the benefits of both.

One of the earliest works on macromodeling in FEM is presented in [10], in which macromodels, also called macro-elements, are developed in selected regions only in order to cover small geometric features requiring strong mesh refinement. Each macromodel represents a transfer functions between the electric and magnetic fields on its boundary and is built in the form of general impedance matrix (GIM), order reduced by means of PRIMA (passive reduced-order interconnect macromodeling algorithm) [11]. A formulation of macromodels for fully segmented structures is proposed in [12] as a technique called SFELP (segmentation approach/finite elements/Lanczos-Pade). The Pade approximation via the Lanczos process (PVL) [13] is employed to build the macromodels defined as general admittance matrices (GAM). Based on the same MOR framework, a method of direct decomposition is developed in [14]. Although in the aforementioned methods the most time consuming computations are performed once in an analysis bandwidth, allowing for fast frequency sweeping (FFS), the GIM and GAM macromodels are frequency dependent. Consequently, when used in eigenproblems, these methods require computationally demanding nonlinear eigensolvers.

The MOR employed in this paper is based on the technique that alleviates this problem. It is proposed in [15] for 2-D problems and extended to a 3-D formulation in [16]. Unlike the reduction of GIM and GAM, this MOR procedure is applied to a transfer function between fields on the boundary of macromodel region and in its interior. Each macromodel is represented by a couple of matrices related to the mass and stiffness matrix of the FEM system. The whole MOR process is performed for just one expansion frequency and, what is more, does not introduce any frequency dependent terms in the macromodel matrices. Besides facilitating the solution of eigenproblems, this approach enables very efficient FFS, especially if a subsequent diagonalization of macromodels is intended, because it can be carried out entirely outside the frequency loop of the analysis bandwidth.

A very important advantage of macromodeling arises when optimization is employed in a design process. It involves a sequence of repeated simulations preceded by some structure modifications, which are usually carried out within small subregions that may be represented by macromodels. As presented in [14,17,18], in each step, only a single macromodel that is in the subregion being currently modified needs to be re-generated, while the rest remain unchanged. Since the generation of macromodels is the most costly part of the simulations, the total optimization time can be significantly reduced. Regarding this as a temporal macromodel reuse, one may also consider a similarly beneficial spatial reuse. That strategy is called macromodel cloning and is proposed in [10]. If the analyzed structure comprises repeated subregions of the same geometry and materials, the macromodels need to be created only once for each group of them and they will be multiply copied in the final system.

The macromodels are represented by dense matrices being significantly smaller than their corresponding sparse FEM matrix blocks. Therefore, the overall time of the solution obtained by means of MOR is much shorter than in the original FEM problem, so that most of the computation cost is shifted to the preparation of macromodels itself. However, the residual solution cost of the final reduced-order system becomes significant in larger structures with increasing number of macromodels. Moreover, if the subdomains are large, the macromodels require higher reduction orders, which immediately increases their size and additionally slows down the solution. Consequently, the solution time becomes comparable to the reduction time and may even largely exceed it, particularly if any form of macromodel reuse is applied, as in the case of cloning or optimization. Another situation when the solution time of the reduced system may be comparable to the time of MOR is an automatic selection of reduction order q [19], in which the system is solved repeatedly at each try of increasing value of q.

To mitigate these effects, a new approach to the model-order reduction in FEM problems is proposed in this paper, which leads to the final system with a diagonal matrix, and thus significantly accelerates its solution. The existing diagonalization procedures are not satisfactory for the reasons explained below. A direct diagonalization of the original global FEM system involving orthogonal decomposition of large sparse matrices is possible but practically useless, because it would produce intermediate matrices that are equally large but dense, causing huge memory demand. Among the techniques to overcome this problem, the mass lumping [20] is most popular, but originally restricted to FEM in the time domain. As performed element-wise, the diagonalization is very efficient but limited to the mass matrix only. This is sufficient, however, in the time domain, because it is only the mass matrix that needs to be solved at every time step. An attempt to adopt the mass lumping technique in the frequency domain has been very recently reported in [21], where the time-domain FEM formulation with diagonalized mass matrix is transformed directly to the frequency domain only in order to improve convergence of the iterative solution to the FEM system by efficiently limiting its spectral radius to less than one. It results in good convergence speedup and accuracy, but has been demonstrated for a very narrow bandwidth, limited to less than 1% only.

The novel procedure proposed in this paper is intended for direct solution of the system of equations which is more suitable for the reduced system consisting of small and dense matrices. It brings a substantial advantage of diagonalization to the frequency-domain FEM analysis by adopting the simultaneous diagonalization of mass and stiffness matrix in the model-order reduction framework. To maintain the diagonalization cost as low as possible, it is carried out in the subdomains after the local MOR. As the macromodel matrices are already dense and small, the problem of excess memory demand does not occur. Owing to the accordingly chosen MOR technique, they are also frequency independent, and therefore the diagonalization is needed only once in the whole bandwidth. The resulting diagonalized macromodels are combined in the final system in a way that allows for cloning. Although only a part of the resulting system matrix becomes diagonal, it dominates so that a dedicated solver, efficiently adopting the Schur complement technique, is proposed to achieve additional solution speedup. Apart of the above mentioned main novelty aspects, a new simultaneous diagonalization algorithm, alternative to that of Laub [22], is proposed. Moreover, three different methods of matrix orthogonal decomposition are considered for the simultaneous diagonalization and compared to propose an optimal combination of them with respect to diagonalization time.

The outline of the rest of the paper, which also reflects the sequence of steps in the proposed procedure, is as follows. Section 2 starts with a brief formulation of FEM for the wave equation as a background of the presented analysis. Then, the procedure of domain partitioning is presented in the perspective of the subsequent MOR and diagonalization in separate subdomains. The final assembly of the global system is not included, as it will be done only after diagonalization. Section 3 presents the port compression and MOR techniques adopted from Fotyga et al. [16] in such a way that the macromodels diagonalized in the next step will be disconnected and thus ready for cloning. In the end of this section, it is depicted how the diagonalized macromodels are created by means of the simultaneous diagonalization algorithms and subsequently assembled in the global reduced system. A dedicated solver based on the Schur complement technique for the resulting partially diagonal system is also proposed. The results of numerical validation with respect to accuracy and speed are presented and discussed in Section 4.

2. Finite Element Method in Partitioned Domain

2.1. FEM Formulation

Consider a waveguide component being a source-free arbitrarily shaped region

Ω

bounded by a surface

Γ

.

Ω

is loaded with lossless media of arbitrary distribution of electric permittivity

ε

and magnetic permeability

μ

. The time-harmonic electric and magnetic fields

\vec{E}

and

\vec{H}

at angular frequency

ω

in

Ω

are governed by the Maxwell equations:

\nabla \times \vec{E} = - j ω μ \vec{H}, \nabla \times \vec{H} = j ω ε \vec{E}

(1)

For the finite element method used in the presented analysis, they are transformed to the vector wave equation with respect to

\vec{E}

:

\nabla \times \frac{1}{μ} \nabla \times \vec{E} - ω^{2} ε \vec{E} = 0 .

(2)

In the boundary-value problems with excitations, which is the case in this analysis,

Γ

comprises

N_{P n}

input/output ports denoted as

P_{n}

. They are referred to as external ports, on which the fields related to excitations and loads are defined as the boundary conditions. On the remaining parts of

Γ

, denoted as

Γ_{0}

, perfect electric conductor (PEC) is assumed.

In the finite element method based on Galerkin approach, the following weak formulation is derived from Equation (2) and the boundary condition in the external ports [12,23]:

\begin{matrix} \int \int \int_{Ω} (\nabla \times \vec{w} \cdot \frac{1}{μ_{r}} \nabla \times \vec{E} - k_{0}^{2} \vec{w} \cdot ε_{r} \vec{E}) d V = j ω μ_{0} \sum_{n = 1}^{N_{P n}} \int \int_{P_{n}} \vec{w} \cdot ({\vec{n}}_{n} \times {\vec{H}}_{t n}) d S, \end{matrix}

(3)

where

{\vec{n}}_{n}

is a normal vector on the nth port,

{\vec{H}}_{t n}

is the distribution of the tangential magnetic field on the nth port,

μ_{r}

,

ε_{r}

are relative permeability and permittivity,

k_{0} = ω \sqrt{μ_{0} ε_{0}}

is the wavenumber, and

\vec{w}

is a vector testing function. The right-hand side represents excitation of the analyzed structure. Distribution of the electric field

\vec{E}

is sought as an approximate solution to Equation (3) in a finite dimensional subspace of vector basis functions

{\vec{w}}_{i}

. The testing functions

\vec{w}

are chosen to be the same as these basis functions, and thus the electric field

\vec{E}

is approximated by the following general expansion:

\vec{E} = \sum_{i = 1}^{N} e_{i} {\vec{w}}_{i},

(4)

where

e_{i}

are the coefficients being the unknowns in the linear system of N equations to which the FEM procedure eventually leads.

The basis functions

\vec{w_{i}}

are defined piecewise in finite elements, into which the entire domain is divided. Usually tetrahedrons are used as they are the simplest shapes able to approximate arbitrarily complex 3-D geometries. Further steps of the FEM procedure begin thus with discretization of

Ω

by means of a 3-D tetrahedral mesh. At the same time, a 2-D triangular mesh is also defined as a collection of faces of the tetrahedrons adjacent to all surfaces present in the domain. They include the surfaces of all physical features and also any fictitious boundaries between subdomains into which

Ω

will be partitioned. The first-order finite elements are assumed in this analysis, however, the proposed technique with diagonalized macromodels is also applicable to higher-order FEM. In a single tetrahedron, six first-order vector basis functions are associated with the six edges and are defined as:

{\vec{w}}_{m} = L_{m} (α_{i} \nabla α_{j} - α_{j} \nabla α_{i}),

(5)

where m is an edge number,

α_{i}

and

α_{j}

are the simplex coordinates associated with nodes i and j of the edge m, and

L_{m}

is the length of this edge [24]. Equation (4) splits into separate tetrahedrons

Ω^{(t)}

and 3-D electric field vectors in each of their volumes are approximated as:

{\vec{E}}^{(t)} = \sum_{m = 1}^{6} e_{m} {\vec{w}}_{m} .

(6)

This expansion can also represent 2-D distribution of the tangential field components on surfaces if three-element subsets of

{\vec{w}}_{m}

for the corresponding triangular faces of tetrahedrons are considered.

Substituting the field expansion in Equation (6) into Equation (3) yields the set of linear systems of equations for a collection of tetrahedrons

Ω^{(t)}

, which can be rewritten in matrix form:

(K^{(t)} - k_{0}^{2} M^{(t)}) e = b^{(t)},

(7)

where

e

is the element vector of unknown coefficients

e_{m}

in Equation (6). The element stiffness and mass matrices

K^{(t)}

and

M^{(t)}

, respectively, and the right-hand side vector of excitation

b^{(t)}

are given by the following integrals:

\begin{matrix} K_{i j}^{(t)} & = & \frac{1}{μ_{r}} \int \int \int_{Ω^{(t)}} \nabla \times {\vec{w}}_{i} \cdot \nabla \times {\vec{w}}_{j} d V, \\ M_{i j}^{(t)} & = & ε_{r} \int \int \int_{Ω^{(t)}} {\vec{w}}_{i} \cdot {\vec{w}}_{j} d V, \\ b_{i}^{(t)} & = & j ω μ_{0} \int \int_{S_{n}^{(t)}} {\vec{w}}_{i} \cdot (\vec{n} \times {\vec{H}}_{t n}) d S \end{matrix}

(8)

where

i, j = 1, 2, \dots 6

are the matrix indices corresponding to tetrahedron edges,

{\vec{w}}_{i}

,

{\vec{w}}_{j}

are the testing functions as defined in Equation (5),

\vec{n}

is a unit normal vector, and

S_{n}^{(t)}

are the faces of tetrahedrons that belong to the external port

P_{n}

in which the distribution of tangential magnetic vector

{\vec{H}}_{t n}

is defined. For the elements not adjoining any of the external ports,

b^{(t)} = 0

.

2.2. Domain Partitioning

To create M macromodels in the process of model-order reduction,

Ω

is divided into M non-overlapping subdomains

Ω_{k}

, as shown in Figure 1a for

M = 3

. They are separated by the interface surfaces

P_{m}

, which we call internal ports. Although

P_{m}

are internal in

Ω

, their edges may connect to

Γ_{0}

. The internal ports are not physical boundaries but only fictitious geometrical object introduced in order to partition the domain. Therefore, the only boundary conditions to be considered on the

P_{m}

express the field continuity, exactly as it is on the interfaces between individual finite elements.

The mesh can be generated according to two different strategies: in a single run for the whole structure or separately in the subregions. In the former, the triangular mesh on the internal ports as well as the FEM basis functions on their tetrahedron faces are naturally shared by the adjacent subdomains. Consequently, the corresponding expansion coefficients of the fields are the same on

P_{m}

, which guarantees continuity of the tangential electric vectors

\vec{E_{t}}

. In addition, the normal components of the electric induction

{\vec{D}}_{n} = ε {\vec{E}}_{n}

remain continuous on the internal ports, which is due to the natural boundary conditions at the interfaces between tetrahedrons inherently present in the FEM weak formulation. To proceed with FEM and MOR separately in each subdomain, the internal ports have to be split, as shown in Figure 1b, in such a way that two identical copies of the surface mesh are assigned to the disconnected subdomains. Despite this, the aforementioned boundary conditions for the continuity of the fields will be fulfilled also on the separated interfaces without the need to impose them explicitly, because the internal ports will eventually reconnect during the assembly of the final system of equations.

The second meshing approach is preferred when cloning of macromodels is intended. The subdomains are separated at the internal ports before mesh generation. However, entirely independent 3-D meshing in each subregion would result in non-conforming 2-D meshes to meet when the ports are reconnected. To avoid such situations and have the boundary conditions fulfilled as straightforwardly as in the first strategy, the procedure is modified and divided into two steps. In the initial step, only 2-D mesh is created on all surfaces in the entire structure. If needed, the meshes on the selected internal ports are copied into the places where the cloned macromodels are to be embedded later. The mesh on the internal ports is duplicated to build separate boundary of 2-D mesh enclosing each unique subregion. The other subregions are regarded as non-unique, and do not require 3-D meshing, because the macromodels will be cloned into them. Next, these surface meshes are used to initiate generation of 3-D mesh in the enclosed volumes. This procedure can be easily applied using the advancing front technique of mesh generation implemented in the NETGEN [25] software that is chosen for this analysis.

2.3. Local Matrix Assembly

The element matrices and vectors in Equation (7) are assembled over each disconnected subdomain

Ω_{k}

separately, leading to the following M local FEM matrix equations with large sparse matrices:

(K_{k} - k_{0}^{2} M_{k}) e_{k} = b_{k}

(9)

for

k = 1, 2, \dots M

. The term local refers here to the subdomains, not just individual finite elements, and is used to oppose to a global system of equations in standard FEM procedure for an unpartitioned domain. Although not necessary in the proposed macromodeling technique, the FEM global system can be assembled from Equation (9) as well.

To proceed with subsequent model-order reduction, the local FEM matrices in Equation (9) are reordered in such a way that the unknowns associated with all subdomain’s port are grouped together. As a result, the matrices in Equation (9) split into the following blocks:

([\begin{matrix} K_{P k} & S_{K k}^{T} \\ S_{K k} & K_{Ω k} \end{matrix}] - k_{0}^{2} [\begin{matrix} M_{P k} & S_{M k}^{T} \\ S_{M k} & M_{Ω k} \end{matrix}]) \cdot [\begin{matrix} e_{P k} \\ e_{Ω k} \end{matrix}] = [\begin{matrix} b_{k} \\ 0 \end{matrix}] .

(10)

The matrices

K_{P k}

,

S_{K k}

,

M_{P k}

, and

S_{M k}

, being a common representation of all ports on the boundary of

Ω_{k}

(

\partial Ω_{k}

) combined together, split further into the blocks corresponding to each separate port

P_{i k}

in the following way:

\begin{matrix} K_{P k} = d i a g (\dots, K_{P i k}, \dots), & M_{P k} = d i a g (\dots, M_{P i k}, \dots), \\ S_{K k} = [\dots S_{K i k} \dots], & S_{M k} = [\dots S_{M i k} \dots], f o r i : P_{i k} \subset \partial Ω_{k}, \end{matrix}

(11)

where the indices

i k

correspond to a port whose global number is i and which belongs to the boundary

\partial Ω_{k}

. The port numbers i represent a consecutive global numbering of the external and internal ports

P_{n}

and

P_{m}

such that

i \in {n, m}

, where

n = 1, 2, \dots P_{0}

and

m = P_{0} + 1, \dots, P

. For

i = m

, the internal port

P_{m}

splits into two ports

P_{i k 1}

and

P_{i k 2}

associated with the adjacent subdomains

Ω_{k 1}

and

Ω_{k 2}

, respectively. For

i = n

, an external port

P_{n}

remains assigned to a single subdomain so that

P_{i k} \equiv P_{n}

. Using the definitions of matrix blocks in Equation (11), Equation (10) can be rewritten in the final form:

([\begin{matrix} ⋱ & 0 & 0 & ⋮ \\ 0 & K_{P i k} & 0 & S_{K i k}^{T} \\ 0 & 0 & ⋱ & ⋮ \\ \dots & S_{K i k} & \dots & K_{Ω k} \end{matrix}] - k_{0}^{2} [\begin{matrix} ⋱ & 0 & 0 & ⋮ \\ 0 & M_{P i k} & 0 & S_{M i k}^{T} \\ 0 & 0 & ⋱ & ⋮ \\ \dots & S_{M i k} & \dots & M_{Ω k} \end{matrix}] \cdot [\begin{matrix} ⋮ \\ e_{P i k} \\ ⋮ \\ e_{Ω k} \end{matrix}] = [\begin{matrix} ⋮ \\ b_{n k} \\ ⋮ \\ 0 \end{matrix}] .

(12)

Let us illustrate this procedure with the exemplary domain partitioning scheme presented in Figure 1a. There are two external ports

P_{n} \in {P_{1}, P_{2}}

and two internal ports

P_{m} \in {P_{3}, P_{4}}

, which connect three subdomains. Split ports

P_{m}

on the separated subdomains are shown in Figure 1b. The mass matrices

K_{k}

derived for each subdomain take the following form:

K_{1} = [\begin{matrix} K_{P 11} & 0 & S_{K 11}^{T} \\ 0 & K_{P 31} & S_{K 31}^{T} \\ S_{K 11} & S_{K 31} & K_{Ω 1} \end{matrix}] K_{2} = [\begin{matrix} K_{P 32} & 0 & S_{K 32}^{T} \\ 0 & K_{P 42} & S_{K 42}^{T} \\ S_{K 32} & S_{K 42} & K_{Ω 2} \end{matrix}] K_{3} = [\begin{matrix} K_{P 22} & 0 & S_{K 22}^{T} \\ 0 & K_{P 43} & S_{K 43}^{T} \\ S_{K 22} & S_{K 43} & K_{Ω 3} \end{matrix}]

(13)

For the stiffness matrices

M_{k}

, the above formulas apply if the letter K is replaced by M.

3. Model-Order Reduction with Diagonalized Macromodels

The entire model-order reduction process is performed locally in each subdomain. As a result, the reduced-order macromodels are created to replace the corresponding matrix blocks in the local FEM system of equations in Equation (12). Consider a single subdomain, as defined in Section 2.2, enclosed by physical boundary

Γ_{0}

and the ports, both external and internal. For the sake of clarity, we temporarily, until the final matrix assembly, omit the global numbering, and thus any subdomain under consideration is now denoted as

Ω

instead of

Ω_{k}

. The local ports in each

Ω

is denoted as

P_{l}

, for

l = 1, 2, \dots L

, while P represents a collection of all them. In this perspective, Equation (10) from which subscripts k are removed is a starting point for the presented procedure.

3.1. Port Compression

The size of a macromodel and also the computational cost of MOR procedure grow with the size of its all ports p counted as their overall number of degrees of freedom (DOF). It is, therefore, desirable to decrease p before the subsequent steps of MOR. A geometrical reduction of DOF by local mesh coarsening at the ports is presented in [15] for 2-D problems. That technique is simple but inefficient in 3-D FEM analysis, for which a different approach based on orthogonal projection, referred to as port compression, is proposed in [16]. The 2-D FEM basis on each port surface is projected onto a new subspace defined by a basis of orthogonal functions

F_{l}

. As they span entire port surface, much fewer functions in

F_{l}

are needed than those in FEM basis

W_{l}

. To simplify definition of the new basis functions, the ports are selected as surfaces conforming to basic coordinate systems, such as Cartesian, cylindrical or spherical. Each port may use its own 2-D local coordinates

(q_{1}, q_{2})

being a subset of a locally chosen 3-D system.

The tangential electric vector in port

P_{l}

is expanded into the series of the orthogonal functions from the basis

F_{l} = {{\vec{f}}_{1}, \dots, {\vec{f}}_{j}, \dots}

. For

{\vec{f}}_{j} \in {\vec{i_{1}} f_{1 j} (q_{1}, q_{2})

,

\vec{i_{2}} f_{2 j} (q_{1}, q_{2})}

, where

({\vec{i}}_{1}, {\vec{i}}_{2})

are unit vectors of the coordinates

(q_{1}, q_{2})

,

j = 1, \dots, N_{P l}^{'}

and

N_{P l}^{'} = N_{1} + N_{2}

, this expansion is as follows:

{\vec{E}}_{P l}^{'} (q_{1}, q_{2}) = \sum_{i = 1}^{N_{1}} \vec{i_{1}} e_{1 i} f_{1 i} (q_{1}, q_{2}) + \sum_{i = 1}^{N_{2}} \vec{i_{2}} e_{2 i} f_{2 i} (q_{1}, q_{2}) = [\dots {\vec{f}}_{j} \dots] e_{P l}^{'},

(14)

where

e_{P l}^{'}

is the vector of coefficients

{e_{1 i}, e_{2 i}}

corresponding to DOF in the compressed port

P_{l}

. Similarly, the tangential fields have also been expanded by means of the FEM basis

W_{l} = {{\vec{w}}_{1}, \dots, {\vec{w}}_{i}, \dots}

, where

{\vec{w}}_{i}

are 2-D FEM basis functions on the surface of port

P_{l}

as defined in Equation (5),

i = 1, \dots, N_{P l}

and vector

e_{P l}

represents DOF in the original FEM port:

{\vec{E}}_{P l} = \sum_{i = 1}^{N_{P l}} e_{i} {\vec{w}}_{i} = [\dots {\vec{w}}_{i} \dots] e_{P l} .

(15)

Then, the port compression can be written as the following projection:

e_{P l} ⟶ e_{P l}^{'} such that e_{P l}^{'} = F_{l}^{T} e_{P l} .

(16)

The projection basis

F_{l}

is a

N_{P l} \times N_{P l}^{'}

matrix of

N_{P l}^{'}

functions in the continuous basis

F_{l}

discretized on

N_{P l}

-element FEM basis

W_{l}

:

F_{l} = [\dots f_{j} \dots], f_{j} = M_{P l}^{- 1} u_{j}, u_{i j} = \int \int_{P_{l}} {\vec{w}}_{i} \cdot {\vec{f}}_{j} d S .

(17)

M_{P l}

is a local mass matrix defined as

M_{P k}

in Equations (8) and (11). For a collection of all ports P on

\partial Ω

, the projection in Equation (16) reads as:

\begin{matrix} e_{P} = {[\dots e_{P l}^{T} \dots]}^{T} ⟶ e_{P}^{'} = {[\dots e_{P l}^{' T} \dots]}^{T} \\ e_{P}^{'} = F^{T} e_{P}, F = {[\dots F_{l}^{T} \dots]}^{T} \end{matrix}

(18)

and is applied to all matrix blocks in Equation (10), yielding the system with compressed ports P:

([\begin{matrix} K_{P}^{'} & {S^{'}}_{K}^{T} \\ S_{K}^{'} & K_{Ω} \end{matrix}] - k_{0}^{2} [\begin{matrix} M_{P}^{'} & {S^{'}}_{M}^{T} \\ S_{M}^{'} & M_{Ω} \end{matrix}]) [\begin{matrix} e_{P}^{'} \\ e_{Ω} \end{matrix}] = [\begin{matrix} b_{P}^{'} \\ 0 \end{matrix}],

(19)

where

K_{P}^{'} = F^{T} K_{P} F, M_{P}^{'} = F^{T} M_{P} F, S_{K}^{'} = F^{T} S_{K}, S_{M}^{'} = F^{T} S_{M} .

(20)

Although subscripts k have been omitted for the sake of clarity, this equation applies to each separate subdomain

Ω_{k}

. The overall number of DOF in the original and compressed ports of a single subdomain is

N_{P} = \sum_{l} N_{P l}

and

p = N_{P}^{'} = \sum_{l} N_{P l}^{'}

, respectively, where

N_{P}^{'} ≪ N_{P}

. Assuming that each subdomain has

k_{p}

ports compressed to

p_{0}

DOF each,

p = k_{p} \cdot p_{0}

.

On the ports whose surfaces are constrained by physical boundaries of the structure, the trigonometric expansion with sine and cosine functions is possible as Equation (14). A typical example of such case is the cross section of a rectangular waveguide where Equation (14) is equivalent to the modal expansion by means of

p_{0}

waveguide modes. The sine and cosine functions are also a natural choice on a cylinder or sphere along full turns of angular coordinates. For the surfaces without prescribed boundary conditions, such as walls of a floating cube box, one may resort to Lagrange polynomials. More details on the port compression by means of the aforementioned expansions are in [16,26].

3.2. Model-Order Reduction

The FEM model-order reduction procedure used in the proposed macromodeling technique was previously proposed by Fotyga et al. [15] and Fotyga et al. [16] for 2-D and 3-D problems, respectively. It utilizes the techniques developed for the second-order linear time-invariant systems (LTI) [3], where the response is defined as a matrix transfer function

H (s)

between inputs and internal states in the Laplace domain. To adopt this approach in FEM for the local MOR, this transfer function is derived from the system in Equation (19) for each subdomain as follows:

\begin{matrix} (K_{Ω} - k_{0}^{2} M_{Ω}) e_{Ω} = - (S_{K}^{'} - k_{0}^{2} S_{M}^{'}) e_{P}^{'} \\ H (s) = - {(K_{Ω} + s^{2} M_{Ω})}^{- 1} s (s^{- 1} S_{K}^{'} + s S_{M}^{'}), e_{Ω} = H (s) e_{P}^{'} \end{matrix}

(21)

for the complex frequency parameter

s = j k_{0} = j ω / c

.

The objective of MOR is to create the reduced-order model representing the following transfer function:

\tilde{H} (s) = - {({\tilde{K}}_{Ω} + s^{2} {\tilde{M}}_{Ω})}^{- 1} s (s^{- 1} {\tilde{S}}_{K}^{'} + s {\tilde{S}}_{M}^{'}), {\tilde{e}}_{Ω} = \tilde{H} (s) e_{P}^{'} .

(22)

\tilde{H} (s)

is supposed to approximate the transfer function in Equation (21) in a limited frequency range in order to capture the behavior of the fields in

Ω

with respect to its ports P by significantly fewer unknowns

{\tilde{e}}_{Ω}

. To this end, the original vector of unknowns

e_{Ω}

, whose length is

N_{Ω}

, is projected onto a new

{\tilde{N}}_{Ω}

-dimensional solution space by means of an appropriately constructed basis

V

being a

N_{Ω} \times {\tilde{N}}_{Ω}

matrix of orthonormal vectors such that

{\tilde{N}}_{Ω} ≪ N_{Ω}

. The

{\tilde{N}}_{Ω} \times {\tilde{N}}_{Ω}

matrices

{\tilde{K}}_{Ω}

and

{\tilde{M}}_{Ω}

become dense but are significantly smaller than

K_{Ω}

and

M_{Ω}

. They are regarded as the macromodel, which represents the transfer function

\tilde{H} (s)

when considered together with their respective coupling matrices

{\tilde{S}}_{K}^{'}

and

{\tilde{S}}_{M}^{'}

. The macromodel size is denoted as

N_{m} \equiv {\tilde{N}}_{Ω}

. Equation (22) is transformed back to the matrix form (Equation (19)) yielding the reduced system:

([\begin{matrix} K_{P}^{'} & {\tilde{S^{'}}}_{K}^{T} \\ {\tilde{S^{'}}}_{K} & {\tilde{K}}_{Ω} \end{matrix}] - k_{0}^{2} [\begin{matrix} M_{P}^{'} & {\tilde{S^{'}}}_{M}^{T} \\ {\tilde{S^{'}}}_{M} & {\tilde{M}}_{Ω} \end{matrix}]) [\begin{matrix} e_{P}^{'} \\ {\tilde{e}}_{Ω} \end{matrix}] = [\begin{matrix} b_{P}^{'} \\ 0 \end{matrix}],

(23)

where the projection by means of

V

is performed as:

{\tilde{K}}_{Ω} = V^{T} K_{Ω} V, {\tilde{M}}_{Ω} = V^{T} M_{Ω} V, {\tilde{S^{'}}}_{K} = V^{T} S_{K}^{'}, {\tilde{S^{'}}}_{M} = V^{T} S_{M}^{'}, {\tilde{e}}_{Ω} = V^{T} e_{Ω} .

(24)

To generate the reduction basis

V

, we employ the Efficient Nodal Order Reduction (ENOR) algorithm proposed in [27] for multiport RLC circuits, which refers to the transfer function of the form:

H_{E} (s) = {(Γ + s G + s^{2} C)}^{- 1} s B .

(25)

This algorithm can by applied to the FEM transfer function in Equation (21) for the following substitution:

Γ = K_{Ω}

,

C = M_{Ω}

,

G = 0

,

B = - (s^{- 1} S_{K}^{'} + s S_{M}^{'})

.

The columns in

V

are combined as first q block moments of

e_{Ω}

expanded around a frequency

s_{0}

, each of them being a

N_{Ω} \times p

matrix. By increasing the reduction order q, the accuracy of the macromodel improves but it is achieved at the price of higher numerical cost of its generation and larger macromodel size

N_{m} = q p

. Higher orders q are needed for larger subregion and stronger field variations, which is usually due to discontinuities and small geometric features within the subdomain.

As the projection basis,

V

is computed for a single expansion frequency

s = s_{0}

; it is frequency independent and valid within a certain bandwidth depending on the required accuracy. The projection in Equation (24) is also performed only once in this bandwidth and does not introduce any frequency dependent terms in the macromodel matrices. For this reason, and since they are very small, the high cost of macromodel generation can be compensated by fast frequency sweeps during the solution of the final system of equations, which eventually leads to a significant analysis speedup.

In structures which comprise repeating subregions, cloning of macromodels is possible. It means that the reduction is not performed in the subdomains represented by the macromodels that have already been created elsewhere and can be copied to different locations. The matrices and vectors corresponding to the cloned macromodels are added in the global system after the diagonalization. This allows for substantial reduction of the overall computational cost of MOR and diagonalization.

3.3. Diagonalization

In the time-domain FEM, only mass matrix needs to be diagonalized in order to accelerate the solution time stepping. This technique, known as mass lumping [20], involves an efficient diagonalization performed on the element level. For the diagonalization to accelerate the system solution in case of the frequency-domain FEM, one must resort to more complex algorithms, which diagonalize both mass and stiffness matrices. They are known in the literature as simultaneous diagonalization [22,28] and usually refer to generalized eigenvalue problems. As they all involve orthogonal decomposition of matrices in the global system, which produces dense intermediate matrices, their numerical cost in case of large sparse FEM systems would exceed the solution cost of the original FEM problem. Based on the same generic framework, a new approach is proposed, which brings a substantial advantage of diagonalization to the frequency-domain FEM analysis by combining it with local model-order reduction. To this end, the diagonalization is performed locally in separate subdomains, prior the global matrix assembly, and applies only to the macromodel matrix blocks

{\tilde{K}}_{Ω}

and

{\tilde{M}}_{Ω}

in Equation (23). The remaining blocks (

K_{P}^{'}

,

{\tilde{S^{'}}}_{K}

,

M_{P}^{'}

,

{\tilde{S^{'}}}_{M}

) can be omitted, because they are much smaller than the macromodel blocks. Although they are present in the system matrix after diagonalization, their influence on the solution time is rather small and will be even further reduced by appropriate decomposition of the final system with the Schur complement technique. It should be noted that, due to the frequency independence of the macromodels, the diagonalization is extremely efficient, as it is performed only once for the whole analysis bandwidth.

As a result of the diagonalization, Equation (23) transforms into the following system:

([\begin{matrix} K_{P}^{'} & S_{K D}^{T} \\ S_{K D} & D_{K} \end{matrix}] - k_{0}^{2} [\begin{matrix} M_{P}^{'} & S_{M D}^{T} \\ S_{M D} & D_{M} \end{matrix}]) [\begin{matrix} e_{P}^{'} \\ e_{D} \end{matrix}] = [\begin{matrix} b_{P}^{'} \\ 0 \end{matrix}],

(26)

where

D_{K}

and

D_{M}

are diagonal matrices representing the diagonalized macromodel. Along with other blocks denoted by subscript D, which are new in this system, they are computed in one of the following algorithms.

Diag-I

Define matrix $Z = {\tilde{M}}_{Ω}^{\frac{1}{2}}$ and compute $Z_{1} = Z^{- 1}$ .
Derive $D_{M} = Z_{1} (Z \cdot Z) Z_{1} = I_{M}$ as a result of left multiplication of Equation (23) by $diag (I_{P}, Z_{1})$ and make the following substitutions in Equation (26): $Z_{1} {\tilde{S}}_{M}^{'} \to S_{M D}$ , $Z_{1} {\tilde{S}}_{K}^{'} \to S_{K D}$ , $Z {\tilde{e}}_{Ω} \to e_{D}$ , where $I_{P}$ and $I_{M}$ are unit matrices of the size of $M_{P}^{'}$ and ${\tilde{M}}_{Ω}$ , respectively.
Compute $K_{Ω Z} = Z_{1} {\tilde{K}}_{Ω} Z_{1}$ .
Compute $Q_{K}$ and $Λ_{K}$ as the orthogonal decomposition $K_{Ω Z} = Q_{K} Λ_{K} Q_{K}^{T}$ .
Derive and calculate the following matrices as a result of left multiplication of Equation (26) by $diag (I_{P}, Q_{K}^{T})$ : $D_{M} = Q_{K}^{T} I_{M} Q_{K} = I_{M}$ , $D_{K} = Λ_{K}$ , $S_{K D} = Q_{K}^{T} Z_{1} {\tilde{S}}_{K}^{'}$ , $S_{M D} = Q_{k}^{T} Z_{1} {\tilde{S}}_{M}^{'}$ , $e_{D} = Q_{k}^{T} Z {\tilde{e}}_{Ω}$ .

Diag-II

Perform the orthogonal decomposition ${\tilde{M}}_{Ω} = Q_{M} Λ_{M} Q_{M}^{T}$ and compute matrix
$X = {(Q_{M} Λ_{M}^{\frac{1}{2}})}^{- 1} = Λ_{M}^{- \frac{1}{2}} Q_{M}^{T}$ .
Derive $D_{M} = X {\tilde{M}}_{Ω} X^{T} = (Λ_{M}^{- \frac{1}{2}} Q_{M}^{T}) {\tilde{M}}_{Ω} (Q_{M} Λ_{M}^{- \frac{1}{2}}) = Λ_{M}^{- \frac{1}{2}} Λ_{M} Λ_{M}^{- \frac{1}{2}} = I_{M}$ as a result of left multiplication of Equation (23) by $diag (I_{P}, X)$ and make the following substitutions in Equation (26): $X {\tilde{S}}_{M}^{'} \to S_{M D}$ , $X {\tilde{S}}_{K}^{'} \to S_{K D}$ , $X {\tilde{e}}_{Ω} \to e_{D}$ , where $I_{P}$ and $I_{M}$ are unit matrices of the size of $M_{P}^{'}$ and ${\tilde{M}}_{Ω}$ , respectively.
Compute $K_{Ω X} = X {\tilde{K}}_{Ω} X^{T}$ .
Compute $Q_{K}$ and $Λ_{K}$ as the orthogonal decomposition $K_{Ω X} = Q_{K} Λ_{K} Q_{K}^{T}$ .
Derive and calculate the following matrices as a result of left multiplication of Equation (26) by $diag (I_{P}, Q_{K}^{T})$ : $D_{M} = Q_{K}^{T} I_{M} Q_{K} = I_{M}$ , $D_{K} = Λ_{K}$ , $S_{K D} = Q_{K}^{T} X {\tilde{S}}_{K}^{'}$ , $S_{M D} = Q_{k}^{T} X {\tilde{S}}_{M}^{'}$ , $e_{D} = Q_{k}^{T} X {\tilde{e}}_{Ω}$ .

Diag-I is an original algorithm, while Diag-II adopts an approach similar to those presented in [22,28] for generalized eigenvalue problems. Both consist of the same Steps (3)–(5), which provide diagonalization of stiffness matrix

{\tilde{K}}_{Ω}

. The main difference between these algorithms lays in Steps (1) and (2) where in Diag-I there is an inverse of square root of the mass matrix

{\tilde{M}}_{Ω}

instead of its orthogonal decomposition. It should be noticed, however, that usually the algorithms for matrix square root involve orthogonal decomposition. In this respect, both procedures should have comparable computational costs to which the orthogonal decompositions contribute most. To perform orthogonal decomposition, the following algorithms are possible:

eigendecomposition (EVD) $Q Λ Q^{T}$ , where diagonal matrix of eigenvalues $Λ$ and orthogonal matrix of eigenvectors $Q$ are directly used in the presented algorithms;
singular value decomposition (SVD) $U Σ V^{T}$ with the substitution $Λ = Σ$ and $Q = U$ ; and
Schur decomposition (SchD) ${QUQ}^{- 1}$ with $Q$ being used directly and the diagonal of triangular matrix $U$ being substituted for $Λ$ .

3.4. Assembly and Solution of the Global System

Now, the matrices of the global system of equations are being finally assembled from Equation (26) rewritten for each subdomain

Ω_{k}

by restoring subscripts k and i to indicate subdomain and its ports in the same form as in Equation (12). The resulting system takes the same form as Equation (26), in which all terms are denoted with superscript

(a)

. They are defined as the following matrices:

D_{K}^{(a)} = diag (\dots, D_{K k}, \dots), D_{M}^{(a)} = diag (\dots, D_{M k}, \dots), e_{D}^{(a)} = {[\dots e_{D k}^{T} \dots]}^{T},

(27)

\begin{matrix} K_{P}^{' (a)} = diag (\dots, K_{P i}^{'}, \dots), & M_{P}^{' (a)} = diag (\dots, M_{P i}^{'}, \dots), \\ K_{P i}^{'} = K_{P i k 1}^{'} + K_{P i k 2}^{'}, & M_{P i}^{'} = M_{P i k 1}^{'} + M_{P i k 2}^{'} & for P_{i} \subset \partial Ω_{k 1} \land P_{i} \subset \partial Ω_{k 2}, \\ K_{P i}^{'} = K_{P i k}^{'}, & M_{P i}^{'} = M_{P i k}^{'} & for P_{i} \subset \partial Ω_{k}, \\ e_{P}^{' (a)} = {[\dots e_{P i}^{' T} \dots]}^{T}, & b_{P}^{' (a)} = {[\dots b_{P i}^{' T} \dots]}^{T}, \end{matrix}

(28)

S_{K D}^{(a)} = [\begin{matrix} S_{K D 11} & \dots & S_{K D i 1} & \dots \\ \dots & \dots & \dots & \dots \\ S_{K D 1 k} & \dots & S_{K D i k} & \dots \\ \dots & \dots & \dots & \dots \end{matrix}], S_{M D}^{(a)} = [\begin{matrix} S_{M D 11} & \dots & S_{M D i 1} & \dots \\ \dots & \dots & \dots & \dots \\ S_{M D 1 k} & \dots & S_{M D i k} & \dots \\ \dots & \dots & \dots & \dots \end{matrix}],

(29)

where

i = 1, \dots, P

,

k = 1, \dots, M

and

S_{K D i k} = S_{M D i k} = 0

for

P_{i} ⊄ Ω_{k}

.

The same assembly procedure can be applied to the local FEM matrices in Equation (12). Although not required, this would provide a standard FEM system useful as a reference for investigation of the efficiency improvement and accuracy of the proposed procedures of MOR and diagonalization.

If there is a group of identical subdomains, macromodel cloning is recommended. In only one of them, 3-D mesh, FEM matrices and the diagonalized macromodel need to be created. The remaining subdomains are represented by multiple copies of this macromodel to be introduced as their respective submatrices in Equations (27)–(29). For example, if in the structure in Figure 1 all subdomains are assumed identical,

Ω_{1}

may be selected as the one where the primary diagonalized macromodel is to be created. Subdomains

Ω_{2}

and

Ω_{3}

are represented by the matrices which are copied from

Ω_{1}

in the following scheme depicted for the stiffness matrix

K

(it is the same for the mass matrix

M

):

D_{K 1} \to {D_{K 2}, D_{K 3}}

,

K_{P 11}^{'} \to {K_{P 32}^{'}, K_{P 43}^{'}}

,

K_{P 31}^{'} \to {K_{P 42}^{'}, K_{P 22}^{'}}

,

S_{K D 11} \to {S_{K D 32}, S_{K D 23}}

,

S_{K D 31} \to {S_{K D 42}, S_{K D 43}}

.

The resulting assembled system is solved in the following compact form:

[\begin{matrix} P & S^{T} \\ S & D \end{matrix}] \cdot [\begin{matrix} e_{P} \\ e_{D} \end{matrix}] = [\begin{matrix} b_{P} \\ b_{D} \end{matrix}],

(30)

where

e_{P} = e_{P}^{' (a)}

,

e_{D} = e_{D}^{(a)}

,

b_{P} = b_{P}^{' (a)}

,

b_{D} = 0

,

P = K_{P}^{' (a)} - k_{0}^{2} M_{P}^{' (a)}

,

D = D_{K}^{(a)} - k_{0}^{2} D_{M}^{(a)}

and

S = S_{K D}^{(a)} - k_{0}^{2} S_{M D}^{(a)}

. It is worth mentioning that this is the first time the frequency sweeping is involved, which means that computationally demanding operations of MOR and diagonalization have been performed just once for the whole frequency range. For the same reason, it also means that the subsequent procedures of solving Equation (30) are inside the frequency loop, and therefore should be thoroughly optimized.

The size of square blocks

P

and

D

is

N_{P}^{(a)} = \sum_{i = 1}^{P} N_{P i}^{'}

and

N_{D}^{(a)} = \sum_{k = 1}^{M} {\tilde{N}}_{Ω k}

, respectively. The total number of unknowns after diagonalization is equal to those after MOR and for structures represented by cascaded macromodels (

k_{p} = 2

) it is:

\tilde{N} = N_{D}^{(a)} + N_{P}^{(a)} = 2 M q p_{0} + (M + 1) p_{0} .

(31)

Diagonal matrix

D

is the largest block in the system and usually

N_{D}^{(a)} ≫ N_{P}^{(a)}

approximately by the factor

2 q

. If, for instance, the analyzed structure consists of

M = 10

macromodels of order

q = 10

and 11 ports compressed to

p_{0} = 10

DOF each, then

N_{P}^{(a)} = 110

,

N_{D}^{(a)} = 2000

and

\tilde{N} = 2110

.

To take full advantage of the fact that the dominating block in the system is just a diagonal matrix, we propose to partition Equation (30) and solve it with respect to

e_{P}

and

e_{D}

separately by means of the Schur complement technique [28] in the following procedure:

Schur-complement based solver (Schur solver).

Compute the Schur complement $Σ = P - S^{T} D^{- 1} S = P - S^{T} S_{D}$ , where $S_{D i j} = S_{i j} / D_{i i}$ for $i = 1, \dots, N_{D}^{(a)}, j = 1, \dots, N_{P}^{(a)}$
Derive the right-hand side vector of the Schur complement equation $Σ e_{P} = b_{P} - S^{T} D^{- 1} b_{D} = b_{P}$
Solve $Σ e_{P} = b_{P} \to e_{P}$
(optional) Solve $D e_{D} = - S e_{P} \to e_{D} = - D^{- 1} S e_{P}$

The inversion of the diagonal matrix

D

in Steps (1) and (2) is as trivial as a scalar division by its entries. Thus, time complexity of the Schur complement creation reduces to approximately one matrix multiplication

S^{T} S_{D}

involving very narrow matrices. Due to significantly smaller size of

P

and

e_{P}

compared to

D

and

e_{D}

, the solution of the system in Step (3) is even less costly. Step (4) is denoted as optional, because in most cases only the response with respect to the ports of analyzed structure is sought after. Therefore, computation of

e_{D}

can be omitted to provide additional savings, if only the field distribution inside the domain volume is not needed.

It should be noted that the presented algorithm of the Schur solver does not involve direct domain partitioning and is performed in the global system at once. This is because the diagonal matrix

D

dominates in the system and its inversion scales perfectly with the number of subdomains, while

P

is small enough not to be worth partitioning. The only operation where some kind of partitioning can be successfully applied is the matrix multiplication

S^{T} S_{D}

. Since the coupling block

S

consists of sparsely distributed small dense matrices

S_{i k} = S_{K D i k} - k_{0}^{2} S_{M D i k}

(

S_{D}

has the same structure), a dedicated block multiplication is proposed instead of the generic MATLAB multiplication. Unlike the MATLAB routine, this procedure takes advantage of the fact that there are many empty blocks in

S

and omits them during the multiplication. The occupancy ratio of nonzero blocks in

S

for cascaded macromodels is

2 M / (M (M + 1))

, which for

M = 10

equals 18%. It is therefore expected to provide additional time savings, especially for large macromodels.

4. Numerical Results

All numerical experiments were performed on a PC with i7-4510U CPU @ 2.6 GHz and 16 GB RAM (model PORTEGE Z30-A-1E1, Toshiba, Tokyo, Japan). The proposed procedures were implemented in MATLAB (R2016a, MathWorks Inc., Natick, MA, USA). The built-in matrix operations and the linear algebra procedures from MATLAB libraries were utilized, the most important of which are: mldivide (generic direct solver for sparse systems of linear equations which for the systems being investigated uses block LDL factorization), eig (eigendecomposition), svd (singular value decomposition), schur (Schur decomposition), and lu (LU factorization).

As a test structure, the rectangular waveguide loaded with periodically distributed pairs of metallic cylindrical posts in E-plane was chosen (Figure 2). It is divided into M cascaded subdomains in such a way that

Ω_{1}

and

Ω_{M}

are empty sections of the waveguide and each of

Ω_{2} \dots Ω_{M - 1}

comprises one symmetrically placed pair of posts, where

D_{k} = (L_{k} + L_{k + 1}) / 2

. The dimensions in millimeters are:

a = 22.86

,

b = 10.16

,

d = 3.0

,

L_{1} = L_{M} = 15.1

,

g_{k} = 17.0

, and

L_{k} = 14.5

for

k = 2 \dots M - 1

. The posts create

M - 3

resonators so that the structure behaves as a band pass filter, which provides large variations in the frequency response. This makes the structure suitable for an accuracy analysis of the presented methods. Moreover, it allows for easy changing of the number of macromodels and cloning. For maximal cloning,

Ω_{2} \dots Ω_{M - 1}

are kept the same and so are

Ω_{1}

and

Ω_{M}

. In this case, there are

M_{0} = 2

unique subdomains

Ω_{1}

and

Ω_{2}

, so the mesh as well as the diagonalized macromodels need to be created only therein (see Figure 2b). The remaining subdomains are represented by the macromodels that are copied from these unique ones and introduced in the global system matrix during the final assembly. To this end, 2-D meshes on the ports of

Ω_{1}

and

Ω_{2}

are the same. Denoting the macromodel in

Ω_{i}

as

{MM}_{i}

, the cloning scheme reads as follows:

{MM}_{1} \to {MM}_{M}

and

{MM}_{2} \to {{MM}_{3} \dots {MM}_{M - 1}}

.

The test structure for varying number of macromodels (

M = 4 \dots 12

) was analyzed towards S-parameters for the excitation mode

{TE}_{01}

in the frequency band covering all fundamental resonances—from 7 GHz, which is right above the waveguide cut-off, to 16 GHz. Frequency sweeping with 201 frequency points and 45 MHz step was chosen to capture correctly the S-characteristics with sharp resonances. The 3-D mesh consists of 4457 and 12,071 tetrahedrons in

Ω_{1}

and

Ω_{2}

, respectively, while the 2-D mesh has 238 triangles in each port. The port compression for the model-order reduction of the order q was performed using modal expansion by means of the first

p_{0}

analytically defined waveguide TE modes.

To demonstrate accuracy of the model-order reduction, the plots of

S_{11}

and

S_{21}

for FEM with MOR (FEM-MOR) are compared with the plain FEM in Figure 3a. For this test, the structure with six pairs of posts and

M = 8

subdomains was chosen. A more detailed insight into the accuracy of MOR is given in Figure 3b as S-parameter error plots for the error defined as follow:

S_{i j err} [dB] = 20 log (| S_{i j FEM - MOR} - S_{i j FEM} |)

(32)

It expresses the distance in dB on a complex plane between

S_{i j}

for the analyses being compared. For the reduction order

q = 10

and the port size

p_{0} = 10

, the errors are below −45 dB in the full frequency band (mostly below −55 dB), which is sufficient to make the S-parameter plots indistinguishable. The number of unknowns in the FEM system is

N =

80,326 and

\tilde{N} = 1690

for FEM-MOR, which results in the reduction ratio

N / \tilde{N} = 47.5

.

To investigate the accuracy of diagonalization, the S-parameters for FEM-MOR with diagonalization (FEM-MOR-Diag) were compared with FEM-MOR. The error defined similarly to Equation (32) was below −240 dB for both diagonalization algorithms and each of the possible orthogonal decompositions (EVD, SVD, and SchD), meaning the diagonalization contributes only a negligible addition to the error introduced by MOR itself, thus the plots in Figure 3 would look exactly the same if FEM-MOR were replaced by FEM-MOR-Diag.

The subsequent tests focused on the performance of macromodel diagonalization in terms of computational time. The diagram in Figure 4 defines the execution times of the steps leading from the original FEM system to the solution of the systems of equations at different stages of the procedure involving MOR and diagonalization. The blocks MM and MM-Sol represent the standard MOR procedure presented in [16] to which the proposed MOR with diagonalized macromodels was compared in the following tests.

In the initial approach, only the solution times

t_{R S}

,

t_{D S}

and

t_{D S S}

were considered to show direct effects of diagonalization and the Schur solver. The block matrix multiplication was assumed in the Schur solver if not stated otherwise. The times

t_{R S}

and

t_{D S S}

are compared for

M = 8

macromodels in Figure 5 as the plots versus port size

p_{0}

and reduction order q. They both grew with the problem size

\tilde{N}

(dependent on q and

p_{0}

) similarly, however the diagonalization combined with the Schur solver clearly brought a substantial decrease of the solution time.

For better insight into these effects, we analyzed solution speedup (speedup ratio), referred to as gain, which is defined as follows: diagonalization gain

G_{D S} = t_{R S} / t_{D S S}

, Schur solver gain

G_{S} = t_{D S} / t_{D S S}

. The latter is a speedup component in

G_{D S}

contributed by the Schur solver used instead of the MATLAB solver. Since the diagonalization gain depends on two factors determining the problem size

\tilde{N}

—macromodel size

2 q p_{0}

and the number of macromodels M— it is depicted as a pair of plots in Figure 6, versus

p_{0}

and M, both with q as a parameter. The

G_{D S}

plots show a significant speedup, which grows almost proportionally with q and

p_{0}

, however, the influence of q on

G_{D S}

is roughly twice as large as that of

p_{0}

. It makes the proposed diagonalization procedure particularly attractive for problems with strong field variations within subdomains. What is more,

G_{D S}

was almost independent of M—it deteriorateD very slowly with M increasing from 4 to 12, and thus diagonalization is suitable even for analysis with large number of macromodels.

The Schur solver gain plots versus

p_{0}

are shown in Figure 7 for two options of matrix multiplication. The dedicated block matrix multiplication improved the Schur solver performance more for larger values of

p_{0}

(Figure 7a), which makes it more appealing than the standard MATLAB matrix multiplication (Figure 7b). Although the latter brought higher gain for small

p_{0}

, it is much less relevant, as the corresponding time savings expressed in absolute values were much smaller that those for large

p_{0}

. When comparing

G_{D S}

with the gain of diagonalization involving the standard MATLAB solver equal

G_{D S S} / G_{S}

, one may notice that the Schur solver brought an important contribution to the overall speedup. For instance, for the largest values of q and

p_{0}

, the Schur solver increased the diagonalization gain from 9.2 to 40.5.

For a more complete and practical view of performance of the presented procedures, the computational costs of MOR

t_{R}

and diagonalization

t_{D}

have to be taken into account. To this end, the following definitions of effective gains are introduced:

Effective diagonalization gain: $G_{D S eff} = t_{R S} / (t_{D} + t_{D S S})$
Effective MOR gain: $G_{R eff} = t_{F S} / (t_{R} + t_{R S})$
Effective MOR and diagonalization gain: $G_{R D S eff} = t_{F S} / (t_{R} + t_{D} + t_{D S S})$ .

When referring to the diagram in Figure 4, these effective gains correspond to different routes leading from different starting systems to all possible solutions.

G_{D S eff}

compared the times needed to reach MM-Sol and DiagMM-SchurSol from the box MM (DiagMM-Sol was disregarded as always

t_{D S S} < t_{D S}

). When starting from the original FEM system, two routes were compared with

t_{F S}

: one leading via MM and the other one via MM and DiagMM. They defined

G_{R eff}

and

G_{R D S eff}

, respectively, and expressed the overall performance improvement brought by MOR and MOR with diagonalized macromodels as compared to the plain FEM solution.

The solution time of the original FEM system by means of the MATLAB solver mldivide

t_{F S}

was 104, 372 and 638 for 4, 8 and 12 subdomains, respectively. All times are given in seconds if not stated otherwise. The reduction time for the unique subdomains

Ω_{1}

and

Ω_{2}

is

t_{R 1} = 0.51

and

t_{R 2} = 1.82

, respectively, for

q = 10

,

p_{0} = 10

. Since the diagonalization time depended only on the macromodel size

N_{m} = 2 q p_{0}

regardless of the physical properties of its corresponding subdomain and was the same for all macromodels, it is presented for just one of them and denoted as

t_{D 1}

. Figure 8a compares

t_{D 1}

for all possible combinations of the orthogonal decompositions—SVD, SchD, and EVD—used in Steps (1) and (4) of the diagonalization algorithm Diag-I. As the plots in Figure 8b show for the best two options—SchD-SVD and EVD-SVD—both algorithms Diag-I and Diag-II had almost the same performance. For further analysis, the configuration SchD-SVD-Diag-II was chosen.

The overall MOR and diagonalization times

t_{R}

and

t_{D}

depended on how much macromodel cloning was involved in the analysis, which eventually influenced the aforementioned effective gains. To demonstrate the role of cloning, the following two options were considered:

Maximal cloning possible in the analyzed structure: only $M_{0} = 2$ original macromodels were generated in $Ω_{1}$ and $Ω_{2}$ out of all M macromodels used: $t_{D} = 2 t_{D 1}$ , $t_{R} = t_{R 1} + t_{R 2}$ .
No cloning: MOR and diagonalization was performed in all M subdomains: $t_{D} = M t_{D 1}$ , $t_{R} = 2 t_{R 1} + (M - 2) t_{R 2}$ .

The computation times defined in Figure 4 and the derived gains are summarized in Table 1 for small, medium and large macromodels (

N_{m} = 72, 200

and 360, respectively) and the above-mentioned cloning options. The influence of the macromodel count (

M = 8

and 12) is presented for their medium size, which corresponds to the S-parameters characteristics in Figure 3. More detailed profiles of the resultant gain of the proposed analysis

G_{R D S eff}

are presented in Figure 9 for maximal cloning and Figure 10 for the case without cloning.

The influence of diagonalization cost on the effective diagonalization gain depended mostly on cloning—

G_{D S eff}

decreaseD with respect to

G_{D S}

less than 1.2 times with cloning and not more than 2 times without it, thus

G_{R D S eff} = 20.1

for the largest macromodels. The most appealing result is that the overall effective speedup of MOR with diagonalization

G_{R D S eff}

could reach extremely high rates for small macromodels, i.e. as large as 369.5 times for

N_{m} = 2 q p_{0} = 72

(

q = 6

,

p_{0} = 6

), if the maximal possible cloning was involved. It increased from 183.2 times for MOR only, owing to the diagonalization. Without cloning,

G_{R D S eff}

reduceD down to 81.4; however, this was still a very high speedup rate. This effect was mostly related to the loss of

G_{R eff}

itself, because, although the lack of cloning affected equally the resultant costs per macromodel of both MOR and diagonalization, the influence of

t_{R}

prevailed as it is much larger than

t_{D}

.

More details regarding the influence of cloning on the resultant effective speedup can be seen by comparing the plots in Figure 9 and Figure 10. Although cloning apparently improved the overall performance, it nearly did not change the degree of influence of the macromodel size on

G_{R D S eff}

. The plots in Figure 9a and Figure 10a look almost the same and they are scaled by the factor approximately equal

M / M_{0}

. The analogous relations regarding the number of macromodels M were rather different. In the case of cloning,

G_{R D S eff}

increased largely with M, because the cost of MOR and diagonalization per macromodel decreased (Figure 10b). Without cloning, this did not occur and

G_{R D S eff}

remained almost constant with respect to M with only slight tendency to increase (Figure 10b). This is an important property, as it shows that, even without cloning, which was the worst case, the resultant performance was not affected by the number of macromodels, and thus the proposed approach is advantageous even for the structures partitioned into an increasing number of subdomains.

Another important observation can be made concerning the dependence of the effective gains on macromodel size.

G_{R D S eff}

decreased severely due to the growth of the macromodel generation cost

t_{R}

and the solution time

t_{R S}

. This effect was inherited after

G_{R eff}

but was partly compensated by the diagonalization, owing to two factors:

G_{D S eff}

was large enough that

t_{D S S} + t_{D} ≪ t_{R S}

and it grew with

N_{m}

. As a result, for

N_{m}

changing from 72 to 360, the initial decrease rate of

G_{R eff}

equalled 21 times (with cloning) or 11.3 times (without cloning), reducing to 6 times for

G_{R D S eff}

. Favoring larger macromodels, the diagonalization brought larger relative speedup where it implied more absolute time savings, and thus was needed most. For instance, in the case of the results presented in Figure 3 (

q = 10

,

p_{0} = 10

,

M = 8

, cloning with

M_{0} = 2

), which involved medium size macromodels, the initial time of the FEM solution was reduced from over 6 min (

t_{F S}

) to less than 3 s for MOR with diagonalization (

t_{R D S} = t_{R} + t_{D} + t_{D S S}

). It may be concluded that the diagonalization was particularly beneficial if macromodels represented large subdomains having complex shapes. This effect was even stronger if cloning was applied: the improvement ratio of the effective gain due to the diagonalization

G_{R D S eff} / G_{R eff}

for the largest macromodels (

N_{m} = 360

) and

M = 8

was 2.3 and 7.1 without and with cloning, respectively.

To show the performance of the proposed methods in the case of tuning, the following scenarios were investigated. The structure presented in Figure 2 for

M = 8

subdomains was regarded as an initial design of a waveguide bandpass filter, which may be tuned by modifying the subdomain lengths

L_{k}

and the distances between the posts

g_{k}

. The structure was assumed symmetrical with respect to its external ports, thus the following subdomains and their macromodels were identical:

Ω_{2} = Ω_{7}, Ω_{3} = Ω_{6}, Ω_{4} = Ω_{5}

. Consequently, there were six possible independent tuning parameters:

L_{2} = L_{7}, g_{2} = g_{7}, L_{3} = L_{6}, g_{3} = g_{6}, L_{4} = L_{5}, g_{4} = g_{5}

. Due to the symmetry, half of all M macromodels were cloned, thus

M_{0} = 4

. The MOR parameters were

p_{0} = 10

and

q = 10

, corresponding to the S-parameter characteristics depicted in Figure 3a. Two scenarios of tuning were considered: (A) with all

N_{v} = 6

tuning parameters in all

M_{v} = 3

modified subdomains; and (B) with

N_{v} = 2

in a single subdomain (

M_{v} = 1

). The tuning procedure involved a simplified gradient optimization algorithm. In each iteration (out of K), two steps were performed: (1) estimation of the goal function gradient by small perturbations of all

N_{v}

tuning parameters; and (2) calculation of a new parameter vector. In the case of MOR, Step (1) required

N_{v}

solutions performed in just one subdomain, whereas, in Step (2), all

M_{v}

macromodels were recalculated.

The results are presented in Table 2 for two different iteration counts K. The computation times and gains were defined as in previous tests, but the times were cumulated throughout all K tuning iterations. Consequently, the resultant cost of MOR, diagonalization and solution of the final system was:

t_{F S}

for the plain FEM,

t_{R S S} = t_{R} + t_{R S}

for FEM-MOR and

t_{R D S} = t_{R} + t_{D} + t_{D S S}

for FEM-MOR with diagonalized macromodels. The corresponding effective gains show that the speedup brought by MOR and MOR with diagonalization was significant and comparable to that of a single simulation for maximal cloning (

M_{0} = 2

) with the same

p_{0}, q, M

, as presented in Table 1. What is also important, the speedup was not or hardly dependent on the iteration count and the number of tuned parameters, which makes the proposed methods robust and thus suitable for CAD applications.

In the final test, the structure depicted in Figure 2 for

M = 8

was simulated by means of two commercial software packages—FEKO (version 2018-309, Altair Engineering, Troy, MI, USA) and EMPro (version 2019, Keysight Technologies, Santa Rosa, CA, USA)—which belong to the leading CAD tools for electromagnetic analysis based on FEM. Similar mesh size, first-order FEM basis function and 201-point frequency sweeping were set to make the results comparable with the analysis presented in this paper.

In Figure 11, the S-parameter characteristics are compared and a view of the structure meshed in FEKO is depicted. The plots are almost indistinguishable, which additionally proves very good accuracy of the proposed method. The solution times for the commercial software and this analysis are compared in Table 3. For the FEM-MOR-Diag, the solution time was

t_{R} + t_{D} + t_{D S S}

. The results show that even an in-house FEM software based on the proposed method of MOR with diagonalized macromodels could significantly outperform commercial FEM software packages.

5. Conclusions

A new technique of local model-order reduction (MOR) in 3-D finite element method (FEM) for electromagnetic analysis of waveguide components has been proposed to resolve the problem of increasing solution time of the reduced-order system combined from the macromodels in a decomposed domain. To this end, the diagonalized macromodels created by means of the simultaneous diagonalization are used to build the global system. To the best of the author’s knowledge, diagonalized macromodels have not been used previously in FEM analysis to speed up the solution of the system obtained in MOR. The proposed approach is very efficient for two reasons: diagonalization is performed on small macromodel matrices and can be carried out just once in the whole bandwidth, which is owing to the frequency independency of macromodels. Although the resulting matrix is only partially diagonal, it can be solved very efficiently by a dedicated solver based on the Schur complement technique, which has also been proposed. The numerical validation of the proposed procedures with respect to accuracy and speed was carried out for an exemplary finite periodical waveguide structure partitioned into the macromodels of varying size and count and for different options of macromodel cloning. The results show that the introduction of diagonalized macromodels in this work provided a significant solution speedup without accuracy degradation. This makes an essential performance improvement in comparison to the work in [16], where similar methods of MOR and port compression are used. The solution time reduces to such extent that the resultant efficiency of the entire analysis becomes determined almost solely by the cost of MOR. It means that proposed technique eventually improves robustness of the model-order reduction with respect to macromodel size. Although the overall speedup of MOR with diagonalized macromodels still decreases with the growth of macromodels, the decrease rate is lower than that without diagonalization. The proposed technique is particularly beneficial, when the system solution time becomes comparable to the reduction time, which occurs for growing size and count of the macromodels. It also takes place when they are cloned in multiple locations of the structures or are used repeatedly in a tuning and optimization process, which makes the proposed technique desirable in CAD applications.

References

Acknowledgments

This work was supported by AFarCloud project (www.afarcloud.eu) that received funding from the ECSEL Joint Undertaking (JU) under grant agreement No 783221. The JU receives support from the European Union’s Horizon 2020 research and innovation program and Austria, Belgium, Czech Republic, Finland, Germany, Greece, Italy, Latvia, Norway, Poland, Portugal, Spain, and Sweden.

Conflicts of Interest

The author declares no conflict of interest.

References

Jin, J.M. The Finite Element Method in Electromagnetics, 3rd ed.; John Wiley & Sons: Hoboken, NJ, USA, 2014. [Google Scholar]
Salazar-Palma, M.; Djordjevic, A.; Sarkar, T.K.; García-Castillo, L.E.; Roy, T. Iterative and Self-Adaptive Finite-Elements in Electromagnetic Modeling; Artech House: Norwood, MA, USA, 1998. [Google Scholar]
Bai, Z. Krylov subspace techniques for reduced-order modeling of large-scale dynamical systems. Appl. Numer. Math. 2002, 43, 9–44. [Google Scholar] [CrossRef] [Green Version]
Feldmann, P.; Freund, R.W. Efficient linear circuit analysis by Padé approximation via the Lanczos process. IEEE Trans. Comput.-Aided Design Integr. Circuits Syst. 1995, 14, 639–649. [Google Scholar] [CrossRef]
Freund, R.W. Krylov-subspace methods for reduced-order modeling in circuit simulation. J. Comput. Appl. Math. 2000, 123, 395–421. [Google Scholar] [CrossRef] [Green Version]
Nguyen, T.S.; Le Duc, T.; Tran, T.S.; Guichon, J.M.; Chadebec, O.; Meunier, G. Adaptive multipoint model order reduction scheme for large-scale inductive PEEC circuits. IEEE Trans. Electromagn. Compat. 2017, 59, 1143–1151. [Google Scholar] [CrossRef]
Cangellaris, A.C. Electromagnetic macro-modeling: An overview of current successes and future opportunities. In Proceedings of the Computational Electromagnetics International Workshop, Izmir, Turkey, 10–13 August 2011; pp. 1–6. [Google Scholar]
Kulas, L.; Mrozowski, M. A fast high-resolution 3-D finite-difference time-domain scheme with macromodels. IEEE Trans. Microw. Theory Technol. 2004, 52, 2330–2335. [Google Scholar] [CrossRef]
Wu, H.; Cangellaris, A.C. A finite-element domain-decomposition methodology for electromagnetic modeling of multilayer high-speed interconnects. IEEE Trans. Adv. Packag. 2008, 31, 339–350. [Google Scholar]
Zhu, Y.; Cangellaris, A.C. Macro-elements for efficient FEM simulation of small geometric features in waveguide components. IEEE Trans. Microw. Theory Technol. 2000, 48, 2254–2260. [Google Scholar] [CrossRef]
Odabasioglu, A.; Celik, M.; Pileggi, L.T. PRIMA: Passive reduced-order interconnect macromodeling algorithm. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 1998, 17, 645–654. [Google Scholar] [CrossRef]
Rubio, J.; Arroyo, J.; Zapata, J. SFELP-an efficient methodology for microwave circuit analysis. IEEE Trans. Microw. Theory Technol. 2001, 49, 509–516. [Google Scholar] [CrossRef]
Freund, R.W.; Feldmann, P. Reduced-order modeling of large passive linear circuits by means of the SyPVL algorithm. In Proceedings of the 1996 IEEE-ACM International Conference on Computer-Aided Design, San Jose, CA, USA, 10–14 November 1996; pp. 280–287. [Google Scholar]
de la Rubia, V.; Zapata, J. Microwave circuit design by means of direct decomposition in the finite-element method. IEEE Trans. Microw. Theory Technol. 2007, 55, 1520–1530. [Google Scholar] [CrossRef]
Fotyga, G.; Nyka, K.; Kulas, L. A new type of macro-elements for efficient two-dimensional FEM analysis. IEEE Antennas Wirel. Propag. Lett. 2011, 10, 270–273. [Google Scholar] [CrossRef]
Fotyga, G.; Nyka, K.; Mrozowski, M. Efficient model order reduction for FEM analysis of waveguide structures and resonators. Prog. Electromagn. Res. 2012, 127, 277–295. [Google Scholar] [CrossRef]
Czarniewska, M.; Fotyga, G.; Mrozowski, M. Local Mesh Deformation for accelerated parametric studies based on the Finite Element Method. In Proceedings of the 2017 IEEE MTT-S International Conference on Numerical Electromagnetic and Multiphysics Modeling and Optimization for RF, Microwave, and Terahertz Applications (NEMO), Seville, Spain, 17–19 May 2017; pp. 284–286. [Google Scholar]
Czarniewska, M.; Fotyga, G.; Lamecki, A.; Mrozowski, M. Parametrized Local Reduced-Order Models with Compressed Projection Basis for Fast Parameter-Dependent Finite-Element Analysis. IEEE Trans. Microw. Theory Tech. 2018, 66, 3656–3667. [Google Scholar] [CrossRef]
Fotyga, G.; Nyka, K.; Mrozowski, M. Automatic reduction order selection for finite-element macromodels. IEEE Microw. Compon. Lett. 2018, 28, 278–280. [Google Scholar] [CrossRef]
Fisher, A.; Rieben, R.N.; Rodrigue, G.H.; White, D.A. A generalized mass lumping technique for vector finite-element solutions of the time-dependent Maxwell equations. IEEE Trans. Antennas Propag. 2005, 53, 2900–2910. [Google Scholar] [CrossRef]
Zeng, K.; Jiao, D. Frequency-domain method having a diagonal mass matrix in arbitrary unstructured meshes for efficient electromagnetic analysis. In Proceedings of the 2017 IEEE International Symposium on Antennas and Propagation USNC/URSI National Radio Science Meeting, San Diego, CA, USA, 9–14 July 2017; pp. 1579–1580. [Google Scholar]
Laub, A.J. Matrix Analysis For Scientists And Engineers; Society for Industrial and Applied Mathematics: Philadelphia, PA, USA, 2004. [Google Scholar]
Zhu, Y.; Cangellaris, A.C. Multigrid Finite Element Methods for Electromagnetic Field Modeling; John Wiley & Sons: Hoboken, NJ, USA, 2006; Volume 28. [Google Scholar]
Reddy, C.J.; Deshpande, D.M.; Cockrell, C.R.; Beck, F.B. Finite Element Method for Eigenvalue Problemsin Electromagnetics; Technical Report 3485; NASA: Pasadena, CA, USA, 1994.
Schöberl, J. NETGEN An advancing front 2D, 3D-mesh generator based on abstract rules. Comput. Vis. Sci. 1997, 1, 41–52. [Google Scholar] [CrossRef]
Fotyga, G.; Nyka, K.; Mrozowski, M. Multilevel model order reduction with generalized compression of boundaries for 3-D FEM electromagnetic analysis. Prog. Electromagn. Res. 2013, 139, 743–759. [Google Scholar] [CrossRef]
Sheehan, B.N. ENOR: Model order reduction of RLC circuits using nodal equations for efficient factorization. In Proceedings of the IEEE 36th Design Automation Conference, New Orleans, LA, USA, 21–25 June 1999; pp. 17–21. [Google Scholar]
Banerjee, S.; Roy, A. Linear Algebra and Matrix Analysis for Statistics; CRC Press: Roca Raton, FL, USA, 2014. [Google Scholar]

Figure 1. A source-free domain

Ω

with external ports

P_{1}, P_{2}

and internal ports

P_{3}, P_{4}

interfacing subdomains

Ω_{1}, Ω_{2}, Ω_{3}

(a). Partitioning of

Ω

into separate subdomains bounded by split internal ports

P_{3} \to {P_{31}, P_{32}}

and

P_{4} \to {P_{42}, P_{43}}

(b).

Figure 1. A source-free domain

Ω

with external ports

P_{1}, P_{2}

and internal ports

P_{3}, P_{4}

interfacing subdomains

Ω_{1}, Ω_{2}, Ω_{3}

(a). Partitioning of

Ω

into separate subdomains bounded by split internal ports

P_{3} \to {P_{31}, P_{32}}

and

P_{4} \to {P_{42}, P_{43}}

(b).

Figure 2. The test structure—rectangular waveguide loaded with pairs of metallic cylindrical posts in E-plane and divided into M subdomains

Ω_{1} \dots Ω_{M}

. The dimensions are shown in x-z plane view (a). The waveguide height is b. The perspective view (b) for

M = 6

shows that the mesh is generated in

Ω_{1}

and

Ω_{2}

only.

Figure 2. The test structure—rectangular waveguide loaded with pairs of metallic cylindrical posts in E-plane and divided into M subdomains

Ω_{1} \dots Ω_{M}

. The dimensions are shown in x-z plane view (a). The waveguide height is b. The perspective view (b) for

M = 6

shows that the mesh is generated in

Ω_{1}

and

Ω_{2}

only.

Figure 3. S-parameters (a); and MOR error

S_{11 err}

and

S_{21 err}

(b) of the structure in Figure 2 for

M = 8

subdomains and macromodels, port size

p_{0} = 10

and reduction order

q = 10

.

Figure 3. S-parameters (a); and MOR error

S_{11 err}

and

S_{21 err}

(b) of the structure in Figure 2 for

M = 8

subdomains and macromodels, port size

p_{0} = 10

and reduction order

q = 10

.

Figure 4. Diagram defining the time cost of MOR, diagonalization and solution of the systems of equations. Abbreviations: FEM, initial FEM system; MM, system with macromodels; DiagMM, system with diagonalized macromodels; -Sol, solution by means of the MATLAB solver mldivide; -SchurSol, solution by means of the Schur solver.

Figure 5. Solution time of the final system for

M = 8

macromodels: original macromodels (a); and diagonalized macromodels and the Schur solver (b).

Figure 5. Solution time of the final system for

M = 8

macromodels: original macromodels (a); and diagonalized macromodels and the Schur solver (b).

Figure 6. Gain of diagonalization with the Schur solver

G_{D S}

: versus port size

p_{0}

for

M = 8

(a); and versus number of macromodels M for

p_{0} = 10

(b).

Figure 6. Gain of diagonalization with the Schur solver

G_{D S}

: versus port size

p_{0}

for

M = 8

(a); and versus number of macromodels M for

p_{0} = 10

(b).

Figure 7. Gain of the Schur solver over the standard MATLAB solver for the system comprising diagonalized macromodels. The Schur solver is in two options: with (a) or without (b) the dedicated block matrix multiplication.

Figure 8. Diagonalization time

t_{D 1}

for one macromodel of the size

N_{m} = 2 q p_{0}

for the following configurations: algorithm Diag-I with all possible combinations of orthogonal decompositions (a); and both algorithms Diag-I and Diag-II with the best performing orthogonal decompositions (b).

Figure 8. Diagonalization time

t_{D 1}

for one macromodel of the size

N_{m} = 2 q p_{0}

for the following configurations: algorithm Diag-I with all possible combinations of orthogonal decompositions (a); and both algorithms Diag-I and Diag-II with the best performing orthogonal decompositions (b).

Figure 9. Effective gain of MOR with diagonalization

G_{R D S eff}

, with cloning: versus port size

p_{0}

for

M = 8

(a); and versus number of macromodels M for

p_{0} = 10

(b).

Figure 9. Effective gain of MOR with diagonalization

G_{R D S eff}

, with cloning: versus port size

p_{0}

for

M = 8

(a); and versus number of macromodels M for

p_{0} = 10

(b).

Figure 10. Effective gain of MOR and diagonalization

G_{R D S eff}

, without cloning: versus port size

p_{0}

for

M = 8

(a); and versus number of macromodels M for

p_{0} = 10

(b).

Figure 10. Effective gain of MOR and diagonalization

G_{R D S eff}

, without cloning: versus port size

p_{0}

for

M = 8

(a); and versus number of macromodels M for

p_{0} = 10

(b).

Figure 11. S-parameters obtained in the commercial software and this analysis (a); and a view of the tested structure meshed in FEKO (b).

Table 1. The times of MOR, diagonalization and solution of the systems of equations (in seconds) along with MOR and diagonalization gains for selected combinations of q,

p_{0}

, M and two options of cloning.

Table 1. The times of MOR, diagonalization and solution of the systems of equations (in seconds) along with MOR and diagonalization gains for selected combinations of q,

p_{0}

, M and two options of cloning.

Cloning	q	$p_{0}$	M	$t_{FS}$	$t_{R}$	$t_{D}$	$t_{RS}$	$t_{DSS}$	$G_{DS}$	$G_{DS eff}$	$G_{R eff}$	$G_{RDS eff}$
yes	6	6	8	372	0.85	0.008	1.2	0.144	8.1	7.7	183.2	369.5
	10	10	8	372	2.33	0.052	11.3	0.421	26.9	23.9	27.3	132.9
	12	15	8	372	4.94	0.168	37.6	0.928	40.5	34.3	8.7	61.7
	10	10	12	638	2.33	0.052	16.4	0.642	25.6	23.7	34.0	211.4
no	6	6	8	372	4.38	0.044	1.2	0.144	8.1	6.2	66.9	81.4
	10	10	8	372	11.94	0.289	11.3	0.421	26.9	15.9	16.0	29.4
	12	15	8	372	25.34	0.943	37.6	0.928	40.5	20.1	5.9	13.7
	10	10	12	638	18.78	0.434	16.4	0.642	25.6	15.3	18.1	32.2

Table 2. The cumulated times of MOR, diagonalization and solution of the systems of equations (in seconds) along with MOR and diagonalization gains for two tuning scenarios,

q = 10

,

p_{0} = 10

,

M = 8

,

M_{0} = 4

, and for varying numbers of tuning parameters

N_{v}

, modified subdomains

M_{v}

and iterations K.

Table 2. The cumulated times of MOR, diagonalization and solution of the systems of equations (in seconds) along with MOR and diagonalization gains for two tuning scenarios,

q = 10

,

p_{0} = 10

,

M = 8

,

M_{0} = 4

, and for varying numbers of tuning parameters

N_{v}

, modified subdomains

M_{v}

and iterations K.

Scenario	$N_{v}$	$M_{v}$	K	$t_{FS}$	$t_{R}$	$t_{D}$	$t_{RS}$	$t_{DSS}$	$t_{RRS}$	$t_{RDS}$	$G_{R eff}$	$G_{RDS eff}$
A	6	3	5	13,396	87.9	1.77	407	15.2	495	105	27.1	127.8
A	6	3	100	260,842	1644	32.7	7932	295.1	9576	1972	27.2	132.3
B	2	1	5	5954	33.3	0.69	181	6.7	214	41	27.8	146.3
B	2	1	100	112,002	552	11.0	3406	126.7	3958	690	28.3	162.4

Table 3. The solution times for the commercial software and this analysis.

Software	FEKO	FEKO	EMPro	This FEM	FEM-MOR-Diag	FEM-MOR-Diag with Cloning
solver	precond. Bi-CGSTAB	direct	direct	direct	direct	direct
no. of FEM DOF	81,331	81,331	80,119	80,326	80,326	80,326
solution time [s]	813	664	416	372	12.6	2.8

© 2019 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Nyka, K. Diagonalized Macromodels in Finite Element Method for Fast Electromagnetic Analysis of Waveguide Components. Electronics 2019, 8, 260. https://doi.org/10.3390/electronics8030260

AMA Style

Nyka K. Diagonalized Macromodels in Finite Element Method for Fast Electromagnetic Analysis of Waveguide Components. Electronics. 2019; 8(3):260. https://doi.org/10.3390/electronics8030260

Chicago/Turabian Style

Nyka, Krzysztof. 2019. "Diagonalized Macromodels in Finite Element Method for Fast Electromagnetic Analysis of Waveguide Components" Electronics 8, no. 3: 260. https://doi.org/10.3390/electronics8030260

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Diagonalized Macromodels in Finite Element Method for Fast Electromagnetic Analysis of Waveguide Components

Abstract

1. Introduction

2. Finite Element Method in Partitioned Domain

2.1. FEM Formulation

2.2. Domain Partitioning

2.3. Local Matrix Assembly

3. Model-Order Reduction with Diagonalized Macromodels

3.1. Port Compression

3.2. Model-Order Reduction

3.3. Diagonalization

Diag-I

Diag-II

3.4. Assembly and Solution of the Global System

Schur-complement based solver (Schur solver).

4. Numerical Results

5. Conclusions

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI