1. Introduction
Multilayer composite structures are widely used in various industries, including aerospace, automotive, and construction, where the optimization of parameters such as load-bearing capacity, fatigue resistance, and vibration damping is essential [
1,
2,
3,
4]. The design of multilayer composite structures poses a significant engineering challenge, particularly in the selection of materials with different mechanical properties and economic factors [
5]. This process requires precise selection of materials for individual layers so that the resulting structure meets specified strength, technological, and operational requirements while minimizing costs. The complexity of this issue makes classical single-optimization methods insufficient, leading to the necessity of applying a multi-criteria approach that allows the simultaneous analysis of multiple design aspects, including both mechanical and economic parameters.
One of the effective tools for solving complex optimization problems is the application of genetic algorithms (GAs) [
6,
7,
8,
9]. These methods enable the efficient exploration of the solution space to minimize the objective function while considering both mechanical and economic constraints. Genetic algorithms, inspired by evolutionary processes, allow for the iterative improvement of solutions through selection, crossover, and mutation of a population of solutions. However, the use of GAs involves repeated objective function evaluations, which, in the case of advanced analyses based on the Finite Element Method (FEM), leads to high computational costs.
To reduce computational time, surrogate models are used to quickly approximate FEM analysis results [
10,
11,
12]. A key challenge is building a sufficient dataset to train the surrogate model. A multi-fidelity (MF) approach is often applied in this process, utilizing FEM models with varying levels of accuracy [
13]. Lower-resolution numerical models (low-fidelity, LF) enable fast data generation but require correction techniques to improve their reliability. Various methods exist for integrating results obtained from models of different fidelity levels, including statistical methods and machine learning-based algorithms. The choice of an appropriate strategy affects both the training time and the accuracy of the surrogate model.
Different methods, such as Kriging and co-Kriging, statistical models, and deep neural networks, are commonly used to develop surrogate models [
11,
12,
14,
15,
16,
17,
18,
19,
20]. Each of these approaches offers varying levels of accuracy and computational complexity, making it crucial not only to assess the effectiveness of the surrogate model itself but also its impact on optimization results. Kriging, as one of the widely used interpolation methods, enables the precise modeling of nonlinear relationships between input parameters and FEM analysis results. On the other hand, deep neural networks can model highly complex nonlinear dependencies but require substantial computational resources and large training datasets [
21,
22,
23].
The concept of Curriculum Learning (CL) originates from human cognition, where learning involves acquiring knowledge through exposure to successive samples in a structured sequence—progressing from the simplest to the most complex examples. This idea was first introduced into machine learning by Bengio et al. [
24] in 2009, who suggested that such an approach not only accelerates the training process but also improves the quality of the local minima obtained. In subsequent years, the concept of progressive learning was applied across various domains with numerous modifications and extensions. Despite these differences, a common feature of these approaches is the emphasis on refining models by focusing on increasingly challenging or problematic examples. For instance, Hsiao and Chang [
25] utilized CL for constructing surrogate models to describe chemical processes (namely, an amine scrubbing process), demonstrating its effectiveness in improving model accuracy.
The concept has also been extended to continual learning, where models are incrementally trained on new data while retaining previously acquired knowledge, as explored by Fayek et al. [
26]. Another advanced adaptation, referred to as adaptive continual learning, was proposed by Kong et al. [
27], in which each learning step was dynamically adjusted based on results obtained in preceding steps, further enhancing model performance. A notable application of CL to highly complex systems includes its use in modeling unsteady hypersonic flows in chemical nonequilibrium, as demonstrated by Scherding et al. [
28]. This study highlights the potential of curriculum-based approaches in computationally demanding simulations, reinforcing the broader utility of this learning paradigm across various scientific and engineering disciplines.
The effectiveness of CL has been demonstrated across various complex applications, where structured, progressive training significantly enhances model performance. In many challenging problems, the approach described in this study as CL provides substantial improvements in result quality while simultaneously reducing computational effort, making it a valuable tool for optimizing surrogate modeling and numerical simulations.
The verification of solution quality cannot rely solely on comparing the surrogate model with FEM results but must also consider the correctness of the optimization at a global level. This necessitates evaluating the influence of the surrogate model on results obtained using GA-based optimization. Furthermore, the optimization process must account for modeling uncertainties, which may require applying probabilistic methods for result analysis.
Due to the multi-criteria nature of the problem and the need for a continuous comparison of optimization results, appropriate indicators are used to assess solution quality [
29,
30]. One of the widely used tools is the analysis of Pareto fronts, enabling the evaluation of trade-offs between different optimization criteria. The Pareto front identifies a set of non-dominated solutions, where each represents an optimal compromise between multiple optimization objectives. This allows for determining optimal material configurations and evaluating their effectiveness concerning predefined design criteria.
Addressing these challenges is a key aspect of effective composite structure optimization, enabling the development of design methodologies that provide optimal solutions both in technical and economic terms. By integrating computational methods, optimization algorithms, and machine learning techniques, it is possible to create more efficient tools to support the design process of multilayer composites. Modern optimization approaches also consider sustainability and environmental constraints, which can serve as additional factors in the design analysis.
Queipo et al. [
31] explored surrogate-based methods for analysis and optimization, addressing key aspects such as loss function selection, regularization criteria, experimental design strategies, sensitivity analysis, and convergence assessment. Their study also illustrated state-of-the-art applications through a multi-objective optimization case involving a liquid rocket injector.
In their comprehensive review, Forrester and Keane [
12] examined the latest advancements in surrogate model construction and their integration into optimization strategies. Their work provided a detailed evaluation of the advantages and limitations of various surrogate modeling techniques, offering practical guidelines for their implementation. Additionally, Hwang and Martins [
32] analyzed the behavior of several popular surrogate modeling approaches when applied to problems requiring thousands of training samples.
The optimization of the dynamic behavior of shell structures has been widely studied, with numerous algorithms proposed in recent research to tackle this challenge. For example, Jing et al. [
33] introduced a sequential permutation search algorithm aimed at optimizing the stacking sequence of doubly curved laminated composite shallow shells to maximize the fundamental frequency. Similarly, Chen et al. [
34] developed a multivariate improved sparrow search algorithm to enhance the fundamental frequency of composite laminated cylindrical shells while minimizing vibrational resonance. Chaudhuri et al. [
35] performed a numerical investigation into the free vibration response of composite stiffened hypar shells with cutouts, utilizing an FE analysis. Their optimization relied on parametric tuning based on the Taguchi approach to achieve the desired frequency response. Another study by Serhat [
36] focused on optimizing the eigenfrequencies of circular cylindrical laminates by examining the influence of parameters such as cylinder length, radius, thickness, and boundary conditions. Likewise, Alshabatat [
37] explored the optimization of natural frequencies in circular cylindrical shells using axially functionally graded materials. Collectively, these studies contribute to advancing optimization methodologies for improving the dynamic performance of composite structures.
This study aims to address the challenges associated with optimizing the dynamic properties of multilayer composite structures while minimizing computational costs. The proposed methodology integrates MF FE models with deep neural network-based surrogate modeling, enabling efficient and accurate multi-objective optimization.
The novelty of this research lies in the systematic use of surrogate models within a CL framework specifically tailored for multi-objective optimization. Unlike traditional surrogate modeling approaches, where training is performed using a fixed dataset, the proposed method dynamically improves the surrogate model by incorporating new high-fidelity samples in successive CL iterations. This iterative refinement enhances the predictive accuracy of the surrogate model while ensuring better convergence of the optimization process. By progressively increasing the quality of the surrogate model, the CL-based approach enables a more reliable identification of the Pareto front, leading to improved trade-off solutions between competing objectives while maintaining computational efficiency.
Furthermore, this study explores different architectures for the surrogate model, comparing three distinct configurations. The effectiveness of these variants is assessed using Pareto front quality indicators, providing a comprehensive evaluation of their impact on optimization performance.
By incorporating these innovations, the proposed methodology offers a robust and scalable solution for optimizing composite structures, demonstrating its applicability to engineering problems requiring a balance between accuracy and computational feasibility.
2. Materials and Methods
2.1. Vibration Problem
In dynamic structural analysis, an essential issue is determining the system’s natural frequencies and mode shapes. The equation of motion describing the system’s dynamics can be written as:
where
is a mass matrix;
is a damping matrix;
is a stiffness matrix;
is a n-element vector of nodal displacements;
is a n-element vector of external forces at nodes;
n is the number of dynamic degrees of freedom.
For the free-vibration analysis, when external forces are absent and damping is neglected, the equation simplifies to:
Solving this system leads to the so-called eigenvalue problem:
where
represents the matrix of mode shapes
(stored in successive columns of matrix
), and
is the matrix of eigenvalues
. The angular frequencies
divided by
yield the natural frequencies
corresponding to the vibration shapes
. Determining the system’s eigenvalues and eigenvectors allows for the analysis of the dynamic properties of the structure, which is crucial for designing and optimizing structures subjected to dynamic excitation.
2.2. Analysis of Dynamic Parameters to Avoid the Resonance Phenomenon
In the analysis of structures subjected to dynamic loads, a key aspect is optimizing their dynamic properties to prevent resonance, which can lead to catastrophic consequences. Resonance occurs when the excitation frequency coincides with or is very close to one of the system’s natural frequencies, resulting in a rapid increase in vibration amplitude, which may lead to structural failure. To avoid this, it is necessary to shape the system’s natural frequency spectrum appropriately at the design stage.
If the excitation frequency is known, optimizing the natural frequency spectrum involves maximizing the separation of natural frequencies from this value, creating a frequency gap around the excitation frequency. This approach significantly reduces the risk of resonance. It is also crucial for low-stress structures, where even minor vibrations can cause premature damage or degradation of functional properties.
Shaping the natural frequency spectrum can be performed as part of an optimization procedure with a properly defined objective function. In its basic form, the objective function can be formulated to maximize the distance between the natural frequencies and the excitation frequency:
where the vector
gathers the natural frequencies of the investigated model obtained for specific values of design parameters collected in vector
, and
stands for the considered excitation frequency. In this paper,
Hz.
If an additional criterion, such as minimizing the structure’s cost, is considered, the optimization problem becomes multi-objective. In this case, the second objective function can be expressed as:
where
is the total volume of the structure, and
is the cost per unit volume of the material used in layer
i.
In this case, the multi-objective optimization aims to minimize both objective functions simultaneously. The standard formulation of the multi-objective optimization problem—finding the values of the arguments collected in an
m-element structure’s control parameters vector
for which two considered objective functions yield the lowest possible values—is given by:
where
is a vector of structure parameters, and
is the
m-dimensional space of the decision parameters gathered in vector
.
The solution to the multi-objective optimization problem is the so-called Pareto front. The Pareto front represents the set of non-dominated solutions in a multi-objective optimization problem. A solution is considered non-dominated if no other solution exists that improves one objective without worsening at least one other. In practical applications, the Pareto front provides decision-makers with a range of optimal trade-offs between competing objectives, allowing for a more informed selection of the most suitable design configuration.
To compare results obtained from different optimization approaches, numerical measures of Pareto front quality must be introduced. Pareto front indicators assess the distribution and diversity of solutions. One example is the hypervolume indicator, which measures the volume of space enclosed by the Pareto front concerning a reference point. The greater the value of this indicator, the better the quality of the obtained solutions regarding the distribution of trade-offs among objective functions. Another commonly used metric is the distance of generated solutions from the theoretically optimal Pareto solution, which helps evaluate the accuracy of the optimization process.
Considering these aspects in the design process allows us to obtain a system with optimized dynamic properties while simultaneously minimizing production costs and reducing the risk of damage due to uncontrolled dynamic excitations.
2.3. The Analyzed Structure
The axisymmetric structure analyzed in this study was generated by rotating a flat hyperbola (marked with blue in
Figure 1) around a fixed axis. This hyperbola had predefined fixed start point A and end point C (namely,
cm,
cm, and the overall length equaled 600 cm), while its middle point B could change its position (given by
d parameter) along the axis perpendicular to the axis of rotation, allowing control over the shape of the generated shell. This geometry enabled a broad range of structural configurations with varying dynamic and mechanical properties.
The shell was asymmetrically supported—one end was fixed, meaning all degrees of freedom are constrained, while the other end remained free. These boundary conditions led to specific dynamic properties of the structure, directly affecting its natural frequency spectrum and susceptibility to resonance phenomena. The structure is shown in
Figure 1.
The analyzed structure was made of a composite material with a constant thickness of 16 mm and consisted of eight layers. Each layer had the same thickness but could be made from one of three available composite materials. Additionally, each layer had a unique fiber orientation, meaning that the orientation of fibers in each layer differed, affecting the mechanical and dynamic properties of the shell.
The complete configuration of the structure was described by the parameter vector
, which consisted of
variables: eight fiber orientation angles
, material selections for each of the eight layers
, and one coordinate defining the position of the middle point of the base hyperbola
d; see Equation (
7). This set of parameters allowed for a precise modeling of the shell and its optimization concerning various criteria, including structural dynamics, stiffness, and material and manufacturing costs.
The materials used to construct the shell included two real composite materials: Carbon Fiber-Reinforced Polymer (CFRP) and Glass Fiber-Reinforced Polymer (GFRP). Additionally, a theoretical material, the theoretical Fiber-Reinforced Polymer (tFRP), was introduced for optimization purposes. The parameters of this material were calculated as the average values of the properties of the CFRP and GFRP. The introduction of this material increased the complexity of the optimization task by introducing an additional value for one of the decision variables.
Table 1 presents a summary of the mechanical and physical properties of the available materials.
2.4. Finite Element Models
This study employed two FE models that differed only in their FE size, which effectively means variations in mesh density. Each model consisted of four-node MITC4 multilayered shell elements, which are based on the first-order shear deformation theory [
39].
Each layer of the shell structure corresponded to a single composite layer, with potentially different material properties and fiber orientation angles. The maximum side length of an approximately square finite element, denoted as h, for the primary FE model (referred to here as M1), was selected to be approximately cm. However, slight variations existed in both the circumferential and longitudinal directions, and also at different locations along the shell’s axis. In addition to the M1 model, one coarse model, labeled as M5, was introduced, featuring element sizes of cm.
The high-fidelity M1 model served as the basis for constructing a pseudo-experimental model. Meanwhile, the lower-fidelity M5 model contributed to expanding the dataset for training the surrogate model. Given that the element size in the coarser model M5 was four times larger than that in M1, the computational cost was reduced by a factor . However, this efficiency gain came at the expense of accuracy—errors in M5 increased by factors of . While this loss of precision was substantial, the proposed methodology was designed to account for and mitigate this issue.
The pseudo-experimental model was derived from the M1 FE model, where the computed natural frequencies underwent the following nonlinear transformation:
where
represents the vector of natural frequencies (in Hz) obtained from the M1 model, corresponding to specific mode shapes
within the mode shape matrix
(see [
40]). Unlike a typical approach where the frequency vector contains the lowest natural frequencies in sequential order, in this study, it included only frequencies corresponding to selected mode shapes. To enable such selection, the mode shapes obtained from numerical simulations first had to be identified and subsequently filtered to retain only the eleven most relevant ones [
40].
This strategy enhanced the accuracy of the surrogate model by focusing on the most meaningful vibrational modes and eliminating unnecessary information that could introduce noise into the learning process. As a result, the optimization procedure benefited from improved convergence and solution quality, as demonstrated in the authors’ previous studies [
40].
The transformed vector represents the pseudo-experimental model’s natural frequencies, and the function mimics experimental testing procedures. The neural network-based approximation—surrogate model application—of is denoted as .
It is important to note that the function does not stem from actual experimental research but is instead an attempt to model discrepancies between numerical simulations and laboratory experiments. The authors’ previous studies relied entirely on numerical analyses; thus, incorporating the pseudo-experimental model into the optimization framework enables the consideration of potential deviations encountered in experimental studies. Furthermore, this approach helps address practical limitations related to the number of feasible experimental tests.
2.5. Optimization Strategy Using Genetic Algorithms, Surrogate Models, and Curriculum Learning
The optimization problems given by Equation (
6) were herein solved using the Non-dominated Sorting Genetic Algorithm II (NSGAII) [
41], a GA-based multi-objective search method that is not derivative-based. Genetic algorithms are widely used in complex engineering problems, particularly where traditional optimization methods prove insufficient [
6,
40]. They work on a population of possible solutions and use deterministic computations and random number generators. The GA’s advantage, crucial from the point of view of the problem to be solved, is the ability to search the entire solution space when trying to find a global minimum. However, this requires repeated evaluations of the objective function, which is computationally expensive when the FEM is applied. In the proposed optimization procedure, the objective function was solved using a surrogate model instead of FEM calculations. Therefore, the GA procedure worked extremely fast.
However, one of the key challenges associated with GAs is the need to repeatedly evaluate the objective functions. This process can be computationally expensive, especially when the objective functions require time-consuming numerical analyses, such as FEM simulations. To significantly mitigate this issue, the present approach employed surrogate models based on deep neural networks (DNNs).
The use of DNNs as surrogate models enables the rapid approximation of analysis results, replacing costly simulations with near-instantaneous predictions. This allows for large-scale optimization while drastically reducing computation time. The effectiveness of this approach was confirmed in the authors’ previous studies, where it was demonstrated that using a DNN for objective function prediction led to a significant reduction in computational burden compared to conventional methods [
42].
The process of selecting DNN parameters required a thorough evaluation of network errors, taking into account the following aspects:
The number of input variables, denoted as I;
The number of layers, represented by ;
The number of neurons H within each hidden layer (expressed as , maintaining consistency across all hidden layers within a specific network);
The number of output nodes, denoted as O;
The choice of learning algorithms and regularization techniques, along with other contributing factors;
The choice of activation and loss functions.
A summary of the different network parameter values considered is presented in
Table 2. It is worth noting that the architecture 17-50-50-50-11, in combination with the Tanh activation function and the RMSProp learning algorithm, provided optimal performance in over 80% of the evaluated DNNs. This configuration was frequently used alongside Batch Normalization (BN) for regularization and Early Stopping strategies. Also, the best results were achieved using the MAE as the loss function.
Preparing surrogate models in the form of a DNN requires generating an appropriate dataset for training. To achieve this, an MF approach was introduced to limit the number of calls to the high-fidelity M1 model during data generation. The less accurate M5 model was employed, allowing the acquisition of a large number of training samples at the cost of reduced accuracy. In the authors’ previous study [
43], it was demonstrated that increasing the FE size by a factor of
h resulted in an approximately
reduction in computational time. However, this simplification came at the expense of accuracy, as the numerical error increased by a factor of
. This trade-off underscores the necessity of incorporating correction mechanisms, such as auxiliary neural networks, to mitigate the errors introduced by the lower-fidelity M5 model. The number of cases computed using the M1 model (which also provided pseudo-experimental data samples) was an order of magnitude smaller than the number of cases evaluated with the M5 model. To further enhance the accuracy of the surrogate model, auxiliary neural networks were incorporated to compensate for the errors introduced by the lower-fidelity M5 model. The number of M1 model evaluations was denoted as
, while the number of M5 model evaluations was denoted as
.
Table 2.
Architecture, algorithms, function, and methods used in DNN simulations [
44].
Table 2.
Architecture, algorithms, function, and methods used in DNN simulations [
44].
DNN architecture | |
| |
| |
| |
Learning algorithms | ADAM |
| * RMSProp |
| SGD |
Regularization methods | * Early Stopping |
| Regularization |
| Dropout |
| * Batch Normalization |
Activation functions | SoftMax |
| * Tanh |
| ReLu |
| Sigmoid |
Loss functions | MSE |
| * MAE |
| ArcSin |
Within this framework, two FEM models of varying accuracy were utilized: a high-fidelity model (M1) and a low-fidelity model (M5). The M1 model served as a reference and was used to generate pseudo-experimental data by introducing a nonlinear perturbation function . This function aimed to account for potential discrepancies between numerical results and real experimental data, thereby enabling optimization under conditions closer to real-world scenarios. This introduced an additional verification step, allowing for the assessment of the robustness of the applied optimization methods against inevitable errors and uncertainties present in experimental data. The low-fidelity model M5, on the other hand, facilitated the rapid estimation of preliminary values while significantly reducing computational costs.
The integration of MF modeling with deep neural networks enhanced the efficiency of the surrogate model training process, allowing for more precise representation of dependencies in the design space, while maintaining an acceptable computation time. The following sections provide a detailed discussion of three different approaches, each varying in the construction of surrogate models and their integration with the optimization procedure.
Regardless of the applied variant, the primary objective of the surrogate model remained unchanged. Its purpose was to predict the pseudo-experimental frequency values
based on a given vector of model parameters
. Ultimately, regardless of the methodology adopted for constructing and training the surrogate model, its operation can be symbolically depicted as in
Figure 2.
The optimization procedure, whose concept is presented in
Figure 3, was based on an iterative approach involving multiple refinements of the surrogate model within the framework of CL.
The process begins with data generation, which includes a large number of samples obtained using the simplified M5 model () and a significantly smaller number of samples derived from the pseudo-experimental Me(M1) model (). This approach allows for the collection of a comprehensive dataset while simultaneously limiting the computational cost associated with the high-fidelity M1 model.
Based on the generated dataset, a surrogate model in the form of a deep neural network is constructed and appropriately trained. Once the training process is completed and the surrogate model is prepared, the optimization procedure is initiated using a GA. At that stage, the surrogate model plays a crucial role in enabling the efficient and rapid estimation of the objective function values.
After completing the first optimization cycle, the obtained results are validated and subsequently used to build an additional dataset. The new samples focus on regions of the design space located in the vicinity of the optimal solution, facilitating the further refinement of the surrogate model.
In the subsequent steps, the surrogate model is retrained based on the newly generated data, and the optimization process is restarted, this time utilizing the improved surrogate model. The iterative refinement cycles of the surrogate model form the core of the CL approach, where x represents the number of performed iterations.
The procedure terminates after reaching a predefined number of CL cycles, ensuring a systematic improvement in the quality of the surrogate model and yielding the final optimized solution.
2.5.1. Variant I
In the first approach variant, an auxiliary surrogate model was first developed to generate training data for the primary surrogate model. The purpose of the auxiliary model was to refine the results obtained from the low-fidelity M5 model so that they would closely match the values derived from the pseudo-experimental model
. To achieve this, FEM calculations were performed for a limited number of cases using both the high-fidelity M1 model and the low-fidelity M5 model. Based on the collected data, an auxiliary model was trained. Its inputs consisted of the structural design parameters, gathered in the vector
, along with a vector
of eleven selected natural frequencies obtained from the M5 model. The neural network was trained to accurately estimate the pseudo-experimental frequencies
, which served as approximations of real experimental measurements (see
Figure 4a).
Upon completion of the training process, the trained auxiliary surrogate model was used to predict pseudo-experimental frequency values
based on the results from rapid calculations using the M5 model only (see
Figure 4b).
This approach enabled the generation of a large dataset, which was subsequently used to train the primary surrogate model (see
Figure 5a). The role of this final surrogate model was to predict pseudo-experimental frequency values
solely based on the design parameter vector
, eliminating the need for any additional numerical simulations (see
Figure 5b).
This methodology significantly reduced the necessity of repeatedly utilizing the computationally expensive M1 model (as well as the pseudo-experimental model). Moreover, it facilitated the development of an accurate and efficient primary surrogate model. The large number of training samples generated by the auxiliary model allowed for precise predictions while maintaining a low computational cost.
2.5.2. Variant II
In the second approach, a different architecture was employed for the auxiliary surrogate model, while the primary surrogate model remained unchanged from the first variant. The key modification introduced in this version was the division of the auxiliary neural network structure into two distinct modules: one dedicated exclusively to processing linear dependencies and the other responsible for capturing nonlinear components of the mapping. Despite its more complex architecture, the auxiliary surrogate model remained a single neural network.
This architectural choice for the auxiliary model was based on the assumption that for functions that can be decomposed into linear and nonlinear components, processing these elements separately should yield more accurate approximation results [
45,
46,
47]. By structuring the auxiliary model in this manner, it was possible to better align its design with the characteristics of the data, thereby improving its ability to capture the relationships between structural parameters and the resulting pseudo-experimental frequencies.
The training procedure of the auxiliary surrogate model (see
Figure 6a), its application phase (see
Figure 6b), and its objective remained identical to those in the first variant. The precomputed values from the simplified M5 model were still utilized and subsequently corrected using the trained network to best match the values obtained from the pseudo-experimental model. The refined data were then used to construct the main surrogate model, whose purpose was to estimate the pseudo-experimental frequency values
based solely on the design parameter vector
, eliminating the need for multiple costly numerical computations (see
Figure 5b).
A similar modular architecture to the one described above for the auxiliary surrogate model (see
Figure 6c) was also tested for the primary surrogate model. The goal was to examine whether separating linear and nonlinear processing could enhance the accuracy of pseudo-experimental frequency predictions. However, the results obtained with this configuration did not show significant improvements over the standard approach, and in some cases, even led to increased approximation errors in the surrogate model. Consequently, this approach was abandoned.
2.5.3. Variant III
The third variant of the approach differed significantly from the two previous methods. It still utilized two surrogate models; however, their role and application underwent substantial modifications. Unlike variants I and II, where the auxiliary surrogate model was used solely for preparing training data for the primary surrogate model, in this approach, both models were employed simultaneously and actively participated in the entire optimization process.
The first surrogate model was designed to replace computations performed using the simplified M5 model. Its primary function was to directly estimate the selected natural frequencies obtained originally from the M5 model based on the vector of design parameters . This eliminated the need for the repeated use of the M5 model during the optimization process.
The second surrogate model, in turn, was responsible for estimating the pseudo-experimental frequencies
, which are essential for optimization. Its input consisted of an extended input vector comprising both the design parameter vector
and the vector of frequencies
obtained from the first surrogate model. As a result, this model accounted for both the structural characteristics and the dynamic properties derived from the analysis of the M5 model (or, more precisely, from the first surrogate model). The training and application of both surrogate models is presented in
Figure 7 and
Figure 8.
With this configuration, both surrogate models were utilized at every stage of the optimization process.
2.6. Indicators: Pareto Front Quality Metrics
The results of the multi-objective optimization problem analyzed in this study, which involved two objective functions, resulted in a two-dimensional Pareto front.
For an objective assessment of the quality of solutions obtained through multi-objective optimization, appropriate evaluation metrics must be introduced. While visual inspection of several Pareto fronts is effective for distinguishing qualitative differences, it becomes insufficient when variations between the compared fronts are merely quantitative. In such cases, the repeated comparison of Pareto fronts necessitates the definition of numerical quality metrics. These indicators allow for an objective evaluation of various characteristics of the analyzed fronts. Audet et al. [
30] reviewed a total of 57 performance indicators and categorized them based on the evaluated parameters into four groups: (i) cardinality indicators, (ii) convergence indicators, (iii) distribution and spread indicators, and (iv) convergence and distribution indicators. Alternatively, Tian et al. [
48] proposed a more simplified classification, distinguishing only between (i) diversity indicators (assessing the evenness and spread of the Pareto front) and (ii) convergence indicators.
In this study, four indicators were selected. The first was the hypervolume indicator, denoted as
, and the second was the relative hypervolume indicator, denoted as
. The hypervolume indicators are classified as convergence and distribution indicators in [
30] or as convergence and diversity indicators in [
48]. The hypervolume indicator
is recognized as the most widely used metric [
29]. The third metric utilized was the Epsilon
-indicator [
49], referred to as
. It is classified as a convergence indicator in [
30] and ranks as the third most frequently used indicator according to [
29]. The second most common metric, the Generational Distance indicator, was applied in this study as the fourth indicator.
Originally introduced by Zitzler [
50], the hypervolume indicator measures the area covered by the examined Pareto front
A relative to a suitably chosen reference point. When comparing two fronts,
A and
B, this indicator can be adapted as the difference
. If one of the compared fronts represents the true Pareto front (TPF), meaning the optimal front sought during the optimization process, the indicator can be redefined as a unary metric:
. The relative hypervolume indicator used in this study is given by:
where
and
denote the areas covered by the TPF and the examined Pareto front
A, respectively. The true Pareto front was defined in this study as the envelope of the results obtained from all examined approaches and variants considered in the analysis. Therefore, it did not represent a fully legitimate TPF, which should ideally be derived analytically. Instead, it served as the most accurate possible approximation of the true optimal front within the scope of this study.
The third selected indicator, , represents the smallest scalar that scales Pareto front B so that every point in is dominated by at least one point in A. If the second Pareto front corresponds to the TPF, this metric can be treated as a unary indicator, denoted as , which was applied in this form in the present study.
The fourth selected indicator, the Generational Distance indicator
[
51] measures the average distance of the obtained Pareto front solutions from the TPF and is defined as:
where