1. Introduction
The optimization of shape and topology is one of the areas of rapid growth in the automotive industry. Many practical applications of sophisticated optimization algorithms have been published in the last years, such as the SIMP method [
1], which uses relative densities of the material as design variables (therefore, the name solid isotropic material with penalization). Method dealing with similar limit load optimization problems utilizing the SIMP method has been described in [
2], highlighting the drawbacks of using this scheme for buckling prediction problems. Another method called bi-directional evolutionary structural optimization (BESO) has been used in these applications, in accordance with [
3]. One of the most recently developed methods for structural optimization is the level-set method (LSM), broadly described in [
4]. The LSM does not use element-based densities as design variables; however, it changes the shape (the boundary) of the body based on the auxiliary field. Most of the popularized methods do not require explicit parametrization; therefore, their solution does not require the user to specify many parameters of the algorithm. On the other hand, parametric optimization, in which the parameters (variables) represent physical quantities, such as dimensions, material properties, loads, etc., requires a detailed definition of the optimization algorithm parameters. This may be seen especially in genetic algorithms, where population size (number of islands, number of individuals per island, number of generations) may be adjusted freely by the designer, in addition to the probabilities of migration, mutation, crossover, elite size, etc. Different combinations of the aforementioned algorithm parameters may show a broad scatter in the convergence rate or the final fitness function (objective function) during optimization. Evolutionary optimization is still an actively developing field. In [
5], the authors proposed a modified version of the genetic algorithm, in which the genetic operators are carried out independently on each generation, i.e., crossover is performed during every 2n generations, while the mutation during every 2n + 1 generations. The authors utilized a large number of individuals in the population combined with only a few generations. The great extinction genetic algorithm (GEGA) modification proposed in [
6] incorporated nature-based phenomena, such as global disasters to remove part of the individuals from the population, which are later randomly replaced by new individuals. The authors focused on minimizing the number of objective function calls, which may be treated as a method to mitigate one of the major drawbacks of GA—excessive computational effort associated with simulation-based objective function evaluation. To capture the actual global optimum with limited resources, the gradient-based optimization procedure was incorporated after the evolutionary optimization. With a similar amount of individuals, yet utilized in different populations, the authors of [
7] were able to optimize the automotive suspension layout with a limited number of generations and small population sizes in the Multi-Island version of the genetic algorithm (MIGA). A recent review of genetic algorithms [
8] points out that the foreseen progress of GAs should cover the tuning of appropriate degrees of crossover and mutation operators. In addition, the hybridization of algorithms is marked as one of the potential solutions to the overly-intensified objective function calculations associated with the GAs. A combination of global-search GA with local-search techniques proves to provide an efficient optimization tool, assuring the balance between diversification and computational effort. The authors of [
8] provide numerous references supporting the efficiency of this hybrid approach in numerous fields of science. In general, different algorithms can be joined sequentially or parallelly, making a hybrid optimization scheme. It is up to the designer which algorithm to apply and in which form (algorithm settings). There is generally no limit to algorithm combinations. In [
9], the authors combined genetic algorithms with simulated annealing, while both of these algorithms are considered to be global-search strategies. The minimization of the compensation error of the magnetometer was addressed in their study, and the combination of population-based GA with single-individual SA helped in the achievement of this goal. The review of hybrid evolutionary optimization [
10] shows that the hybrid approach suffers from similar problems as pure GAs. Finding the efficient population size, along with the proper tuning of genetic operators, is key. The authors pointed out that the mutation parameter seems to change its role in the hybrid approach (compared to solely GA approach), as the purpose of the GA is shifted toward exploration; therefore, the role of mutation should be to steer the algorithm from one basis to another, by adjusting its higher probability. Other hybridization options were highlighted, i.e., pairing the GA with an additional algorithm assuring the feasibility of solutions. In the case of LSM-based topology optimization, this may be achieved by the additional, one-variable optimization of the threshold parameter. However, this method could increase the final optimization time when the univariable optimization is carried out on a very nonlinear and multimodal landscape. In real-world industrial conditions, the time or computational effort available for optimization is limited. In these cases, finding the actual optimum may not be possible; therefore, the designers’ task can be reformulated into finding the best quasi-optimum in the given time or using the available computational resources. Then, having an algorithm with well-tuned parameters is beneficial and desired, as presented, e.g., in [
11,
12]. How to obtain the proper parameter values is one of the engineering challenges, especially when the solution requires time-consuming FEM-based simulation. In these cases, it is beneficial to use the surrogate modeling approach to replace costly numerical simulations with quick metamodel calculations. This approach is the aim of the authors’ research and will be explained in detail in further chapters.
Surrogate modeling, also referred to as metamodeling, is a constantly growing field of science. It allows not only for the replacement of the underlying, costly simulations and physical experiments, but it may also give some insight into the properties of the simulated or tested process, especially when dealing with normalized values, as the coefficient of the metamodels (as the polynomial regression model) represent the sensitivity or importance of particular terms or parameters [
13]. Recent review papers [
14,
15] show that the majority of the literature studies involve three types of metamodels: The artificial neural networks (ANNs), the Kriging, and the polynomial regression, which paired with statistical analysis is referred to as the response surface method (RSM). These surrogates account for 24, 15, and 17% of all surrogates in the recent papers. Moreover, the combination of metamodeling, along with hybrid optimization, is observed as an increasingly common practice among researchers. All of the aforementioned models were proposed in the XXth century; however, the recent development is focused, e.g., on reducing the number of hyper-parameters needed to be optimized during the Kriging fitting process or architecture of the ANNs. More complex Kriging models, such as the Universal Kriging incorporating the trend model, even though more sophisticated, do not always bring significant improvement due to limitations associated with the response of the underlying regression model (the trend). Nevertheless, part of the research is devoted not necessarily to the metamodels themselves, but to techniques that minimize the number of costly training samples needed to fit an efficient surrogate, or even to guide the sampling procedure based on areas of metamodel high uncertainty [
16]—this technique is known as the infill process. Nonetheless, numerous papers utilize surrogates to replace costly computer simulations, e.g., as the authors of [
17] minimized the mass of the jet engine components using ANN, which was carried out 40% faster than the optimization carried out on the numerical-analytical model. Moreover, the authors incorporated the infill process during the optimization. In [
18], the Kriging method was used to solve the vibroacoustic problem when the objective function evaluation using the simulation was quite costly to perform the standard optimization. In [
19], the authors incorporated the RSM to optimize the titanium alloy sheet forming process for medical applications, similar to the study in [
20], where the energy system optimization was conducted using the RSM.
The paper discusses and presents methods typical for problems where the calculation of the objective function is very time-consuming. Typically, the biggest problem is the determination of the parameters of the optimization algorithms. This is in most cases carried out by running the optimization algorithm multiple times with different parameters. Unfortunately, it is inefficient and the total calculation time is sometimes unacceptable. Using the methodology presented in the article, it is possible to select the parameters of optimization algorithms at a significantly lower cost, which allows for an effective, time- and price-acceptable approach to solving the optimization problem, utilizing the metamodeling technique. Metamodeling itself is usually used to replace the computer simulation during the optimization process, but for extremely nonlinear and multimodal problems, a sufficiently precise metamodel may not exist. That is why the authors use the surrogates to mimic the behavior of these complex systems and to perform the tuning of the optimization algorithm parameters, which will then be used to perform constrained mass minimization on FEM models.
The description of the optimization problem is given in
Section 2, highlighting the design variables, optimization objectives, and limitations of the FEM-based process.
Section 2 considers the methods used to resolve the optimization problem, including the metamodeling part. The sampling plan for the surrogate training and validation process, metamodel types, and quality metrics are also described. The last part of
Section 2 concerns optimization algorithms and their parameters, which will be subjected to the tuning process. The first part of
Section 3 shows the results of the metamodeling approach, including a comparative study of the quality metrics. Then, the efficiency of optimization algorithms is shown depending on the algorithm parameters, which are then tuned to provide the most efficient optimization scenario.
Section 3 is concluded with an example of the FEM-based optimization utilizing the best and worst algorithm parameter combinations for comparison and benchmark purposes. All conclusions and results are summarized in
Section 4.
2. Materials and Methods
The shock absorber is a part mounted in every vehicle’s suspension system. Its role is not only to provide the damping response, but also frequently to prevent the wheel from colliding with the (car) body. Therefore, it must withstand severe axial loads, especially when the vehicle goes through an obstacle, such as a pothole. As the energy generated by this impact is usually very high, the loads transferred by the shock absorber may cause it to bend, crack, or buckle, the last being addressed in this paper. In addition, the loads coming from the assembly process or the modular spring (
Figure 1a) may induce significant stress in the structure, affecting the overall strength performance. The engineer’s work is to reduce the mass of this component, utilizing one of many topology or shape optimization techniques.
During the design phase, it is important to predict the proper mechanical response and to ensure that the damper will survive the given impact. It can be achieved by constraining a minimum force at which stability is lost, as shown in
Figure 1b. As a result, the following optimization problem may be formulated:
Objective function
:
subjected to constraint
:
where
is a vector of design variables controlling the shape of the structure;
—material density;
V—the volume of the part being optimized;
—maximum force achieved by the shock absorber before losing stability;
—the minimum force at which the shock absorber is allowed to lose stability.
The method for obtaining the value of the limit load
is usually a simulation using the finite element method. In each iteration of the design process, this constraint must be fulfilled. As the mass of the part is minimized, the fulfillment of the stability constraint is ensured by the optimization algorithm. On the other hand, the mass optimization (minimization) task may be resolved in many ways; however, the authors focus on the application of the derivative-free level-set method applied for nonlinear 3D structures. The shape of the part is controlled by changing the design variable vector
, which affects the distribution of the auxiliary scalar field Φ. For each of the finite elements, the value of the calculated field Φ is compared with the assumed threshold value in the following way:
Using this attitude, the material can be removed from the structure by adjusting the knot values at each of the radial basis function centers. A more detailed explanation of this method can be found in [
21], since the authors developed a methodology suitable for the optimization of unstable structures. The exemplary distribution of the knots is shown in
Figure 2.
The main disadvantage of this approach is the time-consuming nonlinear finite element analysis needed to obtain information about the stability response of the system, for each design vector. Therefore, the authors decided to use surrogate models to represent the actual nonlinear system. This allowed for the verification of hundreds of thousands of design combinations in a small time fraction compared to a single FEM simulation, for the purpose of obtaining the best parameters of the optimization algorithm (tuning optimization algorithm parameters).
The finite element method was used to generate the samples that were then used to feed the metamodels (surrogate models). Two data sets were created to train and validate the metamodels—artificial neural network and Kriging.
As stated above, the method for obtaining the structural response of the system was the simulation using the finite element method (FEM). Due to the fact that the problem incorporates large displacements (geometrical nonlinearities), the interaction between bodies (contact), and the elastic–plastic material constitutive models (physical nonlinearities), the solution was derived using the commercial software, Abaqus/Standard [
22]. The method for obtaining the new shape for FEM analysis was developed by the authors and consisted of the Python routine, which modified the Abaqus input deck based on all of the 27 optimization input parameters. These parameters are the coefficients of the radial basis functions with fixed center positions. Based on the calculated auxiliary field values and assumed threshold, each of the finite elements was judged as active or disabled (taking into account the removal of prior numerical analysis) and a new numerical model was created. In each FEM analysis, the pinned boundary conditions were used at both ends of the shock absorber. In the first step of the simulations, the assembly loads were applied (bracket clamping and suspension spring preload). The incremental compression bumper load was applied in the second step, up to the point at which the structure is losing stability.
The considered example of the optimization consisted of 27 design parameters (variables), as shown in
Figure 2 with RBF knots. The sampling plan in this case must be carefully chosen, in order to not force the design matrix to be quite excessive and to cover the design space in the most efficient manner. The main limitation is the total number of function calls (simulations) that can be used to create the sampling plan. Based on the available resources and the average calculation time, the maximum number of samples could not exceed 6750. Additionally, the sampling plan was divided into a 4:1 ratio; therefore, for each four training samples, one validation sample was kept separately for further surrogate quality assessment (validation). In this case, two plans were created:
- -
5400 samples for the training set and 1350 samples for the validation set.
- -
1350 samples for the training set and 340 samples for the validation set.
In both cases, the design matrices were independent of each other, i.e., two sampling plans were created separately for training and validation data. With the first sets ready (5400 training and 1350 validation samples), an additional small set was created to investigate the effect of sample size on the quality of the metamodel. Therefore, a new sample plan with 340 samples was used for validation purposes, whereas the previous validation set of 1350 samples was used as a training set. In total, three sampling plans were used to create two sets of data for metamodeling purposes.
There are numerous sampling plans available in the literature, among which full-factorial design of experiment (DOE), fractional–factorial DOE, Latin hypercube sampling (LHS), optimal Latin hypercube sampling (OLHS), orthogonal arrays, Sobol sequences, Box–Benhken sampling, or Monte Carlo methods are commonly incorporated in literature studies. The DOE approaches force the design matrix to grow exponentially, namely, full-factorial DOE for 27 factors on two levels requires 2
27, i.e., over 134 million analysis runs. In addition, only extreme values of the parameters are investigated; therefore, the actual nonlinear relations for parameters within their variation range can be lost. The schematic representation of different sampling plans for a small set of 2D data is shown in
Figure 3. In the case of the factorial design, a uniform spread of design points is observed, even though the projection of the samples onto the axes is sparse. On the other hand, the Monte Carlo approach yields quite uniform distribution of samples projected on the axes, yet there may be regions of high density of design points, and regions without any points at all. The best saturation of design points projected onto the axes is visible for the LHS designs. Moreover, the OLHS maximizes the uniformness of design points distribution over the design area, in order that no overlapping points nor unsampled areas are observed when utilizing this sampling technique.
To overcome the excessiveness of a sampling plan, and to achieve multidimensional stratification (i.e., a situation in which projections of individual samples on the axes are spread uniformly), the Latin hypercube sampling is introduced. Generally, the LHS is generated in a random manner, and thus the uniform space-filling property may not be reached. Therefore, the optimal Latin Hypercube Sampling was chosen as a source of the sample data. The difference between LHS and OLHS is the optimization phase, in which the even spread of points is achieved using the
criterion, which is based on the maxmin distance criterion (minimum distance between any two sample points is maximized), as described in [
23]:
where
is the distance measured for any two sample points
xi and
xj, calculated as:
with
t = 1 or
t = 2 depending on the norm used (1 for the Manhattan norm, and 2 for the Euclidean norm).
Once all distances are calculated, they are sorted in ascending order
forming a distance vector. At the same time, the index list is created
which consists of
elements specifying how many pairs of the sampling plan are separated by distance
. Then, the optimal sampling plan is achieved if it minimizes the following criterion [
24]:
OLHS assists in generating more evenly spread plans, which is not only better for the generality of the sampling plan, but also resolves problems with fitting the Kriging model, which will be discussed in more detail in the further part of the paper. For the training and validation data sets, separate OLHSs were created, each satisfying its own . The matrix optimization was terminated once the assumed time was reached.
Two types of metamodels were used in the study, based on their ability to approximate (or interpolate) complex continuous and nonlinear functions without any prior assumptions regarding the function shape (which might be the case for polynomial models) and due to their universality. Studies utilizing similar metamodels were carried out, e.g., in [
25].
The first metamodel is an artificial neural network (ANN) in the multilayer form. The foundations for this model were created in the 1940s by Warren McCulloch and Walter Pits [
26] as a single artificial neuron (
Figure 4a). Gathering multiple neurons in several layers together formed a simple neural network (
Figure 4b), in which popularity started to grow significantly after the backpropagation learning algorithm was implemented [
27].
There are many variants of ANNs, among which networks with radial basis function in the hidden layer are commonly used for metamodeling purposes [
28,
29], due to their clear architecture and simpler learning process. On the other hand, multilayer perceptron networks may have a complex architecture with numerous activation functions, hidden layer numbers, and features, such as regularization [
30,
31] or output normalization. In addition, there is a broad spectrum of weight and bias optimizers, among which stochastic gradient descent (SGD), SGD with momentum (e.g., NADAM—Nesterov adaptive momentum), and limited memory Broyden–Fletcher–Goldfarb–Shanno (L-BFGS) algorithms are one of the most common and well-known in the literature [
32]. This complexity may seem as an obstacle; however, it provides a strong, flexible tool to adjust the network to achieve its highest efficiency. The main disadvantage, in this case, is the time-consuming process of finding the optimal network architecture and its set of properties. Once the artificial neural network is trained on a given set of data, the estimation of the objective function using unknown vectors is carried out using the optimized weights and biases from the training procedure. The implementation of ANN and FEA in problems similar to the one presented in this paper was described in [
33], where the critical loads for the structures were investigated. The authors created the ANN metamodel using the python programming language (PyTorch) framework, which enables efficient parallelization, assisting in the reduction in model training time significantly.
The second metamodel is called the Kriging, after Daniel G. Krige, who developed this interpolation metamodel for geostatistics as his master thesis [
34]. Although it was designed for mining purposes, such as gold exploration using sampling in a two-dimensional space (map), the method may be adapted for spaces with more dimensions. Kriging models are widely used as surrogate models for costly FEM-based simulations, even when severe nonlinearities are present [
35,
36].
There are several types of Kriging models, including simple, ordinary, universal, and blind [
37], with the first being used in this paper for its generality and ease of implementation (sci-kit learn Python framework). The general form of the Kriging is the following:
where:
Most commonly, the Gaussian correlation function is incorporated in the elements of the covariance matrix (defining the
):
where
is theta parameter, scaling the width of the basis function, and k is the number of design variables (dimensions).
Once the training data are shown to the Kriging model, the hyperparameters of the basis functions are optimized by maximizing the ln-likelihood function. The predictions on the unseen data are calculated using both the training data set and the new, unseen data.
Based on the previous studies, the authors decided to focus on a combination of two algorithms, both being a part of the evolutionary optimization group, the genetic algorithm in its multi-island version (MIGA), and the evolution strategy (EVOL). This mixture of algorithms is not unique, as other tailor-made optimization schemes utilize a similar strategy [
38].
Gradient methods were not included in this study, as they are not as efficient in optimizing systems, the response of which can be seen as quantified, i.e., the objective function is not continuous in the whole parameter domain, but it has constant values for wide ranges of parameter values, making the gradient calculation incorrect, as presented in
Figure 5.
The multi-island version of the genetic algorithm is an extension of the work carried out by J. Holland in 1975. Compared to the basic version of the algorithm, the MIGA consists of several autonomous populations, referred to as islands [
39]. Genetic operations, such as selection, cross-over, and mutation are performed (with a certain probability) only among individuals from a single island. Once a fixed number of generations is achieved (migration interval), migration occurs with a certain percentage amount of individuals from each island (migration rate). As an effect, the populations are mixed, providing more diverse and probably more fitted individuals. Elitism is introduced, assuring that the best individuals are not excluded from the population by genetic operators, and thus are preserved throughout the algorithm operation.
The evolution strategy (EVOL) is another variant of biology-inspired algorithms, in contrast to genetic algorithms, as the mutation is the only operator used to create new generations [
40]. The rate of mutation is automatically controlled, based on the number of (un)successful mutations in past generations, making the process self-adaptive. In this paper, the EVOL algorithm is a 1 + λ variant, which means that λ offsprings are generated by mutation, and the best-fitted of these λ individuals becomes the parent for the next generation.
The aim of the algorithm parameter optimization is to maximize the optimization performance with a limited number of FEM simulations. The total number of function calls during the bracket mass optimization (described in the second chapter) is restricted to 2000, due to the time constraint considered in the paper. Therefore, all of the studies performed on surrogate models were also limited to 2000 calls, from which 80% (1600) were used for the genetic algorithm and 20% (400) for the evolution strategy, creating the hybrid optimization flow, as shown in
Figure 6.
With the fixed optimization scheme, one can start to adjust the parameters of the optimization algorithms. To investigate the influence of these parameters (and their interactions) on the final objective function value, the authors decided to use a statistical approach, namely, design of experiment (DOE). The total number of analyzed parameters in the case considered in this paper is eight; however, the study is divided into two groups to reduce the size of the testing plan. In the first study, the single-population genetic algorithm was examined, along with evolution strategy parameters, and, based on the outcome of the analysis, the optimal values (one minimizing the objective function) were used in the second study of the multi-island version of the GA. More details are presented in the following section, highlighting the hybrid algorithm parameters and their levels.
The hybrid approach with a genetic algorithm that incorporates a single population (the most widely used version of the genetic algorithm—GA)—
Table 1.
Parameter A—Population ratio R—is an artificial parameter to control the ratio between the number of individuals and the number of generations that make up the total population. It is simply calculated as follows:
Knowing that the total number of function calls (individuals in the whole GA process) is fixed at 1600, one can easily modify the population architecture using the Population Ratio parameter.
The hybrid approach with a genetic algorithm having multiple populations (multi-island version of the algorithm—MIGA), with genetic operators and penalty function parameters fixed based on results from the first study and parameters given in
Table 2.
Parameters D and E from the first table—Penalty multiplier and Penalty exponent—are the only parameters linked to both of the considered optimization algorithms, where the penalty function is added to the objective value once any of the constraints are violated. The level of penalization may or may not prevent infeasible designs from being used for further genetic operations, depending on the penalty characteristic. The penalty formula is the following:
There are variants of penalty function that include the constant term for infeasible designs, but they may act quite conservatively at close proximity to a feasible region; therefore, in this study, the constant term was excluded.
4. Conclusions
The conducted study has shown, that the usage of metamodels may greatly increase the performance of the optimization algorithms. The authors were able to tune the optimization parameters to increase the efficiency of the proposed algorithms by respectively 12% and 5% for single- and multi-island versions of the hybrid algorithm (comparing the resultant optimal average objective function to the one obtained as an average for all of the analyzed runs). Moreover, the proposed approach enables reducing the total number of parameter combinations to study, as the experiments are divided into single- and multi-population implementation of the genetic algorithm separately (however, sequentially).
To conduct this study, one does not necessarily require an excessive sampling plan for the metamodel training phase. The authors have shown, that even for a significant number of variables, the performance of the metamodels trained on smaller data sets is mildly decreased (R2 drop of 10% for the response FLIMIT) or almost the same (R2 reduced by 2% for response Mass) compared to the four times bigger sampling plan used in the considered work. Nonetheless, the performance of the metamodel is strongly dependent on the underlying nature of the problem. The authors presented the nonlinear buckling example, which included both geometrical and material nonlinearities, making the constraint function extremely sensitive to the design variables.
Compared to other studies described in the literature, this one handles both the evolutionary operators as well as penalty-related ones (as the constrained-optimization problem is considered). This can lead to surprising conclusions, such as the fact that a proper penalty function may have a more significant impact on the optimization efficiency, than the number of individuals or generations of the genetic algorithm. However, this observation may be problem-related, as the authors considered a fixed amount of objective function calls, which may not guarantee the attainment of the global optimum. Yet, in real-world engineering optimization problems, one cannot find a single algorithm that will assure this global optimum each time. Nonetheless, the presented method of parameters tuning is universal in its nature and may be adopted in almost all types of optimization algorithms.
In stochastic processes, such as in genetic algorithms and evolution strategies, it is beneficial to increase the number of algorithm repetitive runs, which is relatively costless when working on quick-to-run metamodels rather than the actual FEM-based or other time-consuming simulations. In this case, the noise associated with the natural spread of evolutionary methods is mitigated, allowing for drawing valid conclusions regarding the algorithm optimization process.
Finally, tests performed on the actual FEM-based optimization with the proposed hybrid approach show, that the conclusions withdrawn from the metamodel-based study are valid and they indeed helped in increasing the efficiency of the optimization process. The same observations have been made regarding the level of spread in the single- and multi-population versions of the genetic algorithm in both the FEM and metamodel-based optimizations, i.e., the MIGA approach yielded a more narrow distribution of best individuals compared to the single-population version of the algorithm.
This study may be extended to verify the performance of adaptive genetic operators, such as [
44], or nonconstant migration properties, such as [
45]. In addition, usage of elemental response (such as, e.g., Huber−Mises−Hencky equivalent stress) may seem as a useful heuristic to steer the optimization process. Moreover, the trained metamodels may be useful for finding areas of a low objective function, where local-search algorithms based on the actual FEM simulation may be initiated from—this could resolve the issue of finding the starting point for the deterministic optimizations. Generally, the further improvement of the optimization capabilities for the presented industrial-based problem is the main goal of the authors in subsequent work. Nonetheless, the presented method for parameter tuning of hybrid optimization is universal and may be adapted to numerous fields of engineering.