**Preface to "Numerical and Evolutionary Optimization 2020"**

This volume was inspired by the 8th International Workshop on Numerical and Evolutionary Optimization (NEO), hosted by the Universidad de la Sierra Juarez, Oaxaca, Mexico, the Universidad ´ Veracruzana, Xalapa, Mexico, and the Cinvestav-IPN, Mexico City, Mexico. The workshop was held on November 18 and 19, 2020, as an online-only event and was attended by a total of around 70 researchers, plus another 130 students from the Research Experience Day of NEO 2020.

Solving scientific and engineering problems from the real world is currently a very complicated task; that is why the development of powerful search and optimization techniques is of great importance. Two well-established fields focus on this duty; they are (i) traditional numerical optimization techniques and (ii) bio-inspired metaheuristic methods. Both general approaches have unique strengths and weaknesses, allowing researchers to solve some challenging problems but still failing in others. The goal of NEO is to gather people from both fields to discuss, compare, and merge these complementary perspectives. Collaborative work allows researchers to maximize strengths and to minimize the weaknesses of both paradigms. NEO also intends to help researchers in these fields to understand and tackle real-world problems such as pattern recognition, routing, energy, lines of production, prediction, and modeling, among others.

This Special Issue consists of 16 research papers. In the first paper, https://doi.org/10.3390/mca 26010005, K. Deb et al. survey surrogate modeling approaches for the numerical treatment of multi-objective optimization problems. Moreover, the authors propose an adaptive switching based metamodeling approach yielding results that are highly competitive to state-of-the-art approaches.

The following ten papers are devoted to the design of new algorithms for particular optimization problems. In the second paper, https://doi.org/10.3390/mca26020031, M. Berkemeier and S. Peitz present a local trust region descent algorithm for unconstrained and convexly constrained multi-objective optimization problems. The method targets at problems that have at least one objective function that is computationally expensive. Convergence of the derivative-free method to a Pareto critical point is proven. In the third paper, https://doi.org/10.3390/mca26020028, M. Perez-Villafuerte proposes a new hybrid multi-objective optimization evolutionary algorithm, called P-HMCSGA, that allows to incorporate decision makers preferences already in early stages of the optimization process. The strength of the novel method is illustrated on real-size multi-objective project portfolio problems. In the fourth paper, https://doi.org/10.3390/mca26020027, A. Castellanos-Alvarez et al. propose a method, NSGA-III-P, for the integration of preferences to a multi-objective evolutionary algorithm using ordinal multi-criteria classification. Numerical results show that the new method is capable of identifying the proper region of interest as specified by the decision maker. In the fifth paper, https://doi.org/10.3390/mca26020035, T. Macias-Escobar et al. propose a new interactive recommendation system for the decision making process based on the characterization of cognitive tasks. The system focuses on a user–system interaction that guides the search towards the best solution considering a decision maker's preferences. The developed prototype has been assessed by several test users leading to a satisfying score and most overall acceptance.

In the sixth paper, https://doi.org/10.3390/mca25040072, J.-Y. Guzman-Gaspar et al. present ´ an empirical comparison of the standard differential evolution (DE) against three random sampling methods to solve particular robust optimization problems in dynamic environments. The findings indicate that DE is a suitable algorithm to deal with this type of dynamic search space when a survival time approach is considered. In the seventh paper, https://doi.org/10.3390/mca26020039, J. P. Sanchez-Hern ´ andez et al. address the protein folding problem. To this end, they present the algorithm ´ GRSA-SSP, a hybrid of golden ratio simulated annealing with a secondary structure prediction. Numerical results show that the new algorithm competes to the state-of-the-art in small peptides except when predicting the largest peptides. In the eighth paper, https://doi.org/10.3390/mca26020 036, A. Estrada-Padilla et al. propose a new methodology to deal with uncertainties in multi-objective portfolio optimization problems via using fuzzy numbers. The results show a significant difference in performance favoring the proposed steady-state algorithm based on the fuzzy adaptive multi-objective evolutionary (FAME) methodology. In the ninth paper, https://doi.org/10.3390/mca 26010008, J. Frausto-Solis et al. propose two multi-objective job shop scheduling metaheuristics based on Simulated Annealing: Chaotic Multi-Objective Simulated Annealing (CMOSA) and Chaotic Multi-Objective Threshold Accepting (CMOTA). Numerical results indicate that the two novel methods are highly competitive to the state-of-the-art. In the tenth paper, https://doi.org/10.3390/m ca26010013, L. G. de la Fraga analyzes the use of numbers with 16 bits in the conventional Differential Evolution (DE) algorithm. It is shown that the additional use of fixed point arithmetic can speed up the evaluation time of the objective function. In the eleventh paper, https://doi.org/10.3390/mca250 40080, F. Beltran et al. deal with a continuation method for the numerical treatment of multi-objective ´ optimization problems. More precisely, the Pareto Tracer is extended to treat general inequalities which greatly enhances its applicability.

The last five papers of this Special Issue deal with the numerical treatment of particular applications that arise in the real world. In the twelfth paper, https://doi.org/10.3390/mca26020046, are presented some preliminary results of a study of 17 interconnected power generation plants situated in eastern Mexico. The study shows that fossil fuel plants, besides emitting greenhouse gases that affect human health and the environment, incur maintenance expenses even without operation. In the thirteenth paper, https://doi.org/10.3390/mca26020029, J. Frausto-Solis et al. propose a new method designed to confirm cases of COVID-19 in the United States, Mexico, Brazil, and Colombia, based on Component Transformation and Convolutional Neural Networks. Numerical results show that it consistently achieves highly competitive results in terms of the MAPE metric. In the fourteenth paper, https://doi.org/10.3390/mca25040073, X. Cai et al., propose and analyze a novel framework for the multi-objective risk-informed decision support systems for the drainage rehabilitation problem. This study shows that the conventional framework can be significantly improved in terms of calculation speed and cost-effectiveness by removing the constraint function and adding more objective functions. In the fifteenth paper, https://doi.org/10.3390/mca26010006, I. Bahreini Toussi et al. investigate the impact forces caused by liquid storage tanks which can lead to structural damage as well as economic and environmental losses. To this end, an OpenFOAM numerical model is used to simulate various tank sizes with different liquid heights. The last contribution of this Special Issue is given by the sixteenth paper, https://doi.org/10.3390/mca25040076. P. R. Castaneda-Avi ˜ na et al. design an analog circuit, a voltage-controlled oscillator (VCO), optimized ˜ using Differential Evolution. It is shown that the suggested approach yields highly robust solutions.

Finally, we thank all participants at NEO 2020 and hope that this book can be a contemporary reference regarding the field of numerical evolutionary optimization and its exciting applications.

## **Marcela Quiroz-Castellanos, Juan Gabriel Ruiz Ruiz, Luis Gerardo de la Fraga, Oliver Sch ¨utze** *Editors*

## *Review* **Surrogate Modeling Approaches for Multiobjective Optimization: Methods, Taxonomy, and Results**

**Kalyanmoy Deb \*,†, Proteek Chandan Roy † and Rayan Hussein †**

Computational Optimization and Innovation (COIN) Laboratory, Michigan State University, East Lansing, MI 48824, USA; royproteekchandan@gmail.com (P.C.R.); husseinr@egr.msu.edu (R.H.)

**\*** Correspondence: kdeb@egr.msu.edu

† These authors contributed equally to this work.

**Abstract:** Most practical optimization problems are comprised of multiple conflicting objectives and constraints which involve time-consuming simulations. Construction of metamodels of objectives and constraints from a few high-fidelity solutions and a subsequent optimization of metamodels to find in-fill solutions in an iterative manner remain a common metamodeling based optimization strategy. The authors have previously proposed a taxonomy of 10 different metamodeling frameworks for multiobjective optimization problems, each of which constructs metamodels of objectives and constraints independently or in an aggregated manner. Of the 10 frameworks, five follow a generative approach in which a single Pareto-optimal solution is found at a time and other five frameworks were proposed to find multiple Pareto-optimal solutions simultaneously. Of the 10 frameworks, two frameworks (M3-2 and M4-2) are detailed here for the first time involving multimodal optimization methods. In this paper, we also propose an adaptive switching based metamodeling (ASM) approach by switching among all 10 frameworks in successive epochs using a statistical comparison of metamodeling accuracy of all 10 frameworks. On 18 problems from three to five objectives, the ASM approach performs better than the individual frameworks alone. Finally, the ASM approach is compared with three other recently proposed multiobjective metamodeling methods and superior performance of the ASM approach is observed. With growing interest in metamodeling approaches for multiobjective optimization, this paper evaluates existing strategies and proposes a viable adaptive strategy by portraying importance of using an ensemble of metamodeling frameworks for a more reliable multiobjective optimization for a limited budget of solution evaluations.

**Keywords:** surrogate modeling; multiobjective optimization; evolutionary algorithms; kriging method; ensemble method; adaptive algorithm

#### **1. Introduction**

Practical problems often require expensive simulation of accurate high-fidelity models. To get close to the optimum of these models, most multiobjective optimization algorithms need to compute a large number of solution evaluations. However, in practice, only a handful of solution evaluations are allowed due to the overall time constraint available to solve such problems. Researchers usually resort to surrogate models or metamodels constructed from a few high-fidelity solution evaluations to replace computationally expensive models to drive an optimization task [1–3]. For example, Gaussian process model, Kriging, or response surface method is commonly used. The Kriging method is of particular interest, since it is able to provide an approximated function as well as an estimate of uncertainty of the prediction of the function [4].

In extending the metamodeling concept to multiobjective optimization problems, an obvious issue is that multiple objective and constraint functions are required to be metamodeled before proceeding with the optimization algorithm. Despite this challenge of multiple metamodeling efforts, a good number of studies have been made to solve

**Citation:** Deb, K.; Roy, P.C.; Hussein, R. Surrogate Modeling Approaches for Multiobjective Optimization: Methods, Taxonomy, and Results. *Math. Comput. Appl.* **2021**, *26*, 5. https://doi.org/10.3390/mca26010005

Received: 25 October 2020 Accepted: 27 December 2020 Published: 31 December 2020

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/ licenses/by/4.0/).

computationally expensive multiobjective optimization problems using metamodeling based evolutionary algorithms [5–10]. However, most of these studies ignored constraints and extending an unconstrained optimization algorithm to constrained optimization is not trivial [11]. In any case, the structure of most of these methods is as follows. Starting an initial archive of solutions obtained by an usual Latin-hypercube sampling, a metamodel for each objective and constraint function is built independently [12,13]. Then, in an *epoch*–one cycle of metamodel development and their use to obtain a set of *in-fill* solutions, an evolutionary multiobjective optimization (EMO) algorithm is used to optimize the metamodeled objectives and constraints to find one or more in-fill points. Thereafter, the in-fill points are evaluated using high-fidelity models and saved into the archive. Next, new metamodels are built using the augmented archive members and the procedure is repeated in several epochs until the allocated number of solution evaluations is consumed [5,14–19]. Many computationally expensive optimization problems involve noisy high-fidelity simulation models. Noise can come from inputs, stochastic processes of the simulation, or the output measurements. In this paper, we do not explicitly discuss the effect of noise in handling metamodeling problems, but we recognize that this is an important matter in solving practical problems.

In a recent taxonomy study [20], authors have categorized different plausible multiobjective metamodeling approaches into 10 frameworks, of which the above-described popular method falls within the first two frameworks—M1-1 or M1-2, depending on whether a single or multiple nondominated in-fill solutions are found in each epoch. The other eight frameworks were not straightforward from a point of view extending single-objective metamodeling approaches to multiobjective optimization and hence were not explored in the past. Moreover, the final two frameworks (M5 and M6) attempt to metamodel an EMO algorithm's implicit overall fitness (or selection) function directly, instead of metamodeling an aggregate or individual objective and constraint functions. There is an advantage of formulating a taxonomy, so that any foreseeable future metamodeling method can also be categorized to fall within one of the 10 frameworks. Moreover, the taxonomy also provides new insights to other currently unexplored ways of handling metamodels within a multiobjective optimization algorithm.

So far, each framework has been applied alone in one complete optimization run to solve a problem, but in a recent study [21], a manual switching of one framework to another after 50% of allocated solution evaluations has produced improved results. An optimization process goes through different features of the multiobjective landscape and it is natural that a different metamodeling framework may be efficient at different phases of a run. These studies are the genesis of this current study, in which we propose an adaptive switching based metamodeling (ASM) approach, which automatically finds one of the 10 best-performing frameworks at the end of each epoch after a detailed statistical study, thereby establishing self-adaptive and efficient overall metamodeling based optimization approach.

In the remainder of the paper, Section 2 briefly describes a summary of recent related works. Section 3 provides a brief description of each of 10 metamodeling frameworks for multiobjective optimization. The proposed ASM approach is described in Section 4. Our extensive results on unconstrained and constrained test problems for each framework alone and the ASM approach are presented in Section 5. A comparative study of the ASM approach with three recent existing algorithms is presented in Section 5.5. We summarize our study of the switching framework based surrogate-assisted optimization with future research directions in Section 6.

#### **2. Past Methods of Metamodeling for Multiobjective Optimization**

We consider the following original multi- or many-objective optimization problem (*P*), involving *n* real-valued variables (**x**), *J* inequality constraints (**g**) (equality constraints, if any, are assumed to be converted to two inequality constraints), and *M* objective functions (**f**):

$$\begin{array}{ll}\text{Minimize} & (f\_1(\mathbf{x}), f\_2(\mathbf{x}), \dots, f\_M(\mathbf{x})),\\ \text{Subject to} & g\_j(\mathbf{x}) \le 0, \quad j = 1, 2, \dots, l, \\ & x\_i^{(L)} \le x\_i \le x\_i^{(II)}, \quad i = 1, 2, \dots, n. \end{array} \tag{1}$$

In this study, we assume that all objective and constraint functions are *computationally expensive* to compute and that they need to be computed independent to each other for every new solution **x**. To distinguish from the original functions, the respective metamodeled function is represented with a "tilde" (such as, *f i*(**x**) or *g<sup>j</sup>*(**x**)). The resulting metamodeled problem is denoted here as *MP*, which is formed with developed metamodels of individual objective and constraints or their aggregates. *In-fill* solutions are defined as optimal solutions of problem *MP*. It is assumed here that constructing the metamodels and their comparisons among each other consume comparatively much less time than evaluating objective and constraints exactly, hence, if the metamodels are close to the original functions, the process can end up with a huge savings in computational time without much sacrifice in solution accuracy. Naturally, in-fill solutions (obtained from metamodels) need to be evaluated using original objective and constraints (termed here as "high-fidelity" evaluations) and can be used to refine the metamodels for their subsequent use within the overall optimization approach.

A number of efficient metamodeling frameworks have been proposed recently for multiobjective optimization [10,22–28], including a parallel implementation concept [29]. These frameworks use different metamodeling methods to approximate objective and constraint functions, such as radial basis functions (RBFs), Kriging, Bayesian neural network, support vector regression, and others [30]. Most of these methods proposed a separate metamodel for each objective and constraint function, akin to our framework M1. Another study have used multiple spatially distributed surrogate models for multiobjective optimization [31]. It is clear that this requires a lot of metamodeling efforts and metamodeling errors from different models can accrue and make the overall optimization to be highly error-prone. As will be clear later, these methods will fall under our M1-2 framework.

Zhang et al. [14] proposed the MOEA/D-EGO algorithm which metamodeled each objective function independently. They constructed multiple expected global optimization (EGO) functions for multiple reference lines of the MOEA/D approach to find a number of trade-off solutions in each optimization task. No constraint handling procedure was suggested. Thus, this method falls under our M1-2 framework.

Chugh et al. [23] proposed a surrogate-assisted adaptive reference vectors guided evolutionary algorithm (K-RVEA) for computationally expensive optimization problems with more than three objectives. Since all objectives and constraints are metamodeled separately, this method also falls under our M1-2 framework. While no constraint handling was proposed with the original study, a later version included constraint handling [32].

Zhao et al. [24] classified the sample data into clusters based on their similarities in the variable space. Then, a local metamodel was built for each cluster of the sample data. A global metamodel is then built using these local metamodels considering their contributions in different regions of the variable space. Due to the construction and optimization of multiple metamodels, one for each cluster, this method belongs to our M-3 framework. The use of a global metamodel by combining all local cluster-wise metamodels qualify this method under the M3-2 framework. No constraint handling method is suggested.

Bhattacharjee et al. [25] used an independent metamodel for each objective and constraint using different metamodeling methods: RBF, Kriging, first and second-order response surface models, and multilayer perceptrons. NSGA-II method is used to optimized metamodeled version of the problem. Clearly, this method falls under our M1-2 category.

Wang et al. [26] used independent metamodeling of objectives but combined them using a weight-sum approach proposed an ensemble-based model management strategy for surrogate-assisted evolutionary algorithm. Thus, due to modeling a combined objective function, this method falls under our M3-1 framework. A global model management strategy inspired from committee-based active learning (CAL) was developed, searching for the best and most uncertain solutions according to a surrogate ensemble using a particle swarm optimization (PSO) algorithm. In addition, a local surrogate model is built around the best solution obtained so far. Then, a PSO algorithm searches on the local surrogate to find its optimum and evaluates it. The evolutionary search using the global model management strategy switches to the local search once no further improvement can be observed and vice versa.

Pan et al. [33] proposed a classification based surrogate-assisted evolutionary algorithm (CSEA) for solving unconstrained optimization problems by using an artificial neural network (ANN) as a surrogate model. The surrogate model aims to learn the dominance relationship between the candidate solutions and a set of selected reference solutions. Due to a single metamodel to find the dominance structure involving all objective functions, this algorithm falls under our M3-2 framework.

Deepti et al. [34] suggested a reduced and simplified model of each objective function in order to reduce the computational efforts.

Recent studies on nonevolutionary optimization methods for multiobjective optimization using trust-region method [35,36] and using decomposition methods [37] are proposed as well.

A recent study [38] reviewed multiobjective metamodeling approaches and suggested a taxonomy of the existing methods based on whether the surrogate assisted values match well the original function values. Three broad categories were suggested: (i) algorithms that do not use any feedback from the original function values, (ii) algorithms that use a fixed number of feedback, and (iii) algorithms that adaptively decide which metamodeled solutions must be checked with the original function values. This extensive review reported that most existing metamodeling approaches used a specific EMO algorithm—NSGA-II [39]. While a check on the accuracy of a metamodel is important for its subsequent use, this is true for both single and multiobjective optimization and no specific issues related to multiobjective optmization were discussed in the review paper.

Besides the algorithmic developments, a number of studies have applied metamodeling methods to practical problems with a limited budget of solution evaluations [40–47], some restricting to a few hundreds [48].

Despite all the above all-around developments, the ideas that most distinguish surrogate modeling in multiobjective optimization from their single-objective counterparts were not addressed well. They are (i) how to fundamentally handle multiple objectives and constraints either through a separate modeling of each or in an aggregated fashion? and (ii) how to make use of the best of different multiple surrogate modeling approaches adaptively within an algorithm? In 2016, Rayan et al. [5] have proposed a taxonomy in which 10 metamodeling frameworks were proposed to address the first question. This paper addresses the second question in a comprehensive manner using the proposed 10 metamodeling frameworks using an ensemble method.

Ensemble methods have been used in surrogate-assisted optimization for solving expensive problems [49–53], but in most of these methods, an ensemble of different metamodeling methods, such as RBF, Kriging, response surfaces, are considered to choose a single suitable method. While such studies are important, depending on the use of objectives and constraints, each such method will fall in one of the first eight frameworks presented in this paper. No effort is made to consider an ensemble of metamodeling frameworks for combining multiple objectives and constraints differently and choosing the most suitable one for optimization. In this paper, we use an ensemble of 10 metamodeling

frameworks [5,20] described in the next section and propose an adaptive selection scheme of choosing one in an iterative manner thereafter.

#### **3. A Taxonomy for Multiobjective Metamodeling Frameworks**

Having *M* objective and *J* constraints to be metamodeled, there exist many plausible ways to develop a metamodeling based multiobjective optimization methods. Thus, there is a need to classify different methods into a few finite clusters so that they can be compared and contrasted with each other. Importantly, such a classification or taxonomy study can provide information about methods which are still unexplored. A recently proposed taxonomy study [20] put forward 10 different frameworks based on the metamodeling objective and constraint functions based on their individual or aggregate modeling, as illustrated in Figure 1.

**Figure 1.** The proposed taxonomy of 10 different metamodeling frameworks for multi- and manyobjective optimization. (Taken from [20]).

We believe most ideas of collectively metamodeling all objectives and constraints can be classified into one of these 10 frameworks. We describe each of the 10 frameworks below in details for the first time.

We explain each framework using a two-variable, two-objective SRN problem [54,55] as an example:

$$\begin{array}{ll}\text{Minimize} & f\_1(\mathbf{x}) = 2 + (\mathbf{x}\_1 - 2)^2 + (\mathbf{x}\_2 - 1)^2, \\ \text{Minimize} & f\_2(\mathbf{x}) = 9\mathbf{x}\_1 - (\mathbf{x}\_2 - 1)^2, \\ \text{Subject to} & g\_1(\mathbf{x}) = \mathbf{x}\_1^2 + \mathbf{x}\_2^2 - 225 \le 0, \\ & g\_2(\mathbf{x}) = \mathbf{x}\_1 - 3\mathbf{x}\_2 + 10 \le 0, \\ & -20 \le (\mathbf{x}\_1, \mathbf{x}\_2) \le 20. \end{array} \tag{2}$$

The PO solutions are known to be as follows: *x*∗ <sup>1</sup> = −2.5 and *x*<sup>∗</sup> <sup>2</sup> ∈ [2.5, 14.79]. To apply a metamodeling approach, one simple idea is to metamodel all four functions. The functions and the respective PO solutions are marked on *f*<sup>1</sup> and *f*<sup>2</sup> plots shown in Figure 2a,b, respectively. The feasible regions for *g*<sup>1</sup> anf *g*<sup>2</sup> are shown in Figure 2c,d.

**Figure 2.** Two objectives and two constraints are shown for SRN problem. The combined feasible region is shown in the contour plot. PO solutions lie on the black line marked inside the feasible region.

#### *3.1. M1-1 and M1-2 Frameworks*

Most existing multiobjective metamodeling approaches are found to fall in these two frameworks [20]. In M1-1 and M1-2, a total of (*M* + *J*) metamodels (*M* objectives and *J* constraints) are constructed. The metamodeling algorithm for M1-1 and M1-2 starts with an archive of initial population (A<sup>0</sup> of size *N*0) created using the Latin hypercube sampling (LHS) method on the entire search space, or by using any other heuristics of the problem. Each objective function (*fi*(**x**), for *i* = 1, ... , *M*) is first normalized to obtain a normalized function *f i* (**x**) using the minimum (*f* min *<sup>i</sup>* ) and maximum (*<sup>f</sup>* max *<sup>i</sup>* ) values of all high-fidelity evaluation of archive members, so that the minimum and maximum values of *f i* (**x**) is zero and one, respectively: --

$$\underline{f}\_i(\mathbf{x}) = \frac{f\_i(\mathbf{x}) - f\_i^{\min}}{f\_i^{\max} - f\_i^{\min}}. \tag{3}$$

Then, metamodels are constructed for each of the *M* normalized objective functions independently: (*f* 1 (**x**), ... , *f <sup>M</sup>*(**x**)), <sup>∀</sup>*<sup>i</sup>* ∈ {1, 2, ... , *<sup>M</sup>*} using a chosen metamodeling method. For all implementations here, we use the Kriging metamodeling method [56] for all frameworks of this study.

Each constraint function (*gj*(**x**), for *j* = 1, ... , *J*) is first normalized to obtain a normalized constraint function (*gj* (**x**)) using standard methods [57], and then metamodeled

separately to obtain an approximate function (*g*- *j* (**x**)) using the same metamodeling method (Kriging method is adopted here) used for metamodeling objective functions.

In M1-1, all metamodeled normalized objectives are combined into a single *aggregated* function and optimized with all separately metamodeled constraints to find a *single* in-fill point using a single-objective evolutionary optimization algorithm (real-coded genetic algorithm (RGA) [54] is used here). In *τ* generations of RGA (defining an epoch), the following achievement scalarization aggregation function (ASF12(**x**, **z**)) [58] is optimized for every **z** vector: ⎧⎪⎪⎨⎪⎪⎩-

$$\begin{aligned} \text{net scalization aggregation function (ASF1\_2(x, z)) [58] is optimized for every} \\ \text{Problem O1-1:} \\ \text{Solution: } \mathbf{x}^\*(\mathbf{z}), \quad \begin{cases} \text{Minimize} & \text{ASF}\_{12}(\mathbf{x}, \mathbf{z}) = \max\_{j=1}^M \left( \underset{\stackrel{\sim}{\rightarrow}\_j}{\underset{j}{\rightleftarrows}}(\mathbf{x}) - z\_j \right), \\ \text{Subject to} & \underset{\stackrel{\sim}{\rightarrow}\_j}{\underset{j}{\rightleftarrows}}(\mathbf{x}) \le 0, \quad j = 1, 2, \dots, l, \\ & \qquad \qquad \qquad \qquad x\_i^{(L)} \le x\_i \le x\_i^{(L)}, \quad i = 1, 2, \dots, n, \end{aligned} \tag{4}$$

where the vector **z** is one of the Das and Dennis's [59] point on the unit simplex on the *M*-dimensional hyperspace (making ∑*<sup>M</sup> <sup>j</sup>*=<sup>1</sup> *zi* = 1). Thus, for each of *H* different **z** vectors, one optimization problem (O1-1) is formed with an equi-angled weight vector, and solved one at time to find a total of *H* in-fill solutions using a real-parameter genetic algorithm (RGA). Figure 3a shows the infill solution for **z** = (0.5, 0.5) for the SRN problem. Notice, the ASF12 function constitutes a minimum point on the Pareto-optimal (PO) line (black line on the contour plot) for the specific **z**-vector. If the exact ASF12 function can be constructed as a metamodeled function from a few high-fidelity evaluations, one epoch would be enough to find a representative PO set. However, since the metamodeled function is expected to have a difference from the original function, several epochs will be necessary to get close to the true PO set. For a different **z**-vector, the ASF12 function will have a different optimal solution, but it will fall on the PO line. The ASF12 model, constructed from metamodeled objective and constraint functions, will produce optimal solutions on the Pareto set for different **z**-vectors. Multiple applications of a RGA will discover a well-distributed set of multiple in-fill points one at a time.

The RGA procedure uses a *trust-region* concept, which we describe in detail in Section 4.3. The best solution for each **z** is sent for a high-fidelity evaluation. The solution is then included in the archive (A1) of all high-fidelity solutions. After all *H* solutions are included in the archive, one epoch of the M1-1 framework optimization problem is considered complete. In the next epoch, all high-fidelity solutions are used to normalize and metamodel all (*M* + *J*) objective functions and constraints, and the above process is repeated to obtain A2. The process is continued until all prespecified maximum solution evaluations (SEmax) is completed. Nondominated solutions of final archive A*<sup>t</sup>* is declared as outcome of the whole multiobjective surrogate-assisted approach. ⎧⎪⎪⎨Minimize ---

In M1-2, the following *M*-objective optimization problem, ⎪⎪⎩

$$\begin{aligned} \text{future of the whole multiplicative surrogate-assusted approach.}\\ \text{In M1-2, the following } M\text{-objective optimization problem,}\\ \text{Problem O1-2:} \begin{cases} \text{Problem O1-2:} \begin{cases} \text{Minimize} & \left(\widetilde{f}\_{1}(\mathbf{x}), \widetilde{f}\_{2}(\mathbf{x}), \dots, \widetilde{f}\_{M}(\mathbf{x})\right), \\ \text{Subject to} & \left\widetilde{g}\_{i}(\mathbf{x}) \le 0, \quad j = 1, 2, \dots, l, \end{cases} \end{aligned} \end{aligned} \end{aligned}$$

$$\begin{aligned} \text{Problem O1-2:} \begin{cases} \text{Minimize} & \left(\widetilde{f}\_{1}(\mathbf{x}), \widetilde{f}\_{2}(\mathbf{x}), \dots, \widetilde{f}\_{M}(\mathbf{x})\right), \\ \text{Subject to} & \left\| \begin{array}{l} \widetilde{g}\_{i}(\mathbf{x}) \le 0, \quad j = 1, 2, \dots, l, \end{cases} \right\|\_{2} \end{aligned}$$

$$\begin{aligned} \text{ $\widetilde{\mathbf{x}}\_{i}^{(L)}$ } \le \mathbf{x}\_{i} \le \mathbf{x}\_{i}^{(L)}, \quad i = 1, 2, \dots, n, \end{aligned}$$

constructing (*M* + *J*) metamodels in each epoch, is solved to find *H* in-fill solutions in a *single run* with an EMO/EMaO procedure. We use NSGA-II procedure [39] for twoobjective problems, and NSGA-III [60] for three or more objective problems here. All *H* solutions are then evaluated using high-fidelity models and are included in the archive for another round of metamodel construction and optimization for the next epoch. The process is continued until SEmax evaluations are done. Figure 3b shows that when NSGA-II optimizes a well-approximated metamodel to the original problem, the obtained solutions will lie on the true PO front.

(**a**) The ASF solution with **z** = (0.5, 0.5) in M1-1, M2-1, M3-1 and M4-1. The solution lies on the true PO set (black line).

(**b**) Efficient solutions in M1-2 and M2-2. For illustration, a few PO solutions are shown in green circles.

(**c**) minASF34 in M3-2 and M4-2. For illustration, five **z**-vectors and their respective PO solutions, found simultaneously, are shown in red circles.

**Figure 3.** In-fill solutions for different frameworks for SRN problem. True functions are plotted here, however, different metamodeling frameworks use different approximations to find in-fill solutions on the true PO set.

#### *3.2. Frameworks M2-1 and M2-2*

For M2-1 and M2-2, a single aggregated constraint violation function (ACV(**x**)) is first constructed using the normalized constraint functions (*gj* (**x**), *j* = 1, ... , *J*) at high-fidelity solutions from the archive (**x** ∈ A*t*), as follows: ⎧⎨⎩

$$\text{ACV}(\mathbf{x}) = \begin{cases} \begin{array}{l} \sum\_{j=1}^{l} \underline{\mathcal{G}}\_{j}(\mathbf{x}), & \text{if } \mathbf{x} \text{ is feasible},\\ \sum\_{j=1}^{l} \langle \underline{\mathcal{G}}\_{j}(\mathbf{x}) \rangle, & \text{otherwise}. \end{array} \end{cases} \tag{6}$$

where the bracket operator *α* is *α*, if *α* > 0; and zero, otherwise. It is clear from the above equation that for high-fidelity solutions, ACV(**x**) takes a negative value for feasible solutions and a positive value for an infeasible solution. In M2-1 and M2-2, the constraint violation function (ACV(**x**)) is then metamodeled to obtain ACV -(**x**), instead of every constraint function (*gj*(**x**)) metamodeled in M1-1 and M1-2. This requires a total of (*M* + 1) metamodel constructions (*M* objectives and one constraint violation function) at each epoch. In M2-1, the following problem ⎧⎪⎪⎨⎪⎪⎩-

$$\begin{array}{ll}\text{Problem O2-1:} & \begin{cases} \text{Minimize} & \text{ASF}\_{12}(\mathbf{x}, \mathbf{z}) = \max\_{j=1}^{M} \left( \widecheck{\underline{\zeta}}\_{j}(\mathbf{x}) - z\_{j} \right), \\ \text{Subject to} & \overleftarrow{\mathcal{ACV}}(\mathbf{x}) \le 0, \\ & \mathbf{x}\_{i}^{(L)} \le \mathbf{x}\_{i} \le \mathbf{x}\_{i}^{(II)}, \quad i = 1, 2, \dots, n. \end{cases} \end{array} \tag{7}$$

is solved to find one in-fill point for each reference line originating from one of the chosen Das-Dennis reference points **z**. Similarly, M2-2 solves the following problem: ⎪⎪⎨⎪⎪⎩Minimize




⎧

$$\begin{array}{c} \text{Problem O2-2:} \\ \text{S solutions:} \ge^{i\*}, \ i = 1, \ldots, H \quad \begin{cases} \text{Minimize} & \left(\underline{\dot{f}}\_{1}(\mathbf{x}), \underline{\dot{f}}\_{2}(\mathbf{x}), \ldots, \underline{\dot{f}}\_{M}(\mathbf{x})\right), \\ \text{Subject to} & \mathbf{A}\overline{\mathbf{C}\mathbf{V}}(\mathbf{x}) \le 0, \\ & \mathbf{x}\_{i}^{(L)} \le \mathbf{x}\_{i} \le \mathbf{x}\_{i}^{(II)}, \quad i = 1, 2, \ldots, n, \end{cases} \end{array} \tag{8}$$

to find *H* in-fill solutions simultaneously. The rest of the M2-1 and M2-2 procedures are identical to that in M1-1 and M1-2, respectively. RGA is used to solve each optimization problem in M2-1 to find one solution at a time, and NSGA-II or NSGA-III is used in M2-2 depending on number of objectives in the problem. Thus, M2-1 requires an archive to store each solution, whereas M2-2 does not require an archive.

#### *3.3. M3-1 and M3-2 Frameworks*

In these two methods, instead of metamodeling each normalized objective function *f i* (**x**) for *i* = 1, ... , *M* independently, we first aggregate them to form the following ASF34 function for each high-fidelity solution **x**: 

$$\text{ASF}\_{34}(\mathbf{x}, \mathbf{z}) = \max\_{j=1}^{M} (\underline{f}\_j(\mathbf{x}) - z\_j), \tag{9}$$

where **z** is defined as before. Note this formulation is different from ASF12 in that the ASF formulation is made with the original normalized objective functions *f <sup>j</sup>* here. Then, one ASF34 function (for a specific **<sup>z</sup>**-vector) is metamodeled to obtain ASF <sup>34</sup>(**x**, **<sup>z</sup>**), along with *<sup>J</sup>* separate metamodels for *<sup>J</sup>* constraints (*g*-⎧

 *j* ) to solve the following problem for M3-1: ⎪⎨⎪⎩Minimize ASF <sup>34</sup>(**x**, **<sup>z</sup>**),

$$\begin{array}{l} \text{from (for a specific } \mathbf{z}\text{-vector) is metamodede to obtain } \text{ASF}\_{34}(\mathbf{x}, \mathbf{z}) \text{, along with } f \text{ extends to } \mathcal{I} \text{-subscripts (} \underline{\widetilde{\mathbf{x}}}\text{) to solve the following problem for M3-1:}\\ \text{Problem O3-1:} \begin{cases} \text{Minimize} & \overline{\text{ASF}}\_{34}(\mathbf{x}, \mathbf{z}),\\ \text{Subject to } \underset{\widetilde{\underline{\mathcal{K}}}\_{i}}{\operatorname{S}}(\mathbf{x}) \le 0, \quad j = 1, 2, \dots, \mathfrak{I},\\ & \qquad x\_{i}^{(L)} \le x\_{i} \le x\_{i}^{(L)}, \quad i = 1, 2, \dots, n. \end{cases} \end{array}$$

For every **z**, a new in-fill point is found by solving the above problem using the same RGA, discussed for M1-1. Every in-fill point is stored in an archive to compare with M∗-2 methods, which creates multiple solutions in one run, thereby not requiring an explicit archive. In M3-2, the following problem is solved: ⎧⎪⎨⎪⎩Minimize minASF34(**x**) = min**<sup>z</sup>** ASF <sup>34</sup>(**x**, **<sup>z</sup>**), Subject to *g*-

$$\begin{array}{c} \text{Problem O3-2:} \\ \text{Solution:} \ge \mathbf{x}^{i,\*}, i = 1, \dots, H \end{array} \left\{ \begin{array}{c} \text{Minimize} \\ \text{Subject to} \\ \end{array} \underline{\tilde{\mathbf{y}}}(\mathbf{x}) \le 0, \quad j = 1, 2, \dots, J, \end{array} \tag{11}$$
 
$$\text{in which the objective function of } \mathbf{x} \text{ is computed as the minimum } \overline{\mathbf{A}} \mathbf{\overline{S}} \mathbf{x}\_{34} \text{ for all } \mathbf{z} \text{-vectors at}$$

**x**. Figure 3c shows the multimodal objective function minASF34(**x**) for the SRN problem, clearly indicating multiple local optima on the PO front. Notice how the minASF34 function has ridges and creates multiple optima on the PO set, one for each reference line. Due to the complexity involved in this function, it is clear that a large number of high-fidelity points will be necessary to make a suitable metamodel with a high accuracy. Besides the need of more points, there is another issue that needs a discussion. Both M3-1 and M3-2 requires *<sup>H</sup>*, ASF <sup>34</sup>(**x**, **<sup>z</sup>**) and *<sup>J</sup>* constraint functions to be metamodeled, thereby making a total of (*H* + *J*) metamodels in each epoch. Since each of multiple optima of the minASF34 function will finally lead us to a set of PO solutions, we would need an efficient multimodal optimization algorithm, instead of a RGA, to solve the metamodeled minASF34 function.

We use a *multimodal* single-objective evolutionary algorithm to find *H* multimodal in-fill points of minASF34 simultaneously. We propose a multimodal RGA (or MM-RGA) which starts with a random population of size *N* for this purpose. In each generation, the

population (*Pt*) is modified to a new population (*Pt*+1) by using selection, recombination, and mutation operators. The selection operator emphasizes multiple diverse solutions as follows. First, a *fitness* is assigned to each population member **<sup>x</sup>** by computing ASF <sup>34</sup>(**x**, **<sup>z</sup>**) for all *H*, **z**-vectors and then assigning the smallest value as the fitness. Then, we apply the binary tournament selection to choose a parent using the following selection function: SF(**x**) =

$$\begin{aligned} \text{pers and then assigning the smallest value as the fitness. Then, we apply the next selection to choose a parent using the following selection function:}\\ \text{SF}(\mathbf{x}) &= \begin{cases} \min \text{ASE}\_{34}(\mathbf{x}), & \text{if } \mathbf{x} \text{ is feasible,} \\ \min \text{ASE}\_{34}^{\text{max}} + \sum\_{j=1}^{l} \langle \overset{\sim}{\mathbb{Q}}\_{j}(\mathbf{x}) \rangle, & \text{otherwise,} \end{cases} \end{aligned} \tag{12}$$

where minASFmax <sup>34</sup> is the maximum minASF34(**x**) value of all feasible population members of MM-RGA. The above selection function has the following effects. If two solutions are feasible, minASF34(**x**) is used to select the winner. If one is feasible and the other is infeasible, the former is chosen, and for two infeasible members, the one with smaller constraint violation ∑*<sup>J</sup> j*=1*g*- *j* (**x**) is chosen. After *N* offspring population members are thus created, we merge the population to form a combined population of 2*N* members. The best solution to each **z**-vector is then copied to *Pt*+1. In the event of a duplicate, the second best solution for the **z**-vector is chosen. If *H* is smaller than *N*, then the process is repeated to select a second population member for as many **z**-vectors as possible. Thus, at the end of the MM-RGA procedure, exactly *H* in-fill solutions are obtained.

#### *3.4. Frameworks M4-1 and M4-2* ⎧

In these two frameworks, constraints are first combined to a single constraint violation function ACV(**x**) as in M2-1 (Equation (6)) and then ACV is metamodeled to obtain ACV -(**x**). The following problem is then solved: ⎪⎨⎪⎩Minimize ASF <sup>34</sup>(**x**, **<sup>z</sup>**),

$$\begin{array}{ll}\text{Problem O4-1:} & \begin{cases} \text{Minimize} & \text{ASF}\_{34}(\mathbf{x}, \mathbf{z}),\\ \text{Subject to} & \widehat{\text{ACV}}(\mathbf{x}) \le 0, \\ & \mathbf{x}\_{i}^{(L)} \le \mathbf{x}\_{i} \le \mathbf{x}\_{i}^{(II)}, \quad i = 1, 2, \dots, n, \end{cases} \end{array} \tag{13}$$

to find a single in-fill solution for every **z**. An archive is built with in-fill solutions. In M4-2, following problem is solved to find *H* in-fill solutions simultaneously: ⎪⎩Minimize minASF34(**x**) = min**<sup>z</sup>** ASF <sup>34</sup>(**x**, **<sup>z</sup>**),

$$\begin{aligned} \text{Problem O4-2:} \begin{cases} \text{Minimize} & \text{minASF}\_{34}(\mathbf{x}) = \min\_{\mathbf{Z}} \text{ASF}\_{34}(\mathbf{x}, \mathbf{z}),\\ \text{Subject to} & \widehat{\text{ACV}}(\mathbf{x}) \le 0, \end{cases} \\ \text{Both these frameworks require } H, \widehat{\text{ASF}}\_{34}(\mathbf{x}, \mathbf{z}) \text{ and one ACV function to be metastable,} \end{aligned} \tag{14}$$

thereby making a total of (*H* + 1) metamodels in each epoch. The same MM-RGA is used here, but the SF function is modified by replacing ∑*<sup>J</sup> j*=1*g*- *j* (**x**) term with ACV -(**x**) in Equation (12). A similar outcome as in Figure 3c occurs here, but the constraints are now handled using one metamodeled ACV -(**x**) function. M4-2 does not require an archive to be maintained, as *H* solutions will be found in one MM-RGA application.

#### *3.5. M5 Framework*

The focus of M5 is to use a generative multiobjective optimization approach in which a single PO solution is found at a time for a **z**-vector by using a combined *selection* function involving all objective and constraint functions together. The following *selection* function is first created: <sup>S</sup>5(**x**, **<sup>z</sup>**) = ASF34(**x**, **<sup>z</sup>**), if **<sup>x</sup>** is feasible,

$$\mathbf{S}\_{5}(\mathbf{x},\mathbf{z}) = \begin{cases} \text{ASF}\_{34}(\mathbf{x},\mathbf{z}), & \text{if } \mathbf{x} \text{ is feasible,} \\ \text{ASF}\_{34}^{\text{max}}(\mathbf{x},\mathbf{z}) + \langle \text{ACV}(\mathbf{x}) \rangle, & \text{otherwise.} \end{cases} \tag{15}$$

Here, the parameter ASFmax <sup>34</sup> (**x**, **z**) is the worst ASF34 function value (described in Equation (9)) of all feasible solutions from the archive. The selection function S5(**x**, **z**) is then metamodeled to obtain <sup>S</sup>-<sup>5</sup>(**x**, **z**), which is then optimized by RGA (described for M1-1) to find one in-fill solution for each **z**-vector. The unconstrained optimization problem with only variable bounds is given below: Minimize *S*-

$$\begin{array}{ll}\text{Problem O5:} & \int \text{Minimize} \quad \overset{\sim}{S}\_{\mathbb{S}}(\mathbf{x}, \mathbf{z}),\\\text{Solution:} & \mathbf{x}^\*(\mathbf{z}) \quad \int \text{Subject to} \quad \mathbf{x}\_i^{(L)} \le \mathbf{x}\_i \le \mathbf{x}\_i^{(U)}, \quad i = 1, 2, \dots, n. \end{array} \tag{16}$$

Thus, *H* metamodels of S5(**x**, **z**) need to be constructed for M5 in each epoch. Figure 4a shows the *S*<sup>5</sup> function with **z** = (0.1, 0.9) for SRN problem. Although details are not apparent in this figure, Figure 4b, plotted near the optimum, shows optimum more clearly. The entire surface plot is not shown for clarity, but it is interesting to see how a single function differentiates infeasible from feasible region and also makes the optimum of the function as one of the PO solutions.

(**a**) Optimal solution with **z** = (0.1, 0.9) in M5.

(**b**) The function *S*<sup>5</sup> surface is blown up near the optimum.

**Figure 4.** In-fill solution for a specific **z**-vector to be obtained by framework M5 for SRN problem.

Clearly, the complexity of the resulting S5(**x**, **z**) function will demand a large number of archive points for an accurate identification of the PO solution or a large number of epochs to arrive at the PO solution. However, the concept of metamodeling a selection function, which is not one of the original objective or constraint function, to find an in-fill solution of the problem is intriguing and opens up a new avenue for surrogate-assisted multiobjective optimization studies.

#### *3.6. Framework M6*

Finally, M6 framework takes the concept of M5 a bit further and constructs a single metamodel in each epoch by combining all *M* objectives and *J* constraints together. A multimodal selection function having each optimum corresponding to a distinct PO solution is formed for this purpose: 

$$\text{ASF}\_6(\mathbf{x}) = \min\_{\mathbf{Z} \in \mathbf{Z}} \max\_{i=1}^M \left( \underline{f}\_i(\mathbf{x}) - z\_i \right). \tag{17}$$

Then, the following selection function is constructed:

$$\text{ASF}\_6(\mathbf{x}) = \min\_{\mathbf{Z} \in \mathbf{Z}} \max\_{i=1}^{M} \left( \underline{f}\_i(\mathbf{x}) - z\_i \right). \tag{17}$$

$$\text{Using selection function is constructed:}$$

$$\text{AS}\_6(\mathbf{x}) = \begin{cases} \text{ASF}\_6(\mathbf{x}), & \text{if } \mathbf{x} \text{ is feasible,} \\ \text{ASF}\_{6,\text{max}} + \text{CV}(\mathbf{x}), & \text{otherwise,} \end{cases} \tag{18}$$

where ASF6,max is the maximum ASF6 value of all feasible archive members. For each archive member **x**, S6(**x**) is first computed. CV(**x**) is same as ACV(**x**), except that for a feasible **x**, CV is set to zero. Then, the following multimodal unconstrained problem (with variable bounds) is constructed to find *H* in-fill solutions simultaneously: Minimize S-

$$\begin{array}{c} \text{Problem O6:} \\ \text{Solution:} \ge \mathbf{x}^{i\*}, \ i = 1, \ldots, H \end{array} \left\{ \begin{array}{c} \text{Minimize} \\ \text{Subject to} \end{array} \begin{array}{c} \text{ $\vec{\mathcal{S}}\_{b}(\mathbf{x})$ ,} \\ \text{Subject to} \quad \mathbf{x}\_{i}^{(L)} \le \mathbf{x}\_{i} \le \mathbf{x}\_{i}^{(II)}, \quad i = 1, 2, \ldots, n. \end{array} \right. \tag{19}$$

A single metamodel needs to be constructed in each epoch in M6 framework. Due to the complexity involved in the <sup>S</sup>6-function, we employ a neural network <sup>S</sup>-<sup>6</sup>(**x**) to metamodel this selection function. A niched RGA [7] similar to that described in Section 3.4 is used here to find *H* in-fill solutions corresponding to each local optimum of the metamodeled S-<sup>6</sup>(**x**) function. No explicit archive needs to be maintained to store *H* solutions. Figure 5a shows *S*<sup>6</sup> function for SRN function on the entire search space. The detail inside the feasible region and near the optimal solutions shown in Figure 5b makes it clear that this function creates six optima on the PO front, corresponding to six **z**-vectors. Although the function is multimodal, the detail structure from Figure 5a to Figure 5b can be modeled gradually with iterations of a carefully designed optimization algorithm.

(**a**) Multimodal optimal solutions in M6. For illustration, six **z**-vectors are used in Equation (15). (**b**) The function *S*<sup>6</sup> surface is blown up near the optimal region showing six optima.

**Figure 5.** The function *S*<sup>6</sup> surface is blown up near the optimal region showing six optima.

#### *3.7. Summary of 10 Frameworks*

A summary of metamodeled functions and the optimization algorithms used to optimize them for all 10 frameworks is provided in Table 1. The relative computational cost for each framework can be derived from this table. M3-1 and M3-2 require to construct the maximum number of metamodels (assuming the number of desired PO solutions *H* > *M*) among all the frameworks, and M6 requires the least, involving only one metamodel in each epoch.

The evolutionary algorithm used to solve each optimization problem is also provided in the table.


**Table 1.** Summary of metamodeled functions and optimization algorithms needed in each epoch for all 10 frameworks.

#### **4. Adaptive Switching Based Metamodeling (ASM) Frameworks**

Each metamodeling framework in our proposed taxonomy requires building metamodels for either each objective and constraint or their aggregations. Thus, it is expected that each framework may be most suitable for certain function landscapes that produce a smaller approximation error, but that framework may not be good in other landscapes. During an optimization process, an algorithm usually faces different kinds of landscape complexities from start to finish. Thus, no one framework is expected to perform best during each step of the optimization process. While each framework was applied to different multiobjective optimization problems in another study [6,20] from start to finish, different problems were found to be solved best by different frameworks. To determine the best performing framework for a problem, a simple-minded approach would be to apply each of the 10 frameworks to solve each problem independently using SEmax high-fidelity evaluations, and then determine the specific framework which performs the best using an EMO metric, such as hypervolume [61] or inverse generational distance (IGD) [62]. This will be computationally expensive, requiring 10 times more than the prescribed SEmax. If each framework is allocated only 1/10-th of SEmax, they may be insufficient to find comparatively good solutions. A better approach would be to use an adaptive switching strategy that chooses the most suitable framework at each epoch.

As mentioned in the previous section, in each epoch, exactly *H* new in-fill solutions are created irrespective of the metamodeling framework used, thereby consuming *H* highfidelity SEs. Clearly, the maximum number of epochs allowable is *<sup>E</sup>*max <sup>=</sup> *SE*max−*N*<sup>0</sup> *<sup>H</sup>* with a minor adjustment on the SEs used in the final epoch. At the beginning of each epoch (say, *t*-th epoch), we have an archive (A*t*) of *Nt* high-fidelity solutions. For the first epoch, these are all *N*<sup>0</sup> Latin hypercube sampled (LHS) solutions, and in each subsequent epoch, *H* new in-fill solutions are added to the archive. At the start of *t*-th epoch, each of the 10 frameworks is used to construct its respective metamodels using all *Nt* archive members. Then, a 10-fold cross-validation method (described in Section 4.2) is used with a suitable performance metric (described in Section 4.1) to determine the most suitable framework for the next epoch. Thereafter, the best-performing framework is used to find a new set of *H* in-fill solutions. They are evaluated using high-fidelity evaluations and all 10 frameworks are statistically compared to choose a new best-performing framework for the next epoch. This process is continued until SEmax evaluations are made. A pseudocode of the proposed ASM approach is provided in Algorithm 1.

#### **Algorithm 1:** Adaptive Swithing Framework

```
Input :Objectives: [ f1,..., fm]
                               T, Constraints: [g1,..., gJ]
                                                       T, frameworks Mi with parameter Γi for
          i ∈ {1..., S} where S is number of frameworks, Number of initial samples, allowed
          high-fidelity solution evaluations, solutions per epoch and cross-validation partitions are N0,
          SEmax, u and K respectively.
  Output:PT
1 t, Pt, Ft, Gt,e ← 0, ∅, ∅, ∅, N0;
2 Pnew ← LHS(ρ)// Initial sampling
3 while True do
4 Fnew = { fi(Pnew), ∀i ∈ {1, . . . , M}}// high-fidelity objectives eval.
5 Gnew = {gj(Pnew), ∀j ∈ {1, . . . , J}}// high-fidelity constraints eval.
6 Pt+1, Ft+1, Gt+1 ← (Pt ∪ Pnew),(Ft ∪ Fnew),(Gt ∪ Gnew)// merge pop
7 e ← e + |Pnew|// number of high-fidelity evaluations
8 break if e ≥ SEmax// termination
9 Calculate {ASF(.), ACV(.), S5, S6} etc. from Pt+1, Ft+1 & Gt+1 as per requirements of Mi, ∀i;
10 Create random K partition (training and test set) Qk
                                                     t+1 from Pt+1, ∀k ∈ {1, . . . , K};
11 for k=1 to K do
12 for i=1 to S do
13 mi ← Build corresponding metamodels for framework Mi using training set of Qk
                                                                                          t+1;
14 SEP(k, i) ← Calculate selection-error probability for mi with test set of Qk
                                                                                 t+1;
15 MB ← Identify best frameworks from SEP;
16 Mb ← Randomly choose a framework from MB;
17 Pnew ← Optimize framework Mb(mb, Γb);
18 if |Pt+1| + |Pnew| > SEmax then
19 Pnew ← Randomly pick SEmax − |Pt+1| solutions from Pnew;
20 t ← t + 1;
      // end of epoch
21 return PT ← filter best solutions from Pt+1
```
#### *4.1. Performance Metric for Framework Selection*

To compare the performances among multiple surrogate models, mean squared error (MSE) has been widely used in literature [30]. For optimization algorithms, the regression methods that use MSE are known to be susceptible to outliers. For multiple objectives, different objectives and constraints may have different scaling. Our pilot study shows that even with the normalization of the objectives and constraints, the MSE metric does not always correctly evaluate the metamodels. Here, we introduce a *selection error probability* (SEP) metric which is more appropriate for an optimization task than MSE metric or even other measures, such as, the Kendal rank correlation coefficient [63] metric. The usual metrics may be better for a regression task, but for an optimization task, the proposed SEP makes a more direct evaluation of pair-wise comparisons of solutions.

SEP is defined as the probability of making an error in correctly predicting the better of two solutions compared against each other using the constructed metamodels. Consider Figure 6, which illustrates an minimization task and comparison of three different population members pair-wise. The true function values are shown in solid blue, while the predicted function values are shown in dashed blue. When points *x*<sup>1</sup> and *x*<sup>2</sup> are compared based on predicted function, the prediction is correct, since *f*((*x*1) < *f*(*x*2) and also ˜ *f*(*x*) < ˜ *f*(*x*2). However, when points *x*<sup>1</sup> and *x*<sup>3</sup> are compared against each other, the prediction is wrong. Out of the three pairwise comparisons, two predictions are correct and one is wrong, thereby making a selection error probability of 1/3 for this case. We argue that in an optimization procedure, it is the SEP which provides a better selection error than the actual function values, as the relative function values are important than the exact function values.

**Figure 6.** Selection Error Probability (SEP) concept is illustrated.

Mathematically, the SEP metric can be defined for *n* points as follows. For each of *N* = ( *n* <sup>2</sup>) pairs of points (*p* and *q*), we evaluate the selection error function (*E*(*p*, *q*)), which is one, if there is a mismatch between predicted winner and actual winner of *p* and *q*; zero, otherwise. Then, SEP is calculated as follows:

$$\text{SEP} = \frac{1}{N} \sum\_{p=1}^{n-1} \sum\_{q=p+1}^{n} E(p, q). \tag{20}$$

The definition of a "winner" can be easily extended to multiobjective and constrained multiobjective optimization by considering the domination [64] and constraintdomination [54] status of two points *p* and *q*.

#### *4.2. Selecting a Framework for an Epoch*

Frameworks having least SEP value are considered to be the best for performing the next epoch. We have performed 10-fold cross-validation in order to identify the best frameworks. After each epoch, *H* new in-fill points are evaluated using high-fidelity evaluations and added to the archive. In each fold of cross-validation, 90% solutions are used for constructing metamodels with respect to the competing frameworks. Then the corresponding frameworks are used to compare every pair (*p* and *q*) of the remaining 10% of archive points using the SEP metric. We apply constrained domination checks to identify the relationship between these two solutions. We then compare this relationship with the true relationship given by their high-fidelity values with the same constrained domination check. We calculate the selection error function (*E*(*p*, *q*)) for each pair of test archive solutions. The above process is repeated 10 times by using different blocks of 90% points to obtain 10 different SEP values for each framework. This cross-validation procedure does not require any new solution evaluations, as the whole computations are performed based on the already-evaluated archive points and their predicted values from each framework. Thereafter, the best framework is identified based on the median SEP value of frameworks.

Finally, the Wilcoxon rank-sum test is performed between the best framework and all other frameworks. All frameworks within a statistical insignificance (having *p* > 0.05) are identified to obtain the best-performing set M*B*. Then a randomly chosen framework (M*b*) is selected from M*<sup>B</sup>* for the next epoch. Since each of these frameworks performs similarly in a sense of median performance, the choice of a random framework makes the ASM approach diverse with the probability of using different metamodeling landscapes in successive epochs. This procedure, in practice, prohibits the overall approach from getting stuck in similar metamodeling frameworks for long, even it is one of the best performing frameworks.

#### *4.3. Trust-Region Based Real-Coded Genetic Algorithms*

Before we present the results, we need to discuss one other algorithmic aspect, which is important. Since the metamodels are not error-free, predictions of solutions close to highfidelity solutions are usually more accurate than predictions far from them. Therefore, we use a trust-region method [65] in which predictions are restricted within a radius *Rtrust* from each high-fidelity solution in the variable space. Trust region method is used in nonevolutionary metamodeling studies [35,36]. Another parameter *Rprox* is also introduced which defines the minimum distance with which any new solution should be located from an archive member to provide a diverse set of in-fill solutions. We simulate a feasible search region *Rsearch* around every high-fidelity solution: *Rprox* ≤ *Rsearch* ≤ *Rtrust*. Using the concepts of trust-region method from the literature [66], we reduce the two radii at every epoch by constant factors: *R*new *trust* = 0.75*R*old *trust* and *R*new *prox* = 0.1*R*new *trust*. A reduction of two radii helps in achieving more trust on closer to high-fidelity solutions with iterations. These factors are found to perform well on a number of trial-and-error studies prior to obtaining the results presented in the next section.

The optimization methods for metamodels are modified as follows. At generation *t*, parent population *Pt* is applied by a standard binary constrained tournament selection on two competing population members using the metamodeled objectives, constraints, or selection criteria described before to choose the winner. Standard recombination and mutation operators (without any care for trust region concept) are used to create an offspring population, which is then combined with the parent population *Pt* and then better half is chosen for the next generation as parent population *Pt*+<sup>1</sup> using the trust region concept. We first count the number of solutions in the combined population within the two trust regions. If the number is smaller than or equal to *N*, then they are copied to *Pt*+<sup>1</sup> and remaining slots are filled with solutions which are closest to the high-fidelity solutions in the variable space. On the other hand, if the number is larger than *N*, the same binary constrained tournament selection method is applied to pick *N* solutions from them and copied to *Pt*+1.

#### **5. Results and Discussion**

We present the results of the ASM approach on 18 different test and engineering problems. The problems include two to five-objective, constrained, and unconstrained problems. In order to get robust performance, we have included all 10 frameworks as options for switching in our ASM approach. The performance of ASM approach is compared with each framework alone. We then compare ASM's performance with three recently suggested multiobjective metamodeling methods: MOEA/D-EGO [14], K-RVEA [23], and CSEA [33].

#### *5.1. Parameter Settings*

For two-objective problems, we use NSGA-II [39] for M1-2 and M2-2 frameworks. For problems with higher number of objectives, we use NSGA-III [60] procedure. Note that, other multiobjective evolutionary algorithms (e.g., MOEA/D [14] or RVEA [23]) can also be used. A population of size (*N* = 100) is used when the number of reference lines (*H*) is less than 100. Otherwise, the population size is set identical to *H*. Initial archive size is set according to Table 2. Other parameter settings are as follows: number of generations *τ* = 300, SBX crossover probability *pc* = 0.95, polynomial mutation probability *pm* = 1/*n* (where *n* is the number of variables), distribution indices for SBX and mutation operators are *<sup>η</sup><sup>c</sup>* <sup>=</sup> 20 and *<sup>η</sup><sup>m</sup>* <sup>=</sup> 20, respectively. Initial value of *Rtrust* is set to be <sup>√</sup>*<sup>n</sup>* for the normalized problems having variable domain [0, 1] *<sup>n</sup>*. The number of reference points, SEmax, resulting epochs for each problem are presented in Table 2.


**Table 2.** Parameter values for 18 problems.

#### *5.2. Two-Objective Unconstrained Problems*

First, we apply our proposed methodologies to two-objective unconstrained problems: ZDT1, ZDT2, ZDT3, ZDT4 and ZDT6. Table 3 presents the median IGD values of 11 runs for each framework applied standalone from start to end. In the absence of any constraint or having a single constraint, M1-1 and M2-1 are identical frameworks; so are M1-2 and M2-2, M3-1 and M4-1, M3-2, and M4-2. This is why we keep a blank under M2-1, M2-2, M4-1, M4-2 entries for unconstrained and single-constraint problems in the table. The best performing method is first identified based on the median IGD values and is marked in bold. A *p*-value from an Wilcoxon rank sum test of each other method is then computed for 11 runs with the 11 runs of the best-performing method. If any algorithm produces a *p*-value greater than 0.05, it indicates that the algorithm has produced a statistically similar performance to the best-performing method and its median IGD value is then marked in italics. It is clear from the table that the ASM approach (right-most column), being mostly in bold, performs better or equivalent to all frameworks for all five ZDT problems, whereas M1-1 performs the best in the first four problems. M1-2 and M3-1 performs well in three test problems, whereas M6 performs the best in ZDT6 problem. Obtained nondominated solutions of two-objective constrained and unconstrained problems of the median run are presented in Figure 7. We also show performance of other comparing algorithms: MOEA/D-EGO [14], K-RVEA [23], and CSEA [33] in the figure.

It is apparent that ASM approach is able to find a better distributed and converged set of points than other methods for an identical number of SEs.

The epoch-wise proportion of usage of each framework over 11 runs of the ASM approach is shown in Figure 8 for all five ZDT problems. For ZDT1, standalone M1- 1, M2-1, M3-1, and M4-1 perform in a statistically similar manner as shown in Table 3, but the ASM approach mostly restricts its epoch-wise choice on M1-1, M1-2, M2-1, and M2-2 and produces a similar performance in most epochs. Since multiple frameworks can appear with a similar performance in an epoch, the proportions (shown in Figure 8) need not sum up to one at each epoch. For ZDT2, only M1-1 and M1-2 perform well as a standalone framework (Table 3), and the ASM approach is able to pick these two frameworks to produce the best performing result. Notice that since ZDT1 and ZDT2 do not have any constraint, M1-1 and M2-1 are identical frameworks and M1-2 and M2-2 are identical frameworks. Except in ZDT6, M1-1, M1-2, M1-2, and M2-2, for which objectives are independently metamodeled, turn to be dominating frameworks. However, for ZDT6, M3-2, M4-2, and M6 show their dominance. In ZDT4, almost all the frameworks are found to be switching between them early on but settles with M1 and M2 frameworks at the latter

part of the optimization runs. Switching among different frameworks performs well on all five problems.

The switching patterns of frameworks for the median performing run for ZDT1, ZDT4, and ZDT6 are shown in Figure 9. Although multiple frameworks may exist at the end of each epoch, the figure shows the specific framework which was chosen for this specific run. For ZDT2, the ASM approach juggles mostly between M1 and M2 variants and produce the best performing result, even better than M1 and M2 alone. In ZDT4, the ASM approach alternates between eight frameworks in the beginning and settles with four of them (M3 and M4 variants) in the middle and then uses M3 variants at the end to produce statistically equivalent result to M1-1 alone. Interestingly, while as a standalone framework from start to end, M1-1 performs the best performance, the ASM approach does not use M1-1 in any of the epochs. The switching of different frameworks from epoch to epoch is clear from these plots.

**Figure 7.** Non-dominated solutions of the final archive for the median run of ASM approach for two-objective ZDT and constrained problems. In all cases, a well-diversified set of near PO solutions is obtained with a limited solution evaluations.

#### *5.3. Two-Objective Constrained Problems*

Next, we apply ASM approach and all the frameworks separately to standard twoobjective constrained problems: BNH, SRN, TNK, OSY, and the welded beam problem (WB) [54]. The ASM approach performs the best on three of the five problems, followed by M1-1 which performed best in two problems; however, both these methods perform the best statistically on all five problems. Other individual frameworks do not perform so well. Figure 8 shows the epoch-wise utilization of different frameworks for TNK and WB

**Metamodeling Frameworks**

**Metamodeling Frameworks**

in 11 runs. The plots for TNK shows that ASM almost always chooses M1-1 or M1-2 as the best-performing frameworks as supported by IGD values in Table 3.

However, on WB problem, ASM approach selects M1-1, M5, and M6 in most of the epochs, despite poor performance of the latter two when applied in a stand-alone manner from start to end.

**Figure 8.** Epoch-wise proportion of appearance of 10 frameworks within *MB* in 11 runs of the ASM approach for ZDT problems, TNK, and welded beam design problems indicates the use of multiple frameworks during optimization. Some problems uses some specific frameworks more frequently.

**Figure 9.** Switching among frameworks for the median IGD run of the ASM approach for ZDT2, ZDT4, and ZDT6 indicates that many frameworks are used during the optimization process.

*Math. Comput. Appl.* **2021**, *26*, 5




**Table 3.** *Cont.*

#### *5.4. Three and More Objective Constrained and Unconstrained Problems*

Next, we apply all ten frameworks and ASM approach to three-objective optimization problems (DTLZ2, DTLZ4, DTLZ5, and DTLZ7) and also to two three-objective constrained problem (C2DTLZ2 and the car side impact problem CAR [60]). Table 3 shows that while M2-2 works uniquely the best on CAR, M1-2 and M3-2 on C2-DTLZ2, and M1-1, M1-2, and M6 on DTLZ4, the performance of ASM approach is better or equivalent compared to all 10 problems.

The epoch-wise proportion of utilization of 10 frameworks in 11 runs are shown in Figure 10 for three and five-objective problems. It can be clearly seen that M3-1 to M6 frameworks are not usually chosen by the ASM approach on most of these problems, except for complex problems, such as DTLZ4. Switching has been confined between M1-1 to M2-2 for most problems, except in DTLZ4, in which all generative frameworks are found to be useful in certain stages during the optimization process. DTLZ7 works better with simultaneous frameworks M1-2 and M2-2.

**Figure 10.** Epoch-wise proportion of usage of 10 frameworks in 11 runs of the ASM approach for three and five-objective problems.

On two five-objective unconstrained DTLZ2 and constrained C2-DTLZ2 problems, M1-2 alone and ASM approach perform the best with statistically significant difference with other frameworks. Constrained C2DTLZ2 problems use similar a switching pattern for three and five-objective version of the problem.

Table 4 calculates the rank of each of the 10 frameworks for solving 18 problems. The table shows that the ASM approach performs the best overall, followed by M1-2, M2-2, and M3-1 respectively. It indicates that overall, metamodeling of objectives independently is a better approach for these problems. M6, although being the most efficient in the number of metamodels, performs the worst.


**Table 4.** Average rank of 10 frameworks and the ASM approach on 18 problems based on Wilcoxon rank-sum test.

#### *5.5. Comparison with Existing Methods*

Next, we examine the performance of our adaptive switching metamodeling (ASM) strategy by comparing them with a few recent algorithms, namely, MOEA/D-EGO [14], K-RVEA [23], and CSEA [33]. Algorithms are implemented in PlatEMO [67]. Since these three competing algorithms can only be applied to unconstrained problems, only ZDT and DTLZ problems are considered here. We plan to compare our constrained approach with existing constraint handling methods [32]. Identical parameters settings as those used with the ASM approach are used for the three competing algorithms. Table 5 presents the mean IGD value of each algorithm. The Wilcoxon rank-sum test results are also shown. It is clearly evident that ASM approach outperforms three competing methods, of which K-RVEA performs well only on two of the nine problems.

**Table 5.** Median IGD on unconstrained problems using ASM approach, and MOEA/D-EGO, K-RVEA, and CSEA algorithms. DNC is denoted as "Did not converge" within given time.


#### **6. Conclusions**

In this paper, we have provided a brief review of existing metamodeling methods for multiobjective optimization, since there has been a surge in such studies in the recent past. Since this calls for modeling multiple objectives and constraints in a progressive manner, a recently proposed taxonomy of 10 frameworks involving metamodeling of independent or aggregate functions of objectives and constraints have been argued to cover a wide variety such methods. Each framework has been presented in detail, comparing and contrasting them in terms of the number of metamodeling functions to be constructed, the number of internal optimization problems to be solved, and the type of optimization methods to be employed, etc. We have argued that each metamodeling framework may be ideal at different stages during an optimization run on an arbitrary problem, hence, an ensemble use of all 10 frameworks becomes a natural choice. To propose an efficient

multiobjective metamodeling algorithm, we have proposed an adaptive switching based metamodeling (ASM) methodology which automatically chooses the most appropriate framework epoch-wise during the course of an optimization run. In order to choose the best framework in every epoch, we perform statistical tests based on a newly proposed acceptance criterion—selection error probability (SEP), which counts the correct pairwise relationships of objectives between two test solutions in a *k*-fold cross-validation test, instead of calculating the usual mean-squared error of metamodeled objective values from true values. We have observed that SEP is less sensitive to outliers and is much better suited for multiobjective constrained optimization. In each epoch, the ASM approach switches to an appropriate framework which then creates a prespecified number of in-fill points by using either an evolutionary single or multiobjective algorithm or by using a multimodal or a niche-based real-parameter genetic algorithm. On 18 test and engineering problems having two to five objectives and multiple constraints, the ASM approach has been found to perform much better compared to each framework alone and also to three other existing metamodeling multiobjective algorithms.

It has been observed that in most problems a switching between different M1 and M2 frameworks, in which objectives are independently metamodeled, has performed the best. Metamodeling of constraints in an aggregate manner or independently is not an important matter. However, for more complex problems, such as ZDT3, ZDT6, ZDT4, DTLZ4, and engineering design problems, all 10 frameworks, including M5 and M6, have been involved at different stages of optimization. Interestingly, certain problems have preferred to pick generative frameworks (M*i*-1 and M5) only, while some others have preferred simultaneous frameworks (M*i*-2 and M6). Clearly, further investigation is needed to decipher a detail problem-wise pattern of selecting frameworks, but this first study on statistics-based adaptive switching has clearly shown its advantage over each framework applied alone.

While in this paper, Kriging metamodeling method has been used for all frameworks, this study can be extended to choose the best metamodeling method from an ensemble of RBF, SVR, or other response surface methods to make the overall approach more computationally efficient. In many practical problems, some functions may be relatively less time-consuming, thereby creating a *heterogeneous* metamodeling scenario [16,68,69]. A simple extension of this study would be to formulate a heterogeneous *MP* (for example, M1-1's objective function for a two-objective problem involving a larger evaluation time for *<sup>f</sup>*<sup>1</sup> can be chosen as *f* - 1 (**x**), *f* 2 (**x**) , in which the objective *f*<sup>2</sup> has not been metamodeled at all). However, more involved algorithms can be tried for to handle such pragmatic scenarios. Another practical aspect comes from the fact that a cluster of objectives and constraints can come at the end of a single expensive evaluation procedure (such as, compliance objective and stress constraint comes after an expensive finite element analysis on a mechanical component design problem), whereas other functions come from a different time-scale evaluation procedure. The resulting definition of an epoch and the overall metamodeling approach need to be reconsidered to make the overall approach efficient. Other tricks, such as, the use of a low-fidelity evaluation scheme for expensive objective and constraints early on during the optimization process using a multifidelity scheme and the use of domain-informed heuristics to initialize population and repair offspring solutions must also be considered while developing efficient metamodeling approaches.

**Funding:** This research received no external funding.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Differential Evolution in Robust Optimization Over Time Using a Survival Time Approach**

**José-Yaír Guzmán-Gaspar 1,\*, Efrén Mezura-Montes <sup>1</sup> and Saúl Domínguez-Isidro <sup>2</sup>**


Received: 2 October 2020; Accepted: 23 October 2020; Published: 26 October 2020

**Abstract:** This study presents an empirical comparison of the standard differential evolution (DE) against three random sampling methods to solve robust optimization over time problems with a survival time approach to analyze its viability and performance capacity of solving problems in dynamic environments. A set of instances with four different dynamics, generated by two different configurations of two well-known benchmarks, are solved. This work also introduces a comparison criterion that allows the algorithm to discriminate among solutions with similar survival times to benefit the selection process. The results show that the standard DE holds a good performance to find ROOT solutions, improving the results reported by state-of-the-art approaches in the studied environments. Finally, it was found that the chaotic dynamic, disregarding the type of peak movement in the search space, is a source of difficulty for the proposed DE algorithm.

**Keywords:** robust optimization; differential evolution; ROOT

#### **1. Introduction**

Optimization is an inherent process in various areas of study and everyday life. The search to improve processes, services, and performances has originated in different solution techniques. However, there are problems in which uncertainty is present over time, given that the solution's environment can change at a specific time. These types of problems are named Dynamic Optimization Problems (DOPs) [1]. This study deals with dynamic problems where the environment of the problem changes over time. Various studies have been carried out to resolve DOPs through tracking moving optima (TMO), which is characterized by the search for and implementation of the global optimal-solution every time the environment changes [2–4].

Evolutionary algorithms, such as Differential Evolution (DE), have shown good performance to solve tracking problems [5–7]. However, the search and implementation of the optimum each time the environment changes may not be feasible due to different circumstances, such as time or cost.

The approach introduced in [8] tries to solve DOPs through a procedure known as robust optimization over time (ROOT). ROOT seeks to solve DOPs by looking for a good solution for multiple environments and preserve it for as long as possible, while its quality does not decrease from a pre-established threshold. The solution found is called Robust Solution Over Time (RSOT).

In this regard, Fu et al. [9] introduced different measures to characterize environmental changes. After that, the authors developed two definitions for robustness [10]. The first one was based on "Survival Time"—when a solution is considered acceptable (an aptitude threshold must be previously defined). *Math. Comput. Appl.* **2020**, *25*, 72

The second definition was based on the "Average Fitness"—the solution's average fitness is maintained during a previously defined time window. The measurements incorporate information on the concepts of robustness and consider the values of error estimators. An algorithm performance measure was suggested to find ROOT solutions. The study was carried out using a modified version of the moving peaks benchmark (mMPB) [10].

Jin et al. [11] proposed a ROOT framework that includes three variants of the Particle Swarm Optimization algorithm (PSO). PSO with a simple restart strategy (sPSO), PSO with memory scheme (memPSO), and a variant that implements the species technique SPSO. The authors applied a radial basis function as an approximation and an autoregressive model as a predictor.

On the other hand, Wang introduced the concept of robustness in a multi-objective environment, where a framework is created to find robust Pareto fronts [12]. The author adopted the dynamic multi-objective evolutionary optimization algorithm in the experiments. At the same time, Huang et al. considered the cost derived from implementing new solutions, thus addressing the ROOT problem by a multi-objective PSO (MOPSO); The Fu metric was applied in that study [13].

Yazdani et al. introduced a new semi-ROOT algorithm that looks for a new solution when the current one is not acceptable, or if the current one is acceptable but the algorithm finds a better solution, whose implementation is preferable even with the cost of change [14].

Novoa-Hernández, Pelta, and Corona analyzed the ROOT behavior using some approximation models [15]. The authors suggested that the radial basis network model with radial basis function works better for problems with a low number of peaks. However, considering all the scenarios, the SVM model with Laplace Kernel shows notably better performance to those compared in the tests carried out.

Novoa-Hernández and Amilkar in [16] reviewed different relevant contributions to ROOT. The authors analyzed papers hosted in the SCOPUS database. Concerning new methods to solve ROOT problems, Yazdani, Nguyen, and Branke proposed a new framework using a multi-population approach where sub-populations track peaks and collect information from them [17]. Adam and Yao introduced three methods to address ROOT (Mesh, Optimal in time, and Robust). The authors mentioned that they significantly improves the results obtained for ROOT in the state-of-the-art [18]. Fox, Yang and Caraffini studied different prediction methods in ROOT, including the Linear and Quadratic Regression methods, an Autoregressive model, and Support Vector Regression [19]. Finally, Liu and Liang mapped a ROOT approach to minimize the electric bus transit center's total cost in the first stage [20].

In different studies, DE has been used to solve ROOT problems using the Average Fitness approach, achieving competitive results [21,22]. However, to the best of the authors' knowledge, there are no studies that determine DE's performance in solving ROOT problems with the Survival Time approach, and this is where this work precisely focuses. This research aims to present an empirical comparison of the standard DE against three random sampling methods to solve robust optimization over time problems with a survival time approach to analyze its viability and performance capacity of solving problems in dynamic environments.

The paper is organized as follows: Section 2 includes ROOT's definition under a survival time approach, while in Section 3 the implemented methods based on random sampling are detailed. Section 4 details the standard differential evolution and the objective function used by the algorithm in the present study. Section 5 specifies the benchmark problems to be solved. After that, Section 6 specifies the experimental settings and Section 7 shows the results obtained. Finally, Section 8 summarizes the conclusions and future work.

*Math. Comput. Appl.* **2020**, *25*, 72

#### **2. Survival Time Approach**

Under this approach, a threshold is predefined to specify the quality that a solution must have to be considered good or suitable to survive. Once the threshold is defined, the search begins for a solution whose fitness can remain above the threshold in as many environments as possible. In this sense, the solution is maintained until its quality does not meet the predefined expectations, and then new robust solution over time must be sought.

In Equation (1), the function *Fs*(*x*, *t*, *V*) to calculate the survival time fitness of a solution *x* at time *t* is detailed. It measures the number of environments that a solution remains in above the threshold *V*. *x*, *t*, *V*) =

$$F^\delta(\vec{x}, t, V) = \begin{cases} 0 & \text{if } f\_l(\vec{x}) < V \\ 1 + \max\{l | \forall i \in \{t, t+1, ..., t+l\}: f\_l(\vec{x}) \ge V\}, & \text{in other case} \end{cases} \tag{1}$$

#### **3. Random Sampling Methods**

In the study presented in [18] the authors proposed three random sampling methods to solve ROOT problems, with a better performance against the state-of-the-art algorithms. The methods are described below.

In all three methods, the best solution should be searched in the current time's solutions space, modifying the solution space when the "Robust method" is used and then using that solution next time according to the approach used (Survival Time or Average Fitness).

#### *3.1. Mesh*

This method performs random sampling in the current search space, then uses the sample with the best fitness in the current environment as a robust solution over time, using the solution found in the following times (Figure 1).

**Figure 1.** Mesh method. The yellow point is the robust solution found by the method and it is used in the next times.

#### *3.2. Time-Optimal*

This method performs a search similar to the "Mesh" method, with the difference that the best solution found being improved using a local search (Figure 2).

**Figure 2.** Time-optimal method. The green point is the robust solution found by the method, which is used in the next times.

#### *3.3. Robust*

This method performs a search similar to the "Mesh" method, differing in that a smoothing preprocessing of the solution space is performed before the search process. As seen in Figure 3, the solution obtained (green dot) can vary concerning the solution with better suitability in the raw environment (yellow dot).

**Figure 3.** Robust method. The green point is the robust solution found by the method which is used in the next times.

#### **4. Differential Evolution**

In 1995, Storm and Price proposed an evolutionary algorithm to solve optimization problems in continuous spaces. DE is based on a population of solutions that, through simple recombination and mutation, evolves, thus improving individuals' fitness [23].

Considering the fact that this work, to the best of the authors' knowledge, is the first attempt to study DE in this type of ROOT problems (i.e., survival time), and also taking into account that the most popular DE variant (DE/rand/1/bin) has provided competitive results in ROOT problems under an average fitness approach [21,22], the algorithm used in this study is precisely the most popular variant known as DE/rand/1/bin, where ''rand" (random) refers to the base vector used in the mutation, ''bin" (binomial) refers to the crossover type used, and 1 means one vector difference computed.

The algorithm starts by randomly generating a uniformly distributed population *xi*,*<sup>G</sup>* ∈ *i* = 1, ..., *NP*, where NP is the number of individuals for each generation ''G".

After that, the algorithm enters an evolution cycle until the stop condition is reached. We applied the maximum number of evaluations allowed ''*MAXEval*" as the stop condition.

Subsequently, to adapt individuals, the algorithm performs recombination, mutation, and the replacement of each one in the current generation. One of the most popular mutations is DE/rand/1 in

Equation (2), where *r*0 = *r*1 = *r*2 = *i* are the indices of individuals randomly chosen from the population, *1* is the number of differences used in the mutation and *F* > 0 is the scale factor.

$$
\vec{\omega}\_{i,\mathcal{G}} = \vec{\mathbf{x}}\_{r0,\mathcal{G}} + F(\vec{\mathbf{x}}\_{r1,\mathcal{G}} - \vec{\mathbf{x}}\_{r2,\mathcal{G}}) \tag{2}
$$

The vector obtained *vi*,*<sup>G</sup>* is known as mutant vector, which is recombined with the target (parent) vector by binomial crossover, as detailed in Equation (3). *vj*,*i*,*G*, *i f randj* ≤ *CR*

$$\mu\_{i,j,G} = \begin{cases} \upsilon\_{j,j,G,\prime} & \text{if } \{rand\_j \le \text{CR} \} \text{ or } (j = j\_{rand})\\ \upsilon\_{j,j,G,\prime} & \text{otherwise} \end{cases} \tag{3}$$

In this study, the elements of the child vector (also called trial) *ui*,*j*,*<sup>G</sup>* are limited according to the pre-established maximum and minimum limits, also known as boundary constraints. Based on the study in [24], we use the boundary method (see Algorithm 1, line 14). In the selection process, the algorithm determines the vector that will prevail for the next generation between parent (target) and child (trial), as expressed in Equation (4). 

$$\vec{\mathfrak{x}}\_{i,G+1} = \begin{cases} \ \vec{u}\_{i,G\prime} & \text{if } (f\left(\vec{u}\_{i,G}\right) \le f\left(\vec{x}\_{i,G}\right)) \\ \ \vec{x}\_{i,G\prime} & \text{otherwise} \end{cases} \tag{4}$$

**Algorithm 1:** "DE/rand/1/bin" Algorithm. *NP*, *MAXEval*, *CR* and *F* are parameters defined by the user. *D* is the dimension of the problem.

**<sup>1</sup>** *G* ← 0 **<sup>2</sup>** Generate an uniform initial random population *xi*,*<sup>G</sup>* ∈ *i* = 1, ..., *NP* **<sup>3</sup>** Compute *f*(*xi*,*<sup>G</sup>* )∀*i*, *i* = 1, ..., *NP* **<sup>4</sup>** *Eval* ← *NP* **<sup>5</sup> while** *Eval* < *MAXEval* **do <sup>6</sup> for** *j* = 1 *to NP* **do <sup>7</sup>** Randomly select *r*0 = *r*1 = *r*2 = *i*: **<sup>8</sup>** *jrand* ← *randi*([1, *D*]) **<sup>9</sup> for** *j* = 1 *to D* **do <sup>10</sup> if** *randj*([0, 1]) < *CR or j* = *jrand* **then <sup>11</sup>** *ui*,*j*,*<sup>G</sup>* ← *xr*0,*j*,*<sup>G</sup>* + *F*(*xr*1,*j*,*<sup>G</sup>* − *xr*2,*j*,*G*) **12 else <sup>13</sup>** *ui*,*j*,*<sup>G</sup>* ← *xi*,*j*,*<sup>G</sup>* **<sup>14</sup>** *ui*,*<sup>G</sup>* ← *min*(*max*(*ui*,*G*, *xmin*), *xmax*) **<sup>15</sup> if** *f*(*ui*,*G*) ≥ *f*(*xi*,*G*) **then <sup>16</sup>** *xi*,*G*+<sup>1</sup> ← *ui*,*<sup>G</sup>* **17 else <sup>18</sup>** *xi*,*G*+<sup>1</sup> ← *xi*,*<sup>G</sup>* **<sup>19</sup>** *Eval* ← *Eval* + 1 **<sup>20</sup> if** *Eval* >= *MAXEval* **then 21** break **<sup>22</sup>** *G* ← *G* + 1

*Math. Comput. Appl.* **2020**, *25*, 72

In Equation (1), the function to obtain an individual's fitness through the survival time approach is shown. However, the fitness obtained is not enough to differentiate similar individuals, i.e., individuals who have survived the same amount of environments. That is why, in the implemented algorithm, we consider an additional calculation to help identify better solutions.

We propose to obtain the average of the solution's quality in the environments that have survived. This average value helps to differentiate solutions with similar survival times. Therefore, the objective function now considers both, the number of surviving environments and the performance achieved by this solution in those environments that it has survived.

Considering the fact that the maximum height of the peaks is defined at 70 (see Table 3), the objective function for the implemented algorithm is given by the result obtained in Equation (1) multiplied by 100 plus the average fitness of the solution throughout the environments it has survived.

#### **5. Benchmark Problems**

The problems tackled in this study are based on Moving Peaks Benchmark (MPB) [25] and are configured in a similar way to that used in various specialized literature publications on ROOT, and specifically as used in [18].

Two modified MPBs can be highlighted, which are described in the following subsections. The dynamics used are presented in Table 1, where Δ*φ* is the increment from time *t* to time *t* + 1 of the *φ* parameter.

**Table 1.** Dynamic Functions.


#### *5.1. Moving Peaks Benchmark 1 (MPB1)*

In this benchmark, environments with conical peaks of height *h*(*t*) ∈ [*hmin*, *hmax*], width *w*(*t*) ∈ [*wmin*, *wmax*] and center *c*(*t*) ∈ [*xmin*, *xmax*] are generated, where the design variable *x* is bounded in [*xmin*, *xmax*]. The function to generate the environment is expressed in Equation (5), where the dynamic function for height and width is given as in Table 1, while the center moves according to Equation (6). *r<sup>i</sup>* follows an uniform distribution of a D-dimensional sphere of radius *s<sup>i</sup>* , and *λ* ∈ [0, 1] is a fixed parameter.

$$f\left(\vec{x},\vec{a}(t)\right) = \max\_{i=1}^{i=m} \{h^i(t) - w^i(t) \|\vec{x} - \vec{a}^i(t)\|\_{l\_2}\}\tag{5}$$

$$\begin{aligned} \vec{c}^i(t+1) &= \vec{c}^i(t) + \vec{v}^i(t+1) \\ \vec{v}^i(t+1) &= \mathbf{s}^i \frac{(1-\lambda)r^i(t+1) + \lambda \vec{v}^i(t)}{|| (1-\lambda)r^i(t+1) + \lambda \vec{v}^i(t) ||} \end{aligned} \tag{6}$$

In the present study, two problems generated by this benchmark are solved, with *λ* = 0 it implies that the movement of the peaks is random, while with *λ* = 1 it implies that the movement is constant in the direction *vi* (*t*).

#### *5.2. Moving Peaks Benchmark 2 (MPB2)*

The set of test functions in this benchmark is described in Equation (7), where*a*(*t*) is the environment at time step *t*, *h<sup>i</sup>* (*t*), *w<sup>i</sup>* (*t*),*ci* (*t*) is the height, width and center of the *i*-th peak function at time *t*, respectively; *x* is the decision variable and *m* is the total number of peaks. *h<sup>i</sup>* (*t* + 1) and *w<sup>i</sup>* (*t* + 1) vary according to Table 1. An additional technique that uses a rotation matrix is used to rotate the centers [25].

$$f\left(\vec{x},\vec{a}(t)\right) = \frac{1}{d} \sum\_{j=1}^{d} \max\_{i=1}^{d} \left\{ h^{i}(t) - w^{i}(t) \|\vec{x} - \vec{c}^{i}(t)\|\right\} \tag{7}$$

#### **6. Experimental Settings**

Based on the information in Section 5, different environments are generated as test problems and they are summarized in Table 2.


**Table 2.** Summary of problems.

The parameter settings of the problems are detailed in Table 3.

**Table 3.** Parameters settings.


The height and width of the peaks were randomly initialized in the predefined ranges. The centers were randomly initialized within the solution space.

A survival threshold *V* = 50 is selected, representing the most difficult cases that have been resolved in the literature under the survival approach. The higher the survival threshold, the more difficult it is to find solutions that satisfy multiple scenarios.

The DE parameters were fine-tuned using the iRace tool [26] and they are summarized in Table 4, where NP is the population size, CR is the crossover parameter, and F is the scale factor.



For each problem, a solution is sought at each time *t* ∈ (2, ..., 100).

In order to evaluate an RSOT in a specific time, approximate and predictive methods have been used in the literature so that the performance of an algorithm depends on their accuracy. However, in this study, we want to know the DE behavior when solving the ROOT environments considering they had ideal predictors to evaluate the solutions. In this regard, the process to study the algorithm's ability to find RSOT using DE at each instant of time is as follows:


#### **7. Results and Discussion**

The results for the problems generated with dynamics 1–4 are detailed in Table 7 and graphically shown in Figures 6–9, for each one of the four dynamics. In all four figures, those labeled with (a) and (b) present the results obtained in the MPB1 problems, while those labeled with (c) and (d) refer to the MPB2 problems. In all cases, the average survival values obtained by Mesh, time-optimal and robust approaches are compared against DE.

Non-parametric statistical tests [27] were applied to the corresponding numerical results presented in Table 7. The 95%-confidence Kruskal–Wallis and 95%-confidence Friedman tests were applied and their obtained *p*-values are reported in Table 5.

**Table 5.** Results of the 95%-confidence Kruskal–Wallis (KW) and Friedman (F) tests. The symbol (\*) after letter ''D" in the *Problem* column refers to the type of dynamic used according to columns *Dynamic*. A *p*-value less than 0.05 means that there are significant differences among the compared algorithms in such problems.


To further determine differences among the compared algorithms, the 95%-confidence Wilcoxon test was applied to pair-wise comparisons for each problem instance. The obtained *p*-values are reported in Table 6, where the significant improvement with a significance level *α* = 0.5 is shown in boldface. We can observe that the Wilcoxon test confirmed significant differences obtained in Kruskal–Wallis and Friedman Tests, most of them comparing DE/rand/1/bin versus random sampling methods, with the exception of four corresponding to problems generated with dynamic 4 (B1D4-1 and B1D4-2 both, in the comparison of DE/rand/1/bin versus Mesh and DE/rand/1/bin versus Time-optimal).

**Table 6.** Results of the 95%-confidence Wilcoxon signed-rank test. A *p*-value less than 0.05 means that exists significant differences.


Table 7 summarizes the mean and standard deviation statistical results obtained by the compared algorithms. It can be seen that DE/rand/1/bin obtains the highest average values in all the problems that were solved. Nevertheless, the higher standard deviation values obtained by DE/rand/1/bin in problems B1D4-1 and B1D4-2 confirm those expressed by the non-parametric tests—the differences are not significant with respect to the random sampling methods. Figures 4 and 5 have the box-plots for B1D4-1 and B1D4-2, where all the compared algorithms reach survival times between 1 and 3, with the exception of the Robust approach in B1D4-1, but such a difference was not significant.


**Table 7.** Statistical Results obtained by each Algorithm in each one of the problem instances. Best statistical results are marked with boldface.

**Figure 4.** Boxplot of the results obtained in B1D4-1.

**Figure 5.** Boxplot of the results obtained in B1D4-2.

With respect to the graphical results, when MPB1 (items (a) and (b)) is compared against MPB2 (items (c) and (d)) in Figures 6–9, it is clear that MPB1 is more difficult to solve by all four approaches. However, in all cases (MPB1 and MPB2 in the four dynamics) DE is able to provide better results against the three other algorithms. Such a performance is more evident in all MPB2 instances.

Regarding MPB1 (items (a) and (b)), it is important to note that it is more difficult to find higher survival times when the peak movement is random (items (a), where *λ* = 0).

Another interesting behavior found is that all four compared methods are affected mainly by the random and chaotic dynamics in those MPB1 instances, the latter one being the most complex (chaotic dynamic). However, even in such a case DE was able to match and in some cases improve the survival values of the compared approaches. This source of difficulty now found motivates part of our future research.

*Math. Comput. Appl.* **2020**, *25*, 72

**Figure 7.** Results obtained using dynamic 2.

**Figure 8.** Results obtained using dynamic 3.

**Figure 9.** Results obtained using dynamic 4.

#### **8. Conclusions**

A performance analysis of the differential evolution algorithm, with one of its original variants, called DE/rand/1/bin, when solving robust optimization over time problems with a survival time approach, was presented in this paper. Three state-of-the-art random sampling methods to solve ROOT problems were used for comparison purposes. Sixteen generated problems by two benchmarks with two configurations and four different dynamics were solved. The solutions generated by the DE were obtained using the real environments without prediction mechanisms with the aim to analyze its behavior in ideal conditions. The findings supported by the obtained results indicate that DE is a suitable algorithm to deal with this type of dynamic search space when a survival time approach is considered. Moreover, the additional criterion that was added to the DE objective function allowed the algorithm to better discriminate between similar solutions in terms of survival time. Furthermore, it was found that the combination of a chaotic dynamic with both, random and constant peak movements, is a source of difficulty that requires further analysis.

This last finding is the starting point of our future research, where more recent DE variants, such as DE/current-to-p-best, will be tested in those complex ROOT instances. Moreover, the effect of predictors in DE-based approaches will be studied.

**Author Contributions:** Conceptualization, J.-Y.G.-G.; methodology, E.M.-M. and J.-Y.G.-G.; software, J.-Y.G.-G.; data curation, J.-Y.G.-G.; investigation, J.-Y.G.-G.; formal analysis, J.-Y.G.-G. and E.M.-M.; validation, E.M.-M. and S.D.-I.; writting—original draft preparation, J.-Y.G.-G., E.M.-M. and S.D.-I.; writing—review and editing, S.D.-I., E.M.-M. and J.-Y.G.-G. All authors have read and agreed to the published version of the manuscript.

*Math. Comput. Appl.* **2020**, *25*, 72

**Funding:** The first author acknowledges support from the Mexican National Council of Science and Technology (CONACyT) through a scholarship to pursue graduate studies at the University of Veracruz.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Abbreviations**

The following abbreviations are used in this manuscript:


#### **References**


*Math. Comput. Appl.* **2020**, *25*, 72


**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
