*Article* **An E**ffi**cient Framework for Multi-Objective Risk-Informed Decision Support Systems for Drainage Rehabilitation**

## **Xiatong Cai 1,\*, Abdolmajid Mohammadian <sup>1</sup> and Hamidreza Shirkhani 1,2**


Received: 6 September 2020; Accepted: 18 October 2020; Published: 2 November 2020

**Abstract:** Combining multiple modules into one framework is a key step in modelling a complex system. In this study, rather than focusing on modifying a specific model, we studied the performance of different calculation structures in a multi-objective optimization framework. The Hydraulic and Risk Combined Model (HRCM) combines hydraulic performance and pipe breaking risk in a drainage system to provide optimal rehabilitation strategies. We evaluated different framework structures for the HRCM model. The results showed that the conventional framework structure used in engineering optimization research, which includes (1) constraint functions; (2) objective functions; and (3) multi-objective optimization, is inefficient for drainage rehabilitation problem. It was shown that the conventional framework can be significantly improved in terms of calculation speed and cost-effectiveness by removing the constraint function and adding more objective functions. The results indicated that the model performance improved remarkably, while the calculation speed was not changed substantially. In addition, we found that the mixed-integer optimization can decrease the optimization performance compared to using continuous variables and adding a post-processing module at the last stage to remove the unsatisfying results. This study (i) highlights the importance of the framework structure inefficiently solving engineering problems, and (ii) provides a simplified efficient framework for engineering optimization problems.

**Keywords:** optimization framework; drainage rehabilitation; overflooding; pipe breaking

#### **1. Introduction**

Urban flooding happens when the capacity of a municipal sewerage system cannot support the amount of water that emerges in a short period of time [1]. Such a large amount of water could have either resulted from an intensified storm due to climate change [2–4], or freshets that amplify the stress on the sewerage system [5]. In order to release the stress of overflooding in cities, transforming the sewerage system and increasing its resilience to extreme weather can be a priority to increase the resilience of cities.

Computational simulations have been used for urban planning, including underground infrastructure design and pipe rehabilitation in recent years [6,7]. The essential idea is to build an optimization framework and apply it to modify a set of drainage system related variables such as the diameter, slope, and depth of the pipe. The framework requires the users to select applicable objective functions, which can be the system hydraulic performance or system pipe breaking risk [8,9], to maximize the performance of the system. Previous studies have focused on various aspects such as the cost of flooding damage [10], and integrated 1D/2D hydraulic modelling, where the SWMM5 was used as the 1D hydraulic model for sewer system simulations and a 2D model was employed to

analyze the overflooding consequences in the drainage basin to obtain more accurate results on the damage of urban flooding [11].

In addition to the surcharge, drainage systems face more challenges, such as ageing due to natural and human impacts [12]. The threat of drainage pipes breaking cannot be ignored at locations across the world [13–15]. Canada's Infrastructure Card [12], reported that nearly one-third of potable water and sewerage pipes underground are imposed to breaking risk. Due to the ageing of the pipe system, the breakage of water supply pipes and sewerage pipes can introduce secondary pollutants into potable water and threaten human health [16].

Accurate predictions of the current and future conditions of a sewerage system using available assessment data are crucial for developing appropriate strategies for ageing pipe maintenance and rehabilitation. Statistical models are used to predict the probability of pipe failure in a drainage system [17]. The advantage of statistical models is that they are easy to apply in a large system to calculate the systematical performance when the random impacts can be ignored. Some statistical models such as the homogeneous Poisson processes model, non-homogeneous Poisson process model, and zero-inflated non-homogeneous Poisson process model, which use the age (time) of a pipe to predict its failure, have good performance in practice [18,19].

Altarabsheh et al. [20,21] conducted research based on whole lifecycle assessment, genetic algorithm, and Monte Carlo simulation to maximize network condition and serviceability while minimizing network risk of failure and total lifecycle cost for the entire planning period. State transition in a Markov chain can simulate the life of a pipe and predict the whole life risk of a pipe [22]. Other methods such as evolutionary polynomial regression [23], ordinal regression model [24], and flexible fuzzy model [25] are promising methods. Researchers have also concentrated on deciding the consequences of failure, such as the analytical hierarchy process [26,27]. However, this line of research has not been applied with drainage surcharge for drainage rehabilitation and design.

Cai et al. [28] combined hydraulic performance and breaking risk via a multi-objective genetic algorithm optimization framework. By building a relationship between rehabilitation and hydraulic performance as well as pipe breaking risk, they provided a novel decision support system for drainage systems rehabilitation. In their methodology, they used the traditional three-element optimization method: (1) set constraint functions to allow the system meet basic requirements; (2) set objective functions to improve the performance of the system; and (3) use a linkage module to link different modules in the system. They used one constraint function to control the overflooding in an urban system, and used a hydraulic performance objective function to optimize the rehabilitation methods. In their paper, they used a breadth-first searching algorithm to separate the problematic system and then optimized the system by a hydraulic diagnostic model [29] from the high impact drainage chain route to the low impact drainage chain route. This method provides good results for various drainage systems. However, there are some limitations in their framework. The overflooding was solved by constraint function, which means they added many logistic judgments in their algorithm, and that will decrease the calculation speed. Second, this hydraulic diagnostic model is designed to search for a narrow pipe in a chain route in a drainage system. Therefore, it can decrease the speed when they apply this method chain by chain to search for all the narrow pipes in the drainage network. In their research, they only discussed the genetic algorithm (GA), which neglected other optimization methods, such as particle swarm optimization (PSO) that has been used in drainage rehabilitation problems [30].

In this research, we improved the three-element optimization framework, which included constraint functions, objective functions, and multi-objective optimization, and created a faster and more accurate framework for urban drainage system. We improved their first-generation rehabilitation methodology from four aspects. (1) Enlighted by a multiple-stage decision support system [31], we improved their framework to get accurate results by adding a new objective function to optimize the budget distribution. (2) We tested whether the constraint function can be removed, and the final results can be selected by a filter to increase the speed. (3) We examined whether it is accurate enough to use the overflooding index in each node for optimization. In this way, the new algorithm does not

need to search the network chain by chain. (4) We tested whether particle swarm optimization can have better results than the genetic algorithm in this problem. This is because, in literatures, there is a debate on which method has a better performance in drainage systems.

This paper is organized as follows: first, the structure of our new algorithms is introduced; subsequently, we specify the new algorithms in a computational model, Hydraulics and Risk Combined Model (HRCM). Then, two scenarios are studied to verify those new methods. Finally, we provide a combined methodology to replace/rehabilitate pipes in the drainage system for urban flooding control and pipe breaking precaution.

#### **2. Materials and Methods**

#### *2.1. Introduction to the Hydraulics and Risk Combined Model Model*

In this research, we used the Hydraulics and Risk Combined Model (HRCM) [28] to calculate the hydraulic performance, risk, and maintenance cost of a drainage system. There are five modules in the HRCM model:

(1) Hydraulic simulation module: In this module, the SWMM5 model calculates hydraulic grade line in the drainage system. Then, the hydraulic diagnostic model is applied to this system to calculate the hydraulic performance index (flooding index) for the drainage.

The GA-HRMC method has a hydraulic diagnostic model [29], which calculates the overflooding impact of a pipe to the system; Equations (1)–(3). According to this model, the hydraulic impacts of a pipe are represented by the sum of the pipe to the system (other pipes). The diagnostic model can have better performance than using the ratio of the hydraulic grade line over the depth of the manhole [32]. The system overflooding objective function *Ns* is calculated by the weight average value of the overflooding ratio of each pipe weighted by its length; Equation (4).

$$N\_i = 100\% \times \frac{H\_i^{IS}}{G\_i} \tag{1}$$

$$N\_i^i = N\_{\rm min} + (N\_{\rm max} - N\_{\rm min}) \frac{\left(H\_i^{\rm LS} - H\_i^{\rm DS}\right)}{G\_i} \tag{2}$$

$$N\_i^{DS} = N\_i - N\_i^i = (N\_{\max} - N\_{\min}) \frac{H\_i^{DS}}{G\_i} \tag{3}$$

$$N\_s = \sum\_{j,i} N\_i^j l\_i \left| \sum\_i l\_i \right| \tag{4}$$

where *HUS <sup>i</sup>* <sup>=</sup> upstream hydraulic grade line of pipe *<sup>i</sup>*; *<sup>H</sup>DS <sup>i</sup>* = downstream hydraulic grade line of pipe *i*; *Gi* = height of the node *i*; *N<sup>i</sup> <sup>i</sup>* = net effect of the surcharge causes by pipe *i*; *Ni* = overflooding ratio of node *i*; *Nmin* = minimum overflooding ratio of node *i*; *Nmax* = maximum overflooding ratio of node *i*; *Ns* = system overflooding index.

(2) Risk assessment module: In the risk assessment module, the probability of failure for each pipe is calculated according to the age of each pipe. Then, a statistical exponential equation gives the probability of breaking for each pipe. The breaking probability of each pipe multiplies the consequence of failure of that pipe to get the breaking risk of that pipe. We assumed that the probability of failure for each pipe is given in Equation (5):

$$P(t) = a \times \mathbf{e}^{b \times (t-c)} \tag{5}$$

where: *P*(*t*) = the possibility of failure with time *t* (year). The *a*, *b*, and *c* are fitting parameters.

In the risk-informed model, Cai et al. [28] assumed a statistical exponential model [33] to calculate the probability of failure; and they used the consequence of failure criteria by Baah et al. [34] to calculate the weighted system pipe breaking risk index. In this study, we kept the same setting in our risk-informed model. The objective function of the system pipe breaking risk *RS* is given in Equation (6):

$$R\_S = \frac{\sum\_{l} l\_i C\_i P^i}{\sum \ l\_i} \tag{6}$$

where *RS* = the risk of the system; *Ci* = the consequence of a failure of pipe *i*; *Pi* = the possibility of failure of pipe *i*; *li* = the length of pipe *i*.

(3) Rehabilitation module: In this module, different rehabilitation methods are connected to the age and diameter of a pipe. This can change the values of breaking risk index and overflooding index in a drainage system.

Six rehabilitation methods were linked to hydraulic performance (pipe diameter) and breaking risk (pipe age) (Table 1) [20]. In order to make the HRCM model recognize the cost difference among different pipe diameters, Cai et al. [28] added a pipe cost item for pipe replacement. The pipe cost *Cp* is a function that is related to pipe diameter and pipe length. The cost to rehabilitate one pipe is the sum of the rehabilitation cost, disruption cost, and pipe cost. The cost objective function is the total cost of all the pipes.


**Table 1.** The rehabilitation matrix [20,28]

<sup>1</sup> *Cp* = *f*(*di*, *li*) pipe cost function. In this research, we assumed *Cp* = *d* × *l*.

(4) Multi-objective optimization module: There are two objective functions in this multi-objective optimization. First, a set of constraint functions on hydraulics performance, breaking risk, and budget limits the minimum requirements for rehabilitation methods. Second, they use a non-dominated sorting genetic algorithm (NSGA-II) to optimize hydraulic performance and decrease breaking risk in this system.

(5) Postprocessing filter (expert system): This module can select results from the Pareto Front according to the cost.

The structure of the HRCM model can be seen in Figure 1.

**Figure 1.** Structure of the HRCM model.

#### *2.2. Algorithm Frameworks*

#### HRCM Model Simulation Frameworks

We considered six alternative methodologies for HRCM calculation to compare with the method by Cai et al. [28], which we named the GA-HRCM method. We used their original framework (GA-HRCM) as our control group to compare with other methods. Explanations of other alternative algorithms, GA-Continuous, GA-Cost, GA-Unconstraint, PSO-Cost, PSO-HRCM, and GA-Network, are given in Figure 2.

**Figure 2.** Flowchart of different optimization algorithms.

The GA-HRCM method uses a discrete pipe diameter, which increases the time in each iteration to transform the continuous value to discrete value, which is a process in the GA algorithm itself. The GA-Continuous method uses a continuous diameter for pipes during optimization. In the post-processing section, the continuous pipe diameters were transformed to the nearest discrete pipe diameters, which are used in engineering, and then the overflooding index and the pipe breaking index were calculated (Table 2).

The GA-HRCM method did not use cost as an objective function, because in an engineering project, budget is seen as a constraint. GA-Cost uses rehabilitation cost as another objective function. We use this comparison to evaluate whether this can improve rehabilitation strategy results by increasing cost-effectiveness (Table 2).

The GA-HRCM method has a constraint function for both hydraulic and budget. It did not limit the breaking risk because it uses a stochastic model, so the breaking risk is an objective function rather than a constraint function. The GA-Unconstraint method removes the hydraulic constraint in rehabilitation as well as the budget constraint. The results were filtered and we only kept the results that satisfied our expectations after the optimization process (Table 2).

The particle swarm optimization (PSO) and genetic algorithm (GA) methods have been widely used in sewerage pipe design and rehabilitation [35,36] and have shown good results in predicting hydraulic performance. However, it is still unclear as to which method is suitable for drainage optimization [7,36–38]. We revised the code given by Yarpiz [39] in order to solve the mixed integers problem. We employed two PSO methods, PSO-HRCM and PSO-Cost, to compare their performance with the employed genetic algorithm. The PSO-HRCM method replaces the NSGA-II to non-dominant

sorting PSO method. Upon this replacement, the PSO-Cost method adds cost as another objective function to the PSO-HRCM method.

A drainage system has a complex structure [40]. In their research work, Bennis et al. [29] provided a hydraulic diagnostic model. To distinguish it from other indexes, we call it the chain route index in this study. In the model, they recognized narrow pipes by calculating an index to evaluate backwater effects downstream to upstream. Their method can separate the surcharge effect into two categories: (1) surcharge caused by the pipe itself; (2) surcharge caused by the downstream narrow pipes. Therefore, a computational model can detect which pipe affects the system easily. The GA-HRCM method used this hydraulic diagnostic model to optimize the overall overflooding index. The GA-Network method tests whether this strong searching model is unnecessary to find the narrow pipe. Dion and Bennis [32] introduced a global modeling approach to evaluate hydraulic performances in a drainage system. Instead of calculating the chain route index, they directly used the hydraulic grade line in each junction to evaluate the hydraulic performance of the drainage system; Equation (7).

$$N\_s = \sum\_i N\_i l\_i / \sum\_i l\_i \tag{7}$$

To distinguish this index from the chain route one, it will be called the network index in the present study. The GA-HRCM model uses the chain route index. It has high efficiency when the drainage system is simple, but it is not efficient when the drainage system becomes complex, because this chain route index needs to calculate the index from one branch of the system to another [28]. In this research, we evaluated this speed-accuracy compromise by comparing the GA-HRCM method and the GA-Network method, using the global hydraulic index (Table 2).


**Table 2.** Parameter setting of different HRCM methods.

<sup>1</sup> Constraint functions of cost and hydraulic overflooding. <sup>2</sup> The system overflooding index in Equation (4). <sup>3</sup> The system overflooding index in Equation (7).

#### *2.3. Revised HRCM Method (RHRCM)*

In previous sections, seven calculation methods were applied to the HRCM model to verify how they affect the framework. On comparing the performance of the seven methods, we revised the HRCM model to improve its efficiency to solve overflooding and pipe breaking combined problems.

The revised framework is presented in Figure 3. In this new framework, we simplified the three-element framework to: (1) optimization; (2) linkage; (3) post-processing. This framework can be applied to other pipe systems and solve similar problems. The method with the fastest convergence speed—GA-Continuous—was selected to improve convergence speed. The GA-Network method was selected to enhance the performance of the HRCM model on the network drainage system and improve efficiency. Besides, the GA-Continuous method and the GA-Network method can offer fewer strategies than the original HRCM method. In order to compensate for the weakness of the original HRCM inefficient budget distribution to rehabilitate each pipe, we selected the GA-Cost method to increase the accuracy of the framework. The new framework removes constraint functions and adds cost as another objective function. It also removes the hydraulic diagnostic and discrete models. Distinguished from other studies that study optimization methods directly without improving the structure of optimization, we propose to study optimization structure in each step for drainage optimization.

**Figure 3.** Diagram of the revised framework (RHRCM).

#### *2.4. Case Study*

Because the validation of the HRCM model was evaluated by Cai et al. [28] and the objective of this study is to compare the performance of different frameworks, we assumed two idealized scenarios in which it is easy to recognize narrow and aged pipes. Therefore, we can easily evaluate the performance of different frameworks. The configuration of the drainage system used the system proposed by Bennies et al. [29] (Figure 4).

**Figure 4.** Drainage system configuration: (**a**) structure of the drainage system; (**b**) schematic view of the pipe diameter, length, and depth (Data from Bennis et al. and Cat et al. [28,29]).

In this paper, we considered two scenarios (Table 3) to evaluate the seven methods mentioned in Figure 2. The first scenario represented a narrow pipe scenario, and the second scenario an aged pipe scenario.



The first scenario is used to test whether these methods can choose the correct pipe and replace it with a larger one. In the first scenario, three narrow pipes were placed in the system, and all the pipes were of the same age. Among the three narrow pipes, one pipe was extremely narrow, which means that the model must find and replace it; the overflooding constraint can then be satisfied. The other pipes will affect the overflooding index but are not necessary to satisfy requirements. The second scenario includes an aged pipe and two narrow pipes. The aged pipe was severely deteriorated as compared to the other pipes, and the narrow pipes were not severely narrow. The second scenario was used to test whether these methods can find the aged pipe and use a reasonable rehabilitation method to solve the ageing problem. The drainage system was set as in Figure 4a. This is the same as that in the Cai et al. [28] study, for comparison purposes. Chicago designed rainfall is a common case for the simulation of sewerage systems [29,41,42].

#### *2.5. Model Performance Evaluation*

#### Sensitivity Analysis

Result accuracy can increase with an increased population size of the optimization algorithm, but the computational time will also increase. As per the functionality limitation of our computer—Intel® Core™ i7-8750H CPU @2.20GHz, 16.0 GB (RAM), we set the population size to 100, 500, 1000, 1500, 2000, and 2500 for both GA and PSO methods. We evaluated population convergence (i.e., whether the results will converge at our population setting) and time convergence (i.e., the computational time at the convergent population if the convergence exists). The evaluation criteria are: (1) computational time at the population equals to 2500; (2) how many rehabilitation solutions are given by the HRCM model at population size equal to 2500; (3) and the average cost of the total solutions at the 2500 population size (Tables 4 and 5). It should be noted that the word 'convergence' in this research means the number of strategies, the overflooding index, and the pipe breaking index in strategies set for one population size, which does not change at a larger population size.

After postprocessing, the selected output results can solve the overflooding problem. Then, we compared their rehabilitative effectiveness. The cost-effectiveness analysis can quantify the rehabilitation performance of a rehabilitation strategy at per unit cost [43,44]. This method can evaluate the effectiveness of our rehabilitation method, as it provides information on which method can best improve the performance of a system under the unit cost. It is defined as the index in Equation (8) to evaluate the efficiency of each method. The original overflooding index and the risk index of scenario 1 were 29.72 and 16.24, respectively. The overflooding index and risk index of scenario 2 were 5.07 and 22.68, respectively:

$$\mathcal{C}\mathcal{e} = \frac{1}{k} \sum\_{j} \left( l\_j^p - l\_j^a \right) / \mathcal{C}\_j^r \tag{8}$$

where *Ce* = cost-effectiveness index; *I p <sup>j</sup>* = average of the difference between the original overflooding/risk index; *I a <sup>j</sup>* <sup>=</sup> overflooding/risk index after the rehabilitation; *Cr <sup>j</sup>* = cost for rehabilitation; *k* = the total number of *j*.

#### **3. Results**

#### *3.1. Computational Time Competition*

Figure 5 shows the time competition of the seven methods. The computational time increases with an increased population. However, the trend was not monotonic.

**Figure 5.** The time competition for seven methods: (**a**) scenario 1; (**b**) scenario 2.

Different methods exhibited discrepancies in calculation speed under the various scenarios. The GA-Unconstraint method had the minimum calculation time in the first scenario. There is a bump up when the population equals 2500 of GA-Unconstraint in the second scenario. We calculated two simulations for the GA-Unconstraint method with the population size being equal to 2200 and 3000, respectively. The computational times were 77,286 s and 89,393 s, respectively. Therefore, we inferred that the high computational time for the GA-Unconstraint method at the population size (equal to 2500) is because of the fluctuations of the program. The GA-Continuous method had a fast convergence speed for both scenarios. It was found that the GA-HRCM method was the slowest method (Figure 5). The computational time comparison between the RHRCM method and the other seven modified methods is presented in Figure 5. The RHRCM method exhibited the fastest speed compared to the other seven methods, and it was stable with respect to the population increase in scenario 1. This property can be also seen from the computational time comparison of scenario 2. The RHRCM method was relatively stable, compared to the GA-Unconstraint method. It showed a significant advantage over other methods in terms of computational speed.

#### *3.2. Methods Evaluation*

#### 3.2.1. Scenario 1—Narrow Pipe

We assessed the results by evaluating the converged population, convergence time, number of solutions, and cost-effectiveness at a population of 2500 (Table 4). The cost-effectiveness value was calculated by dividing the difference between the original hydraulic/risk index and the new hydraulic/risk index by cost (million \$) (Table 4).

After adding cost as another objective function, the expense of rehabilitation decreased from 0.67 million dollars to 0.3 million dollars. Compared to the GA-HRCM method, we found that it is less likely that the GA-Cost method selects fiberglass reinforcement, which is the most expensive rehabilitation method in our case (Table 1). This can reduce costs on unnecessary rehabilitation.


**Table 4.** The summarized results of the seven methods with scenario 1.

<sup>1</sup> Hydro is the cost-effectiveness of the overflooding index. <sup>2</sup> Risk is the cost-effectiveness of the breaking index.

The GA-Network method converged at population size equals to 500, which is faster than the GA-HRCM method converged at a population of 2000 (Table 4). The GA-Network method offered four strategies, which is smaller than the six strategies obtained from the chain route index— the GA-HRCM method (Table 4). The GA-Unconstraint method has the fastest calculation speed for the same population size as the other methods (Figure 5). It was found that GA-Continuous converged at the 1500 population size, and it is faster than GA-HRCM.

PSO-based methods did not offer a significant advantage in cost-effectiveness and computational speed (Table 4), when compared to the GA-based method. Zarbaf et al. [45] compared the PSO method and the GA method for the calculation of cable tension estimate. They found that both methods can evaluate the tensioned cable, but the PSO method was more accurate. Surendar et al. [37] compared the GA and PSO methods in predicting Brazilian tensile strength. They found that even though the two methods can predict the value, PSO had better performance in fitting the result. Vasudevan and Sinha [36] showed that the PSO method had better performance in the distribution system. However, in the sewerage system, one study showed that GA methods can offer similar results as the PSO method [46]. In our research, we find that the PSO method was not as good. The PSO method uses the best values in one generation to guide the algorithm to produce the next generation. This will be efficient when searching for an optimum value in a continuous function. However, to rehabilitate drainage systems, there are many parallel solutions. For example, even though the hydraulic performance is improved when we enlarge the diameter of a pipe, after enlarging the diameter and exceeding a threshold, the results are improved. This means that in one generation, there will be many optimum values, thus impacting the performance of the PSO method in searching for the optimum value.

The RHRCM method can combine the strengths of previous frameworks. We found that the RHRCM method has one advantage offered by the GA-Unconstraint method—it took 89,182 s for the 2500 population; it is also more stable (Figure 5). Besides, the RHRCM achieved maximum cost-effectiveness as compared to the other methods. The cost-effectiveness of overflooding rehabilitation was found to be 177.29, and the pipe breaking rehabilitation cost-effectiveness was 87.82 (Table 4). Besides, the RHRCM offered 10 rehabilitation strategies, which is acceptable (Table 4). These strategies simplified those provided by the GA-Cost method.

#### 3.2.2. Scenario 2—Ageing Pipe

Previous studies, such as Kleiner et al. [47] considered age of the pipe as a Fuzzy variable. Scenario 2 includes an aged pipe; its results are presented in Table 5. Among the seven methods, GA-Cost showed high cost-effectiveness of system overflooding and pipe breaking risk—13.92 and 129.5, respectively. The hydraulic cost-effectiveness value (11.7) is higher than that of the RHRCM method. The RHRCM method gave the highest risk cost-effective of 137.31, which shows that it is compatible with the risk scenario.


**Table 5.** The summarized results of seven methods with scenario 2.

<sup>1</sup> Hydro is the cost-effectiveness of the overflooding index. <sup>2</sup> Risk is the cost-effectiveness of the breaking index.

#### **4. Discussion**

#### *4.1. Advantage and Limitation of RHRCM Method*

The refined HRCM model is faster because unnecessary parts in the original HRCM model are removed. We also acquired a higher cost-effectiveness by adding cost as another objective function, as it can remove the parallel solutions.

We believe that there is a convergence of optimum value in an optimum question because the Pareto Front is the set of optimum values. However, are there optimum strategies in a rehabilitation problem? The answer is no. The reason for this is because the two types of situations can result in the same value on the Pareto Front with different strategies. First, if there are two pipes which have the same breaking risk and hydraulic performance in this network, one will have the same result when replacing the first pipe or the second pipe. That means that one will have two solutions offering the same result on the Pareto Front; furthermore, both are optimum solutions. The second type is that if one can replace a pipe to a diameter of 0.305 m to 1 m, it may solve the surcharge problem; however when one changes the dimeter to 2 m, it can get the same surcharge index at that junction. This means that every pipe has a threshold; when the diameter of the pipe goes beyond that threshold, all of the rehabilitation strategies are the same in the optimization program. In the HRCM model, we used a post-processing strategy to select solutions from the set of rehabilitation plans. In the RHRCM model, the new dimension (cost) can help to partly solve the parallel results problem because different diameters have different costs. This can improve the performance of an optimization method. However, this method does not increase the search speed for hydraulic performance and breaking risk. How to solve the parallel solution problem and increase the searching speed can be a topic for future study.

#### *4.2. Discrete Versus Continuous Data*

In engineering, certain parameters are not continuous. For example, the diameter of the pipe in a real case should be a discrete value, based on manufacturing standards. Therefore, although the mixed-integer optimization method is widely used in many engineering problems, there should be a discussion on whether mixed-integer is better than continuous optimization. In our research, we found the GA-Continuous has a faster convergence speed in scenario 1 than GA-HRCM. However, GA-Continuous is slower than GA-HRCM in scenario 2. We found that both continuous and discrete methods can solve the problem well. Therefore, mixed-integer optimization is not always better than continuous optimization. In the context of optimization algorithms, the continuous optimization method can have a higher sensitivity to variables that are changed continuously, and they do not have a process to transform a continuous number to an integer. However, it may be easy to obtain the local optimum value. Thus, it is a competition between these two situations, and we should adjust it according to different situations.

#### *4.3. Parallel Results Problem*

The slow convergence speed can be attributed to the parallel results problem; in network optimization, it is defined as having multiple solutions with the same performance. For example, consider a case in which there is a pipe in a drainage system leading to a surcharge, such as C8 in Figure 4b, and the critical diameter is δ (which is enough to solve the surcharge). When the model assigns diameter values larger than δ, they can get the same results for the overflooding index. This means that even though there are limited points on the Pareto Front, there are many strategies that can have the exact same values on the Pareto Front. Therefore, this seriously affects the convergence speed of optimization. The parallel results problem may explain this non-convergence in the framework. This motivated us to study how to evaluate the performance of optimization for rehabilitation problems in the future.

In this study, we use the GA and PSO methods because they are the most widely used in engineering. Many kinds of optimization algorithms, such as ant colony optimization algorithm [45], random forest [25], cellular automata [48], hanging gardens algorithm [49], and whale optimization [50] should be tested in the future to see whether they are more suitable for this framework.

We found that when we add the cost into our framework our program can have better results. That provided the initial idea to solve the parallel solution problem. We can add more parameters to this system. The GA-Unconstraint and RHRCM show that there is no significant difference in the calculation speed when we have two or three objective functions. The GA-Cost and GA-HRCM showed that the case of three objective functions needs more time for calculation. This means that when we remove the constraint function, the calculation time will not increase significantly even though we add more objective functions. Therefore, we can add more parameters to this framework to make it more resistant to the parallel solutions. Besides, we believe, this framework can have a higher calculation speed when we use parallel computing.

#### *4.4. Framework*

Distinguishing our research from other studies, and improving the performance of an optimization model by using different optimization methods, we studied whether the simplified calculation framework can improve performance. In our research, we found that with our new framework, the calculation speed and cost-effectiveness of the HRCM model were significantly improved. The computational speed of RHRCM was increased four times, and cost-effectiveness increased three times as compared to the GA-HRCM method, by changing the computational framework. This emphasizes the importance of studying how to improve the calculation methodology of an optimization question. A multi-objective optimization model is a complex system, because it has a multifaceted calculation structure and involves many modules to solve one question. Therefore, current optimization methods should be simplified to achieve a higher performance. Research on framework structure thus needs to be paid more attention.

#### **5. Conclusions**

Developing rehabilitation strategies in order to obtain maximum benefit for solving urban flooding and reducing pipe breaking risk at the same time is an important issue in urban drainage systems. In this paper, seven potential frameworks were compared. The results showed that calculation speed and accuracy were improved when continuous variables are used and constraint functions are removed. A post-processing filter was added at the end to transform pipe diameter to a discrete value and remove the unsatisfying strategies that result in a high overflooding index or breaking risk index. Multi-objective optimization was found to be adequate in finding a solution. Furthermore, calculation accuracy can increase when cost is selected as an objective function. We also found that the GA algorithm had a better performance than the PSO method in drainage optimization problems. Simulation results showed that these methods can significantly improve the decision support system

for drainage rehabilitation. A new method was proposed (RHRCM), which exhibited a remarkably higher computational speed (four times faster than the original HRCM model) and was able to obtain results with a higher cost-effectiveness (three times higher than the original HRCM model). We found that a simplified framework can significantly improve the calculation performance of the original model; therefore, further research should focus on study of the framework structure.

**Author Contributions:** Conceptualization, A.M., H.S. and X.C.; methodology, X.C. and H.S.; software, X.C.; investigation, X.C. and H.S.; resources, A.M. and H.S.; writing—original draft preparation, X.C.; writing—review and editing, A.M., H.S. and X.C.; supervision, A.M. and H.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Acknowledgments:** We extend our acknowledgements to Colin Rennie, Hossein Bonakdari, and Chengxi Li.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Single-Objective Optimization of a CMOS VCO Considering PVT and Monte Carlo Simulations**

**Perla Rubi Castañeda-Aviña 1,†, Esteban Tlelo-Cuautle 1,\*,† and Luis Gerardo de la Fraga 2,†**


Received: 23 September 2020; Accepted: 30 November 2020; Published: 3 December 2020 -

**Abstract:** The optimization of analog integrated circuits requires to take into account a number of considerations and trade-offs that are specific to each circuit, meaning that each case of design may be subject to different constraints to accomplish target specifications. This paper shows the single-objective optimization of a complementary metal-oxide-semiconductor (CMOS) four-stage voltage-controlled oscillator (VCO) to maximize the oscillation frequency. The stages are designed by using CMOS current-mode logic or differential pairs and are connected in a ring structure. The optimization is performed by applying differential evolution (DE) algorithm, in which the design variables are the control voltage and the transistors' widths and lengths. The objective is maximizing the oscillation frequency under the constraints so that the CMOS VCO be robust to Monte Carlo simulations and to process-voltage-temperature (PVT) variations. The optimization results show that DE provides feasible solutions oscillating at 5 GHz with a wide control voltage range and robust to both Monte Carlo and PVT analyses.

**Keywords:** VCO; differential evolution; CMOS differential pair; PVT variations; Monte Carlo analysis

#### **1. Introduction**

The voltage-controlled oscillator (VCO) is quite useful in applications such as: analog-to-digital converters [1–3], phase-locked loops [4], and so on. The VCO can be implemented by using complementary metal-oxide-semiconductor (CMOS) technology of integrated circuits, as already shown in [5], and also by using LC-tank structures. Several CMOS VCO designs can be classified by using single-ended stages [6,7], differential stages [8,9] and pseudo-differential stages [10]. Among the currently available VCO topologies, the one consisting of a ring structure [11], and using CMOS differential stages has the advantage of providing great immunity to supply disturbances [12]. Other desired features in designing a VCO are associated to accomplish low-power consumption, minimum layout area, high-frequency and wide control voltage range. These target specifications become difficult to achieve due to the continuous down scaling of silicon CMOS technologies. Besides, designing a VCO in a ring topology is frequently a more attractive alternative because it allows accomplishing a wide tuning (control voltage) range, small layout area, high gain, low cost, robustness to variations, simplicity and scalability in nanoscale CMOS processes [13,14]. The three principal causes of alteration on the performace for a circuit are the variations in the fabrication process, power supply and operation temperature, these constitute PVT variations and their impact is increased with the devices' downscaling [15]. Process variations include wafer defects or may be produced by certain chemical procedures causing some circuit's paremeters to change, voltage fluctuations in the circuit take place for a variety of reasons such as supply noise and can be compensated with a voltage

regulator to prevent the transistor's operating point from being affected, last but not least temperature variations can be caused by external sources or by the circuit's own power dissipation. These PVT variations can be minimized by a proper design and layout placement and routing. Among the currently available designs, the authors in [5] introduced a wide-band VCO implemented by CMOS differential stages connected in a ring topology. Other design guidelines to improve the VCO's performance can be found in [16–19].

The oscillation frequency *fosc* of a VCO can be evaluated by (1), where *N* indicates the number of stages and *τ* is a time constant that depends on the associated resistance of the active load and the value of the capacitor load. *fosc* varies in a range determined by a control voltage *Vctrl* [14], and depends on the number of CMOS differential stages *N*, but decreasing *N* yields a reduction in gain, which may result in the oscillation mitigation. This trade-off can be improved by applying metaheuristics to maximize *fosc* under a wide range of *Vctrl*, and low silicon area or number of CMOS differential stages *N*. Different metaheuristics have been applied to the optimization of CMOS integrated circuits in previous works due to the complexity involved in the design processes [7,20–22]. In this manner, the differential evolution (DE) algorithm is applied herein to vary the sizes of the transistors in the CMOS differential stages to maximize the oscillation frequency of a CMOS VCO *fosc*. The electrical characteristics of the VCO are evaluated by linking the simulation program with integrated circuit emphasis (SPICE).

$$f\_{\text{osc}} = \frac{1}{2N \cdot \pi} \tag{1}$$

The rest of the paper is organized as follows: Section 2 describes the considerations taken for the design of both the CMOS differential pair stage and the VCO in a ring topology. The DE algorithm is detailed in Section 4. The single-objective optimization is described in Section 5. Section 6 describes a brief disscussion about this work. Finally, Section 7 summarizes the conclusions.

#### **2. Ring VCO-Based on CMOS Differential Stages**

In this paper, the main objective in designing a CMOS differential stage as the one shown in Figure 1, which will be used to implement a ring VCO, is oriented to achieve the highest oscillation frequency *fosc* given in (1), which is inversely proportional to both the number of CMOS stages *N* and the propagation delay *τ*. Supposing *N* constant, then the delay generated by the differential pair must be minimized [14,23]. Some authors recommend that the delay can be reduced by augmenting the output transconductance *gds* of the active MOS transistor and by reducing the equivalent capacitance, where the load capacitance *CL* could be the dominant one [13,23,24]. The trade-off here is that augmenting *gds* leads to increase the sizes of the MOS transistors and this generates larger parasitic capacitance values. Therefore, this problem is quite suitable for applying metaheuristics, like the DE algorithm.

**Figure 1.** CMOS differential stage with active load and control voltage *Vctrl*.

If the MOS transistors *MN*<sup>1</sup> and *MN*<sup>2</sup> operate in their saturation region, then they must accomplish |*VDS*| > (|*VGS*| − |*VTH*|) and |*VGS*| > |*VTH*|, where the voltages are associated to the drain (D), gate (G) and source (S) terminals of the MOS transistors, and its associated threshold voltage *VTH*. The width (W) and length (L) sizes of the MOS transistors can be evaluated by (2), where *ID* is the drain current, and *μnCox* are parameters provided by the CMOS technology foundry. In this work the sizing is performed by using 180 nanometers (nm) from United Microelectronics Corporation (UMC).

$$\frac{\mathcal{W}}{L} = \frac{2I\_D}{\mu\_n \mathbb{C}\_{ox} (|V\_{GS}| - |V\_{TH}|)^2} \tag{2}$$

As already shown in [5], the active loads are implemented by P-type MOS transistors (*MP*<sup>3</sup> and *MP*4) operating in the triode region, and their sizing accomplish |*VDS*| < (|*VGS*| − |*VTH*|) and (3) [25]. The equivalent resistance is tuned by the control voltage *Vctrl* at the gates of the PMOS transistors [14,26], and the output conductance of the PMOS transistor can be approached as 1/*go* = 1/*gds* = 1/*μCox*(|*Vctrl* − *Vs*| − |*Vth*|). 

$$I\_D = \mu \mathcal{C}\_{ox} \frac{\mathcal{W}}{L} \left[ \left| V\_{GS} - V\_{TH} \right| \left| V\_{DS} \right| - \frac{1}{2} \left| V\_{DS} \right|^2 \right] \tag{3}$$

The propagation delay *τ* is directly related to the dominant pole, and it has been approximated as in (4), which depends on *CL*, the transconductance *gm* of the CMOS differential pair, and *gds* of the active load [27], so that the reduction of the transistors' sizes leads to an increase of the dominant pole *ωp*.

$$
\omega\_{P} = \frac{3.29 \cdot 10^{54} (\text{C}\_{D} + \text{C}\_{AD2} + \text{C}\_{db4} + \text{C}\_{L}) + 2.43 \cdot 10^{44} (gd\alpha\_{2} + gd\epsilon\_{4}) + 1.46 \cdot 10^{56} \cdot gm\_{2} (\text{C}\_{D} + \text{C}\_{L}) + 3.86 \cdot 10^{45} (gd\alpha\_{4} \text{s} \cdot \text{m}) + 2.51 \cdot 10^{58} (\text{C}\_{L} \cdot gd\alpha\_{4} \text{s} \cdot \text{m})}{3.29 \cdot 10^{54} (gd\alpha\_{2} + gd\epsilon\_{4}) + 1.46 \cdot 10^{56} (gd\alpha\_{4} \text{s} \cdot \text{m} + gd\epsilon\_{2} \text{s} \cdot \text{m} \cdot \text{s})} \tag{4}
$$

The delay cell shown in Figure 1 can therefore be characterized by measuring the open-loop gain *AOL* and the dominant pole *ωp*. For instance, the gain-bandwidth product (GBW) of the delay cell, is the frequency at which *AOL* becomes 0 dB [28]. Its design including process, voltage and temperature (PVT) variations is given in [5], and in this paper the delay cell is optimized to provide the smallest propagation delay *τ* to increase *fosc*. The CMOS differential stage with active load is used to design the four-stages (*N* = 4) VCO shown in Figure 2.

**Figure 2.** VCO consisting of four CMOS differential stages with active loads, in a ring topology.

#### **3. VCO Optimization Methods**

The VCO optimization has been carried out through different approaches, such as metaheuristics [7,22]. In [7], a ring VCO's operation improvement is performed through particle swam optimization (PSO) and non-dominated sorting genetic algorithm (NSGA-II), to minimize both the phase noise and the power consumption. This is carried out through the use of symbolic modeling techniques to obtain the total output noise density and VCO's phase noise expressions by doing this the run time is reduced and the noise expression is simplified. Achieving also an improvement in tuning range without being an objective and also performing both Monte Carlo and process corners analyses to the final design. Similarly, in [22] the optimal sizing of a differential ring VCO is carried out through multi-objective particle swam optimization (MOPSO) and infeasibility-driven evolutionary algorithm (IDEA) to improve its performances by minimizing both the phase noise and the power consumption while maintaining a given oscillation frequency. Noise modeling is also carried out, to obtain the simplified noise expressions and solve the equations' system the determinate decision diagram (DDD) symbolic technique is used. Furthermore, Monte Carlo and PVT variations analyses were performed to guarantee the design robustness.

In [29], an algorithm that performs RF circuits sizing by using evolutionary strategies and simulating annealing in the search and selection parts, respectively, is implemented in Matlab. The optimization is carried out taking into account the parasitics caused by the passive elements' layout through physical based equivalent parasitic models, by doing this the number of iterations between circuit sizing and layout generation is reduced (reducing the synthesis time) since the difference between synthesis and post-layout results is decreased. The use of simplified models through RF circuit synthesis to approximate layout-induced parasitics lead to unrealistic outcomes. An LC cross-coupled oscillator was optimized using this approach, where the restrictions are: oscillation frequency, phase noise, power consumption, and oscillation amplitude.

In [30], the circuit optimization tool AIDA-C is used to carry out a multi-objective optimization and perform the sizing of an LC-tank VCO with the aim to minimize two compromised objectives, which are phase noise and power consumption. This optimization process achieves a good balance between the two objectives, since there is a trade-off between them, the optimization execution takes several hours to run. In [31], two design tools AIDA and SIDe-O to design a robust LC-tank VCO are introduced. SIDe-O is employed to face the problems relative to the passive elements and through AIDA a robust design is assured due to its corner-aware approach and NSGA-II is employed for the phase noise, power consumption and area minimization, as in the previous case the algorithm takes several hours to run, in both algorithms none of the objectives are focused on achieving a higher oscillation frequency.

#### **4. Problem Formulation for the Optimization of the VCO by Applying DE**

The single-objective function *g*(*x*) is formulated by (5), where *μ* is a constant established to one and *r*(*x*) stands for the constraints. One can see that when all the constraints are fulfilled then the second term of the function is equal to 0 and the objective function is the oscillating-period of the ring VCO *g*(*x*) = *f*(*x*). Therefore, the sizing optimization problem can be defined by (6).

$$\mathcal{S}(\mathbf{x}) = f(\mathbf{x}) + \mu \sum r^2(\mathbf{x}) \tag{5}$$

$$\begin{aligned} \text{Search}: & \mathbf{x} = [W\_1, W\_3, L\_1, L\_3, V\_{ctrl}] \\ \text{Minimize}: & \mathbf{g}(\mathbf{x}) \\ \text{Subject to } & \mathbf{5} > A\_{OL} > 1, \\ & V\_{DS} \le V\_{GS} - V\_{TH} \quad \text{for} \quad M\_{P3} \quad \text{and} \quad M\_{P4}. \\ & V\_{DS} \ge V\_{GS} - V\_{TH} \quad \text{otherwise,} \\ & W\_{min} < W < W\_{max}, \\ & L\_{min} < L < L\_{max}, \\ & V\_{SS} < V\_{ctrl} < V\_{DD} \end{aligned} \tag{6}$$

By applying the DE algorithm, which is described below, the sizing optimization process requires a population of *In* individuals, a maximum number of generations *maxGen*, and the objective function

*g*(*x*). Two of the main factors guaranteeing that global optimality is achievable by a metaheuristic like DE are the selection of the best solutions and randomization, where the former ensures that the solution converges to an optimum value while the later keeps the solution from getting halted at local optima [32]. To maximize *fosc*, this paper minimizes the oscillating period of the ring VCO, which is subject to the constraints of maintaining the load MOS transistors *MP*<sup>3</sup> and *MP*<sup>4</sup> operating in the triode region and the rest N-type MOS transistors operating in the saturation region. The SPICE simulator is linked within the optimization loop to evaluate the delay cell's gain *AOL* to be maintained within 1 and 5 dB.

The DE algorithm is a metaheuristic that performs an iterative optimization based on the evolution of a population of individuals under the concept of competition. The initial population is randomly generated where each individual represents a tentative solution that is associated to a fitness value through an objective function to point out the individual's suitability to a particular problem. The individuals with better fitness are more likely to be selected as parents, the chosen ones are reproduced using genetic operators (crossover, mutation) to produce new offsprings, which will also be evaluated to determine its survival. This represents a generation and this process is repeated until a stop criteria is met [33–36]. The DE algorithm is suitable for continuous optimization problems, like sizing analog CMOS integrated circuits as the VCO. In the DE algorithm, a vector population is altered through a vector of differences, which translates to a two operators: the first one being a recombination operator of two or more solutions and the second one coming as a self-referential mutation operator that conducts the algorithm unto finding acceptable solutions. Each individual is encoded as a vector of real numbers that are within the limits defined for each design variable (as the widths (W) and lengths (L) of the MOS transistors). The crossover operator defines the offspring-associated variable to be a a linear combination of three randomly selected individuals or an inheritance of its parents value while guaranteeing that at least one of the offspring's variable will be different from its parent. A scaling factor is employed to prevent stagnation of the search process [33,37].

In the DE algorithm, if a variable's magnitude is out of range, the recombination and mutation operators can be employed to reset the value. For instance the value can be established to the limit it exceeds, however this diminish the population's diversity. Other approaches reset it to a random value or initializing this value to a mid point between its previous value and the violated bound. In the latter the limits are approached asymptotically leading to diminish the amount of disruption [33]. In our current DE implementation, the individual is reset randomly within the search bounds. Other guidelines to design a DE algorithm may include to set the population number to ten times the amount of decision variables and initialize the weighting factor, *Pf* to 0.8 and the crossover constant, *Pc* to 0.9. If no convergence is achieved an increase in population may be necessary, however frequently the weighting factor is the one that has to be modified to be a little lower or higher than 0.8. The relation between convergence speed and robustness features is a trade-off, if the amount of population increments and the weighting factor decrements then convergence is more likely to occur but within a longer period of time. The performance of DE is more sensitive to the value of the weighting factor than the value of the crossover constant, and the range of both is generally in [0.5, 1]. A faster convergence may occur with higher values of the crossover constant [33].

The usefulness of the DE algorithm in sizing CMOS integrated circuits has been proved in [38–40]. Algorithm 1 describes its adaptation to maximize the oscillation frequency of the ring VCO shown in Figure 2. As mentioned above, herein the objective function is associated to minimize the propagation delay *τ* that is accomplished by measuring the oscillating period by using SPICE.

**Algorithm 1** DE pseudocode.


In the optimization process the individuals *In* of the population generated by the DE algorithm are replaced into the netlist file of the VCO's delay cell and each individual is simulated in SPICE. The electrical characteristics are obtained from the (.lis) output SPICE-file to verify that all the MOS transistors are working in the appropriate region of operation and that the gain is within the range of 5 > *AOL* > 1. A flag assigns 0 to a fulfilled constraint and 1 to a not fulfilled one. The period of the sinusoidal wave is associated to the function *f*(*x*). If the VCO is not oscillating then a high value is assigned to *f*(*x*). In the DE algorithm each individual is mutated to generate an adaptive solution *vij* from three randomly selected parents, as given in (7). Afterwards, the crossover takes place creating a trial solution, through the recombination of a mutated solution *vij* with an individual *xij*, given by (8). Finally, the replacement is carried out employing an elitist selection, where the new individual will replace its parent if its objective function value is better than the parent, as given in (9) [33]. 

$$
\omega\_{i\bar{j}} = \mathbf{x}\_{r\bar{3}\bar{j}} + P\_f(\mathbf{x}\_{r1\bar{j}} - \mathbf{x}\_{r2\bar{j}}) \tag{7}
$$

$$\mathbf{u}\_{ij} = \begin{cases} \mathbf{v}\_{ij} & \text{if } \operatorname{rand}\_j[0, 1] < P\_c \quad \text{or} \quad j = j\_{\text{rand}} \\\\ \mathbf{x}\_{ij} & \text{otherwise} \end{cases} \tag{8}$$

$$\mathbf{x}\_i(t+1) = \begin{cases} \mathbf{u}\_i(t+1) & \text{if } \operatorname{id}\_i(t+1) > f(\mathbf{x}\_i(t)) \\\\ \mathbf{x}\_i(t) & \text{otherwise} \end{cases} \tag{9}$$

$$\mathbf{x}\_{l}(t+1) = \begin{cases} u\_{i}(t+1) & \text{if } \quad f(u\_{i}(t+1)) < f(\mathbf{x}\_{i}(t)) \\ \mathbf{x}\_{l}(t) & \text{otherwise} \end{cases} \tag{9}$$

#### **5. Optimizing the CMOS VCO by Applying DE Algorithm**

The sizing optimization problem defined by (6), requires the sizes of the design variables (widths *W* and lengths *L*) of the MOS transistors, but one must determine the search space ranges. For instance, the limits of the sizes are set to: 2*λ* ≤ *W* ≤ 1000*λ* and 2*λ* ≤ *L* ≤ 10*λ*, respectively, where *λ* = 90 nm for the UMC CMOS technology of 180 nm. Another design variable is the control voltage, which bounds are set to *VSS* ≤ *Vctrl* ≤ *VDD*, and where *VSS* = −0.9 V is the lower supply voltage and *VDD* = 0.9 V the higher supply voltage.

The DE algorithm was calibrated by adjusting *Pc*, *Pf* and *In* to 0.7, 0.6 and 50, respectively. The maximum number of generations is set to 50. In total, 30 runs of DE were performed. The best feasible solution provided an oscillation frequency of 5 GHz, as shown in Figure 3. In such a case the obtained parameter values are: *Ibias* = *IMN*<sup>3</sup> = 4 mA, *WMN*<sup>1</sup> = *WMN*<sup>2</sup> = 40 μm, *WMN*<sup>3</sup> = 500 μm, *WMP*<sup>3</sup> = *WMP*<sup>4</sup> = 17 μm, *LMN*<sup>1</sup> = *LMN*<sup>2</sup> = *LMN*<sup>3</sup> = *LMP*<sup>3</sup> = *LMP*<sup>4</sup> = 0.18 μm, *Vctrl* = −0.8 V and *CL* = 31.39 fF. The *VBIAS* is created from Figure 1, in which the CMOS differential stage with active load is biased with *Ibias* = 2 mA, and the sizes of *Mbn* are *W* = 200 μm and *L* = 180 μm.

The SPICE simulation result of the best solution of the DE algorithm is shown in Figure 3.

**Figure 3.** VCO's oscillation frequency provided by the best solution of the DE algorithm.

Monte Carlo is an integrated circuits' statistical analysis in which a circuit devices' parameters and mismatch are varied randomly. Monte Carlo simulation allows the designer to consider the possible effects of a random variation of certain circuit's parameter over its performance. Monte Carlo analysis is carried out through the variation of W and L for each one of the 30 feasible solutions over 1000 runs, and considering a Gaussian distribution with 10% deviation. The outcome of the Monte Carlo simulations is employed to compute the mean and the standard deviation of the objetive function value, those results are sketched in Figure 4.

**Figure 4.** Mean and standard deviation of the Monte Carlo analysis for 30 feasible sized solutions of the DE algorithm. The best solution is the one with the lowest period (corresponding to a greater oscillation frequency).

The feasible sized solutions that accomplished the lower time delay *τ* of the CMOS differential stages are analyzed and their statistics related to the mean and standard deviation of the period of the sinusoidal wave are summarized in Table 1. From this table, the Monte Carlo simulation of the best solution of the DE algorithm is shown in Figure 5.

**Figure 5.** Monte Carlo simulation of the best feasible sized solution of the DE algorithm.



The parameters of each one of the five best feasible sized solutions and the simulated period, frequency and gain of the VCO and the CMOS delay cell, respectively, are summarized in Table 2.

**Table 2.** Best 5 feasible sized solution design parameters provided by the DE algorithm.


A PVT simulation of the ten best feasible sized solutions was also performed to assure that the CMOS VCO is robust to variations. The PVT variations are simulated by setting *Vctrl* = −0.8 V. Considering five process corners (typical-typical (TT), slow-slow (SS), slow N-type MOS transistor and fast P-type MOS transistor (SNFP), fast N-type MOS transistor and slow P-type MOS transistor (FNSP), and fast-fast (FF)), three voltage variations (±10% of ±*Vsupply* = 0.9 V), and three temperature variations (*T*− = −20 ◦C, *T* = 60 ◦C and *T*+ = 120 ◦C) [41], Figure 6 shows the higher and lower gain and oscillation frequency values provided by the DE algorithm. Table 3 summarizes PVT simulation results, where the five corners (TT, SS, SNFP, FNSP and FF) correspond to the MOS transistor models provided by the UMC foundry.

**Figure 6.** (**a**) Higher and (**b**) lower gains and dominant pole frequencies, for the solution 1 CMOS delay cell designed with United Microelectronics Corporation (UMC) technology of 180 nm by applying DE algorithm.


**Table 3.** Open-loop gain and dominant pole frequency over PVT variations with *Vctrl* = −0.8 V.


**Table 3.** *Cont.*

As one can see from Table 3 solution number 4 is the most robust to PVT. This solution has the greater frequency with all the gains been positive. Figure 6 depicts the higher and lower gains and dominant pole frequencies for solution number 1 since this is the one that provides the higher oscillation frequency, as one can see the greater gains occur at the FNSP process-corner (in Figure 6a) while the lower gains for the most part occur at SNFP process-corner (see Figure 6b. Furthermore, the greater *ω<sup>p</sup>* takes place mostly at FF process-corner, while the lower *ω<sup>p</sup>* mostly takes place at SS process-corner.

Figure 7 depicts the higher and lower gains and dominant pole frequencies for solution number 4 since is the most robust one. As one can see in Figure 7a the greater gains occur mostly at the FNSP process-corner, while the lower gains, in Figure 7b, for the most part occur at SNFP process-corner. The greater *ω<sup>p</sup>* takes place mostly at the FF process-corner (in Figure 7b), while the lower *ω<sup>p</sup>* mostly takes place at the SS process-corner (in Figure 7a).

Table 4 shows the oscillation frequency and power dissipation corresponding to each control voltage *Vctrl* value for the best 5 feasible sized solutions.

**Figure 7.** (**a**) Higher and (**b**) lower gains and dominant pole frequencies, for the solution 4 CMOS delay cell designed with UMC technology of 180 nm by applying DE algorithm.


**Table 4.** Oscillation frequency and power dissipation to the corresponding *Vctrl*.

#### **6. Discussion**

The proposed methodology to circuit design here is: (1) apply DE at least 30 times. This give us 30 solutions to our design problem, considering only the best solutions according to the objective function. (2) From the best 10 solutions, apply the Monte Carlo (MT) analysis. (3) From the best 10 solutions of the MC analysis apply the PVT analysis. Finally, (4) select the best solution according to the showed variations in the PVT analysis.

We apply the MC analysis to vary the dimension for all the circuit's transistors up to 10% of their value. As shown in Figure 5, these variations are not too high to move the operating point of the MOS transistors, and still the order of the obtained solution according to the objective function is kept after the MC analysis.

Then we apply the PVT analysis: The five process corners employed for this simulation are the ones provided by the foundry which are typical-typical (TT), slow NMOS transistor and fast PMOS transistor (SNFP), fast NMOS transistor and slow PMOS transistor (FNSP), slow-slow (SS) and fast-fast (FF), these account for the variation of fabrication parameters. A circuit can also be subject to temperature (considering three temperatures −20◦, 60◦ and 120◦) and voltage variations (considering a variation of ±10%) in its operation environment therefore each corner is simulated with each temperature and voltage variation.

The chosen solution is the one with lower time period (or higher operation frequency) while all the gains are positive, within the gain constraint of 1 < *AOL* < 5.

In Table 3 are shown only the first five solutions, although 10 analyses were performed.

We use a DE version programmed in C language. One single run (50 individuals, and 50 generations) took around 32 min.

The MC and PVT analyses could be incorporated within the optimization loop, as another set of constraints. This idea also will increase the simulation time to several hours. We are going to analyse this idea as a future work.

#### **7. Conclusions**

The application of the DE algorithm has proven to be effective in the minimization of the time period of a CMOS VCO designed with CMOS differential delay cells in a ring topology. We use the Monte Carlo analysis over the sized transistor dimensions to rank the obtained DE solutions. Then we apply the PVT analyses to the 10 best solutions according to the Monte Carlo analysis. The most robust solution to PVT, provides an oscillation frequency up to 4.25 GHz (corresponding to a time period of 0.235 ns), and it has a wider tunning range, of 2.72–4.44 GHz, corresponding to *Vctrl* of −0.36 to −0.9V.

**Author Contributions:** Investigation, P.R.C.-A., E.T.-C., and L.G.d.l.F.; and Writing—review and editing, P.R.C.-A., E.T.-C., and L.G.d.l.F. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Conflicts of Interest:** The authors declare no conflict of interest. The founding sponsors had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

#### **References**


**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **The Pareto Tracer for General Inequality Constrained Multi-Objective Optimization Problems**

**Fernanda Beltrán 1, Oliver Cuate 1,2 and Oliver Schütze 1,\***


Received: 22 October 2020; Accepted: 18 December 2020; Published: 20 December 2020

**Abstract:** Problems where several incommensurable objectives have to be optimized concurrently arise in many engineering and financial applications. Continuation methods for the treatment of such multi-objective optimization methods (MOPs) are very efficient if all objectives are continuous since in that case one can expect that the solution set forms at least locally a manifold. Recently, the Pareto Tracer (PT) has been proposed, which is such a multi-objective continuation method. While the method works reliably for MOPs with box and equality constraints, no strategy has been proposed yet to adequately treat general inequalities, which we address in this work. We formulate the extension of the PT and present numerical results on some selected benchmark problems. The results indicate that the new method can indeed handle general MOPs, which greatly enhances its applicability.

**Keywords:** multi-objective optimization; Pareto Tracer; continuation; constraint handling

#### **1. Introduction**

In many real-world applications, the problem occurs that several conflicting and incommensurable objectives have to be optimized concurrently. As general example, in the design of basically any product, both cost (to be minimized) and quality (to be maximized) are relevant objectives, among others. Problems of that kind are termed multi-objective optimization problems (MOPs). In the case all of the objectives are continuous and in conflict with each other, it is known that there is not one single solution to be expected (as it is the case for scalar optimization problems, i.e., problems where one objective is considered) but an entire set of solutions. More precisely, one can expect that the solution set—the Pareto set, and, respectively its image, the Pareto front—forms at least locally an object of dimension *k* − 1, where *k* is the number of objectives involved in the problem. Due to this, "curse of dimensionality" problems with more than, e.g., four objectives are also called many objective optimization problems (MaOPs).

In the literature, many different methods for the numerical treatment of MOPs and MaOPs can be found (see also the discussion in the next section). One class of such methods is given by specialized continuation methods that take advantage of the fact that the solution set forms—at least locally and under certain mild assumption on the model as discussed in [1]—a manifold. Continuation methods start with one (approximate) solution of the problem and perform a movement along the Pareto set/front of the given M(a)OP via considering the underdetermined system of equations that is developed out of the Karush–Kuhn–Tucker (KKT) equations of the problem. By construction, continuation methods are of local nature. That is, if the Pareto set consists of different connected components, such methods will have to be fed with several starting points in order to obtain approximations of the entire solution set. On the other hand, continuation methods are probably most effective locally (i.e., within each connected component). Thus far, several multi-objective continuation methods have been proposed. Most of these continuation methods, however, are designed for or

restricted to the treatment of bi-objective problems (i.e., MOPs with two objectives). The method of Hillermeier [1] and the Pareto Tracer (PT [2]) have been proposed for general number *k* of objectives. The method of Hillermeier is applicable to unconstrained and equality constrained MOPs, and the PT in addition to box constrained problems. Thus far, no extensions for these two methods are known for the treatment of general inequalities, which represents a significant shortcoming since such constraints naturally arise in many applications (e.g., [3,4]). In this paper, we extend the PT for the treatment of general inequality constraints. To this end, we utilize and adapt elements from active set methods to decide which of the inequalities have to be treated as equalities at each candidate solution. We demonstrate the strength of the novel algorithm on several benchmark test functions and present comparisons to some other numerical multi-objective solvers. The results indicate that the new method can indeed reliably handle MOPs with general constraints.

The remainder of this paper is organized as follows. In Section 2, we shortly present the required background for the understanding of this work. In Section 3, we adapt the Pareto Tracer for the treatment of general (equality and inequality) constraints. In Section 4, we present some results of the PT as well as some other multi-objective numerical methods on selected benchmark problems. Finally, we draw our conclusions in Section 5 and give possible paths for future research.

#### **2. Background and Related Work**

In this section, we briefly state the main concepts and notations that are used for the understanding of this work (for details, we refer to, e.g., [5,6]).

We consider here continuous multi-objective optimization problem (MOPs) that can be defined mathematically as

$$\begin{array}{ll}\min\_{\mathbf{x}} & F(\mathbf{x}),\\\text{s.t.} & h\_{i}(\mathbf{x}) = 0, \quad i = 1, \ldots, p, \\ & g\_{i}(\mathbf{x}) \le 0, \quad i = 1, \ldots, m,\end{array} \tag{1}$$

where *<sup>F</sup>* : *<sup>Q</sup>* <sup>⊂</sup> <sup>R</sup>*<sup>n</sup>* <sup>→</sup> <sup>R</sup>*k*, *<sup>F</sup>*(*x*)=(*f*1(*x*), ... , *fk*(*x*))*<sup>T</sup>* is the map of the *<sup>k</sup>* individual objectives *fi* : *<sup>Q</sup>* <sup>⊂</sup> <sup>R</sup>*<sup>n</sup>* <sup>→</sup> <sup>R</sup>. We assume that all objectives and constraint functions are twice continuously differentiable. The domain *Q* of the functions is defined by the equality and inequality constraints of (1):

$$Q := \{ \mathbf{x} \in \mathbb{R}^n \; : \; h\_i(\mathbf{x}) = 0, \; i = 1, \dots, p \text{ and } \mathbf{g}\_i(\mathbf{x}) \le 0, \; i = 1, \dots, m \}. \tag{2}$$

If a point *<sup>x</sup>* <sup>∈</sup> <sup>R</sup>*<sup>n</sup>* satisfies all constraints of (1), i.e., if *<sup>x</sup>* <sup>∈</sup> *<sup>Q</sup>*, we call this point feasible. Points *x* ∈ *Q* are called infeasible. If *k* = 2 objectives are considered, the problem is also termed a bi-objective optimization problem (BOP).

We say that a point *x* ∈ *Q* dominates a point *y* ∈ *Q* (in short: *x* ≺ *y*) if *fi*(*x*) ≤ *fi*(*y*) for all *i* = 1, ... , *k*, and there exists an index *j* such that *fj*(*x*) < *fj*(*y*). A point *x*<sup>∗</sup> is called Pareto optimal or simply optimal if there does not exist a vector *y* ∈ *Q* that dominates *x*∗. A point *x*<sup>∗</sup> ∈ *Q* is called locally optimal if there does not exist a vector *y* ∈ *Q* ∩ *N*(*x*∗) that dominates *x*∗, where *N*(*x*∗) is a neighborhood of *x*∗. The set *PQ* of all Pareto optimal solutions is called the Pareto set, and its image *F*(*PQ*) the Pareto front. In [1], it has been shown that one can expect that both Pareto set and front typically form (*k* − 1)-dimensional objects under certain (mild) conditions on the problem.

If all objectives and constraint functions are differentiable, local optimal solutions can be characterized by the Karush–Kuhn–Tucker (KKT) equations [7,8]:

**Theorem 1.** *Suppose that x*∗ *is locally optimal with respect to* (1)*. Then, there exist Lagrange multipliers <sup>α</sup>* <sup>∈</sup> <sup>R</sup>*k, <sup>λ</sup>* <sup>∈</sup> <sup>R</sup>*<sup>p</sup> and <sup>γ</sup>* <sup>∈</sup> <sup>R</sup>*<sup>m</sup> such that the following conditions are satisfied*

$$\sum\_{i=1}^{k} a\_i \nabla f\_i(\mathbf{x}^\*) + \sum\_{i=1}^{p} \lambda\_i \nabla h\_i(\mathbf{x}^\*) + \sum\_{i=1}^{m} \gamma\_i g\_i(\mathbf{x}^\*) = \mathbf{0} \tag{3a}$$

$$h\_i(\mathbf{x}^\*) = 0, \quad i = 1 \dots p\_\prime \tag{3b}$$

$$g\_i(\mathbf{x}^\*) \le 0, \quad i = 1 \dots m,\tag{3c}$$

$$\alpha\_i \ge 0, \quad i = 1 \dots k,\tag{3d}$$

$$\sum\_{i=1}^{k} \alpha\_i = 1,\tag{3e}$$

$$
\gamma\_i \ge 0, \quad i = 1 \dots m,\tag{3f}
$$

$$
\gamma\_i \mathfrak{g}\_i(\mathfrak{x}^\*) = 0, \quad i = 1 \dots m. \tag{3g}
$$

Multi-objective optimization is an active field of research, and thus far many numerical methods have been proposed for the treatment of such problems. There exist for instance many methods that are designed to compute single solutions such as the weighted sum method [9], the -constraint method [5,10], the weighted metric and weighted Tchebycheff method [5,11,12], as well as reference point problems [13–15]. All of these methods transform the given MOP into a scalar optimization problem (SOP) that can to a certain extent to include users' preferences. These methods can either be used as standalone algorithm (i.e., for the computation of single solutions) or be used to obtain a finite size approximation of the entire Pareto set/front of the given MOP via utilizing a clever sequence of these SOPs [5,16–19].

Further, there exist set oriented methods such as cell mapping techniques [20–23]), subdivision techniques [24–27], and multi-objective evolutionary algorithms (MOEAs, e.g., [3,28–34]). All of these methods manipulate an entire set of candidate solutions in each iteration and hence yield a finite size approximation of the solution set in one run of the algorithm. Hybridizations of such techniques with mathematical programming techniques can be found in [31,35–41].

Finally, a third class of numerical solvers for MOPs is given by specialized continuation methods that take advantage of the fact that the Pareto set/front of a given problem forms at least locally a manifold of a certain dimension. Methods of this kind start with a given (approximate) solution and perform a movement along the Pareto set/front of the problem. The first such method is proposed in [1], which can be applied to unconstrained and equality constrained MOPs of any number *k* of objectives, while no strategies are reported on how to treat inequalities. ParCont [42,43] is a rigorous predictor–corrector method that is based on interval analysis and parallelotope domains. The method can deal with equality and inequality constraints, but it is restricted to bi-objective problems. This restriction also holds for the method presented in [44], which has been designed to provide an equispaced approximation of the Pareto front. The Zigzag method [45–47] obtains Pareto front approximations via alternating optimizing one of the objectives. This approach is also limited to the treatment of bi-objective problems.

In [48], a continuation method is presented that is applicable to box-constrained BOPs. In [49], a variant of the method of Hillermeier is presented that is designed for the treatment of highdimensional problems.

Recently, the Pareto Tracer (PT) was proposed by Martin and Schütze [2]. Similar to the method of Hillermeier, PT addresses the underdetermined nonlinear system of equations that is induced by the KKT equations. However, unlike the method of Hillermeier, the PT aims to separate the decision variables from the associated weight (or Lagrange) vectors whenever possible, leading to significant changes. The latter is due to the fact that the nonlinearity of the equation system can be significantly higher in the compound space compared to the corresponding system that is only defined in decision variable space. As a by-product, the chosen approach allows to compute the tangent space of both

Pareto set and front at every given regular point *x*. In [50], elements of the PT are used to treat many objective optimization problems (i.e., MOPs with more than, e.g., five objectives). Thus far, PT is only applicable to box and equality constrained problems which limits its application. In the following, we propose and discuss an extension of this method to adequately treat general MOPs, i.e., MOPs that in particular contain general inequalities.

#### **3. Adapting the Pareto Tracer for General Inequality Constrained MOPs**

In this section, we adapt the PT so that is can handle general inequality constraints. The core is the predictor–corrector step that generates from a given candidate solution *xi* the following candidate *xi*+<sup>1</sup> that satisfies the KKT conditions, and so that *F*(*xi*+1) − *F*(*xi*) defines a pre-described movement in objective space along the set of KKT points.

Assume we are given a MOP of form (1) and a feasible point *x*<sup>0</sup> that satisfies the KKT conditions (3), where *α<sup>i</sup>* > 0, *i* = 1, . . . , *k*. Let  > 0 and define by

$$M\_p(\mathfrak{e}) := \{ j \in \{ 1, \ldots, m \} \; : \; \mathfrak{g}\_j(\mathfrak{x}\_0) \ge -\mathfrak{e} \}\tag{4}$$

the set of indices corresponding to the nearly active inequalities at *x*0. If *Ip*() = {*j*1, ... , *js*}, *s* ≤ *m*, define

$$\mathbf{G}\_{\mathfrak{c}} := \begin{pmatrix} \nabla g\_{\mathfrak{f}\_{\mathfrak{f}}}(\mathbf{x}\_{\mathbf{0}})^{\mathsf{T}} \\ \vdots \\ \nabla g\_{\mathfrak{f}\_{\mathfrak{f}}}(\mathbf{x}\_{\mathbf{0}})^{\mathsf{T}} \end{pmatrix} \in \mathbb{R}^{s \times n}. \tag{5}$$

Further, let

$$\begin{aligned} J &:= \begin{pmatrix} \nabla f\_1(\mathbf{x})^T \\ \vdots \\ \nabla f\_k(\mathbf{x})^T \end{pmatrix} \in \mathbb{R}^{k \times n} \\\\ H &:= \begin{pmatrix} \nabla h\_1(\mathbf{x}\_0)^T \\ \vdots \\ \nabla h\_p(\mathbf{x}\_0)^T \end{pmatrix} \in \mathbb{R}^{p \times n}, \end{aligned} \tag{6}$$

and *<sup>α</sup>* <sup>∈</sup> <sup>R</sup>*n*, *<sup>λ</sup>* <sup>∈</sup> <sup>R</sup>*p*, and *<sup>γ</sup>* <sup>∈</sup> <sup>R</sup>*<sup>s</sup>* be the solution of 

 

$$\min\_{\mathsf{R}, \mathsf{R}, \mathsf{I}, \mathsf{I}} \left\{ \left\| f^T \mathbb{R} + H^T \mathbb{X} + G\_c^T \gamma \right\|\_2^2 \; : \; \mathbb{A}\_i \ge 0, i = 1, \dots, k, \sum\_{i=1}^k \mathbb{A}\_i = 1 \right\}. \tag{7}$$

Note that (7) yields the Lagrange multipliers at *x*<sup>0</sup> for  = 0 if *x*<sup>0</sup> is a KKT point and if all active inequalities are regarded as equalities. Using *α*, *λ* and *γ*, define the matrix

$$\mathcal{W}\_{a,\emptyset,\gamma} := \sum\_{i=1}^{k} a\_i \nabla^2 f\_i(\mathbf{x}) + \sum\_{i=1}^{p} \lambda\_i \nabla^2 h\_i(\mathbf{x}) + \sum\_{i=1}^{s} \gamma\_i \nabla^2 g\_{j\_i}(\mathbf{x}) \in \mathbb{R}^{n \times n}. \tag{8}$$

To compute a predictor direction *νμ* <sup>∈</sup> <sup>R</sup>*n*, we solve the the system

$$
\begin{pmatrix}
\mathcal{W}\_{\mathfrak{a},\lambda,\gamma} & H^T & G\_{\mathfrak{c}}^T \\
H & 0 & 0 \\
G\_{\mathfrak{c}} & 0 & 0
\end{pmatrix}
\begin{pmatrix}
\nu\_{\mu} \\
\zeta \\
\sigma
\end{pmatrix} = 
\begin{pmatrix}
0 \\
0
\end{pmatrix}.
\tag{9}
$$

Note that system (9) depends on *<sup>μ</sup>* <sup>∈</sup> <sup>R</sup>*k*. Before we specify this vector, we first simplify (9). Denote by 

$$A := \begin{pmatrix} H \\ G\_{\varepsilon} \end{pmatrix} \in \mathbb{R}^{(p+s)\times n}, \qquad \zeta := \begin{pmatrix} \zeta \\ \sigma \end{pmatrix} \in \mathbb{R}^{p+s}, \tag{10}$$

then (9) is equivalent to

$$
\begin{pmatrix} W\_{a,\lambda,\gamma} & A^T \\ A & 0 \end{pmatrix} \begin{pmatrix} \nu\_{\mu} \\ \xi \end{pmatrix} = \begin{pmatrix} -J^T \mu \\ 0 \end{pmatrix}. \tag{11}
$$

$$
\text{and to show that for a vector } \nu\_{\mu\_d} \text{ that solves (11), where } \mu\_d \in \mathbb{R}^k \text{ is}
$$

$$
\begin{pmatrix} -JW\_{a,\lambda,\gamma}^{-1}J^T \\ 1 \dots 1 \end{pmatrix} \mu\_d = \begin{pmatrix} d \\ 0 \end{pmatrix}, \tag{12}
$$

Let *<sup>d</sup>* <sup>∈</sup> <sup>R</sup>*k*. It is straightforward to show that for a vector *νμ<sup>d</sup>* that solves (11), where *<sup>μ</sup><sup>d</sup>* <sup>∈</sup> <sup>R</sup>*<sup>k</sup>* is chosen such that

$$
\begin{pmatrix} -J\mathcal{W}\_{a,\lambda,\gamma}^{-1}f^T \\ 1 \dots 1 \end{pmatrix} \mu\_d = \begin{pmatrix} d \\ 0 \end{pmatrix} \tag{12}
$$

it holds

$$J\nu\_{\mu\_d} = d.\tag{13}$$

That is, (infinitesimal) small steps from *x*<sup>0</sup> into direction *νμ<sup>d</sup>* (in decision variable space) will lead to a movement from *F*(*x*0) into direction *d* (in objective space). It remains to select a suitable choice for *d*. Since *α* is orthogonal to the linearized Pareto front at *F*(*x*0) [1], a suggesting choice is hence by (13) to take *d* orthogonal to *α*. For this, let

$$\mathfrak{a} = \mathbb{Q}\mathbb{R} = (q\_1, q\_2, \dots, q\_k)\mathbb{R},\tag{14}$$

where *<sup>Q</sup>* <sup>∈</sup> <sup>R</sup>*k*×*<sup>k</sup>* is orthogonal and *<sup>R</sup>* <sup>∈</sup> <sup>R</sup>*k*×1, be a *QR*-factorization of *<sup>α</sup>*. Then, any vector

$$d \in \text{span}\{q\_{2'}, \dots, q\_k\} \tag{15}$$

can be chosen so that a movement in direction *νμ<sup>d</sup>* (in decision variable space) leads to a movement from *F*(*x*0) along the Pareto front. Note that the second equation in (12) reads as ∑*<sup>k</sup> <sup>i</sup>*=<sup>1</sup> *μ<sup>i</sup>* = 0. Hence, for the special case of a bi-objective optimization problem (i.e., *k* = 2), there are—after normalization—only two choices for *μ*: 

$$
\mu^{(1)} = \begin{pmatrix} -1 \\ 1 \end{pmatrix}, \quad \text{and} \quad \mu^{(2)} = \begin{pmatrix} 1 \\ -1 \end{pmatrix}. \tag{16}
$$

Analog to Martin and Schütze [2], one can show that *μ*(1) corresponds to a "right down" movement along the Pareto front while *μ*(2) corresponds to a "left up" movement along the Pareto front.

After selecting the predictor direction *νμ*, the question is how far to step in this direction. Here, we follow the suggestion made by Hillermeier [1] and use the step size

$$t = \frac{\pi}{||J\nu\_{\mu}||\_2} \tag{17}$$

for a (small) value *τ* > 0 so that

$$\|\|F(\mathbf{x}\_0 + t\nu\_\mu) - F(\mathbf{x}\_0)\|\|\_2 \approx \tau. \tag{18}$$

For the computations presented below, we make the following modifications: instead of *Wα*,*β*,*γ*, we use the matrix

$$\mathcal{W}\_{\boldsymbol{a}} := \sum\_{i=1}^{k} a\_i \nabla^2 f\_i(\boldsymbol{x}) \in \mathbb{R}^{n \times n}. \tag{19}$$
 
$$\text{action of } \boldsymbol{\nu}\_{\boldsymbol{\mu}\boldsymbol{\nu}} \text{ we use the system}$$

More precisely, for the computation of *νμ*, we use the system

$$
\begin{pmatrix} \mathcal{W}\_{\mathbb{A}} & A^T \\ A & 0 \end{pmatrix} \begin{pmatrix} \nu\_{\mu} \\ \xi \end{pmatrix} = \begin{pmatrix} -J^T \mu \\ 0 \end{pmatrix} \tag{20}
$$

*Math. Comput. Appl.* **2020**, *25*, 80

and to obtain *μ<sup>d</sup>* we solve

$$
\mu \begin{pmatrix} -JW\_{\alpha}^{-1}f^T \\ 1 \dots 1 \end{pmatrix} \mu\_d = \begin{pmatrix} d \\ 0 \end{pmatrix} . \tag{21}
$$

We have observed similar performance for both approaches, while the usage of *W<sup>α</sup>* compared to *Wα*,*β*,*<sup>γ</sup>* comes with the advantage that no Hessians for any of the constraint functions have to be computed.

Given a predictor point

$$\mathfrak{X}\_1 := \mathfrak{x}\_0 + t\mathfrak{v}\_{\mu} \tag{22}$$

the task of the upcoming corrector step is to find a KKT point *x*<sup>1</sup> that is ideally near to *x*˜1. For this, we suggest to apply the multi-objective Newton method proposed in [51]. In particular, we first compute the solution (*ν*˜1, ˜ *δ*) of the following problem

$$\begin{aligned} \min\_{\left(\nu,\delta\right) \in \mathbb{R}^n \times \mathbb{R}} & \quad \\ \text{s.t. } & \nabla f\_i(\left(\mathbb{\dot{x}\_1}\right)^T \nu + \frac{1}{2} \nu^T \nabla^2 f\_i(\left(\mathbb{\dot{x}\_1}\right) \nu \le \delta, \quad i = 1, \dots, k, \\ & h\_i(\left(\mathbb{\dot{x}\_1}\right) + \nabla h\_i(\left(\mathbb{\dot{x}\_1}\right)^T \nu = 0, \quad i = 1, \dots, p. \end{aligned} \tag{23}$$

*ν*˜1 is indeed the Newton direction for equality constrained MOPs as suggested in [2]. To adequately treat the involved inequalities, however, we propose to use the solution of the following problem:

$$\begin{aligned} \min\_{\left(\nu,\delta\right)\in\mathbb{R}^{n}\times\mathbb{R}} & \delta\\ \text{s.t. } & \nabla f\_{i}(\bar{\mathbf{x}}\_{1})^{T}\nu + \frac{1}{2}\nu^{T}\nabla^{2}f\_{i}(\bar{\mathbf{x}}\_{1})\nu \leq \delta, \quad i = 1,\ldots,k,\\ & h\_{i}(\bar{\mathbf{x}}\_{1}) + \nabla h\_{i}(\bar{\mathbf{x}}\_{1})^{T}\nu = 0, \quad i = 1,\ldots,p. \\ & g\_{i}(\bar{\mathbf{x}}\_{1}) + \nabla g\_{i}(\bar{\mathbf{x}}\_{1})^{T}\nu = 0, \quad i \in I\_{\mathfrak{c}}(\boldsymbol{\varepsilon}). \end{aligned} \tag{24}$$

Note that problem (24) is identical to problem (23) except that |*Ic*()| inequalities are treated as equalities at *x*˜1. In particular, we propose to add an index *i* to *Ic*() if


Algorithm 1 shows the pseudo code to build the index set *Ic*() at a predictor point *x*˜*i*. Given the Newton direction, the Newton step can then be performed via using the Armijo rule described in [51], as done in our computations. The set *Ic*() is only computed once, it and remains fixed during the Newton iteration in the corrector step.

Algorithm 2 shows the pseudo code of one predictor–corrector step of the PT for general (equality and inequality constrained) MOPs. For bi-objective problems, *μ* can be chosen as in (16) leading either to a "left up" or "right down" movement, as discussed above. The algorithm has to be stopped if *α* is either close enough to (1, 0)*<sup>T</sup>* or (0, 1)*T*, depending of course on the chosen search direction. For *k* > 2, one can use the box partition in objective space as described in [2] in order to mark the regions of the Pareto front that have already been "covered" during the run of the algorithm.

For the realization of the predictor–corrector step several linear systems of equations have to be solved, the largest one being (20). The cost is hence *O*((*n* + *p* + *s*)3) in terms of flops and *O*((*n* + *p* + *s*)2) in terms of storage. Further, for the corrector step the SOP (7) has to be solved that contains *k* + *p* + *s* decision variables. For the computation of the Newton direction, the SOPs (23) and (24) have to be solved for the first Newton iteration that contains both *n* + 1 decision variables. For further Newton iterations, only SOP (24) has to be solved since the index set *Ic*() remains fixed within a corrector step. Finally, note that, if the method is realized as described above, the Hessians of all individual objectives have to be computed at each candidate solution (including at each Newton iteration). Using ideas from quasi-Newton methods, one can approximate the Hessians so that only gradient information is needed at each candidate solution, as described in [2].

**Algorithm 1** Build *Ic*()

**Require:** *x*˜*i*: predictor, *ν*˜*i*: corrector direction for (23),  > 0: tolerance **Ensure:** *Ic*(): index set 1: *I* := ∅ 2: **for** *i* = 1, ..., *m* **do** 3: **if** *gi*(*x*˜*i*) >  **then** 4: *I* := *I* ∪ *i* 5: **else if** *gi*(*x*˜*i*) <sup>∈</sup> (<sup>−</sup>, ) ∧ ∇*g*(*x*˜*i*)*Tν*˜*<sup>i</sup>* <sup>&</sup>gt; <sup>0</sup> **then** 6: *I* := *I* ∪ *i* 7: **end if** 8: **end for** 9: **Return** *Ic*()

**Algorithm 2** Predictor–corrector step of the Pareto Tracer for general MOPs

**Require:** *xi*: current candidate solution, *τ* > 0: desired distance in objective space,  > 0: tolerance **Ensure:** *xi*+1: new candidate solution


As a demonstration example, we consider the problem

$$\begin{aligned} \text{11ple, we consider the problem} \\ \min \begin{cases} f\_1(\mathbf{x}) = (\mathbf{x}\_1 + 3)^2 + (\mathbf{x}\_2 - 2)^2, \\ f\_2(\mathbf{x}) = \mathbf{x}\_1^2 + (\mathbf{x}\_2 + 3)^2, \\ g\_1(\mathbf{x}) = (\mathbf{x}\_1 + 1)^2 + \mathbf{x}\_2^2 \le 2^2, \\ g\_2(\mathbf{x}) = (\mathbf{x}\_1 + 2)^2 + (\mathbf{x}\_2 + 2)^2 \le 2^2. \end{aligned} \end{aligned} \tag{25}$$

Figure 1a shows the Pareto set of the above problem where the two inequalities have been left out (i.e., the line segment connecting (−3, 2)*<sup>T</sup>* and (0, <sup>−</sup>3)*T*), the sets *gi*(*x*) = 0, *<sup>i</sup>* <sup>=</sup> 1, 2, as well as the Pareto set of this problem which is indeed the result of the PT. As starting point, we chose a point which significantly violates both constraints (and, hence, |*Ic*()| = 2 for  = 1*e* − 4). An application of the above-described Newton method leads to the point on the Pareto set with the smallest *x*1-value, which is in fact the initial point for the PT. During the run of PT, first only *g*<sup>2</sup> is "active" in the corrector step (i.e., *Ic*() = {2}), later none of the constraints (in the intersection of the Pareto fronts of the constrained and the unconstrained MOP), and finally only *g*1.

**Figure 1.** Numerical result of the PT for MOP (25).

#### **4. Numerical Results**

In this section, we further demonstrate the behavior of the PT on five benchmark problems that contain inequality constraints. For all problems, we used the quasi-Newton variant of PT that only required function and Jacobian information (and no Hessians). To compare the results, we also show the respective results obtained by the normal boundary intersection (NBI, [16]), the -constraint method [5], and the multi-objective evolutionary algorithm NSGA-II. For NBI and the -constraint method, we used the code that is available at [52], and for NSGA-II the implementation of PlatEMO [53]. Regrettably, no comparison to a multi-objective continuation method can be presented since none of the respective codes are publicly available. For a comparison of the PT and the method of Hillermeier on box and equality constrained MOPs, we refer to [2]. We chose also to include a comparison to the famous NSGA-II since it is widely used and state-of-the-art for two- and three-objective problems as we consider here. We stress that the comparisons only show (on the first four test problems) that PT outperforms NSGA-II on these particular cases where the Pareto front consists of one connected component. For highly multi-modal functions where the Pareto set/front falls into several connected components, NSGA-II will certainly outperform the (standalone) PT. A fair comparison can only be obtained when integrating PT into a global heuristic (as, e.g., done in [41]). This is certainly an interesting task, however, beyond the scope of this work.

To compare the results, we compare the total number of function evaluations used for each algorithm on each problem. For this, each Jacobian call is counted as four function calls assuming that the derivative is obtained via automatic differentiation [54]. To measure the quality of the approximations, we used the averaged Hausdorff distance Δ<sup>2</sup> [55–57]. Since NSGA-II has stochastic components, we applied this algorithm for each problem 10 times and present the median result (measured by Δ2).

#### *4.1. Binh and Korn*

Our first test example is a modification of the box-constrained BOP from Binh and Korn [58], where we add two inequality constraints as follows: min

$$\begin{array}{ll}\min \begin{cases} f\_1(\mathbf{x}) = 4\mathbf{x}\_1^2 + 4\mathbf{x}\_2^2, \\ f\_2(\mathbf{x}) = (\mathbf{x}\_1 - 5)^2 + (\mathbf{x}\_2 - 5)^2, \end{cases} \\ \text{s.t.} \quad (\mathbf{x}\_1 - 2)^2 + (\mathbf{x}\_2 - 1)^2 \le 2.3^2, \\ (\mathbf{x}\_1 - 3)^2 + (\mathbf{x}\_2 - 3)^2 \ge 1.5^2, \\ 0 \le \mathbf{x}\_1 \le \mathbf{5}, \\ 0 \le \mathbf{x}\_2 \le 3. \end{array} \tag{26}$$

Table 1 shows the design parameters that have been used by NSGA-II for this problem, Table 2 shows the computational efforts and the obtained approximation quality for each algorithm, and Figures 2 and 3 show the obtained Pareto set and front approximations, respectively. For PT, we chose *τ* = 0.6 leading to 52 solutions along the Pareto set/front in 4.48 s (the computations have been done on a Ubuntu 20.04.1 LTS system with an Intel Core i7-855OU 1.80 GHz x 8 CPU and 12 GB of RAM). We then applied NBI and the -constraint model using this number of sub-problems. For NSGA-II, we took the population size 100, which is a standard value for this algorithm. The results show nearly perfect Pareto front approximations (at least from the practical point of view) for all algorithms, which is also reflected by the low Δ<sup>2</sup> values that are very close to the optimal value of 0.6 (at least for PT, defined by *τ*). In terms of function evaluations, PT clearly wins over NBI and the -constraint method. A comparison to NSGA-II is not possible due to the choice of the population size.

**Figure 2.** Results in decision space for MOP (26).

**Table 1.** Parameters used by NSGA-II for MOP (26).



**Table 2.** Computational efforts and approximation quality of the algorithms for MOP (26).

**Figure 3.** Results in objective space for MOP (26).

#### *4.2. Chakong and Haimes*

Next, we consider the bi-objective problem of Chankong and Haimes [59], which contains next to the box constraints one linear and one nonlinear inequality.

$$\begin{aligned} \min \begin{cases} f\_1(\mathbf{x}) = 2 + (\mathbf{x}\_1 - 2)^2 + (\mathbf{x}\_2 - 1)^2, \\ f\_2(\mathbf{x}) = 9\mathbf{x}\_1 - (\mathbf{x}\_2 - 1)^2, \end{cases} \\ \text{s.t.} \quad \mathbf{x}\_1^2 + \mathbf{x}\_2^2 \le 225, \\ \mathbf{x}\_1 - 3\mathbf{x}\_2 + 10 \le 0, \\ \text{with } \quad -20 \le \mathbf{x}\_1, \mathbf{x}\_2 \le 20. \end{aligned} \tag{27}$$

Table 3 shows the parameter values used for the application of NSGA-II, Table 4 the computational efforts and the approximation qualities, and Figures 4 and 5 the obtained approximations. We used *τ* = 1 for PT, and proceeded as for the previous example for the other methods. The results are also similar to the previous example: all methods are capable of detecting a nearly perfect Pareto front approximation, and the overall cost is significantly less for PT, in 5.96 s.

**Figure 4.** Results in decision space for MOP (27).


**Table 3.** Parameters used by NSGA-II for problem (27).

**Table 4.** Computational efforts and approximation qualities for problem (27).


**Figure 5.** Results in objective space for MOP (27).

#### *4.3. Tamaki*

Next, we considered a MOP with three objectives (28): ⎪⎪⎩

$$\min \begin{cases} f\_1(\mathbf{x}) &= \mathbf{x}\_1 \\ f\_2(\mathbf{x}) &= \mathbf{x}\_2 \\ f\_3(\mathbf{x}) &= \mathbf{x}\_3 \\ \text{s.t.} & \mathbf{x}\_1^2 + \mathbf{x}\_2^2 + \mathbf{x}\_3^2 \ge 1 \\ & 0 \le \mathbf{x}\_1, \mathbf{x}\_2, \mathbf{x}\_3 \le 4. \end{cases} \tag{28}$$

Both the Pareto set and front for this problem are a part of the unit sphere. Table 5 shows the design parameters for NSGA-II, Table 6 shows the computational effort and the approximation quality for each algorithm, and Figure 6 shows the Pareto front approximations (the respective Pareto set approximations will look identically, albeit in *x*-space). For this problem, *τ* = 0.05 was used. The implementation of the -constrained method did not yield a result. On the Tamaki problem, PT performs better than the other algorithms both in approximation quality and in the overall computational cost.

⎧

⎪⎪⎨

**Figure 6.** Results in objective space for MOP (28).

**Table 5.** Parameters used by NSGA-II for MOP (28).



**Table 6.** Computational efforts and approximation qualities for problem (28).

#### *4.4. BCS*

We next considered a second three-objective problem that contains next to one inequality also a linear equality constraint: ⎪⎪⎨⎪⎪⎩

⎧

$$\min \begin{cases} f\_1(\mathbf{x}) = (\mathbf{x}\_1 + 3)^2 + (\mathbf{x}\_2 + 3)^2 + (\mathbf{x}\_3 + 3)^2, \\ f\_2(\mathbf{x}) = (\mathbf{x}\_1 - 9)^2 + (\mathbf{x}\_2 + 5)^2 + (\mathbf{x}\_3 + 5)^2, \\ f\_3(\mathbf{x}) = (\mathbf{x}\_1 - 5)^2 + (\mathbf{x}\_2 - 8)^2 + \mathbf{x}\_3^2, \\ \text{s.t.} \quad \mathbf{x}\_1 - 2\mathbf{x}\_2 - 3\mathbf{x}\_3 = \mathbf{0}, \\ \sin(2\mathbf{x}\_1) - \mathbf{x}\_2 \le 0. \end{cases} \tag{29}$$

Table 7 presents the design parameters used by NSGA-II, Table 8 shows the computational effort and the approximation quality for each algorithm, and Figures 7 and 8 present the Pareto front approximation of PT (using *τ* = 2), which took 16.79 s. For this example, none of the other methods were able to yield feasible solutions, where we counted a solution *x* to be feasible if |*x*<sup>1</sup> − 2*x*<sup>2</sup> − 3*x*3| < 1*e* − 4 and sin(2*x*1) − *x*<sup>2</sup> ≤ 1*e* − 4.

**Figure 7.** Numerical result of PT in the decision space for MOP (29).

**Table 7.** Parameters used by NSGA-II for MOP (29).



**Table 8.** Computation efforts for the proposed test problem (29).

**Figure 8.** Numerical result of PT in the objective space for MOP (29).

#### *4.5. Osykzka and Kundu* ⎧⎪⎨

As last example, we considered the bi-objective problem of Osykzka and Kundu [60], which has six decision variables and contains six inequality constraints in addition to the box constraints: ⎪⎩

$$\begin{aligned} \min \begin{cases} f\_1(\mathbf{x}) &= -25(\mathbf{x}\_1 - 2)^2 - (\mathbf{x}\_2 - 2)^2 - (\mathbf{x}\_3 - 1)^2 - (\mathbf{x}\_4 - 4)^2 - (\mathbf{x}\_5 - 1)^2 \\ f\_2(\mathbf{x}) &= \sum\_{i=1}^6 x\_i^2 \\ \text{s.t.} & \mathbf{x}\_1 + \mathbf{x}\_2 - 2 \ge 0 \\ & 6 - \mathbf{x}\_1 - \mathbf{x}\_2 \ge 0 \\ & 2 - \mathbf{x}\_2 + \mathbf{x}\_1 \ge 0 \\ & 2 - \mathbf{x}\_1 + 3\mathbf{x}\_2 \ge 0 \\ & 4 - (\mathbf{x}\_3 - 3)^2 - \mathbf{x}\_4 \ge 0 \\ & (\mathbf{x}\_5 - 3)^2 + \mathbf{x}\_6 - 4 \ge 0 \\ & 0 \le \mathbf{x}\_1, \mathbf{x}\_2, \mathbf{x}\_6 \le 10 \\ & 1 \le \mathbf{x}\_3, \mathbf{x}\_5 \le 5 \\ & 0 \le \mathbf{x}\_4 \le 6 \end{aligned} \tag{30}$$

While the Pareto front of this problem is connected, its Pareto set consists of three different connected components. Hence, PT is not able to compute an approximation of the entire Pareto front with only one starting point. Figure 9a shows the result of PE for *τ* = 2 using the three starting points

$$\begin{aligned} \mathbf{x}\_{0,1} &= \left( 0.60, 1.50, 1.0, 0.00, 1.00, 0.04 \right)^T, \\ \mathbf{x}\_{0,2} &= \left( 0.00, 2.00, 2.20, 0.00, 1.00, 0.00 \right)^T, \\ \mathbf{x}\_{0,3} &= \left( 5.00, 1.00, 5.00, 0.00, 1.00, 0.01 \right)^T. \end{aligned} \tag{31}$$

The computational time to obtain this result was 12.98 s. Figure 9b shows a numerical result of NSGA-II using the design parameters shown in Table 9. The obtained solutions "under" the Pareto front can be explained by the tolerance of 1 <sup>×</sup> <sup>10</sup>−<sup>4</sup> that was used to measure feasibility (while <sup>1</sup> <sup>×</sup> <sup>10</sup>−<sup>8</sup> was used for PT). Table <sup>10</sup> shows the computational effort for both methods. Needless to say, this represents by no means a comparison of the two methods. Instead, this should be rather seen as a motivation to hybridize PT with a global search strategy in order to obtain a fast and reliable multi-objective solver, which we leave for future studies.

**Table 9.** Parameters used by NSGA-II for MOP (30).


**Table 10.** Computational efforts and approximation qualities for problem (30).


**Figure 9.** Results in objective space for MOP (30).

#### **5. Conclusions and Future Work**

In this paper, we extend the multi-objective continuation method Pareto Tracer (PT) for the treatment of general inequality constraints. To this end, the predictor–corrector step is modified as follows: in the predictor, all nearly active inequalities are treated as equalities. In the following corrector step, the main challenge is to identify the inequalities for which the predictor solution is either nearly active or slightly violates the constraint that has to be considered, namely the equality constraint in the Newton method, and this is done in a bootstrap manner. We formulate the resulting algorithm and show some numerical results on several benchmark problems, indicating that it can reliably handle inequality (and equality) constrained MOPs. We further present comparisons to some other numerical methods. The results show that the extended PT can indeed reliably handle general MOPs (and in particular general inequalities). However, the method is—by construction—of local nature and restricted to the connected component of the solution set for which one initial solution is available. One interesting task is certainly to hybridize PT with a global solver such as a multi-objective evolutionary algorithm and to compare the resulting hybrid against other methods with respect to their ability to compute the entire global Pareto set/front of a given MOP. This is beyond the scope of this work and has been left for future work.

**Author Contributions:** Conceptualization and formal analysis, O.S.; software, F.B. and O.C.; and writing and editing: F.B., O.C., and O.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** The authors acknowledge support from Conacyt project No. 285599 and SEP Cinvestav project No. 231.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Prediction of Maximum Pressure at the Roofs of Rectangular Water Tanks Subjected to Harmonic Base Excitation Using the Multi-Gene Genetic Programming Method**

**Iman Bahreini Toussi 1,\*, Abdolmajid Mohammadian <sup>1</sup> and Reza Kianoush <sup>2</sup>**


**Abstract:** Liquid storage tanks subjected to base excitation can cause large impact forces on the tank roof, which can lead to structural damage as well as economic and environmental losses. The use of artificial intelligence in solving engineering problems is becoming popular in various research fields, and the Genetic Programming (GP) method is receiving more attention in recent years as a regression tool and also as an approach for finding empirical expressions between the data. In this study, an OpenFOAM numerical model that was validated by the authors in a previous study is used to simulate various tank sizes with different liquid heights. The tanks are excited in three different orientations with harmonic sinusoidal loadings. The excitation frequencies are chosen as equal to the tanks' natural frequencies so that they would be subject to a resonance condition. The maximum pressure in each case is recorded and made dimensionless; then, using Multi-Gene Genetic Programming (MGGP) methods, a relationship between the dimensionless maximum pressure and dimensionless liquid height is acquired. Finally, some error measurements are calculated, and the sensitivity and uncertainty of the proposed equation are analyzed.

**Keywords:** liquid storage tanks; base excitation; artificial intelligence; Multi-Gene Genetic Programming; computational fluid dynamics; finite volume method

#### **1. Introduction**

Earthquakes cause damage to various types of structures, and buildings, dams, reservoirs, and liquid storage tanks may be victims of an earthquake excitation. Sloshing in a liquid storage tank can cause irreversible structural failure and spillage of the liquid material into the environment, and this liquid, if toxic or flammable, may affect the area for a long time, even permanently. Thus, protecting liquid storage tanks from damage during an earthquake is crucial. One of the causes is related to the pressure exerted on the roof of the tank due to the sloshing of the liquid. Therefore, it is necessary for a designer to know the maximum pressure caused by such effects on a tank's roof.

Analytical, numerical, and experimental solutions have been introduced by various scholars. Housner [1] provided an analytical solution that is adopted in some design codes and standards such as the ACI 350.3 from the American Concrete Institute [2]. Housner's method divides the liquid into two parts, i.e., impulsive and convective. The former is the lower part of the liquid that moves in unison with the tank walls, while the latter is the upper part of liquid that creates sloshing in a tank. The impulsive mass is assumed to be rigidly connected to the tank's walls, while the convective mass is modeled by a mass–spring system. Figure 1 illustrates Housner's model for ground-supported tanks. Despite attempts at developing analytical solutions other than Housner's method (e.g., Isaacson [3]), most previous studies have concentrated on numerical analyses. The

**Citation:** Bahreini Toussi, I.; Mohammadian, A.; Kianoush, R. Prediction of Maximum Pressure at the Roofs of Rectangular Water Tanks Subjected to Harmonic Base Excitation Using the Multi-Gene Genetic Programming Method. *Math. Comput. Appl.* **2021**, *26*, 6. https:// doi.org/10.3390/mca26010006

Received: 16 November 2020 Accepted: 29 December 2020 Published: 2 January 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

goal of such studies is to provide a solution to the Navier–Stokes equations given in Equations (5)–(8), which are the governing equations in fluid flow. Cho and Cho [4] developed a combined finite element–boundary element (FE–BE) method to predict liquid behavior and its interaction with a structure, and Liu and Lin [5] studied a numerical model to solve 3D non-linear sloshing in a liquid storage tank. Their model adopted the volume of fluid (VOF) method for tracking a free surface in conjunction with the finite difference method (FDM). Chen et al. [6] formulated a numerical model that is based on Reynolds-averaged Navier–Stokes (RANS) fluid motion, which proved to be in good agreement with the experimental data from Daewoo Shipbuilding & Marine Engineering Co., Ltd. (DSME) [7]. The data were obtained from tests on a rectangular tank with plan dimensions of 800 mm × 400 mm and a height of 500 mm that was horizontally excited with different frequencies.

**Figure 1.** Schematic view of Housner's simplified model.

In recent years, artificial intelligence (AI) has been evolving in all aspects of human life, including engineering problems (Afan et al. [8]). There are several methods for the estimation of a relationship between scattered data based on AI. Among them, the group method of data handling (GMDH; Ivakhnenko and Ivakhnenko [9]) and evolutionary polynomial regression (EPR) can be mentioned. AI techniques such as support vector machine (SVM), artificial neural networks (ANNs), adaptive neuro-fuzzy inference system (ANFIS), Genetic Programming (GP) have recently been used for engineering problems such as water quality index and groundwater level modeling (e.g., Mohammadpour et al. [10]; Ghani and Azamathulla [11]; He et al. [12]; Lallahem et al. [13]; Daliakopoulos et al. [14]; Mirzavand et al. [15]; and Mohammadpour et al. [16]).

Model tree (MT)—a sub-class of the regression tree method—is another regression method in which an equation is generated at each node [17]. In a regression tree, a constant or a relatively simple regression model is used to demonstrate the data [18]. A genetic based method known as GP is also used for the regression of data. In this method, a set of sub-trees is randomly generated based on user-defined specifications using arithmetic operators (i.e., +, −, ×, /), non-linear functions (e.g., sin, cos, log), etc. [19]. The goal is to minimize the errors (e.g., root mean square error (RMSE)) in newer generations until an acceptable error is reached.

Another method for formulating scattered data based on AI is gene expression programming (GEP), which was introduced by Ferreira in 1999 (Sattar and Gharabaghi [20]). This method can be employed to develop relationships between data with minimal error [21]. Azamathulla [22] adopted this method to estimate the scour depth downstream of sills. To do so, he used the following procedure: (1) choose a fitness function; (2) choose a set of terminals (T) and functions (F) to shape chromosomes; (3) choose the chromosome architecture (i.e., head length and the number of genes); (4) choose the linking function (e.g., addition and multiplication operators); and (5) choose the set of genetic operators (e.g., mutation, transportation, etc.). He compared his results with the equation obtained by Chinnarasri and Kositgittiwong [23], which at the time had the lowest error value, and found that the proposed equation using the GEP model had a higher accuracy. Najafzadeh et al. [24] used three methods, i.e., GEP, MT, and EPR, to predict the maximum scour depth near piers with debris accumulation. Gholami et al. [25] used the GEP method to predict the characteristics of stable bank channels. They obtained their own experimental data as well as data from previous experimental studies to complete their GEP modelling. The results were compared with available theoretical and experimental methods. Despite a good agreement and accuracy, the model's complexity was found to be higher in comparison with older analytical methods, and therefore the GEP method was not suggested by the authors. Sheikh et al. [26] applied GEP to analyze shear stress distribution in circular channels with flat beds subject to sediment deposition. They proposed equations for predicting the base shear applied to the bed and the walls of such channels. It was found that the GEP model could lower errors and uncertainties, and hence the model was recommended for the base shear analysis of circular channels with flat beds.

A sub-class of the GP method known as Multi-Gene Genetic Programming (MGGP) can be used in problems with higher complexity. A gene is a weighted linear combination of outputs from a GP tree. In this method, the user has control over the maximum number of genes and the depth of the model tree [27]. In this method, multiple genes are combined to produce an MGGP model. AI techniques have shown to be capable of accurate prediction, and with the development of computing systems, they have become easier to use. However, to the best of the authors' knowledge, they have not been employed in the prediction of pressures and forces in water tanks. Previous studies in engineering applications have shown promising results for MGGP in comparison with other AI techniques such as ANN, ANFIS, traditional GP, etc. (Kaydani et al. [28]; Safari and Mehr [29]; Mehr and Nourani [30]).

The use of GP methods in civil engineering is becoming increasingly popular. Gandomi et al. [31] proposed an empirical model for predicting the ultimate shear strength of reinforced concrete (RC) deep beams using GEP. The results were compared with design codes such as ACI and CSA, and the model was found to give better results than the design codes when compared to the available experimental and numerical data. Gandomi et al. [32] developed a model to find the shear capacity of RC beams without stirrups using the GEP method. To avoid overfitting, they divided the data into three groups of learning, validation, and testing on a random basis. The developed model was tested against the available data and several design codes (e.g., ACI, CSA, NZS, etc.) for various sizes and models of RC beams and was found to give compatible results. GEP can be used in various fields of civil engineering as an optimization method. Zahiri et al. [33] investigated the applications of GEP in hydraulic engineering and found it applicable in different areas, such as estimation of scour depth, discharge rate, and land transport in rivers.

In the present study, data generated by a validated OpenFOAM (Open-Source Field Operation and Manipulation) [34] model are used. The maximum pressure on the roof of a tank is the parameter of interest. Several tank sizes with various liquid heights are excited by a resonance frequency, and the maximum hydrodynamic pressure at the roof of the tank in each case is obtained. Using the GP method in both Single-Gene and Multi-Gene modes, an equation is proposed for predicting the maximum pressure at the roof of the tank. Finally, the proposed equation's reliability is investigated and discussed through error measurements as well as uncertainty and sensibility analyses.

To the best of the authors' knowledge, a study such as this one that predicts the maximum pressure at the roof of a liquid storage tank subjected to base excitation has not been addressed previously. The design codes generally provide a minimum free-board, and if the provided free-board is not sufficient, it is left to the designers to decide how to design the roof. No further data are provided in that manner in the design codes. Furthermore, previous studies have not investigated the pressures at the roof of the tank with the intention of finding a relationship between the tank size and the maximum

pressure on the roof. The available codes and standards do not provide details for designing the roof of tanks with insufficient freeboard, and they only recommend designing the roof to resist uplift pressures. Therefore, this study can provide a good estimate of those pressures and help with the design process.

The results from this study can help provide empirical formulations to appropriately estimate the hydrodynamic pressures at the roof of a liquid tank subjected to base excitations. This can be adopted in design codes and standards to better address the uplift forces and hydrodynamic pressures at the roof level. In addition, the artificial intelligence component of this research can significantly reduce computational cost and time.

Although earthquake and harmonic excitations have different characteristics, it was found in a previous study [35] that harmonic resonance excitations can produce higher hydrodynamic pressures on the roof of a tank compared to earthquake excitations, which is the reason this kind of loading was applied in this study instead of earthquake excitations

This paper is organized as follows. Section 2 deals with the details and equations of numerical modeling and MGGP. Section 3 presents the results, discussions, and error measurements, and some concluding remarks complete the study.

#### **2. Materials and Methods**

#### *2.1. Numerical Modelling*

An OpenFOAM model was previously developed and validated by the authors [35]. The same model was used to generate data for the current study. The maximum hydrodynamic pressure at the roof of rectangular tanks is the parameter of interest in this study. Hence, pressure sensors were distributed on one quarter of the roof for each simulation.

Four different tank sizes were used in the study, the dimensions of which are presented in Table 1. For each tank, a minimum of six different liquid heights were simulated, as discussed later. Since the direction of an earthquake cannot be predicted, four different tank orientations were tested, and among them, the highest roof pressure for each liquid height in each tank was found.


**Table 1.** Dimensions of tanks used in the study.

Many previous studies (e.g., [4,36,37]) have shown that Housner's simplified method [1] predicts resonance frequency accurately, and hence in this study the same method was applied.

Based on Housner's method, the resonance frequency in a rectangular tank can be calculated as follows:

$$M\_{\rm c} = M \frac{\tanh 1.7 \,\mathrm{L/h}}{1.7 \,\mathrm{L/h}} \tag{1}$$

$$k\_{\varepsilon} = 3 \frac{M\_1^2}{M} \frac{\mathcal{g}h}{L^2} \tag{2}$$

$$
\omega\_{\mathfrak{c}} = \sqrt{\frac{k\_{\mathfrak{c}}}{\mathcal{M}\_{\mathfrak{c}}}} \tag{3}
$$

$$T\_c = \frac{2\pi}{\omega\_c} \tag{4}$$

where *Mc* is the mass of the convective part of the liquid (*c* = convective), *M* is the total liquid mass, *L* is half of the tank length, *h* is the total liquid height, *kc* is the stiffness of the assumed spring that connects the convective mass to the tank's walls in the direction of movement, *g* is ground acceleration equal to 9.81 m/s2, and *ω<sup>c</sup>* and *Tc* are the resonance frequency and resonance period of the first (fundamental) mode of the oscillating liquid, respectively. In lieu of Housner's method to determine the natural frequency of the tank, Lamb's formula can be used for simplicity [38]. In Table 2, the resonance frequencies that were applied to each tank based on the size and liquid height are presented. Each tank size–liquid height combination was simulated at four different orientations of 0◦, 30◦, 60◦, and 90◦. Since the direction of an earthquake is not predictable, the maximum pressure among all orientations was used as the input for the GP section. In other words, the maximum of maximums was found and applied to the GP. The excitation orientations of 0◦, 30◦, and 60◦ are presented in Figure 2.


**Table 2.** Frequency applied to each tank based on the tank size and liquid height.

**Figure 2.** Tank orientations for simulations.

After finding the resonance frequency for each tank size and liquid height, numerical modelling was performed using OpenFOAM software. The OpenFOAM model can provide numerical solutions for various types of engineering problems, such as heat transfer, mass transport, liquid flow, etc. It can also solve fluid–structure interaction problems based on computational fluid dynamics (CFD) modelling [39]. Navier–Stokes equations in Equations (5)–(8) are solved for these types of problems.

$$\frac{\partial \mu}{\partial x} + \frac{\partial v}{\partial y} + \frac{\partial w}{\partial z} = 0 \tag{5}$$

$$\frac{\partial u}{\partial t} + u \frac{\partial u}{\partial x} + v \frac{\partial u}{\partial y} + w \frac{\partial u}{\partial z} = -\frac{1}{\rho} \frac{\partial p}{\partial x} + \nu \nabla^2 u \tag{6}$$

$$u\frac{\partial v}{\partial t} + u\frac{\partial v}{\partial x} + v\frac{\partial v}{\partial y} + w\frac{\partial v}{\partial z} = -\frac{1}{\rho}\frac{\partial p}{\partial y} + \nu\nabla^2 v \tag{7}$$

$$\frac{\partial w}{\partial t} + u \frac{\partial w}{\partial x} + v \frac{\partial w}{\partial y} + w \frac{\partial w}{\partial z} = -\frac{1}{\rho} \frac{\partial p}{\partial z} + \nu \nabla^2 w - \text{g} \tag{8}$$

in which

$$
\nabla^2 = \frac{\partial^2}{\partial x^2} + \frac{\partial^2}{\partial y^2} + \frac{\partial^2}{\partial z^2} \tag{9}
$$

and *ρ* and *p* are the liquid density (kg/m3) and total pressure (Pa) respectively; *u*, *v*, and *w* are the particle speeds in the *x*, *y*, and *z* directions (m/s); *t* is time (s); and *g* = 9.81 m/s2 is the gravity acceleration and

$$
\rho = \mathfrak{a}\rho\_1 + (1 - \mathfrak{a})\rho\_2 \tag{10}
$$

where *ρ*<sup>1</sup> and *ρ*<sup>2</sup> are the densities of air and water, respectively, and *α* indicates the volume of each particle that is filled with each of the fluids. The value of *α* varies between 0.0 and 1.0, with 1.0 meaning the cell is filled with water and 0.0 indicating air. A value of 0.5 is allocated to the free surface. Any value between 0.0 and 0.5 indicates air, and a value between 0.5 and 1.0 indicates water.

Given the very high momentum of the flow, turbulent stresses have a negligible effect on the flow in comparison with the liquid sloshing forces, and hence, turbulence was not modeled in this study.

2.1.1. Computational Setup

• Mesh

In this study, a structured cubic mesh was used. By running a mesh sensitivity analysis, the optimum mesh size was found. To do so, the pressure at the top corner of the tank was measured with various mesh sizes.

• Initial conditions

For the initial conditions, the velocity, acceleration, and displacement fields were set to zero.

## • Wall boundary conditions

The "no flow, frictionless" wall boundary condition is applied to the base and the side walls of the tank. This implicit boundary condition is used when no flow crosses the wall, and the shear stress at the wall and normal gradient of tangent velocity were set to zero. In other words, no fluid enters or exits the boundary where this condition is applied. This boundary condition is applied as follows:

$$\mathcal{U}\_n = 0 \tag{11}$$

$$\frac{\partial}{\partial n} \mathcal{U}\_{\text{\tiny\,t}} = 0 \tag{12}$$

where *Un* and *Uτ* are the normal and the tangential velocities of the flow, respectively, and *n* is the normal vector of the boundary.

• Free surface boundary conditions

The pressure at the free surface is set to zero, and the free surface is modelled using the volume of fluid (VoF) method according to the following equation:

$$
\frac{\partial \alpha}{\partial t} + \frac{\partial (\alpha u)}{\mathbf{x}} + \frac{\partial (\alpha v)}{y} + \frac{\partial (\alpha w)}{z} = 0 \tag{13}
$$

#### 2.1.2. CFD Details

In the mesh sensitivity analysis, a mesh size of 6 mm × 6 mm × 6 mm was found to be reasonably accurate. An adjustable scheme was chosen for the time-step, with a maximum step size of 0.05 s. This means each time-step is chosen based on the previous step. This helps with the accuracy of the simulation results; however, it has higher computational costs.

In the validated OpenFOAM model, an eddy viscosity of 2 <sup>×</sup> <sup>10</sup>−<sup>4</sup> <sup>m</sup>2/s was found to provide the best results compared to the experimental data.

A total of eighteen pressure sensors (probes) are distributed on one quarter of the roof for each of the simulated tanks. The long duration of the simulations is expected that the pressure distribution on a quarter of the domain can be representative of the entire roof. In addition, in this study, the quarter of the roof with the highest pressure was selected for the GP analysis. The placement of sensors on the roof of the tank are presented in Figure 3. Using these sensors, the pressure distribution on the roofs of the tanks can be found. Figure 4 shows a sample of the CFD output; more details on the OpenFOAM model are given by Bahreini et al. [35].

**Figure 3.** Sensor arrangement at the roof of the tank.

**Figure 4.** Computational fluid dynamics (CFD) outputs for tank size 2, with 800 mm water depth at 0◦ orientation and time *t* = 9.50 s; (**a**) liquid surface and (**b**) pressure.

#### *2.2. Genetic Programming*

Genetic programming (GP) is a method based on artificial intelligence that can be used in optimization problems. This method can be applied in Single-Gene and Multi-Gene models. In this method, the structure of the solution is not specified at the beginning and is shaped throughout the evolution [40]. Initial chromosomes are created, and during generations of evolutions and mutations, newer chromosomes with optimized characteristics are created. These cycles continue until the maximum number of iterations is reached or until the optimization reaches a point that is close to the solution (i.e., the error is negligible). In the Single-Gene model, mutations occur to one gene, while in the Multi-Gene method, there are mutations and crossovers across several genes.

In this method, the goal is to find the best-fit expression using the fit function (Equation (14)). This function has a value between 0 to 1000, with 1000 being the fittest, i.e., with the minimum error. 

$$f\_i = 1000 \frac{1}{1 + RRSE\_i} \tag{14}$$

where

$$RRSE\_i = \sqrt{\frac{\sum\_{j=1}^{n} \left(P\_{(ij)} - T\_j\right)^2}{\sum\_{j=1}^{n} \left(T\_j - \overline{T}\right)^2}}\tag{15}$$

and *i* is the number of the fit function, *j* is the number of data, *P(ij)* is the calculated value for *j*th data based on *i*th function, *Tj* is the actual value for the *j*th data, and *T* is the average of the *Tj* values.

In GP-based methods, an initial gene or tree is randomly created, and the process starts. Several reproductions, including mutation (i.e., random changes in a gene and replacing a material with another material) and crossover (i.e., interchange of materials between the parent genes) operations, take place until the termination conditions are fulfilled.

Each gene is in a shape of a tree and consists of two types of nodes: (1) operator nodes, being mathematical operators (e.g., +, −, ×, /, power, sin, cos, log, etc.); and (2) operand nodes, which are the input variables, e.g., x1, x2, etc. (Pandey et al. [41]).

Here, an example is presented for further explanation and a better understanding. In a regression problem with two operands of x1 and x2 (i.e., *y* = *f*(*x*1, *x*2), *y* is dependent on two variables of *x*<sup>1</sup> and *x*2), *A*<sup>1</sup> and *B*<sup>1</sup> are randomly created parent genes as follows:

$$A\_1 = (2.3 \times x\_1) - (\sin x\_2) \tag{16}$$

$$B\_1 = \left(1.1 \times x\_1^2\right) + \left(\log x\_2\right) \tag{17}$$

In a crossover process, a sub-tree of the parent gene *A*<sup>1</sup> is switched with a sub-tree of the parent gene *B*1, resulting in second generation genes, *A*<sup>2</sup> and *B*2:

$$A\_2 = (2.3 \times x\_1) - \left(1.1 \times x\_1^2\right) \tag{18}$$

$$B\_2 = (\log \mathbf{x}\_2) + (\sin \mathbf{x}\_2) \tag{19}$$

And in a mutation process, a sub-tree of each of the genes *A*<sup>2</sup> and *B*<sup>2</sup> is replaced by a new randomly chosen sub-tree, creating the third-generation genes, *A*<sup>3</sup> and *B*3: 

$$A\_3 = \left(1.3 \times x\_2^3\right) - \left(1.1 \times x\_1^2\right) \tag{20}$$

$$B\_3 = (\log \mathbf{x}\_2) + \left(\frac{\mathbf{x}\_1}{\mathbf{x}\_2}\right) \tag{21}$$

This sequence continues until the termination conditions are fulfilled. At the end, the two genes are combined to form the equation: 

$$Y\_i = \alpha(A\_i) + \beta(B\_i) + \mathcal{C} \tag{22}$$

which, in this three-generation example, is as follows:

$$y = a\left[\left(1.3 \times \mathbf{x}\_2^3\right) - \left(1.1 \times \mathbf{x}\_1^2\right)\right] + \beta[\left(\log \mathbf{x}\_2\right) + \left(\frac{\mathbf{x}\_1}{\mathbf{x}\_2}\right)] + \mathbf{C} \tag{23}$$

where *α* and *β* are called gene weights, and *C* is a constant bias term. The gene weights and bias term are calculated by an ordinary least-squared method. Figure 5 shows the procedure of this example in the form of MGGP trees.

**Figure 5.** *Cont.*

**Figure 5.** An example of a Multi-Gene Genetic Programming (MGGP) procedure.

In the current study, using MATLAB, an open-source MGGP algorithm (Genetic Programming Toolbox for the Identification of Physical Systems; GPTIPS) [42] is run to provide the general shape of the prediction function. In this algorithm, there is a random initial assumption for the function; then, the function is developed through generations until the error is minimized. Finally, using non-linear least squared optimization, an optimized equation is obtained that can be used for further analysis, as described in the following. This algorithm uses Pareto theory to find a balance between the fitness and complexity of the model in order to select the optimum model.

Figure 6 shows an example output tree of the MGGP algorithm. In this tree, the operators plus (+), minus (−), division (/), and multiplication (×) are used. The tree depth in this example is 12, and it has a total of 35 nodes.

In this method, chromosomes are introduced as computer programs of different shapes and sizes, with each consisting of sub-programs called genes, i.e., each chromosome is composed of genes. A typical GP method procedure is as follows:


This cycle continues until the function that best fits the data is found. For this study, from a total of 25 samples, 80% (20 samples) were used to train the model while 20% (5 samples) were used for testing (i.e., for validating the model). The trained data are expected to show higher accuracy and smaller errors since the model is directly obtained from this set of data. The tested and trained data are chosen on a random basis.

**Figure 6.** MGGP example output.

#### **3. Results and Discussion**

#### *3.1. Numerical Modelling*

Following the completion of the simulations, the results were analyzed for each case. At this stage, contours illustrating the maximum pressure distribution (not at a specific timestep but over the simulation time) on the roof are plotted for each simulation. Contours associated with the 755 mm × 300 mm tank are presented in Figures 7–9. In the figures, the bottom left represents the center of the roof with dimension of (0, 0), while the top right shows the corner (375.5, 150). The results from the numerical models show that in 46 out of 67 simulations (67%), the maximum pressure on the roof of the tank occurs at the corner.

**Figure 7.** Pressure distribution at the roof of the 755 mm × 300 mm tank, 0◦ orientation.

**Figure 8.** Pressure distribution at the roof of the 755 mm × 300 mm tank, 30◦ orientation.

**Figure 9.** Pressure distribution at the roof of the 755 mm × 300 mm tank, 60◦ orientation.

To find a relationship to predict the maximum pressure for any tank size with any liquid height, the pressure and liquid height need to be dimensionless. It should be noted that the dimensionless maximum pressure needs to consider all factors that might affect the value of the pressure, and hence, the dimensionless pressure and dimensionless liquid height can be calculated by Equations (24) and (25):

$$P\_d = \frac{P\_{max}}{\frac{\left(a\rho.h.L.H\right)}{\left(Fb\right)^2}}\tag{24}$$

$$h\_d = \frac{h}{L} \tag{25}$$

where *Pd* is the dimensionless pressure, *Pmax* is the maximum pressure on the roof, *a* is the maximum acceleration of the harmonic excitation, *ρ* is the density of water, *h* is the liquid height in the tank, *H* is the height of the tank, *L* is half of the length of the tank (i.e., the tank's length is 2*L*), and *Fb* is the available freeboard. The parameters *a* and *Fb* can be calculated by Equations (26) and (27):

$$a = A.\omega\_i^2\tag{26}$$

$$Fb = H - h \tag{27}$$

In Equation (26), *A* is the displacement amplitude of the harmonic motion. In Figure 10, the dimensionless maximum pressure plotted against the dimensionless liquid height are presented in a scatter graph. It should be noted that the results presented in this study are valid for cases when the sloshing height exceeds the wall height, which then generates pressure on the roof of a tank.

#### *3.2. Genetic Programming*

The GPTIPS algorithm allows the user to choose between Single-Gene and Multi-Gene solutions. Single-Gene is the more traditional way of GP and results in simpler equations. Although the Multi-Gene process is more complex, it may lead to solutions with higher accuracy. In this study, the default crossover and mutation coefficients were used as follows: probability of Multi-Gene GP tree cross over = 0.85, probability of Multi-Gene GP tree mutation = 0.1, and probability of Multi-Gene GP tree direct copy = 0.05.

In this section, both Single-Gene and Multi-Gene solutions are examined and explained, and the results are presented.

#### 3.2.1. Single-Gene Solution

In the single-Gene solution, the procedure is simple. There is only one gene and a bias term; hence, there is no crossover of sub-trees. Mutations, however, occur in this solution. The equation obtained from the GPTIPS algorithm in the Single-Gene mode is presented in Equation (28):

$$P\_{d\_r \mid S} = 4.6489 - \frac{12.498 \times \ln(h\_d)}{h\_d^3 + 0.0534} \tag{28}$$

Here, *Pd,S* is the dimensionless maximum pressure obtained by the Single-Gene solution. To obtain this equation, the algorithm was set to have 200 generations, with a population size of 300. The maximum tree depth was set to 4, and operators plus, minus, multiply, divide, and log (which in MATLAB means the Napierian logarithm, i.e., ln) were used. This equation is obtained in generation 184. It should be noted that for the simulated tanks, *hd* (i.e., dimensionless liquid height) has a value between 0.3179 and 3.2211, and hence the results are valid for tanks with dimensionless liquid height in that range. Since this relationship is obtained based on the maximum pressure in all tank orientations, it is not affected by the angle of tank orientation. Figure 11 presents the complexity of the model plotted against its accuracy level (1 <sup>−</sup> R2) for the population on the training set of data. In this figure, green dots represent Pareto models, and blue dots represent non-Pareto models. The green dot with a red circle shows the best model in terms of R2 on the training data.

**Figure 11.** Expressional complexity of the proposed Single-Gene model.

#### 3.2.2. Multi-Gene Solution (MGGP)

In this step, the algorithm is modified to use multiple genes. This mode has both crossover and mutation processes. The following equation (Equation (29)) is obtained from the Multi-Gene procedure. *<sup>d</sup>* <sup>+</sup> 17.484*hd* <sup>−</sup> 3.402

$$P\_{d,M} = 5.1961 + \frac{\left(2.383h\_d^3 - 16.846h\_d^2 + 17.484h\_d - 3.402\right)}{h\_d^5} \tag{29}$$

In this equation, *Pd,M* is the dimensionless maximum roof pressure obtained by the Multi-Gene program. The number of generations was set to 500 with a population of 300. Equation (29) was obtained in generation 473. This equation is composed of the following genes:

$$\text{Gere 1}: \ -\frac{0.920h\_d + 3.402}{h\_d^5} \tag{30}$$

$$\text{Gere 2}: \frac{2.383h\_d^2 - 16.85h\_d + 18.4}{h\_d^4} \tag{31}$$

A bias term equal to 5.196 was obtained. Figure 12 presents the complexity of the model plotted against its accuracy level (1 <sup>−</sup> R2) for the population on the training set of data.

**Figure 12.** Expressional complexity of the proposed Multi-Gene model.

The reason for having a different number of maximum generations for the GP and MGGP models is that for the GP model, the optimum equation was found in the 184th generation, and for the MGGP model it was in the 473rd generation. Therefore, while the 200 maximum generations sufficed for the GP model, the MGGP model required a higher number of maximum generations. These numbers were chosen on a trial and error basis, starting from 100 generations until the optimum equation was obtained at a generation smaller than the maximum number of generations. This could ensure that the obtained equation was the optimal one.

#### 3.2.3. Error Estimations

In this section, some error measures of the Single-Gene and Multi-Gene models are presented and compared. These measurements can help determine the accuracy of the presented models and the choice of each option. Errors were measured for both Single-Gene and Multi-Gene programs on the trained and tested data and were finally compared against each other.

#### a. R-Squared (R2)

In this section, the calculated dimensionless maximum pressure (based on Equations (28) and (29) for Single-Gene and Multi-Gene solutions, respectively) are plotted against the observed dimensionless maximum pressure in Figure 13a,b. The R2, is calculated as 

$$R^2 = 1 - \frac{\sum (P\_d - P\_{d,GP})^2}{\sum \left(P\_d - \overline{P\_d}\right)^2} \tag{32}$$

where *Pd*,*GP* is the dimensionless maximum pressure obtained by the MGGP, and *Pd* is the average of the observed dimensionless maximum pressures. Table 3 presents the R2 values for the Single-Gene and Multi-Gene solutions.

**Figure 13.** Observed dimensionless maximum pressure plotted against the dimensionless maximum pressure obtained by (**a**) Single-Gene procedure and (**b**) Multi-Gene procedure, for the overall data sets.



b. Root Mean Squared Error (RMSE)

The standard deviation of the residuals, known as root mean squared error (RMSE) is another way of error reporting. It shows the concentration of data near the regression graph. RMSE is calculated based on the following equation:

$$RMSE = \sqrt{\frac{\sum (P\_d - P\_{d,GP})^2}{N}} \tag{33}$$

where *N* is the number of observed data, which in this study is 20 for the trained data set, 5 for the test data set, and 25 for the overall data. RMSE has the same dimensions as the original data. In this case, since the input data set is dimensionless, the RMSE is also dimensionless. RMSE values for each of the data sets are presented in Table 4.


**Table 4.** Error estimates.

c. Mean Absolute Deviation (MAD) !!

Mean absolute deviation or MAD is a tool for showing the scatteredness of data around the mean. It can be measured by the following equation:

!

!

Ratio (MAD)

Equation or MAD is a tool for showing the scatteredness of data in the measured by the following equation:

$$MAD = \frac{\sum |P\_d - \overline{P\_d}|}{N} \text{ or } MAD = \overline{\sum |P\_d - \overline{P\_d}|} \tag{34}$$

!

!

!

!

!

!

The MAD measurements for each data set are presented in Table 3.

!

!

#### d. Mean Absolute Error (MAE)

Mean absolute error (MAE) is the average of the absolute values of the difference between the observed and measured data. In other words,

!

!

(MAE)

for (MAE) is the average of the absolute values of the difference and measured data. In other words,

$$MAE = \frac{\sum \left| P\_d - P\_{d,GP} \right|}{N} = \sum \left| P\_d - P\_{d,GP} \right|\tag{35}$$

The MAE values are presented in Table 4.

e. Mean Absolute Percentage Error (MAPE)

This error measures the accuracy of the model as a percentage and is calculated as follows:

$$MAPE = \frac{1}{N} \sum \frac{P\_d - P\_{d, \ GP}}{P\_d} \times 100\% \tag{36}$$

The MAPE values found in this study for different data sets of Single-Gene and Multi-Gene modes are presented in Table 4.

#### f. Akaike Information Criterion (AIC):

The results were also compared using the Akaike information criterion (AIC) using the following equation [43]:

$$\text{AIC} = \text{N} \times \log(\sqrt{\text{RMSE}}) + 2\text{k} \tag{37}$$

where k is the number of optimized coefficients. The results are presented in Table 4. The value of the AIC can help compare the complexity and the accuracy of the models at the same time [44]. The results show that when combined, the simplicity and accuracy of the two models (i.e., Single-Gene and Multi-Gene methods) are very close, and there is a difference of 3.1%, 13%, and 1.2% between the Single-Gene and Multi-Gene models for the trained, test, and overall data sets.

#### g. Performance Index (PI):

In addition to error estimates, evaluating the model performance is helpful in the comparison of different models. The performance index (PI) can be used for this purpose as follows [45]: !!!!

 

 

"

$$\text{PI} = \frac{\text{RRMSE}}{\text{R} + 1} \tag{38}$$

$$\text{PI} = \frac{\text{RRMSE}}{\text{R} + 1} \tag{38}$$

$$\text{RRMSE} = \frac{\text{RMSE}}{|\overline{\text{P}\_{\text{d}}}|} \tag{39}$$

$$\left(\frac{\text{P}\_{\text{d}} - \overline{\text{P}\_{\text{d}}}\right) \left(\text{P}\_{\text{d,GP}} - \overline{\text{P}\_{\text{d,GP}}}\right)}{\text{}} \tag{40}$$

$$\text{RRMSE} = \frac{\text{RMSE}}{|\overline{\mathcal{P}\_{\text{d}}}|} \tag{39}$$

$$\text{RR} = \frac{\sum \left( \mathcal{P}\_{\text{d}} - \overline{\mathcal{P}\_{\text{d}}} \right) \left( \mathcal{P}\_{\text{d,GP}} - \overline{\mathcal{P}\_{\text{d,GP}}} \right)}{\sqrt{\sum \left( \mathcal{P}\_{\text{d}} - \overline{\mathcal{P}\_{\text{d}}} \right)^{2} \sum \left( \mathcal{P}\_{\text{d,GP}} - \overline{\mathcal{P}\_{\text{d,GP}}} \right)^{2}}} \tag{40}$$

where RRMSE is relative root mean square error and R is the correlation coefficient. The lower the PI, the more precise the model. The results of the PI are presented in Table 4. The results show that in all data sets—i.e., test, trained, and overall—the Multi-Gene model has a lower PI, and therefore it is a more precise model than the Single-Gene model.

The error measurements demonstrate that the Multi-Gene method provides a relatively more accurate results compared to the Single-Gene method; however, a rather more complicated formula is required. It is suggested that in the situations where a rough estimate is needed, the Single-Gene method can lead to a reasonable answer in a relatively shorter time with less computational cost, but when a more accurate answer is required, the Multi-Gene formula is recommended.

The error estimates show that the test data sets in both Single-Gene and Multi-Gene models have a lower R2 and higher MAPE, which can be indicators of higher errors and overfitting of the model. However, the RMSE and MAE values provide comparable results for the test and trained data sets with fewer errors. In other words, two of the four error indicators show better results in test data sets, while the other two may indicate overfitting. Given the circumstances, the results for both Single-Gene and Multi-Gene models are reasonably acceptable.

#### *3.3. Uncertainty Analysis and Confidence Bands*

After finding the equation, its credibility needs to be investigated and verified by uncertainty and sensitivity analyses.

A Monte Carlo analysis was also performed for the uncertainty analysis of the resulting equation. The objective of this analysis is to calculate the uncertainty of the final function. To do so, 1,000,000 random inputs of *hd* were generated in the range of 0.3179 to 3.2211. Then, the equation was run for each random number. To generate random data with normal-shaped distribution in a specific range, a truncated Gaussian function was used. The histogram of the generated data using the truncated Gaussian function is shown in Figure 14.

**Figure 14.** Histogram of generated random inputs created with truncated Gaussian function.

These random numbers were then put into the GP model, and 1,000,000 values for *Pd*, namely *Pmc*, were calculated. The mean absolute deviation (MAD) was calculated around the average using Equation (27) !!!!

$$MAD = \frac{1}{n} \sum\_{i=1}^{n} |P\_{m c\_i} - P\_{w \text{gy}}| \tag{41}$$

where *n* is the number of samples (i.e., *n* = 1,000,000 in this case) and *Pavg* is the average of the pressures calculated by the Monte Carlo simulation [20], thus leading to

 $MAD\_{SG} = 11.718$  and  $MAD\_{MG} = 11.4728$ 

This can be used to calculate the uncertainty percentage of the function by using the following equation [20]:

$$
\Omega I = 100 \times \frac{\text{MAD}}{P\_{\text{avg}}} \tag{42}
$$

The above leads to

$$\text{ill}\_{\text{SG}} = 100 \times \frac{11.718}{10.7485} = 109.02 \text{ and } \text{ill}\_{\text{MG}} = 100 \times \frac{11.4728}{11.430} = 100.3738$$

where *USG* and *UMG* are the uncertainty percentages for the Single-Gene and Multi-Gene equations, respectively. Due to the high slope of the graph of the equation in the beginning, these amounts of uncertainty are reasonable.

Confidence bands of the graph are then obtained using a 2nd-order approach in the calculation of the Jacobian Matrix with the central difference scheme. The MATLAB internal function "nlpredci" (non-linear regression prediction confidence intervals) is used. This function can provide the user with 95% confidence band widths of the given equation. According to Dolan et al. [46], this function gives a symmetric confidence interval at each point; hence, the two confidence bands have the same distance from the main equation. The 95% confidence bands for Equations (28) and (29) are plotted in Figure 15a,b, respectively. The average confidence band width for Equation (28) (i.e., Single-Gene mode) is 20.54, and for Equation (29) (i.e., Multi-Gene mode) is 15.27.

**Figure 15.** Graph of the proposed equation for dimensionless pressure plotted against dimensionless liquid height with 95% confidence bounds for (**a**) Single-Gene and (**b**) Multi-Gene modes.

#### *3.4. Sensitivity Analysis*

For the sensitivity analysis, a 10% perturbation is applied to an input value of the equation (here, the mean), and the perturbation in the outcome is calculated. The calculations are presented in Equations (43)–(45): !!!!!!

$$h\_{dp} = 1.1 \times h\_{dm} \tag{43}$$

$$
\Delta P\_d = \frac{\left|P\_{dp} - P\_{dm}\right|}{P\_{dm}}\tag{44}
$$

$$S\_n = \frac{\Delta P\_d}{0.1} \tag{45}$$

where *hdp* is the 10% perturbed mean dimensionless liquid height, *hdm* is the actual mean dimensionless liquid height, Δ*Pd* is the perturbation that appears in the dimensionless pressure due to the 10% perturbation in the dimensionless liquid height, *Pdp* is the change in the value of the dimensionless pressure when the dimensionless liquid height changes, *Pdm* is the value of the dimensionless pressure at mean dimensionless liquid height (*hdm*) calculated based on Equations (28) and (29) for Single-Gene and Multi-Gene modes, and *Sn* is the normal sensitivity of those equations.

This leads to a sensitivity of *Sn*,*SG* = 0.258, or a 25.8% sensitivity for the Single-Gene solution and *Sn*,*MG* = 0.116 or a 11.6% sensitivity for the Multi-Gene solution.

#### **4. Conclusions**

The purpose of this study was to develop an empirical equation for the maximum pressure at the roofs of liquid storage tanks. To do so, a previously validated OpenFOAM model was used to generate the data. The data included the maximum pressure at the roof. Various tank sizes with different liquid heights were modeled, and harmonic sinusoidal base excitations with resonance frequencies were applied to the tanks. To consider the effect of bi-directional excitation, the tanks were shaken in three different orientations. Pressure sensors were distributed on one quarter of the roof, and the maximum pressure at each sensor was recorded.

Using the GP method, a relationship between the dimensionless liquid height and the dimensionless maximum pressure was obtained in both Single-Gene and Multi-Gene modes (Equations (28) and (29)). Using multiple error measures, the two equations were tested, and the results were compared. These results show that the outputs of the equations are in good agreement with the ones obtained by CFD modelling. Uncertainty analyses of the equations were conducted using the Monte Carlo method, leading to reasonable values given that both functions have an ascending shape with a high slope in the beginning of their domains. In addition, the 95% confidence bands for the equation were drawn.

It can be concluded that the use of AI techniques combined with CFD is helpful in predicting the maximum pressure at the roof of a base-excited tank. Further investigation on this aspect is currently in progress by the authors.

**Author Contributions:** Conceptualization, A.M. and R.K.; data curation, I.B.T.; formal analysis, I.B.T.; funding acquisition, A.M. and R.K.; investigation, I.B.T.; methodology, A.M. and R.K.; project administration, A.M. and R.K.; resources, I.B.T.; software, I.B.T. and A.M.; supervision, A.M. and R.K.; validation, I.B.T.; visualization, I.B.T.; writing—original draft, I.B.T.; writing—review and editing, A.M. and R.K. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by Natural Sciences and Engineering Research Council of Canada (NSERC).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Chaotic Multi-Objective Simulated Annealing and Threshold Accepting for Job Shop Scheduling Problem**

**Juan Frausto-Solis 1,, Leonor Hernández-Ramírez 1, Guadalupe Castilla-Valdez 1, Juan J. González-Barbosa <sup>1</sup> and Juan P. Sánchez-Hernández <sup>2</sup>**


**Abstract:** The Job Shop Scheduling Problem (JSSP) has enormous industrial applicability. This problem refers to a set of jobs that should be processed in a specific order using a set of machines. For the single-objective optimization JSSP problem, Simulated Annealing is among the best algorithms. However, in Multi-Objective JSSP (MOJSSP), these algorithms have barely been analyzed, and the Threshold Accepting Algorithm has not been published for this problem. It is worth mentioning that the researchers in this area have not reported studies with more than three objectives, and the number of metrics they used to measure their performance is less than two or three. In this paper, we present two MOJSSP metaheuristics based on Simulated Annealing: Chaotic Multi-Objective Simulated Annealing (CMOSA) and Chaotic Multi-Objective Threshold Accepting (CMOTA). We developed these algorithms to minimize three objective functions and compared them using the HV metric with the recently published algorithms, MOMARLA, MOPSO, CMOEA, and SPEA. The best algorithm is CMOSA (HV of 0.76), followed by MOMARLA and CMOTA (with HV of 0.68), and MOPSO (with HV of 0.54). In addition, we show a complexity comparison of these algorithms, showing that CMOSA, CMOTA, and MOMARLA have a similar complexity class, followed by MOPSO.

**Keywords:** JSSP; CMOSA; CMOTA; chaotic perturbation

### **1. Introduction**

The Job Shop Scheduling Problem (JSSP) has enormous industrial applicability. This problem consists of a set of jobs, formed by operations, which must be processed in a set of machines subject to constraints of precedence and resource capacity. Finding the optimal solution for this problem is too complex, and so it is classified in the NP-hard class [1,2]. On the other hand, the JSSP foundations provide a theoretical background for developing efficient algorithms for other significant sequencing problems, which have many production systems applications [3]. Furthermore, designing and evaluating new algorithms for JSSP is relevant not only because it represents a big challenge but also for its high industrial applicability [4].

There are several JSSP taxonomies; one of which is single-objective and multi-objective optimization. The single-objective optimization version has been widely studied for many years, and the Simulated Annealing (SA) [5] is among the best algorithms. The Threshold Accepting (TA) algorithm from the same family is also very efficient in this area [6]. In contrast, in the case of Multi-Objective Optimization Problems (MOOPs), both algorithms for JSSP and their comparison are scarce.

Published JSSP algorithms for MOOP include only a few objectives, and only a few performance metrics are reported. However, it is common for the industrial scheduling requirements to have several objectives, and then the Multi-Objective JSSP (MOJSSP)

**Citation:** Frausto-Solis, J.; Hernández-Ramírez, L.; Castilla-Valdez, G.; González-Barbosa, J.J.; Sánchez-Hernández, J.P. Chaotic Multi-Objective Simulated Annealing and Threshold Accepting for Job Shop Scheduling Problem. *Math. Comput. Appl.* **2021**, *26*, 8. https://doi.org/ 10.3390/mca26010008

Received: 26 September 2020 Accepted: 8 January 2021 Published: 12 January 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

becomes an even more significant challenge. Thus, many industrial production areas require the multi-objective approach [7,8].

In single-objective optimization, the goal is to find the optimal feasible solution of an objective function. In other words, to find the best value of the variables which fulfill all the constraints of the problem. On the other hand, for MOJSSP, the problem is to find the optimum of a set of objective functions *f*1(*x*), *f*2(*x*)... *fn*(*x*) depending on a set of variables *x* and subject to a set of constraints defined by these variables. To find the optimal solution is usually impossible because fulfilling some objective functions may not optimize the other objectives of the problem. In MOOP, a preference relation or Pareto dominance relation produces a set of solutions commonly called the Pareto optimal set [9]. The Decision Makers (DMs) should select from the Pareto set the solution that satisfies their preferences, which can be subjective, based on experience, or will most likely be influenced by the industrial environment's needs [10]. Therefore, the DM needs to have a Pareto front that contains multiple representative compromise solutions, which exhibit both good convergence and diversity [11].

In the study of single-objective JSSP, many algorithms have been applied. Some of the most common are SA, Genetic Algorithms (GAs), Tabu Search (TS), and Ant Systems (ASs) [12]. In addition, as we mention below, few works in the literature solve JSSP instances with more than two objectives and applying more than two metrics to evaluate their performance. Nevertheless, for MOJSSP, the number of objectives and performance metrics remains too small [8,13–15]. The works of Zhao [14] and Mendez [8] are exceptions because the authors have presented implementations with two or three significant objective functions and two performance metrics. Moreover, SA and TA have shown to be very efficient for solving NP-hard problems. Thus, this paper's motivation is to develop new efficient SA algorithms for MOJSSP with two or more objective functions and a larger number of performance metrics.

The first adaptation of SA to MOOP was an algorithm proposed in 1992, also known as MOSA [16]. An essential part of this algorithm is that it applies the Boltzmann criterion for accepting bad solutions, commonly used in single-objective JSSP. MOSA combines several objective functions. The single-objective JSSP optimization with SA algorithm and MOSA algorithm for multi-objective optimization is different in several aspect related to determining the energy functions, using and generating new solutions, and measuring their quality as is well known, these energy functions are required in the acceptance criterion. Multiple versions of MOSA have been proposed in the last few years. One of them, published in 2008, is AMOSA, that surpassed other MOOP algorithms at this time [17]. In this work, we adapt this algorithm for MOJSSP. TA [6] is an algorithm for single-objective JSSP, which is very similar to Simulated Annealing. These two algorithms have the same structure, and both use a temperature parameter, and they accept some bad solutions for escaping from local optima. In addition, these algorithms are among the best JSSP algorithms, and their performance is very similar. Nevertheless, for MOJSSP, a TA algorithm has not been published, and so for obvious reason, it was not compared with the SA multi-objective version.

MOJSSP has been commonly solved using IMOEA/D [14], NSGA-II [18], SPEA [19], MOPSO [20], and CMOEA [21]; the latter was renamed CMEA in [8]. Nevertheless, the number of objectives and performance metrics of these algorithms remains too small. The Evolutionary Algorithm based on decomposition proposed in 2016 by Zhao in [14] was considered the best algorithm [22]. The Multi-Objective Q-Learning algorithm (MOQL) for JSSP was published in 2017 [23]; this approach uses several agents to solve JSSP. An extension of MOQL is MOMARLA, which was proposed in 2019 by Mendez [8]. This MOJSSP algorithm uses two objective functions: makespan and total tardiness. MOMARLA overcomes the classical multi-objective algorithms SPEA [19], CMOEA [21], and MOPSO [20].

The two new algorithms presented in this paper for JSSP are Chaotic Multi-Objective Simulated Annealing (CMOSA) and Chaotic Multi-Objective Threshold Accepting (CMOTA). The first algorithm is inspired by the classic MOSA algorithm [17]. However, CMOSA is

different in three aspects: (1) for the first time it is designed specifically for MOJSSP, (2) it uses an analytical tuning of the cooling scheme parameters, and (3) it uses chaotic perturbations for finding new solutions and for escaping from local optima. This process allows the search to continue from a different point in the solution space and it contributes to a better diversity of the generated solutions. Furthermore, CMOTA is based on CMOSA and Threshold Accepting, and it does not require the Boltzmann distribution. Instead, it uses a threshold strategy for accepting bad solutions to escape from local optima. In addition, a chaotic perturbation function is applied.

In this paper, we present two new alternatives for MOJSSP, and we consider three objective functions: makespan, total tardiness, and total flow time. The first objective is very relevant for production management applications [7], while the other two are critical for enhancing client attention service [23]. In addition, we use six metrics for the evaluation of these algorithms, and they are Mean Ideal Distance (MID), Spacing (S), Hypervolume (HV), Spread (Δ), Inverted Generational Distance (IGD), and Coverage (C). We also apply an analytical tuning parameter method to these algorithms. Finally, we compare the achieved results with those obtained with the JSSP algorithm cited below in [8,14].

The rest of the paper is organized as follows. In Section 2, we make a qualitative comparison of related MOJSSP works. In Section 3, we present MOJSSP concepts and the performance metrics that were applied. Section 4 presents the formulation of MOJSSP with three objectives. The proposed algorithms, their tuning method, and the chaotic perturbation are also shown in Section 5. Section 6 shows the application of the proposed algorithms to a set of 70, 58, and 15 instances. Finally, the results are shown and compared with previous works. In Section 7, we present our conclusions.

#### **2. Related Works**

As mentioned above, in single-objective optimization, the JSSP community has broadly investigated the performance of the different solution methods. However, the situation is entirely different for MOJSSP, and there is a small number of published works. In 1994, an analysis of SA family algorithms for JSSP was presented [24]; two of them were SA and TA, which we briefly explain in the next paragraph. These algorithms suppose that the solutions define a set of macrostates of a set of particles, while the objective functions' values represent their energy, and both algorithms have a Metropolis cycle where the neighborhood of solutions is explored. In single-objective optimization, for the set of instances used to evaluate JSSP algorithms, SA obtained better results than TA. Furthermore, a better solution than the previous one is always accepted, while a worse solution may be accepted depending on the Boltzmann distribution criterion. This distribution is related to the current temperature value and the increment or decrement of energy (associated with the objective functions) in the current temperature value. In the TA case, a worse solution than the previous one may be accepted using a criterion that tries to emulate the Boltzmann distribution. This criterion establishes a possible acceptance of a worse solution when the decrement of energy is smaller than a threshold value depending on the temperature and a parameter *γ* that is very close to one. Then at the beginning of the process, the threshold values are enormous because they depend on the temperatures. Subsequently, the temperature parameter is gradually decreased until a value close to zero is achieved, and then this threshold is very small.

In 2001, a Multi-Objective Genetic Algorithm was proposed to minimize the makespan, total tardiness, and the total idle time [25]. The proposed methodology for JSSP was assessed with 28 benchmark problems. In this publication, the authors randomly weighted the different fitness functions to determine their results.

In 2006, SA was used for two objectives: the makespan and the mean flow time [26]. This algorithm was called Pareto Archived Simulated Annealing (PASA), which used the Simulated Annealing algorithm with an overheating strategy to escape from local optima and to improve the quality of the results. The performance of this algorithm was

evaluated with 82 instances taken from the literature. Unfortunately, this method has not been updated for three or more objective functions.

In 2011, a two-stage genetic algorithm (2S-GA) was proposed for JSSP with three objectives to minimize the makespan, total weighted earliness, and total weighted tardiness [13]. In the first stage, a parallel GA found the best solution for each objective function. Then, in the second stage, the GA combined the populations, which evolved using the weighted aggregating objective function.

Researchers from the Contemporary Design and Integrated Manufacturing Technology (CDIMT) laboratory proposed an algorithm named Improved Multi-Objective Evolutionary Algorithm based on Decomposition (IMOEA/D) to minimize the makespan, tardiness, and total flow time [14]. The authors experiment with 58 benchmark instances, and they use the performance metrics Coverage [27] and Mean Ideal Distance (MID) [28] to evaluate their algorithm. We notice in Table 1, studies with two or three objectives, but they do not report any metric. On the other hand, IMOEA/D stands out from the rest of the literature, not only because the authors reported good results but also because they considered a more significant number of objectives, and they applied two metrics.

In 2008, the AMOSA algorithm based on SA for several objectives was proposed [17]. In this paper, the authors reported that the AMOSA algorithm performed better than some MOEA algorithms, one of them NSGA-II [29]. They presented the main Boltzmann rules for accepting bad solutions. Unfortunately, a MOJSSP with AMOSA and with more than two objectives has not been published.

In 2017, a hybrid algorithm between an NSGA-II and a linear programming approach was proposed [15]; it was used to solve the FT10 instance of Taillard [30]. This algorithm minimized the weighted tardiness and energy costs. To evaluate the performance, the authors only used the HV metric.

In 2019, MOMARLA was proposed, a new algorithm based on Q-Learning to solve MOJSSP [8]. This work provided flexibility to use decision-maker preferences; each agent represented a specific objective and used two action selection strategies to find a diverse and accurate Pareto front. In Table 1, we present the last related studies for MOJSSP and the proposed algorithms.

This paper analyzes our algorithms CMOSA and CMOTA, as follows: (a) comparing CMOSA and CMOTA versus IMOEA/D [14], (b) comparing our algorithms with the results published for MOMARLA, MOPSO, CMOEA, and SPEA, and (c) comparing CMOSA versus CMOTA.


**Table 1.** Related Works.

\* Not reported.

#### **3. Multi-Objective Optimization**

In a single-objective problem, the algorithm finishes its execution when it finds the solution that optimizes the objective function or a very close optimal solution. However, for Multi-Objective Optimization, the situation is more complicated since several objectives must be optimized simultaneously. Then, it is necessary to find a set of solutions optimizing

each of the objectives individually. These solutions can be contrasting because we can obtain the best solution for an objective function that is not the best for other objective functions.

#### *3.1. Concepts*

Definitions of some concepts of Multi-Objective Optimization are shown below.

Pareto Dominance: In general, for any optimization problem, solution A dominates another solution B if the following conditions are met [31]: A is strictly better than B on at least one objective, and A is not worse than B for any objective function.

Non-dominated set: In a set of P solutions, the non-dominated solutions P1 is integrated by solutions that accomplish the following conditions [31]: any pair of P1 solutions must be non-dominated (one regarding the other), and any solution that does not belong to P1 is dominated by at least one member of P1.

Pareto optimal set: The set of non-dominated solutions of the total search space.

Pareto front: The graphic representation of the non-dominated solutions of the multiobjective optimization problem.

#### *3.2. Performance Metrics*

In an experimental comparison of different optimization techniques or algorithms, it is always necessary to have the notion of performance. In the case of Multi-Objective Optimization, the definition of quality is much more complicated than for single-objective optimization problems because the multi-objective optimization criteria itself consists of multiple objectives, of which, the most important are:


In general, it is difficult to find a single performance metric that encompasses all of the above criteria. In the literature, a large number of performance metrics can be found. The most popular performance metrics were used in this research and are described below:

Mean Ideal Distance: Evaluates the closeness of the calculated Pareto front (*PFcalc*) solutions with an ideal point, which is usually (0, 0) [28]. "

$$MID = \frac{\sum\_{i=1}^{Q} \varepsilon\_i}{Q} \tag{1}$$

where *ci* = *f* 2 1,*<sup>i</sup>* + *<sup>f</sup>* <sup>2</sup> 2,*<sup>i</sup>* + *<sup>f</sup>* <sup>2</sup> 3,*<sup>i</sup>* and *f*1,*i*, *f*2,*i*, *f*3,*<sup>i</sup>* are the values of the *i*-th non-dominated solution for their first, second, and third objective function, and *Q* is the number of solutions in the *PFcalc*. 

Spacing: Evaluates the distribution of non-dominated solutions in the *PFcalc*. When several algorithms are evaluated with this metric, the best is that with the smallest *S* value [32].

$$S = \sqrt{\frac{\sum\_{i=1}^{Q} (d\_i - \bar{d})^2}{Q}} \tag{2}$$

where *di* measures the distance in the space of the objective functions between the *i*-th solution and its nearest neighbor; that is the *j*-th solution in the *PFcalc* of the algorithm, *Q* is the number of the solutions in the *PFcalc*, ¯*<sup>d</sup>* is the average of the *di*, that is ¯*<sup>d</sup>* = <sup>∑</sup>*<sup>Q</sup> i*=1 *di Q* and *di* <sup>=</sup> *minj*(<sup>|</sup> *<sup>f</sup> <sup>i</sup>* <sup>1</sup>(*x*) − *f j* <sup>1</sup>(*x*)<sup>|</sup> <sup>+</sup> <sup>|</sup> *<sup>f</sup> <sup>i</sup>* <sup>2</sup>(*x*) − *f j* <sup>2</sup>(*x*)<sup>|</sup> <sup>+</sup> ··· <sup>+</sup> <sup>|</sup> *<sup>f</sup> <sup>i</sup> <sup>M</sup>*(*x*) − *f j <sup>M</sup>*(*x*)|), where *<sup>f</sup> <sup>i</sup>* <sup>1</sup>, *<sup>f</sup> <sup>i</sup>* <sup>2</sup> are the values of the *i*-th non-dominated solution for their first and second objective function, *f j* 1, *f j* <sup>2</sup> are the values of the *j*-th non-dominated solution for their first and second objective function respectively, *M* is the number of objective functions and *i*, *j* = 1, . . . *Q*.

Hypervolume: Calculates the volume in the objective space that is covered by all members of the non-dominated set [33]. The *HV* metric is measured based on a reference point (*W*), and this can be found simply by constructing a vector with the worst values of the objective function. *HV* = *volume*

$$HV = volume \big( \cup\_{i=1}^{|Q|} v\_i \big) \tag{3}$$

where *vi* is a hypercube and is constructed with a reference point *W* and the solution *i* as the diagonal corners of the hypercube [31]. An algorithm that obtains the largest *HV* value is better. The data should be normalized by transforming the value in the range [0, 1] for each objective separately to perform the calculation.

Spread: This metric was proposed to have a more precise coverage value and considers the distance to the (extreme points) of the true Pareto front (*PFtrue*) [29].

$$\Delta = \frac{\sum\_{k=1}^{M} d\_k^{\epsilon} + \sum\_{i=1}^{Q} |d\_i - \bar{d}|}{\sum\_{k=1}^{M} d\_k^{\epsilon} + Q \times \bar{d}} \tag{4}$$

where *d<sup>e</sup> <sup>k</sup>* measures the distance between the "extreme" point of the *PFtrue* for the *k*-th objective function, and the nearest point of *PFcalc*, *di* corresponds to the distance between the solution *i*-th of the *PFcalc*, while its nearest neighbor, ¯*d* corresponds to the average of the *di* and *M* is the number of objectives. 

Inverted Generational Distance: It is an inverted indicator version of the Generational Distance (GD) metric, where all the distances are measured from the *PFtrue* to the *PFcalc* [1].

$$IGD(Q) = \frac{\left(\sum\_{j=1}^{|T|} d\_j^p\right)^{1/p}}{|T|} \tag{5}$$

where *<sup>T</sup>* = {*t*1, *<sup>t</sup>*2, ... , *<sup>t</sup>*|*<sup>T</sup>*|} that is, the solutions in the *PFtrue* and |*T*| is the cardinality of *<sup>T</sup>*, *p* is an integer parameter, in this paper *p* = 2 and ˆ*dj* is the Euclidean distance from *tj* to its nearest objective vector *q* in *Q*, according to (6). 

$$d\_{\vec{j}} = \min\_{q=1}^{|Q|} \left\lfloor \sum\_{m=1}^{M} (fm(t\_{\vec{j}}) - fm(q))^2 \right\rfloor \tag{6}$$

where *f m*(*t*) is the *m*-th objective function value of the *t*-th member of T and M is the number of objectives.

Coverage: Represents the dominance between set *A* and set *B* [27]. It is the ratio of the number of solutions in set *B* that were dominated by solutions in set *A* and the total number of solutions in set *B*. The *C* metric is defined by (7).

$$\mathcal{C}(A, B) = \frac{|\{b \in B | \exists a \in A : a \preceq b\}|}{|B|} \tag{7}$$

When *C*(*A*, *B*) = 1, all *B* solution are dominated or equal to solutions in *A*. Otherwise, *C*(*A*, *B*) = 0, represents situations in which none of the solutions in *B* is dominated by any solution in *A*. The higher the value of *C*(*A*, *B*), the more solutions in *B* are dominated by solutions in *A*. Both *C*(*A*, *B*) and *C*(*B*, *A*) should be considered, since *C*(*B*, *A*) is not necessarily equal to 1 − *C*(*A*, *B*).

#### **4. Multi-Objective Job Shop Scheduling Problem**

In JSSP, there are a set of *n* different jobs consisting of operations that must be processed in *m* different machines. There are a set of precedence constraints for these operations, and there are also resource capacity constraints for ensuring that each machine should process only one operation at the same time. The processing time of each operation is known in advance. The objective of JSSP is to determine the sequence of the operations in each machine (the start and finish time of each operation) to minimize certain objective functions subject to the constraints mentioned above. The most common objective is the

makespan, which is the total time in which all the problem operations are processed. Nevertheless, real scheduling problems are multi-objective, and several objectives should be considered simultaneously.

The three objectives that are addressed in the present paper are:

Makespan: the maximum time of completion of all jobs.

Total tardiness: it is calculated as the total positive differences between the makespan and the due date of each job.

Total flow time: it is the summation of the completion times of all jobs.

The formal MOJSSP model can be formulated as follows [34,35]:

$$Optimize \ F(\mathbf{x}) = \left[ f\_1(\mathbf{x}), f\_2(\mathbf{x}), \dots, f\_q(\mathbf{x}) \right] \\ Subject \ to : \mathbf{x} \in \mathcal{S} \tag{8}$$

where *q* is the number of objectives, *x* is the vector of decision variables, and *S* represents the feasible region. Defined by the next precedence and capacity constraints, respectively:

$$\begin{cases} t\_j \ge t\_i + p\_i & \text{For all } i, j \in O \text{ when } i \text{ precedes } j\\ t\_j \ge t\_i + p\_i \text{ or } t\_i \ge t\_j + p\_j & \text{For all } i, j \in O \text{ when } M\_{\bar{i}} = M\_{\bar{j}} \end{cases}$$

where:

*ti*, *tj* are the starting times for the jobs *i*, *j* ∈ *J*. *pi* and *pj* are the processing times for the jobs *i*, *j* ∈ *J*. *J* : {*J*1, *J*2, *J*3,..., *Jn*} it is the set of jobs. *M* : {*M*1, *M*2, *M*3,... *Mm*} it is the set of machines. *O* is the set of operations *Oj*,*<sup>i</sup>* (operation *i* of the job *j*). 

The objective functions of makespan, total tardiness, and total flow time, are defined by Equations (9)–(11), respectively. *f*1 = *min n* max

$$\begin{aligned} f\_1 &= \min\left(\max\_{j=1}^n \mathcal{C}\_j\right) \\\\ &\vdots \\\\ T\_j \end{aligned} \tag{9}$$

$$\begin{aligned} T\_j \end{aligned} \Big| \begin{aligned} \max\limits\_{\cdot} & \begin{pmatrix} \sum\_{\cdot} \max\limits(0, C\_j - D\_j) \end{pmatrix} \end{aligned} \tag{10}$$

where *Cj* is the makespan of job *j*.

$$f\_1 = \min\left(\max\_{j=1}^n \mathcal{C}\_j\right) \tag{9}$$
  $\text{span of job } j.$  
$$f\_2 = \min\left(\sum\_{j=1}^n T\_j\right) = \min\left(\sum\_{j=1}^n \max(0, \mathcal{C}\_j - D\_j)\right) \tag{10}$$

where *Tj* = *max*(0, *Cj* − *Dj*) is the tardiness of job *j*, and *Dj* is the due date of job *j* and is calculated with *Dj* = *τ* ∑*<sup>m</sup> <sup>i</sup>*=<sup>1</sup> *pj*,*<sup>i</sup>* [36], where *pj*,*<sup>i</sup>* is the time required to process the job *j* in the machine *i*. In this case, the due date of the *j* job is the sum of the processing time of all its operations on all machines, multiplied by a narrowing factor (*τ*), which is in the range 1.5 ≤ *τ* ≤ 2.0 [14,36].

$$f\_3 = \min \sum\_{j=1}^n \mathbb{C}\_j \tag{11}$$

#### **5. Multi-Objective Proposed Algorithms**

The two multi-objective algorithms presented in this section for solving JSSP are Chaotic Multi-Objective Simulated Annealing and Chaotic Multi-Objective Threshold Accepting. We describe these algorithms in this section after analyzing the single-objective optimization algorithms for JSSP.

#### *5.1. Simulated Annealing*

The algorithm SA proposed by Kirkpatrick et al. comes from a close analogy with the metal annealing process [5]. This process consists of heating and progressively cooling metal. As the temperature decreases, the molecules' movement slows down and tends to adopt a lower energy configuration. Kirkpatrick et al. proposed this algorithm for combinatorial optimization problems and to escape from local minima. It starts with an initial solution and generates a new solution in its neighborhood. If the new solution is better than the old solution, then it is accepted. Otherwise, SA applies the Boltzmann distribution, which determines if a bad solution can be taken as a strategy for escaping from local optima. This process is repeated many times until an equilibrium condition is accomplished.

The SA algorithm is shown in Algorithm 1. Line 1 receives the parameters: the initial (*Tinitial*) and final (*Tfinal*) temperatures, the alpha value (*α*) for decreasing the temperature, and beta (*β*) for increasing the length of the Metropolis cycle. The current temperature (*Tk*) is set in line 2. An initial solution (*scurrent*) is generated randomly in line 3. The stop criterion is evaluated (line 4); this main cycle is repeated while the current temperature (*Tk*) is higher than the final temperature (*Tfinal*). The Metropolis cycle starts in line 5, where a neighboring solution (*snew*) is generated (line 6). In line 7 the increment Δ*E* of the objective function is determined for the current solution (*scurrent*) and the new one (*snew*). When this increment is negative (line 8) the new solution is the best. In this case, the new solution replaces the current solution (line 9). Otherwise, the Boltzmann criterion is applied (lines 11 and 12). This criterion allows the algorithm to escape from local optima depending on the current temperature and delta values. Finally, line 16 increases the number of iterations of the Metropolis cycle, and in line 17, the cooling function is applied to reduce the current temperature.

**Algorithm 1** Classic Simulated Annealing algorithm

1: **procedure** SA(*Tinitial*, *Tfinal*, *α*, *β*, *Lk*) 2: *Tk* ← *Tinitial* 3: *scurrent* ← *RandomInitialSolution*() 4: **while** *Tk* ≥ *Tfinal* **do** 5: **for** 1 to *Lk* **do** 6: *snew* ← *perturbation*(*scurrent*) 7: Δ*E* ← *E*(*snew*) − *E*(*scurrent*) 8: **if** Δ*E* < 0 **then** 9: *scurrent* ← *snew* 10: **else** 11: **if** (*e*−Δ*E*/*Tk* > *random*(0, 1) **then** 12: *scurrent* ← *snew* 13: **end if** 14: **end if** 15: **end for** 16: *Lk* ← *β* × *Lk* 17: *Tk* ← *α* × *Tk* 18: **end while** 19: **return** *scurrent* 20: **end procedure**

#### *5.2. Analytical Tuning for Simulated Annealing*

The parameters tuning process for the SA algorithm used in this paper is based on a method proposed in [37]. This method establishes that both the initial and final temperatures are functions of the maximum and minimum energy values *Emax* and *Emin*, respectively. These energies appeared in the Boltzmann distribution criterion that states that a bad solution is accepted in a temperature *<sup>T</sup>* when *random*(0, 1) <sup>≤</sup> *<sup>e</sup>*−Δ*E*/*T*. For JSSP, <sup>Δ</sup>*<sup>E</sup>* is obtained with the makespan. For this tuning method, these two functions are obtained from the neighborhood of different solutions randomly generated. A set of previous SA

executions must be carried out for obtaining Δ*Emax* and Δ*Emin*. These value are used in the Boltzmann distribution for determining the initial and final temperatures. Then, the other parameters of Metropolis cycle are determined. The process used is detailed in the next paragraph.

Initial temperature (*Tinitial*): It is the temperature value from which the search process begins. The probability of accepting a new solution is almost 1 at high temperatures so, its cost of deterioration is maximum. The initial temperature is associated with the maximum allowed deterioration and its defined acceptance probability. Let us define *si* as the current solution, *sj* a new proposed solution, *E*(*si*) and *E*(*sj*) are its associated costs, the maximum and minimum deterioration are Δ*Emax* and Δ*Emin*. Then *P*(Δ*Emax*), is the probability of accepting a solution with the maximum deterioration and it is calculated with (12). Thus the value of the initial temperature (*Tinitial*) is calculated with (13).

$$P(\Delta E\_{\max}) = \mathfrak{e}^{(\Delta E\_{\max} / T\_{initial})} \tag{12}$$

$$T\_{initial} = \frac{-\Delta E\_{\text{max}}}{\ln(P(\Delta E\_{\text{max}}))} \tag{13}$$

Final temperature (*Tfinal*): It is the temperature value at which the search stops. In the same way, the final temperature is determined with (14) according to the probability *P*(Δ*Emin*), which is the probability of accepting a solution with minimum deterioration.

$$T\_{final} = \frac{-\Delta E\_{min}}{\ln(P(\Delta E\_{min}))} \tag{14}$$

Alpha value (*α*): It is the temperature decrease factor. This parameter determines the speed at which the decrease in temperature will occur, for fast decrements 0.7 it is usually used and for slow decrements 0.99.

Cooling scheme: This function specifies how the temperature is decreased. In this case, the value of the current temperature (*Tk*) follows the geometric scheme (15).

$$T\_{k+1} = \mathfrak{a}T\_k \tag{15}$$

Length of the Markov chain or iterations in Metropolis cycle (*Lk*): This refers to the number of iterations of the Metropolis cycle that is performed at each temperature *k*, this number of iterations can be constant or variable. It is well known that at high temperatures, only a few iterations are required since the stochastic equilibrium is rapidly reached [37]. However, at low temperatures, a much more exhaustive level of exploration is required. Thus, a larger *Lk* value must be used. If *Lmin* is the value of *Lk* at the initial temperature, and *Lmax* is the *Lk* at the final temperature, then the Formula (16) is used.

$$L\_{k+1} = \beta L\_k \tag{16}$$

where *β* is the increment coefficient of *Lk*. Since the Functions (15) and (16) are applied successively in SA from the initial to the final temperature, *Tfinal* and *Lmax* are calculated with (17) and (18).

$$T\_{final} = \alpha^n T\_{initial} \tag{17}$$

$$L\_{\max} = \beta^n L\_{\min} \tag{18}$$

In (17) and (18) *n* is the number of steps from *Tinitial* to *Tfinal*, then (19) and (20) are obtained.

$$m = \frac{\ln(T\_{final}) - \ln(T\_{initial})}{\ln(a)}\tag{19}$$

$$\beta = \varepsilon^{(\frac{\ln(L\_{\max}) - \ln(L\_{\min})}{n})} \tag{20}$$

The probability of selecting the solution *sj* from *N* random samples in the neighborhood *Vsi* is given by (21); and from this equation, the *N* value is obtained in (22), where the exploration level *C* is defined in Equation (23).

$$P(S\_{\hat{\jmath}}) = 1 - \varepsilon^{\frac{-N}{|V\_{\hat{\imath}\hat{\imath}|}|}} \tag{21}$$

$$N = - \mid V\_{si} \mid \ln(1 - P(S\_{\bar{j}})) = \mathbb{C} \mid V\_{si} \mid \tag{22}$$

$$\mathbb{C} = \ln(P(S\_{\bar{j}})) \tag{23}$$

The length of the Markov chain or iterations of the Metropolis cycle are defined by (24).

$$L\_{\max} = N = \mathbb{C} \mid V\_{\text{si}} \mid \tag{24}$$

To guarantee a good exploration level, the *C* value determined by (23) must be established between 1 ≤ *C* ≤ 4.6 [38].

#### *5.3. Chaotic Multi-Objective Simulated Annealing (CMOSA)*

As we previously mentioned, the AMOSA algorithm was proposed in [17]. However, this algorithm is designed for general purposes. In this work, we adapt the AMOSA for JSSP to include the following features: (1) the mathematical constraints of MOJSSP, and (2) the objective functions makespan, total tardiness, and total flow time.

CMOSA has the same features previously described and has the next three elements: (1) a new structure, (2) chaotic perturbation, and (3) apply dominance to select solutions. These elements are described in the next subsections.

#### 5.3.1. CMOSA Structure

The CMOSA algorithm uses a chaotic phase to improve the quality of the solutions considering the three objectives. Algorithm 2 receives its parameters in line 1: initial temperature (*Tinitial*), final temperature (*Tfinal*), alpha (*α*), beta (*β*), Metropolis iterations in every cycle (*Lk*), and the initial solution (*scurrent*) to be improved. In lines 2 and 3, the variables of the algorithm are initialized. In line 4, the *scurrent* is processed to obtain the values for each of the three objectives as output. In line 5, the initial temperature is established as the current temperature (*Tk*). Then the main cycle begins in line 6. This cycle is repeated as long as the current temperature is greater than, or equal to, the final temperature. In line 7, the Metropolis cycle begins. Subsequently, the algorithm verifies if it is stagnant in line 8. If that is the case, lines 9 to 20 are executed. The number of iterations to perform a local search is established in line 10; this value is based on the number of tasks of the instance multiplied by an experimentally tuned parameter (in this case, this parameter is *timesLS* = 10).

In line 11, a local search begins. In the first iteration of this search, a chaotic perturbation (explained in Algorithm 4) is applied to the *scurrent* (line 12) to restart the search process from another point in the solution space. In further iterations, a regular perturbation is applied (line 14) that consists only of exchanging the position of two operations in the solution, always verifying that the solution generated is feasible. In line 16, the *snew* is processed to obtain the values for each of the three objectives. Subsequently, and only if the new solution dominates the current solution of the three objectives, the new solution is used to continue the search process (lines 17 and 18). When the algorithm is not stagnant, a regular perturbation is applied, and the flow continues (line 22). If the current and the new solution are different, we proceed with the dominance verification process to determine which solution is used to continue the search (line 26); this process is explained in Algorithm 5. Finally, from lines 29 to 36, a process is applied to set a limit to the number of times the algorithm is stagnant (See Algorithm 3). The algorithm is determined to be stagnant if, after some iterations, it fails to generate a new, non-dominated solution. In this algorithm, the stagnation is limited to 10 iterations. At the end of the algorithm, in line 37, the number of repetitions of the Metropolis cycle (*Lk*) is increased by multiplying its previous value by

the *β* parameter value. Additionally, in line 38, the current temperature (*Tk*) is decreased by multiplying it by the *α* value. At the end of line 40, the stored solution (*scurrent*) is generated as the output of the algorithm.


Algorithm 3 shows the process that is carried out to verify the stagnation mentioned in line 30 of Algorithm 2.

**Algorithm 3** Caught


In this Algorithm 3 the current solution (*scurrent*) and the counter of times it has trapped (*counterTrapped*) are received as input. In line 2 the variables used are initialized. Then the times that the current solution is dominated by at least one solution from the nondominated front are counted (line 3). If the current solution is non-dominated (line 4) it is stored in the front of non-dominated solutions (line 5). If the current solution is dominated by at least one solution (line 7) then the *counterTrapped* is incremented (line 8). When *counterTrapped* equals the maximum number of trapped allowed (line 10), the value of *isCaught* is set to *TRUE* (line 11) and the trap counter is reset to zero in line 12.

#### 5.3.2. Chaotic Perturbation

The logistic equation or logistic map is a well-known mathematical application of the biologist Robert May for a simple demographic model [39]. This application tells us the population in the *n*-th generation based on the size of the previous generation. This value may be found by a popular logistic model mathematically expressed as:

$$\mathbf{x}\_{n+1} = r\mathbf{x}\_n(1-\mathbf{x}\_n) \tag{25}$$

In Equation (25), the variable *xn* takes values ranged between zero and one. This variable represents the fraction of individuals in a specific situation (for instance, into a territory or with a particular feature) in a given instant *n*. The parameter *r* is a positive number representing the combined ratio between reproduction and mortality. Even though we are not interested in this paper in demographic or similar problems, we notice the very fast last variable changes. Then it can be taken as a chaotic variable. Thus, we use this variable for performing a chaotic perturbation function, which may help to escape from local optima for our CMOTA and CMOSA algorithms.

The chaotic function used is very sensitive to changes in the initial conditions, and this characteristic is used to generate a perturbation to the solution for escaping from local optima. Then chaos or chaotic perturbation is a process carried out to restart the search from another point in the space of solutions.

Algorithm 4 can be explained in three steps. Firstly, the feasible operations (operations that can be performed without violating any restrictions) are searched (line 4). Secondly, whether there is only one feasible operation (line 5) means that it is the last operation and selected (line 6). When there is more than one feasible operation, a chaotic function is applied to select the operations. In this case, the logistic function is used (lines 8–19), which applies a threshold in the range [0.5 to 1]. Finally, the selected operation is added to the new solution (line 21). This process is applied until all the operations are selected.

**Algorithm 4** Chaotic perturbation


#### 5.3.3. Applying Dominance to Select Solutions

In Algorithm 5, the current solution (*scurrent*) is compared with the new solution (*snew*) to determine which solution is used to continue the search. In this comparison, there are three cases:


#### **Algorithm 5** Verify dominance CMOSA

```
1: procedure VERIFYDOMINANCECMOSA(Tk,snew,scurrent, mksnew, tdsnew, fltnew, mkscurrent, tdscurrent, fltcurrent)
2: newDominateCurrent ← FALSE, currentDominateNew ← FALSE
3: if snew ≺ scurrent then
4: scurrent ← snew
5: newDominateCurrent ← TRUE
6: end if 7: if scurrent ≺ snew then
8: ΔMKS ← mksnew − mkscurrent
9: ΔTDS ← tdsnew − tdscurrent
10: ΔFLT ← fltnew − fltcurrent
11: Δ ← ΔMKS + ΔTDS + ΔFLT
12: if random(0, 1) < e−Δ/Tk then
13: F ← scurrent 14: scurrent ← snew
15: end if
16: currentDominateNew ← TRUE
17: end if
18: if (newDominateCurrent = FALSE) AND (currentDominateNew = FALSE) then
19: F ← scurrent 20: scurrent ← snew
21: end if 22: return scurrent
23: end procedure
```
#### *5.4. Chaotic Multi-Objective Threshold Accepting (CMOTA)*

In 1990, Dueck et al. proposed the TA algorithm as a general-purpose algorithm for the solution of combinatorial optimization problems [6]. This TA algorithm has a simpler structure than SA, and is very efficient for solving many problems but has never been applied for MOJSSP. The difference between SA and TA is basically in the criteria for accepting bad solutions. TA accepts every new configuration, which is not much worse than the old one. In contrast, SA would accept worse solutions only with small probabilities. An apparent advantage of TA is that it is higher simply because it is not necessary to compute probabilities or to make decisions based on a Boltzmann probability distribution.

Algorithm 6 shows CMOTA algorithm, where we observe that it has the same structure as CMOSA algorithm. These two algorithms have a temperature cycle and, within it, a Metropolis cycle. In these algorithms, a perturbation is applied to the current solution. Then, the dominance of the two solutions is verified to determine which of them is used to continue the searching process (Algorithm 7). Finally, the increment of the variable that controls the iterations of the Metropolis cycle, the reduction of the temperature, and the increment of the counter (line 39) for the number of temperatures are performed.

In Algorithm 7, the dominance of the two solutions is verified to determine which continues with the search. It has the same three cases used in CMOSA (Algorithm 5). The main differences are the following:


straightforward than CMOSA or any other AMOSA algorithm. Moreover, because the parameter *γ* is usually very close to one, it is unnecessary to calculate probabilities for the Boltzmann distribution or make a random decision process for bad solutions.

**Algorithm 6** Chaotic Multi-Objective Threshold Accepting (CMOTA)


**Algorithm 7** Verify dominance CMOTA

```
1: procedure VERIFYDOMINANCECMOTA(counter, Tk,snew,scurrent)
2: γ ← 1, γreduced ← 0.978,setT ← 1, bound ← NumberO f Temperatures × limit
3: newDominateCurrent ← FALSE, currentDominateNew ← FALSE
4: if counter < bound then
5: T ← Tk
6: end if
7: if (counter = bound) AND (setT = 1) then
8: setT ← 0
9: T ← Tk
10: end if
11: if setT = 0 then
12: γ ← γreduced
13: end if
14: if snew ≺ scurrent then
15: scurrent ← snew
16: newDominateCurrent ← TRUE
17: end if
18: if scurrent ≺ snew then
19: if random(0, 1) < T then
20: F ← scurrent
21: scurrent ← snew
22: end if
23: currentDominateNew ← TRUE
24: end if
25: if (newDominateCurrent = FALSE) AND (currentDominateNew = FALSE) then
26: F ← scurrent
27: scurrent ← snew
28: end if
29: T ← T × γ
30: end procedure
```
#### **6. Main Methodology for CMOSA and CMOTA**

Figure 1 shows the main module for each of the two proposed algorithms CMOSA and CMOTA, which may be considered the main processes in any high-level language.

In this main module, the instance to be solved is read, then the tuning process is performed. The due date is calculated, which is an essential element for calculating the tardiness. The set of initial solutions (*S*) is generated randomly, as follows. First, a collection of feasible operations are determined, then one of them is randomly selected and added to the solution until all the job operations are added.

Once the set of initial solutions has been generated, an algorithm (CMOSA or CMOTA) is applied to improve each initial solution, and the generated solution is stored in a set of final solutions (*F*). To obtain the set of non-dominated solutions, also called the zero front (*f*0) from the set of final solutions, we applied the fast non-dominated Sorting algorithm [29]. To know the quality of the non-dominated set obtained, the MID, Spacing, HV, Spread, IGD, and Coverage metrics are calculated. To perform the calculation of the spread and IGD, the true Pareto front (*PFtrue*) is needed. However, for the instances used in this paper, the *PFtrue* has not been published for all the instances. For this reason, the calculation was made using an approximate Pareto front (*PFapprox*), which we obtained from the union of the fronts calculated with previous executions of the two algorithms presented here (CMOSA and CMOTA).

**Figure 1.** Main module for CMOSA and CMOTA.

#### *6.1. Computational Experimentation*

A set of 70 instances of different authors was used to evaluate the performance of the algorithms, including: (1) FT06, FT10, and FT20 proposed by [40]; (2) ORB01 to ORB10 proposed by [41]; (3) LA01 to LA40 proposed by [42]; (4) ABZ5, ABZ6, ABZ7, ABZ8, and ABZ9 proposed by [43]; (5) YN1, YN2, YN3, and YN4 proposed by [44], and (6) TA01, TA11, TA21, TA31, TA41, TA51, TA61, and TA71 proposed by [30].

As already explained, to perform the analytical tuning, some previous executions of the algorithm are necessary. The parameters used for those previous executions are shown in Table 2, and the parameters used in the final experimentation for each instance are shown in Table 3.



**Table 3.** General parameters for CMOSA/CMOTA.


The execution of the algorithm was carried out on one of the terminals of the Ehecatl cluster at the TecNM/IT Ciudad Madero, which has the following characteristics: Intel® Xeon® processor at 2.30 GHz, Memory: 64 GB (4 × 16 GB) ddr4-2133, Linux operating system CentOS, and C language was used for the implementation. We developed CMOSA (https://github.com/DrJuanFraustoSolis/CMOSA-JSSP.git) and CMOTA (https://github.com/DrJuanFraustoSolis/CMOTA-JSSP.git) and we tested the software and using three data sets reported in the paper and taken from the literature.

In the first experiment, the algorithms CMOSA and CMOTA were compared with AMOSA algorithm using the 70 described instances and six performance metrics. In a second experiment, we compared CMOSA and CMOTA with the IMOEA/D algorithm, with the 58 instances used by Zhao [14]. In the second experiment, we used the same MID metric of this publication. The third experiment was based on the 15 instances reported in [8], where the results of the next MOJSSP algorithms are published: SPEA, CMOEA, MOPSO, and MOMARLA. In this publication the authors used two objective functions and two metrics (HV and Coverage); they determined that the best algorithm is MOMARLA followed by MOPSO. We executed CMOSA and CMOTA for the instances of this dataset and we compared our results using the HV metric with those published in [8]. However, a comparison using the coverage metric was impossible because the Pareto fronts of these methods have not been reported [8]. In our case, we show in Appendix A the fronts of non-dominated solutions obtained with 70 instances.

#### *6.2. Results*

The average values of 30 runs, for the six metrics obtained by CMOSA and CMOTA for the complete data set of 70 instances are shown in Tables 4 and 5. We observed that CMOSA obtained the best values for MID and IGD metrics. For Spacing and Spread, CMOTA obtained the best results. For the HV metric, both algorithms achieved the same result (0.42). We observed in Table 5 that CMOSA obtained the best coverage result.

A two-tailed Wilcoxon test was performed with a significance level of 5% (last column in Table 4) and this shows that there are no significant differences between the CMOSA and CMOTA except in MID and IGD metrics.


**Table 4.** Results obtained by the metrics for 70 instances.

\* Best result.

**Table 5.** Results obtained by the coverage metric.


\* Best result.

Table 6 shows the comparison of CMOSA and AMOSA. We observed that CMOSA obtains the best performance in all the metrics evaluated. In addition, the Wilcoxon test indicates that there are significant differences in most of them; thus, CMOSA overtakes AMOSA. We compared CMOTA and AMOSA in Table 7. In this case, CMOTA also obtains the best average results in all the metrics; however, according to the Wilcoxon test, there are significant differences in only two metrics.


**Table 6.** Comparison among CMOSA with AMOSA.

\* Best result.

**Table 7.** Comparison among CMOTA with AMOSA.


\* Best result.

We compare in Table 8 the CMOSA and CMOTA with the IMOEA/D algorithm using the 58 common instances published in [14] where the MID metric was measured. This table shows the MID average value of this metric for the non-dominated set of solutions of CMOSA and CMOTA. The results showed that CMOSA and CMOTA obtain better performances than IMOEA/D. We notice that both algorithms, CMOSA and CMOTA, achieved smaller MID values than IMOEA/D, which indicates that the Pareto fronts of our algorithms are closer to the reference point (0,0,0). The Wilcoxon test confirms that CMOSA and CMOTA surpassed the IMOEA/D.

**Table 8.** CMOSA, CMOTA, and IMOEA/D results obtained using MID metric.


The results of CMOSA and CMOTA were compared with the SPEA, CMOEA, MOPSO, and MOMARLA algorithms [8]. In the last reference, only two objective functions were reported, the makespan and total tardiness. The experimentation was carried out with 15 instances and the average HV values were calculated to perform the analysis of the results, which are shown in Table 9. We notice that MOMARLA surpassed SPEA, CMOEA, and MOPSO. We can observe that CMOSA obtained a better performance than MOMARLA and the other algorithms. Comparing CMOTA and MOMARLA, we notice that both algorithms obtained the same HV average results.


**Table 9.** Comparison among SPEA, CMOEA, MOPSO, CMOSA, CMOTA, and MOMARLA using HV.

\* Best result.

#### *6.3. CMOSA-CMOTA Complexity and Run Time Results*

In this section, we present the complexity of the algorithms analyzed in this paper. The algorithms' complexity is presented in Table 10, and it was obtained directly when it was explicitly published or determined from the algorithms' pseudocodes. In this table, *M* is the number of objectives, Γ is the population size, *T* is the neighborhood size, *n* is the number of iterations (temperatures for AMOSA, CMOSA, and CMOTA), and *p* is the problem size. The latter is equal to *jm* where *j* and *m* are the number of jobs and machines, respectively. Because the algorithms with the best quality metrics are CMOSA, CMOTA MOMARLA, and MOPSO, their complexity is compared in this section.

It is well known that the complexity of classical SA is *O*(*p*<sup>2</sup> *log p*) [45]. However, we notice from Table 10 that CMOSA, and CMOTA have a different complexity even though they are based on SA. This is because these new algorithms applied a different chaotic perturbation and another local search (see Algorithms 2 and 6 in lines 10–20).

The temporal function of MOMARLA, CMOSA, and CMOTA belong to *O*(*Mnp*). For MOMARLA, *n* is the number of iterations, a variable used at the beginning of this algorithm. On the other hand, for CMOSA and CMOTA, *n* is the number of temperatures used in the algorithm, also at its beginning; in any case, the difference will be only a constant.

We note that AMOSA and MOPSO have a similar complexity class expression, that is *O*(*n*Γ2) and *O*(*M*Γ2) respectively. However, MOPSO overtakes AMOSA because *M* is in general lower than *n*. We observe that CMOSA, CMOTA and MOMARLA belong to *O*(*Mnp*) class complexity, while MOPSO belongs to *O*(*M*Γ2) [46]. Thus, the relation between them is *np*/Γ<sup>2</sup> which in general is lower than one. Thus CMOSA, CMOTA and MOMARLA have a lower complexity than MOPSO. Moreover, CMOSA, CMOTA, and MOMARLA have better HV metric quality as is shown in Table 9.

In the next paragraph, we present a comparative analysis of the execution time of the algorithms implemented in this paper.



In Table 11 we show the execution time, expressed in seconds, for the three algorithms (CMOSA, CMOTA, and AMOSA) implemented in this paper for three data sets (70, 58, and 15 instances). In all these cases, we emphasize that the AMOSA algorithm was the base to design the other two algorithms. In fact, all of them have the same structure except that CMOSA and CMOTA apply chaotic perturbations when they detect a possible stagnation. Thus, all of them have similar complexity measures for the worst-case. Table 11 shows the percentage of time saved by these two algorithms concerning AMOSA. For these datasets, we measured that AMOSA saved 2.1, 19.87, and 42.48 percent of the AMOSA run time; on the other hand, these figures of CMOTA versus AMOSA are 55, 68.89, and 46.73 percent. Thus, both of our proposed algorithms CMOSA and CMOTA are significantly more efficient than AMOSA. Unfortunately, we do not have the tools to compare these algorithms versus the other algorithms' execution time in Table 1. Nevertheless, we made the quality comparisons by using the metrics previously published.

**Algorithm CMOSA CMOTA AMOSA** [17] **Data set of 70 instances** Average execution time 495.22 229.42 \* 505.84 % time saved vs AMOSA 2.1 55 \* 0 **Data set of 58 instances** Average execution time 111.68 41.97 \* 139.39 % time saved vs AMOSA 19.87 69.89 \* 0 **Data set of 15 instances** Average execution time 81.24 75.24 \* 141.25 % time saved vs AMOSA 42.48 46.73 \* 0

**Table 11.** Runtimes for CMOSA, CMOTA and AMOSA.

\* Best result.

#### **7. Conclusions**

This paper presents two multi-objective algorithms for JSSP, named CMOSA and CMOTA, with three objectives and six metrics. The objective functions for these algorithms are makespan, total tardiness, and total flow time. Regarding the results from the comparison of CMOSA and CMOTA with AMOSA, we observe that both algorithms obtained a well-distributed Pareto front, closest to the origin, and closest to the approximate Pareto front as was indicated by Spacing, MID, and IGD metrics, respectively. Thus, using these five metrics, we found that CMOSA and CMOTA surpassed the AMOSA algorithm. Regarding the volume covered by the front calculated by the HV metric, it was observed that both algorithms, CMOSA and CMOTA, have the same performance; however, CMOSA has a higher convergence than CMOTA. In addition, the proposed algorithms surpass IMOEA/D when MID metric was used. Moreover, we use the HV to compare the proposed algorithms with SPEA, CMOEA, MOPSO, and MOMARLA. We found that CMOSA outperforms these algorithms, followed by CMOTA, MOMARLA, and MOPSO.

We observe that CMOSA and CMOTA have similar complexity as the best algorithms in the literature. In addition, we show that CMOSA and CMOTA surpass AMOSA when we compare them using execution time for three data sets. We found CMOTA is, on average, 50 percent faster than AMOSA and CMOSA. Finally, we conclude that CMOSA and CMOTA have similar temporal complexity than the best literature algorithms, and the quality metrics show that the proposed algorithms outperform them.

**Author Contributions:** Conceptualization: J.F.-S., L.H.-R., G.C.-V.; Methodology: J.F.-S., L.H.-R., G.C.-V., J.J.G.-B.; Investigation: J.F.-S., L.H.-R., G.C.-V., J.J.G.-B.; Software: J.F.-S., L.H.-R., G.C.-V., J.J.G.-B.; Formal Analysis: J.F.-S., G.C.-V.; Writing original draft: J.F.-S., L.H.-R., G.C.-V.; Writing review and editing: J.F.-S., J.J.G.-B., J.P.S.-H. All authors have read and agreed to the published version of the manuscript. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Acknowledgments:** The authors would like to express their gratitude to CONACYT and TecNM/IT Ciudad Madero. In addition, the authors acknowledge the support from Laboratorio Nacional de Tecnologías de la Información (LaNTI) for the access to the cluster.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Appendix A. Non-Dominated Front Obtained**

The non-dominated solutions obtained by CMOSA algorithm for the 70 instances used are shown in Tables A1–A6, and the non-dominated solutions obtained by CMOTA algorithm for the same instances are shown in Tables A7–A12. In these tables, MKS is the makespan, TDS is the total tardiness and FLT is the total flow time. For each instance, the best value for each objective function is highlighted with an asterisk (\*) and in bold type.

**Table A1.** Non-dominated front obtained by CMOSA for the JSSP instances proposed by [40].


**Table A2.** Non-dominated front obtained by CMOSA for the JSSP instances proposed by [41].



**Table A2.** *Cont*.

**Table A3.** Non-dominated front obtained by CMOSA for the JSSP instances proposed by [42].



**Table A3.** *Cont.*


**Table A3.** *Cont.*


**LA31 LA32 LA33 LA34 LA35 MKS TDS FLT MKS TDS FLT MKS TDS FLT MKS TDS FLT MKS TDS FLT** 1 **1784 \*** 20,830.5 43,617 **1850 \*** 20,861.5 45715 **1719 \*** 20,933.5 43,387 **1743 \*** 22,605.5 45,617 **1898 \*** 24,225.5 47,233 2 1794 20,718.5 43,505 1867 20,860.5 45,714 1721 18,798.5 41,252 1747 21,475.5 44,487 1899 23,434.5 46,652 3 1796 20,390.5 43,177 1871 20,686.5 45,540 1723 18,528.5 40,982 1755 21,271.5 44,283 1900 22,784.5 46,012 4 1797 20,066.5 42,842 1881 20,563.5 45,417 1725 18,137.5 40,591 1756 21,211.5 44,223 1901 22,724.5 45,952 5 1798 20,009.5 42785 1889 20,059.5 44,913 1738 **18,109.5 \*40,563 \*** 1759 21041.5 44,037 1903 22,684.5 45,912 6 1800 **19,919.5 \*42,695 \*** 1900 **20,049.5 \*44,903 \*** 1771 20,916.0 43,916 1920 22,481.5 45,709 7 1774 20,787.0 43,787 1947 22,677.0 45,695 8 1781 20,736.0 43,736 1950 22,442.5 45,670 9 1791 20,693.5 43,705 1953 22,454.0 45,665 10 1801 20,505.5 43,517 1958 22,327.5 45,555 11 1837 20,476.5 43,488 2018 **22,311.5 \*45,539 \*** 12 1839 20,356.5 43,368 13 1840 20,305.5 43,317 14 1843 20,298.5 43,310 15 1850 20,072.5 43,084 16 1906 **19,880.5 \*42,892 \* LA36 LA37 LA38 LA39 LA40**


**Table A3.** *Cont.*


**Table A4.** Non-dominated front obtained by CMOSA for the JSSP instances proposed by [43].

**Table A5.** Non-dominated front obtained by CMOSA for the JSSP instances proposed by [44].


**Table A6.** Non-dominated front obtained by CMOSA for the JSSP instances proposed by [30].



**Table A6.** *Cont.*

**Table A7.** Non-dominated front obtained by CMOTA for the JSSP instances proposed by [40].


**Table A8.** Non-dominated front obtained by CMOTA for the JSSP instances proposed by [41].


19 1189 1279.0 9481 20 1202 1303.0 9252 21 1266 1249.5 9639 22 1284 **1198.5 \*** 9588 **ORB6 ORB7 ORB8 ORB9 ORB10 MKS TDS FLT MKS TDS FLT MKS TDS FLT MKS TDS FLT MKS TDS FLT** 1 **1090 \*** 1382.5 9489 **433 \*** 226.0 3813 **1016 \*** 1919.5 8465 **1009 \*** 1646.5 9402 **1055 \*** 1366.5 9211 2 1091 1284.5 9341 437 225.0 3770 1025 1635.5 **8181 \*** 1013 1595.0 9331 1065 790.5 8899 3 1134 1078.0 9177 439 271.5 3707 1047 1617.0 8457 1016 1534.0 9251 1108 843.0 8834 4 1153 1059.0 9182 453 220.0 3742 1148 1575.0 8319 1027 1644.0 9187 1114 **686.5 \*** 8810 5 1168 969.0 **9030 \*** 465 236.0 3697 1150 1564.0 8312 1036 1669.0 9130 1115 687.5 8795 6 1204 945.0 9072 471 **173.5 \* 3620 \*** 1176 1565.0 8294 1043 1479.0 9206 1246 1080.0 **8747 \*** 7 1221 **907.0 \*** 9034 1184 **1502.0 \*** 8301 1063 1360.0 8975 8 1064 **1355.0 \*** 8966 9 1066 1378.0 8942 10 1073 1358.5 8956 11 1083 1426.0 **8885 \*** 12 1092 1417.0 8914

**Table A8.** *Cont.*

**Table A9.** Non-dominated front obtained by CMOTA for the JSSP instances proposed by [42].



**Table A9.** *Cont.*


**Table A9.** *Cont.*


**Table A10.** Non-dominated front obtained by CMOTA for the JSSP instances proposed by [43].

**Table A11.** Non-dominated front obtained by CMOTA for the JSSP instances proposed by [44].


**Table A12.** Non-dominated front obtained by CMOTA for the JSSP instances proposed by [30].



**Table A12.** *Cont.*

#### **References**


## *Article* **Differential Evolution under Fixed Point Arithmetic and FP16 Numbers**

**Luis Gerardo de la Fraga**

Computer Science Department, Center for Research and Advanced Studies of the National Polytechnic Institute (CINVESTAV), Ciudad de Mexico 07360, Mexico; fraga@cs.cinvestav.mx; Tel.: +52-55-57473755

**Abstract:** In this work, the differential evolution algorithm behavior under a fixed point arithmetic is analyzed also using half-precision floating point (FP) numbers of 16 bits, and these last numbers are known as FP16. In this paper, it is considered that it is important to analyze differential evolution (DE) in these circumstances with the goal of reducing its consumption power, storage size of the variables, and improve its speed behavior. All these aspects become important if one needs to design a dedicated hardware, as an embedded DE within a circuit chip, that performs optimization. With these conditions DE is tested using three common multimodal benchmark functions: Rosenbrock, Rastrigin, and Ackley, in 10 dimensions. Results are obtained in software by simulating all numbers using C programming language.

**Keywords:** differential evolution; fixed point arithmetic; FP16; pseudo random number generator

## **1. Introduction**

The use of different number types in machine learning applications has been analyzed extensively in previous years, more specifically in deep learning neural networks [1,2]. These kinds of neural networks use the convolution as the basic function and have thousands of parameters and must be trained first; that is, the network must be optimized by modifying all the parameters to obtain a local minimum of the goal function. The optimization step is called training and it could take hours in modern hardware of general purpose graphics processor units (GPGPUs). A special type of number, Brain Floating Point (bfloat16), which is a half-precision FP format of 16 bits with the same range of the usual single precision FP numbers (float in C programming language, of 32 bits length), has been proposed for training deep learning neural networks [2]. Other FP numbers of 16 bit length are the so-called FP16 numbers, these are an IEEE standard [1,2] for half-precision FP numbers and can be used on ARM processors.

The goal of using different, shorter numbers in machine learning applications is to improve the speed, and as a consequence reduce the power consumption as it would take less time to train a deep learning network, and also reduce the storage memory or disk size for the variables. In [1] it is mentioned that half precision is also attractive for accelerating general purpose scientific computing, such as weather forecasting, climate modeling, and solution of linear systems of equations. The supercomputer Summit (it was in the Top 500 list https://www.top500.org (accessed on 3 February 2021)), has a peak performance of 148.6 petaflops in the LINPACK benchmark, a benchmark that employs only double precision. For a genetics application that uses half precision, the same machine has a peak performance of 2.36 exaflops [1].

In this work it is proposed to analyze the well known heuristic for single objective optimization, the differential evolution (DE) algorithm, under FP16 numbers, and also under fixed point arithmetic that uses integer numbers of different lengths. This analysis is important if we think of embedded optimization algorithms within a chip [3], which performs a dedicated task. One constraint in these kinds of applications must be that the power consumption is as low as possible. Also it is important if one designs a dedicated

**Citation:** de la Fraga, L.G. Differential Evolution under Fixed Point Arithmetic and FP16 Numbers. *Math. Comput. Appl.* **2021**, *26*, 13. https://doi.org/10.3390/mca26010013

Academic Editor: Leonardo Trujillo

Received: 19 December 2020 Accepted: 2 February 2021 Published: 4 February 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

algorithm in hardware, just as in FPGAs (Field Programmable Gate Arrays), to accelerate its behavior. Also, another possible application is to execute a fast and small DE inside each core in a GPGPU. These three application scenarios justify the analysis of the DE performed in this work.

The rest of this article is organized as: in Section 2 a very brief description of fixed point arithmetic and FP numbers is made. In Section 3 the DE algorithm is analyzed for which parts could be improved by using other different number types. In Section 4 some experiments and their results are described. Finally, in Section 6 some conclusions are presented.

#### **2. Fixed Point Arithmetic and Floating Point Numbers**

The notation *a*.*b* will be used here to represent a set of integer numbers that uses *a* bits in the integer part, and *b* bits in the fractional part. Each number is of size *a* + *b* + 1 bits (plus the sign bit).

For a number *x* ∈ *a*.*b*, the range of numbers that can be represented is:

$$-2^a \le x \le 2^a - 2^{-b} \tag{1}$$

Summing up two numbers *a*.*b* results in a number (*a* + 1).*b* [4]. The multiplication of two numbers *a*.*b* results in a number (2*a* + 1).2*b* [4]. It is possible to verify these results by applying the respective operation to two extreme numbers in (1).

The microprocessors offer the sum and multiplication of two integer numbers and the result is stored in a number of the same size as the operands. In a hardware design for a given application, one must use a big enough number to store the sum of two *a*.*b* numbers, and the result to multiply two *a*.*b* numbers must be returned to a *a*.*b* number. The easiest way to perform this is by truncating the result: the resulted 2*a*.2*b* is shifted *b* bits to the right, again the number must be big enough to store the resulted *a*.*b* number. In a PC, if one uses 32 bit integer numbers, the first bit is the sign bit, and then one could multiply up two <sup>√</sup> 231 = 231/2 values to keep the result within the used 31 bits. In any application, normally one does not take care if the used numbers can keep the result of the operations applied to them, and one trusts that the numbers are big enough to store the results.

The operations sum and multiplication of two integer numbers are the fastest because each operation is built in the hardware and both take a single clock step.

The sum and multiplication of two FP numbers is totally different. An FP number is composed as *<sup>s</sup>* · <sup>2</sup>*<sup>e</sup>* , where *s* is the significant and *e* the exponent. If *p* bits are used for the significant, it is an integer that could take values from 0 to 2*<sup>p</sup>* <sup>−</sup> 1. The exponent *<sup>e</sup>* is an integer number too. The sum of two FP numbers is carried on first by expressing both numbers with the same exponent, then summing up both significants. The greater exponent of both numbers is used to express them with the same exponent. The result must be rounded to express the same number of bits used in the significants. Also, the result could be normalized, which means that the exponent will have a single binary precision number.

The multiplication takes more steps because two numbers *<sup>s</sup>*<sup>1</sup> · <sup>2</sup>*e*<sup>1</sup> , and *<sup>s</sup>*<sup>2</sup> · <sup>2</sup>*e*<sup>2</sup> are multiplied as *s* = *s*<sup>1</sup> · *s*<sup>2</sup> and the exponents are summed (*e* = *e*<sup>1</sup> + *e*2), and also both results are rounded and the final result is normalized.

In the IEEE 754 standard [5], an FP number has a sign bit, *i*, and the represented number is equal to (−1)*<sup>i</sup>* · *<sup>s</sup>* · <sup>2</sup>*<sup>e</sup>* , where *e*min ≤ *p* + *e* − 1 ≤ *e*max. The values used in common FP numbers are shown in Table 1.


**Table 1.** Characteristics of floating point (FP) numbers in the IEEE 754 standard.

Floating point operations take more than a clock cycle within a microprocessor.

The IEEE 754 standard [5] gives much more aspects that are necessary to work with FP numbers, such as rounding methods, Not a Number (NaN), infinities, and how to handle exceptions. In [6] all these details about FPs are explained.

#### **3. DE Analysis**

DE is a heuristic used for global optimization under continuous spaces. DE solves problems as:

$$\begin{aligned} \text{minimize: } f(\mathbf{x}), \\ \text{subject to: } \mathbf{g}(\mathbf{x}) \ge 0, \text{and} \\ \mathbf{h}(\mathbf{x}) = 0, \\ \mathbf{x} \in \mathcal{S} \subset \mathbb{R}^n. \end{aligned} \tag{2}$$

where *<sup>f</sup>* : <sup>R</sup>*<sup>n</sup>* <sup>→</sup> <sup>R</sup> is the function to optimize; **<sup>x</sup>** <sup>∈</sup> <sup>R</sup>*n*, that is, the problem has *<sup>n</sup>* variables; and also we could have **<sup>g</sup>** : <sup>R</sup>*<sup>n</sup>* <sup>→</sup> <sup>R</sup>*m*<sup>1</sup> , *<sup>m</sup>*<sup>1</sup> inequality constraints; and **<sup>h</sup>** : <sup>R</sup>*<sup>n</sup>* <sup>→</sup> <sup>R</sup>*m*<sup>2</sup> , *<sup>m</sup>*<sup>2</sup> equality constraints. The solution to the problem **x** is in a subset *S* of the whole search space R*<sup>n</sup>* and where the constraints are satisfied, this space *S* is called the *feasible space*.

Also, the *search space* contains the feasible space and is defined by the *box constraints*:

$$\mathbf{x}\_{i} \in [l\_{i}, u\_{i}], \text{ for } i = \{1, 2, \dots, n\}. \tag{3}$$

This is, each variable *xi* is searched in the interval defined by the lower bound value *li*, and the upper bound value *ui*, for *i* = {1, 2, . . . , *n*}.

Constraints can be incorporated into the problem (2) by modifying the objective function as:

$$f\_1(\mathbf{x}) = f(\mathbf{x}) + \alpha \sum\_{i=1}^{m\_1} \min[0, g\_i(\mathbf{x})]^2 + \beta \sum\_{i=1}^{m\_2} h\_i^2(\mathbf{x}) \tag{4}$$

Now the *f*<sup>1</sup> will be optimized instead of *f* in (2). *α* and *β* in (4) represent the penalty coefficients that weigh the relative importance of each kind of constraint.

One important point about DE is that the heuristic needs to only evaluate the problem to solve. Classical mathematical optimization methods use the first and perhaps also the second derivative of the given problem. These derivatives are easy to obtain if one has in hand the mathematical expression to the given problem. It is possible to approximate the derivatives numerically but with a very high computational cost [7].

According to the test in the CEC 2005 conference [8], DE is the second best heuristic to solve real parameter optimization problems, when the number of parameters is around 10. The DE pseudocode is shown in Algorithm 1.

DE works with a population that is composed of a set of individuals, or vectors, of real numbers. All vectors are initialized with random numbers with a uniform distribution within the search bounds of each parameter (line 1 in Algorithm 1). For a certain number of iterations (line 4) the population is modified and this modified population could replace the original individuals. The core of DE is in the loop on lines 8–13: a new individual is generated from three different individuals chosen randomly; each value of the new vector (it represents a new individual) is calculated from the first father, plus the difference of the other two fathers multiplied by *F*, the difference constant; the new vector value is calculated if a random real number (between zero and one) is less than *R*, the DE's recombination constant. To prevent the case when the new individual could be equal to the current father *i*, at least one vector's component (a variable value) is forced to be calculated from their random fathers values: it is in line 9 of the pseudocode, when *j* = *j*rand, and *j*rand is an integer random number between 1 and *n*. In lines 10–12 it is checked if each combined variable value is within the search space. Then the new individual is evaluated, and if it is better than the father (in lines 11–12), then the child replaces its father. The stop condition used here is: if the number of iterations is greater than a maximum number of iterations or when the difference in the objective function values of the worst and best individuals is less than *v*. This stop condition is called *diff* criterion in [9], and is recommended for a global optimization task.

#### **Algorithm 1** Differential evolution algorithm (rand/1/bin version)

**Require:** The search space and the value *v* for the stop condition. The values for population

size, *μ*; maximum number of generations, *g*; difference and recombination constants, *F* and *R*, respectively.

**Ensure:** A solution of the minimization problem

1: initialize (*P* = {**x**1, **x**2,..., **x***μ*})


5: **for** *i* = 1 to *μ* **do** ⎧


8: **for** *j* = 1 to *n* **do**

9: *x <sup>j</sup>* = *xr*3,*<sup>j</sup>* + *F*(*xr*1,*<sup>j</sup>* − *xr*2,*j*) if *U*(0, 1) < *R* or *j* = *j*rand *xi*,*<sup>j</sup>* otherwise 10: **if** *x <sup>j</sup>* < *li* or *x <sup>j</sup>* > *ui* **then** Check bounds 11: *x <sup>j</sup>* = *U*(0, 1)(*ui* − *li*) + *li* 12: **end if** 13: **end for** 14: **if** *f*(**x** ) < *f*(**x***i*) **then** 15: **x***<sup>i</sup>* = **x** 16: **end if** 17: **end for** 18: min = *f*(**x**1), max = *f*(**x**1) 19: **for** *i* = 2 to *μ* **do** 20: **if** *f*(**x***i*) < min **then**

```
21: min = f(xi)
```
22: **end if**

23: **if** *f*(**x***i*) > max **then**

24: max = *f*(**x***i*)

25: **end if**

26: **end for**

27: *k* ← *k* + 1

28: **until** (max − min) < *v* or *k* > *g*

A general form to set the parameter values for DE is: if *d* is the number of variables, the population size is set to 10*d*, *F* ∈ [0.5, 1.0], and *R* ∈ [0.8, 1.0] [9].

The DE in Algorithm 1 can be improved by using a random integer number generator as the one described in [10], which does not use divisions or FP numbers. This idea could improve the algorithm in line 6 (to generate three numbers in the interval [1, *μ*], and in line 7 where another random integer number is generated in the interval [1, *n*]. Also, the values for *F* and *R* are within the interval [0.5, 1.0], and usually no more than one or two decimal values are used for these constants, thus these values are not affected by using half precision numbers (see Table 1). Even more, a totally integer arithmetic could be used in the comparison *U*(0, 1) < *R*) (in line 9 in Algorithm 1), if it is used instead rand(1, 231) < *I*, with *<sup>I</sup>* <sup>=</sup> 231 · *<sup>R</sup>*.

Two implementations of DE were used in this work: one with fixed point arithmetic, and another one using FP16 numbers. The implementation with fixed point arithmetic uses integer (of 32 bits) numbers for all the variables. The implementation using FP16 numbers uses half precision floats (FP16, 16 bits) for all the variables. In this paper a computer of 64 bits architecture was used, then the multiplication of two integers was stored in a long type variable of 64 bits, shifted and truncated to a integer of 32 bits. The core part of DE (lines 8–13 in Algorithm 1) calculates the selected and mutated vector **x** as: 

$$\mathbf{x}'\_{j} = \begin{cases} \mathbf{x}\_{r\_{2j}j} + F(\mathbf{x}\_{r\_{1j}} - \mathbf{x}\_{r\_{2j}j}) & \text{if } lI(0,1) < R \text{ or } j = j\_{\text{rand}} \\ \mathbf{x}\_{i,j} & \text{otherwise,} \end{cases} \tag{5}$$

for *j* = {1, 2, ... , *n*}, this is for each variable of the given problem. Thus, one subtraction (*xr*1,*<sup>j</sup>* − *xr*2,*j*) followed of one multiplication (by constant *F*) and one summation (with *xr*3,*j*) are needed to calculated the new vector **x** . The greatest value for *F* could be 1, if all the search space is equal for all variables, the result in (5) could be the double of the current *x <sup>j</sup>* value.

Then, the maximum possible values in the search space could be the double of the bound values of the search space. Another problem is to find the maximum possible value in the function space. Also, it is not clear how many bits are necessary in the fractional part for the fixed point arithmetic. These items are solved in the following section.

#### **4. Experiments with Three Multimodal Functions in 10 Dimensions**

Three very well known benchmark functions were used: shifted version of Rosenbrock, Rastrigin, and Ackley functions in 10 dimensions. All these functions are multimodal, which justify solving them using the DE heuristic. The used Rosenbrock function is defined as: #100

$$f\_1(\mathbf{x}) = 0.39 + \frac{1}{10} \sum\_{i=1}^{n-1} \left\{ \left[ (\mathbf{x}\_i + 1)^2 - (\mathbf{x}\_{i+1} + 1) \right]^2 + \frac{\mathbf{x}\_i^2}{100} \right\},\tag{6}$$

its minimum value is 0.39 with **x** = [0, 0, . . . , 0].

The Rastrigin function is defined as:

$$f\_2(\mathbf{x}) = -3\mathbf{3} + \sum\_{i=1}^{n} \left[ \frac{\mathbf{x}\_i^2}{10} - \cos(2\pi \mathbf{x}\_i \mathbf{x}\_i) + 1 \right],\tag{7}$$

its minimum value is −33 for **x** = [0, 0, . . . , 0].

The Ackley function is defined as:

$$\begin{bmatrix} 200 & \sum\_{i=1}^{n} \begin{bmatrix} 10 & 0 & \cdots & 0 \end{bmatrix} \\\\ \text{sum value is } -33 \text{ for } \mathbf{x} = [0, 0, \dots, 0]. \\\\ f\_3(\mathbf{x}) = \frac{1}{20} \left\{ \mathbf{c} - \exp\left[\frac{1}{n} \sum\_{i=1}^{n} \cos(2\pi \ \mathbf{x}\_i) \right] \right\} - 6 - \exp\left[-\frac{1}{5} \sqrt{\frac{1}{n} \sum\_{i=1}^{n} \mathbf{x}\_i^2} \right], \end{bmatrix} \tag{8}$$

its minimum value is −7 with **x** also equal to **x** = [0, 0, ... , 0]. These three functions are scaled with respect to the three ones defined in [11] in order to keep their amplitudes within the range of half precision FP numbers (see Table 1). A summary of these three functions is described in Table 2.

All functions were programmed in single precision FP (float in C) arithmetic.

**Table 2.** The three test functions used in this work. The search space was set to [−10, 10], thus the shown values are the extreme possible values that the functions could take, also the minimum value is shown at the optimum solution **x** = [0, ... , 0], and the evaluation at **x** = [1, ... , 1] is shown in the last column.


The number of bits used for the integer and fractional parts for the simulations in fixed point arithmetic is shown in Table 3. The number of bits in the integer part is set according to Table 2 because the maximum number in the third column in Table 3 must be greater than the maximum extreme value shown in Table 2.

**Table 3.** Calculation of the number of bits in the integer part for the simulations using fixed point arithmetic. Numbers shown here must be greater than the corresponding ones in Table 2 to permit the optimization operations for differential evolution (DE).


The resulted statistics for the simulations using 100 runs per bit in the fractional part and FP16 arithmetic are shown in Tables 4–6, for the Rosenbrock, Rastrigin, and Ackley functions, respectively. In those tables the statistics for the number of generations and the obtained function values are shown. The used number of bits in the integer part are shown in Table 3. These number of bits in the integer part were calculated from data in Table 2, for example, for the Rosenbrock function in Table 2 the maximum obtained value function is 10891.29, thus the number of bits for the integer part must be greater than this number, therefore 14 bits were selected because 2<sup>14</sup> = 16, 384 > 10, 891.29. The corresponding variable values for the minimum for each function for the FP16 simulations are shown in Table 7. The obtained mean value for the FP16 simulation for the Rosenbrock function is 0.391538 (see at the end of sixth column in Table 4). The equivalent mean for the fixed point arithmetic is 0.391079 at 11 bits in the fractional part; the associated variable values at this simulation with 11 bits is also shown in Table 7. The same procedure was repeated for the results for the Rastrigin and Ackley functions and are also shown in Table 7.


**Table 4.** Statistics of the 100 runs per bits used in the fractional part for the fixed point arithmetic and for the Rosenbrock function (14 bits were used for the integer part). Results for 100 runs for the FP16 are also shown. *g* represents the number of generations.

**Table 5.** Simulation results for Rastrigin function. Statistics of the 100 runs per bits used in the fractional part for the fixed point arithmetic (7 bits were used for the integer part). Results for 100 runs for the FP16 are also shown. *g* is the number of generations.



**Table 6.** Simulation results for Ackley function. Statistics of the 100 runs per bits used in the fractional part for the fixed point arithmetic (5 bits were used for the integer part). Results for 100 runs for the FP16 are also shown. *g* is the number of generations.

**Table 7.** Variables values for the minimum function value for FP16 simulation, and the integer arithmetic simulation. The shown numbers 11, 12, and 11 correspond to the used bits in the fractional part for integer arithmetic, which also correspond to the same mean of FP16 results for each function in Tables 4–6.


#### **5. Discussion**

With the simulation results shown in Tables 4–7 it is confirmed that the heuristic DE can be executed in fixed point arithmetic or half precision FP numbers.

As one can see in Tables 4–6 not all the fractional numbers of bits are necessary with a given application. From Table 7 same results for FP16 numbers can be obtained with numbers 14.11, 7.12, and 5.11 for the scaled Rosenbrok, Rastrigin, and Acklen functions.

About the precision obtained in the solution using FP16 or integer arithmetic. The defined machine epsilon value is that such when  = 1 + . In most of the modern microprocessors (that use two's complement arithmetic) this machine epsilon value for each data type is shown in Table 8.

**Table 8.** Machine epsilon values for the different floating point numbers, for a general integer number of *n* bits in the fractional part, and also for the integer arithmetic of results shown in Table 7.


The *precision bits* is one bit more than the positive exponent of epsilon in floating point types and equal to the number of bits used in the fractional part in integer arithmetic.

Roughly, one cannot expect a result in an optimization problem beyond the precision of the machine epsilon. Thus, using FP16 numbers will give precision in the result at most 9.765625 <sup>×</sup> <sup>10</sup><sup>−</sup>4. Or using an integer number *<sup>a</sup>*.*b*, the result will have at most a precision of 2−*b*. This means also that using FP16 numbers the heuristic, DE in this case, will finish early compared to using single or double precision floating point numbers. In the experiment in this work the DE's stop condition was set equal to 10−4. It is expected that using a smaller stop condition the heuristic will finish in more generations but then is necessary to change to other number types.

One possible application of using FP16 numbers of integer arithmetic could be to obtain first a low precision result within the precision given by the used type numbers (see Table 8). If a bigger precision is required, then a traditional mathematical algorithm, such as the Newton method, could be used. The starting solution for the Newton method will be the previous obtained low resolution solution.

Of course if FP16 numbers of integer arithmetic are used, the application should work at the precision results given by those type numbers. Finally, this behavior must be analyzed in advance for a given application.

For all the simulations the DE's stop condition was set equal to 0.0001. This number in 3.28 notation is equal to 0x000068db (it is a hexadecimal number of 32 bits), and this number can be written by convenience with the binary point as 0x0.00068db. The 13 bits after the binary point are all zeros, thus the stop condition is equal to zero for less than 13 bits used in the fractional part, as one can confirm in Tables 4 and 6 where the simulations show the maximum number of iterations and the stop condition is not taken into account for lesser than and equal to 13 bits.

For the use of fixed point arithmetic in DE, it is critical to know in advance the range of values for the function to optimize. Here the extremes values of the search space were used to know those quantities. In a practical task, it could be tried with the extremes and perhaps other points, on a very coarse grid, to evaluate the function to optimize. The same procedure should be applied to use FP16 numbers.

DE core (in Algorithm 1) uses one difference and one multiplication, thus there is not a numerical problem to be used with fixed point arithmetic or FP16 numbers.

A naive implementation of fixed point arithmetic with a word length of 32 bits is not required, in general. As one can see in Table 4, the same results using 14–17 bits in the fractional part for the Rosenbrock function are obtained. The same applies from results in Table 5 for the Rastringin function for 11–24 bits, and in Table 6 for the Ackley function from 13 to 26 bits in the fractional part.

A future work will be the design in the hardware of DE, which should include the random number generator that can be optimized to use directly the generated bits without FP divisions, as is suggested in [10]. This idea of this design also could be used in software within each core of a GPGPU. Also an interesting idea is to incorporate a random number generator based in chaos [12], which is easy to implement.

#### **6. Conclusions**

The DE optimization heuristic was analyzed under its implementation with fixed point arithmetic and half precision floating point arithmetic. Results were shown in software simulation with three multimodal functions: Rosenbrock, Rastrigin, and Ackley in 10 dimensions. To apply these arithmetic representations, it is necessary first to know how to scale the function values to be inside the ranges of FP16 numbers. It is suggested to use the extreme search values to have an idea of those range function values. If this point is solved, DE can be perfectly used in these arithmetics.

Still is possible to optimize the DE algorithm in the pseudo random number generator, without using FP arithmetic. This analysis is required if DE will be embedded in hardware inside a circuit chip or in massive parallel versions in GPGPUs.

**Funding:** This research received no external funding.

**Acknowledgments:** The author would like to thank the anonymous reviewers for their valuable comments which have helped to improve the quality of this article.

**Conflicts of Interest:** The author declares no conflict of interest.

#### **References**


## *Article* **A Method for Integration of Preferences to a Multi-Objective Evolutionary Algorithm Using Ordinal Multi-Criteria Classification**

**Alejandro Castellanos-Alvarez 1, Laura Cruz-Reyes 1, Eduardo Fernandez 2, Nelson Rangel-Valdez 3, Claudia Gómez-Santillán 1,\*, Hector Fraire <sup>1</sup> and José Alfredo Brambila-Hernández <sup>1</sup>**


**Keywords:** incorporation of preferences; multi-criteria classification; decision-making process; multiobjective evolutionary optimization; outranking relationships

## **1. Introduction**

Many industrial domains are concerned with multi-objective optimization problems (MOPs), which in general have conflicting objectives to handle [1]. To solve optimally, a MOPs is to find a set of solutions defined as Pareto optimal solutions. They represent the best compromise between the conflicting objectives. A promising alternative is solving MOPs with metaheuristics, like multi-objective evolutionary algorithms (MOEAs); they obtain an approximation of the Pareto optimal set. This approach solves the problem partially. The decision-maker (DM) has to choose the best compromise solution, which satisfies his preferences, from the set of solutions obtained (non-dominated by each other). For practical reasons, the DM needs to choose one solution to implement it.

MOEAs face various problems when dealing with many objectives—exponential growth in the number of non-dominated solutions and high computational cost to maintain population diversity [2–4], among others. In addition to the previous problems, decisionmaking becomes difficult when the number of objectives increases.

**Citation:** Castellanos-Alvarez, A.; Cruz-Reyes, L.; Fernandez, E.; Rangel-Valdez, N.; Gómez-Santillán, C.; Fraire, H.; Brambila-Hernández, J.A. A Method for Integration of Preferences to a Multi-Objective Evolutionary Algorithm Using Ordinal Multi-Criteria Classification. *Math. Comput. Appl.* **2021**, *26*, 27. https://doi.org/10.3390/mca26020027

Academic Editor: Oliver Schütze

Received: 3 March 2021 Accepted: 26 March 2021 Published: 30 March 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

One way to reduce the DM's cognitive effort is to consider the preferences to guide the MOEA to the region of interest (ROI). Incorporating DM's preferences requires considering non-trivial aspects—defining the DM's preferences, determining the ROI and determining the relevance of a solution [5]. The preferences incorporation methods have used the following representation structures [6,7]—weights, ranking of solutions, ranking of objective functions, reference point, trade-offs between objective functions, desirability thresholds, outranking relations. This paper incorporates preferences using outranking relations.

In many real-world situations, the MOPs environment implicates imprecise information derived from inaccurate measurements or the variability in DMs' judgments and beliefs. Not considering these imprecisions can lead to unsatisfactory solutions and, in consequence, to a poor choice between the existing alternatives due to imperfect knowledge of the problem [8]. Imprecise information may be present in different MOP components; for example, it can be either in objective functions, restrictions, or a decision-maker's preferences. Obtaining the preferential model parameters is a difficult task that increases with the objective number, only possible when the handle of imprecision is allowed [9]. The simplest approach to handling imprecise information is to estimate this information's mean value to solve the problem as a deterministic one [10]. The interval numbers are a natural, simple, and effective approach to express imperfect knowledge. This paper incorporates interval analytics to express the parameters of a preferential model.

On the other hand, when we apply MOEAs to solve problems with many objectives, they face challenges such [2–4]:


Even though incorporating preferences in MOEAs is a challenging problem, the outranking approach handles it appropriately and aids in reducing the DM's cognitive effort required to choose a final solution [13]. Considering the lack of research devoted to studying the convenience of using the outranking approach in the optimization process, this work proposes a further analysis to observe the performance of a novel strategy of incorporating outranking in a MOEA. Unlike Cruz et al. [6], which requires representative solutions of two classes from the DM, this work proposes to incorporate two classes for internal use to guide the search process and establish greater differentiation between solutions, exerting selective pressure to find the ROI, but with the same cognitive load for the DM.

According to the reviewed literature [2–4,11], and as was mentioned before, MOEAs present difficulties when the number of objectives grows. For example, the classical Non-dominated Sorting Genetic Algorithm II (NSGA-II) [14] presents issues with the diversity-controlling operators [12]; authors extended this algorithm in NSGA-III to replace the crowding distance operator with the generation of well-spread reference point. In this paper, we propose a new method to integrate the DM's preferences to NSGA-III, which can deal with many objectives and is based on non-dominated fronts' ordering.

To the best of our knowledge, few of the previous studies has incorporated the presence of imperfect knowledge, nor have used the INTERCLASS-nC [15] as a classifier in the non-dominated-sorting process or employed more of two of inner classes to guide the search process towards the region of interest, and this work focuses on these issues. This research seeks to evaluate the proposed method's performance when incorporating preferences in the presence of imperfect knowledge with various versions of the proposed algorithm.

The remain of this paper is organized as follows—Section 2 includes reviewing the literature and some definitions of INTERCLASS-nC. Section 3 details the proposed method present. Section 4 specifies the benchmark to be solved, which includes seven problem instances. Section 5 shows and discusses the experimental results. Finally, Section 6 presents the conclusions of this paper and future work.

#### **2. Literature Review**

Two main approaches are distinguished in the area of Multi-Criteria Decision-Making (MCDM) [16]:


In the case of outranking relationship, indicators of dominance or preference are defined given some thresholds. This approach's main criticism is the difficulty to obtain the model parameters [6]; however, there are methods to solve it [17]. On the other hand, MAUT does not work when intransitivity exists between the preferential model [16]. The intransitivity phenomenon occurs in many real cases when exist a looping between the alternatives to select. It is important to consider this property to avoid possible incoherent solutions [18].

The incorporation of *interactive* and *a priori* preferences can reduce the search space because the information is used to guide MOEAs to reach the ROI, which is the region of the Pareto frontier preferred by the DM's. Expressing a DM's preference could be a more difficult cognitive process. According to Cruz et al. [6], the following characteristics are desirable for a preference incorporation method:


In Cruz et al. [6], the multicriteria ordinal classification requires the DM to separate solutions into two categories. In a preference incorporation method with this classifier, the human categorization is the stage with the lowest cognitive demand of the entire process. Assigning solutions to the class "good" or "not good" does not require the DM to worry about the transitivity between them in the same way; the DM only compares the solutions between "good" and "not good".

Using outranking relationships allows handling the characteristics of many DMs facing real-world problems [6]. Being good that used preference incorporation methods meet the desirable characteristics described above, related to interaction with the DM, compatibility between the preferential model and the DM, properties of the preferences, and parameters' inference.

The ordinal multi-criteria classification can be useful to the DM to determine the best solution of a discrete set of alternatives, this is due to the existence of ordinally ordered sets starting with the most preferred alternatives to the least preferred ones [19]. There is a variety of multi-criteria ordinal classification methods, these can be grouped into the following classes [15]:


To our knowledge, the first article that uses multi-criteria ordinal classification based on outranking was Oliveira et al. [20], which uses the popular ELECTRE-TRI method for ordinal classification in a three-objective problem, in which preferences are incorporated *a priori*, directly setting the parameters of the outranking model. Those methods belong to the family ELECTRE (*Elimination Et Choix Traduisant la Realite*) which uses a relation of outranking to identify if a solution *x* is at least as good as a *y*.

The hybrid algorithm proposed by Cruz et al. [13] uses a multi-criteria ordinal classification based on outranking. During the first phase, a meta-heuristic algorithm obtains a first approximation to the Pareto frontier. In the second phase, the DM assigns the solutions to two ordered classes and obtains the parameters of the outranking model. In the third phase, the THESEUS classification method applies selective pressure towards "satisfactory" solutions. They test the proposal on project portfolio problems with 4, 9, and 16 objectives; its results surpass the popular NSGA-II and Non-Outranked Ant Colony Optimization (NOACO) proposed in [21].

Cruz et al. [6] proposed the Hybrid Evolutionary Algorithm guided by Preferences (HEAP) algorithm, an extension of their previous work [13]. Where, instead of NSGA-II and NOACO, they use MOEA/D and MOEA/D-DE as metaheuristics for the first phase of the hybrid algorithm. For evaluating the proposed algorithm, they used instances of the portfolio optimization problem and the scalable test DTLZ problem, with three and eight objectives. The DTLZ benchmark are box-constrained continuous n-dimensional multi-objective problems, scalable in fitness dimension. This experimentation aims to analyze different in the activation of classification and the restart of solutions. The use of the DTLZ test suite makes possible assess the closeness to the ROI of a DM and compare the performance with three and eight objectives. The DM's preferences are simulated through an outranking model. In addition to the THESEUS classification method, the popular ELECTRE-TRI is incorporated, and the results of both methods are compared. In most cases, the best results were obtained with ELECTRE-TRI.

Additionally, few of the researches in the state of the art consider the imperfect knowledge in the DM's preferences and its effect in the function's objectives to be optimized. Besides, none has used the classifier INTERCLASS-nC in the non-dominated-sorting process or employed more inner classes to guide the search process towards the ROI. The proposed NSGA-III-P incorporates these characteristics.

#### *2.1. Interval Arithmetic*

In [22], Moore et al. formally proposed the interval analysis. An interval number can be viewed as an entity that reflects a quantitative property whose precise value is unknown. Still, the range within the value lies is known [15]. In this work, the imperfect knowledge is represented with interval numbers, Moore et al. [23] describes a number in interval as a range, **E** = [*E*, *E*], where *E* represents the lower limit while *E* the upper limit of an interval. Items in bold are numbers in intervals.

Considering two numbers of intervals **D** = [*D*, *D*] and **E** = [*E*, *E*], the Basic arithmetic operations can be defined for numbers of intervals as follows:

• addition:


$$\mathbf{D} - \mathbf{E} = [\underline{D} - \overline{E}, \overline{D} - \underline{E}] \tag{2}$$

• multiplication:

$$\mathbf{D} \ast \mathbf{E} = [\min\{\underline{DE}, \underline{DE}, \underline{DE}, \underline{DE}\}, \max\{\underline{DE}, \underline{DE}, \overline{DE}, \overline{DE}\}] \tag{3}$$

• division:

$$\mathbf{D}\mathbf{\varDelta} = [\underline{\mathbf{D}}, \overline{\mathbf{D}}] \ast [\frac{1}{\underline{\mathbf{E}}}, \frac{1}{\underline{\mathbf{E}}}].\tag{4}$$

According to Fliedner et al. [24] a *realization* of an interval number is any real number *e* ∈ [*E*, *E*]. An order relation is defined in the number of intervals as: let *e* and *d* be two realizations of **E** and **D** respectively, we say that **E** > **D** if the preposition "*e is greater than d*" has greater credibility than "*d* is greater the an *e*". ⎧⎪⎨

Fernandez et al. [25] proposes the possibility function: ⎪⎩

$$P(\mathbf{E} \le \mathbf{D}) = \begin{cases} 1 \text{ if } p\_{ED} > 1, \\ P\_{ED} \text{ if } 0 \le P\_{ED} \le 1, \\ 0 \text{ if } P\_{ED} < 0, \end{cases} \tag{5}$$

where **E** = [*e*,*e*] and **D** = [*d*, *d*] are numbers of intervals and *PED* = *<sup>e</sup>*−*<sup>d</sup>* (*e*−*e*)+(*d*−*d*) . The order relationship between **D** and **E** is given by:

	- (a) *P*(**E** ≥ **D**) > 0.5. Therefore, **E** is greater than **D**, (**E** > **D**).
	- (b) *P*(**E** ≥ **D**) < 0.5. Therefore, **E** is less than **D**, (**E** < **D**).

#### *2.2. INTERCLASS-nC*

Fernandez et al. [15] proposed an ordinal classification method, useful when the DM has a vague idea about the boundaries between adjacent classes but can identify several (even one) representative solutions in each class.

The DM must provide a model of outranking in terms of:


A set of classes *C* = {*C*1, ..., *Ck*, ..., *Cm*}, (*m* ≥ 2) is defined, ordered by increasing preference. Considering a *δ* > 0.5 and *λ* > [0.5, 0.5]. Where, *δ* corresponds to the maximum probability degree for which the strength of the coalition of agreement exceeds *λ*.

*Rk* = {*rkj*, *j* = 1, ..., *card*(*Rk*)} is a subset of reference solutions that characterize *Ck*, *k* = 1, ..., *m* and {*r*0, *R*1, ..., *Rm*,*rm*+1} is the set of all reference solutions, in which *r*<sup>0</sup> and *rm*+<sup>1</sup> are the worst and the ideal reference solution respectively. The elements in *Rk*, *k* = 1, ..., *m* − 1 must satisfy the conditions defined in Fernandez et al. [15].

Classification is performed using top-down and bottom-up methods jointly. Each method proposes a class for the assignment of *x*; in case of not coinciding, these rules propose a possible range for the assignment of *x*.

#### **3. Proposed Method**

The Nondominated Sorting Genetic Algorithm III proposed in [12] is a genetic algorithm similar to the original NSGA-II. They search the Pareto optimal set performing a non-dominated sorting. The difference is the maintenance of diversity in the selection stage. The first uses crowding distances, and the second uses reference points. NSGA-III discriminates between the non-dominated solutions using a utility function, which calculates a solution's relevance to approximate a reference point.

To incorporate a DM's preferences, we propose integrating the ordinal classification method INTERCLASS-nC into the NSGA-III, we will call this variant NSGA-III-P. The original work [6] only defines the classes "satisfactory" (Sat) and "unsatisfactory" (Dis); the DM gives a reference set to generate these classes (with one or more representative solutions for each class). This classification complements the non-dominated sorting to increase the capacity to discriminate solutions; this strategy induces a greater selective

pressure, focusing the search toward the ROI. In this work, two classes are added internally for giving more precision in the comparison of the solutions:


The steps to follow to generate the *Pt*+<sup>1</sup> of the NSGA-III-P that integrates the INTERCLASS-nC ordinal classification method are shown in the Algorithm 1. Let *Qt* the children population of the current generation with equal number of individual *N* of *Pt*. The first step is to combine the children and parents tending *Rt* = *Pt* ∪ *Qt* (of size 2*N*), the *N* individuals that will become *Pt*+<sup>1</sup> will be selected. To do this, *Rt* will be divided into multiple fronts not dominated by *non-dominated sorting* (*F*1, *F*2, ..., *Fn*).

The proposed method of integration of preferences works with the set of previously created non-dominated fronts, by classifying all the solutions in *F*<sup>1</sup> and group the solutions in classes, creating the fronts *F* <sup>1</sup>, *F* <sup>2</sup>, *F* <sup>3</sup>, *F* <sup>4</sup> corresponding to classes *HSat*, *Sat*, *Dis*, *HDis*. In the created fronts are joined with the remaining ones in such a way that *F* = {*F* <sup>1</sup>, *F* <sup>2</sup>, *F* <sup>3</sup>, *F* 4} ∪*<sup>n</sup> j*=2 *Fj*. This process is illustrated in Figure 1 and corresponds to step 7–18 in Algorithm 1.

**Figure 1.** The proposed methodology for classifying the *F*1, grouping, and fronts reordering.

After *F* <sup>1</sup> the new population is built until the size is *N*. The last front is called the *l*-th front. Therefore, the front *l* + 1 are rejected; in most situations, *l* is partially accepted. Only the solutions that maximize the diversity of *l*-th are selected in such a case (steps 21–26).

#### **Algorithm 1** Generation *Pt* of NSGA-III-P

**Input:** *H* structured reference points *Z<sup>s</sup>* or supplied aspiration points *Za*, parent population *Pt*, *Cx* iteration where the algorithm applies the classification, *Ry* solution replacement rate **Output:** *Pt*+<sup>1</sup>

1: *St* ← ∅, *i* ← 1 2: *Qt* ← *Recombination* + *Mutation*(*Pt*) 3: *Rt* ← *Pt* ∪ *Qt* 4: (*F*1, *F*2, ..., *Fn*) ← *Non* − *dominated* − *sort*(*Rt*) 5: // If the rest of the current *iteration* between *Cx* equals 0, the classification applies 6: **if** (*iteration mod Cx*) == 0 **then** 7: (*F* <sup>1</sup>, *F* <sup>2</sup>, *F* <sup>3</sup>, *F* <sup>4</sup>) ← ∅ 8: **for** *s* ∈ *F*<sup>1</sup> **do** // Classify each member of *F*<sup>1</sup> and group by class 9: *c* ← *classi f y*(*s*) 10: **if** *c* == "*hsat*" **then** 11: *F* <sup>1</sup> ← *F* <sup>1</sup> ∪ *s* 12: **if** *c* == "*sat*" **then** 13: *F* <sup>2</sup> ← *F* <sup>2</sup> ∪ *s* 14: **if** *c* == "*dis*" **then** 15: *F* <sup>3</sup> ← *F* <sup>3</sup> ∪ *s* 16: **if** *c* == "*hdis*" **then** 17: *F* <sup>4</sup> ← *F* <sup>4</sup> ∪ *s* 18: *F* ← {*F* <sup>1</sup>, *F* <sup>2</sup>, *F* <sup>3</sup>, *F* 4} ∪*<sup>n</sup> <sup>j</sup>*=<sup>2</sup> *Fj* // Fronts reordering 19: **else** 20: *F* = (*F*1, *F*2, ..., *Fn*) 21: **while** |*St*| ≤ *N* **do** // Last front to be included *F <sup>l</sup>* ← *F i* 22: *St* ← *St* ∪ *F i* 23: *i* ← *i* + 1 24: **if** |*St*| == *N* **then** 25: **if** (*iteration mod Cx*) == 0 **then** 26: *replacement*(*St*, *Ry*) // Replace the last *Ry* random individuals 27: **Return:** *St* 28: **else** 29: *Pt*+<sup>1</sup> ← ∪*l*−<sup>1</sup> *<sup>j</sup>*=1*Fj* 30: Points to be chosen from *Fl* : *K* ← *N* − |*Pt*+1| 31: Normalize objectives & create reference set *<sup>Z</sup><sup>r</sup>* <sup>←</sup> *normalize*(*<sup>f</sup> <sup>n</sup>*, *St*, *<sup>Z</sup><sup>r</sup>* , *Z<sup>s</sup>* , *Za*) 32: Associate each member *s* ∈ *St* with a reference point: 33: [*π*(*s*), *d*(*s*)] = *associate*(*St*, *Zr*)%*π*(*s*) 34: Compute niche count of reference point *<sup>j</sup>* <sup>∈</sup> *<sup>Z</sup><sup>r</sup>* : *pj* <sup>=</sup> <sup>∑</sup>*s*∈*St*/*Fl* ((*π*(*s*) = *j*)?1 : 0) 35: Choose *K* member one at a time from *Fl* to construct 36: *Pt*+<sup>1</sup> : *niching*(*K*, *pj*, *π*, *d*, *Z<sup>r</sup>* , *Fk*, *Pt*+1) 37: **if** (*iteration mod Cx*) == 0 **then** 38: *replacement*(*St*, *Ry*) // Replace the last *Ry* random individuals 39: **Return:** *Pt*

The proposed algorithm has two approaches for controlling the selective pressure generated by the incorporation of preference:


Preference incorporation is, in a certain way, an Intensification approach. The Intensification would be reduced by adding new random solutions and generating a diversification, therefore balancing the search. We analyzed different activation configurations in the experimental section to observe their impact on the algorithm's performance.

#### **4. Experimental Settings**

The proposed NSGA-III-P (non-nominated sorting genetic algorithm III with preferences) algorithm's experimentation was carried out to solve the DTLZ1 - DTLZ7 problem's. The algorithm's performance is observed to evaluate the effect of the intensificationdiversification mechanism.

All the algorithms used in this experimentation were executed 50 times for each instance on an Intel Core i7-10510U CPU @ 1.80GHz × 8 with 16 GB of RAM. We developed the algorithms in Java using the OpenJDK 11.0.10 64-Bit.

The DTLZ problem's instances configuration is summarized in the Table 1. For his solution, the algorithm has a population size *n* = 92 individuals, the algorithm uses the SBX crossover operator and the polynomial mutation operator. The Table 2 shows the configurations of these operators.


**Table 1.** Parameters Used for Three-Objective DTLZ Problem's instances.

**Table 2.** Crossover and mutation parameters used for NSGA-III-P.


We analyzed the NSGA-III-P algorithm's versions named *CxRy*, where *x* is the percentage of iterations to activate the classification. In contrast, *y* is the percentage of replacement of solutions. Considering the classification increase intensification, less classification reduces the intensification, and restart of solutions increases the diversification; these variants are higher to lower intensification: C100R0, C1R0, C1R2, C10R0, and C0R0 (see Table 3).

**Table 3.** Experimental configurations carried out.


#### *4.1. Creation of the ROI*

Let *T* be a sample of non-dominated solutions taken from a large set *T* of solutions (≥100 thousand) generated analytically at the Pareto frontier of a standard problem. The solutions that integrate the ROI identified with the following sets and measures in *T* .

• Outranking weakness of a solution *x*. A low value of this measure provides positive arguments for selecting *x*.

$$D\_0(\mathbf{x}) = \{ y|\sigma(y, \mathbf{x}) > \beta, \ \sigma(\mathbf{x}, y) < 0.5, \ y \in T' \{ \mathbf{x} \} \}\tag{6}$$

• Net score measure used to identify DM preferred solutions.

$$F\_{\mathfrak{n}}(\mathbf{x}) = \sum\_{\mathbf{y} \in T'} \sigma(\mathbf{x}, \mathbf{y}) - \sigma(\mathbf{y}, \mathbf{x}) \tag{7}$$

where *Fn*(*x*) > *Fn*(*y*) indicates a certain preference of *x* over *y*.

• Best compromise solution set more preferred by the DM.

$$\mathbf{x}^\* = \{ \mathbf{x} | D(\mathbf{x}) = 0, F\_\mathbf{n}(\mathbf{x}) = \max\_{\mathbf{y} \in T'} (F\_\mathbf{n}(\mathbf{y})), \mathbf{x} \in T' \}\tag{8}$$

• Region of interest made up of the best compromise solutions *x*∗

$$ROI(T') = \mathbf{x}^\* \cup \{ \max\_{\mathbf{x} \in T'} (F\_n(\mathbf{x}) \ge \mathbf{0}, \mathbf{K}) \},\tag{9}$$

where *K* are the largest *Fn* values of *x*.

#### *4.2. Indicators of Performance*

Each algorithm is executed 50 times to the result of a complete run of the NSGA-III-P algorithm configurations, and applying the following indicators:


#### *4.3. Description of the Instance*

The DTLZ problems instance used contains the characterization of the DM preferences (elements 3–6). It has the following elements:


#### **5. Results**

Table 4 shows the reached performance for each algorithm when solving each DTLZ problem. For space reasons, these results are only presented for two performance measures. The first two columns show the result for the original NSGA-III algorithm. The next columns present eight variants of NSGA-III with preferences. The first six columns correspond to variants without activating the solutions restarting strategy. The last two columns correspond to variants that use restarting to reduce the effect of incorporate preferences.



%C-CHSat: conservation percentage of highly satisfactory solutions; MinEuc: min Euclidean distance.

Table 5 shows the first summary of a statistical comparison of five variants of NSGA-III using the configurations reported in Table 4. We applied the Friedman Test, followed by the Hollman Post-hoc Test. The best and the worst algorithm are identified with the algorithms' ranking considering two measures: the percentage of conservation of highly satisfactory solutions (CHSat) and the minimum Euclidean distance (MinEuc).

**Table 5.** Best and worst algorithms resulting from their statistical comparison evaluated with two measures.


In this paper, the main measure to evaluate algorithms is related to the counting of highly satisfactory solutions because preferences elicitation is aligned with this measure. But considering other DM could be interested in the closeness to the ROI, the Euclidean distance is an alternative because it is frequently used in decision-making. For a DM interested in highly satisfactory solutions, the best variant for all DTLZ problems is C100R0. In contrast, if the DM is interested in solutions closer to the ROI, we cannot find a unique variant as the best; They are dependent on the problem. The C100R0 variant offers solutions close to the ROI in four of the seven problems evaluated (DTLZ2–DTLZ4, DTLZ7); For the DTLZ5 and DTLZ6 problems, C1R2 has a better performance. The original NSGA-III algorithm offers solutions closer to the ROI for the DTLZ1 problem. It is noteworthy. that C100R0 is never the worst option; the other variants are the worst at least once.

Table 6 shows the algorithms' average performance for all DTLZ problems. After applying statistical tests to compare algorithms (Friedman aligned and Hollman posthoc). We identify pairwise comparisons with significant differences. Using these pairs, for each algorithm, a set of statistically no better algorithms was obtained. Finally, the algorithms are ranked instead of Hierarchical using the well-known Borda count to accumulate their positioning overall instances for a given measure. The superscript corresponds to ranking Borda.

There are significant statistical differences in 3 of the 5 metrics evaluated (CHSAT, Mean Euclidean, Max Euclidean). For the percentage of conservation of solutions for which

the DM is highly satisfied (CHSat), the best algorithm is C100R0. In contrast, the rest of the algorithms have a similar behavior according to Borda's ranking. The indicator of the percentage of solutions for which the DM is satisfied (CSat) does not significantly differ. That is expected because CHSAT gets better well-solutions.

**Table 6.** The average and standard deviation of the algorithms over 50 independent runs in terms of percentage of conservation and Euclidean distance for the DTLZ family of problems.


Statistical test: Friedman of aligned ranks with a significance level of 0.05. The superscript indicates the position in which it was ranked by the Borda method. The subscript indicates the standard deviation of the results. The upper arrow indicates the top-ranked algorithm.

> The C100R0 configuration is the one with the greatest contribution of solutions closer to the ROI according to the minimum Euclidean distance indicator. This indicator does not have significant differences. For the average, significant differences were found, and the algorithm C100R0 is the one that provides the closest solutions. The algorithms that provide the least distant solutions are C100R0 and C0R0 based on the maximum of the Euclidean distance.

> This global analysis gives the best rank for the C100R0, meaning that it is a good alternative for all analyzed problems. However, C1R2 produces solutions closer to the ROI in some problems. They are extreme variants concerning intensification and diversification, meaning that the balance between them depends on the problem; we need to conduct extensive experimentation to confirm.

> To illustrate the superiority of the proposed NSGA-III-P concerning NSGA-III, Figures 2 and 3 shows the non-dominated solutions obtaining when solving the DTLZ3 problem. Figure 2 is for NSGA-III (C0R0) and Figure 3 is for NSGA-III-P with preferences all time and without a restart (C100R0). The variant C100R0 performs a better exploration of the region of interest with highly satisfactory solutions. At the same time, C0R0 scans the entire solution space, but most solutions are highly unsatisfactory. The solutions belonging to the ROI are illustrated in black, the solutions classified as highly satisfactory (HSat) in green, satisfactory solutions (Sat) in blue, unsatisfactory solutions orange (Dis), and highly unsatisfactory solutions (HDis) in red.

**Figure 2.** Non-dominated NSGA-III(C0R0) solutions of the DTLZ3 problem.

**Figure 3.** Non-dominated NSGA-III-P(C100R0) solutions of the DTLZ3 problem.

#### **6. Conclusions**

This article presents a novel method for incorporating DM's preferences into the NSGA-III algorithm, named NSGA-III-P. INTERCLASS-NC is a multi-criteria and outranking ordinal classifier that allows incorporating preference, giving the algorithm the capacity to improve the discrimination of solutions and intensify the search toward the region of interest. Excessive intensification can diminish the algorithm's effectiveness. To regulate this selective pressure, we add two complementary strategies to the search in NSGA-III-P: control the activations of the classification and control the restarts of solutions.

Experiments with different configurations of NSGA-III-P were proposed to study different levels of intensification and diversification. NSGA-III-P solve the DTLZ test suite, including the preferences of DM with imperfect knowledge.

Based on computational experimentation, the best alternative to the DTLZ problems is the C100R0 (always classify without restarts) when the DM is looking for highly satisfactory solutions. When the DM prefers solutions closer to the ROI, the variants C1R2 (classify and sometimes restart) and C100R0 have the best performance with two and four problems, respectively. In general, the proposed method NSGA-III-P outperforms NSGA-III because it allows obtaining better approximations to the ROI using the principal performance measures; only in one case, the NSGA-III is the best option for the DTLZ1 problem using the Max Euclidean distance.

These preliminary results open a research line to determine the extent to which the selective pressure induced by preferences improves the algorithm performance concerning the closeness to the ROI and the factors that affect it.

As future work, we will evaluate the proposal with a greater number of objectives for the DTLZ problems. Also, the proposal will be integrated into at least one other algorithm representative of the state of the art. We aim to develop a method that dynamically adjusts the diversification and intensification levels required for each problem.

**Author Contributions:** Conceptualization, A.C.-A., L.C.-R. and E.F.; methodology, L.C.-R., N.R.-V. and A.C.-A.; software, A.C.-A. and N.R.-V.; validation, L.C.-R., N.R.-V., H.F., J.A.B.-H., C.G.-S. and A.C.-A.; formal analysis, L.C.-R.; investigation, L.C.-R., E.F. and A.C.-A.; resources, E.F., L.C.-R. and A.C.-A.; data curation, N.R.-V. and A.C.-A.; writing—original draft preparation, A.C.-A. and L.C.-R., J.A.B.-H.; writing—review and editing, C.G.-S., L.C.-R., H.F., J.A.B.-H., N.R.-V., J.A.B.-H. and A.C.-A.; visualization, L.C.-R.; supervision, L.C.-R.; project administration, L.C.-R.; funding acquisition, L.C.-R., A.C.-A. and H.F. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Data Availability Statement:** The instances and other files used here are available at https://www. dropbox.com/sh/5wb8api8zdyjs8y/AAD11EQbI4P0lQgijvgfFC2qa?dl=0 (accessed on 29 March 2021).

**Acknowledgments:** Authors thanks to CONACYT for supporting the projects from (a) Cátedras CONACYT Program with Number 3058. (b) CONACYT Project with Number A1-S-11012 from Convocatoria de Investigación Científica Básica 2017–2018 and CONACYT Project with Number 312397 from Programa de Apoyo para Actividades Científicas, Tecnológicas y de Innovación (PAACTI), a efecto de participar en la Convocatoria 2020-1 Apoyo para Proyectos de Investigación Científica, Desarrollo Tecnológico e Innovación en Salud ante la Contingencia por COVID-19. (c) Alejandro Castellanos-Alvarez would like to thank CONACYT for the support number 1006467.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Abbreviations**

The following abbreviations are used in this manuscript:


#### **References**


## *Article* **Effect of the Profile of the Decision Maker in the Search for Solutions in the Decision-Making Process**

**Mercedes Perez-Villafuerte 1,\*, Laura Cruz-Reyes 2, Nelson Rangel-Valdez 3, Claudia Gomez-Santillan <sup>2</sup> and Héctor Fraire-Huacuja <sup>2</sup>**

	- Instituto Tecnológico de Ciudad Madero, Cd. Madero 89440, Mexico; nelson.rangel@itcm.edu.mx

**Abstract:** Many real-world optimization problems involving several conflicting objective functions frequently appear in current scenarios and it is expected they will remain present in the future. However, approaches combining multi-objective optimization with the incorporation of the decision maker's (DM's) preferences through multi-criteria ordinal classification are still scarce. In addition, preferences are rarely associated with a DM's characteristics; the preference selection is arbitrary. This paper proposes a new hybrid multi-objective optimization algorithm called P-HMCSGA (preference hybrid multi-criteria sorting genetic algorithm) that allows the DM's preferences to be incorporated in the optimization process' early phases and updated into the search process. P-HMCSGA incorporates preferences using a multi-criteria ordinal classification to distinguish solutions as good and bad; its parameters are determined with a preference disaggregation method. The main feature of P-HMCSGA is the new method proposed to associate preferences with the characterization profile of a DM and its integration with ordinal classification. This increases the selective pressure towards the desired region of interest more in agreement with the DM's preferences specified in realistic profiles. The method is illustrated by solving real-size multi-objective PPPs (project portfolio problem). The experimentation aims to answer three questions: (i) To what extent does allowing the DM to express their preferences through a characterization profile impact the quality of the solution obtained in the optimization? (ii) How sensible is the proposal to different profiles? (iii) How much does the level of robustness of a profile impact the quality of final solutions (this question is related with the knowledge level that a DM has about his/her preferences)? Concluding, the proposal fulfills several desirable characteristics of a preferences incorporation method concerning these questions.

**Keywords:** decision maker profile; profile assessment; region of interest approximation; optimization using preferences; hybrid evolutionary approach

### **1. Introduction**

A variety of real-world problems, known as multi-objective optimization problems (MOPs), involve optimizing many objective functions simultaneously [1]. Multi-objective evolutionary algorithms (MOEAs) have been widely used for solving MOPs because of their effectiveness in solving problems in many fields. Nowadays, MOPs solved with metaheuristics like evolutionary algorithms are an important active research field [1,2].

Although the aim in Evolutionary Multi-objective Optimization (EMO) is to find a set of solutions that evenly spread around the Pareto front of a given MOP, it is also equally important to identify the solution to be implemented which best satisfies the preferences of the decision-maker (DM) [3]. Selecting the most preferred Pareto solution requires

**Citation:** Perez-Villafuerte, M.; Cruz-Reyes, L.; Rangel-Valdez, N.; Gomez-Santillan, C.; Fraire-Huacuja, H. Effect of the Profile of the Decision Maker in the Search for Solutions in the Decision-Making Process. *Math. Comput. Appl.* **2021**, *26*, 28. https://doi.org/10.3390/mca26020028

Academic Editor: Oliver Schütze

Received: 1 March 2021 Accepted: 25 March 2021 Published: 31 March 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

evaluates many solutions simultaneously, demanding a high cognitive effort, especially in problems with many objectives.

One alternative to reduce the DM's cognitive effort is to incorporate preferences information of the DM into a multi-objective metaheuristic to identify progressively the region of interest (RoI), defined as the set of non-dominated solutions that the DM prefers over the other solutions [4,5]. There is a growing interest in the solution of MOPs with preferences.

The promising variants in the decision-making process are incorporating preferences using the a priori and interactive approaches, which have the advantage of delimiting the search space for searching an optimal solution, avoiding unnecessary exploration of the entire search space. Preferences integration narrows the search space in optimization problems so that the selective pressure directs evolutionary algorithms close to a region of interest [6]. However, the specialized literature starts from arbitrary reference sets, which are examples of random solutions introduced as preference information into the search process of a metaheuristic. In this work, it is proposed that these sets of references are generated from profiles that characterize DMs preferences simpler and realistically.

So far, there is no general definition that associates the mechanisms of incorporation of preferences with the region of interest. Each author captures preferences in different ways, for example, using fuzzy numbers [7], reference points [4,8], weights based [9–11], solution ranking-based [12] and outranking based models [13]. In addition, each captures preferences at different times in the search process, for example, a priori [14,15], a posteriori [16] or interactively [8]. Those differences difficult to make a fair comparison among them. A detailed review on types of approaches for preference incorporation can be seen in [17–19].

In our opinion, a multi-objective optimization metaheuristic proposed for solving MOPs with preferences should satisfy these features: (1) to allow the DM to introduce a priori preferences information with minimum cognitive effort; (2) interactivity, to allow the DM to specify new preference information to adjust his/her preferences. This paper proposes P-HMCSGA (preference hybrid multi-criteria sorting genetic algorithm), a new MOEA that satisfies these requirements; preferences are specified in a preferential profile. The proposed method holds for both multi-objective and many objective problems.

The experimentation was designed to respond to some questions related to the impact of the proposed algorithm in the solution of a real-world problem with nine and sixteen objectives. The results were satisfactory, particularly in the solution quality, sensibility to a profile and robustness.

This paper is organized as follows: Section 2 formalizes the theoretical background of the algorithm proposed. Section 3 contains the description of P-HMCSGA and its phases. Section 4 presents the experimental results that demonstrate the performance of our approach. Finally, Section 5 presents the conclusions and the possible areas of opportunity in the future.

#### **2. Theoretical Background**

#### *2.1. Public Portfolio Problem (PPP)*

A project is a unique, unrepeatable and temporary process that seeks to achieve a specific set of objectives. A set of projects that can be done in the same period of time is called a portfolio [19]. However, organizations generally do not have sufficient resources to support all proposed projects. In such circumstances, the difficulty is choosing the set of projects that offer the greatest benefit.

The public project portfolio (PPP) problem is defined below [20]:

Consider a set of *N* projects, where the *i*-th project is represented by a *p*-dimensional vector *f*(*i*) = 〈*f* 1(*i*), *f* 2(*i*), ... , *fp*(*i*)〉, where each *fj*(*i*) indicates the contribution of project *i* to the *j*-th objective. Each project has an associated cost expressed by *ci*. Each objective indicates the number of people benefited who belong to a social category, who will receive a level of benefit from the *i*-th project.

A portfolio *x* is a subset of projects generally modeled as a binary vector *x* = 〈*x*1, *x*2, ... , *xN*〉, where *N* indicates the number of projects. In this vector, *xi* is a binary variable where *xi* = 1 if the *i*-th project is supported and *xi* = 0 otherwise. 

There is a total budget that the organization is willing to invest, which is denoted as *B*. Portfolios are subject to the following budget restriction:

$$\left(\sum\_{i=1}^{N} \mathbf{x}\_{i} \mathbf{c}\_{i}\right) \le B \tag{1}$$

The *i*-th project corresponds to an area (health, education, etc.) indicated by a*i*. Each area has a budget limit defined by the DM. For each area *k*, a lower and upper budget limit, *Lk* and *Uk*, respectively, is considered. Based on this, the constraint of each area *k* is

$$L\_k \le \sum\_{i=1}^N \pi\_i \mathcal{G}\_i(k) c\_i \le \mathcal{U}\_k \tag{2}$$

where *gi*(*k*) is defined as

$$g\_i(k) = \begin{cases} 1 & \text{if } \qquad a\_i = k, \\ 0 & \text{otherwise} \end{cases} \tag{3}$$

Each *i*-th project corresponds to a geographic region indicated by *ri*. For each region m, a lower and upper budget limit, *Lm* and *Um*, respectively, is also considered. The restriction by region is defined as follows

$$L\_m \le \sum\_{i=1}^N \mathbf{x}\_i h\_i(m) c\_i \le \mathcal{U}\_m \tag{4}$$

where *hi*(*m*) is defined as

$$h\_i(m) = \begin{cases} 1 & \text{if } \qquad r\_i = m. \\ 0 & \text{otherwise} \end{cases} \tag{5}$$

The quality of the portfolio *x* is determined by the union of the benefits of each one of the projects that compose it. This can be expressed as

$$z(\mathbf{x}) = z\_1(\mathbf{x}), z\_2(\mathbf{x}), \dots, z\_{\mathcal{P}}(\mathbf{x}) \tag{6}$$

where *zj*(*x*) is defined as

$$z\_j(\mathbf{x}) = \sum\_{i=1}^{N} x\_i f\_j(i) \tag{7}$$

If we denote by *RF* the region of feasible portfolios, the project portfolio problem is to identify one or more portfolios that solve

$$
gamma\_{\mathbf{x}\in\mathcal{R}\_{\mathbb{F}}}\{z(\mathbf{x})\}.\tag{8}$$

To select a portfolio many conflicting attributes are considered. Due to the nature of the problem, it has been approached by multi-criteria algorithms that generate a set of solutions that presumably are on the Pareto frontier, which would be the set of optimal non-dominated portfolios in PPP. The DM should choose only one portfolio from the set of good solutions, such a decision depends on the DM's preferences.

#### *2.2. Multi-Objective Optimization*

In Multi-objective Optimization Problems (MOP), when the objectives are in conflict with each other, the compromise solutions are usually sought rather than a single solution.

A MOP can be defined as a vector of decision variables <sup>→</sup> *x* = [*x*1, *x*2,..., *xn*] *<sup>T</sup>*, which optimizes (maximizes or minimizes) a vector function *F*(*x*) whose elements represent the objective functions of problem [2], where:

$$F(\mathbf{x}) = [f\_1(\mathbf{x}), f\_2(\mathbf{x}), \dots, f\_k(\mathbf{x})], \quad f\_i: \mathbb{R}^n \to \mathbb{R} \tag{9}$$

subject to:

$$g\_i(\mathbf{x}) \le 0; \ i = 1, 2, \dots, m$$

$$h\_j(\mathbf{x}) = 0; \ j = 1, 2, \dots, p$$

where:

*n* is the number of decision variables, *k* is the number of objective functions, *m* is the number of inequality constraints, *p* is the number of equality constraints.

Therefore, the notion of optimum is different in these cases. The notion of optimum was generalized by Pareto [21]. This notion is commonly known under the term pareto optimality.

In multi-objective algorithms, the concept of Pareto dominance is frequently used when comparing two solutions and determining whether one dominates the other.

One solution <sup>→</sup> *xa* is said to dominate another <sup>→</sup> *xb* if the following conditions are met (for the minimization case): 

1. The solution <sup>→</sup> *xa* is no worse than <sup>→</sup> *xb* in all objectives:

$$\left|f\_i\left(\stackrel{\rightarrow}{\mathbf{x}\_a^\cdot}\right)\right| \le \left|f\_j\left(\stackrel{\rightarrow}{\mathbf{x}\_b^\cdot}\right)\right|, \qquad \forall i \in \left[1, \ 2, \ \dots, \ k\right] \tag{10}$$

2. The solution <sup>→</sup> *xa* is strictly better than <sup>→</sup> *xb* in at least one objective:

$$f\_i(\stackrel{\rightarrow}{\mathbf{x}\_a^\cdot}) \;<\; f\_j(\stackrel{\rightarrow}{\mathbf{x}\_b^\cdot}), \qquad \exists i \in [1, \; 2, \; \dots, \; k] \tag{11}$$

If any of the conditions (1) or (2) are violated, the solution <sup>→</sup> *xa* does not dominate the <sup>→</sup> *xb* solution. That is, for one solution to dominate another, it needs to be strictly better in at least one objective and not worse in any of the rest. Within a set, a non-dominated solution has no other solution that dominates it. When comparing two solutions <sup>→</sup> *xa* and <sup>→</sup> *xb*, there can only be three possible solutions:


Pareto optimal set. For a given MOP, the Pareto optimal set is defined as *P*∗ = {*<sup>x</sup>* ∈ *<sup>S</sup>*/*x* ∈ *<sup>S</sup>*, *<sup>F</sup>*(*x* ) ≺ *F*(*x*)}.

*Pareto front*. For a given MOP and its Pareto optimal set *P*∗,the Pareto front is defined as *PF*<sup>∗</sup> = {*F*(*x*), *x* ∈ *P*∗}.

#### *2.3. Elitist Non-Dominated Sorting Genetic Algorithm-II (NSGA-II)*

Elitist Non-dominated Sorting Genetic Algorithm-II (NSGA-II) [22] is one of the most popular algorithms for solving multi-objective problems due to its simplicity and effectiveness. The algorithm first generates a competitive population of individuals that is then ordered according to the level of dominance that the individual has in the population. This level of dominance generates different fronts, in the first front are the non-dominated solutions. Solutions from this first front, the elite solutions, are passed on to the next generation along with other solutions in such a way that there is diversity.

Like any genetic algorithm, evolutionary operators (cross and mutation, among others) are applied to it. The non-dominated solutions of the last generation will be an approximation to the Pareto front.

#### *2.4. Fernandez's Preference Model*

Fernandez et al. [3] assumed that there are methods for assigning a degree of truth *σ*(*x*,*y*) in [0, 1] to the predicate *xSy* "*x* is at least as good as *y*". Outranking methods such as ELECTRE-III [23,24] and PROMETHEE [25] can be used for this purpose. This work computes *σ*(*x*,*y*) based on ELECTRE-III and it uses the thresholds *λ*, *β and ε* to transform the fuzzy preference relations into the crisp preference relations.

The resulting relational system of preference defines five crisp relations. This system considers that: (1) *ε* < *β* < *λ* < 1; (2) the value *λ* > 0.5 is the outranking credibility threshold; (3) the value *β* is the asymmetry parameter; and (4) the value *ε* is the symmetry parameter. The formal definition of the relations are the following ones:

Strict Preference: This corresponds to the existence of clear and positive reasons that justify significant preference in favor of one (identified) of the two actions. The statement *x* is strictly preferred to *y* is denoted by *xPy* and exists if at least one of the following conditions holds.


Indifference: This corresponds to the existence of clear and positive reasons that justify equivalence between the two actions. The statement *x* is indifferent to *y* is denoted by *xIy* and it occurs if all the following conditions are met:


Weak Preference: This arises when indifference and strict preference cannot be distinguished appropriately. The statement *x* is weakly preferred to *y* is denoted by *xQy* and it occurs if all the following conditions are satisfied.


Incomparability: This corresponds to a high heterogeneity among alternatives causing that none of the preceding situations predominates. The statement *x* is incomparable to *y* is denoted by *xRy* and it must satisfy the following condition:

(1) *σ*(*x*,*y*) < 0.5 ∧ *σ*(*y*,*x*) < 0.5.

K-preference: This arises when strict preference and incomparability cannot be distinguished appropriately. The statement *x* is *k*-preferred to *y* is denoted by *xKy and* it exists if the following conditions are satisfied:


Fernandez et al. [3] used the above relations over a feasible set of solutions *O* of an optimization problem to define the best compromise according to the DM's preferences. The elements to determine the model include the following ones:


• The net-flow-score outranked frontier *NF* = {*x* ∈ *O*|*card* (*FO*)*<sup>x</sup>* = 0}, where *x* ∈ *O* is a feasible solution and (*FO*)*<sup>x</sup>* = *card* {*y* ∈ *NS*|*Fn*(*y*) > *Fn*(*x*)} is the set of non-strict outranked solution with larger net flow score than *x*. The net flow score *Fn*(*x*) *=* ∑ *<sup>y</sup>* <sup>∈</sup> *NS*−{*x*} [*σ*(*x,y*) − *σ*(*y,x*)] is a popular measure in the literature and in this work offers a further ranking on the solution inside the non-strictly outranked frontier.

Hence, based on the previous sets, the best compromise for the DM is any solution to the optimization problem defined in Equation (12), with a preemptive priority favoring *card*(*SO*).

$$\min\_{x \in O} \left\{ \left< \left| S(O, x) \right| \right> \left| W(O, x) \right|, \left| F(O, x) \right| \right\} \tag{12}$$

In summary, the preferences model is the relational system of preferences previously presented. Based on it, the best compromise is any solution in the Pareto frontier of the optimization problem shown in Equation (12).

#### *2.5. Preference-Disaggregation Analysis (PDA)*

The parameters of an outranking model (weights and thresholds, etc.) must be elicited, such as the preference model used in the present work. Direct procedures that ask a DM for proper values to be assign are commonly used; however, in such approaches, the DMs reveal difficulties when they are asked to assign values to parameters whose meanings are not understood for them [26]. On the other hand, indirect procedures, which compose the so-called preference-disaggregation analysis (PDA), use regression-like methods for inferring a set of parameters from a battery of decision examples [27]. In [28], a new optimization model for PDA is solved with the NSGA-II algorithm. According to Greco et al. in [29], MCDA approaches based on disaggregation paradigms are of interest because their simplicity and the reduced cognitive effort required from the DM. The use of an ordinal classification on the examples is an easy way for a DM to provides his/her preferences.

#### *2.6. THESEUS*

Fernandez proposed in [30] the THESEUS approach that is based on transforming the sorting problem into a particular case of the selection problem. THESEUS assigns new objects to the categories already defined in the set of references, comparing the object with the inconsistencies of the possible assignment and the information of various preference relations that can be strict, weak or indifferent; these are derived from a fuzzy outranking relation, described in Section 2.4. The category assignment is the consequence of comparisons with other objects whose categories are known.

The THESEUS method is based on the following premises:


The Hybrid Multi-Criteria Sorting Genetic Algorithm (H-MCSGA) algorithm presented in [14] uses the THESEUS method to assign solutions to two ordered categories (satisfactory and unsatisfactory). THESEUS is combined with the non-dominated sorting of an evolutionary algorithm to increase the selective pressure towards the RoI.

#### **3. Description of P-HMCSGA**

P-HMCSGA is an algorithm designed to solve MOPs, which allows the DM to specify his/her preferences by a realistic profile. Fernandez and Navarro [30] propose an outranking preference model that supports incorporating these preferences, which are regularly obtained by assignment examples. In outranking models, approaches like preference-disaggregation analysis [31,32] reflect preferences into these models' parameters. In this paper, the terms direct and indirect concern the way to determine the outranking model parameters.

P-HMCSGA consists of three phases to perform the multi-objective optimization process. In the first phase, the DM specifies preferences in a profile that characterized her/his, which permits categorize the solutions as good and bad. The second phase transforms the categorized solutions into preference model parameters. Both phases correspond, respectively, with the indirect and direct elicitation of preferences mentioned in [33,34]. Finally, the third phase incorporates preferences in the solution process as the parameters of the preference model, supporting a multi-criteria classifier. Figure 1 illustrates these steps and the next sections explain each one.

**Figure 1.** Phases of incorporating preferences in the optimization process.

The profiles are proposed to characterize a DM's preferences expressed in understandable terms, avoiding the cognitive effort involved in selecting, from a solutions sample, the ones as close as possible to his/her preferences. This difficulty increases with the number of objectives.

An example of this characterization would be the profile of a DM who wishes to favor portfolios in which the number of supported projects is maximized. Another DM could be more interested in reducing the consumption of the available budget.

Depending on the selected profile, through profile generators, reference sets are formed to use in different parts of the optimization process. In this algorithm, the following generators are proposed.


#### *3.1. Phase 1: Indirect Preference Elicitation*

In this phase, we start from the idea of presenting solutions to a decision-maker; these solutions can be obtained through a solution generator, a repository, etc. Commonly, it is intended that from these solutions, the DM selects those that are representative for him to be part of the good category and others to consider them in the category of bad solutions. Instead, to reduce the DM's cognitive load, the generator-α method in Figure 2 allows the DM to provide a simple preference profile to emulates him/her in the reference set's

construction through this profile. This is a set of classified solutions that serve as training, reflecting the DM's preferences in a categorized way. Similarly, only two categories are considered at this time: good and bad solutions.

**Figure 2.** Profile-generator-α: the profiling method to imitate a decision maker in categorizing solutions.

Step 1. In this first part of the optimization process, the optimization instance is introduced to a solution generator without preferential support to generate feasible solutions.

Step 2. The DM selects a profile and, together with the generated solutions, is the input of a categorizer, which separates the good and bad solutions according to the preference profile.

Step 3. For the categorizer, the input is a set of solutions and the selected profile-α. Depending on the profile, the selection of solutions could require additional information. In this step, a coincidence count is made for each solution according to the α profile and, once the coincidence count has been complete, descending ordering is made according to this count's value.

Step 4. The *n* solutions with the greatest coincidence are selected to form the category of good solutions. In the same way, the *n* solutions with the lowest coincidences form the category of bad solutions.

Step 5. These two sets form the reference set or categorized examples to use in the next phase.

#### *3.2. Phase 2: Direct Preference Elicitation*

For the DM, it is easier to indicate their preferences in profiles (converted a posteriori in categorized solutions) than to perform it directly by assigning weights to objectives, establishing acceptance ranges for each criterion, or giving preferential model parameters. For the Phase 2, the use of a PDA method [31] is proposed to transform the categorized preference information into parameters of a preferential model [30] that is part of the search process; this phase is shown in Figure 3.

**Figure 3.** Instance generation process for PDA.

Step 1. For this Phase 2, an instance generator method was developed so that PDA transforms the DM's preferences into parameters of a preferential model. This generator receives as the first input the categorized examples obtained in Phase 1.

Step 2. The second input for the instance generator is the parameter ranges. The estimator of the feasibility region method obtains these approximate reference parameters from the initial optimization instance's objectives. These values are adjusted according to the set of references (categorized examples) also introduced to PDA.

Step 3. Once the approximate reference parameters have been calculated, they are joined with the set of references obtained in Phase 1 to generate an input instance for the PDA.

Step 4. In the PDA procedure, the preferences, expressed in categorized examples, are transformed into preferential model parameters. At the end of Phase 2, a set of preferential model parameters are obtained for its incorporation into the search process of Phase 3.

#### *3.3. Phase 3: Incorporation of Preferences in the Solution Process*

Initially, in Phase 1, the preferences are introduced as a profile and converted to a reference set. After, in Phase 2, they are reflected in preference model parameters. Figure 4 shows the process of Phase 3, in which the preferences are incorporated in the optimization process.

**Figure 4.** Phase 3: Incorporation of preferences in the solution process.

Step 1. Once the parameters for the preferential model have been generated using PDA, the optimization instance incorporating these parameters is generated so that the preferences are included in the search process.

Step 2. With the instance with preferences, an initial search is conducted with a strategy that approximates the region of interest. For this, an alternatives generator with preferences method, like Non-Outranked Ant Colony Optimization (NO-ACO) proposed in [36] and modified in [14], finds a sample of satisfactory and unsatisfactory solutions, considering net flow, strict-preference and Pareto dominance (see Section 2). In [37], the NO-ACO algorithm uses the three objective Equation (11) as a subrogate model to solve PPP instances.

Step 3. The obtained solutions sample is introduced to the profile-generator-β to form a reference set, considering that the solutions satisfy the DM requested profile and his/her tolerance. The reference set includes good solutions from the satisfactory set and bad solutions from the unsatisfactory ones.

Step 4. After this, the parameterized optimization instance and the reference set obtained with the profile-generator-β are introduced to the H-MCSGA optimizer [14], which uses outranking classification to adds more solutions discrimination capability to

the sorting process of NSGA-II. Based on the preference model, this strategy makes a better approximation towards the DM's region of interest.

Step 5. The architecture of the proposed algorithm facilitates the interactive incorporation of preferences to allow the DM to refine them. Every certain number of iterations, the DM can give feedback with a sample of the best solutions obtained for the profile specified, with the possibility of choosing the ones most preferred. This set of solutions can enrich the initial reference set to direct the search more intensively toward the refined region of interest. The evaluation of interactive preference incorporation remains as future work.

Figure 5 shows the proposed P-HMCSGA, in which the intervention of the three phases that have been previously exposed can be observed.

**Figure 5.** The Preference Hybrid Multi-Criteria Sorting Genetic Algorithm (P-HMCSGA) for incorporating preferences in optimizers.

#### **4. Experimental Design and Results**

This section describes the experimental process that evaluates P-HMCSGA. The process analyzes the algorithm in two experiments. The first experiment analyzes the effect of the profile of a decision maker in the search process. The second experiment studies the error variability of initial reference solutions and their impact on final solutions.

The experimental design tested the approach, in each experiment, on three different profiles, one configuration for the involved algorithms and six instances of the project portfolio problem (PPP). The performance of P-HMCSGA was measured using five different indicators that reflect how well it approximates the Pareto front with and without preferences and how well it adjusts the portfolios to the specified profiles.

A summary of the steps followed during the experiment are depicted in Figure 6. The remainder of the section details the elements utilized and the results obtained.

**Figure 6.** General experimental design with P-HMCSGA.

#### *4.1. DM Profiles*

A profile refers to the method used by a DM to make a decision. The evaluation of P-HMCSGA uses the following three preferential profiles:

Established projects: A predefined projects set in the portfolio is considered to be formed and they are defined according to the DM's preferences; one possible reason for this profile is that these projects have been beneficial in the past.

Preference in the area and/or region: For the DMs, a portfolio has more preference with a higher number of supported projects on a pre-specified area or region.

Cardinality: The DM favors portfolios that maximize the number of supported projects.

#### *4.2. Algorithm Configuration*

Table 1 shows the algorithms and their parameters' values used for each process involved in the P-HMCSGA. The process is shown in Column 1. Column 2 indicates which strategy was used in each process. Columns 3 to show the configuration of the parameters' values used in each strategy. The used genetic operators were the same as reported in [8,10,17] for each approach.


#### **Table 1.** Algorithms used in the proposed P-HMCSGA.

*4.3. Instances of the Project Portfolio Problem*

The instances of the PPP proposed in [39] served as a benchmark for the evaluation of P-HMCSGA. Table 2 summarizes the details about the instances. The medium-scale instances have nine objectives and the large-scale instances has 16 objectives.


**Table 2.** Description of the instances used in the experimentation.

#### *4.4. Quality Indicators to Evaluate Solutions*

This work uses five different indicators to evaluate the performance of the proposed P-HMCSGA algorithm. These metrics are detailed in the remainder of this section.

Indicator *NDA* measures the non-dominance proportion achieved over an approximated Pareto front (PF) *A*. Equation (13) computes this indicator as to the quotient of the size of the set of non-dominated solutions *F0* produced by P-HMCSGA and the size of set *A*.

$$ND\_A = \frac{|F\_0|}{|A|} \times 100\tag{13}$$

Indicator *PSOA* measures the proportion between non-strictly outranked solutions (i.e., solutions that are hard to distinguish by preference according to a specific DM (see Section 2) and an approximated PF *A*. Equation (14) computes this indicator as the quotient of the size of the approximated non-strict-outranked frontier *FNSO* produced by P-HMCSGA and the size of the set *A*.

$$PSO\_A = \frac{|F\_{NSO}|}{|A|} \times 100\tag{14}$$

Indicator PC measures the percentage of maximum cardinality achieved by the solutions reported by P-HMCSGA. Equation (15) computes it as the quotient between the number of supported projects in a portfolio, sp, and the estimated maximum projects that could ever be supported, ems.

$$PC = \frac{sp}{\varepsilon m s} \times 100\tag{15}$$

Indicator PES (previously established projects) measures the proportion of the DM's previously established projects that are found in a portfolio generated by P-HMCSGA. Equation (16) computed as the quotient between ep, i.e., the number of projects in a portfolio that are wanted by the DM, and EP (established projects), the maximum number of wanted projects.

$$PES = \frac{\varepsilon p}{EP} \times 100\tag{16}$$

Finally, indicator PAR (project in area/region) is the proportion of supported projects that goes in agreement with the area/region desired in the portfolio and established by the DM. Equation (17) measures this proportion as the quotient between EAR, the number of projects in the portfolio constructed by P-HMCSGA that satisfied the area and region conditions of the DM and q, the number of projects in the instance that satisfy the area and region conditions established by the DM.

$$PAR = \frac{EAR}{q} \times 100\tag{17}$$

The NDA and PSOA are referred to as general quality indicators because they measure the quality of a strategy based on their closeness to the PF or the RoI, the general metric evaluations for multi-criteria algorithms. The indicators PC, PES and PAR are indicators of specific quality because they measure how well the portfolios constructed have respected the preferences established by a particular preference profile.

#### *4.5. Experiment 1: Effect of the Profile of a Decision Maker in the Search Process*

This experiment was carried out based on the idea that if the solutions are presented to two decision makers with different profiles, these may be good for one decision maker and they may not be good for another, or perhaps only some. An example is shown in Figure 7, where, for a DM that seeks to maximize the number of projects included in the portfolio, both solutions satisfy that requirement, but if these same solutions are presented to a DM whose profile establishes that project number 3 must be supported, the first solution is definitely not acceptable because it does not include this.

**Figure 7.** Solutions evaluated in different profiles.

From the above, an experiment was designed, the process of which is shown in Figure 8. There, it illustrates that P-HMCSGA solves each instance using each of the *n* profiles (the configurations of the approaches are according to those defined in Section 4.2). With the results, a matrix of size *n* × *n* is formed. The set of solutions produced using each profile *i* is compared against the other profiles *j* in order to estimate how well the satisfaction of a profile *i* by P-HMCSGA behaves in comparison with other profiles *j* not considered at the moment. Hence, a cell (*i*,*j*) contains the number of portfolios that satisfies profiles *i* and *j*. Appendices A and B show the complete set of results derived from experimenting with the considered set of instances; the remainder of the sections presents a summary based on selected cases.

**Figure 8.** Evaluation of the profile according to the profile criteria.

Table 3 shows the different DM profiles considered in the experiment for all the instances. Each profile defines two values: the expected value, which is the desired amount of elements required to satisfy a DM completely, and the minimum accepted, which is the minimum number of elements necessary to consider a solution as satisfactory. The maximum found shows the best match obtained from a portfolio constructed by P-HMCSGA.



Table 4 shows the matrix obtained for the concentration of the results of the evaluations in the profiles. The row leads the profile used by P-HMCSGA to approximate the RoI. The column shows how the best value obtained by fixing the profile behaves in other profiles. The value in parenthesis indicates the best value in the compared profile; the value outside is the number of solutions that obtained that value.

**Table 4.** Matrix of results of satisfactory solutions evaluated in other preferential profiles using instance o9p100\_1.


The results from Table 4 show that the highest number of solutions coincide with the main diagonal; this demonstrates that the use of profiles in the search process of P-HMCSGA indeed pursues the construction of portfolios that satisfy such preference conditions.

Table 5 shows the results from the measurements established by the indicators defined in Section 4.4. Again, the highest scores are in the main diagonal and are achieved when the indicator matches the profile used during the search process. These results corroborate the fact that the use of profiles favors the construction of solutions that satisfy a DM's preferences.

**Table 5.** Specific quality indicators for each profile as established by the DM using instance o9p100\_1.


Figure 9 illustrates the process used to identify the approximate Pareto front and nonstrict outranked sets from P-HMCSGA on each profile. There, all the solutions considered satisfactory for each profile were concentrated in bags of satisfactory solutions, sets of non-repeated solutions that satisfy non-dominance or non-strictly outrank conditions.


**Figure 9.** Obtaining bags of satisfactory solutions for each profile evaluated.

Table 6 summarizes the measurements on the *ND* and *NSO* indicators obtained from the cardinality profile. Using P-HMCSGA to approximate the PF and the RoI on instance o9p100\_1, under the cardinality profile, the set *A* of reported portfolios was of size 99. The numbers of portfolios that satisfy the cardinality, established projects and area-region profiles are shown in row one and columns 3, 5 and 7, respectively.

**Table 6.** Dominance and strictly-outranked in cardinality profile using instance o9p100\_1.


The proportion of non-dominated solutions on each profile is shown in row 2 and columns 4, 6 and 8. The proportion of non-strictly outranked solution on each profile is shown in row 3 and columns 4, 6 and 8. The results show that if we use cardinality in the profile, the best measures for indicators *ND* and *NSO* are obtained when comparing against the same cardinality.

Table 7 summarizes the measurements on the *ND* and *NSO* indicators obtained from the established projects profile. Using P-HMCSGA to approximate the PF and the RoI on instance o9p100\_1, under the established projects profile, the set *A* of reported portfolios was of size 38. The numbers of portfolios that satisfy the cardinality, established projects and area-region profiles are shown in row one and columns 3, 5 and 7, respectively. The proportion of non-dominated solutions on each profile is shown in row 2 and columns 4, 6 and 8. The proportion of non-strictly outranked solution on each profile is shown in row 3 and columns 4, 6 and 8. The results show that if we use established projects in the profile, the best measures for indicators *ND* and *NSO* are obtained when comparing against the same established projects.



Table 8 summarizes the measurements on the *ND* and *NSO* indicators obtained from the area-region profile. Using P-HMCSGA to approximate the PF and the RoI on instance o9p100\_1, under the area-region profile, the set *A* of reported portfolios was of size 56. The numbers of portfolios that satisfy the cardinality, established projects and area-region profiles are shown in row one and columns 3, 5 and 7, respectively. The proportion of non-dominated solutions on each profile is shown in row 2 and columns 4, 6 and 8. The proportion of non-strictly outranked solution on each profile is shown in row 3 and columns 4, 6 and 8. The results show that if we use established projects in the profile, the best measures for indicators *ND* and *NSO* are obtained when comparing against the same area-region.

**Table 8.** Dominance and strictly-outranked in area-region profile.


Tables 6–8 show that the percentage of solutions remaining satisfactory (non-dominated and non-strictly outranked) is very high in the solutions obtained from the parameter configuration corresponding to the specified profile. When the search was not configurated according to the interesting profile, it obtains few solutions (which were not repeated). Both complementary results show that the search direction depends on the preference profile established by the DM. The solutions obtained from the configuration of specific parameters for the profile are good in dominance and outranking. Besides, using these parameters, it is possible to find a greater number of satisfactory solutions for the profile.

#### *4.6. Experiment 2: Error Variability of Initial Reference Solutions and Its Impact on Final Solutions*

The objective of this experiment is to analyze how the quality of an initial reference set affects the performance of P-HMCSGA. For this purpose, the implementation of the algorithm considers the use of two types of reference sets. The low-quality reference set (denoted "Low") has solutions around the minimum value in a profile that is considered satisfactory for a DM. The high-quality reference set (denoted "Good") has solutions close to the maximum value possible of satisfaction for the chosen profile. The experiment compares the final set of solutions produced by P-HMCSGA using each of the reference sets in terms of the level of satisfaction of the profile and the number of solutions produced. The results show that using a robust reference set formed by solutions of high quality

improves the performance of P-HMCSGA and allows it to find solutions that better satisfy the preferential profile.

For this experiment, the configuration of P-HMCSGA was in accordance with the values in Table 1, except for the number of executions of the generation of alternatives without preferences that were set to 1 for simplicity. The instance considered for the experiment was o9p100\_1. The profile used was established projects and it fixes 15 projects as the desire by the DM (expected column in Table 9); Also, in addition, it considers as satisfactory any solution having a subset of at least 11 of such projects (minimum accepted column in Table 9). The low-quality reference has 20 portfolios or solutions; from them, two contain 12 of the desired 15 projects and the remaining ones contain only 11 projects. The high-quality reference set (or robust reference set) also has 20 portfolios; however, three of them have all the desired projects and 17 of them have 14. Table 9 also shows the maximum number of desired projects that could be found in a solution constructed by P-HMCSGA in the experiment; those solutions could only be found using a reference set of high quality.

**Table 9.** Values that satisfy the DM with the established projects profile on instance o9p100\_1.


Table 10 shows the results of this experiment. In row 1, the column "Maximum in RS*"* shows the composition of the reference sets Low and Good; the column "Finals" shows the composition of the satisfactory solutions found by P-HMCSGA. In both columns, the notation *X*(*Y*) indicates that there are *X* solutions having *Y* desire projects from the 15 considered in the profile. In row 2, the best value achieved according to the PES indicator is shown, considering the composition of the portfolios reported and 15 as the maximum number of desired projects, EP. It is important to note the contrast of the solutions obtained using reference sets with different qualities. The significant variations in the quality of solutions obtained are evident, favoring the results of the robust reference set; this situation is due to the fact that there are far more solutions with a greater number of desired projects involved in the portfolios and also because those solutions present the highest value in PES, indicating that they are closer to the DM's profile than the others.

**Table 10.** Error variability of initial reference solutions and its impact on final solutions.


Finally, Table 11 shows the summary of the dominance comparison of the solutions found using the different referent sets. In the case of the final solutions obtained using the low-quality reference set, of the six of the acceptable solutions according to the preference profile, five remained non-dominated. On the other hand, the use of the high-quality reference set presents a larger number of non-dominated solutions (97.73% of all those reported) all also being satisfactory to the DM.


**Table 11.** Dominance comparison of solutions with good quality RS and low quality RS.

#### **5. Conclusions**

We present P-HMCSGA, a hybrid evolutionary algorithm for solving multi-objective optimization problems. The algorithm incorporates the DM's preferences to guide the search towards the region of interest that the DM desires the algorithm to approach. The DM gives a preferential profile containing their preferences expressed in understandable terms, which demand less cognitive effort than when he/she selects solutions from a generated sample, which is a common way to elicit preferences. P-HMCSGA reflects this preference profile in the outranking relational parameters of an ordinal multi-criteria ordinal classification method. These strategies add more solutions discriminationcapability to the well-known non-dominated sorting process incorporated in P-HMCSGA, achieving a better approximation towards the DM's region of interest.

The proposed algorithm satisfies some desirable features of a preferences incorporation method: (a) the DM interacts easily with the solutions sample generator method, decreasing the cognitive effort from the DM when the DM gives a simple preferential profile to automatically separate solutions as "good" and "bad" and to the forms reference set for a classifier method; (b) the multi-criteria preferences outranking model is compatible with relevant characteristics of real DMs expressed in preference profiles and preferences are rarely associated with realistic DM's characteristics; (c) it is possible to estimate the preference model parameters from the profile provided by the DM during the interactive process; (d) to a certain extent, the profile generator and the ordinal classification replace the DM during the optimization process, bringing valuable aid to the DM, especially in the presence of a high number of objectives.

Our algorithm obtains, as output, a set of non-dominated solutions which belong to the DM best class. The experiments were carried out to solve a project portfolio optimization problem with instances of nine and sixteen objectives. From these experiments, we can observe that P-HMCSGA has performed as expected and make the following conclusions:


The P-HMCSGA running time only doubles the time spent by the slower inner strategy. The latter can be observed from the results when analyzing instances with nine and 16 objectives, separately. In nine objective instances, A2-NSGA-III was the most time-consuming strategy, requiring at least four times the amount of time of any other. In sixteen objective strategies, NO-ACO was the slowest and needed at least five times longer than any other. Hence, the accumulated consumed time would be, at most, twice the slower strategy.

We have not included a comparison with other algorithms because our preference representation requires conversion methods to make a fair and comparable evaluation. Another possible analysis is to implement our preference incorporation method into several state-of-the-art multi-objective optimization metaheuristics to determine the advantages and limits of our proposal. However, this is beyond the scope of this paper.

Future work is proposed to verify the algorithm's interactive capacity, which can enrich the initial reference set with new solutions to intensity the search toward the region of interest. In addition to interactivity, a DM might be interested in specifying their profile with more than one preference. For example, in the project portfolio problem, a DM has particular ideal preferences on portfolios with a small number of projects and their cost is at a most certain quantity. Some voting strategies could be useful to deal with profiles that include several preferences. Pareto Explorer is a recent tool that incorporates user's preferences, articulated either in decision variables, objectives, weight space, or toward knee solutions, in the computation of the "ideal" solution of a given MaOP is the Pareto Explorer [40].

**Author Contributions:** Conceptualization: M.P.-V., L.C.-R., N.R.-V.; Methodology: H.F.-H., C.G.-S.; Investigation: M.P.-V., L.C.-R., N.R.-V.; Software: M.P.-V., C.G.-S., N.R.-V.; Formal Analysis: M.P.-V., L.C.-R., N.R.-V.; Writing original draft: M.P.-V.; Writing review and editing: M.P.-V., L.C.-R., N.R.-V., H.F.-H. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Acknowledgments:** The authors thank CONACYT for supporting the projects from (a) Cátedras CONACYT Program with Number 3058. (b) Project CONACYT A1-S-11012 from Convocatoria de Investigación Científica Básica 2017–2018 and CONACYT Project with Number 312397 from Programa de Apoyo para Actividades Científicas, Tecnológicas y de Innovación (PAACTI), a efecto de participar en la Convocatoria 2020-1 Apoyo para Proyectos de Investigación Científica, Desarrollo Tecnológico e Innovación en Salud ante la Contingencia por COVID-19. (c) M. Pérez-Villafuerte would like to thank CONACYT for the support number 293813.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Abbreviations**


#### **Appendix A**

Tables A1–A7 show matrix of results of satisfactory solutions evaluated in other preferential profiles for instances o9p100\_1, o9p100\_2, o9p100\_3, o9p150\_1, o9p150\_2 and o16p500\_1.


**Table A1.** Matrix of results of satisfactory solutions evaluated in other preferential profiles. Instance o9p100\_1.

**Table A2.** Matrix of results of satisfactory solutions evaluated in other preferential profiles. Instance o9p100\_2.


**Table A3.** Matrix of results of satisfactory solutions evaluated in other preferential profiles. Instance o9p100\_3.


**Table A4.** Matrix of results of satisfactory solutions evaluated in other preferential profiles. Instance o9p150\_1.


**Table A5.** Matrix of results of satisfactory solutions evaluated in other preferential profiles. Instance o9p150\_2.


**Table A6.** Matrix of results of satisfactory solutions evaluated in other preferential profiles. Instance o16p500\_1.


#### **Appendix B**

Table A7 shows the average running time for each instance that was part of the experimentation, expressed in seconds; the time in each algorithm that participates in P-HMCSGA is detailed. The average times shown occurred during the execution of P-HMCSGA with the settings indicated in Table 1. The strongest part is in PDA, used to discover the preferential parameters that will guide the search going forward.

**Table A7.** Average running time in all the process.


#### **References**


## *Article* **Convolutional Neural Network–Component Transformation (CNN–CT) for Confirmed COVID-19 Cases**

**Juan Frausto-Solís 1,\*,†, Lucía J. Hernández-González 1,†, Juan J. González-Barbosa 1,\*,†, Juan Paulo Sánchez-Hernández <sup>2</sup> and Edgar Román-Rangel <sup>3</sup>**


**Abstract:** The COVID-19 disease constitutes a global health contingency. This disease has left millions people infected, and its spread has dramatically increased. This study proposes a new method based on a Convolutional Neural Network (CNN) and temporal Component Transformation (CT) called CNN–CT. This method is applied to confirmed cases of COVID-19 in the United States, Mexico, Brazil, and Colombia. The CT changes daily predictions and observations to weekly components and vice versa. In addition, CNN–CT adjusts the predictions made by CNN using AutoRegressive Integrated Moving Average (ARIMA) and Exponential Smoothing (ES) methods. This combination of strategies provides better predictions than most of the individual methods by themselves. In this paper, we present the mathematical formulation for this strategy. Our experiments encompass the fine-tuning of the parameters of the algorithms. We compared the best hybrid methods obtained with CNN–CT versus the individual CNN, Long Short-Term Memory (LSTM), ARIMA, and ES methods. Our results show that our hybrid method surpasses the performance of LSTM, and that it consistently achieves competitive results in terms of the MAPE metric, as opposed to the individual CNN and ARIMA methods, whose performance varies largely for different scenarios.

**Keywords:** forecasting; Convolutional Neural Network; LSTM; COVID-19; deep learning

## **1. Introduction**

Coronaviruses are a large family of viruses characterized by having crown-shaped spikes on their surface. Nowadays, there are seven identified types of coronaviruses that can be transmitted among humans. The most dangerous coronaviruses known until recent years are MERS-CoV and SARS-CoV, and they have caused severe diseases, such as MERS and SARS, in 2003 and 2012, respectively, [1]. However, at the end of 2019, in Wuhan, China, the new epidemiological outbreak of COVID-19 emerged; it was caused by the new coronavirus called SARS-CoV2.

The importance of mathematical models and algorithms to analyze this disease has grown because they allow one to find patterns, make predictions, and understand fluctuations. Epidemiological models can be classified into two groups [2]:

• Dynamic Models. These are old models that usually divide the population into several subsets known as compartments, for instance, the Susceptible, Infectious, Recovered or SIR model. The SIR model was proposed in 1902 by Sir Roland Ross and then expanded by Kermack and McKendrick in 1927 [3].

**Citation:** Frausto-Solís, J.; Hernández-González, L.J.; González-Barbosa, J.J.; Sánchez-Hernández, J.P.; Román-Rangel, E. Convolutional Neural Network–Component Transformation (CNN–CT) for Confirmed COVID-19 Cases. *Math. Comput. Appl.* **2021**, *26*, 29. https:// doi.org/10.3390/mca26020029

Academic Editor: Oliver Schütze

Received: 28 February 2021 Accepted: 8 April 2021 Published: 12 April 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

• Forecasting models using time series. Here, we find classical methods such as ARIMA and Exponential Smoothing (ES) [4]. Furthermore, Machine Learning methods like Support Vector Machines [5] and Deep Learning [6] are also in this group.

This work presents a new method of the second group, based on Convolutional Neural Network (CNN) [7] and a proposed Component Transformation (CT), which we named CNN–CT, whose mathematical formulation is presented. The CNN–CT method is applied to forecast the number of COVID-19 confirmed cases for the United States (US), Mexico, Brazil, and Colombia [8]. The CT changes daily observations into weekly data and back. The forecast made by our hybrid CNN–CT method is further adjusted either with ARIMA or ES methods. We compared the proposed hybrid method versus the individual methods. Our results show that the combined method consistently achieves competitive results in terms of the MAPE metric, as opposed to any of its elements—CNN, ARIMA, or ES—whose performance as individual methods varies largely for different countries. Moreover, the proposed CNN–CT method also outperforms the Long Short-Term Memory (LSTM) [9], which is among the most used methods for dealing with time-series.

Both CNN and LSTM are Deep Learning methods, the first of which is equipped with convolutional filters while the second with recurrent operations, but in both cases with parameters that are learned though gradient-descent-like methods in a scenarios where data are used for training as they become available. In contrast, ARIMA and ES are traditional regression methods that consider a full set of training data at once, thus having the potential of better approximating such a training set, but losing the ability to adjust to newly available data as CNN and LSTM can. The proposed CNN–CT method exploits both the potential of incorporating newly available data as well as the strength of looking at a complete set at once, which results in an enriched forecast method.

We chose to use CNNs, given that the signal processing literature states that convolutional filters are more stable than recurrent operations like LSTM [10]. Moreover, the superior performance of CNNs over traditional methods, like ARIMA, has been confirmed by previous work focused on text classification [11] and sequence modeling [12], where convolutions obtained higher performance with respect to other methods.

The rest of this paper is organized as follows. In Section 2, we discuss works related to the forecast of confirmed cases of COVID-19. In Section 3, we show the proposed forecasting method for daily confirmed cases of COVID-19, highlighting the application of Deep Learning, ARIMA, and ES methods. In Section 4, we present details about the data and tools used to validate our method. Finally, Sections 5 and 6 present results and conclusions of this work.

#### **2. Related Works**

COVID-19 is a disease with a high rate of spread, which has led to an interest in estimation and forecasting the number of cases of infected people. Recently, several works have been presented with traditional epidemiological models or Dynamic Models. The Susceptible, Exposed, Infectious, Recovered (SEIR) model [13] was used to forecast confirmed cases in the United Kingdom, and the SIR and SEIR models were applied to forecast cumulative infected and recovered cases in Santiago de Cuba [14]. The Susceptible, Exposed, Infectious, Recovered, Dead (SEIRD) model [15] was used to forecast confirmed and death cases in Mexico. At Chen [16], comparative work was conducted to predict 11 days of confirmed cases in some regions of Canada and the United States. They use SIR, Neural Network, and ARIMA models.

The ARIMA and ES were used as adjusting methods to improve the results obtained for other models such as those obtained for SIR models, Neural Networks, and Support Vector Regression algorithms [2,17]. However, in most cases, the number of days forecast is too short. For instance, the authors of [18] used ARMA to forecast confirmed cases for three days in Chinese provinces, Asian countries, and a few occidental countries (Germany, US, Italy, and Spain). Parvez et al. compared an Adaptive Neuro-fuzzy Inference System versus ARIMA to predict ten days of COVID-19 confirmed cases in Bangladesh [19]. Furthermore, Petropoulus et al. [20], used the ES method known as Holt-Winter to forecast ten days of globally accumulated COVID-19 confirmed cases. Hussain et al. [21], used an ES to estimate twelve days of confirmed cases, and the *R*<sup>0</sup> parameter known as the basic reproduction number.

ARIMA and Deep Learning methods have been used alone to forecast COVID-19 cases. Chimmula [22] used LSTM to predict daily cases, obtaining with this method an error of eight percent using MAPE. In Chandraa [23], LSTM, BiLSTM, and EDLSTM were used to forecast the spread of COVID-19 infections among selected states in India. The work presented by Zeroul et al. [24] used deep learning to predict 10 days of number of infected people, obtaining a MAPE error between 1.28% and 59%. Saba et al. [25] compared polynomial regression, Holt-Winter, ARIMA, and SARIMA models, to predict the confirmed and deaths cases. Parbat et al. [26] proposed using an SVR-Radial model to forecast total deaths and recovered, daily confirmed cumulative, and confirmed daily deaths in India; this method obtained around thirteen percent MAPE error for the entire country.

Moreover, classical forecast methods have been combined with Machine Learning techniques [2,17,27]. Katris [27] used ARIMA, ES, Neural Network, and MARS models, where the combined methods performed better than the individual methods.

In general, ARIMA and ES methods are used to forecast cases with short-term periods, while Machine Learning and Deep Learning models are able to predict cases over more extended periods. However, the latter do not always obtain good results when used as individual methods.

#### **3. CNN–CT Method**

We show the proposed CNN–CT method in Figure 1, where a Convolutional Neural Network is used as primary forecasting method for daily confirmed cases of COVID-19, and it is complemented by ARIMA or ES, which are used as adjusting methods against daily errors.

**Figure 1.** Proposed Convolutional Neural Network (CNN) and temporal Component Transformation (CT) (CNN–CT ) method. Training with two phases: the first phase corresponds to forecast method using component values, and the second phase used residual values with a residual forecast method.

Firstly, our method's training stage is composed of two phases, each of which is formed by three internal sub-processes plus one global integration sub-process, as is shown in Figure 1.

In the first sub-process of phase 1, we start by transforming daily values *yt* into weekly components *wτ*, where *t* is a day index and *τ* is a component index. These *wτ* components represent average weekly forecast estimations. In the second sub-process, a CNN is used to forecast the component *w*ˆ *<sup>τ</sup>*. Finally, in the third sub-process, we convert the component estimation *w<sup>τ</sup>* back into daily estimations *y*ˆ*t*,*τ*.

In phase 2, the adjusting methods are trained. First, we obtain the residual *ε<sup>t</sup>* from the difference between the daily prediction and its corresponding ground truth value, i.e., *y*ˆ*τ*,*<sup>t</sup>* − *yt*. We scale these residual values to be in the range [1, 10], as required by the Holt-Winter methods.

In the second sub-process of phase 2, we use the residuals *ε<sup>t</sup>* to train an autoregressive model using either ARIMA or ES, which is used to forecast residual values *ε*ˆ*<sup>t</sup>* (concretely, *ε*ˆ*t*,*es* and *ε*ˆ*t*,*arima* for ES and ARIMA, respectively).

Later, in the third sub-process of phase 2, residual forecasts *et*,*es* or *et*,*arima* are obtained from the previously computed residual forecast values. Finally, this residual forecast *et*,*<sup>X</sup>* is added to the daily estimation *yτ*ˆ,*<sup>t</sup>* obtained from the CNN, resulting in the final prediction value *F t* .

#### *3.1. Data Transformation*

Prediction models reflect an increased error as the number of forecasting periods increases. We chose to forecast more cases by transforming daily records into weekly components with the CT module, which maps the daily cases *yt* into components *wτ* that represent a weighted average of the daily cases obtained within a week. The values *wτ* are calculated with Equation (1).

$$w\_{\tau} = \frac{\sum\_{t=\tau\tau-6}^{\tau\tau} y\_t}{\mathcal{T}},\tag{1}$$

where *w<sup>τ</sup>* is the weekly average of week *τ* and *w*1, *w*2, ... , *w<sup>τ</sup>* is a set of transformed observation into components. For instance, *<sup>w</sup>*<sup>2</sup> <sup>=</sup> *<sup>y</sup>*8+*y*9+...+*y*<sup>14</sup> <sup>7</sup> .

#### *3.2. CNN Forecast Component*

We used a CNN as a component forecasting method. The training and validation stages are composed of *w<sup>τ</sup>* values. The CNN architecture contains an input layer with 50 convolutional neurons, a maxpooling layer of size equals 2. A complete MLP layer of 50 neurons, and one output layer with a single neuron. The convolutional layers use the ReLU activation function. The training configuration parameters is as follows: Adam optimizer [28], mean absolute error as loss function, 100 epochs, and batch size equal to 10. The above configuration is used to forecast weekly components *w*ˆ *<sup>τ</sup>*.

#### *3.3. Daily Estimations*

The reverse transformation or daily estimations involves converting the weekly components *wτ* back into daily values. For this, it is necessary to calculate the subcomponents of a component, which we define as shown in Table 1.

#### **Table 1.** Component segmentation into subcomponents.


The segmentation of the week into two subcomponents provides insights about the social behavior of countries separately into beginning and end of a week. The distribution of the daily cases with respect to their subcomponents can be obtained by Equations (2) and (3).

$$
\delta\_{\tau,1} = \frac{\sum\_{t=1,\tau}^{4,\tau} y\_{t,\tau}}{4},
\tag{2}
$$

$$
\delta\_{\tau,2} = \frac{\sum\_{t=5,\tau}^{7,\tau} y\_{t,\tau}}{3},
\tag{3}
$$

where *δτ*,1, *δτ*,2 are subcomponents ADS-1 (Monday to Thursday) and ADS-2 (Friday to Sunday) for the component *τ*. We determine that the daily ratio *μτ*,*<sup>t</sup>* represents the proportion of the original daily values for subcomponent 1 and 2 for the component *τ* (Equation (4)). The daily ratio *μτ*,*<sup>t</sup>* lets us to determine weekday normalized cases *xt* (Equation (5)) of the training phase. In other words, *x*<sup>1</sup> = *mondeysavg*, ... , *x*<sup>7</sup> = *sundaysavg* are average confirmed cases of each day of the week throughout the time series. 

$$\mu\_{\tau,t} = \begin{cases} \frac{\mathcal{V}\tau \underline{t}}{\mathcal{S}\_{\underline{r},1}}, & \text{if } 1 \le t \le 4, \\\frac{\mathcal{V}\tau \underline{t}}{\mathcal{S}\_{\underline{r},2}} & \text{if } 5 \le t \le 7, \end{cases} \tag{4}$$

$$\mathbf{x}\_{l} = \frac{\sum\_{i=1}^{\tau} \mu\_{i}}{\tau}. \tag{5}$$

The weighting of the daily cases obtained with the ratio *μτ*,*<sup>t</sup>* allows obtaining a statistical estimation on the relevance of persons infected in the first and second subcomponent *τ*, *j* of each component *τ* throughout the training period. The inverse transformation determines the daily cases predicted from the components using Equations (6) and (7). 

$$
\hat{\delta}\_{\tau,i} = \hat{w}\_{\tau} \frac{w\_{\tau}}{\delta\_{\tau,i}},
\tag{6}
$$

$$\mathcal{Y}\_{\mathbf{r},t} = \begin{cases} \mathbf{x}\_t \boldsymbol{\delta}\_{\mathbf{r},1'} & \text{if } 1 \le t \le 4, \\ \mathbf{x}\_t \boldsymbol{\delta}\_{\mathbf{r},2} & \text{if } 5 \le t \le 7, \end{cases} \tag{7}$$

where *y*ˆ*τ*,*<sup>t</sup>* represents the forecasting case values of the component *τ* at time *t*, and ˆ *δτ*,*<sup>i</sup>* is the forecast of the average number of infected sub-component *i* in the *τ* component. The data for the learning of the adjustment methods are obtained from the daily prediction values of the validation phase of components *yτ*,*t*.

#### *3.4. Residual Transformation*

A residual value is given by the difference in the ground truth and the predicted value, as shown in Equation (8).

$$x\_t = y\_t - \hat{y}\_t = y\_t - y\_{t-1} \tag{8}$$

where *yt* is the ground truth in time *t*, *y*ˆ*<sup>t</sup>* is the forecast value in time *t*. Using Equation (8), the residuals *et* are obtained by subtraction of *yt* and *yτ*,*t*, as shown in Equation (9).

$$x\_t = y\_{\tau, t} - y\_{t, \prime} \tag{9}$$

where *yτ*,*<sup>t</sup>* is the forecasting value in time *t* of component *τ*. ARIMA and ES methods used positive numbers; because of this, the residuals were normalized as shown in Equation (10).

$$\varepsilon\_{\!t} = |y\_{\overline{\tau},t} - y\_{\!t}|,\tag{10}$$

where |.| represents normalization of *et* in the range of values [1, 10].

#### *3.5. Residual Forecast*

We used ARIMA and ES forecasting methods as forecasting adjustments methods. The training and validation sets are composed by *ε<sup>t</sup>* values.

The configuration of the ARIMA method is as follows: start\_p = 0, d = 0, start\_q = 0, max\_p = 5, max\_q = 5, max\_d = 5, start\_Q = 0, max\_P = 5, max\_D = 5, max\_Q = 5, m = 4, seasonal = True, error\_action = 'warn', trace = True, suppress\_warnings = True, stepwise = True, random\_state = 20, n\_fits = 50, information\_criterion = 'aic', and alpha = 0.05.

Furthermore, ES obtained a configuration that used the Holt–Winter (HW) method. The variants of HW used are: additive, multiplicative, additive damped, multiplicative damped. These variants were trained with a norm residuals *εt*.

#### *3.6. Residual Estimations*

We use residual transformations *ε<sup>t</sup>* to train ARIMA and ES, from which we obtained four hybrid methods, CNN-ARIMA, CNN-ES, LSTM-ARIMA, and LSTM-ES. The forecasts *εt*,*es* and *εt*,*arima* from these hybrid methods are transformed into residuals *et*,*es*,*et*,*arima*, which are in the non-normalized domain.

#### *3.7. Forecasting*

Finally, we evaluated the forecast values of the validation phase *F <sup>t</sup>* , which is composed of the daily forecasts *yτ*,*<sup>t</sup>* of CNN and adjustment forecasts *et*,*best*, as is shown in Equation (11).

$$F'\_t = \mathcal{Y}\_{\mathsf{T},t} + \mathfrak{e}\_{t,best}.\tag{11}$$

#### **4. Experimental Setup**

The source of the data, the pre-processing applied, the data separation criterion in training, validation, and testing are described below. Finally, the evaluation metrics are described.

#### *4.1. Data*

The COVID-19 database used in this work is the Novel Coronavirus 2019 dataset [8], whose records report the number of infected, recovered, and deceased people in each country of the world. From this database, we used a time series starting from 22 January 2020, and that is called Time\_Series\_Covid\_19\_confirmed. We selected the records corresponding to the US, Mexico, Brazil, and Colombia.

We used data records from 2 March 2020 until 28 June 2020 for training (17 weeks); from 29 June 2020 to 19 July 2020 for validation (3 weeks); and from 20 July 2020 to 9 August 2020 for test (3 weeks). Figure 2 shows a scheme for this split of data.

With this split, the training of the CNN–CT method for the US was carried out with 17 weekly components *wτ*, as explained in Section 3.1. In the case of Mexico, Brazil, and Colombia, we used only 15 weekly components since the data corresponding to the first week were discarded due to the lack of significant information; that is, the values of the first week were considerably low with respect to the rest of the series. We noticed that processing this first week results in underestimation of the forecast values.

**Figure 2.** Split of the observations in training and validation set by CNN method.

Although training is conducted using weekly components *wτ*, the forecast for the validation and test stages happens in daily values *yτ*,*t*, as explained in Section 3.3.

Residual forecasts allow adjusting daily forecast with ARIMA and ES. In addition, it trained with the residuals of forecast daily validation means, and *wτ* forecasts obtained in the validation phase were transformed into daily estimations *yτ*,*t* to be used in the training and validation phase of the adjustment methods. Figure 3 shows a scheme for this split of data for the adjusting methods.

**Figure 3.** Split of the observations in training and validation set by adjusting methods.

Given that the problem we address corresponds to a scenario of auto-regression, the actual structure of the data is such that each output variable *yt* depends on a vector of past values **x** = [*yt*−1, *yt*−2, ... , *yt*−*T*]. For this work, we used lags of up to three past values, *t* − 3, *t* − 2, and *t* − 1.

#### *4.2. Metrics*

The proposed hybridized CNN–CT method and its individual composing methods are evaluated by the MAPE [29], as it has been widely used in the works discussed in Section 2. The MAPE computes the percentage of accuracy in the predicted value with respect to the ground truth. The closer to zero, the more accurate it is. Another common metric is RMSPE [4] which is also used in part of this paper. &

$$MAPE = \frac{100}{n} \sum\_{t=1}^{n} \frac{|y\_t - \hat{y}\_t|}{y\_t},\tag{12}$$

$$RMSPE = \sqrt{\frac{\sum\_{t=1}^{n} (y\_t - \hat{y}\_t)^2}{n}} \ast 100,\tag{13}$$

where, *yt* is the ground truth, *y*ˆ*<sup>t</sup>* is the predicted value, and *n* indicates the total number of samples.

#### *4.3. Tools*

This work was developed with a computer with an iOS operating system, 8 GB, and a 2.3 GHz Dual-Core Intel Core i5 processor. We used Python 3.7.1, and the CNN model was built using Tensorflow and Keras libraries [30].

#### **5. Results**

This section shows the results of the CNN–CT method proposed for daily forecasting cases of COVID-19 in the US, Mexico, Brazil, and Colombia. First, we compare the performance of using CNN and LSTM as the main forecasting methods with ARIMA and ES (Holt-Winter, HW) as adjusting methods. Then, we present the comparison of the CNN–CT model versus the individual CNN, LSTM, ARIMA, and Holt-Winters models for each country.

We can see in Figure 4 the comparison of best-performing forecast models for the countries of The United States, Mexico, Brazil, and Colombia. In the US, Figure 4a, the forecasts of LSTM-ARIMA manage to maintain the trend and seasonality patterns with respect to the ground truth. However, the CNN-HW prognosis is well below the actual data. We can see in Table 2 that LSTM-ARIMA achieves the lowest MAPE for the US.

**Figure 4.** Daily forecast with CNN–CT method using CNN and Long Short-Term Memory (LSTM) as main forecast methods.

Likewise, Figure 4b shows the behavior of the forecasts for daily cases of COVID-19 in Mexico. We can see that all four models are able to maintain trend and seasonality patterns with respect to ground truth. However, LSTM–ARIMA shows a high error rate because of the difference with respect to the actual data. On the other hand, the forecast of CNN-HW is very close to the real data, which allows us to obtain a better performance with respect to the other methods. The average MAPE and its standard deviation are shown in Table 2, where we can see that CNN-HW achieves the best average performance among the four models.

Similarly, Figure 4c shows the comparative Brazil forecast for all the models. We can see that LSTM-ARIMA manages to maintain seasonality patterns concerning the ground truth. In the case of CNN-HW, it follows the trend and seasonality patterns with respect to the ground truth. The average MAPE and its standard deviation are shown in Table 2. However, as we noticed before with the average MAPE and its standard deviation, CNN– HW has the best performance.

We can see in Figure 4d that LSTM–ARIMA manages to maintain seasonality patterns concerning the ground truth for Colombia. In the case of CNN–HW, it follows the trend and seasonality patterns concerning the ground truth. According to Table 2 CNN-ARIMA shows the best MAPE performance, as its curve is the closest to the ground truth.

In general, our experiments show that smoothing with ARIMA or ES helps obtain lower MAPE in the case of CNN. This is not the case with LSTM. Table 2 shows a summary of the MAPE and RMSPE daily forecasting values of the CNN–CT and LSTM–CT for US, Mexico, Brazil, and Colombia. In the case of US, the method with the best performance is LSTM-ARIMA, having a *MAPE* ≈ 14%. In the case of Mexico and Brazil, CNN–HW is better with MAPE 14.18% and 29.3%. It is possible to see that LSTM–ARIMA and CNN– HW obtain better results in different countries. In Colombia, CNN-ARIMA obtains the best MAPE and RMSPE.

We averaged the MAPE of all the countries for each method in Table 2. We observed that CNN–CT methods have better performance than that of LSTM–CT. Furthermore, for each country, we determined the standard deviation of the error metrics. We noticed that CNN–CT has the lower deviation, which indicates that its best performance is consistent across countries.


**Table 2.** CNN–CT methods performance. Best MAPE results are marked in bold.

Finally, in Figure 5, we show a comparison of the MAPE for the CNN-HW model versus the individual CNN, LSTM, ARIMA, and Holt–Winters models for each country.

Although ARIMA obtained good performance for the US (11.18) and Mexico (16.31), first and third place, respectively, it provides high MAPE for Brazil (50.99) and Colombia (29.75), with the last and second-last places, respectively. Similarly, pure CNN is a good method for Mexico (14.04) and Colombia (14.96) but not so good for US (42.75) and Brazil (38.19).

In contrast, CNN–CT (CNN-HW) is consistently competitive for all cases, obtaining second place for US (15.53), Mexico (14.18, as good as the best-performing CNN alone), and Colombia (21.75), and first for Brazil (29.30).

**Figure 5.** Daily forecast with CNN–CT (using Holt–Winters (HW)) versus the individual methods CNN, LSTM, ARIMA, and HW.

We show the comparison of CNN–HW versus the four individual methods in Table 3. We can see that CNN–HW surpasses all of these individual methods for Brazil and Colombia. For the case of Mexico, CNN–HW is below the best performing method (CNN) only by 0.14 MAPE points. Furthermore, CNN–HW achieves competitive results for the US.

**Table 3.** The performance of the CNN–CT vs. individual methods. Best MAPE results are marked in bold.


#### **6. Conclusions**

This paper investigates the problem of forecasting confirmed daily cases of COVID-19 in Mexico, Brazil, Colombia, and the US. Given the limited number of data available at the time of conducting our experiments, several limitations of the prediction methods became evident. These limitations were even more obvious due to the presence of noise in the daily data, which might very well be a consequence of the restrictions on the flow of data imposed by the sanitary crisis related to COVID-19 worldwide.

In particular, most prediction methods decrease their accuracy as the periods for forecast become larger. To mitigate this issue, we proposed a component transformation that converts daily values into weekly components for correct prediction in those cases.

We present a hybrid forecasting method termed Convolutional Neural Network– Component Transformation (CNN–CT), which uses CNN and LSTM as the main prediction method and ES and ARIMA as adjusting methods for daily error correction. As a result, there are two variants of the proposed method: CNN–CT with Holt–Winters, and LSTM–CT with ARIMA.

We compared the prediction performance of the individual methods that compose the proposed CNN–CT using the MAPE metric. We noticed that CNN and LSTM are very good with learning trend and seasonality of the time series; however, LSTM forecasts tends to generate increasing and decreasing trend, which causes the error to increase. Our experiments show that smoothing with ARIMA or ES helps obtain lower MAPE in the case of CNN. This is not the case with LSTM.

As future works, we propose applying this methodology to other popular forecasting methods such as SVR, Recurrent Neural Network, and so on; measuring the performance quality in more countries; and applying powerful data cleaning as a preprocessing stage. Furthermore, it could be interesting to use different adjusting methods. Finally, we propose testing if the proposed methodology is completely general or determines which strategy applies in different forecast scenarios.

**Author Contributions:** Conceptualization L.J.H.-G., J.F.-S., and J.J.G.-B.; methodology L.J.H.-G., J.F.-S., E.R.-R., and J.J.G.-B.; investigation L.J.H.-G., J.F.-S., and J.J.G.-B.; Software L.J.H.-G., J.F.-S., and J.J.G.-B.; validation, J.F.-S., J.P.S.-H., and E.R.-R.; formal analysis J.F.-S., J.P.S.-H., and E.R.-R.; writing—original draft L.J.H.-G. and J.F.-S.; writing—review and editing, J.F.-S., J.J.G.-B., E.R.-R., and J.P.S.-H. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Acknowledgments:** The authors would like to acknowledge with appreciation and gratitude CONA-CYT, TecNM/Instituto Tecnológico de Ciudad Madero, and Asociación Maxicana de Cultura A.C. In addition, the authors acknowledge the support from Laboratorio Nacional de Tecnologías de la Información (LaNTI) for the access to the cluster.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Derivative-Free Multiobjective Trust Region Descent Method Using Radial Basis Function Surrogate Models**

**Manuel Berkemeier 1,\* and Sebastian Peitz <sup>2</sup>**


**Abstract:** We present a local trust region descent algorithm for unconstrained and convexly constrained multiobjective optimization problems. It is targeted at heterogeneous and expensive problems, i.e., problems that have at least one objective function that is computationally expensive. Convergence to a Pareto critical point is proven. The method is derivative-free in the sense that derivative information need not be available for the expensive objectives. Instead, a multiobjective trust region approach is used that works similarly to its well-known scalar counterparts and complements multiobjective line-search algorithms. Local surrogate models constructed from evaluation data of the true objective functions are employed to compute possible descent directions. In contrast to existing multiobjective trust region algorithms, these surrogates are not polynomial but carefully constructed radial basis function networks. This has the important advantage that the number of data points needed per iteration scales linearly with the decision space dimension. The local models qualify as *fully linear* and the corresponding general scalar framework is adapted for problems with multiple objectives.

**Keywords:** multiobjective optimization; trust region methods; multiobjective descent; derivativefree optimization; radial basis functions; fully linear models

#### **1. Introduction**

Optimization problems arise in a multitude of applications in mathematics, computer science, engineering and the natural sciences. In many real-life scenarios, there are multiple, equally important objectives that need to be optimized. Such problems are then called *Multiobjective Optimization Problems* (MOP). In contrast to the single objective case, an MOP often does not have a single solution but an entire set of optimal trade-offs between the different objectives, which we call *Pareto optimal*. They constitute the *Pareto Set* and their image is the *Pareto Frontier*. The goal in the numerical treatment of an MOP is to either approximate these sets or to find single points within these sets. In applications, the problem can become more difficult when some of the objectives require computationally expensive or time consuming evaluations. For instance, the objectives could depend on a computer simulation or some other *black-box*. It is then of primary interest to reduce the overall number of function evaluations. Consequently, it can become infeasible to approximate derivative information of the true objectives using, e.g., finite differences. This holds true especially if higher order derivatives are required. In this work, optimization methods that do not use the true objective gradients (which nonetheless are assumed to exist) are referred to as *derivative-free*.

There is a variety of methods to deal with MOPs, some of which are also derivativefree or try to constrain the number of expensive function evaluations. A broad overview of different problems and techniques concerning multiobjective optimization can be found,

**Citation:** Berkemeier, M.; Peitz, S. Derivative-Free Multiobjective Trust Region Descent Method Using Radial Basis Function Surrogate Models. *Math. Comput. Appl.* **2021**, *26*, 31. https://doi.org/10.3390/ mca26020031

Academic Editors: Marcela Quiroz, Juan Gabriel Ruiz, Luis Gerardo de la Fraga and Oliver Schütze

Received: 26 February 2021 Accepted: 7 April 2021 Published: 15 April 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

e.g., in [1–4]. One popular approach for calculating Pareto optimal solutions is scalarization, i.e., the transformation of an MOP into a single objective problem, cf. [5] for an overview. Alternatively, classical (single objective) descent algorithms can be adapted for the multiobjective case [6–11]. What is more, the structure of the Pareto Set can be exploited to find multiple solutions [12,13]. There are also methods for non-smooth problems [14,15] and multiobjective direct-search variants [16,17]. Both scalarization and descent techniques may be included in Evolutionary Algorithms (EA) [18–22]. To address computationally expensive objectives or missing derivative information, there are algorithms that use surrogate models (see the surveys [23–25]) or borrow from ideas from scalar trust region methods, e.g., [26].

In single objective optimization, trust region methods are well suited for derivativefree optimization [27,28]. Our work is based on the recent development of multiobjective trust region methods:


Our contribution is the extension of the above-mentioned methods to general fully linear models (and in particular Radial Basis Function (RBF) surrogates as in [34]), which is related to the scalar framework in [35]. Most importantly, this reduces surrogate construction complexity, in terms of objective evaluations per iteration, to linear with respect to the number of decision variables, in contrast to the quadratically increasing number of function evaluations for methods using second degree polynomials. We further prove convergence to critical points when the problem is constrained to a convex and compact set by using an analogous argumentation as in [36]. To this end, we extend the theory in [6] to provide new results concerning the continuity of the solutions of the projected steepest descent direction problem, which is based on the alternative formulation by Fliege and Svaiter [7]. We also show how to keep the convergence properties for constrained problems when the Pascoletti–Serafini scalarization is employed (like in [33]).

The remainder of the paper is structured as follows: Section 2 provides a brief introduction to multiobjective optimality and criticality concepts. In Section 3 the fundamentals of the algorithm are explained. In Section 4 we introduce fully linear surrogate models and describe the construction of suitable polynomial models and RBF models for unconstrained and box-constrained problems. We also formalize the main algorithm in this section. Section 5 deals with the descent step calculation so that a sufficient decrease is achieved in each iteration. Convergence is proven in Section 6 and a few numerical examples for unconstrained and finitely box-constrained problems are shown in Section 7. In Section 7 we also compare the RBF models against linear polynomial models that have the same linear construction complexity. We conclude with a brief discussion in Section 8. ⎡⎤

#### **2. Optimality and Criticality in Multiobjective Optimization** ⎢⎣⎥⎦

We consider the following (real-valued) multiobjective optimization problem:

$$\min\_{\mathbf{x}\in\mathcal{X}}\mathbf{f}(\mathbf{x}) := \min\_{\mathbf{x}\in\mathcal{X}} \begin{bmatrix} f\_1(\mathbf{x}) \\ \vdots \\ f\_k(\mathbf{x}) \end{bmatrix} \in \mathbb{R}^k,\tag{\text{MOP}}$$

with a feasible set X ⊆ <sup>R</sup>*<sup>n</sup>* and *<sup>k</sup>* objective functions *<sup>f</sup>* : <sup>R</sup>*<sup>n</sup>* <sup>→</sup> <sup>R</sup>, <sup>=</sup> 1, ... , *<sup>k</sup>*. We further assume (MOP) to be *heterogeneous*. That is, there is a non-empty subset *I*ex ⊆ {1, ... , *k*} of indices so that the gradients of *f*, ∈ *I*ex, are unknown and cannot be approximated, e.g., via finite differences. The (possibly empty) index set *I*cheap = {1, ... , *k*} \ *I*ex indicates functions whose gradients are available.

Solutions for (MOP) consist of optimal trade-offs **x**<sup>∗</sup> ∈ X between the different objectives and are called non-dominated or Pareto optimal. That is, there is no **x** ∈ X with **f**(**x**) ≺ **f**(**x**∗) (i.e., **f**(**x**) ≤ **f**(**x**∗) and *f*(**x**) < *f*(**x**∗) for some index ∈ {1, ... , *k*}). The subset P<sup>S</sup> ⊆ X of non-dominated points is then called the *Pareto Set* and its image <sup>P</sup><sup>F</sup> :<sup>=</sup> **<sup>f</sup>**(PS) <sup>⊆</sup> <sup>R</sup>*<sup>k</sup>* is called the *Pareto Frontier*. All concepts can be defined in a local fashion in an analogous way.

Similar to scalar optimization, there is a necessary condition for local optima using the gradients of the objective function. We therefore implicitly assume all objective functions *f*, = 1, ... , *k*, to be continuously differentiable on X . Moreover, the following assumption allows for an easier treatment of tangent cones in the constrained case:

**Assumption 1.** *Either the problem is unconstrained, i.e.,* <sup>X</sup> <sup>=</sup> <sup>R</sup>*<sup>n</sup> or the feasible set* X ⊆ <sup>R</sup>*<sup>n</sup> is compact and convex. All functions are defined on* X *.*

The second case is a standard assumption in the MO literature for constrained problems [6,7]. Now let *<sup>∇</sup>f*(**x**) denote the gradient of *<sup>f</sup>* and **Df**(**x**) <sup>∈</sup> <sup>R</sup>*k*×*<sup>n</sup>* the Jacobian of **<sup>f</sup>** at **x** ∈ X .

**Definition 1.** *We call a vector* **d** ∈X− **x** *a multi-descent direction for* **f** *in* **x** *if ∇f*(**x**), **d** < 0 *for all* ∈ {1, . . . , *k*}, *or equivalently if*

$$\max\_{\ell=1,\ldots,k} \langle \nabla f\_{\ell}(\mathbf{x}^\*), \mathbf{d} \rangle < 0 \tag{1}$$

*where* •, • *is the standard inner product on* <sup>R</sup>*<sup>n</sup> and we consider* X − **<sup>x</sup>** <sup>=</sup> <sup>X</sup> *in the unconstrained case* <sup>X</sup> <sup>=</sup> <sup>R</sup>*n.*

A point **x**<sup>∗</sup> ∈ X is called *critical* for (MOP) iff there is no descent direction **d** ∈X− **x**<sup>∗</sup> with (1). As all Pareto optimal points are also critical (cf. [6,37] or [2] [Ch. 17]), it is viable to search for optimal points by calculating points from the superset Pcrit ⊇ P<sup>S</sup> of critical points for (MOP). Similar to single objective optimization, using such a first order condition makes sense especially in combination with some global method or when exploring the structure of the critical set. We discuss promising approaches in Section 8. Note, that due the above restrictions, our method is not a general replacement for other methods, e.g., scalarization approaches, but rather an additional tool for situations where those are not applicable.

One intuitive way to approach the critical set is by iteratively performing descent steps. Fliege and Svaiter [7] propose several ways to compute suitable descent directions. The minimizer **d**∗ of the following problem is known as the multiobjective steepest-descent direction.

$$\min\_{\mathbf{d}\in\mathcal{X}-\mathbf{x}} \max\_{\ell=1,\ldots,k} \langle \nabla f\_{\ell}(\mathbf{x}), \mathbf{d} \rangle \quad \text{s.t.} \quad ||\mathbf{d}|| \le 1. \tag{P1}$$

Problem (P1) has an equivalent reformulation as

$$\min\_{\mathbf{d}\in\mathcal{X}-\mathbf{x}}\beta\qquad\text{s.t.}\quad\|\mathbf{d}\|\leq1\quad\text{and}\quad\langle\nabla f\_{\ell}(\mathbf{x}),\mathbf{d}\rangle\leq\beta\,\forall\,\ell=1,\dots,k,\tag{\text{P2}}$$

which is a linear program, if X is defined by linear constraints and the maximum-norm • = •<sup>∞</sup> is used [7]. We thus stick with this choice because it facilitates implementation, but note that other choices are possible (see for example [33]).

Motivated by the next theorem we can use the optimal value of either problem as a measure of criticality, i.e., as a multiobjective pendant for the gradient norm. As is standard in most multiobjective trust region works (cf. [29,30,33]), we flip the sign so that the values are non-negative.

**Theorem 1.** *For* **x** ∈ X *let* **d**∗(**x**) *be the minimizer of* (P1) *and ω*(**x**) *be the negative optimal value, that is*

$$
\omega(\mathbf{x}) := -\max\_{\ell=1,\dots,k} \langle \nabla f\_\ell(\mathbf{x}), \mathbf{d}^\*(\mathbf{x}) \rangle.
$$

*Then the following statements hold:*

	- *(a) The point* **x** ∈ X *is* not *critical.*
	- *(b) ω*(**x**) > 0*.*
	- *(c)* **d**∗(**x**) = **0***.*

*Consequently, the point* **x** *is critical iff ω*(**x**) = 0*.*

**Proof.** For the unconstrained case all statements are proven in [7] (Lemma 3).

The first and the third statement hold true for X convex and compact by definition. The continuity of *ω* can be shown similarly as in [6], see Appendix A.1.

With further conditions on **f** and X the criticality measure *ω*(**x**) is even Lipschitz continuous and subsequently uniformly and Cauchy continuous:

**Theorem 2.** *If ∇f*, = 1, ... , *k*, *are Lipschitz continuous and Assumption 1 holds, then the map ω*(•) *as defined in Theorem 1 is uniformly continuous.*

**Proof.** The proof for <sup>X</sup> <sup>=</sup> <sup>R</sup>*<sup>n</sup>* is given by Thomann [38]. A proof for the constrained case can be found in Appendix A.1 as to not clutter this introductory section.

Together with Theorem 1 this hints at *ω*(•) being a criticality measure as defined for scalar trust region methods in [36] ([Ch. 8]):

**Definition 2.** *We call <sup>π</sup>*: <sup>N</sup><sup>0</sup> <sup>×</sup> <sup>R</sup>*<sup>n</sup>* <sup>→</sup> <sup>R</sup>, *a criticality measure for* (MOP) *if <sup>π</sup> is Cauchy continuous with respect to its second argument and if implies that the sequence* - .

$$\lim\_{t \to \infty} \pi(t, \mathbf{x}^{(t)}) = 0$$

**x**(*t*) *asymptotically approaches a Pareto critical point.*

#### **3. Trust Region Ideas**

Multiobjective trust region algorithms closely follow the design of scalar approaches (see [36] for an extensive treatment) and provide an alternative to (approximate) linesearch algorithms (e.g., [7]). Consequently, the requirements and convergence proofs in [29,30,33] for the unconstrained multiobjective case are fairly similar to those in [36]. We will reexamine the core concepts to provide a clear understanding and point out the similarities to the single objective case.

The main idea is to iteratively compute multi-descent steps **s**(*t*) in every iteration *<sup>t</sup>* ∈ N0. We could, for example, use the steepest descent direction given by (P1). This would require knowledge of the true objective gradients, which need not be available for objective functions with indices in *I*ex. Hence, benevolent surrogate model functions (**x**) = #

$$\mathbf{m}^{(t)}: \mathbb{R}^n \to \mathbb{R}^k, \ \mathbf{x} \mapsto \mathbf{m}^{(t)}(\mathbf{x}) = \left[m\_1^{(t)}(\mathbf{x}), \dots, m\_k^{(t)}(\mathbf{x})\right]^T, \ \mathbf{x}$$

are employed (at least for the expensive objectives).

The surrogate models are constructed to be sufficiently accurate within a trust region


$$B^{(t)} := B\left(\mathbf{x}^{(t)}; \boldsymbol{\Delta}^{(t)}\right) = \left\{\mathbf{x} \in \mathcal{X} : \left\|\mathbf{x} - \mathbf{x}^{(t)}\right\| \le \boldsymbol{\Delta}^{(t)}\right\}, \quad \text{with } \left\|\boldsymbol{\bullet}\right\| = \left\|\boldsymbol{\bullet}\right\|\_{\infty} \tag{2}$$

.

around the current iterate **x**(*t*). To be precise, the models are made fully linear as described in Section 4.1. This ensures that the model error and the model gradient error are uniformly bounded within the trust region. 

The *model* steepest descent direction **<sup>d</sup>**(*t*) <sup>m</sup> can then computed as the optimizer of the surrogate problem

$$\begin{aligned} \omega\_{\mathbf{m}}^{(t)} \left( \mathbf{x}^{(t)} \right) &:= - \min\_{\mathbf{d} \in \mathcal{X} - \mathbf{x}} \beta \\ \text{s.t. } ||\mathbf{d}|| &\le 1, \text{ and } \langle \nabla m\_{\ell}^{(t)}(\mathbf{x}), \mathbf{d} \rangle \le \beta \quad \forall \ell = 1, \dots, k. \end{aligned} \tag{Pm}$$

Now let *<sup>σ</sup>*(*t*) > 0 be a step size. The direction **<sup>d</sup>**(*t*) <sup>m</sup> need not be a descent direction for the true objectives **f** and the trial point **x** (*t*) <sup>+</sup> <sup>=</sup> **<sup>x</sup>**(*t*) <sup>+</sup> *<sup>σ</sup>*(*t*)**d**(*t*) <sup>m</sup> is only accepted if a measure *ρ*(*t*) of improvement and model quality surpasses a positive threshold *ν*+. As in [30,33], we scalarize the multiobjective problems by defining

$$\Phi(\mathbf{x}) := \max\_{\ell=1,\ldots,k} f\_{\ell}(\mathbf{x}), \qquad \Phi\_{\mathbf{m}}^{(t)}(\mathbf{x}) := \max\_{\ell=1,\ldots,k} m\_{\ell}^{(t)}(\mathbf{x}).$$

Whenever <sup>Φ</sup>(**x**(*t*)) <sup>−</sup> <sup>Φ</sup>(**<sup>x</sup>** (*t*) <sup>+</sup> ) > 0, there is a reduction in at least one objective function of **f** because of

$$0 < \Phi(\mathbf{x}^{(t)}) - \Phi(\mathbf{x}\_+^{(t)}) = f\_\ell(\mathbf{x}^{(t)}) - f\_\emptyset(\mathbf{x}\_+^{(t)}) \stackrel{\text{def.}}{\leq} f\_\ell(\mathbf{x}^{(t)}) - f\_\ell(\mathbf{x}\_+^{(t)}),$$

where we denoted by the (not necessarily unique) maximizing index in Φ(**x**(*t*)) and by *q* the (neither necessarily unique) maximizing index in Φ(**x** (*t*) <sup>+</sup> ). (The abbreviation "df." above the inequality symbol stands for "(by) definition" and is used throughout this document when appropriate.) Of course, the same property holds for Φ(*t*) <sup>m</sup> (•) and **<sup>m</sup>**(*t*). .

Thus, the step size *<sup>σ</sup>*(*t*) > 0 is chosen so that the step **<sup>s</sup>**(*t*) = *<sup>σ</sup>*(*t*)**d**(*t*) <sup>m</sup> satisfies both **<sup>x</sup>**(*t*) <sup>+</sup> **<sup>s</sup>**(*t*) <sup>∈</sup> *<sup>B</sup>*(*t*) and a "sufficient decrease condition" of the form min-

$$
\Phi\_{\mathbf{m}}^{(t)}(\mathbf{x}^{(t)}) - \Phi\_{\mathbf{m}}^{(t)}(\mathbf{x}^{(t)} + \mathbf{s}^{(t)}) \ge \kappa^{\text{sd}} \omega \left(\mathbf{x}^{(t)}\right) \min \left\{ \mathbb{C} \cdot \omega \left(\mathbf{x}^{(t)}\right), 1, \Delta^{(t)} \right\} \ge 0,
$$

with constants *<sup>κ</sup>*sd <sup>∈</sup> (0, 1) and C <sup>&</sup>gt; 0, see Section 5. Such a condition is also required in the scalar case [35,36] and essential for the convergence proof in Section 6, where we show lim*t*→<sup>∞</sup> *ω* **x**(*t*) = 0. ⎧⎪⎪⎨

Due to the decrease condition, the denominator in the ratio of actual versus predicted reduction ⎪⎪⎩

$$\boldsymbol{\rho}^{(t)} = \begin{cases} \frac{\boldsymbol{\Phi}(\mathbf{x}^{(t)}) - \boldsymbol{\Phi}(\mathbf{x}^{(t)}\_{+})}{\boldsymbol{\Phi}^{(t)}\_{\mathbf{m}}(\mathbf{x}^{(t)}) - \boldsymbol{\Phi}^{(t)}\_{\mathbf{m}}(\mathbf{x}^{(t)}\_{+})} & \text{if } \mathbf{x}^{(t)} \neq \mathbf{x}^{(t)}\_{+}, \\\\ 0 & \text{if } \mathbf{x}^{(t)} = \mathbf{x}^{(t)}\_{+} \Leftrightarrow \mathbf{s}^{(t)} = \mathbf{0}, \end{cases} \tag{3}$$

is non-negative. A positive *ρ*(*t*) implies a decrease in at least one objective *f*, so we accept **x** (*t*) <sup>+</sup> as the next iterate if *<sup>ρ</sup>*(*t*) <sup>&</sup>gt; *<sup>ν</sup>*<sup>+</sup> <sup>&</sup>gt; 0. If *<sup>ρ</sup>*(*t*) is sufficiently large, say *<sup>ρ</sup>*(*t*) <sup>≥</sup> *<sup>ν</sup>*++ <sup>&</sup>gt; *<sup>ν</sup>*<sup>+</sup> <sup>&</sup>gt; 0, the next trust region might have a larger radius <sup>Δ</sup>(*t*+1) <sup>≥</sup> <sup>Δ</sup>(*t*). If in contrast *<sup>ρ</sup>* <sup>&</sup>lt; *<sup>ν</sup>*++, the next trust region radius should be smaller and the surrogates improved.

This encompasses the case **s**(*t*) = **0**, when the iterate **x**(*t*) is critical for

$$\min\_{\mathbf{x}\in\mathcal{B}^{(t)}}\mathbf{m}^{(t)}(\mathbf{x}) \in \mathbb{R}^k. \tag{\text{MOPm}}$$

Roughly speaking, we suppose that **x**(*t*) is near a critical point for the original problem (MOP) if **m**(*t*) is sufficiently accurate. If we truly are near a critical point, then the trust region radius will approach 0. For further details concerning the acceptance ratio *ρ*(*t*), see [33] (Section 2.2).

**Remark 1.** *We can modify <sup>ρ</sup>*(*t*) *in* (3) *to obtain a descent in all objectives, i.e., if* **<sup>x</sup>**(*t*) <sup>=</sup> **<sup>x</sup>** (*t*) <sup>+</sup> *we test <sup>ρ</sup>*(*t*) <sup>=</sup> *<sup>f</sup>*(**x**(*t*)) <sup>−</sup> *<sup>f</sup>*(**<sup>x</sup>** (*t*) <sup>+</sup> ) *m*(*t*) (**x**(*t*)) <sup>−</sup> *<sup>m</sup>*(*t*) (**x** (*t*) <sup>+</sup> ) > *ν*<sup>+</sup> *for all* = 1, . . . , *k. This is the* strict *acceptance test.*

#### **4. Surrogate Models and the Final Algorithm**

Until now, we have not discussed the actual choice of surrogate models used for **m**(*t*). As is shown in Section 5, the models should be twice continuously differentiable with uniformly bounded hessians. To prove convergence of our algorithm, we have to impose further requirements on the (uniform) approximation qualities of the surrogates **m**(*t*). We can meet these requirements using so-called fully linear models. Moreover, fully linear models intrinsically allow for modifications of the basic trust region method that are aimed at reducing the total number of expensive objective evaluations. Finally, we briefly recapitulate how radial basis functions and multivariate Lagrange polynomials can be made fully linear.

**Remark 2.** *Although the trust region framework is suitable for general convexly constrained compact sets, we will discuss the construction of fully linear polynomial and RBF models for unconstrained and box-constrained problems only.*

*In the constrained case, we treat the constraints as* unrelaxable*, that is, we do not allow for evaluations of the true objectives outside* <sup>X</sup> *, see the definition of <sup>B</sup>*(*t*) ⊆ X *in* (2)*. We also ensure to only select training data in* X *during the construction of surrogate models.*

*To the best of our knowledge there are no construction procedures for the above model types for general (unrelaxable) constraints. A discussion of how some model based algorithms deal with constraints can be found in* [28] *(Section 7). The issue is also addressed in* [27] *(Ch. 13) . If the constraints are treated as relaxable, then techniques from* [39] *(Ch. 15) might be applicable such as merit functions or filter methods, but this is left for future research.*

#### *4.1. Fully Linear Models*

We start by reciting the abstract definition of full linearity as given in [27,35]:

**Definition 3.** *Let* <sup>Δ</sup>ub <sup>&</sup>gt; <sup>0</sup> *be given and let <sup>f</sup>* : <sup>R</sup> <sup>→</sup> <sup>R</sup> *be a function that is continuously differentiable in an open domain containing* X *and has a Lipschitz continuous gradient on* X *. A set of model functions* <sup>M</sup> <sup>=</sup> {*m*: <sup>R</sup>*<sup>n</sup>* <sup>→</sup> <sup>R</sup>} <sup>⊆</sup> *<sup>C</sup>*1(R*n*, <sup>R</sup>) *is called a* fully linear *class of models w.r.t. f if the following hold:*

	- *the error between the gradient of the model and the gradient of the function satisfies*

$$||\nabla f(\xi) - \nabla m(\xi)|| \le \epsilon \Delta, \quad \forall \xi \in B(\mathbf{x}; \Delta),$$

• *the error between the model and the function satisfies*

$$|f(\xi) - m(\xi)| \le \epsilon \Delta^2, \quad \forall \xi \in B(\mathbf{x}; \Delta).$$

	- *either establish that a given model m* ∈ M *is fully linear on B*(**x**; Δ)*, i.e., it satisfies the error bounds in 1,*


.

• *or find a model m that is fully linear on B* ˜ (**x**; Δ)*.*

**Remark 3.** *In the unconstrained case, the requirements in Definition 3 can be relaxed a bit, at least when using the strict acceptance test with* **<sup>f</sup>**(**x**(*T*)) <sup>≤</sup> **<sup>f</sup>**(**x**(*t*)) *for all <sup>T</sup>* <sup>≥</sup> *<sup>t</sup>* <sup>≥</sup> <sup>0</sup>*. We can then restrict ourselves to the set* X := / **x**; Δub

$$\mathcal{X}' := \bigcup\_{\mathbf{x} \in L(\mathbf{x}^{(0)})} B\left(\mathbf{x}; \boldsymbol{\Delta}^{\mathrm{ub}}\right), \quad \text{where } L(\mathbf{x}^{(0)}) := \left\{ \mathbf{x} \in \mathbb{R}^n : \mathbf{f}(\mathbf{x}) \le \mathbf{f}(\mathbf{x}^{(0)}) \right\}.$$

For the convergence analysis in Section 6, we further cite [27] ([Lemma 10.25]). The lemma states that a fully linear model is also fully linear in enlarged regions if the error constants are chosen appropriately:

**Lemma 1.** *For* **<sup>x</sup>** ∈ X *and* <sup>Δ</sup> <sup>≤</sup> <sup>Δ</sup>ub *consider a function <sup>f</sup> and a fully-linear model <sup>m</sup> as in Definition 3 with constants* , ˙, *Lm* > 0*. Let Lf* > 0 *be a Lipschitz constant of ∇f . Assume w.l.o.g. that* 

$$L\_{\mathfrak{m}} + L\_f \le \epsilon \quad \text{and} \quad \frac{\dot{\mathfrak{c}}}{2} \le \epsilon.$$

*Then m is fully linear on B* **x**; Δ˜ *for any* <sup>Δ</sup>˜ <sup>∈</sup> [Δ, <sup>Δ</sup>ub] *with respect to the same constants* , ˙, *Lm.*

Finally, we generalize the definition to a *vector* of real valued functions.

**Definition 4.** *Let* Δub > 0 *be given and let* **f** = [ *f*1, ... , *fk*] *<sup>T</sup> be a vector of functions satisfying the requirements of Definition 3. Then* **m** = [*m*1, ... , *mk*] *T, with <sup>m</sup>* : <sup>R</sup>*<sup>n</sup>* <sup>→</sup> <sup>R</sup>, ∈ {1, ... , *<sup>k</sup>*}*, belongs to a collection of fully linear classes w.r.t.* **f** *if for each the function m belongs to a fully linear class w.r.t. f, with error constants and* ˙*.*

*The model-improvement algorithm of* **m** *consists in applying the individual improvement algorithms for all indices* ∈ {1, ... , *k*} *and* **m** *is deemed fully linear iff all m are fully linear with constants and* ˙*.*

Definition 4 is stated in a way that allows for different model types for the different objectives. Most importantly, we can use *m* = *f* and ∇*m* = ∇ *f* if the objective is cheap, i.e., ∈ *I*cheap, and if *f* not only has Lipschitz gradients but also has a Hessian that is uniformly bounded in terms of its norm. The latter requirement is formalized in Assumption 3 and needed for the convergence analysis.

#### Algorithm Modifications

With Definitions 3 and 4 we have formalized our assumption that the surrogates become more accurate when we decrease the trust region radius. This motivates the following modifications to the basic procedure: 


• A trust region update that also takes into consideration (*t*) m **x**(*t*) . The radius should be enlarged if we have a large acceptance ratio *ρ*(*t*) and the Δ(*t*) is small as measured against *βω*(*t*) m **x**(*t*) for a constant *β* > 0.

These changes are implemented in Algorithm 2. For more detailed explanations we refer to [27] (Ch. 10).

**Algorithm 1:** Criticality Routine.

**Configuration:** A backtracking constant *α* ∈ (0, 1), *μ* > 0 from Algorithm 2; **Input:** Current trust region radius Δ(*t*), current models **m**(*t*); **Output:** Fully linear models **m**(*t*) and the (possibly shrunken) radius Δ(*t*); Set <sup>Δ</sup><sup>0</sup> <sup>←</sup> <sup>Δ</sup>(*t*); **for** *j* = 1, 2, . . . **do** Set radius: <sup>Δ</sup>(*t*) <sup>←</sup> *<sup>α</sup>j*−1Δ0; Make models **<sup>m</sup>**(*t*) fully linear on *<sup>B</sup>*(*t*) ; /\* can change (*t*) m **x**(*t*) \*/ **if** <sup>Δ</sup>(*t*) <sup>≤</sup> *μ*(*t*) m **x**(*t*) **then** Break; **end end**

From Algorithm 2 we see that we can classify the iterations based on *ρ*(*t*) as in Definition 5.

**Definition 5.** *For given constants* 0 ≤ *ν*<sup>+</sup> ≤ *ν*++ < 1, *ν*++ = 0, *we call the iteration with index <sup>t</sup>* ∈ N<sup>0</sup> *of Algorithm 2.*


**Algorithm 2:** General Trust Region Method (TRM) for (MOP). **Configuration:** Criticality parameters *ε*crit > 0 and *μ* > *β* > 0, acceptance parameters 1 > *ν*++ ≥ *ν*<sup>+</sup> ≥ 0, *ν*++ = 0, update factors *γ*<sup>↑</sup> ≥ 1 > *γ*<sup>↓</sup> ≥ *γ*- > 0 and Δub > 0; **Input:** The initial site **<sup>x</sup>**(0) <sup>∈</sup> <sup>R</sup>*n*; **for** *t* = 0, 1, . . . **do if** *t* > 0 *and iteration* (*t* − 1) *was model-improving (cf. Definition 5)* **then** Perform at least one improvement step on **m**(*t*−1) and then let **<sup>m</sup>**(*t*) <sup>←</sup> **<sup>m</sup>**(*t*−1); **else** Construct surrogate models **m**(*t*) on *B*(*t*); **end** /\* Criticality Step: \*/ **if** (*t*) m **x**(*t*) < *ε*crit **and <sup>m</sup>**(*t*) not fully linear **or** <sup>Δ</sup>(*t*) > *μ*(*t*) m **x**(*t*) **then** Set Δ(*t*) <sup>∗</sup> <sup>←</sup> <sup>Δ</sup>(*t*); Call Algorithm 1 so that **m**(*t*) is fully linear on *B*(*t*) with <sup>Δ</sup>(*t*) <sup>∈</sup> 0, *μ*(*t*) m **x**(*t*) #; Then set <sup>Δ</sup>(*t*) <sup>←</sup> minmax-Δ(*t*), *β*(*t*) m **x**(*t*) ., Δ(*t*) ∗ . ; **end** Compute a suitable descent step **s**(*t*); Set **x** (*t*) <sup>+</sup> <sup>←</sup> **<sup>x</sup>**(*t*) <sup>+</sup> **<sup>s</sup>**(*t*), evaluate **<sup>f</sup>**(**<sup>x</sup>** (*t*) <sup>+</sup> ) and compute *<sup>ρ</sup>*(*t*) with (3); Perform the following updates: **<sup>x</sup>**(*t*+1) <sup>←</sup> **<sup>x</sup>**(*t*) if *<sup>ρ</sup>*(*t*) <sup>&</sup>lt; *<sup>ν</sup>*<sup>+</sup> or *<sup>ν</sup>*<sup>+</sup> <sup>≤</sup> *<sup>ρ</sup>*(*t*) <sup>&</sup>lt; *<sup>ν</sup>*++ & **<sup>m</sup>**(*t*) is **not** fully linear, **x** (*t*) <sup>+</sup> if *<sup>ρ</sup>*(*t*) <sup>≥</sup> *<sup>ν</sup>*++ or *<sup>ν</sup>*<sup>+</sup> <sup>≤</sup> *<sup>ρ</sup>*(*t*) <sup>&</sup>lt; *<sup>ν</sup>*++ & **<sup>m</sup>**(*t*) is fully linear, <sup>Δ</sup>(*t*+1) <sup>←</sup> <sup>Δ</sup>+, where Δ+ ⎧ ⎪⎪⎪⎪⎨ ⎪⎪⎪⎪⎩ = Δ(*t*) if *ρ*(*t*) < *ν*++ & **m**(*t*) is **not** fully linear, ∈ [*γ*-<sup>Δ</sup>(*t*), *<sup>γ</sup>*↓Δ(*t*)] if *<sup>ρ</sup>*(*t*) < *<sup>ν</sup>*++ & **<sup>m</sup>**(*t*) is fully linear, ∈ <sup>Δ</sup>(*t*), min{*γ*↑Δ(*t*), <sup>Δ</sup>ub} # if *<sup>ν</sup>*++ <sup>≤</sup> *<sup>ρ</sup>*(*t*) and <sup>Δ</sup>(*t*) <sup>≥</sup> *βω*(*t*) m **x**(*t*) , <sup>=</sup> min{*γ*↑Δ(*t*), <sup>Δ</sup>ub} if *<sup>ν</sup>*++ <sup>≤</sup> *<sup>ρ</sup>*(*t*) and <sup>Δ</sup>(*t*) <sup>&</sup>lt; *βω*(*t*) m **x**(*t*) . **end**

#### *4.2. Fully Linear Lagrange Polynomials*

Quadratic Taylor polynomial models are used very frequently. As explained in [27] we can alternatively use multivariate interpolating Lagrange polynomial models when derivative information is not available. We will consider first and second degree Lagrange models. Even though the latter require <sup>O</sup>(*n*2) function evaluations they are still cheaper than second degree finite difference models. For this reason, these models are also used in [33,38].

To construct an interpolating polynomial model we have to provide *p* data sites, where *p* is the dimension of the space Π*<sup>d</sup> <sup>n</sup>* of real-valued *n*-variate polynomials with degree *<sup>d</sup>*. For *<sup>d</sup>* <sup>=</sup> 1 we have *<sup>p</sup>* <sup>=</sup> *<sup>n</sup>* <sup>+</sup> 1 and for *<sup>d</sup>* <sup>=</sup> 2 it is *<sup>p</sup>* <sup>=</sup> (*<sup>n</sup>* <sup>+</sup> <sup>1</sup>)(*<sup>n</sup>* <sup>+</sup> <sup>2</sup>) <sup>2</sup> . If *<sup>n</sup>* <sup>≥</sup> 2, the *Mairhuber–Curtis* theorem [40] applies and the data sites must form a so-called *poised* set in <sup>X</sup> . The set <sup>Ξ</sup> <sup>=</sup> {ξ1, ... , <sup>ξ</sup>*p*} ⊂ <sup>R</sup>*<sup>n</sup>* is poised if for any basis {*ψi*}*<sup>i</sup>* of <sup>Π</sup>*<sup>d</sup> <sup>n</sup>* the matrix **M***ψ* := 0 *ψi*(ξ*j*) 1 <sup>1</sup>≤*i*,*j*≤*<sup>p</sup>* is non-singular. Then for any function *<sup>f</sup>* : <sup>R</sup>*<sup>n</sup>* <sup>→</sup> <sup>R</sup> there is a unique interpolating polynomial *m*(**x**) = ∑*<sup>p</sup> <sup>i</sup>*=<sup>1</sup> *λiψi*(**x**) with *m*(ξ*j*) = *f*(ξ*j*) for all *j* = 1, ... , *p*. Given a poised set <sup>Ξ</sup> the associated Lagrange basis {*li*}*<sup>i</sup>* of <sup>Π</sup>*<sup>d</sup> <sup>n</sup>* is defined by *li*(ξ**j**) = *δi*,*j*. The model coefficients then simply are the data values, i.e., *λ<sup>i</sup>* = *f*(ξ*i*).

Same as in [38], we implement Algorithm 6.2 from [27] to ensure poisedness. It selects training sites <sup>Ξ</sup> from the current (slightly enlarged) trust region of radius *<sup>θ</sup>*1Δ(*t*), *<sup>θ</sup>*<sup>1</sup> <sup>≥</sup> 1, and calculates the associated lagrange basis. We can then separately evaluate the true objectives *<sup>f</sup>* on <sup>Ξ</sup> to easily build the surrogates *<sup>m</sup>*(*t*) , ∈ {1, ... , *k*}. Our implementation always includes ξ<sup>1</sup> = **x**(*t*) and tries to select points from a database of prior evaluations first.

We employ an additional algorithm (Algorithm 6.3 in [27]) to ensure that the set Ξ is even Λ*-poised*, see [27] ([Definition 3.6]). The procedure is still finite and ensures the models are actually *fully linear*. The quality of the surrogate models can be improved by choosing a small algorithm parameter Λ > 1. Our implementation tries again to recycle points from a database. Different to before, interpolation at **x**(*t*) can no longer be guaranteed. This second step can also be omitted first and then used as a model-improvement step in a subsequent iteration.

#### *4.3. Fully Linear Radial Basis Function Models*

The main drawback of quadratic Lagrange models is that we still need <sup>O</sup>(*n*2) function evaluations in each iteration of Algorithm 2. A possible fix is to use under-determined regression polynomials instead [27,31,41]. Motivated by the findings in [34] we chose so-called Radial Basis Function (RBF) models as an alternative. RBF are well-known for their approximation capabilities on irregular data [40]. In our implementation they have the form

$$m(\mathbf{x}) = \sum\_{i=1}^{N} c\_i \boldsymbol{\varrho}(\|\mathbf{x} - \boldsymbol{\xi}\_i\|\_2) + \boldsymbol{\pi}(\mathbf{x}), \text{ with } \boldsymbol{\pi} = \sum\_{j=1}^{n+1} \lambda\_j \boldsymbol{\psi}\_j \in \Pi\_n^1 \text{ and } N \ge n+1,\tag{4}$$

which conforms to the construction by Wild et al. [34]. Here, *ϕ* is a function from a domain containing <sup>R</sup>≥<sup>0</sup> to <sup>R</sup>. For a fixed *<sup>ϕ</sup>* the mapping *<sup>ϕ</sup>*(•) from <sup>R</sup>*<sup>n</sup>* <sup>→</sup> <sup>R</sup> is radially symmetric with respect to its argument and the mapping (**x**, ξ) !→ *ϕ*(**x** − ξ2) is called a *kernel*.

We will describe the procedure only briefly and refer to [34,42] and the dissertation [41] for more details. To conform to the algorithmic framework the models must have Hessians of uniformly bounded norm. Additionally, we want them to be twice differentiable due to the following, very general result:

**Theorem 3** (Th 4.1 in [41])**.** *Suppose that f and m are continuously differentiable in an open domain containing <sup>B</sup>*(*t*) *and that* <sup>∇</sup> *<sup>f</sup> and* <sup>∇</sup>*<sup>m</sup> are Lipschitz in <sup>B</sup>*(*t*)*. Further suppose that <sup>m</sup> interpolates f on a* Λ*-poised set* Ξ = {ξ1, ... , ξ*n*+1} *(for a fixed* Λ < ∞*). Then m is fully linear for f as in Definition 3.*

The Λ-poised set is determined using pivotal algorithms from [34,41] in an enlarged trust region of radius *<sup>θ</sup>*1Δ(*t*), *<sup>θ</sup>*<sup>1</sup> <sup>≥</sup> 1. If we restrict ourselves to functions *<sup>ϕ</sup>* that are conditionally positive definite (c.p.d.—see [34] for the definition) of order *D* ≤ 2, then for any *<sup>f</sup>* : <sup>R</sup>*<sup>n</sup>* <sup>→</sup> <sup>R</sup> an interpolating model *<sup>m</sup>* of form (4) is uniquely determined by solving a linear equation system. If further *ϕ* is either twice continuously differentiable on an open domain containing [0, ∞) with *ϕ* (0) = 0 , then *m* from (4) is twice continuously differentiable and has Lipschitz gradients exactly if its Hessian stays bounded. This is the case for all *ϕ* we consider (see Table 1). The hessian norm is determined by the magnitudes of the coefficients *ci* and by |*ϕ* (*r*)/*r*| and |*ϕ*(*r*)|.


**Table 1.** Some radial functions *<sup>ϕ</sup>*: <sup>R</sup>≥<sup>0</sup> <sup>→</sup> <sup>R</sup> that are c.p.d. of order *<sup>D</sup>* <sup>≤</sup> 2, cf. [34]. 2

If there are exactly *N* = *n* + 1 points from a poised set Ξ, then the coefficients *ci* vanish and the model (4) is a linear polynomial. The values |*ϕ* (*r*)/*r*| and |*ϕ*(*r*)| are bounded because of *<sup>r</sup>* <sup>∈</sup> [0, <sup>Δ</sup>ub] and *<sup>ϕ</sup>* (0) = 0. To exploit the nonlinear modeling capabilities of RBF and perform exploration, there is a procedure in [34] to select additional (database) points from within a region of maximum radius *<sup>θ</sup>*2Δub, *<sup>θ</sup>*<sup>2</sup> <sup>≥</sup> *<sup>θ</sup>*<sup>1</sup> <sup>≥</sup> 1, so that the values <sup>|</sup>*ci*<sup>|</sup> stay bounded. Modifications for box constraints can be found in [41] ([Sec. 6.3.1]) and [43].

Table 1 shows the RBF we are using and of which order they are. Both the Gaussian and the Multiquadric allow for fine-tuning with a shape parameter *α* > 0. This can potentially improve the conditioning of the interpolation system.

Figure 1b illustrates the effect of the shape parameter. As can be seen, the radial functions become narrower for larger shape parameters. Hence, we do not only use a constant shape parameter *α* = 1 like [34] do, but we also use an *α* that is (within lower and upper bounds) inversely proportional to Δ(*t*).

Figure 1a shows interpolation of a nonlinear function by a surrogate based on the Multiquadric with a linear tail.

**Figure 1.** (**a**) Interpolation of a nonlinear function (black) by a Multiquadric surrogate (blue) based on 5 discrete training points (orange). Dashed lines show the kernels and the polynomial tail. (**b**) Different kernels in 1D with varying shape parameter (1 or 10), see also Table 1.

#### **5. Descent Steps**

In this section we introduce some possible steps **s**(*t*) to use in Algorithm 2. We begin by defining the best step along the steepest descent direction as given by (Pm). Subsequently, backtracking variants are defined that use a multiobjective variant of Armijo's rule.

#### *5.1. Pareto–Cauchy Step*

Both the *Pareto–Cauchy point* as well as a backtracking variant, the *modified Pareto– Cauchy point*, are points along the descent direction **<sup>d</sup>**(*t*) <sup>m</sup> within *B*(*t*) so that a sufficient decrease measured by Φ(*t*) <sup>m</sup> (•) and *<sup>ω</sup>*(*t*) <sup>m</sup> (•) is achieved. Under mild assumptions we can then derive a decrease in terms of *ω*(•).

**Definition 6.** *For <sup>t</sup>* <sup>∈</sup> <sup>N</sup><sup>0</sup> *let* **<sup>d</sup>**(*t*) <sup>m</sup> *be a minimizer for* (Pm)*. The best attainable trial point* **x** (*t*) PC *along* **<sup>d</sup>**(*t*) <sup>m</sup> *is called the* Pareto–Cauchy point *and given by* 

$$\begin{split} \mathbf{x}\_{\text{PC}}^{(t)} &:= \mathbf{x}^{(t)} + \boldsymbol{\sigma}^{(t)} \cdot \mathbf{d}\_{\text{m}}^{(t)},\\ \boldsymbol{\sigma}^{(t)} &= \operatorname\*{arg\,min}\_{0 \le \boldsymbol{\sigma}} \boldsymbol{\Phi}\_{\text{m}}^{(t)} \left( \mathbf{x}^{(t)} + \boldsymbol{\sigma} \cdot \mathbf{d}\_{\text{m}}^{(t)} \right) \quad \text{s.t.} \ \mathbf{x}\_{\text{PC}}^{(t)} \in B^{(t)}. \end{split} \tag{5}$$

*Let σ*(*t*) *be the minimizer in* (5)*. We call* **s** (*t*) PC :<sup>=</sup> *<sup>σ</sup>*(*t*)**d**(*t*) <sup>m</sup> *the* Pareto–Cauchy step*.*

If we make the following standard assumption, then the Pareto–Cauchy point allows for a lower bound on the improvement in terms of Φ(*t*) <sup>m</sup> .

**Assumption 2.** *For all <sup>t</sup>* <sup>∈</sup> <sup>N</sup><sup>0</sup> *the surrogates* **<sup>m</sup>**(*t*)(**x**)=[*m*(*t*) <sup>1</sup> (**x**), ... , *<sup>m</sup>*(*t*) *<sup>k</sup>* (**x**)]*<sup>T</sup> are twice continuously differentiable on an open set containing* <sup>X</sup> *. Denote by <sup>H</sup>m*(*t*) (**x**) *the Hessian of <sup>m</sup>*(*t*) *for* = 1, . . . , *k.* ⎧⎫

**Theorem 4.** *If Assumptions 1 and 2 are satisfied, then for any iterate* **x**(*t*) *the Pareto–Cauchy point* **x** (*t*) PC *satisfies* ⎩⎭

⎨

⎬

$$
\Phi\_{\mathbf{m}}^{(t)}(\mathbf{x}^{(t)}) - \Phi\_{\mathbf{m}}^{(t)}(\mathbf{x}\_{\text{PC}}^{(t)}) \geq \frac{1}{2} \omega\_{\mathbf{m}}^{(t)} \left(\mathbf{x}^{(t)}\right) \cdot \min \left\{ \frac{\omega\_{\mathbf{m}}^{(t)} \left(\mathbf{x}^{(t)}\right)}{\mathbf{c}H\_{\mathbf{m}}^{(t)}}, \Delta^{(t)}, 1 \right\}, \tag{6}
$$

*where*

$$H\_{\mathbf{m}}^{(t)} = \max\_{\ell=1,\ldots,k} \max\_{\mathbf{x} \in B^{(t)}} \left\| Hm\_{\ell}^{(t)}(\mathbf{x}) \right\|\_{F} \tag{7}$$

*and the constant* c > 0 *relates the trust region norm* • *to the Euclidean norm* •<sup>2</sup> *via*

$$\|\mathbf{x}\|\_2 \le \sqrt{\mathbf{c}} \|\mathbf{x}\| \qquad \forall \mathbf{x} \in \mathbb{R}^n. \tag{8}$$

.

If • = •<sup>∞</sup> is used, then c can be chosen as c = *k*. The proof for Theorem 4 is provided after the next auxiliary lemma. 67

**Lemma 2.** *Under Assumptions <sup>1</sup> and 2, let* **<sup>d</sup>** *be a non-increasing direction at* **<sup>x</sup>**(*t*) <sup>∈</sup> <sup>R</sup>*<sup>n</sup> for* **m**(*t*)*, i.e., Let q* ∈ {1, . . . , *<sup>k</sup>*} *be any objective index and <sup>σ</sup>*¯ <sup>≥</sup> min-

$$\left\langle \nabla m\_{\ell}^{(t)}(\mathbf{x}^{(t)}), \mathbf{d} \right\rangle \le 0 \qquad \forall \ell = 1, \dots, k.$$

<sup>Δ</sup>(*t*), **d** *. Then it holds that*

$$\left\langle \nabla m\_{\ell}^{(t)}(\mathbf{x}^{(t)}), \mathbf{d} \right\rangle \le 0 \qquad \forall \ell = 1, \dots, k.$$

$$\forall \in \{1, \dots, k\} \text{ be any objective index and } \vartheta \ge \min \left\{ \Delta^{(t)}, \|\mathbf{d}\| \right\}. \text{ Then it holds}$$

$$m\_{q}^{(t)}(\mathbf{x}^{(t)}) - \min\_{0 < \sigma < \vartheta} m\_{q}^{(t)} \left( \mathbf{x}^{(t)} + \sigma \frac{\mathbf{d}}{\|\mathbf{d}\|} \right) \ge \frac{w}{2} \min \left\{ \frac{w}{\|\mathbf{d}\|^{2} \mathbf{c} H\_{\text{m}}^{(t)}}, \frac{\Delta^{(t)}}{\|\mathbf{d}\|}, 1 \right\}.$$

*where we have used the shorthand notation*

$$w = -\max\_{\ell=1,\dots,k} \left\langle \nabla m\_{\ell}^{(t)}(\mathbf{x}^{(t)}), \mathbf{d} \right\rangle \ge 0.$$

Lemma 2 states that a minimizer along any non-increasing direction **d** achieves a minimum reduction w.r.t. Φ(*t*) <sup>m</sup> . Similar results can be found in in [30] or [33]. But since we do not use polynomial surrogates **m**(*t*), we have to employ the multivariate version of Taylor's theorem to make the proof work. We can do this because according to Assumption 2, the functions *m*(*t*) *<sup>q</sup>* , *q* ∈ {1, ... , *k*} are twice continuously differentiable in an open domain containing X . Moreover, Assumption 1 ensures that the function is defined on the

line from χ to **x**. As shown in [44] (Ch. 3) a first degree expansion at **x** ∈ *B*(χ, Δ) around χ ∈ X then leads to

$$\begin{aligned} m\_q^{(t)}(\mathbf{x}) &= m\_q(\mathbf{x}) + \nabla m\_q^{(t)}(\mathbf{x})^T \mathbf{h} + \frac{1}{2} \mathbf{h}^T H m\_q^{(t)}(\boldsymbol{\xi}\_q) \mathbf{h}, & \text{with } \mathbf{h} = (\mathbf{x} - \mathbf{x})\_\prime \\ \text{for some } \boldsymbol{\xi}\_q &\in \{\mathbf{x} + \theta(\mathbf{x} - \mathbf{x}) : \theta \in [0, 1] \}, \text{ for all } q = 1, \dots, k. \end{aligned} \tag{9}$$

**Proof of Lemma 2.** Let the requirements of Lemma 2 hold and let **d** be a non-increasing direction for **m**(*t*). Then: 

$$\begin{split} &m\_{q}^{(t)}(\mathbf{x}^{(t)}) - \min\_{0 < \sigma < \tilde{\sigma}} m\_{q}^{(t)} \Big( \mathbf{x}^{(t)} + \sigma \frac{\mathbf{d}}{\|\mathbf{d}\|} \Big) = \max\_{0 \le \sigma \le \tilde{\sigma}} \Big\{ m\_{q}^{(t)}(\mathbf{x}^{(t)}) - m\_{q}^{(t)} \Big( \mathbf{x}^{(t)} + \sigma \frac{\mathbf{d}}{\|\mathbf{d}\|} \Big) \Big\} \\ &\overset{(9)}{=} \max\_{0 \le \sigma \le \tilde{\sigma}} \Big\{ m\_{q}^{(t)}(\mathbf{x}^{(t)}) - \left( m\_{q}^{(t)}(\mathbf{x}^{(t)}) + \frac{\sigma}{\|\mathbf{d}\|} \langle \nabla m\_{q}^{(t)}(\mathbf{x}^{(t)}), \mathbf{d} \rangle + \frac{\sigma^{2}}{2 \|\mathbf{d}\|} \langle \mathbf{d}, H m\_{q}^{(t)}(\xi\_{q}) \mathbf{d} \rangle \right) \Big\} \\ &\overset{(10)}{\geq} \max\_{0 \le \sigma \le \tilde{\sigma}} \Big{ \left\{ -\frac{\sigma}{\|\mathbf{d}\|} \max\_{j=1,\dots,k} \langle \nabla m\_{j}^{(t)}(\mathbf{x}^{(t)}), \mathbf{d} \rangle - \frac{\sigma^{2}}{2 \|\mathbf{d}\|} \langle \mathbf{d}, H m\_{q}^{(t)}(\xi\_{q}) \mathbf{d} \rangle \right\}. \end{split}$$

We use the shorthand *<sup>w</sup>* <sup>=</sup> <sup>−</sup> max*j∇m*(*t*) *<sup>j</sup>* (**x**(*t*)), **<sup>d</sup>** and the Cauchy–Schwartz inequality to get

$$\dots \ge \max\_{0 \le \sigma \le \delta} \left\{ \frac{\sigma}{\|\mathbf{d}\|} w - \frac{\sigma^2}{2 \|\mathbf{d}\|^2} \|\mathbf{d}\|\_{2}^{2} \Big\|\mathbf{H} m\_q^{(t)}(\boldsymbol{\xi})\Big\|\_{F} \right\} \overset{(8),(7)}{\ge} \max\_{0 \le \sigma \le \delta} \left\{ \frac{\sigma}{\|\mathbf{d}\|} w - \frac{\sigma^2}{2} \mathbf{c} H\_{\mathbf{m}}^{(t)} \right\}.$$

The RHS is concave and we can thus easily determine the global maximizer *σ*∗. Similar to [30] (Lemma 4.1) we find , 1

$$\begin{array}{ll} \mathop{\rm s\!=}\limits\_{\sigma\in\mathcal{S}} \left\{ \left\|\mathbf{d}\right\|\right\|^{\sigma} & 2\left\|\mathbf{d}\right\|^{2}\left\|\mathbf{d}\right\|^{2} \left\|\mathbf{1}\right\|^{2} \left\|\mathbf{1}\right\|\_{F} & \left\|\mathbf{1}\right\|^{\sigma} & 2 \left\|\mathbf{d}\right\|^{\sigma} \\\\ \text{RHS is concave and we can thus easily determine the global maximum} \\\\ \mathop{\rm m}\_{\boldsymbol{\eta}}^{(t)}(\mathbf{x}^{(t)}) - \mathop{\rm min}\_{0 < \sigma < \mathcal{S}} m\_{\boldsymbol{\eta}}^{(t)} \left(\mathbf{x}^{(t)} + \sigma \frac{\mathbf{d}}{\left\|\mathbf{d}\right\|}\right) & \geq \frac{w}{2} \min\Bigg\{ \frac{w}{\left\|\mathbf{d}\right\|^{2} \mathbf{c} H\_{\mathbf{m}}^{(t)}}, \frac{\Delta^{(t)}}{\left\|\mathbf{d}\right\|}, 1 \}, \end{array}$$

where we have additionally used *<sup>σ</sup>*¯ <sup>≥</sup> min{Δ(*t*), 1}.

**Proof of Theorem 4.** If **<sup>x</sup>**(*t*) is Pareto critical for (MOPm), then **<sup>d</sup>**(*t*) <sup>m</sup> <sup>=</sup> **<sup>0</sup>** and *<sup>ω</sup>*(*t*) m **x**(*t*) = 0 and the inequality holds trivially. 

Else, let the indices , *q* ∈ {1, . . . , *k*} be such that .

$$\begin{aligned} &\text{let the indices }\ell, q \in \{1, \ldots, k\} \text{ be such that} \\\\ &\Phi\_{\mathbf{m}}^{(t)}(\mathbf{x}^{(t)}) - \Phi\_{\mathbf{m}}^{(t)}(\mathbf{x}\_{\text{PC}}^{(t)}) = m\_{\ell}^{(t)}(\mathbf{x}^{(t)}) - m\_{q}^{(t)}(\mathbf{x}\_{\text{PC}}^{(t)}) \ge m\_{q}(\mathbf{x}^{(t)}) - m\_{q}(\mathbf{x}\_{\text{PC}}^{(t)}), \\\\ &\vdots \quad \quad \quad \quad \left\{ \min\left\{ \boldsymbol{\Delta}^{(t)}, \left\| \mathbf{d}\_{\text{m}}^{(t)} \right\| \right\} \quad \text{if } \left\| \mathbf{d}\_{\text{m}}^{(t)} \right\| < 1 \text{ or } \boldsymbol{\Delta}^{(t)} \le 1, \end{aligned}$$

and define

$$\bar{\sigma} := \begin{cases} \min \left\{ \Delta^{(t)}, \left\| \mathbf{d}\_{\mathbf{m}}^{(t)} \right\| \right\} & \text{if } \left\| \mathbf{d}\_{\mathbf{m}}^{(t)} \right\| < 1 \text{ or } \Delta^{(t)} \le 1, \\ \Delta^{(t)} & \text{else.} \end{cases} \tag{10}$$

Then clearly *<sup>σ</sup>*¯ <sup>≥</sup> min-Δ(*t*), **<sup>d</sup>**(*t*) m and for the Pareto–Cauchy point we have 

$$\begin{array}{cc} \left\{ \boldsymbol{\Delta}^{(t)} \right\} & \text{else.}\\\\ \min \left\{ \boldsymbol{\Delta}^{(t)}, \left\| \left\| \mathbf{d}\_{\mathrm{m}}^{(t)} \right\| \right\} \text{ and for the Pareto-Cauchy} \end{array}$$

$$m\_{q}^{(t)} \left( \mathbf{x}\_{\mathrm{PC}}^{(t)} \right) = \min\_{0 \le \sigma \le \sigma} m\_{q} \left( \mathbf{x}^{(t)} + \frac{\sigma}{\left\| \left\| \mathbf{d}\_{\mathrm{m}}^{(t)} \right\|} \right\| \right).$$

From Lemma 2 and **<sup>d</sup>**(*t*) m the bound (6) immediately follows. **Remark 4.** *Some authors define the Pareto–Cauchy point as the actual minimizer* **x** (*t*) min *of* <sup>Φ</sup>(*t*) m *within the current trust region (instead of the minimizer along the steepest descent direction). For this true minimizer the same bound* (6) *holds. This is due to*

$$
\Phi\_{\mathbf{m}}^{(t)}(\mathbf{x}^{(t)}) - \Phi\_{\mathbf{m}}^{(t)}(\mathbf{x}\_{\min}^{(t)}) = m\_{\ell}(\mathbf{x}^{(t)}) - \min\_{\mathbf{x} \in B^{(t)}} m\_{\emptyset}(\mathbf{x}) \ge m\_{\emptyset}(\mathbf{x}^{(t)}) - m\_{\emptyset}(\mathbf{x}\_{\text{PC}}^{(t)}).
$$

#### *5.2. Modified Pareto–Cauchy Point via Backtracking*

A common approach in trust region methods is to find an approximate solution to (5) within the current trust region. Usually a backtracking procedure similar to Armijo's inexact line-search is used for the Pareto–Cauchy subproblem, see [36] (Section 6.3) and [30]. Doing so, we can still guarantee a sufficient decrease.

Before we actually define the backtracking step along **<sup>d</sup>**(*t*) <sup>m</sup> , we derive a more general lemma. It illustrates that backtracking along any suitable direction is well-defined.

**Lemma 3.** *Suppose Assumptions <sup>1</sup> and <sup>2</sup> hold. For* **<sup>x</sup>**(*t*) <sup>∈</sup> <sup>R</sup>*n, let* **<sup>d</sup>** *be a descent direction for* **<sup>m</sup>**(*t*) *and let q* ∈ {1, ... , *k*} *be any objective index and σ*¯ > 0*. Then, for any fixed constants* a, b ∈ (0, 1) *there is an integer j* ∈ N<sup>0</sup> *such that* 67

$$\Psi\left(\mathbf{x}^{(t)} + \frac{\mathbf{b}^j \vec{\sigma}}{||\mathbf{d}||} \mathbf{d}\right) \le \Psi(\mathbf{x}^{(t)}) - \frac{\mathbf{a} \mathbf{b}^j \vec{\sigma}}{||\mathbf{d}||} w \tag{11}$$

*where, again, we have used the shorthand notation w* = − max=1,...,*<sup>k</sup> ∇m*(*t*) (**x**(*t*)), **<sup>d</sup>** > 0 *and* <sup>Ψ</sup> *is either some specific model,* <sup>Ψ</sup> <sup>=</sup> *<sup>m</sup>, or the maximum value,* <sup>Ψ</sup> <sup>=</sup> <sup>Φ</sup>(*t*) <sup>m</sup> *.* 

*Moreover, if we define the step* **s**(*t*) = <sup>b</sup>*<sup>j</sup> σ*¯ **d<sup>d</sup>** *for the smallest <sup>j</sup>* <sup>∈</sup> <sup>N</sup><sup>0</sup> *satisfying* (11)*, then there is a constant κ*sd <sup>m</sup> ∈ (0, 1) *such that* m *w* min

$$\Psi(\mathbf{x}^{(t)}) - \Psi\left(\mathbf{x}^{(t)} + \mathbf{s}^{(t)}\right) \ge \kappa\_{\mathbf{m}}^{\text{sd}} w \min\left\{\frac{w}{||\mathbf{d}||^2 \mathbf{c} H\_{\mathbf{m}}^{(t)}}, \frac{\vartheta}{||\mathbf{d}||}\right\}.\tag{12}$$

**Proof.** The first part can be derived from the fact that **d** is a descent direction, see e.g., [6]. However, we will use the approach from [30] to also derive the bound (12). With Taylor's Theorem we obtain

$$\begin{split} &\mathbb{P}\left(\mathbf{x}^{(t)} + \frac{\mathbf{b}^{j}\mathbf{\bar{\sigma}}}{\|\mathbf{d}\|}\mathbf{d}\right) = m\_{\ell}\Big(\mathbf{x}^{(t)} + \frac{\mathbf{b}^{j}\mathbf{\bar{\sigma}}}{\|\mathbf{d}\|}\mathbf{d}\right) \qquad \text{(for some } \ell \in \{1, \ldots, k\}) \\ &= m\_{\ell}^{(t)}(\mathbf{x}^{(t)}) + \frac{\mathbf{b}^{j}\bar{\sigma}}{\|\mathbf{d}\|} \langle \nabla m\_{\ell}^{(t)}(\mathbf{x}^{(t)}), \mathbf{d} \rangle + \frac{(\mathbf{b}^{j}\bar{\sigma})^{2}}{2\|\mathbf{d}\|} \langle \mathbf{d}, H m\_{\ell}^{(t)}(\mathbf{k}\_{\ell}) \mathbf{d} \rangle \\ &\leq \Psi(\mathbf{x}^{(t)}) + \max\_{q=1,\ldots,k} \frac{\mathbf{b}^{j}\bar{\sigma}}{\|\mathbf{d}\|} \langle \nabla m\_{q}^{(t)}(\mathbf{x}^{(t)}), \mathbf{d} \rangle + \max\_{q=1,\ldots,k} \frac{(\mathbf{b}^{j}\bar{\sigma})^{2}}{2\|\mathbf{d}\|} \langle \mathbf{d}, H m\_{q}^{(t)}(\mathbf{k}\_{q}) \mathbf{d} \rangle \\ &\leq \frac{(\mathbf{m}\_{\ell})\_{\ell}}{2} \Psi(\mathbf{x}^{(t)}) - \frac{\mathbf{b}^{j}\bar{\sigma}}{\|\mathbf{d}\|} w + \frac{(\mathbf{b}^{j}\bar{\sigma})^{2}}{2} c H\_{\mathbf{m}}^{(t)}. \end{split}$$

In the last line, we have additionally used the Cauchy–Schwartz inequality. For a constructive proof, suppose now that (11) is violated for some *<sup>j</sup>* ∈ N0, i.e.,

$$
\Psi\left(\mathbf{x}^{(t)} + \frac{\mathbf{b}^j \bar{\boldsymbol{\sigma}}}{||\mathbf{d}||} \mathbf{d}\right) > \Psi(\mathbf{x}^{(t)}) - \frac{\mathbf{a} \mathbf{b}^j \bar{\boldsymbol{\sigma}}}{||\mathbf{d}||} w.
$$

Plugging in (13) for the LHS and substracting Ψ(**x**(*t*)) then leads to

$$\mathsf{b}^{j} > \frac{2(1 - \mathsf{a})w}{\|\mathsf{d}\| \|\partial \mathsf{c} H\_{\mathsf{m}}^{(\mathfrak{f})}\|}.$$

where the right hand side is positive and completely independent of *j*. Since b ∈ (0, 1), there must be a *j* <sup>∗</sup> ∈ N0, *<sup>j</sup>* <sup>∗</sup> > *j*, for which b*<sup>j</sup>* ∗ ≤ 2(1 − a)*w* **d***σ*¯ <sup>c</sup>*H*(*t*) m so that (11) must also be fulfilled ∗ .

$$\text{for this b}^{\cdot^\*}$$

Analogous to the proof of [30] ([Lemma 4.2]) we can now derive the constant *κ*sd m from (12) as *κ*sd <sup>m</sup> = min{2b(1 − a), a}.

Lemma <sup>3</sup> applies naturally to the step along **<sup>d</sup>**(*t*) <sup>m</sup> : 

**Definition 7.** *For* **<sup>x</sup>**(*t*) <sup>∈</sup> *<sup>B</sup>*(*t*) *let* **<sup>d</sup>**(*t*) <sup>m</sup> *be a solution to* (Pm) *and define the* modified Pareto– Cauchy step *as* m 

$$\mathbf{\bar{s}}\_{\text{PC}}^{(t)} := \mathbf{b}^{j} \mathcal{O} \frac{\mathbf{d}\_{\text{m}}^{(t)}}{\left\| \mathbf{d}\_{\text{m}}^{(t)} \right\|},$$

*where again <sup>σ</sup>*¯ *as in* (10) *and j* ∈ N<sup>0</sup> *is the smallest integer that satisfies σ*¯ 

$$\Phi\_{\rm m}^{(t)}\left(\mathbf{x}^{(t)} + \widetilde{\mathbf{s}}\_{\rm PC}^{(t)}\right) \le \Phi\_{\rm m}^{(t)}\left(\mathbf{x}^{(t)}\right) - \frac{\mathbf{a}\mathbf{b}^{\dot{f}}\mathcal{ }}{\left\|\mathbf{d}\_{\rm m}^{(t)}\right\|}\omega\_{\rm m}^{(t)}\left(\mathbf{x}^{(t)}\right) \tag{14}$$

*for predefined constants* a, b ∈ (0, 1)*.*

The definition of *σ*¯ ensures, that **x**(*t*) + **s˜** (*t*) PC is contained in the current trust region *<sup>B</sup>*(*t*). Furthermore, these steps provide a sufficient decrease very similar to (6):

**Corollary 1.** *Suppose Assumptions 1 and 2 hold. For the step* **s˜** (*t*) PC *the following statements are true:* ⎧⎨⎫⎬


$$
\Phi\_{\mathbf{m}}^{(t)}(\mathbf{x}^{(t)}) - \Phi\_{\mathbf{m}}^{(t)}(\mathbf{x}^{(t)} + \tilde{\mathbf{s}}\_{\mathbf{PC}}^{(t)}) \ge \kappa\_{\mathbf{m}}^{\text{sd}} \omega\_{\mathbf{m}}^{(t)} \left(\mathbf{x}^{(t)}\right) \min \left\{ \frac{\omega\_{\mathbf{m}}^{(t)}\left(\mathbf{x}^{(t)}\right)}{\mathbf{c}H\_{\mathbf{m}}^{(t)}}, \Delta^{(t)}, 1\right\}.
$$

**Proof.** If **x**(*t*) is critical, then the bound is trivial. Otherwise, the existence of a *j* satisfying (14) follows from Lemma 3 for Ψ = Φ(*t*) <sup>m</sup> . The lower bound on the decrease follows immediately from *<sup>σ</sup>*¯ <sup>≥</sup> min- **<sup>d</sup>**(*t*) m , Δ(*t*) .

From Lemma 3 it follows that the backtracking condition (14) can be modified to explicitly require a decrease in *every* objective: ⎧⎨⎛⎝⎞⎠⎫⎬

⎭

**Definition 8.** *Let j* ∈ N<sup>0</sup> *the smallest integer satisfying* 

⎩

$$\begin{aligned} & \mathbf{18.} \quad \text{Let } j \in \mathbb{N}\_0 \text{ the smallest integer satisfying} \\ & \min\_{\ell=1,\ldots,k} \left\{ m\_{\ell}^{(t)}(\mathbf{x}^{(t)}) - m\_{\ell}^{(t)} \left( \mathbf{x}^{(t)} + \mathbf{b}^j \bar{\boldsymbol{\sigma}} \frac{\mathbf{d}\_{\mathbf{m}}^{(t)}}{\left\| \mathbf{d}\_{\mathbf{m}}^{(t)} \right\|} \right) \right\} \geq \frac{\mathbf{a}^j \boldsymbol{\sigma}}{\left\| \mathbf{d}\_{\mathbf{m}}^{(t)} \right\|} \omega\_{\mathbf{m}}^{(t)} \left( \mathbf{x}^{(t)} \right). \end{aligned}$$
 
$$\begin{aligned} & \mathbf{\tilde{u}} = \mathbf{b}^{(t)} \mathbf{x} \end{aligned}$$

*We define the* strict *modified Pareto–Cauchy point as* **xˆ** (*t*) PC <sup>=</sup> **<sup>x</sup>**(*t*) <sup>+</sup> **sˆ** (*t*) PC *and the corresponding*

$$\text{step as } \mathbf{s}\_{\text{PC}}^{(t)} = \mathbf{b}^{\bar{j}} \mathcal{O} \frac{\mathbf{d}\_{\text{m}}^{(t)}}{\left\| \mathbf{d}\_{\text{m}}^{(t)} \right\|}.$$

**Corollary 2.** *Suppose Assumptions 1 and 2 hold.*

*1. The strict modified Pareto–Cauchy point exists, the backtracking is finite.* ⎩

*2. There is a constant κ*sd <sup>m</sup> ∈ (0, 1) *such that*


$$\text{The strict modified Pareto-Cauchy point exists, the backtracking is finite.}$$

$$\text{Here is a constant } \kappa\_{\text{m}}^{\text{sd}} \in (0, 1) \text{ such that}$$

$$\min\_{\ell = 1, \ldots, k} \left\{ m\_{\ell}^{(t)} \left( \mathbf{x}^{(t)} \right) - m\_{\ell}^{(t)} \left( \hat{\mathbf{x}}\_{\text{PC}}^{(t)} \right) \right\} \ge \kappa\_{\text{m}}^{\text{sd}} \omega\_{\text{m}}^{(t)} \left( \mathbf{x}^{(t)} \right) \min \left\{ \frac{\omega\_{\text{m}}^{(t)} \left( \mathbf{x}^{(t)} \right)}{c H\_{\text{m}}^{(t)}}, \boldsymbol{\Delta}^{(t)}, 1 \right\}. \tag{15}$$

⎫

⎬

⎧

⎫

⎬

⎭

⎨

**Remark 5.** *In the preceding subsections, we have shown descent steps along the model steepest descent direction. Similar to the single objective case we do not necessarily have to use the steepest descent direction and different step calculation methods are viable. For instance, Thomann and Eichfelder [33] use the well-known Pascoletti–Serafini scalarization to treat the subproblem* (MOPm)*. We refer to their work and Appendix B to see how this method can be related to the steepest descent direction.*

#### *5.3. Sufficient Decrease for the Original Problem* ⎧⎨

In the previous subsections, we have shown how to compute steps **s**(*t*) to achieve a sufficient decrease in terms of Φ(*t*) <sup>m</sup> and *<sup>ω</sup>*(*t*) <sup>m</sup> (•). For a descent step **<sup>s</sup>**(*t*) the bound is of the form ⎩ ⎭

$$
\Phi\_{\mathbf{m}}^{(t)}(\mathbf{x}^{(t)}) - \Phi\_{\mathbf{m}}^{(t)}(\mathbf{x}^{(t)} + \mathbf{s}^{(t)}) \ge \kappa\_{\mathbf{m}}^{\text{sd}} \omega\_{\mathbf{m}}^{(t)} \left(\mathbf{x}^{(t)}\right) \min \left\{ \frac{\omega\_{\mathbf{m}}^{(t)}\left(\mathbf{x}^{(t)}\right)}{\mathbf{c}H\_{\mathbf{m}}^{(t)}}, \boldsymbol{\Delta}^{(t)}, 1 \right\}, \quad \kappa\_{\mathbf{m}}^{\text{sd}} \in (0, 1), \tag{16}
$$

and thereby very similar to the bounds for the scalar projected gradient trust region method [36]. By introducing a slightly modified version of *ω*(*t*) <sup>m</sup> (•), we can transform (16) into the bound used in [30,33]. min-.

**Lemma 4.** *If π*(*t*, **x**(*t*)) *is a criticality measure for some multiobjective problem, then π*˜(*t*, **x**(*t*)) = 1, *π*(*t*, **x**(*t*)) *is also a criticality measure for the same problem.*

**Proof.** We have 0 <sup>≤</sup> *<sup>π</sup>*˜(*t*, **<sup>x</sup>**(*t*)) <sup>≤</sup> *<sup>π</sup>*(*t*, **<sup>x</sup>**(*t*)). Thus, *<sup>π</sup>*˜ <sup>→</sup> 0 whenever *<sup>π</sup>* <sup>→</sup> 0. The minimum of uniformly continuous functions is again uniformly continuous.

We next make another standard assumption on the class of surrogate models. 

**Assumption 3.** *The norm of all model hessians is uniformly bounded above on* X *, i.e., there is a positive constant* Hm *such that*

$$\left\| \operatorname{H} m\_{\ell}^{(t)}(\mathbf{x}) \right\|\_{F} \leq \operatorname{H}\_{\mathbf{m}} \qquad \forall \ell = 1, \dots, k, \forall \mathbf{x} \in \mathcal{B}^{(t)}, \ \forall t \in \mathbb{N}\_{0}.$$

*W.l.o.g., we assume*

$$\mathbf{H}\_{\mathbf{m}} \cdot \mathbf{c} > 1, \quad \text{with } \mathbf{c} \text{ as } \text{in} \tag{17}$$

**Remark 6.** *From this assumption it follows that the model gradients are then Lipschitz as well. Together with Theorem 2, we then know that ω*(*t*) <sup>m</sup> (•) *is a criticality measure for* (MOPm)*.* m (**x**) := minm (**x**), 1.

Motivated by the previous remark, we will from now on refer to the following functions

$$\omega \circ (\mathbf{x}) := \min \{ \omega(\mathbf{x}), 1 \} \text{ and } \omega\_{\mathbf{m}}^{(t)}(\mathbf{x}) := \min \{ \omega\_{\mathbf{m}}^{(t)}(\mathbf{x}), 1 \} \forall t = 0, 1, \dots \tag{18}$$

We can thereby derive the sufficient decrease condition in "standard form":

**Corollary 3.** *Under Assumption 3, suppose that for* **x**(*t*) *and some descent step* **s**(*t*) *the bound* (16) *holds. For the criticality measure* (*t*) <sup>m</sup> (•) *it follows that* ⎩⎭

$$\Phi\_{\mathbf{m}}^{(t)}(\mathbf{x}^{(t)}) - \Phi\_{\mathbf{m}}^{(t)}(\mathbf{x}^{(t)} + \mathbf{s}^{(t)}) \ge \kappa\_{\mathbf{m}}^{\text{sd}} \mathcal{o}\_{\mathbf{m}}^{(t)} \left(\mathbf{x}^{(t)}\right) \min \left\{ \frac{\mathcal{O}\_{\mathbf{m}}^{(t)} \left(\mathbf{x}^{(t)}\right)}{\text{cH}\_{\mathbf{m}}}, \boldsymbol{\Delta}^{(t)} \right\}. \tag{19}$$

⎧

⎫

⎬

⎫

⎬

⎨

**Proof.** (*t*) <sup>m</sup> (•) is a criticality measure due to Assumption 3 and Lemma 4. Further, from (18) and (17) it follows that

$$\frac{\alpha\_{\rm m}^{(t)}\left(\mathbf{x}^{(t)}\right)}{\mathbf{c}\mathbf{H}\_{\rm m}} \le \frac{1}{\mathbf{c}\mathbf{H}\_{\rm m}} \le 1$$

and if we plug this into (16) we obtain (19).

To relate the RHS of (19) to the criticality *ω*(•) of the original problem, we require another assumption. !!!!!!

**Assumption 4.** *There is a constant κω* > 0 *such that*

$$\left|\omega\_{\mathbf{m}}^{(t)}\left(\mathbf{x}^{(t)}\right) - \omega\left(\mathbf{x}^{(t)}\right)\right| \leq \kappa\_{\omega}\omega\_{\mathbf{m}}^{(t)}\left(\mathbf{x}^{(t)}\right).$$

This assumption is also made by Thomann and Eichfelder [33] and can easily be justified by using fully linear surrogate models and a bounded trust region radius in combination with a criticality test, see Lemma 7.

Assumption 4 can be used to formulate the next two lemmata relating the model criticality and the true criticality. They are proven in Appendix A.2. From these lemmata and Corollary 3 the final result, Corollary 4, easily follows. !!!!!!

**Lemma 5.** *If Assumption <sup>4</sup> holds, then it holds for* (*t*) <sup>m</sup> (•) *and* (•) *from* (18) *that*

$$\left|\boldsymbol{\mathcal{O}}\_{\mathrm{m}}^{(t)}\left(\mathbf{x}^{(t)}\right) - \boldsymbol{\mathcal{O}}\left(\mathbf{x}^{(t)}\right)\right| \leq \kappa\_{\boldsymbol{\omega}} \boldsymbol{\mathcal{O}}\_{\mathrm{m}}^{(t)}\left(\mathbf{x}^{(t)}\right).$$

**Lemma 6.** *From Assumption 4 it follows that*

$$
\omega\_{\mathbf{m}}^{(t)}\left(\mathbf{x}^{(t)}\right) \ge \frac{1}{\kappa\_{\omega} + 1} \varpi\left(\mathbf{x}^{(t)}\right) \quad with \ (\kappa\_{\omega} + 1)^{-1} \in (0, 1).
$$

**Corollary 4.** *Suppose that Assumptions 3 and 4 hold and that* **x**(*t*) *and* **s**(*t*) *satisfy* (19)*. Then* ⎩⎭

$$
\Phi\_{\mathbf{m}}^{(t)}(\mathbf{x}^{(t)}) - \Phi\_{\mathbf{m}}^{(t)}(\mathbf{x}^{(t)} + \mathbf{s}^{(t)}) \ge \kappa^{\text{sd}} \mathcal{o}\left(\mathbf{x}^{(t)}\right) \min\left\{\frac{\mathcal{O}\left(\mathbf{x}^{(t)}\right)}{\mathbf{c}\mathbf{H}\_{\mathbf{m}}}, \boldsymbol{\Delta}^{(t)}\right\},\tag{20}
$$

*where <sup>κ</sup>*sd <sup>=</sup> *<sup>κ</sup>*sd m <sup>1</sup>+*κω* ∈ (0, 1)*.*

#### **6. Convergence**

*6.1. Preliminary Assumptions and Definitions*

To prove convergence of Algorithm 2 we first have to make sure that at least one of the objectives is bounded from below. This is a weaker requirement than the standard assumption that all objectives are bounded from below:

**Assumption 5.** *The maximum* max=1,...,*<sup>k</sup> f*(**x**) *of all objective functions is bounded from below on* X *.*

To be able to use (•) as a criticality measure and to refer to fully linear models, we further require:

**Assumption 6.** *The objective* **<sup>f</sup>**: <sup>R</sup>*<sup>n</sup>* <sup>→</sup> <sup>R</sup>*<sup>k</sup> is continuously differentiable in an open domain containing* X *and has a Lipschitz continuous gradient on* X *.*

We summarize the assumptions on the surrogates as follows:

**Assumption 7.** *The vector of surrogate model functions <sup>m</sup>*(*t*) <sup>1</sup> , ... , *<sup>m</sup>*(*t*) *<sup>k</sup> belongs to a collection of fully linear classes as in Definition 4: For each objective objective index* = 1, ... , *k there are error constants so that* ˙ *and m*(*t*) *can be made to satisfy the bounds in Definition 3.*

For the subsequent analysis we define component-wise maximum constants as

$$\mathfrak{e} := \max\_{\ell=1,\ldots,k} \mathfrak{e}\_{\ell}, \quad \dot{\mathfrak{e}} := \max\_{\ell=1,\ldots,k} \dot{\mathfrak{e}}\_{\ell}. \tag{21}$$

We also wish for the descent steps to fulfill a sufficient decrease condition for the surrogate criticality measure as discussed in Section 5.

**Assumption 8.** *For all <sup>t</sup>* <sup>∈</sup> <sup>N</sup><sup>0</sup> *the descent steps* **<sup>s</sup>**(*t*) *are assumed to fulfill both* **<sup>x</sup>**(*t*) <sup>+</sup> **<sup>s</sup>**(*t*) <sup>∈</sup> *<sup>B</sup>*(*t*) *and* (19)*.* 

Finally, to avoid a cluttered notation when dealing with subsequences we define the following shorthand notations:

$$
\boldsymbol{\sigma}\_{\mathbf{m}}^{(t)} := \boldsymbol{\sigma}\_{\mathbf{m}}^{(t)} \left( \mathbf{x}^{(t)} \right), \\
\boldsymbol{\sigma}^{(t)} := \boldsymbol{\sigma} \left( \mathbf{x}^{(t)} \right) \quad \forall t \in \mathbb{N}\_{0}.
$$

#### *6.2. Convergence Proof*

In the following we prove convergence of Algorithm 2 to Pareto critical points. We account for the case that no criticality test is used, i.e., *ε*crit = 0. We then require all surrogates to be fully linear in each iteration and need Assumption 4. The proof is an adapted version of the scalar case in [35].

It is also similar to the proofs for the multiobjective algorithms in [30,33]. However, in both cases, no criticality test is employed, there is no distinction between successful and acceptable iterations (*ν*<sup>+</sup> = *ν*++) and interpolation at **x**(*t*) by the surrogates is required. We indicate notable differences when appropriate.

We start with two results concerning the criticality test in Algorithm 2.

**Lemma 7.** *For each iteration <sup>t</sup>* <sup>∈</sup> <sup>N</sup><sup>0</sup> *Assumption <sup>4</sup> is fulfilled if the model* **<sup>m</sup>**(*t*) *is fully-linear and the criticality test was performed and—if applicable—Algorithm 1 has finished.* 

**Proof.** Let , *<sup>q</sup>* ∈ {1, ... , *<sup>k</sup>*} and **<sup>d</sup>**, **<sup>d</sup>***<sup>q</sup>* ∈X− **<sup>x</sup>**(*t*) be solutions of (P1) and (Pm), respectively, such that 

$$
\omega\_{\mathbf{m}}^{(t)}\left(\mathbf{x}^{(t)}\right) = -\langle \nabla m\_{\ell}^{(t)}(\mathbf{x}^{(t)}), \mathbf{d}\_{\ell}\rangle,\\
\omega\left(\mathbf{x}^{(t)}\right) = -\langle \nabla f\_{\emptyset}(\mathbf{x}^{(t)}), \mathbf{d}\_{\emptyset}\rangle.
$$

If *ω*(*t*) m **x**(*t*) ≥ *ω* **x**(*t*) , then, using Cauchy–Schwartz and **d** ≤ 1, 

$$\begin{aligned} \left| \omega\_{\mathbf{m}}^{(t)} \left( \mathbf{x}^{(t)} \right) - \omega \left( \mathbf{x}^{(t)} \right) \right| &= \langle \nabla f\_{\emptyset} (\mathbf{x}^{(t)}), \mathbf{d}\_{\emptyset} \rangle - \langle \nabla m\_{\ell}^{(t)} (\mathbf{x}^{(t)}), \mathbf{d}\_{\ell} \rangle \\ &\stackrel{\text{df.}}{\leq} \langle \nabla f\_{\emptyset} (\mathbf{x}^{(t)}), \mathbf{d}\_{\ell} \rangle - \langle \nabla m\_{q}^{(t)} (\mathbf{x}^{(t)}), \mathbf{d}\_{\ell} \rangle \\ &\leq \left\| \nabla f\_{q} (\mathbf{x}^{(t)}) - \nabla m\_{q}^{(t)} (\mathbf{x}^{(t)}) \right\|\_{2'} \end{aligned}$$

and if *ω*(*t*) m **x**(*t*) < *ω* **x**(*t*) , we obtain

!

!

!

!

!

!

$$\left|\omega\_{\mathbf{m}}^{(t)}\left(\mathbf{x}^{(t)}\right) - \omega\left(\mathbf{x}^{(t)}\right)\right| \leq \left\|\nabla m\_{\ell}^{(t)}\left(\mathbf{x}^{(t)}\right) - \nabla f\_{\ell}\left(\mathbf{x}^{(t)}\right)\right\|\_{2}.$$

Because **m**(*t*) is fully linear, it follows that 

$$\left|\omega\_{\mathbf{m}}^{(t)}\left(\mathbf{x}^{(t)}\right) - \omega\left(\mathbf{x}^{(t)}\right)\right| \leq \sqrt{\mathbf{c}} \boldsymbol{\varepsilon} \boldsymbol{\Delta}^{(t)}\text{.}\qquad\text{with } \boldsymbol{\varepsilon} \text{ from (21)}.$$

If we just left Algorithm 1, then the model is fully linear for Δ(*t*) due to Lemma 1 and we have <sup>Δ</sup>(*t*) <sup>≤</sup> *μ*(*t*) m **x**(*t*) <sup>≤</sup> *μω*(*t*) m **x**(*t*) . If we otherwise did not enter Algorithm 1 in the first place, it must hold that *ω*(*t*) m **x**(*t*) ≥ *ε*crit and 

$$\begin{aligned} \Delta^{(t)} \le \Delta^{\text{ub}} &= \frac{\Delta^{\text{ub}}}{\varepsilon\_{\text{crit}}} \varepsilon\_{\text{crit}} \le \frac{\Delta^{\text{ub}}}{\varepsilon\_{\text{crit}}} \omega\_{\text{m}}^{(t)} \left( \mathbf{x}^{(t)} \right) \\\\ \mathbf{x}^{(t)} \Big|\_{} &\le \kappa\_{\text{w}} \omega\_{\text{m}}^{(t)} \left( \mathbf{x}^{(t)} \right), \quad \kappa\_{\text{w}} = \sqrt{\mathbf{c}} \varepsilon \max \{ \mathbf{x} \} \end{aligned}$$

and thus

!

!

!

!

$$
\Delta^{(t)} \le \Delta^{\text{ub}} = \frac{\Delta^{\text{ub}}}{\varepsilon\_{\text{crit}}} \varepsilon\_{\text{crit}} \le \frac{\Delta^{\text{ub}}}{\varepsilon\_{\text{crit}}} \omega\_{\text{m}}^{(t)} \left( \mathbf{x}^{(t)} \right)
$$

us

 
$$
\left| \omega\_{\text{m}}^{(t)} \left( \mathbf{x}^{(t)} \right) - \omega \left( \mathbf{x}^{(t)} \right) \right| \le \kappa\_{\text{w}} \omega\_{\text{m}}^{(t)} \left( \mathbf{x}^{(t)} \right), \quad \kappa\_{\text{w}} = \sqrt{\text{c}} \varepsilon \max \left\{ \mu, \varepsilon\_{\text{crit}}^{-1} \boldsymbol{\Delta}^{\text{ub}} \right\} > 0.
$$

In the subsequent analysis, we require mainly steps with fully linear models to achieve sufficient decrease for the true problem. Due to Lemma 7, we can dispose of Assumption 4 by using the criticality routine:

**Assumption 9.** *Either ε*crit > 0 *or Assumption 4 holds.* 

We have also implicitly shown the following property of the criticality measures. !!!!!!!!

!

!

**Corollary 5.** *If* **m**(*t*) *is fully linear for* **f** *with* ˙ > 0 *as in* (21) *then*

!

$$\left|\boldsymbol{\mathcal{O}}\_{\mathbf{m}}^{(t)}\left(\mathbf{x}^{(t)}\right) - \boldsymbol{\mathcal{O}}\left(\mathbf{x}^{(t)}\right)\right| \leq \left|\boldsymbol{\omega}\_{\mathbf{m}}^{(t)}\left(\mathbf{x}^{(t)}\right) - \boldsymbol{\omega}\left(\mathbf{x}^{(t)}\right)\right| \leq \sqrt{c}\boldsymbol{\varepsilon}\boldsymbol{\Delta}^{(t)}.$$

**Lemma 8.** *If* **x**(*t*) *is not critical for the true problem* (MOP)*, i.e.,* **x**(*t*) = 0*, then Algorithm 1 will terminate after a finite number of iterations.* 

**Proof.** At the start of Algorithm 1, we know that **m**(*t*) is not fully linear or Δ(*t*) > *μ*(*t*) m **x**(*t*) . For clarity, we denote the first model by **<sup>m</sup>**(*t*) <sup>0</sup> and define <sup>Δ</sup><sup>0</sup> <sup>=</sup> <sup>Δ</sup>(*t*). We then ensure that the model is made fully linear on Δ(*t*) <sup>1</sup> = Δ<sup>0</sup> and denote this fully linear model by **<sup>m</sup>**(*t*) <sup>1</sup> . If afterwards <sup>Δ</sup>(*t*) <sup>1</sup> <sup>≤</sup> *μ*(*t*) m1 **x**(*t*) , then Algorithm 1 terminates. 

Otherwise, the process is repeated: the radius is multiplied by *α* ∈ (0, 1) so that in the *j*-th iteration we have Δ(*t*) *<sup>j</sup>* <sup>=</sup> *<sup>α</sup>j*−1Δ<sup>0</sup> and **<sup>m</sup>**(*t*) *<sup>j</sup>* is made fully linear on <sup>Δ</sup>(*t*) *<sup>j</sup>* until

$$
\Delta\_j^{(t)} = \mathfrak{a}^{j-1} \Delta\_0 \le \mu \mathfrak{a}\_{\mathfrak{m}\_j}^{(t)} \Big(\mathfrak{x}^{(t)}\Big).
$$

The only way for Algorithm 1 to loop infinitely is

$$
\omega\_{\mathbf{m}\_j}^{(t)} \left( \mathbf{x}^{(t)} \right) < \frac{a^{j-1} \Delta\_0}{\mu} \qquad \forall j \in \mathbb{N}.\tag{22}
$$

Because **<sup>m</sup>**(*t*) *<sup>j</sup>* is fully linear on *<sup>α</sup>j*−1Δ0, we know from Corollary <sup>5</sup> that

!

!

!

$$\left|\boldsymbol{\mathcal{O}}\_{\mathbf{m}\_{j}}^{(t)}\left(\mathbf{x}^{(t)}\right) - \boldsymbol{\mathcal{O}}\left(\mathbf{x}^{(t)}\right)\right| \leq \sqrt{\mathbf{c}} \dot{\boldsymbol{\epsilon}} \boldsymbol{\kappa}^{j-1} \Delta\_{0} \qquad \forall j \in \mathbb{N}.$$

Using the triangle inequality together with (22) gives us 

!

!

!

!

!

!

!

!

!

!

!

!

!

$$\left|\boldsymbol{\sigma}\left(\mathbf{x}^{(t)}\right)\right| \leq \left|\boldsymbol{\sigma}\_{\mathbf{m}\_{j}}^{(t)}\left(\mathbf{x}^{(t)}\right) - \boldsymbol{\sigma}\left(\mathbf{x}^{(t)}\right)\right| + \left|\boldsymbol{\sigma}\_{\mathbf{m}\_{j}}^{(t)}\left(\mathbf{x}^{(t)}\right)\right| \leq \left(\mu^{-1} + \sqrt{\mathbf{c}\boldsymbol{\epsilon}}\right)a^{j-1}\boldsymbol{\Delta}\_{0} \quad \forall j \in \mathbb{N}.$$
 
$$\text{As } a \in (0, 1) \text{, this implies } \boldsymbol{\sigma}\left(\mathbf{x}^{(t)}\right) = 0 \text{ and } \mathbf{x}^{(t)} \text{ is hence critical.}$$

We next state another auxiliary lemma that we need for the convergence proof. !!

!

**Lemma 9.** *Suppose Assumptions <sup>6</sup> and <sup>7</sup> hold. For the iterate* **<sup>x</sup>**(*t*) *let* **<sup>s</sup>**(*t*) <sup>∈</sup> <sup>R</sup>*<sup>n</sup> be a any step with* **x** (*t*) <sup>+</sup> <sup>=</sup> **<sup>x</sup>**(*t*) <sup>+</sup> **<sup>s</sup>**(*t*) <sup>∈</sup> *<sup>B</sup>*(*t*)*. If* **<sup>m</sup>**(*t*) *is fully linear on B*(*t*) *then it holds that* !!

$$\left|\Phi(\mathbf{x}\_{+}^{(t)}) - \Phi\_{\mathbf{m}}^{(t)}(\mathbf{x}\_{+}^{(t)})\right| \leq \epsilon \left(\Delta^{(t)}\right)^{2}.$$

**Proof.** The proof follows from the definition of <sup>Φ</sup> and <sup>Φ</sup>(*t*) <sup>m</sup> and the full linearity of **m**(*t*). It can be found in [33] (Lemma 4.16).

Convergence of Algorithm 2 is proven by showing that in certain situations, the iteration must be acceptable or successful as defined in Definition 5. This is done indirectly and relies on the next two lemmata. They use the preceding result to show that in a (hypothetical) situation where no Pareto critical point is approached, the trust region radius must be bounded from below. 

**Lemma 10.** *Suppose Assumptions 1, 3 and 6 to 8 hold. If* **x**(*t*) *is not Pareto critical for* (MOPm) *and* **m**(*t*) *is fully linear on B*(*t*) *and*

$$\Delta^{(t)} \le \frac{\kappa\_{\rm m}^{\rm sd} (1 - \nu\_{++}) \phi\_{\rm m}^{(t)} \left( \mathbf{x}^{(t)} \right)}{2\lambda}, \quad \text{where } \lambda = \max \{ \epsilon, \epsilon \mathbf{H}\_{\rm m} \} \text{ and } \kappa\_{\rm m}^{\rm sd} \text{ as in (19)}.$$

*then the iteration is successful, that is, t* ∈ S *and* <sup>Δ</sup>*t*+<sup>1</sup> <sup>≥</sup> <sup>Δ</sup>(*t*)*.*

**Proof.** The proof is very similar to [35] (Lemma 5.3) and [33] (Lemma 4.17). In contrast to the latter, we use the surrogate problem and do not require interpolation at **x**(*t*): 

By definition we have *κ*sd <sup>m</sup> (1 − *ν*++) < 1 and hence it follows from Assumptions 4 and 8 and Corollary 3 that

$$\Delta^{(t)} \le \frac{\kappa\_{\rm{m}}^{\rm{sd}} (1 - \nu\_{+ \cdot}) \boldsymbol{\omega}\_{\rm{m}}^{(t)} \left( \mathbf{x}^{(t)} \right)}{2\lambda} \tag{23}$$

$$\le \frac{\boldsymbol{\omega}\_{\rm{m}}^{(t)}}{2\lambda} \le \frac{\boldsymbol{\omega}\_{\rm{m}}^{(t)}}{2\text{cH}\_{\rm{m}}} \le \frac{\boldsymbol{\omega}\_{\rm{m}}^{(t)}}{\text{cH}\_{\rm{m}}}.$$

$$\text{can plug this into (19) and obtain}$$

$$\boldsymbol{\alpha}\_{+}^{(t)} \ge \kappa\_{\rm{m}}^{\rm{sd}} \boldsymbol{\alpha}\_{\rm{m}}^{(t)} \min \left\{ \frac{\boldsymbol{\omega}\_{\rm{m}}^{(t)}}{\text{cH}}, \boldsymbol{\Delta}^{(t)} \right\} \ge \kappa\_{\rm{m}}^{\rm{sd}} \boldsymbol{\alpha}\_{\rm{m}}^{(t)} \boldsymbol{\Delta}^{(t)}. \tag{24}$$

With Assumption 8 we can plug this into (19) and obtain

$$
\Phi\_{\mathbf{m}}^{(t)}(\mathbf{x}^{(t)}) - \Phi\_{\mathbf{m}}^{(t)}(\mathbf{x}\_{+}^{(t)}) \geq \kappa\_{\mathbf{m}}^{\text{sd}} \mathcal{\phi}\_{\mathbf{m}}^{(t)} \min \left\{ \frac{\mathcal{\mathcal{Q}}\_{\mathbf{m}}^{(t)}}{\text{cH}\_{\mathbf{m}}}, \boldsymbol{\Delta}^{(t)} \right\} \geq \kappa\_{\mathbf{m}}^{\text{sd}} \mathcal{\phi}\_{\mathbf{m}}^{(t)} \boldsymbol{\Delta}^{(t)}.\tag{24}$$

Due to Assumption 7 we can take Definition (3) and estimate !!!!!!

!

!

!

!

!

!

!

!

!

!

!

!

$$\begin{split} \left| \boldsymbol{\rho}^{(t)} - 1 \right| &= \left| \frac{\Phi(\mathbf{x}^{(t)}) - \Phi(\mathbf{x}\_{+}^{(t)}) - (\Phi\_{\mathbf{m}}^{(t)}(\mathbf{x}^{(t)}) - \Phi\_{\mathbf{m}}^{(t)}(\mathbf{x}\_{+}^{(t)})}{\Phi\_{\mathbf{m}}^{(t)}(\mathbf{x}^{(t)}) - \Phi\_{\mathbf{m}}^{(t)}(\mathbf{x}\_{+}^{(t)})} \right| \\ &\leq \frac{\left| \Phi(\mathbf{x}^{(t)}) - \Phi\_{\mathbf{m}}^{(t)}(\mathbf{x}^{(t)}) \right| + \left| \Phi\_{\mathbf{m}}^{(t)}(\mathbf{x}\_{+}^{(t)}) - \Phi(\mathbf{x}\_{+}^{(t)}) \right|}{\left| \Phi\_{\mathbf{m}}^{(t)}(\mathbf{x}^{(t)}) - \Phi\_{\mathbf{m}}^{(t)}(\mathbf{x}\_{+}^{(t)}) \right|} \\ &\leq \frac{2\epsilon \left(\Delta^{(t)}\right)^{2}}{\kappa\_{\mathbf{m}}^{\text{sd}}\mathcal{O}\_{\mathbf{m}}^{(t)}\Delta^{(t)}} \leq \frac{2\lambda\Delta^{(t)} \left(\Delta^{(t)}\right)}{\kappa\_{\mathbf{m}}^{\text{sd}}\mathcal{O}\_{\mathbf{m}}^{(t)}} \leq 1 - \nu\_{++}. \end{split}$$

!

!

!

!

!

!

!

!

!

!

Therefore *<sup>ρ</sup>*(*t*) <sup>≥</sup> *<sup>ν</sup>*++ and the iteration *<sup>t</sup>* using step **<sup>s</sup>**(*t*) is successful.

The same statement can be made for the true problem and (•): 

**Corollary 6.** *Suppose Assumptions 1, 3 and 6 to 9 hold. If* **x**(*t*) *is not Pareto critical for* (MOP) *and* **m**(*t*) *is fully linear on B*(*t*) *and*

$$
\Delta^{(t)} \le \frac{\kappa^{\rm sd} (1 - \nu\_{++}) \mathcal{O}\left(\mathbf{x}^{(t)}\right)}{2\lambda}, \quad \text{where } \lambda = \max\{\epsilon, \epsilon \mathbf{H}\_{\rm m}\}\_{\prime} \kappa\_{\rm m}^{\rm sd} \text{ as in (20)},
$$

*then the iteration is successful, that is t* ∈ S *and* <sup>Δ</sup>*t*+<sup>1</sup> <sup>≥</sup> <sup>Δ</sup>(*t*)*.*

**Proof.** The proof works exactly the same as for Lemma 10. But due to Assumption 9 we can use Lemma 7 and employ the sufficient decrease condition (20) for (•) instead.

As in [35] (Lemma 5.4) and [33] (Lemma 4.18), it is now easy to show that when no Pareto critical point of (MOPm) is approached the trust region radius must be bounded: 

**Lemma 11.** *Suppose Assumptions 1, 3 and 6 to 8 hold and that there exists a constant* lb <sup>m</sup> > 0 *such that* (*t*) m **x**(*t*) <sup>≥</sup> lb <sup>m</sup> *for all t. Then there is a constant* Δlb > 0 *with*

$$
\Delta^{(t)} \ge \Delta^{\text{lb}} \quad \text{for all } t \in \mathbb{N}\_0.
$$

**Proof.** We first investigate the criticality step and assume *<sup>ε</sup>*crit <sup>&</sup>gt; (*t*) <sup>m</sup> <sup>≥</sup> lb m. After we finish the criticality loop, we get radius <sup>Δ</sup>(*t*) so that <sup>Δ</sup>(*t*) <sup>≥</sup> min{Δ(*t*) <sup>∗</sup> , *β*(*t*) <sup>m</sup> } and therefore <sup>Δ</sup>(*t*) <sup>≥</sup> min{*β*lb m, <sup>Δ</sup>(*t*) <sup>∗</sup> } for all *t*.

Outside the criticality step, we know from Lemma 10 that whenever Δ(*t*) falls below

$$
\bar{\Lambda} := \frac{\kappa\_{\rm m}^{\rm sd} (1 - \nu\_{++}) \sigma\_{\rm m}^{\rm lb}}{2\lambda},
$$

iteration *<sup>t</sup>* must be either model-improving or successful and hence <sup>Δ</sup>(*t*+1) <sup>≥</sup> <sup>Δ</sup>(*t*) and the radius cannot decrease until Δ(*k*) > Δ˜ for some *k* > *t*. Because *γ*- ∈ (0, 1) is the severest possible shrinking factor in Algorithm 2, we therefore know that Δ(*t*) can never be actively shrunken to a value below *γ*-Δ˜ .

Combining both bounds on Δ(*t*) results in

$$
\Delta^{(t)} \ge \Delta^{\text{lb}} := \min \{ \beta \omega\_{\text{m}}^{\text{lb}} \gamma\_{\text{l}} \tilde{\Delta}, \Delta\_\*^{(0)} \} \qquad \forall t \in \mathbb{N}\_{0 \times 1}
$$

where we have again used the fact that Δ(*t*) <sup>∗</sup> cannot be reduced further if it is less than or equal to Δ˜ due to the update mechanism in Algorithm 2.

We can now state the first convergence result:

**Theorem 5.** *Suppose that Assumptions 1, 3 and 6 to 8 hold. If Algorithm 2 has only a finite number* <sup>0</sup> ≤ |S| <sup>&</sup>lt; <sup>∞</sup> *of successful iterations* <sup>S</sup> <sup>=</sup> {*<sup>t</sup>* <sup>∈</sup> <sup>N</sup><sup>0</sup> : *<sup>ρ</sup>*(*t*) <sup>≥</sup> *<sup>ν</sup>*++} *then*

$$\lim\_{t \to \infty} \phi \left( \mathbf{x}^{(t)} \right) = 0.$$

**Proof.** If the criticality loop runs infinitely, then the result follows from Lemma 8.

Otherwise, let *t*<sup>0</sup> any index larger than the last successful index (or *t*<sup>0</sup> ≥ 0 if S = ∅). All *t* ≥ *t*<sup>0</sup> then must be model-improving, acceptable or inacceptable. In all cases, the trust region radius Δ(*t*) is never increased. Due to Assumption 7, the number of successive model-improvement steps is bounded above by M <sup>∈</sup> <sup>N</sup>. Hence, <sup>Δ</sup>(*t*) is decreased by a factor of *γ* ∈ [*γ*-, *γ*↓] ⊆ (0, 1) at least once every M iterations. Thus,

$$\sum\_{t>t\_0}^{\infty} \Delta^{(t)} \le N \sum\_{i=1}^{\infty} \gamma\_\downarrow^i \Delta^{(t\_0)} = \frac{N\gamma\_\downarrow}{1-\gamma\_\downarrow} \Delta^{(t\_0)},$$

and <sup>Δ</sup>(*t*) **must go to zero** for *<sup>t</sup>* <sup>→</sup> <sup>∞</sup>. 

!

!

!

Clearly, for any *<sup>τ</sup>* <sup>≥</sup> *<sup>t</sup>*0, the iterates (and trust region centers) **<sup>x</sup>**(*τ*) and **<sup>x</sup>**(*t*0) cannot be further apart than the sum of all subsequent trust region radii, i.e., 

$$\left\|\mathbf{x}^{(\tau)} - \mathbf{x}^{(t\_0)}\right\| \le \sum\_{t \ge t\_0}^{\infty} \Delta^{(t)} \le \frac{N\gamma\_\downarrow}{1 - \gamma\_\downarrow} \Delta^{(t\_0)}.$$

The RHS goes to zero as we let *t*<sup>0</sup> go to infinity and so must the norm on the LHS, i.e.,

$$\lim\_{t\_0 \to \infty} \left\| \mathbf{x}^{(\tau)} - \mathbf{x}^{(t\_0)} \right\| = 0. \tag{25}$$

Now let *<sup>τ</sup>* <sup>=</sup> *<sup>τ</sup>*(*t*0) <sup>≥</sup> *<sup>t</sup>*<sup>0</sup> be the first iteration index so that **<sup>m</sup>**(*τ*) is fully linear. Then

$$\left| \left| \boldsymbol{\mathcal{O}}\_{\mathbf{m}}^{(t\_0)} \right| \right| \leq \left| \left| \boldsymbol{\mathcal{O}}^{(t\_0)} - \boldsymbol{\mathcal{O}}^{(\tau)} \right| \right| + \left| \left| \boldsymbol{\mathcal{O}}^{(\tau)} - \boldsymbol{\mathcal{O}}\_{\mathbf{m}}^{(\tau)} \right| \right| + \left| \boldsymbol{\mathcal{O}}\_{\mathbf{m}}^{(\tau)} \right|.$$

and for the terms on the right and for *t*<sup>0</sup> → ∞, we find:

!

!

!

!

!

!


We conclude that the left side, **x**(*t*0) , goes to zero as well for *t*<sup>0</sup> → ∞.

We now address the case of infinitely many successful iterations, first for the surrogate measure (*t*) <sup>m</sup> (•) and then for (•). We show that the criticality measures are not bounded away from zero.

We start with the observation that in any case the trust region radius converges to zero:

**Lemma 12.** *If Assumptions 1, 3 and 6 to 8 hold, then the subsequence of trust region radii generated by Algorithm <sup>2</sup> goes to zero, i.e.,* lim*t*→<sup>∞</sup> <sup>Δ</sup>(*t*) <sup>=</sup> 0.

**Proof.** We have shown in the proof of Theorem 5 that this is the case for finitely many successful iterations.

Suppose there are infinitely many successful iterations. Take any successful index *<sup>t</sup>* ∈ S. Then *<sup>ρ</sup>*(*t*) <sup>≥</sup> *<sup>ν</sup>*++ and from Assumption <sup>8</sup> it follows for **<sup>x</sup>**(*t*+1) <sup>=</sup> **<sup>x</sup>** (*t*) <sup>+</sup> = **<sup>x</sup>**(*t*) + **<sup>s</sup>**(*t*) that + ) ≥ *ν*++ m min 

.

$$\begin{aligned} \text{S. Then } & \boldsymbol{\rho}^{(t)} \ge \boldsymbol{\nu}\_{++} \text{ and from Assumption 8 it follows for } \mathbf{x}^{(t+1)} = \mathbf{x}\_{+}^{(t)} = \mathbf{x}^{(t)} + \mathbf{s}^{(t)}\\ \boldsymbol{\Phi}(\mathbf{x}^{(t)}) - \boldsymbol{\Phi}(\mathbf{x}\_{+}^{(t)}) & \ge \boldsymbol{\nu}\_{++} \left(\boldsymbol{\Phi}\_{\mathbf{m}}^{(t)}(\mathbf{x}^{(t)}) - \boldsymbol{\Phi}\_{\mathbf{m}}^{(t)}(\mathbf{x}\_{+}^{(t)})\right) \overset{(19)}{\geq} \boldsymbol{\nu}\_{+ } \kappa\_{\mathbf{m}}^{\text{sd}} \boldsymbol{\phi}\_{\mathbf{m}}^{(t)} \min\left\{\frac{\boldsymbol{\mathcal{Q}}\_{\mathbf{m}}^{(t)}}{\epsilon \mathbf{H}\_{\mathbf{m}}}, \boldsymbol{\Delta}^{(t)}\right\} \\ \text{criticality step ensures that } & \boldsymbol{\phi}\_{\mathbf{m}}^{(t)} \geq \min\left\{\boldsymbol{\varepsilon}\_{\text{crit}}, \frac{\boldsymbol{\Delta}^{(t)}}{\boldsymbol{\nu}}\right\} \text{ so that} \end{aligned}$$

The criticality step ensures that (*t*) *ε*crit, Δ(*t*) *μ* so that m min min

$$\Phi(\mathbf{x}^{(t)}) - \Phi(\mathbf{x}\_{+}^{(t)}) \ge \nu\_{+} \kappa\_{\mathbf{m}}^{\text{sd}} \min \left\{ \varepsilon\_{\text{crit}}, \frac{\Delta^{(t)}}{\mu} \right\} \min \left\{ \frac{\Delta^{(t)}}{\mu \text{cH}\_{\text{m}}}, \Delta^{(t)} \right\} \ge 0. \tag{26}$$

Now the right hand side has to go to zero: Suppose it was bounded below by a positive constant *ε* > 0. We could then compute a lower bound on the improvement from the first iteration with index 0 up to *t* + 1 by summation

$$\Phi(\mathbf{x}^{(0)}) - \Phi(\mathbf{x}^{(t+1)}) \ge \sum\_{\mathbf{r} \in \mathcal{S}\_t} \Phi(\mathbf{x}^{(\tau)}) - \Phi(\mathbf{x}^{(\tau+1)}) \ge |\mathcal{S}\_t|\varepsilon$$

where S*<sup>t</sup>* = S∩{0, ... , *t*} are all successful indices with a maximum index of *t*. Because S is unbounded, the right side diverges for *t* → ∞ and so must the left side in contradiction to <sup>Φ</sup> being bounded below by Assumption 5. From (26) we see that this implies <sup>Δ</sup>(*t*) <sup>→</sup> <sup>0</sup> for *t* ∈ S, *t* → ∞.

Now consider any sequence T ⊆ *N* of indices that are not necessarily successful, i.e., |T \S| ≥ 0. The radius is only ever increased in successful iterations and at most by a factor of *γ*↑. Since S is unbounded, there is for any *τ* ∈ T a largest *t<sup>τ</sup>* ∈ S with *t<sup>τ</sup>* ≤ *τ*. Then <sup>Δ</sup>(*τ*) <sup>≤</sup> *<sup>γ</sup>*↑Δ(*t<sup>τ</sup>* ) and because of <sup>Δ</sup>(*t<sup>τ</sup>* ) <sup>→</sup> 0 it follows that

$$\lim\_{\substack{\tau \in \mathcal{T}\_{\omega} \\ \tau \to \infty}} \Delta^{(\tau)} = 0,$$

which concludes the proof.

**Lemma 13.** *Suppose Assumptions 1, 3 and 5 to 8 hold. For the iterates produced by Algorithm 2 it holds that*

$$\liminf\_{t \to \infty} \mathcal{O}\_{\mathbf{m}}^{(t)} \left( \mathbf{x}^{(t)} \right) = 0.$$

**Proof.** For a contradiction, suppose that lim inf*t*→<sup>∞</sup> (*t*) m **x**(*t*) = 0. Then there is a constant lb <sup>m</sup> <sup>&</sup>gt; 0 with (*t*) <sup>m</sup> <sup>≥</sup> lb <sup>m</sup> for all *<sup>t</sup>* ∈ N0. According to Lemma 11, there exists a constant <sup>Δ</sup>lb <sup>&</sup>gt; 0 with <sup>Δ</sup>(*t*) <sup>≥</sup> <sup>Δ</sup>lb for all *<sup>t</sup>*. This contradicts Lemma 12. 

The next result allows us to transfer the result to (•).

**Lemma 14.** *Suppose Assumptions 1, <sup>6</sup> and <sup>7</sup> hold. For any subsequence* {*ti*}*i*∈<sup>N</sup> <sup>⊆</sup> <sup>N</sup><sup>0</sup> *of iteration indices of Algorithm 2 with* 

$$\lim\_{i \to \infty} \phi\_{\mathbf{m}}^{(t\_i)} \left( \mathbf{x}^{(t\_i)} \right) = 0,\tag{27}$$

*it also holds that*

$$\lim\_{i \to \infty} \mathcal{O}\left(\mathbf{x}^{(t\_i)}\right) = 0.\tag{28}$$

**Proof.** By (27), (*ti*) <sup>m</sup> <sup>&</sup>lt; *<sup>ε</sup>*crit for sufficiently large *<sup>i</sup>*. If **<sup>x</sup>**(*ti*) is critical for (MOP), then the result follows from Lemma 8. Otherwise, **m**(*ti*) is fully linear on *B* **x**(*ti*); Δ(*ti*) for some <sup>Δ</sup>(*ti*) <sup>≤</sup> *μ*(*ti*) <sup>m</sup> . From Corollary <sup>5</sup> it follows that !!!!!!

$$\left|\boldsymbol{\sigma}\_{\mathsf{m}}^{(t\_{i})} - \boldsymbol{\sigma}^{(t\_{i})}\right| \leq \sqrt{c} \dot{\epsilon} \Lambda^{(t\_{i})} \leq \sqrt{c} \dot{\epsilon} \mu \boldsymbol{\sigma}\_{\mathsf{m}}^{(t\_{i})}.$$

The triangle inequality yields

!

!

!

$$\mathcal{a}^{(t\_i)} \le \left| \mathcal{a}^{(t\_i)} - \mathcal{a}\_{\mathbf{m}}^{(t\_i)} \right| + \mathcal{a}\_{\mathbf{m}}^{(t\_i)} \le (\sqrt{c}\dot{\epsilon}\mu + 1)\mathcal{a}\_{\mathbf{m}}^{(t\_i)}$$

!

!

!

for sufficiently large *i* and (27) then implies (28).

!

!

!

The next global convergence result immediately follows from Theorem 5 and Lemmas 13 and 14:

**Theorem 6.** *Suppose Assumptions 1, 3 and 5 to 8 hold. Then* lim inf*t*→<sup>∞</sup> **x**(*t*) = 0.

This shows that if the iterates are bounded, then there is a subsequence of iterates in R*<sup>n</sup>* approximating a Pareto critical point. We next show that *all* limit points of a sequence generated by Algorithm 2 are Pareto critical. 

**Theorem 7.** *Suppose Assumptions 1 and 3 to 8 hold. Then* lim*t*→<sup>∞</sup> **x**(*t*) = 0. 9:

**Proof.** We have already proven the result for finitely many successful iterations, see Theorem 5. We thus suppose that S is unbounded.

For the purpose of establishing a contradiction, suppose that there exists a sequence *tj <sup>j</sup>*∈<sup>N</sup> of indices that are successful or acceptable with

$$
\mathfrak{L}^{\left(t\_j\right)} \ge \mathfrak{L}\varepsilon > 0 \quad \text{for some } \varepsilon > 0 \text{ and all } j. \tag{29}
$$

8

We can ignore model-improving and inacceptable iterations: During those the iterate does not change, and we find a larger acceptable or successful index with the same criticality value. 

From Theorem 6 we obtain that for every such *tj*, there exists a first index *τ<sup>j</sup>* > *tj* such that **x**(*τj*) < *ε*. We thus find another subsequence indexed by {*τj*} such that

$$
\omega^{(t)} \ge \varepsilon \text{ for } t\_j \le t < \tau\_j \text{ and } \omega^{(\tau\_j)} < \varepsilon. \tag{30}
$$

Using (29) and (30), it also follows from a triangle inequality that

!

$$\left|\mathcal{O}^{\left(t\_{\langle\rangle}\right)} - \mathcal{O}^{\left(\tau\_{\langle\rangle}\right)}\right| \ge \mathcal{O}^{\left(t\_{\langle\rangle}\right)} - \mathcal{O}^{\left(\tau\_{\langle\rangle}\right)} > 2\varepsilon - \varepsilon = \varepsilon \qquad \forall j \in \mathbb{N}.\tag{31}$$
 
$$\text{and } \{\tau\_{\langle\rangle}\} \text{ as in (30), define the following subset set of indices }$$
 
$$\mathcal{T} = \{t \in \mathbb{N}\_{0} : \exists j \in \mathbb{N} \text{ such that } t\_{j} \le t < \tau\_{j}\}.$$

With {*tj*} and {*τj*} as in (30), define the following subset set of indices

$$\mathcal{T} = \left\{ t \in \mathbb{N}\_0 : \exists j \in \mathbb{N} \text{ such that } t\_j \le t < \tau\_j \right\}.$$

By (30) we have (*t*) <sup>≥</sup> *<sup>ε</sup>* for *<sup>t</sup>* ∈ T , and due to Lemma 14, we also know that then (*t*) <sup>m</sup> cannot go to zero neither, i.e., there is some *ε*<sup>m</sup> > 0 such that

$$
\omega\_{\mathbf{m}}^{(t)} \ge \varepsilon\_{\mathbf{m}} > 0 \qquad \forall t \in \mathcal{T}.
$$

From Lemma <sup>12</sup> we know that <sup>Δ</sup>(*t*) *<sup>t</sup>*→<sup>∞</sup> −−→ 0 so that by Corollary 6, any sufficiently large *<sup>t</sup>* ∈ T must be either successful or model-improving (if **<sup>m</sup>**(*t*) is not fully linear). For *t* ∈T ∩S, it follows from Assumption 8 that ) ≥ *ν*++ m *ε*m min *ε*m

$$\Phi(\mathbf{x}^{(t)}) - \Phi(\mathbf{x}^{(t+1)}) \ge \nu\_{++} \left( \Phi\_{\mathbf{m}}(\mathbf{x}^{(t)}) - \Phi\_{\mathbf{m}}(\mathbf{x}^{(t+1)}) \right) \ge \nu\_{++} \kappa\_{\mathbf{m}}^{\mathrm{sd}} \varepsilon\_{\mathbf{m}} \min \left\{ \frac{\varepsilon\_{\mathbf{m}}}{\varepsilon \mathbf{H}\_{\mathbf{m}}}, \Delta^{(t)} \right\} \ge 0.1$$

If *<sup>t</sup>* ∈T ∩S is sufficiently large, we have <sup>Δ</sup>(*t*) <sup>≤</sup> *<sup>ε</sup>*<sup>m</sup> cHm and

$$
\Delta^{(t)} \le \frac{1}{\nu\_{++} \kappa\_{\text{m}}^{\text{sd}} \varepsilon\_{\text{m}}} \left( \Phi(\mathbf{x}^{(t)}) - \Phi(\mathbf{x}^{(t+1)}) \right).
$$

Since the iteration is either successful or model-improving for sufficiently large *t* ∈ T , and since **x**(*t*) = **x**(*t*+1) for a model-improving iteration, we deduce from the previous inequality that 

$$\left\|\mathbf{x}^{(t\_j)} - \mathbf{x}^{(\tau\_j)}\right\| \le \sum\_{\substack{t=t\_j,\\t \in \mathcal{T} \cap \mathcal{S}}}^{\tau\_j - 1} \left\|\mathbf{x}^{(t)} - \mathbf{x}^{(t+1)}\right\| \le \sum\_{\substack{t=t\_j,\\t \in \mathcal{T} \cap \mathcal{S}}}^{\tau\_j - 1} \Delta^{(t)} \le \frac{1}{\nu\_{+, \*} \mathbf{x}\_{\mathbf{m}}^{\text{sd}} \mathcal{E}\_{\mathbf{m}}} \left(\Phi(\mathbf{x}^{(t\_j)}) - \Phi(\mathbf{x}^{(\tau\_j)})\right).$$
 for  $j \in \mathbb{N}$  sufficiently large. The sequence  $\left\{\Phi(\mathbf{x}^{(t)})\right\}\_{t \in \mathbb{N}\_0}$  is bounded below (Assumption

Φ(**x**(*t*)) *<sup>t</sup>*∈N<sup>0</sup> is bounded below (Assumption 5) and monotonically decreasing by construction. Hence, the RHS above must converge to zero for *j* → ∞. This implies lim*j*→<sup>∞</sup> **<sup>x</sup>**(*tj*) <sup>−</sup> **<sup>x</sup>**(*τj*) = 0. 

Because of Assumptions 1 and 6, (•) is uniformly continuous so that then

$$\lim\_{j \to \infty} \mathcal{O}\left(\mathbf{x}^{(t\_j)}\right) - \mathcal{O}\left(\mathbf{x}^{(\tau\_j)}\right) = 0\_\tau$$

which is a contradiction to (31). Thus, no subsequence of acceptable or successful indices as in (29) can exist.

#### **7. Numerical Examples**

In this section we provide some more details on the actual implementation of Algorithm 2 and present the results of various experiments. We compare different surrogate model types with regard to their efficacy (in terms of expensive objective evaluations) and their ability to find Pareto critical points.

#### *7.1. Implementation Details*

We implemented the framework in the Julia language (the code is available under https://github.com/manuelbb-upb/Morbit.jl, accessed on 15 April 2021) and used the surrogate construction algorithms from Sections 4.2 and 4.3. Concerning the RBF models, the algorithms are thus the same as in [41]. The OSQP solver [45] is used to solve (Pm). For non-linear problems we use the NLopt.jl [46] package. More specifically we use the MMA algorithm [47] in conjunction with DynamicPolynomials.jl [48] to construct the Lagrange polynomials. The Pascoletti–Serafini subproblems is solved using the population based ISRES method [49] with MMA for polishing. The derivatives of cheap objective functions are obtained by means of automatic differentiation [50] and Taylor models use FiniteDiff.jl. 

In accordance with Algorithm 2, we perform the shrinking trust region update via

$$\Delta^{(t+1)} \gets \begin{cases} \gamma\_{\sqcup} \Delta^{(t)} & \text{if } \rho^{(t)} < \nu\_{+\prime} \\ \gamma\_{\sqcup} \Delta^{(t)} & \text{if } \rho^{(t)} < \nu\_{++} \end{cases}$$

Note that for box-constrained problems we internally scale the feasible set to the unit hypercube [0, 1] *<sup>n</sup>* and all radii are measured with regard to this scaled domain.

For **stopping**, we use a disjunction of different criteria:

• We have an upper bound Nit. ∈ N on the maximum number of iterations and an upper bound Nexp. ∈ N on the number of expensive objective evaluations.

• The surrogate criticality naturally allows for a stopping test and due to Lemma 11 the trust region radius can also be used (see also [33] [Sec. 5]). We combine this with a relative tolerance test and stop if <sup>Δ</sup>(*t*) <sup>≤</sup> <sup>Δ</sup>min OR ≤ *ω*min

$$
\Delta^{(t)} \le \Delta\_{\text{min}} \text{ OR } \left( \Delta^{(t)} \le \Delta\_{\text{crit}} \text{ AND } \omega \left( \mathbf{x}^{(t)} \right) \le \omega\_{\text{min}} \right).
$$


$$\begin{aligned} \left\| \mathbf{x}^{(t)} - \mathbf{x}^{(t+1)} \right\| \Big|\_{\infty} &\leq \delta\_{\mathbf{x}} \left\| \mathbf{x}^{(t)} \right\|\_{\infty} \quad \text{and} \\\left\| \mathbf{f}(\mathbf{x}^{(t)}) - \mathbf{f}(\mathbf{x}^{(t+1)}) \right\|\_{\infty} &\leq \delta\_{f} \left\| \mathbf{f}(\mathbf{x}^{(t)}) \right\|\_{\infty} \end{aligned}$$

to provoke early stopping.

#### *7.2. A First Example*

We ran our method on a multitude of academic test problems with a varying number of decision variables *n* and objective functions *k*. We were able to approximate Pareto critical points in both cases, if we treat the problems as heterogeneous and if we declare them as expensive. We benchmarked RBF against polynomial models, because in [33] it was shown that a trust region method using second degree Lagrange polynomials outperforms commercial solvers on scalarized problems. Most often, RBF surrogates outperform other model types with regard to the number of expensive function evaluations. 

This is illustrated in Figure 2. It shows two runs of Algorithm 2 on the non-convex problem (T6), taken from [38]:

where on scalarized problems. Most often, RBF survey outperforms other with regard to the number of expansive function evaluations.

usuated in Figure 2. It shows two runs of Algorithm 2 on the non-convex taken from [38]:

$$\min\_{\mathbf{x}\in\mathcal{X}} \begin{bmatrix} \mathbf{x}\_{1} + \ln(\mathbf{x}\_{1}) + \mathbf{x}\_{2'}^{2} \\ \mathbf{x}\_{1}^{2} + \mathbf{x}\_{2}^{4} \end{bmatrix}, \mathcal{X} = [\varepsilon, \mathbf{3}0] \times [0, \mathbf{3}0] \subseteq \mathbb{R}^{2}, \varepsilon = 10^{-12}.\tag{76}$$

**Figure 2.** Two runs with maximum number of expensive evaluations set to 20 (soft limit). Test points are light-gray, the iterates are black, final iterate is red, white markers show other points where the objectives are evaluated. The successive trust regions are also shown. (**a**) Using Radial Basis Function (RBF) surrogate models we converge to the optimum using only 12 expensive evaluations. (**b**) Quadratic Lagrange models do not reach the optimum using 19 evaluations. (**c**) Iterations and test points in the objective space.

The first objective function is treated as expensive while the second is cheap. In contrast to most other MOPs, there is only one solution and this Pareto optimal point is [*ε*, 0] *T*. When we set a very restrictive limit of Nexp. = 20 then we run out of budget with second degree Lagrange surrogates before we reach the optimum, see Figure 2b. As evident in Figure 2a, surrogates based on (cubic) RBF do require significantly less training data. For the RBF models the algorithm stopped after two critical loops and the model refinement during these loops is made clear by the samples on the problem boundary converging to zero. The complete set of relevant parameters for the test runs is given in Table 2. We used a strict acceptance test and the strict Pareto–Cauchy step.


**Table 2.** Parameters for Figure 2, radii relative to [0, 1] *n*.

#### *7.3. Benchmarks on Scalable Test-Problems*

To assess the performance with a growing number of decision variables *n*, we performed tests on scalable problems of the ZDT and DTLZ family [51,52]. Figure 3 shows results for the bi-objective problems ZDT1-ZDT3 and for the *k*-objective problems DTLZ1 and DTLZ6 (we used *k* = max{2, *n* − 4} objectives). All problems are box constrained. Twelve feasible starting points (from the Halton sequence) were generated for each problem setting, i.e., for each combination of *n*, a test problem and a descent method. The acceptance test and the backtracking were strict.

**Value** <sup>10</sup>−<sup>3</sup> 20 2 2 <sup>×</sup> 103 103 0.5 10−<sup>3</sup> 0.1 0.1 0.4 0.51 0.75 2

**Figure 3.** Average number of expensive objective evaluations by number of decision variables *n*, surrogate type and descent method. "SD" refers to steepest descent and "PS" to Pascoletti–Serafini. "LP1" (orange) are linear Lagrange models, "LP2" (yellow) quadratic Lagrange models, "TP1" (blue) are linear Taylor polynomials based on finite differences and "cubic" (black) refers to cubic RBF models. Additionally the results for weighted sum runs are shown in green, using the COBYLA solver and a single objective variant of the trust region framework, ORBIT.

In all cases the first objective was considered cheap and all other objectives expensive. First and second degree Lagrange models were compared against linear Taylor models and (cubic) RBF surrogates. The Lagrange models were built using a Λ-poised set, with Λ = 1.5. In the case of quadratic models we used a precomputed set of points for *n* ≥ 6. The Taylor models used finite differences and points outside of box constraints were simply projected back onto the boundary. The RBF models were allowed to include up to (*n* + 1)(*n* + 2)/2 training points from the database if *n* ≤ 10 and else the maximum number of points was 2*n* + 1. Points were first selected from a box of radius *θ*1Δ(*t*) with *θ*<sup>1</sup> = 2 and then from a box of radius *θ*2Δub with *θ*<sup>2</sup> = 2. All other parameters differing from the parameters in Table 2 are listed in Table 3. The stopping parameters were chosen so as to exit early and save evaluations.


**Table 3.** Parameters for Figure 3, radii relative to [0, 1] *n*.

As expected, the second degree Lagrange polynomials require the most objective evaluations and the quadratic dependence on *n* is clearly visible in Figure 3, and the quadratic growth of the dark-blue line continues for *n* ≥ 8. On average, the linear Lagrange models perform better than the linear Taylor polynomials when using the steepest descent steps—also in accordance with our expectations, because only *n* + 1 points are needed for each model (versus 2*n* points). Most models—even the linear ones—profit from using the Pascoletti–Serafini subproblems (see Appendix B) over the steepest descent steps. By far the least evaluations (on average) are needed for the RBF models: The black line consistently stays below all other data points. Note, that the RBF models likely appear to perform slightly better with the steepest descent steps because of the early stopping. In other experiments we noticed that RBF models with Pascoletti–Serafini steps can save evaluations when more precise solutions are required.

For comparison, we also used the weighted sum approach with the single objective <sup>∑</sup> *<sup>f</sup>* on each problem instance. We tested both the derivative-free COBYLA solver (described in [53] and implemented by NLopt.jl) and the trust region method using steepest descent and cubic RBF models, i.e., our own implementation of ORBIT [34]. Both solvers were restricted to the same number of maximum function evaluations. In fact, ORBIT was configured with the exact same parameters as in Table 3 and the relative stopping tolerances for COBYLA were *<sup>δ</sup><sup>x</sup>* <sup>=</sup> *<sup>δ</sup><sup>f</sup>* <sup>=</sup> <sup>10</sup>−2. Although, COBYLA also uses linear models it requires significantly more evaluations than most other algorithms. The results of the ORBIT scalarization are more comparable to that of the multiobjective runs.

#### 7.3.1. Solution Quality

Figure 4 illustrates that not only do RBF perform better on average, but also overall. With regard to the final solution criticality, there are a few outliers mostly due to DTLZ1 (see also Figure 5). However, in most cases the solution criticality is acceptable, except for the linear Lagrange models. Moreover, Figure 5 shows that a good percentage of problem instances is solved with RBF, especially when compared to the other linear models. Note, that in cases where the true objectives are not differentiable at the final iterate, *ω* was set to 0 because the selected problems are non-differentiable only in Pareto optimal points. In Figure 5 it also becomes apparent that the bi-objective DTLZ1 instances were the most challenging for all algorithms. DTLZ1 has many local minima and it is likely to exit early near such a local minimum due to repeated unsuccessful iterations. Likewise, ZDT3 is "flat" towards the true Pareto Front so that it becomes hard to make progress there.

**Figure 4.** Box-plots of the number of evaluations and the solution criticality for *n* = 5 and *n* = 15 for the runs from Figure 3. Outliers are not shown. "WS\_C" and "WS\_O" refer to the weighted sum approach using COBYLA and ORBIT, respectively.

**Figure 5.** Each group of bars shows the percentage of solved problem instances, i.e., test runs were the final solution criticality has a value below 0.1. From left to right, the bars correspond to the Trust Region Method (TRM) using linear Lagrange polynomials, the TRM with quadratic Lagrange polynomials, TRM with linear Taylor polynomials, weighted sum with COBYLA, weighted sum with ORBIT and TRM with cubic RBF. Per model and *n*-value there were 60 runs.

Besides criticality, another metric of interest is the spread of solutions for different starting points. Figure 6 shows the final iterates when the algorithm is applied to the bi-objective problems ZDT1 and ZDT2 for 10 different starting points. Additionally, the problems are solved using the weighted sum approach with the derivative-free COBYLA solver. For each starting point the optimizers were allowed 30 objective evaluations and no data were re-used between runs.

**Figure 6.** Final iterates in objective space for the bi-objective problems ZDT1 and ZDT2 in 10 variables. The weighted sum method (WS) is compared against the trust region method using steepest descent (DS) and the Pascoletti–Serafini (PS) method.

As can bee seen, for these problems, the trust region method readily reaches the critical set using only 30 evaluations. Here, the steepest descent direction tends to generate solutions on the problem boundary when applied in such a global manner—with relatively large trust region radii (Δ(0) = 0.1 and Δub = 0.5). Nonetheless, the method remains applicable for local refinement of approximate solutions, e.g., after a coarse search for good starting points using global methods or as a corrector in continuation frameworks. The Pascoletti–Serafini step can be employed with different reference points/directions to provide a better covering than both the steepest descent steps and the weighted sum approach. For Figure 6, the points {[0, −10*i*], *i* = 1, . . . , 10} were used. The weighted sum approach (with fixed weights) tends to produce clustered solutions. Especially for the

non-convex problem ZDT2 only the boundary points of the true Pareto Front are reached, as expected [1].

#### 7.3.2. RBF Comparison

Furthermore, we compared the RBF kernels from Table 1. In [34], the cubic kernel performs best on single-objective problems while the Gaussian does worst. As can be seen in Figure 7 this holds for multiple objective functions, too: The Gaussian and the Multiquadric require more function evaluations than the Cubic, especially in higher dimensions. If, however, we use a very simple adaptive strategy to fine-tune the shape parameter, then both kernels can finish significantly faster. In both cases, the shape parameter was set to *α* = 20/Δ(*t*) in each iteration. Nevertheless, the cubic function appears to be a good choice in general.

**Figure 7.** Each group of bars shows the influence of a adaptive shape radius on the performance of different RBF models (tested on ZDT3) for different decision space dimensions. From left to right the bars correspond to the cubic RBF, the Gaussian—with constant shape factor 1 and with adaptive shape factor 20/Δ(*t*)—and the Multiquadric—with shape factors 1 and 20/Δ(*t*).

#### **8. Conclusions**

We have developed a trust region framework for heterogeneous and expensive multiobjective optimization problems. It is based on similar work [29–31,33] and our main contributions are the integration of constraints and of radial basis function surrogates. Subsequently, our method is is provably convergent to first order critical points for unconstrained problems and when the feasible set is convex and compact, while requiring significantly less expensive function evaluations due to a linear scaling of model construction complexity with respect to the number of decision variables.

For future work, several modifications and extensions can likely be transferred from the single-objective to the multiobjective case. For examples, the trust region update can be made step-size-dependent (rather than to depend *ρ*(*t*) alone) to allow for a more precise model refinement, see [36] ([Ch. 10]). We have also experimented with the nonlinear CG method [9] for a multiobjective Steihaug–Toint step [36] ([Ch. 7]) and early results look promising.

Going forward, we would like to apply our algorithm to a real world application, similar to what has been done in [54]. Moreover, it would be desirable to obtain not just one but multiple Pareto critical solutions. Because the Pascoletti–Serafini scalarization is still compatible with constraints, the iterations can be guided in image space by providing different global reference points. Furthermore, it is straightforward to use RBF with the heuristic methods from [55] for heterogeneous problems. We believe that it should also be possible to propagate multiple solutions and combine the TRM method with non-dominance testing as has been done [31] and in [56]. One can think of other globalization strategies as well: RBF models have been used in multiobjective Stochastic Search algorithms [57] and trust region ideas have been included into population based strategies [26]. It will thus be interesting to see whether the theoretical convergence properties can be maintained within these contexts

by employing a careful trust-region management. Finally, re-using the data sampled near the final iterate within a continuation framework like in [58] is a promising next step.

**Supplementary Materials:** Our Julia implementation of the solver is available online at https:// github.com/manuelbb-upb/Morbit.jl accessed on 15 April 2021.

**Author Contributions:** Conceptualization, M.B. and S.P.; methodology, M.B.; software, M.B.; validation, M.B. and S.P.; formal analysis, M.B. and S.P.; investigation, M.B.; writing—original draft preparation, M.B.; writing—review and editing, S.P.; visualization, M.B.; supervision, S.P.; All authors have read and agreed to the published version of the manuscript.

**Funding:** This research has been funded by the European Union and the German Federal State of North Rhine-Westphalia within the EFRE.NRW project "SET CPS".

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Appendix A. Miscellaneous Proofs**

*Appendix A.1. Continuity of the Constrained Optimal Value*

In this subsection we show the continuity of *ω*(**x**) in the constrained case, where *ω*(**x**) is the negative optimal value of (P1), i.e.,

$$
\omega(\mathbf{x}) := -\min\_{\mathbf{d} \in \mathcal{X} - \mathbf{x}} \max\_{\ell = 1, \dots, k} \langle \nabla f\_{\ell}(\mathbf{x}), \mathbf{d} \rangle\_{\ell},
$$

$$
\text{s.t. } ||\mathbf{d}|| \le 1.
$$

The proof of the continuity of *ω*(**x**), as stated in Theorem 1, follows the reasoning from [6], where continuity is shown for a related constrained descent direction program.

**Proof of Item 2 in Theorem 1.** Let the requirements of Item 1 be fulfilled, i.e., let **f** be continuously differentiable and let X ⊂ <sup>R</sup>*<sup>n</sup>* be convex and compact. Further, let **<sup>x</sup>** be a point in X and denote the minimizing direction in (P1) by **d**(**x**) and the optimal value by *θ*(**x**). We show that *θ*(**x**) is continuous, by which *ω*(**x**) = −*θ*(**x**) is continuous as well.

First, note the following properties of the maximum function:

1. **u** !→ max *u* is positively homogenous and hence

$$\max\_{\ell} (\langle \nabla f\_{\ell}(\mathbf{x}), \mathbf{d}\_{1} \rangle + \langle \nabla f\_{\ell}(\mathbf{x}), \mathbf{d}\_{2} \rangle) \le \max\_{\ell} \langle \nabla f\_{\ell}(\mathbf{x}), \mathbf{d}\_{1} \rangle + \max\_{\ell} \langle \nabla f\_{\ell}(\mathbf{x}), \mathbf{d}\_{2} \rangle.$$

2. **u** !→ max *u* is Lipschitz with constant 1 so that

$$\left| \max\_{\ell} \langle \nabla f\_{\ell}(\mathbf{x}\_{1}), \mathbf{d}\_{1} \rangle - \max\_{\ell} \langle \nabla f\_{\ell}(\mathbf{x}\_{2}), \mathbf{d}\_{2} \rangle \right| \leq \|| \mathbf{Df}(\mathbf{x}\_{1}) \mathbf{d}\_{1} - \mathbf{Df}(\mathbf{x}\_{2}) \mathbf{d}\_{2} \|\_{\ell}$$

for both the maximum and the Euclidean norm. ⎧⎪⎨

Now let {**x**(*t*)}⊆X be a sequence with **<sup>x</sup>**(*t*) <sup>→</sup> **<sup>x</sup>**. Due to the constraints, we have that **<sup>d</sup>**(**x**) ∈X− **<sup>x</sup>** and thereby **<sup>d</sup>**(**x**) + **<sup>x</sup>** <sup>−</sup> **<sup>x</sup>**(*t*) ∈X− **<sup>x</sup>**(*t*). Let ⎪⎩min 

$$(0,1] \ni \sigma^{(t)} := \begin{cases} \min\left\{1, \frac{1}{\left\|\mathbf{d}(\mathbf{x}) + \mathbf{x} - \mathbf{x}^{(t)}\right\|}\right\} & \text{if } \mathbf{d}(\mathbf{x}) \ne \mathbf{x}^{(t)} - \mathbf{x}, \\ 1 & \text{else}. \end{cases}$$

Then *σ*(*t*) **<sup>d</sup>**(**x**) + **<sup>x</sup>** <sup>−</sup> **<sup>x</sup>**(*t*) is feasible for (P1) at **x**(*t*): • *σ*(*t*) **<sup>d</sup>**(**x**) + **<sup>x</sup>** <sup>−</sup> **<sup>x</sup>**(*t*) ∈X− **<sup>x</sup>**(*t*) because X − **<sup>x</sup>**(*t*) is convex and **<sup>0</sup>**, **<sup>d</sup>**(**x**) + **<sup>x</sup>** <sup>−</sup> **<sup>x</sup>**(*t*) ∈ X − **<sup>x</sup>**(*t*) as well as *<sup>σ</sup>*(*t*) <sup>∈</sup> (0, 1].

• *σ*(*t*) **<sup>d</sup>**(**x**) + **<sup>x</sup>** <sup>−</sup> **<sup>x</sup>**(*t*) <sup>≤</sup> 1 by the definition of *<sup>σ</sup>*(*t*). By the definition of (P1) it follows that

max*∇f*(**x**(*t*)), **<sup>d</sup>**(**x**(*t*)) ≤ *<sup>σ</sup>*(*t*) max*∇f*(**x**(*t*)), **<sup>d</sup>**(**x**) + **<sup>x</sup>** <sup>−</sup> **<sup>x</sup>**(*t*) and by the maximum property 1 max*∇f*(**x**(*t*)), **<sup>d</sup>**(**x**(*t*)) ≤ *<sup>σ</sup>*(*t*) max*∇f*(**x**(*t*)), **<sup>d</sup>**(**x**) <sup>+</sup> *<sup>σ</sup>*(*t*) max*∇f*(**x**(*t*)), **<sup>x</sup>** <sup>−</sup> **<sup>x</sup>**(*t*). (A1) 

We make the following observations:


$$\max\_{\ell} \langle \nabla f\_{\ell}(\mathbf{x}^{(t)}), \mathbf{d}(\mathbf{x}) \rangle \to \max\_{\ell} \langle \nabla f\_{\ell}(\mathbf{x}), \mathbf{d}(\mathbf{x}) \rangle \quad \text{for } t \to \infty.$$

• The last term on the RHS of (A1) vanishes for *t* → ∞.

By taking the limit superior on (A1), we then find that

$$\limsup\_{t \to \infty} \theta(\mathbf{x}^{(t)}) = \limsup\_{t \to \infty} \max\_{\ell} \langle \nabla f\_{\ell}(\mathbf{x}^{(t)}), \mathbf{d}(\mathbf{x}^{(t)}) \rangle \le \max\_{\ell} \langle \nabla f\_{\ell}(\mathbf{x}), \mathbf{d}(\mathbf{x}) \rangle = \theta(\mathbf{x}) \tag{A2}$$

Vice versa, we know that because of **<sup>d</sup>**(**x**(*t*)) ∈X−**x**(*t*), it holds that **<sup>d</sup>**(**x**(*t*)) +**x**(*t*) <sup>−</sup>**<sup>x</sup>** <sup>∈</sup> X − **x** and as above we find that ⎧

$$\begin{aligned} \mathbf{x} & \text{ and as above we find that} \\\\ \max\_{\ell} \langle \nabla f\_{\ell}(\mathbf{x}), \mathbf{d}(\mathbf{x}) \rangle &\leq \lambda^{(t)} \max\_{\ell} \langle \nabla f\_{\ell}(\mathbf{x}), \mathbf{d}(\mathbf{x}^{(t)}) \rangle + \lambda^{(t)} \max\_{\ell} \langle \nabla f\_{\ell}(\mathbf{x}), \mathbf{x}^{(t)} - \mathbf{x} \rangle \end{aligned} \tag{A3}$$

with

$$\lambda^{(t)} := \begin{cases} \min\left\{1, \frac{1}{\left||\mathbf{d}(\mathbf{x}) + \mathbf{x}^{(t)} - \mathbf{x}||}\right|\right\} & \text{if } \mathbf{d}(\mathbf{x}) \neq \mathbf{x}^{(t)} - \mathbf{x}, \\ 1 & \text{else}. \end{cases}$$

Again, the last term of (A3) vanishes in the limit so that by using the properties of the maximum function and the continuity of *<sup>∇</sup>f*, as well as *<sup>λ</sup>*(*t*) *<sup>t</sup>*→<sup>∞</sup> −−→ 1, in taking the limit inferior on (A3) we find that # #

$$\begin{split} \theta(\mathbf{x}) &= \max\_{\ell} \langle \nabla f\_{\ell}(\mathbf{x}), \mathbf{d}(\mathbf{x}) \rangle \leq \liminf\_{l \to \infty} \max\_{\ell} \langle \nabla f\_{\ell}(\mathbf{x}), \mathbf{d}(\mathbf{x}^{(l)}) \rangle \\ &\leq \liminf\_{l \to \infty} \Big[ \Big( \max\_{\ell} \langle \nabla f\_{\ell}(\mathbf{x}), \mathbf{d}(\mathbf{x}^{(l)}) \rangle - \max\_{\ell} \langle \nabla f\_{\ell}(\mathbf{x}^{(l)}), \mathbf{d}(\mathbf{x}^{(l)}) \rangle \Big) + \max\_{\ell} \langle \nabla f\_{\ell}(\mathbf{x}^{(l)}), \mathbf{d}(\mathbf{x}^{(l)}) \rangle \Big] \\ &\leq \liminf\_{l \to \infty} \Big[ \Big[ \Big| \mathbf{Df}(\mathbf{x}) - \mathbf{Df}(\mathbf{x}^{(l)}) \Big] \Big| \Big| \mathbf{d}(\mathbf{x}^{(l)}) \Big| + \max\_{\ell} \langle \nabla f\_{\ell}(\mathbf{x}^{(l)}), \mathbf{d}(\mathbf{x}^{(l)}) \rangle \Big] \\ &\leq \liminf\_{l \to \infty} \max\_{\ell} \langle \nabla f\_{\ell}(\mathbf{x}^{(l)}), \mathbf{d}(\mathbf{x}^{(l)}) \rangle = \liminf\_{l \to \infty} \theta(\mathbf{x}^{(l)}). \end{split} \tag{A4}$$

Combining (A2) and (A4) shows that *<sup>θ</sup>*(**x**(*t*)) *<sup>t</sup>*→<sup>∞</sup> −−→ *<sup>θ</sup>*(**x**).

Theorem 2 claims that *ω*(**x**) is uniformly continuous, provided the objective gradients are Lipschitz. The implied Cauchy continuity is an important property in the convergence proof of the algorithm.

**Proof of Theorem 2.** We will consider the constrained case only, when X is convex and compact and show uniform continuity a fortiori by proving that *ω*(•) is Lipschitz. Let the objective gradients be Lipschitz continuous. Then **Df** is Lipschitz as well with constant *L* > 0. Let **x**, **y** ∈ X with **x** = **y** (the other case is trivial) and let again **d**(**x**), **d**(**y**) be the respective optimizers.

Suppose w.l.o.g. that

$$\left| \max\_{\ell} \langle \nabla f\_{\ell}(\mathbf{x}), \mathbf{d}(\mathbf{x}) \rangle - \max\_{\ell} \langle \nabla f\_{\ell}(\mathbf{y}), \mathbf{d}(\mathbf{y}) \rangle \right| = \max\_{\ell} \langle \nabla f\_{\ell}(\mathbf{x}), \mathbf{d}(\mathbf{x}) \rangle - \max\_{\ell} \langle \nabla f\_{\ell}(\mathbf{y}), \mathbf{d}(\mathbf{y}) \rangle \rangle$$
 
$$\text{If we define}$$
 
$$(\mathbf{0}, \mathbf{1})\_{\mathbf{x} = \mathbf{x}\_{\ell}} = \int \min \{ 1, \frac{1}{\|\mathbf{d}(\mathbf{y}) + \mathbf{x} - \mathbf{x}\|} \} \quad \text{if } \mathbf{d}(\mathbf{y}) \neq \mathbf{x} - \mathbf{y}$$

!

!

!

!

If we define

!

!

!

!

$$\mathbf{f}(0,1] \ni \sigma := \begin{cases} \min\left\{ 1, \frac{1}{\|\mathbf{d}(\mathbf{y}) + \mathbf{y} - \mathbf{x}\|} \right\} & \text{if } \mathbf{d}(\mathbf{y}) \ne \mathbf{x} - \mathbf{y}, \\ 1 & \text{else}, \end{cases}$$

then again *σ*(**d**(**y**) + **y** − **x**) is feasible for (P1) at **y**. Thus,

$$\begin{array}{ll} \max\_{\ell} \langle \nabla f\_{\ell}(\mathbf{x}), \mathbf{d}(\mathbf{x}) \rangle - \max\_{\ell} \langle \nabla f\_{\ell}(\mathbf{y}), \mathbf{d}(\mathbf{y}) \rangle \\ \qquad \stackrel{\text{df.}}{\leq} \max\_{\ell} \langle \nabla f\_{\ell}(\mathbf{x}), \sigma(\mathbf{d}(\mathbf{y}) + \mathbf{y} - \mathbf{x}) \rangle - \max\_{\ell} \langle \nabla f\_{\ell}(\mathbf{y}), \mathbf{d}(\mathbf{y}) \rangle \\ \leq \| \sigma \mathbf{Df}(\mathbf{x})(\mathbf{d}(\mathbf{y}) + \mathbf{y} - \mathbf{x}) - \mathbf{Df}(\mathbf{y})\mathbf{d}(\mathbf{y}) \| \\ \leq \frac{\epsilon}{2} \| \sigma \mathbf{Df}(\mathbf{x}) - \mathbf{Df}(\mathbf{y}) \| \| \mathbf{d}(\mathbf{y}) \| + \| \mathbf{Df}(\mathbf{x}) \| \| \mathbf{x} - \mathbf{y} \| \end{array} \tag{A5}$$

where we have again used the maximum property 2 for the second inequality. We now investigate the first term on the RHS. Using **d**(**y**) ≤ 1 and adding a zero, we find

$$\begin{array}{lcl} \|\boldsymbol{\sigma}\mathbf{Df}(\mathbf{x}) - \mathbf{Df}(\mathbf{y})\| \|\mathbf{d}(\mathbf{y})\| \leq & \|\mathbf{Df}(\mathbf{x}) - \mathbf{Df}(\mathbf{y}) - (1 - \boldsymbol{\sigma})\mathbf{Df}(\mathbf{x})\| \\ \leq & L\|\mathbf{x} - \mathbf{y}\| + (1 - \boldsymbol{\sigma})\|\mathbf{Df}(\mathbf{x})\|. \end{array} \tag{A6}$$

Furthermore, **d**(**y**) + **y** − **x** ≤ 1 + **y** − **x** implies 1/(1 + **y** − **x**) ≤ *σ* and

$$1 - \sigma \le 1 - \frac{1}{1 + ||\mathbf{y} - \mathbf{x}||} = \frac{||\mathbf{y} - \mathbf{x}||}{1 + ||\mathbf{y} - \mathbf{x}||} \le ||\mathbf{y} - \mathbf{x}||.$$

We use this inequality and plug (A6) into (A5) to obtain

$$\begin{aligned} \max\_{\ell} \langle \nabla f\_{\ell}(\mathbf{x}), \mathbf{d}(\mathbf{x}) \rangle - \max\_{\ell} \langle \nabla f\_{\ell}(\mathbf{y}), \mathbf{d}(\mathbf{y}) \rangle &\leq L \|\mathbf{x} - \mathbf{y}\| + 2 \|\mathbf{D}\mathbf{f}(\mathbf{x})\| \|\mathbf{x} - \mathbf{y}\| \\ &\leq (L + 2D) \|\mathbf{x} - \mathbf{y}\|\_{\prime} \end{aligned}$$

with *<sup>D</sup>* = max**x**∈X **Df**(**x**) which is well-defined because X is compact and **Df**(•) is continuous. 

⎪⎪⎪⎭

.

*Appendix A.2. Modified Criticality Measures* !!!!

**Proof of Lemma 5.** There are two cases to consider: !!

!

⎪⎪⎪⎩

• If *ω*(*t*) m **x**(*t*) ≥ *ω* **x**(*t*) then *ω*(*t*) m **x**(*t*) − *ω* **x**(*t*) = *ω*(*t*) m **x**(*t*) − *ω* **x**(*t*) <sup>≤</sup> *κωω*(*t*) m **x**(*t*) !!!!⎧⎪⎪⎪⎨⎫⎪⎪⎪⎬

Now !

$$\left|\boldsymbol{\sigma}\_{\mathbf{m}}^{(t)}\left(\mathbf{x}^{(t)}\right) - \boldsymbol{\sigma}\left(\mathbf{x}^{(t)}\right)\right| \leq \begin{cases} \boldsymbol{\omega}\_{\mathbf{m}}^{(t)}\left(\mathbf{x}^{(t)}\right) - \boldsymbol{\omega}\left(\mathbf{x}^{(t)}\right) \\ 1 - \boldsymbol{\omega}\left(\mathbf{x}^{(t)}\right) \leq \boldsymbol{\omega}\_{\mathbf{m}}^{(t)}\left(\mathbf{x}^{(t)}\right) - \boldsymbol{\omega}\left(\mathbf{x}^{(t)}\right) \\ 1 - 1 = 0 \end{cases} \right| \\ \leq \boldsymbol{\kappa}\_{\boldsymbol{\omega}}\boldsymbol{\omega}\_{\mathbf{m}}^{(t)}\left(\mathbf{x}^{(t)}\right).$$

• The case *ω* **x**(*t*) < *ω*(*t*) m **x**(*t*) can be shown similarly.

**Proof of Lemma 6.** Use Lemma 5 and then investigate the two possible cases:

• If (*t*) m **x**(*t*) ≥ **x**(*t*) , then the first inequality follows because of 1 ≥ 1/(1 + *κω*).

• If (*t*) m **x**(*t*) < **x**(*t*) , then **x**(*t*) <sup>−</sup> (*t*) m **x**(*t*) <sup>≤</sup> *κω*(*t*) m **x**(*t*) , and again the first inequality follows.

#### **Appendix B. Pascoletti–Serafini Step**

One example of an alternative descent step **<sup>s</sup>**(*t*) <sup>∈</sup> <sup>R</sup>*<sup>n</sup>* is given in [33]. Thomann and Eichfelder [33] leverage the Pascoletti–Serafini scalarization to define local subproblems that guide the iterates towards the (local) model ideal point. To be precise, it is shown that the trial point **x** (*t*) <sup>+</sup> can be computed as the solution to

$$\min\_{\tau \in \mathbb{R}, \mathbf{x} \in B^{(t)}} \tau \quad \text{s.t. } \mathbf{m}^{(t)}(\mathbf{x}^{(t)}) + \tau \mathbf{r}^{(t)} - \mathbf{m}^{(t)}(\mathbf{x}) \ge \mathbf{0},\tag{A7}$$

where **<sup>r</sup>**(*t*) <sup>=</sup> **<sup>m</sup>**(*t*)(**x**(*t*)) <sup>−</sup> **<sup>i</sup>** (*t*) <sup>m</sup> <sup>∈</sup> <sup>R</sup>*<sup>k</sup>* <sup>≥</sup><sup>0</sup> is the direction vector pointing from the local model ideal point

$$\mathbf{i}\_{\mathbf{m}}^{(t)} = \begin{bmatrix} i\_1^{(t)}, \dots, i\_k^{(t)} \end{bmatrix}^T, \text{ with } i\_\ell^{(t)} = \min\_{\mathbf{x} \in \mathcal{X}} m\_\ell^{(t)}(\mathbf{x}) \text{ for } \ell = 1, \dots, k,\tag{A8}$$

to the current iterate value. If the surrogates are linear or quadratic polynomials and the trust region use a *p*-norm with *p* ∈ {1, 2, ∞} these sub-problems are linear or quadratic programs.

A convergence proof for the unconstrained case is given in [33]. It relies on a sufficient decrease bound similar to (20). However, it is not shown that *<sup>κ</sup>*sd <sup>∈</sup> (0, 1) exists independent of the iteration index *t* but stated as an assumption.

Furthermore, constraints (in particular box constraints) are integrated into the definition of *<sup>ω</sup>*(•) and *<sup>ω</sup>*(*t*) <sup>m</sup> (•) using an active set strategy (see [38]). Consequently, both values are no longer Cauchy continuous. We can remedy both drawbacks by relating the (possibly constrained) Pascoletti–Serafini trial point to the strict modified Pareto–Cauchy point in our projection framework. To this end, we allow in (A7) and (A8) any feasible set fulfilling Assumption 1. Moreover, we recite the following assumption:

**Assumption A1** (Assumption 4.10 in [33])**.** *There is a constant* <sup>r</sup> <sup>∈</sup> (0, 1] *so that if* **<sup>x</sup>**(*t*) *is not Pareto critical, the components r*(*t*) <sup>1</sup> ,...,*r* (*t*) *<sup>k</sup>* , *of* **<sup>r</sup>**(*t*) *satisfy* min *r* (*t*) max *r* (*t*) ≥ r.

The assumption can be justified because *r* (*t*) <sup>&</sup>gt; 0 if **<sup>x</sup>**(*t*) is not critical and *<sup>r</sup>* (*t*) can be bounded above and below by expressions involving *ω*(*t*) <sup>m</sup> (•), see Remark 4 and [33] (Lemma 4.9). We can then derive the following lemma: ⎧⎫

**Lemma A1.** *Suppose Assumptions 1 and 2 and Appendix B hold. Let* (*τ*+, **x** (*t*) <sup>+</sup> ) *be the solution to* (A7)*. Then there exists a constant κ*˜sd <sup>m</sup> ∈ (0, 1) *such that it holds* ⎩⎭

⎨

⎬

$$
\Phi\_{\mathbf{m}}^{(t)}(\mathbf{x}^{(t)}) - \Phi\_{\mathbf{m}}^{(t)}(\mathbf{x}\_{+}^{(t)}) \geq \overline{\kappa}\_{\mathbf{m}}^{\text{sd}} \omega\_{\mathbf{m}}^{(t)} \left(\mathbf{x}^{(t)}\right) \min \left\{ \frac{\omega\_{\mathbf{m}}^{(t)}\left(\mathbf{x}^{(t)}\right)}{\mathbf{c}H\_{\mathbf{m}}^{(t)}}, \boldsymbol{\Delta}^{(t)}, 1 \right\}.
$$

**Proof.** If **x**(*t*) is critical for (MOPm), then *τ*<sup>+</sup> = 0 and **x** (*t*) <sup>+</sup> = **<sup>x</sup>**(*t*) and the bound is trivial [5]. Otherwise, we can use the same argumentation as in [33] ([Lemma 4.13]) to show that for the strict modified Pareto–Cauchy point **xˆ** (*t*) PC it holds that -.

$$
\Phi\_{\mathbf{m}}^{(t)}(\mathbf{x}^{(t)}) - \Phi\_{\mathbf{m}}^{(t)}(\mathbf{x}\_{+}^{(t)}) \ge \mathbf{r} \min\_{\ell} \left\{ m\_{\ell}^{(t)}(\mathbf{x}^{(t)}) - m\_{\ell}^{(t)}(\mathfrak{A}\_{\mathbf{PC}}^{(t)}) \right\},
$$

and the final bound follows from Corollary 2 with the new constant *κ*˜sd <sup>m</sup> = r*κ*sd m .

#### **References**


## *Article* **An Interactive Recommendation System for Decision Making Based on the Characterization of Cognitive Tasks**

**Teodoro Macias-Escobar 1,2,\*,†, Laura Cruz-Reyes 3,†, César Medina-Trejo 3,†, Claudia Gómez-Santillán 3,†, Nelson Rangel-Valdez 4,† and Héctor Fraire-Huacuja 3,†**


**Citation:** Macias-Escobar, T.; Cruz-Reyes, L.; Medina-Trejo, C.; Gómez-Santillán, C.; Rangel-Valdez, N.; Fraire-Huacuja, H. An Interactive Recommendation System for Decision Making Based on the Characterization of Cognitive Tasks. *Math. Comput. Appl.* **2021**, *26*, 35. https://doi.org/10.3390/mca26020035

Academic Editors: Marcela Quiroz, Juan Gabriel Ruiz, Luis Gerardo de la Fraga and Oliver Schütze

Received: 28 February 2021 Accepted: 20 April 2021 Published: 21 April 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

**Abstract:** The decision-making process can be complex and underestimated, where mismanagement could lead to poor results and excessive spending. This situation appears in highly complex multicriteria problems such as the project portfolio selection (PPS) problem. Therefore, a recommender system becomes crucial to guide the solution search process. To our knowledge, most recommender systems that use argumentation theory are not proposed for multi-criteria optimization problems. Besides, most of the current recommender systems focused on PPS problems do not attempt to justify their recommendations. This work studies the characterization of cognitive tasks involved in the decision-aiding process to propose a framework for the Decision Aid Interactive Recommender System (DAIRS). The proposed system focuses on a user-system interaction that guides the search towards the best solution considering a decision-maker's preferences. The developed framework uses argumentation theory supported by argumentation schemes, dialogue games, proof standards, and two state transition diagrams (STD) to generate and explain its recommendations to the user. This work presents a prototype of DAIRS to evaluate the user experience on multiple real-life case simulations through a usability measurement. The prototype and both STDs received a satisfying score and mostly overall acceptance by the test users.

**Keywords:** decision making process; cognitive tasks; recommender system; project portfolio selection problem; usability evaluation

## **1. Introduction**

The decision-making process consists of selecting the best solution among a set of possible alternatives, considering difficult and complicated decisions [1]. Finding efficient strategies or techniques to aid this process is challenging due to the complexity of the problems.

In decision-making processes, such as the solution of optimization problems, the decision-maker (DM) is the person or group whose preferences are decisive for choosing an adequate solution to problems with multiple objectives (which are sometimes in conflict) and multiple efficient solutions [2]. The DM is the one who makes the final decision and chooses the solution that seems more appropriate from the preferences previously established.

There is a recent growing interest in using various techniques to incorporate the DM's preferences within a methodology, heuristic, or meta-heuristic to solve an optimization

problem [3]. Among the different preference incorporation techniques available, using a weight vector that defines the importance of each objective is one of the most commonly used and accepted approaches.

The project portfolio selection (PPS) problem is a challenging optimization problem that presents several conditions to consider. First, these problems are usually multiobjective, searching for the best possible outcome for each objective. However, these objectives usually face conflicts between them based on the constraints that the problem sets. Second, the number of constraints that a PPS problem presents can make the decisionmaking process difficult since many possible solutions within the solution search space may not be feasible.

Usually, PPS problems define a limited number of resources to be distributed to improve each of the objectives while considering a maximum and minimum threshold of said resources for each of the elements defined in the constraints, limiting each objective's gain. Under these circumstances, it is most likely that it will not be possible to determine an optimal single solution, but instead, a set of optimal solutions that define a balance between the objectives of the problem and the DM's preferences, identified by using different strategies [4]. Therefore, it is crucial to select the most suitable solution that reflects the preferences of the DM. Multi-criteria decision analysis (MCDA) methods are among the most widely used tools for solving PPS problems because of their capacity to handle complex problems with multiple objectives (usually in conflict) to satisfy [5].

A practical methodology to solve PPS problems is the decision support system (DSS), which allows the DM to analyze a PPS problem under the current set of preferences and facilitate the decision-making process. However, choosing the best solution is a complex task because of the problem's subjective nature and the DM's preferences, which could be specific to a person or group and might change during the solution process. An interactive DSS allows the DM to show the best solutions based on the current preferences and receive new information from the DM and update its search to adapt to changes. As the name infers, this system can establish a user-system interaction during the solution process.

This paper proposes the Decision Aid Interactive Recommender System (DAIRS), a multi-criteria DSS (MCDSS) framework that considers integrating cognitive tasks to the user-system interaction. DAIRS is able to perform several tasks aiming to aid the DM during the decision-making process, such as evaluating alternatives, interacting with the DM, and recommending a solution while presenting arguments to justify this selection. The most relevant and novel feature DAIRS provides is that it not only is able to obtain information from the DM and adapt it to present an appropriate recommendation. This proposal can also present new information to DM or defend its current recommendation. In other words, DAIRS establishes a dialogue game with the user instead of only being a system that receives information.

This paper addresses the characterization of cognitive tasks involved in the decision support process and its integration in recommender systems to develop more robust DSS. These systems should allow the precise analysis of possible solutions, provide solutions that optimize the results, and at the same time, satisfy the preferences established by a DM. DAIRS includes on its MCDSS framework different MCDA methods supported by argumentation theory in the form of argumentation schemes and proof standards.

This proposal intends to present a recommender system that is able to provide a bidirectional interaction. Both the user and system provide and obtain new information based on the knowledge obtained during a dialogue. For this purpose, this work uses concepts related to argumentation theory, which allow both participants (user and system) to establish a well-structured dialogue.

DAIRS uses a bidirectional interaction under the assumption that the user will satisfactorily carry out a decision-making process even without extensive knowledge of the problem. DAIRS provides information to the DM through the dialogue game, seeking to enhance and accelerate learning about the problem to aid in selecting a suitable solution.

This work seeks to meet three main objectives. First, develop a recommender system, called DAIRS, which suggests a solution to a multi-objective optimization problem (MOP), precisely a PPS problem, with a deep interaction between the decision-maker (DM) and the system. Second, this work seeks to simplify the DM interaction with the proposed recommender system. Lastly, DAIRS endeavors to achieve high-level satisfaction of a DM. For the last objective, this proposal seeks to validate the developed recommender system, evaluating the effects of using argumentation theory and a bidirectional dialogue concerning several properties related to the usability of an MCDSS.

The main contributions of this work, proposed to meet the above objectives, can be summarized in three elements, whose originality is shown in Section 2 :


The remaining part of this paper is structured as follows: Section 2 shows a brief review of works related to the proposal in this paper. Section 3 presents the necessary concepts on recommender systems employed in this work. Section 4 describes the proposed methodology and the developed prototype. Section 5 presents the experimental design and the results and analysis regarding the proposed prototype's performance when used to solve a test case study, which simulates a real-life scenario of a PPS problem. Finally, Section 6 addresses conclusions regarding the usability and effectiveness of the proposal and possible future work.

#### **2. Related Work**

This section reviews some of the most relevant works related to three topics in particular: (i) DSS frameworks used to solve PPS problems, (ii) interactive systems used for optimization problems, and (iii) proposals that use the characterization of cognitive tools to improve the interaction between the user and the recommender system.

#### *2.1. DSS Frameworks Used to Solve PPS Problems*

There are multiple DSS proposed focused on solving PPS problems. These works consider different strategies to incorporate the preferences established by the DM and select the most appropriate solution based on the current preference set or a preference weight vector.

While the proposal presented on this paper focuses on the development of a system that performs a project portfolio selection reflecting the DM's preferences in the best possible form and is able to interact with the user by entering in a dialogue, this bidirectional interaction feature, to the best of the authors' knowledge, has not been considered for solving PPS problems. It is important to understand some of the most relevant approaches to solve this problem.

Chu et al. [6] presents one of the first DSS proposed to solve PPS problems. Their DSS presents an approach based on a cost/benefit model for research and development (R&D) project management. Their work considers monetary and time cost, as well as the probability of success of each project to determine the optimal sequence of R&D projects to execute. The impact of each element when performing a selection of a solution is defined by a pair of weight variables that allowed to define how relevant is for the user to save money or time. More recently, Hummel et al. proposed a DSS framework based on the *Measuring* *Attractiveness by a Categorical Based Evaluation Technique* (MACBETH) approach [7] to solve R&D project portfolio management problems interactively [8].

Archer and Ghasemzadeh [9] propose a framework to design decision support systems to solve PPS problems. In their work, they attempt to simplify the PPS process in three main phases: strategic consideration, individual project evaluation, and portfolio selection. At each phase the users are free to select the techniques that they find the most suitable. This framework is used to develop a DSS named Project Analysis and Selection System (PASS) [10] which is able to perform tasks such as data entry, pre-screening, project evaluation, screening and optimization models without the involvement of the DM. PASS is used to solve successfully solve a single-objective PSS problem.

DSS frameworks have proven to be suitable alternatives for solving multi-objective PPS problems. Hu et al. [11] proposes a multi-criteria DSS (MCDSS) framework to solve PPS problems implementing the Lean and Six Sigma concepts [12] and considering the cost and benefit of each project. Their framework considers flexible weight vectors that can be modified during the solution process and the output is a Pareto optimal portfolio set which allows the DM to select the most adequate to their preferences.

Khalili-Damghani's work [13] shows how flexible frameworks developed to solve PPS problems can be. In this case, an evolutionary algorithm (EA) is combined with a data envelope analysis model (DEA) to create the structure of a fuzzy rule-based (FRB) system that measures the suitability of all available candidate project portfolios.

Mira et al. [14] evaluates the performance of a DSS framework by solving a real-life simulation of a PPS problem and comparing the cumulative controlled risk value obtained by the DSS with respect to the controlled risk value obtained by a manual-based portfolio selection method. The results show a 10% improvement of the DSS framework over manual-based selection.

Mohammed [15] proposes the use of various strategies to find appropriate solutions to PPS problems within fuzzy environments. For this purpose, his work relies on the Analytic Hierarchy Process (AHP) [16] and TOPSIS [17] adapted to work using fuzzy strategies, which use a set of vectors of relative criteria weights. In this case, the used strategies incorporate preferences in the decision-making process before the system proceeds to generate a recommendation.

Recently, DSS frameworks have proven to be an adequate alternative to solve PPS problems focused on sustainability. Dobrovolskiene and Tamosiuniene [18] propose integrating the analysis of a sustainability index of each project within a Markowitz risk-return scheme [19]. This incorporation aims to find better portfolios based on a risk-return assessment that at the same time considers the DM's responsibility towards the surrounding environment with a long-term focus on the well-being of society.

Debnath et al. [20] propose a DSS framework supported by a hybrid multi-criteria decision support method. This hybrid system combines strategies such as sensitivity analysis with grey-based Decision-Making Trial and Evaluation Laboratory [21] and Multi-Attributive Border Approximation area Comparison [22] to solve PPS problems focused on the development, quality, and distribution of genetically modified agricultural products considering sustainability under social, beneficial, and differential criteria.

Verdecho et al. [23] present another proposal focused on sustainability. In this case, an AHP is used to solve a PPS problem related to supply chains, whose objectives are focused on financial, environmental, and social sustainability. Their framework also seeks to optimize supply-related processes and customer satisfaction.

#### *2.2. Interactive Systems for Optimization Problems*

A common scenario when using DSS frameworks to solve any optimization problem, such as the PPS problem, is that while they present a solution based on preferences defined by a DM, the next step in the decision-making process is not considered. This step consists on determining the acceptance (or rejection) of the recommended solution by the DM, as well as updating the DM's preferences. The preferences of the DM may change during

solution process. It must be considered that the DM may be a single agent or group whose preferences are susceptible to social, political or economical-related changes surrounding the problem to be solved. Therefore, it is desirable that the framework is able to adapt to a new preferences.

For this situation, it is advisable to use an interactive process between the DM and the recommender system for decision-making support. The proposal of this work focuses on a bidirectional interaction. This feature is not present in the PPS problem nor in the optimization problems mentioned in this subsection.

Miettinen et al. [24] present a study that focuses on solving multi-objective optimization problems (MOPs) while using an interactive system that allows the system and the user to exchange information. Their study mentions that there are three main stopping criteria for these systems: the DM accepts the solution, the DM stops the process manually or an algorithmic stopping criterion is reached. Also, according to their work and [25] the interactive process can be divided in two phases. First, a learning phase where the DM obtains knowledge regarding the problem. The second phase is the decision phase, where the system identifies the most suitable solution according to the current information and DM must accept or reject it.

The *Flexible and Interactive Tradeoff* method [26] is a proposed approach implemented into a DSS to solve MOPs. This proposal considers that it is easier for the DM to compare results from multiple alternatives based on a definition of strict preferences between criteria rather than on indifference. This approach considers that the DM needs to establish a preferential lexicographic order for all criteria.

The InDM2 algorithm [27] is a recent work that allows interaction between the DM and the recommender system. The DM initially establishes a reference point which reflects to reflect his/her preferences. During the solution process, InDM2 shows the user the best candidate solutions it has found that match the current preferences. The DM is able to accept the solutions obtained, wait for the system to provide new solutions or stop the process at any time. InDM2 also allows the DM to update the current reference point or propose a new one, allowing the system to obtain new information based on the DM's new preferences.

Azabi et al. [28] propose an interactive optimization framework supported by a low fidelity flow resolver and an interactive Multi-objective Particle Swarm Optimization (MOPSO) for the optimization of the aerodynamic shape design of aerial vehicle platforms. As InDM2, the DM can incorporate preferences before and during the solution process. This interaction allows their framework to define and update a region of interest, accelerating the process. The results of their experiments show that the interactive MOPSO outperforms a non-interactive MOPSO, proving that a constant user-system interaction can provide better results.

There are also interactive framework proposals focused on solving PPS problems. Strummer et al. [29] proposes an interactive framework using a strategy based on identifying Pareto optimal solutions to determine a set of optimal portfolios to present to the DM. The system first solves a PPS problem and presents the best solutions under the current preferences and constraints defined by to the DM. The DM can then interact with the system to determine its preference for a particular criterion or set new constraints.

A study in Nowak et al. [30] states that several frameworks presented in the literature assume that the DM has a high-level knowledge of the problem, the methods used to solve, and has a well-defined set of preferences. These assumptions are obviously not always true. Therefore, the recommender system has to be as user-friendly as possible and understand that the user might have little knowledge of the problem prior using the system. Their study also notes a lack of consideration of dynamic elements, such as changes in preferences or the problem environment. The authors propose a general structure for the development of frameworks to solve PPS problems. This structure considers the criteria to be evaluated, the data needed to evaluate the projects, an analysis and evaluation of the projects, as well as the construction of project portfolios. The proposed structure also is

capable of obtaining new information from the DM during each iteration, allowing the DSS to adapt to every change and focusing their search towards the new preferences.

Interactive systems can use graphical visualization to support the decision-making process of optimization problems. The work of Haara et al. [31] performs a study to evaluate several interactive data visualization techniques to support the solution of multiobjective forest planning problems. The DM uses visual elements to ease the process of correctly identifying and defining his/her preferences. The authors mention that these interactive systems can be used for proposal management problems. PPS problems are management problems. Therefore, it possible to think that these proposals could work successfully for PPS problems.

The *Your Own Decision Aid* (YODA) framework is an interactive recommender system proposed in Kurttila et al. [32] to solve PPS problems. YODA focuses on working with DMs composed of multiple people, where each user defines his/her preferences and acceptable candidate projects. All available projects are separated in subsets based on the level of group acceptance that each project has. The projects with the highest level of acceptance will have priority when the system defines a candidate project portfolio. Each user is able to update their preferences or define a project acceptance threshold to allow rejected projects that are close enough to their acceptance standards to be considered to be acceptable alternatives.

#### *2.3. Characterization of Cognitive Tools to Improve User-System Interaction*

The interactive process between the user and the DSS should not be limited to a series of commands simulating a master-slave structure. The interaction requires the characterization of cognitive tasks to become an entity that not only receives information but also provides new knowledge of the problem to the user.

The problem of characterizing cognitive tasks in the decision support process has been addressed using different approaches. Some representative works on providing explanations to accompany a recommended solution in each interaction are shown in this section. Some works, described below, are based on argumentation to model human argumentation and dialogue processes. Their description includes limitations.

The proposals using artificial intelligence provide a recommendation by learning the user preferences for particular products. They do not seek to recommend a solution for an optimization problem. Instead, they use the AI methods to optimize the recommender systems (e.g., identify similar users). Other related works use queries to obtain information from the DM related to the currently presented solution, but they do not explain the result [33].

Labreuche [34] describes how to use the argumentation theory to perform a pairwise comparison between alternatives, using an MCDA method based on a weight vector. It establishes four different situations, which involve two candidate solutions *x* and *y* and a weight vector *w* for six different criteria representing the DM preferences. These situations use pairwise evaluations based on criteria weights to present an argument in favor or against the statement "*x* is preferred over *y*".

Ouerdane [35] extends Labreuche's work, presenting an approach to provide underlying reasons for supporting an alternative selected by a recommendation system. For the process of justification, the argumentation theory and decision support are combined with an established language to enable communication. It also proposed a hierarchical structure of argument schemes to decompose the decision process into steps whose underlying premises are made explicit, allowing identifying when the dialogue should incorporate the information into the dialogue with the DM. This structure has only been tested for a low-dimensional choice problem (CHP), where the decision options are known from the beginning. Said proposal analyzed the required elements to perform a dialogue game with the user, not only to defend an established recommendation by the system but also to obtain new preferences and statements provided by the DM. The new information obtained could

change dialogue-related elements, leading the system to provide a new recommendation if necessary.

The work presented in Cruz-Reyes et al. [36] proposes a framework design for generating DSSs focused on a PPS problem by characterizing arguments and dialogue using argumentation theory and rough sets theory. The framework has a justification module for the recommended solution shown to the user; the justification is supported by argument schemes and decision rules generated with rough sets. The process starts by obtaining available preferential information provided by the DM and selecting the appropriate multicriteria method to evaluate the available portfolios. After, it generates a recommendation and its justification interactively. This work focuses more on the decision rules generated through rough sets. The argumentation theory is a complementing element of the architectural design, and it is presented only in a conceptual form.

Sassoon et al. [37] use argumentation theory by using argumentation schemes within a chatbot. Their chatbot establishes a dialogue between the user and a DSS focused on medical consultation. It considers the user's symptoms, medical history, and a list of available treatments to recommend the most appropriate medication. This DSS can attempt to justify its response through arguments. This system only considers current feasible information and does not use the user's preferences.

The studies conducted by Morveli-Espinoza et al. [38–40] focus on the solution of goal selection problems. Their research uses artificial intelligence and argumentation semantics to select goals that are not in conflict and produce the best results considering a set of premises added before carrying out the solution process. The interface developed in their study is able to answer the "Why?" and "Why not?" questions for each goal, generating arguments based on the semantics used.

Recommender systems for e-commerce often rely on artificial intelligence (AI). The use of advanced AI related methods allow the system to ease the user-system interaction and recommend higher quality e-services and online products more closely related to the DM's preferences [33].

Recently, interactive DSS frameworks that accept arguments from the DM have been proposed to solve PPS problems . Vayanos et al. [41] presents a framework that focuses on obtaining the preferences of the DM before and during the solution of the problem. The system generates a set of preferences based on a moderate number of queries presented to the DM. Each query provides a pairwise comparisons between two solutions. The system is based on weak-preference concept. This means that even if the user shows a preference on a certain criterion, this is not considered to be an absolute factor to determine dominance between solutions and is instead taken as a support by the framework when performing a portfolio project selection.

The previous query-based interactive system research extends in another study [42]. This investigation considers two and multi-stage robust optimization problems, including R&D PPS problems. The DSS framework interacts with the user before making a recommendation by performing a series of queries where the DM must define a value that reflects the level of attractiveness towards a particular item. The system uses these values to elicit preferences considering one of two possible models: maximize worst-case utility or minimize the worst-case regret of the item recommended.

Another DSS interactive framework has been recently proposed in Nowak & Trzaskalik [43]. Their work presents a MCDSS which interacts with the DM during each interaction and allows the user to redefine his/her preferences and constraints to solve dynamic PPS problems. Their DSS considers two possible sources that lead to a change in the problem environment: a time-dependent variable and a change in the DM's preferences and constraints.

These last three proposals allow the user to provide new preferences by using queries. The proposal presented in this paper aims to provide an interactive recommender system that receives new preferential information from the DM and adapts its recommendation. The proposed system is also able to provide the DM with new information based on the

knowledge obtained. In addition, the system can argue and defend its proposed project portfolio selection through arguments, with the objective that the DM understands, through dialogue, that the recommendation presented by the system is the most appropriate based on the current information.

The intention of allowing the proposed DSS framework to defend its recommendation through arguments is to allow the user to learn in detail the characteristics and properties of the problem to solve. This also allows the DM to see thoroughly the reasons for the portfolio selection made by the DSS. This paper focuses on the use of argumentation theory to not only support the solution process for a PPS problem, but also to allow the system to defend the recommended solution. Additionally, this paper incorporates two newly proposed STDs, argumentation schemes and a proof standard (TOPSIS [17]) different from those proposed by Ouerdane [35].

#### **3. Background**

This section reviews the most relevant concepts related to the proposed work, necessary to understand the said proposal and how it operates. For this, the revised concepts focus on the decision-making problem, several of the most relevant approaches, and recommendation systems, and the argumentation theory.

#### *3.1. Multi-Objective Optimization Problem*

As mentioned in Section 1, many cases in which decision problems arise involve multiple objectives to be satisfied and usually in conflict with each other. Equation (1) presents the definition of a multi-objective optimization problem (MOP). This particular example presents a maximization MOP, looking to obtain the variable decision vector *x* that obtains the highest possible value for the *M* objectives within the function set *F*. However, it is also necessary to mention that it is possible to define minimization MOPs or combine both maximization and minimization for a subset of objectives.

$$\max F(\vec{x}) = f\_1(\vec{x}), f\_2(\vec{x}), \dots, f\_M(\vec{x}) \quad \text{s.t.} \ g(\vec{x}) > 0, h(\vec{x}) = 0 \ . \tag{1}$$

Each MOP has a set of inequality (*g*) and equality (*h*) constraints that define the solutions' feasibility. Based on the above scenario, it is understandable to believe that there are cases in which defining a single solution as optimal over all the other candidates is impossible. At this point, it falls to the decision-maker to carry out the selection of the most appropriate solution (or set of solutions) based on his preferences.

#### 3.1.1. The Decision Making Problem

In real-life situations, the DM may be represented by a person or group which seeks to improve their profits. However, the DM might not have enough resources to support all available alternatives simultaneously. This leads to what can be defined as a decisionmaking problem. It is necessary to search for actions that meet the current goals in the best way possible, using the available resources and maximizing profit.

Decision-making problems present four basic elements [44]: A set of one or several objectives to solve; a set of candidate solutions to achieve all objectives within the set; a set of factors that define the environment that surrounds the problem; and a set of utility values associated with each solution when they interact with the current environment.

In these cases, DMs might use multi-criteria decision support systems (MCDSS) to support their decisions. MCDSS uses computational techniques used to analyze highly complex decision problems in a reasonable computational time [45]. The multi-criteria decision analysis (MCDA) is a collection of concepts, methods, and techniques that seek to help individuals or groups make decisions involving conflicting points of view and multiple stakeholders [46]. MCDA methods are relevant components of MCDSS. Five elements are involved in these methods: Goal, decision-maker, alternatives or actions, preferences, and a solution set based on preferences.

#### 3.1.2. Project Portfolio Selection Problem

An example of a decision-making problem can be seen in the project portfolio selection (PPS) problem. A project is defined as a temporary, unique, and unrepeatable process that pursues a specific set of objectives [47]. A project portfolio is a set of projects selected for future implementation.

In this case, a person or organization has a set of projects to carry out. These projects share the resources currently available, and there is the possibility that several of those projects complement each other, as they are effective in the same area. Therefore, it is necessary to know which project portfolio meets an organization's demands, maximizing its profit.

Equations (2)–(4) present a formal definition of the PPS problem. Let *N* be the number of available projects. A project portfolio *x* is an *N* sized binary vector. The projects that have been selected are given a value of 1, while the non-selected projects are given a value of 0. The value of a project portfolio for an objective *i* is defined by the sum of each selected portfolio's profit towards the said objective. The profit matrix *p* contains the respective profit obtained by the *j*th project for the *i*th objective.

Two main constraints restrict the PPS problem. First, the budget threshold, which is presented in Equation (3). The cost vector *c* defines how much each project costs, while *B* defines the maximum available current budget. The sum of all the selected projects' costs must be equal to or lower than *B*.

The second constraint refers to all the areas involved in the problem. Thus, it is necessary to consider several *A* areas and a binary project-area matrix *a*, which defines which projects are assigned on each area. Each area has lower and upper investment thresholds *Lk* and *Uk*, respectively. The sum of all selected projects' costs involved in each area must be between those two thresholds to be considered a feasible portfolio.

$$\max f\_i(\vec{x}) = \sum\_{j=1}^{N} x\_j p\_{i,j}. \tag{2}$$

Such as *<sup>N</sup>*

$$\sum\_{i=1}^{N} \mathbf{x}\_{i} c\_{i} \le B\_{\prime} \tag{3}$$

$$L\_k \le \sum\_{i=1}^N x\_i c\_i a\_{k,i} \le \mathcal{U}\_k \quad k = 1, 2, \dots, A. \tag{4}$$

#### *3.2. Recommender System*

By solving a PPS problem using a method such as genetic or exact algorithms, it is possible to generate a set of good quality candidate solutions. However, a prevalent issue at this step lies in presenting the DM too many potential solutions, which may be too many to carry out an analysis using only the human capability. It is also necessary to consider that the DM's preferences might have changed during the problem's solution, making the decision-making process even more difficult.

A recommender system is a potential alternative for this situation. This system relies on the DM's preferences and a set of various heuristics to direct its search and define which solutions from the set may be more attractive to the DM [48]. Specifically, in the PPS problem, a set of solutions, global and area budget constraints, and DM preferences can be used to determine the most appropriate project portfolios.

However, there is a possibility that DM is not entirely convinced and needs to know the reasons behind the decision made by the recommender system. Other possible situations that the system might face when presenting a solution to the DM are related to the human factor. For example, the DM may not know how to express his preferences correctly, may not fully know the details of the problem, and may even directly reject the system's recommendation without waiting for a justification. For these reasons, it is desirable

to establish a quick relationship with the DM. The theory of argumentation offers an alternative to carry out this relationship.

#### *3.3. Argumentation Theory in Decision Making*

The argumentation theory is within the field of artificial intelligence. It can be defined as the process of constructing and evaluating arguments to justify conclusions. This allows decision-making to be carried out in a justified manner. This theory is based on nonmonotonic reasoning. This means that the conclusions obtained may be modified and even rejected when new information is presented [35].

The most relevant elements to consider within the argumentation theory are cognitive artifacts, proof standards, and argumentation schemes.

#### 3.3.1. Cognitive Artifact

Cognitive artifacts human-made objects that seek to help or enhance cognition. Its use is not only focused on supporting memory but also to set reasoning towards classifications and comparisons among several alternatives [49]. The support to the decision-making process presented by the argumentation theory can be seen as a set of cognitive artifacts used sequentially. This sequence occurs through an interaction between an expert and a client. According to [50], this process uses four cognitive artifacts: a representation of the problem, a formulation of the problem, a model of evaluation, and a final recommendation. This work addresses these last two artifacts.

#### 3.3.2. Proof Standard

In argumentation theory, all statements must be analyzed to determine their truthfulness and their effect on a possible conclusion the DM desires to reach [35]. Proof standards are methods and techniques that allow the unification of a set of arguments for and against a certain conclusion. These proof standards analyze and determine each argument's strength and value to solve the conflict between them by accepting or rejecting the established conclusion.

A basic example of a proof standard is the simple majority. This standard takes a statement such as "project *x* is better than project *y*". For this case, the *M* objectives are considered, and the values obtained by each one for both projects are analyzed. If *x* has more objectives with better value than *y*, then the conclusion is true. This expression can be formally defined as presented in the following Equation (5), where *Si* represent the dominance factor for objective *i*

$$\mathbf{x} \succeq \mathbf{y} \leftrightarrow |\{i \in M : \mathbf{x} \mathbf{S}\_i \mathbf{y}\}| \ge |\{i \in M : \mathbf{y} \mathbf{S}\_i \mathbf{x}\}|.\tag{5}$$

#### 3.3.3. Argumentation Scheme

Argumentation schemes can be defined as argumentative structures capable of detecting common and stereotypical patterns of human reasoning [51]. They are based on a set of inference rules in which the existence of certain premises can lead to a conclusion. The structure of the schemes is based on non-monotonic reasoning, allowing the entry of new information, altering the state of the conclusion.

An argumentation scheme is composed of three main elements:


Argumentation schemes are not necessarily complex. For example, the cause to effect scheme [52] is based on two premises: If event *A* occurs, event *B* occurs as a consequence, and *A* has occurred. Therefore, the conclusion defines that *B* will occur. Critical questions focus on the strength of the relationship between *A* and *B*, whether if it is strong enough evidence to warrant this event, and if there exist other relevant factors that also provoke *B* to occur.

#### *3.4. Dialogue Game*

One possible form to represent argumentation theory within decision-making problems is through the use of dialogue games. These games model verbally or in-writing the interaction between two or more individuals, called players. The dialogue game intends to exchange arguments both for and against a statement between the players to reach a satisfactory conclusion [53].

Multiple elements must be considered for the dialogue game, such as the players and their respective roles, objectives, limitations, etc. Like any game, a set of rules must be established that defines which actions are acceptable or not during the dialogue. Also, it is necessary to define a system to determine the movements that each participant is allowed to perform at the different stages of the dialogue game.

#### 3.4.1. Dialogue Game Rules

The dialogue game rules establish how the game is performed, defining criteria such as the starting and ending points of the game, the movements allowed for each player. These rules also define the criteria necessary to allow a coherent dialogue between the players. Each one can provide statements, arguments, and premises considered acceptable by the other participants, avoiding fallacies and dialogue loops that would stall the dialogue at a certain point [53].

There are four different types of dialogue game rules.


#### 3.4.2. State Transition Diagram

Based on the defined dialogue game rules, it is possible to identify which movements are allowed for each player and when he/she can use them. A state transition diagram (STD) can represent the evolution of the dialogue game graphically. An STD allows the players to visualize each of the different states where the dialogue can be located and the player currently in turn and what their available movements are. Similarly, an STD represents the starting and ending points of the game. With this, the four different types of rules of the dialogue game are effectively represented.

#### **4. Proposed Work**

This section describes the methodology and the different cognitive components defined for DAIRS. Afterward, a prototype proposed in this paper implements this methodology, which allows a user-system interaction through a dialogue game. This work focuses on two cognitive tasks: the evaluation model of the alternatives based on proof standards and the construction of arguments for the proposed recommender system's recommendation using argumentation schemes and a dialogue game.

#### *4.1. Dairs Methodology*

The evaluation of alternatives is the process of evaluating a set of alternatives based on their attributes, indicators, or dimensions of those alternatives [50]. In this case, the alternatives are the feasible project portfolios for the PPS problem. Each portfolio is evaluated

considering its performance on each objective and set of constraints. A criteria weight vector or a criteria hierarchy order is commonly used to solve evaluate alternatives. Therefore, DAIRS also considers these two elements when evaluating portfolios to create a recommendation

Using the previous information regarding the properties of the problem provided by the DM, a proof standard is selected considering said properties and used to evaluate all the feasible portfolios. Then, the recommender system defines an initial recommendation supported by the information provided by the DM and an abductive inference argumentation scheme based on the information obtained by the proof standard used. Therefore, before the dialogue game has begun, the system already has an initial portfolio recommendation to present to the user according to his/her preferences and arguments to defend said recommendation.

The recommendation system presented in this work requires defining a set of crucial elements for its operation: A set of proof standards, argumentation schemes, and a dialogue structure that defines how both user and system will perform a bidirectional interaction using a dialogue game.

#### 4.1.1. Proof Standards

To carry out a proper dialogue game between the user and the system, it is necessary to define methods that allow correctly collecting and analyzing the arguments for and against the current statement to reach a reasonable conclusion. Proof standards allow performing such collection and analysis.

The recommender system is capable of using a large number of proof standards. For this work, the orientation of the set of proof standards selected aims towards defining a solution for the PPS problem and is based on Ouerdane's work [35].

DAIRS considers proof standards that use a criteria preference hierarchy. These standards allow the user to define strict preferences between objectives. The recommender system focuses its search on the criteria defined as most relevant by the DM.

*Simple majority*: As explained in Section 3.3.2 and presented in Equation (5), this standard evaluates the truth of the statement "*x* is better than *y*" based on the number of objectives this statement holds.

*Lexicographic order*: This proof standard uses a hierarchical order established in the criteria. A project *x* is better than a project *y* if, and only if, *x* has a better value on a criterion of higher priority than *y*. The criteria hierarchy establishes that a higher-order criterion is infinitely more important than those in a lower position. Therefore, this method disregards the value of any other criterion of lower priority.

There are cases where even when the DM has a higher preference over specific criteria, this preference might not be strict. Instead, there is a certain threshold of acceptance for criteria with lower priority if their improvement is significant in these cases. Therefore, DAIRS considers proof standards that analyze each project portfolio supported by a criteria weight vector, determining each objective's relevance. These standards allow the system to identify possible significant improvements in criteria with different levels of importance for the DM.

*Weighted majority*: This method follows a similar strategy than simple majority. However, it relies on the weights of each criterion to evaluate. In this case, a criteria weight vector *w* assigns a weight to each criterion *i* (*wi*). Portfolio *x* has a preference over portfolio *y* if the sum of the weights of the criteria where *x* is better than *y* is greater than the sum of the weights of the criteria where *y* is better than *x*.

$$\mathbf{x} \succeq \mathbf{y} \leftrightarrow \mathcal{W}\_{\mathbf{x}\mathbf{y}} = \sum\_{\mathbf{x} \mathbf{S}\_{i}\mathbf{y}} w\_{i} \ge \mathcal{W}\_{\mathbf{y}\mathbf{x}} = \sum\_{\mathbf{y} \mathbf{S}\_{i}\mathbf{x}} w\_{i}. \tag{6}$$

*Weighted sum*: This method defines a single fitness value *Sx* for each portfolio based on *w* and the fitness value *f* obtained on each objective *i* (*fi*(*x*)). Let *N* be the number of criteria for the current problem. Equation (7) presents a formal definition of the previous statement. Portfolio *x* is preferred over portfolio *y* if, and only if, the sum of *x* is greater than the sum of *y* (*Sx* > *Sy*).

$$S\_x = \sum\_{i=1}^{N} w\_i f\_i(x). \tag{7}$$

*TOPSIS*: This proof standard is based on a method proposed in [17], which considers both the distance to the ideal solution, also known as utopia point, and the distance towards the negative ideal solution or nadir point. The solution that is closer to the former and furthest from the latter is the one that takes precedence.

The selection of the proper proof standard is essential to obtain a successful recommendation that follows both the DM's preferences and the quality of the solution itself. This process has a very relevant impact on the dialogue game. Each proof standard can have a set of properties defined, making them unique compared to the other set standards. During the dialogue game, both the user and system can define which properties are suitable to be considered or not in the discussion to obtain better recommendations or enhance the dialogue game's quality, based on the information provided by both players. The properties considered for the proof standard selection are:


This set of properties is based on the recommendations provided multiple works in the literature [35,52]. Table 1 shows the properties belonging to each proof standard. It should be noted that both simple majority and weighted majority methods can be used with or without a veto threshold.


**Table 1.** Proof Standards used for DAIRS and their properties.

\* These proof standards can be used with or without using a veto threshold.

#### 4.1.2. Argumentation Schemes

In addition to determining the proof standards to be used, it is necessary to define which human behavior patterns to consider for a dialogue within the system. The intention of defining the patterns to be identified is to regulate the system responses based on these patterns and establish boundaries in the dialogue to avoid situations such as infinite dialogue loops or loss of focus. For this reason, it is necessary to establish a set of argumentation schemes, which allow the process of identifying behavioral patterns to be carried out.

This work seeks to incorporate the proof standards selected in the previous subsection to strengthen and facilitate premise analysis and to define a conclusion for the current statement in the dialogue through argumentation schemes. These schemes are chosen considering proposals provided in previous related works [35,52]:

*Abductive reasoning argument*: This argumentation scheme allows the system to select the most suitable proof standard according to the current properties identified based on the information provided by user and the system.

*Argument from position to know*: The system performs an initial recommendation using this argumentation scheme after the system chooses a proof standard. This scheme also provides recommendations for the dialogue game's first cycles. With this, the system does not consider itself an expert yet as it has only obtained the initial information given by the list of available projects, DM's preferences, and budget threshold.

*Argument from an expert opinion*: After several cycles have passed in the dialogue game, surpassing a certain number of cycles, defined as *cycle threshold*, the system considers that it has obtained enough information from the user to position itself as an expert for the problem analyzed. Under this scheme, the system is more assertive in its arguments, as it has more information to defend them instead of just expecting to obtain new data from the user.

*Multi-criteria pairwise comparison*: The system compares the current recommendation against other alternatives, as well as solutions picked by the user that might attract his/her interest. The proof standard currently being used supports this scheme to form arguments to either defend the recommendation or select the user-picked solution if the new information provided proves that the DM's selection outperforms the system's recommendation under the current proof standard.

*Practical argument from analogy*: Sometimes, two solutions might be similar to a high degree. Therefore, it is necessary to consider if the previous solutions considered by either the system or the user can be considered recommendations for the dialogue game's current stage.

*Ad ignorantiam*: The current state of the system is unable to make inferences. All the information known by the system is considered valid by it. Meanwhile, all unknown information is considered false. The user can provide the system with new information regarding the problem in discussion at any point during the dialogue game.

*Cause to effect*: A change in the state of a proof standard property or the value of a criterion affects the current state of the system's recommendation. Whenever a change is detected, the system performs a reevaluation of the current solutions based on the new information. Then, it provides the user with a new recommendation, and the dialogue game continues.

*From bias*: The system considers this fallacy as the user might be biased towards a particular solution. While one of the recommender system's objectives is to provide the most suitable solution, user satisfaction is also a very relevant factor that a system must consider. Therefore, the system allows the user to set the recommended solution as the alternative the user picks. However, the system constantly reminds the DM that his/her choice might be biased and not the best available.

During the dialogue game, the system uses an argumentation scheme selected depending on the activities carried out in its current state by either the system or the user. Therefore, it is necessary to properly establish the dialogue game structure to use the correct argumentation scheme to characterize the arguments and premises used in the dialogue's current state.

The system relies on argumentation schemes to accept or reject a statement and obtain information, leading to changes in the problem's criteria values or the state of the proof standard properties. As previously mentioned, argumentation schemes can define the most suitable proof standard according to the current information provided.

#### 4.1.3. Dialogue Game Rules

DAIRS aims to use a dialogue game to establish a two-dimensional interaction between the user and the system. This interaction allows both participants to provide statements to strengthen the information to ease the decision-making process.

Before carrying out a dialogue game between the user and the system, it is necessary to define the set of rules that the players will follow in the game. As previously mentioned in Section 3.4, there are four types of dialogue game rules: locution, compromise, dialogue, and termination.

The compromise, dialogue, and termination rules followed in this work are established in [35]. However, the locution rules provide two main additions. First, the system can reject an argument presented by the user if it does not satisfy the current evaluation criteria. Second, the user is allowed to reject the system's recommendation at multiple points during the dialogue. These additions focus on the system's capability to defend its recommendation and user's satisfaction. Table 2 presents the locution rules used in this work. Let *φ* be the current statement, *C* a critical question, and "type" refers to *C* being an assumption or exception.

**Table 2.** Locution rules for the dialogue game used in this work.


#### 4.1.4. State Transition Diagrams

Once the dialogue game rules are defined, it is possible to design state transition diagrams (STDs). An STD can graphically represent how the dialogue flow will carry out. A noticeable advantage in using STDs is that they offer an easy method to identify and regulate how the dialogue transpires. Also, STDs show the movements available to both players at each stage of the interaction.

The recommender system proposed uses two STDs. Before a dialogue game begins, the system will select one of these diagrams to establish a user-system interaction for the current instance. The factor considered to define which STD to use is whether the DM establishes criteria preference hierarchy before the dialogue game begins. Depending on which scenario occurs, the system will use a particular STD and a different proof standard according to which properties are considered active.

The reasoning for using different STDs based on the DM's preferences is to take advantage of the amount of information and knowledge the user has regarding the problem. DAIRS provides a learning-focused dialogue if the DM has little knowledge of the problem. Meanwhile, the system provides a more assertive and portfolio selection-driven dialogue if the DM has an acceptable level of knowledge and it is possible to skip or shorten the learning phase.

*State Transition Diagram 1 (STD1)*: This diagram is chosen whenever the initial information available about the problem does not provide an explicit preference hierarchy regarding the problem criteria. This STD follows the structure defined in [35] while adding the additional locution rules mentioned previously. In particular, it adds a move that allows the system to reject the user's suggestion if there are no additional reasons for supporting his or her statement after a certain number of dialogue cycles have passed. A *dialogue cycle* can be defined as the point in the dialogue game when it reaches the initial state (1) once again. This system explains to the user that the reason for this rejection is to avoid a dialogue loop and continue the recommendation process. Figure 1 shows the structure of STD1.

*State Transition Diagram 2 (STD2)*: The second STD is used when there is an explicit user-defined preference hierarchy for the criteria before the dialogue game begins. The system seeks to exploit this situation to use and obtain as much information as possible from the early state of the dialogue game. Also, it allows the user to present critical questions from the beginning, which is not allowed when using STD1. STD2 provides more flexibility for the user by allowing him/her to reject the recommendation since the initial states of the dialogue. Figure 2 shows the structure of STD2.

**Figure 1.** State Transition Diagram 1, used when there is not an explicit criteria preference hierarchy defined on the initial information.

**Figure 2.** State Transition Diagram 2, used when there is an explicit criteria preference hierarchy defined on the initial information.

#### 4.1.5. System Modules

The next step is to incorporate the dialogue game and all its necessary procedures to be carried out properly within DAIRS. Previously, four main processes were identified as necessary to be implemented in the system to execute a recommendation process properly [36]. Figure 3 presents the structure of these models.

*Load instance module*: Reads the information concerning an instance to be solved by the recommender system. The DM uploads a file containing initial instance data to the system. This file contains information such as the number of candidate solutions and criteria, a criteria weight vector, a solution/objective matrix, budget threshold, veto threshold (if required), and criteria hierarchy. For the PPS problem, it is also necessary to insert additional data, such as the project portfolio matrix, representing the projects selected by each portfolio.

*Configuration module*: The system analyzes the information from the instance obtained in the load instance module to determine the initial configuration of all the elements required to start a dialogue game, such as the dialogue game rules, the state transition diagram, and the initial proof standard. This setup will allow the system to provide an initial recommendation to start the dialogue with the user.

*Dialogue module*: The user and the system start the dialogue game. The system's main objective is to convince the user to accept the recommendation provided by it. However, the user can reject the current recommendation or add new information and modify the initial configuration. This process will provide new information to the system, which the system will use to generate a new recommendation.

*Recommendation acceptance/rejection module*: The user can accept or reject the system's recommendation. This module determines a final step in the dialogue. The proposed recommender system attempts to consider the human factor by allowing the user to reject the solution at several stages of the dialogue, even if the solution recommended is the most suitable according to the current information provided by both the instance and the user. This option aims towards the user's satisfaction. As previously mentioned, while the recommender system's objective is to provide a high-quality recommendation, it is also desirable that the user feels satisfied with his/her final decision. User satisfaction is also an objective that any recommender system must pursue.

These modules adequately represent a recommender system's structure supported by concepts related to argumentation theory, such as argumentation schemes. For this reason, the development of the recommender system presented in this paper uses the previously mentioned structure.

**Figure 3.** Diagram module of the proposed recommender system, argumentation scheme are used within the dialogue module.

#### *4.2. Interactive Prototype*

The next step in developing the proposed recommender system is the implementation of a prototype, which incorporates all the previously mentioned elements (argumentation schemes, proof standards, dialogue games, and STDs). The proposed methodology intends to properly carry out a dialogue game, following the dialogue structures defined and represented in the STDs. The development of this prototype allows a user to directly contact the recommender system and to evaluate the usability of the framework designed in the previous subsection.

Figure 4 shows a dialogue game carried out between two users following the proposed structure. This dialogue shows an interaction between a recommender system (which plays the role of an expert) and the DM (who plays the role of the user). While the system presents recommendations, the user can question them, challenge them, or argue. The end of the dialogue relies on the user's final decision to accept or reject the system's recommendations. Note that the system can evolve its recommendation into a new one when the information provided presents valid arguments to justify the change.


**Figure 4.** Example of a dialogue game between user and system following the defined STDs.

#### 4.2.1. Bidirectional Interaction Algorithm

Algorithm 1 corresponds to the proposed method for bidirectional interaction between the user and DAIRS. The objective is to present the user with recommended solutions and an explanation of the recommendation while receiving the DM's preferences. The system must define several argumentation elements before the user-system interaction within the prototype may begin: A set of proof standards *PS* and its properties *PSProperties*, the initial set of premises *Premises*, the argumentation scheme set used for the dialogue game *Schemes*, the dialogue game rules *D*, and the set of available state transition diagrams *STD*. The output of this algorithm is a portfolio recommendation *rp*

This algorithm also requires a file of the instance *(file)*. This file must contain a set of elements as part of the initial input: An alternative/criteria value matrix (*C*), a criteria preference hierarchy (*Pre f C*), a criteria weight vector (*W*), a veto threshold vector for all criteria (*V*), a set of available project portfolios (*P*), its respective cost (*Pcost*), and the maximum allowed budget (*B*). Appendix A shows in more detail the information that this file should contain.

The algorithm begins using the *Load instance* module to load an instance in step 1, obtaining all the data necessary to proceed to the *Configuration* module. From steps 2 to 7,

this module defines each element's values for the cognitive decision tasks and dialogue game, according to the information provided by the instance.

Then, the *Dialogue* module is used from step 8 to step 18, establishing an interaction with the user in step 9, which could result in a possible modification of the values of the alternative/criterion value matrix, the active set of the proof standard properties, or the selected proof standard, as well as an update on the set of premises according to with the new information given by the user during that step.

```
Algorithm 1 Bidirectional interaction of DAIRS
```

```
1: {C, Pre f C, W, V, P, Pcost, B} ← load_instance(file)
2: Schemes ← select_schemes(Pre f C, Premises, Schemes)
3: D ← select_locution_rule_subset(D, Pre f C)
4: std ← select_std(STD, D
                             )
5: PSProperties ← set_properties(PSProperties, Pre f C, W, V, Premises)
6: ps ← proof_standard_selection(PSProperties
                                                 , PS)
7: rp ← recommend_portfolio(ps, P, C, Pre f C, W, V)
8: do
9: {Premises, C, PSProperties
                              , ps} ← interaction(std, D, Schemes
                                                                 , Premises)
10: if modified(C) then
11: update_criteria(C)
12: rp ← recommend_portfolio(ps, P, C, Pre f C, W, V)
13: else if modified(PSProperties
                                 ) then
14: PSProperties ← set_properties(PSProperties
                                                   , Pre f C, W, V, Premises)
15: ps ← proof_standard_selection(PSProperties
                                                   , PS)
16: rp ← recommend_portfolio(ps, P, C, Pre f C, W, V)
17: else if modified(ps) then
18: rp ← recommend_portfolio(ps, P, C, Pre f C, W, V)
19: end if
20: while !accept_reject(rp)
```
Then, the system checks whether there was a change that could affect the current recommendation. Steps 11 and 12 are executed if there is a change in the alternative/criterion matrix values. These steps update the matrix and use the current proof standard to evaluate all the available portfolios again. Steps 14 to 16 are performed if either the user or the system has modified the proof standard's properties. These steps update the set of active proof standard properties, select the most appropriate proof standard and reevaluate the set of portfolios. The system can directly change the proof standard to offer the user a more flexible system if the user desires. If so, then step 18 is executed, using the chosen proof standard to generate a new recommendation.

The algorithm repeats this process until the user reaches a final state of acceptance or rejection of the system's recommendation. When that happens, DAIRS reaches the *Recommendation acceptance/rejection* module, considering the dialogue game finished and ending the interaction.

#### 4.2.2. Graphical User Interface

The graphical user interface of the proposed prototype seeks to allow the user to interact with the system in multiple ways. From the definition of the instance to work with, establish a dialogue with the system, edit values of the profit obtained by each available project, and manipulate the status of the proof standard properties considered by the system to match the user's preferences better. This interface is composed of a set of windows that allow the user to perform the activities previously mentioned.

Figure 5 presents the graphical user interface (GUI) of DAIRS; the primary areas in this interface are:

1. The menu bar. A set of menus that allow the user to perform actions related to the instance and its properties. It contains two sub-menus. The first sub-menu, named *Instance*, allows the user to read, start and restart instances. The second submenu, named *Recommendation Options*, lets the user update criteria values, visualize information regarding all available portfolios, the current state of the dialogue game, and even provides the user a Help window with any necessary additional information regarding the GUI.



**Figure 5.** Main window of the DAIRS GUI. The GUI allows the user to read the dialogue, perform actions and see the available portfolios.

As previously mentioned, the recommender system prototype proposed in this work focuses on the PPS problem. As previously mentioned, the GUI intends to allow the user to interact with a recommender system using the proposed structure. The *Load instance* module reads the information about a PPS problem instance from a file. This file includes all the required information necessary to initiate a dialogue game between the user and system in DAIRS, as explained in Algorithm 1.

The *Configuration* module allows the user to select proper parameters to start the dialogue game seeking to aid the DM in his/her decision for the uploaded PPS problem. The initial premises and arguments that both the user and system are available to select from are determined based on the information. The dialogue game rules are defined. Then, DAIRS generates an STD following the structure of said rules. In this case, if there is not a criteria hierarchy defined on the instance file, then STD1 is used. However, if there is a preference order defined in the file, then STD2 is used. Finally, a proof standard is selected based on the information provided by said instance.

After this, the *Dialogue* module is reached. In the first step, the system provides an initial recommendation to the user based on the selected proof standard and all available information regarding the candidate portfolios. Figure 6 shows this process. From this point, the dialogue game begins, the user can accept or reject the said recommendation, provide his arguments to counter the system's proposal, or even introduce additional information which affects weights or veto thresholds of the criteria, the impact of an

alternative in a criterion, or the proof standard. Figure 7 presents a screenshot reflecting these movements.

During the dialogue game, it is expected that the prototype's interface allows changes within the instance. The user can change the value of any criterion for each available project. The user can do so until the process reaches the *Recommendation acceptance/rejection* module when the DM accepts or rejects the current recommendation and considers it a final decision. By doing so, the dialogue game reaches its end.


**Figure 6.** Initial recommendation from the system. DAIRS analyzes all the information provided by the instance file and provides a recommendation.


**Figure 7.** Advanced stage of the dialogue game. The user is questioning reasoning behind the system's recommendation and the system is able to respond.

4.2.3. Definition of the Dialogue Game Rules

Within the prototype, once the instance to read containing the PPS problem's information has been defined, it is necessary to determine how DAIRS will carry out the dialogue game between the user and the system. For this purpose, the system must define the dialogue game rules.

For this prototype, the compromise, dialogue, and termination rules are identical in all possible scenarios where an interaction between the players occurs to aid the decisionmaking process of a PPS problem, as shown in Section 4.1.3. However, it is necessary to define the locution rules that each dialogue game will use, based on whether or not there is a hierarchy order established for the criteria.

As mentioned in Section 4.1.4, DAIRS uses two STDs. The first one, STD1, does not require an initial preference hierarchy and focuses on obtaining information regarding the PPS problem on the dialogue game's initial cycles. The second diagram, STD2, is used when the DM defines preferences before the dialogue game begins. In this case, the system is more flexible to the user since DAIRS considers that both players have a better understanding of the problem as there is enough information to determine a hierarchy.

Once the dialogue game rules and STD are defined, the user can communicate with the system through the main window's interaction area after the system has presented an initial recommendation. Considering the structure presented by Figure 5, the user has at his disposal a set of available actions that allow him to interact with the system before and during the dialogue game. These actions are the ones that allow the creation of bidirectional interactions between the user and the system:


The options available for *DM's decision* depend on the current state of the dialogue within the STD and the locution rules defined. The user can accept the current recommendation (Accept) or reject it (Retract), ending the dialogue game. He/She can also present an argument to challenge the system's actions (Challenge), create an argument for or against the recommendation (Argue), suggest a recommendation for the system to analyze (Assert), or present a critical question that can modify the current proof standard used and its properties (Pose Critical Question).

#### 4.2.4. Use of Argumentation Schemes

DAIRS uses an argumentation scheme based on the current state of the STD and the DM's action in the interaction area. The argumentation schemes used in this prototype are those reviewed in Section 4.1.2. This subsection briefly explains the conditions and events that trigger the use of each scheme.

The *abductive reasoning argument* scheme is used in the prototype when the system reads an instance before generating an initial recommendation, the user uses the GUI to start a dialogue game, after posing a critical question, and when the user argues to have a preference towards a particular criterion.

DAIRS uses the *argument from position to know* scheme after setting the initial proof standard, when the dialogue game starts using the GUI, when the system provides a new recommendation and the dialogues cycles have not surpassed the *cycle threshold*. The system also uses this scheme if the user challenges a system's argument and when the user poses a critical question.

Meanwhile, the recommender system uses the *argument from an expert opinion* scheme under the same scenarios as *argument from position to know*. However, it is only used when the number of dialogue cycles has surpassed the *cycle threshold*, which implies that the system has a more profound knowledge about the instance.

Whenever the user wishes to compare the profit or budget of two portfolios, DAIRS uses the *multi-criteria pairwise comparison* scheme. When the difference between two compared portfolios has no significant difference, DAIRS uses the *practical argument from analogy* argumentation scheme to support its decision.

For the fallacy-based argumentation schemes, the system uses the *ad ignorantiam* scheme at all times, as all information not introduced into the system is considered false by it. Meanwhile, DAIRS uses the *from bias* scheme when there is a criteria hierarchy defined or if the user decides to define a hierarchy.

Lastly, the prototype uses the *cause to effect* argumentation scheme when the user poses a critical question, asserts a preference towards a specific portfolio, or defines a preference towards a particular criterion as an argument to justify his/her preference for a specific portfolio.

#### 4.2.5. Proof Standard Selection

The last step in the *configuration* module before presenting an initial recommendation is to select the initial proof standard. To do the system performs a two-stage method. The first stage corresponds to the definition of the proof standard properties before starting the dialogue game.

In DAIRS, different considerations determine if a property is set as active or inactive. Ordinality is always active unless the value of one of the weight vector values is equal to or exceeds 0.6 under a normalized value. Anonymity is active when there is not an explicit criteria preference hierarchy order defined. Additivity with respect to coalitions and with respect to values are only active if ordinality is active as well. Veto and distance to the worst solution are inactive by default.

The veto property is defined as inactive by default for the definition of the initial proof standard since the simple majority and weighted majority proof standards can have both veto and non-veto versions. Therefore, the user has the choice to activate this property during the dialogue game. Distance to the worst solution is set as inactive as the system seeks to use basic comparisons between all portfolios during the initial recommendation. The user is allowed to activate this property and access more complex proof standards during the dialogue.

After the proof standard properties setup, the system selects the most suitable standard based on the active properties. The DAIRS prototype analyses each proof standard, choosing the one with the most significant number of related properties active at that time in the system.

Following the proof standard selection, the process moves towards the *dialogue* module and performs an interaction between the user and the system using the DAIRS prototype. In this module, the user can modify the current state of all properties during the argument exchange between him/her and the system. Providing new information can also cause said properties to become active or inactive. There are three conditions in which the system can modify the status of each proof standard property:


If any previous scenarios occur, the recommender system selects the new proof standard to use by considering the active properties or using the standard that the user directly chose. After doing so, the system analyses the available portfolios under the selected proof standard and presents a new recommendation. This process is what the system considers a *dialogue loop*.

#### **5. Experimentation and Analysis**

This section defines an experimental design to evaluate the effect of the developed prototype on various users. Generally, the measurement of recommender systems comes through quantitative measures. However, human factors affect the acceptance of the recommendation that must be evaluated in interactive systems [54], such as user satisfaction and confidence in the results.

This experiment seeks to analyze the usability of the recommender system under a real-life simulation of a PPS problem in a controlled environment, where users interact with the system. Under these considerations, this work performs a usability test to evaluate the proposed prototype.

The analysis presented in this section will allow a study of the effects on user overall satisfaction by using argumentation theory concepts, such as argumentation schemes, proof standards, and dialogue games on an MCDSS. This study will also compare the effects on user satisfaction under the two STDs presented in this proposal.

#### *5.1. Experimental Design*

A study is conducted on two groups of seven individuals to evaluate the performance of the DAIRS prototype built; each group includes people with different degrees of computational and mathematical knowledge, from people that have a basic level of computer knowledge to master degree and Ph.D. students.

Each member plays the role of a user and interacts with the system. in a dialogue game. The interaction period given to the users to work with the prototype has a maximum limit set to 50 min. The experimentation process performed by each group consists of the following steps:

(1) Introduction to the system: Users are shown the recommender system prototype and explained how it works. The users receive a detailed explanation about the different components DAIRS has, the possible actions they can perform on the system at any given time, and how the system reacts to each of the user's movements (maximum time length: 5 min).

(2) Initial use of the system to solve a sample PPS problem: In this step, the user directly interacts with the DAIRS prototype for the first time. Users face a real-life simulation of a small-sized PPS problem in terms of the number of available projects. Then, the evaluator asks each user to carry out a set of steps: Create a project/profit matrix of the PPS problem presented, analyze a set of previously made project portfolios to solve this problem, manually select the portfolio he/she believes is the best choice. After that, the users create a file of this PPS problem using the structure mentioned in Section 4.2 and upload it to the recommender system in its GUI. Finally, each user analyzes and compares their decision against the system's recommendation and engages in a brief dialogue game (maximum time length: 10 min).

The introductory PPS problem for this step is a simple example that presents the following scenario: *"You have got \$20,000 in savings, and there are some necessities of life and work that you want to cover which have the following costs:"*


*"However, your savings do not allow you to buy everything, so you must select a subset. You must select which of them to choose taking into account four equally important criteria:"*


(3) Simulation of a complex real-life PPS problem: To fully evaluate the prototype's capabilities, both groups perform a simulation of a real-life PPS under a different environment. The first group works with an instance without a criteria preference hierarchy order defined. Meanwhile, the second group uses an instance with a criteria hierarchy. The PPS problem to solve presents the following scenario:

*Four neighboring cities are planning to apply 25 social projects to improve the citizens' quality of life. However, before these projects were budgeted, a natural disaster severely depleted these towns' funds. Because of this, the cities can implement only a subset of the projects. Each city provided a list defining a level of satisfaction provided to the city by each project. Meanwhile, an analyst was hired, who generated a set of possible combinations of the projects that the cities could execute.*

The first group users, which manage an instance without a criteria hierarchy, are given the objective: *Determine which project portfolio is the most adequate to best satisfy the four cities.*

The second group users, which manages an instance *with* criteria preference hierarchy, are given the objective: *Determine which project portfolio is the most adequate. The user must*

*consider that a council composed of members from all four cities has decided to satisfy mainly one of the four cities as it is the one that generates the most income for all of them.*

At the beginning of this problem's study, each user expresses preferences through a criteria weight vector. The group supervisor makes a consensus using Borda counting [55,56], obtaining a weight vector that characterizes the group. Appendix A presents the information of the PPS problem file.

Each user receives a file containing the initial information of the PPS problem. The file follows their respective group's structure. The user must upload the file into the recommender system's GUI and engage in a dialogue game to obtain the most suitable solution to satisfy their respective objectives. Since the first group's problem setup does not include a criteria hierarchy, their dialogue game will follow the structure defined in STD1. Meanwhile, the second group will follow the structure of STD2 as there is a predefined criteria ranking (maximum time length: 30 min).

(4) Application of an evaluation to measure the usability of the prototype and to obtain user feedback for potential future work (maximum time length: 5 min).

#### *5.2. Usability Evaluation*

This work uses a usability test to evaluate the performance of DAIRS based on the user's opinion and satisfaction. The usability test analyses six critical elements related to a recommender system's quality: Design, functionality, ease of use, learning capacity, user satisfaction, and result and potential future use. The designed usability evaluation questions and structure are based on the models presented by Lewis and Zins et al.'s works [57,58]. This test uses a score between 0 and 10 as a measure, where 0 means complete disagreement and 10 means complete agreement. Each of the analyzed elements features a subset of questions to evaluate it:

*Design*:


*Functionality*:


*Ease of use*:


#### *Learning*:


#### *Result and future use*:


#### *5.3. Results and Analysis*

The results obtained in the usability evaluation for each user were added to a total value per group. Then, the average values were obtained per question and for each of the question subsets representing the elements considered relevant for a recommender system as mentioned in the previous subsection.

Table 3 presents the results obtained on average for each element of the recommendation system considered. A Wilcoxon statistical test [59] was performed to determine whether there is a significant difference between the values obtained by the groups. The greatest difference is found within the satisfaction criterion, while the learning section shows the smallest difference between the two groups.

Figure 8 shows graphically the average values obtained in each section by each of the groups. As mentioned above, the satisfaction section shows the most remarkable difference between the two groups, while the least remarkable difference is in the learning section. Another observation that this figure presents is that the average value for all sections is higher than 8.

Table 4 presents the average value obtained by question for each group and the difference between them. The results presented in this table allow a specific visualization of the main strengths of each STD based on the users' evaluation. Figures 9 and 10 graphically show the results for each question in STD1 and STD2, respectively. These resources allow seeing which elements had a more relevant impact on user satisfaction in each of the analyzed sections of the prototype for each STD and compare them.

**Table 3.** Difference between the average value obtained per section for both groups. A statistical test is used to find if there is a significant difference between both groups.


\* indicates there is a significant difference between values using Wilcoxon Test.

Based on this, it is possible to assume that STD1, which is selected if the instance does not have a defined criteria hierarchy before establishing a user-system interaction, used in a dialogue game provides better functionality and satisfaction for the users. Also, according to the values obtained for the questions related to the results and future use, it could be implied that test users would prefer to conduct a dialogue game using the structure in STD1 over the structure in STD2.

**Figure 8.** Bar graph comparison of the average value obtained per section by both groups.

**Table 4.** Difference between the average per question for both groups. This comparison allows analyzing and understanding the specific elements in which users were most satisfied.


However, the overall user satisfaction while using STD2, which is used if there is a defined criterion preference hierarchy defined from the initial step of the dialogue game, is also acceptable. STD2 presents advantages over STD1 over certain aspects based on the results obtained, specifically regarding the ease of use, which is one of the main objectives of the STD2 design. Therefore, neither can be discarded as both have potential utility within the prototype and can provide new relevant information to the user during the dialogue.

**Figure 9.** Bar graph showing the average obtained per question by the first group (STD1).

**Figure 10.** Bar graph showing the average obtained per question by the second group (STD2).

Analyzing the values obtained concerning design, users generally felt comfortable using the GUI presented. According to the answers obtained by the questions made regarding this section, this comfort is because each of the main window parts and the available options and windows given by the menus provide the necessary content without saturating the user with too much unnecessary information. The most common observation regarding the prototype's design is the need to add more colored details to make the relevant elements of the dialogue more noticeable. With this, it is possible to consider that the DAIRS GUI design is easy and straightforward to use. These graphical advantages are intended to favor the flow of the dialogue and thus obtain a better recommendation.

Both groups were also satisfied with the prototype's functionality, as shown from the values in Table 3. Most users think that the system can adequately support the decisionmaking process for PPS problems and that it has the necessary tools to execute this task. For this analyzed section, the most significant difference between the groups focuses on the users' ability to select their solution. This result suggests that users prefer an assertive but learning-focused interface. Another observation from the users is again focused on graphical aspects since both groups proposed the use of graphs to represent the criteria profit and budgets.

The results regarding ease of use imply that some users in the first group had difficulty adapting to the prototype's operation at the beginning of the test. In fact, concerning the questions related to this section, the most notable difference is presented for the users' opinion regarding the system's simplicity. Although the interface is considered to be simple and accessible for most users, some users from the first group believe that starting a dialogue game can be complicated. The obtained results can conclude that the prototype, though easy to use, requires a certain degree of prior knowledge for proper usage. Users with more previous knowledge of the problem (STD2) quickly adapted to the use of DAIRS. However, the overall results are satisfactory for both groups. As a relevant observation,

the users would desire to have access to a user manual that explains each of the prototype components' operations in detail.

An interesting fact that should be mentioned is that, although some users reported issues adapting at the beginning of the test, the results concerning learning, in general, show that the conveniences and information offered by the system allow them to learn how to use it properly quickly.

Most users in both groups conclude that the prototype offers easy and simple ways to learn how to use the system, although the initial impact can present a steep learning curve at the beginning. Similarly, users generally feel that DAIRS adequately supports them in obtaining information and learning quickly and effectively about the PPS problem. Based on the learning section results, it is possible to believe that DAIRS offers the user an advantage to solve a problem, as it offers an effective problem-learning methodology supported by argumentation theory.

The two sections of questions in which there was a statistically significant difference between the two groups were "satisfaction" and "results and future use". Although mentioning the difficulty of adapting at the beginning of using the prototype in general, the first group felt largely satisfied and comfortable using the system by the end of the test. On the other hand, although the users from the second group of users were satisfied with the prototype, they consider it advisable to reduce the system's dialogue game duration. Although users are satisfied with the system's functionality and interface, they feel that the time needed for the user and the system to reach an agreement on a portfolio recommendation could be improved.

Based on the information previously presented, it is possible to say that the definition of bidirectional interaction between the user and DAIRS is effective since users feel generally satisfied with the recommendation obtained to solve a problem by using an interactive recommender system supported by MCDA methods and argumentation theory, using argumentation schemes, proof standards, and dialogue games.

Also, the results obtained in the "satisfaction" and "results and future use" sections can assume that the learning-oriented approach, given by STD1, offers higher user satisfaction in the results obtained compared to a recommendation-oriented approach, as presented by STD2. This assumption agrees with the conclusions obtained by the results in the other analyzed sections, where users using STD2 feel more comfortable using DAIRS when carrying out a dialogue following this diagram but preferred that the system would focus more on continuing the learning process of the problem.

In general, there is a better overall evaluation by the first group, being only the ease of use question subset the only exception. However, there is only a significant difference in the satisfaction and results and future use sections.

All analyzed sections obtained an average value higher than 8, and, except for the analysis for the ease of use on the first group, these values were never lower than 8.5. Based on this, it is possible to consider that the prototype had a satisfactory degree of acceptance by both groups and that the future implementation of all the presented observations could further improve its quality.

The system received an average score of 89.91%. Therefore, it is possible to conclude that this evaluation is satisfactory enough to consider DAIRS as a promising alternative. However, the results and observations from the users evidence the necessity to introduce visual resources; although the plain text could be enough for some people, others prefer a representation using images and graphs.

Most state-of-the-art works presented in Section 2 propose MCDSS frameworks to solve optimization problems and establish an interaction between the user and the system. However, this interaction only allows users to incorporate new information, and the system does not establish a deep interaction with the DM that goes beyond receiving such information and generating new recommendations. Experimentation with DAIRS shows that it is possible to generate an MCDSS to solve PPS problems capable of establishing a bidirectional interaction. In this interaction, both participants generate and obtain new

information. The system's defense of the recommendation and the user's statements use argumentation theory in a dialogue game supported by argumentation schemes and proof standards.

#### **6. Conclusions and Future Work**

This work studied the characterization of cognitive tasks involved in the decisionaiding process. The cognitive tasks involved in the process were defined, identifying those that could generate an interaction between the user and the system. This paper addressed two cognitive tasks to create a final recommendation: the evaluation model for the alternatives and the argument construction. For the first cognitive task, the proposed recommender system used proof standards to define a method to evaluate and select the best fitting alternative. For the second cognitive task, the system used argumentation schemes and a dialogue game to support the preferred alternative and establish a possible user-system interaction.

One of this work's main contributions is the development of the Decision Aid Interactive Recommender System (DAIRS), an MCDSS framework focused on solving PPS multi-objective problems. The framework is based on the characterization of cognitive tasks through argumentation schemes, dialogue game rules, state transition diagrams, and proof standards. These elements are incorporated into DAIRS to allow the recommender system to perform a bidirectional interaction between it and a user. This work proposed and developed a DAIRS experimental prototype that provides an environment to aid the decision-making to validate the proposed system.

Another contribution is the proposal and design of two state transition diagrams (STDs) to determine the flow of a dialogue game between the DM and DAIRS. These STDs allow two-way interaction between both participants, meaning that both can obtain and provide information. Also, the proposed STDs have two relevant components. First, the user can reject the proposal; this defines a new dialogue stopping criterion aiming towards the user's satisfaction. Second, the system is able to defend its arguments and reject the user's statements if there is not enough information to support them. The first STD, STD1, focuses on a more learning-oriented dialogue. Meanwhile, the second STD, named STD2, assumes that the user has an acceptable degree of knowledge about the problem to solve and focuses on providing recommendations to the user and engaging in a dialogue game focused on said recommendations.

Some of the most relevant features in the DAIRS prototype are designing and implementing several concepts related to argumentation theory within an MCDSS. The first of these concepts is a set of proof standards based on several known MCDA methods. Also, DAIRS incorporates multiple argumentation schemes from the literature on its process, supported by proof standards. Another relevant feature in DAIRS is the use of these elements by employing a dialogue game that uses one of the STDs proposed in this paper to direct the flow of the user-system interaction. DAIRS consider the three standard stopping criteria for a DSS interaction: user acceptance, manual stop, and algorithmic stop. The user can accept or reject the final decision; this considers user acceptance and manual stop. For the last stopping criterion, DAIRS implements a method to avoid loops in a dialogue game using multiple argumentation schemes.

Considering the strategies used by several state-of-the-art works, DAIRS uses proof standards based on a criteria hierarchy and a criteria weight vector. These considerations positively impacted users' overall satisfaction when using the proposed prototype since it considers their preferences using methods focused on qualitative (hierarchy) and quantitative (vector of weights) strategies, which allowed for a more flexible dialogue.

A usability evaluation analyzed the proposed system in this work to measure the quality of the developed DAIRS based on the experience of multiple test users after using it to solve a PPS problem that simulated a real-life situation. This evaluation studied the user experience regarding DAIRS by considering human factors that affect the acceptance or rejection of a recommendation. The results obtained were satisfactory enough as the system received an average approval of 89.91% and an overall acceptance in several critical elements such as design, functionality, ease of use, learning capability, satisfaction, and future use. Users were satisfied using the proposed GUI due to its simple design, ease to learn, use, interaction, and capability to obtain problem information.

On the other hand, the results for users using STD1 were often better than STD2. However, in both cases, the conclusions were primarily positive. These observations allow understanding that users are looking for an interactive system that assertively establishes recommendations, but with a focus directed towards learning about the problem with the objective that both the user and the system gain new knowledge to find a better solution.

The results show that the design of a bidirectional interactive recommender system allows users to successfully and effectively select a suitable recommendation for PPS problems. DAIRS presents a novel approach to the generation of recommendations for this type of problem not previously explored in the literature, to the authors' knowledge.

Considering the research area related to this work and all the observations and comments provided by the test users, multiple areas offer potential future work. First, the use of the proposed system on real-life problems different than the PPS problem. Second, adding new elements that make the recommender system capable of receiving new usermade portfolios during the dialogue game. Currently, the system uses only one STD per dialogue. Therefore, future work could focus on using more than one STD per dialogue game, looking to improve the dialogue game's quality. Also, there exists a wide variety of MCDA methods in the state-of-the-art, opening the possibility of using different methods as proof standards. Finally, the following versions of the prototype could provide the user with a more friendly looking GUI, featuring graphs and a more colorful environment.

**Author Contributions:** Conceptualization: T.M.-E., L.C.-R. and C.M.-T.; methodology: T.M.-E., C.M.- T. and N.R.-V.; software: T.M.-E. and N.R.-V.; validation: T.M.-E., L.C.-R. and C.G.-S.; formal analysis: T.M.-E., C.M.-T. and H.F.-H.; investigation: T.M.-E., L.C.-R., C.M.-T. and C.G.-S.; resources: T.M.-E., L.C.-R. and H.F.-H.; data curation: T.M.-E.; writing—original draft preparation: T.M.-E. and C.M.-T.; writing—review and editing: T.M.-E., L.C.-R., C.M.-T. and C.G.-S.; visualization: L.C.-R.; supervision: L.C.-R.; project administration: L.C.-R.; funding acquisition: T.M.-E., L.C.-R., C.M.-T. and H.F.-H. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Ethical review and approval were waived for this study due to a completely voluntary and conscientious acceptance by each of the users to participate in the tests carried out to perform the usability evaluation. All users only interacted with the system during the test periods established in advance notice.

**Informed Consent Statement:** Informed consent was obtained from all subjects involved in the study.

**Data Availability Statement:** A DAIRS prototype is available at: https://www.dropbox.com/sh/j1 yfblg011a7m0w/AABYrgbKBDWEFBg\_vRcGE6dja?dl=0 (accessed on 20 April 2021).

**Acknowledgments:** Authors thank to CONACYT for supporting the projects from (a) Cátedras CONACYT Program with Number 3058. (b) Project CONACyT A1-S-11012 from Convocatoria de Investigación Científica Básica 2017–2018 and CONACYT Project with Number 312397 from Programa de Apoyo para Actividades Científicas, Tecnológicas y de Innovación (PAACTI), a efecto de participar en la Convocatoria 2020-1 Apoyo para Proyectos de Investigación Científica, Desarrollo Tecnológico e Innovación en Salud ante la Contingencia por COVID-19. (c) T. Macias-Escobar would like to acknowledge CONACYT, Mexico National Grant System, Grant 465554.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Abbreviations**

The following abbreviations are used in this manuscript:


#### **Appendix A. PPS Problem Test Case**

This appendix presents the information in the PPS problem file required for the *load instance module*. This information is also used to perform step 3 of the experimental design.


#### **Table A1.** Criteria-profit matrix.


List of available project portfolios and required budget (0 indicates that a project has not been included by the portfolio, 1 indicates that a project has been included):


#### **References**


## *Article* **Modeling and Optimizing the Multi-Objective Portfolio Optimization Problem with Trapezoidal Fuzzy Parameters**

**Alejandro Estrada-Padilla, Daniela Lopez-Garcia, Claudia Gómez-Santillán, Héctor Joaquín Fraire-Huacuja, Laura Cruz-Reyes, Nelson Rangel-Valdez \* and María Lucila Morales-Rodríguez**

> Graduate Program Division, Tecnológico Nacional de México/Instituto Tecnológico de Ciudad Madero, Ciudad Madero 89440, Mexico; aestrada1993@hotmail.com (A.E.-P.); dann.loga@gmail.com (D.L.-G.); claudia.gs@cdmadero.tecnm.mx (C.G.-S.); hector.fh@cdmadero.tecnm.mx (H.J.F.-H.); laura.cr@cdmadero.tecnm.mx (L.C.-R.); lucila.mr@cdmadero.tecnm.mx (M.L.M.-R.) **\*** Correspondence: nelson.rv@cdmadero.tecnm.mx

**Abstract:** A common issue in the Multi-Objective Portfolio Optimization Problem (MOPOP) is the presence of uncertainty that affects individual decisions, e.g., variations on resources or benefits of projects. Fuzzy numbers are successful in dealing with imprecise numerical quantities, and they found numerous applications in optimization. However, so far, they have not been used to tackle uncertainty in MOPOP. Hence, this work proposes to tackle MOPOP's uncertainty with a new optimization model based on fuzzy trapezoidal parameters. Additionally, it proposes three novel steady-state algorithms as the model's solution process. One approach integrates the Fuzzy Adaptive Multi-objective Evolutionary (FAME) methodology; the other two apply the Non-Dominated Genetic Algorithm (NSGA-II) methodology. One steady-state algorithm uses the Spatial Spread Deviation as a density estimator to improve the Pareto fronts' distribution. This research work's final contribution is developing a new defuzzification mapping that allows measuring algorithms' performance using widely known metrics. The results show a significant difference in performance favoring the proposed steady-state algorithm based on the FAME methodology.

**Keywords:** multi-objective optimization; multi-objective portfolio optimization problem; trapezoidal fuzzy numbers; density estimators; steady state algorithms

#### **1. Introduction**

The Portfolio Optimization Problem (POP) is always present in organizations. One key issue in POP's decision process is the uncertainty caused by the variability in the project benefits and resources. The latter situation arises the necessity of a tool for describing and representing uncertainty associated with real-life decision-making situations. The POP searches a subset of projects under a predefined set of resources that maximizes the produced benefits; its formal definition is as follows.

Let *A* be a finite set of *N* projects, each characterized by estimates of its impacts and resource consumption. A portfolio is a subset of *A* that can be represented by a binary vector *x* = *x*1, *x*2, ... , *xn* that assigns *xi* = 1 for every financed project *i*, and *xi* = 0 otherwise. Let <sup>→</sup> *z* (*x*) = *z*<sup>1</sup> (*x*), *z*2(*x*), ... , *zp*(*x*) be the vector of impacts resulting from the linear sum of the attribute values of each financed project in *x*, i.e., the vector of size *p* representing multiple attributes related to organizational goals that describe the consequences of a portfolio *x*. Assume w.l.o.g. that the higher an attribute's value is, the better. Then, Problem (1) formally defines POP. *Maximize*9 :

$$\text{Maximize}\{z\_1(\mathbf{x}), z\_2(\mathbf{x}), \dots, z\_{\mathbb{P}}(\mathbf{x})\}, \mathbf{x} \in \mathcal{R}\_F \tag{1}$$

In Problem (1), *RF* is the space of feasible portfolios, usually determined by the available budget and other constraints that the *Decision Maker* (DM) wants to impose (e.g., budget limits on types, geographic areas, social roles of projects, etc.).

**Citation:** Estrada-Padilla, A.; Lopez-Garcia, D.; Gómez-Santillán, C.; Fraire-Huacuja, H.J.; Cruz-Reyes, L.; Rangel-Valdez, N.; Morales-Rodríguez, M.L. Modeling and Optimizing the Multi-Objective Portfolio Optimization Problem with Trapezoidal Fuzzy Parameters. *Math. Comput. Appl.* **2021**, *26*, 36. https:// doi.org/10.3390/mca26020036

Academic Editor: Leonardo Trujillo

Received: 28 February 2021 Accepted: 22 April 2021 Published: 24 April 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Different scientific research works address POP's variant in Problem (1), considering precise values on the available resources and the projects' impacts [1–6]. Moreover, there is an area called Portfolio Decision Analysis (PDA) dedicated to studying mathematical models to solve POP. There are theories, methods, and practices developed within this area to help decision-makers select projects from a very large set of them, taking into account relevant constraints, preferences, uncertainty, or imprecision [7]. PDA-related problems' difficulty comes from a combination of factors such as (1) large entry space; (2) consequences of multidimensionality in portfolio construction and selection; or (3) qualitative, imprecise or uncertain information.

A large entry space requires a solution process with exponential complexity for decisionmaking problems, even with simple decisions on allocating resources for candidate projects.

The consequences of multidimensionality in portfolio construction and selection relate conflicting attributes with difficulty in the decision process. Usually, the larger the number of dimensions, the more complex the solution space is. The latter causes a situation with so many solutions that it easily exceeds the human cognitive capabilities for evaluating and selecting the best candidate solutions [8].

The qualitative, imprecise, or uncertain information exists because of the varying nature of the distinct attributes and resources considered in the construction of portfolios. Such information can sometimes occur from different circumstances as a DM needs to use non-numerical data to describe the effects of a project instead of a quantitative measure. Other cases might indicate that there is lack of knowledge about future states of specific criteria, vagueness in the provided information, the values used to describe attributes or resources are not accurately known beforehand, or vague approximations and areas of ignorance. All the previous situations, denoted hereafter as *uncertainty*, limit the scientific approach in Operational Research-Decision Aiding [9], and modeling them using probability distributions can be a challenge [9].

Several optimization problems use fuzzy numbers to model the uncertainty in parameters' values from arbitrariness, imprecision, and poor determination [10]. Among the most recent and works related to the Multi-objective Portfolio Optimization Problem are the following: García [11] solved the Multi-objective and Static Portfolio Optimization Problem (MOSPOP) with real parameters using the generational algorithms HHGA-SPPv1 and HHGA-SPPv2 and considering the preferences of a DM. Rivera-Zarate [12] uses the Non-Outranked Ant Colony Optimization (NO-ACO) to address a variant of MOSPOP that includes interdependency among objectives and that has partial support with real parameters. Bastiani [13] solves the MOSPOP variant that includes synergy using ACO-SPRI, ACO-SOP, and ACO-SOP, three strategies based on the ACO that incorporate in their search process priority ranking, preferences, and synergy, respectively. Sánchez [14] proposes using classification methods on the generational algorithms H-MCSGA and I-MCSGA to approximate the Region of Interest (ROI) in MOSPOP. The first algorithm adds the preferences at the beginning of the process, while the second algorithm adds them during the process (while interacting with the DM). Balderas addresses the MOSPOP with uncertainty using intervals; it proposes the generational algorithm I-NOSGA based on NSGAII but incorporates interval numbers. I-NOSGA includes preferences "a priori" and uses Crowding Distance as its density estimator. Martínez [15] addresses the Dynamic Multi-objective Portfolio Optimization Problem (DMOPOP) with real parameters; the proposed approach introduces dynamism by changing the problem definition at the end of each period. Martínez presents three new multi-objective algorithms that also incorporate "a priori" preferences: the generational D-NSGA-II-FF, a new version of a classic genetic algorithm of no-dominance; the D-AbYSS-FF, a modified version of scatter search; and the D-MOEA\D-FF, a new variant of a state-of-the-art algorithm based on decomposition.

Table 1 summarizes the main features of the previously described works. Column 1 cites the research work and the studied POP variant. Columns 2 to 7 show the considered features in the research works: the solution algorithms it proposed, the type of instances it solved, the performance metrics it used, if it integrated preferences in the search process, if


**Table 1.** Related works.

used a steady-state selection scheme or not.

it considered a static or dynamic POP's version, the type of parameters it used, and if it

It is worth nothing that, from the information in Table 1, only approaches based on intervals address POP's variant with uncertainty, and none of them utilized a steady-state selection scheme. The Fuzzy Adaptive Multi-objective Evolutionary solution methodology (FAME) has had great success in many optimization problems; however, there is a lack of studies about its performance on the POP. The previous situations open an area of opportunity, addressed in this work, consisting of studying optimization approaches' performance derived from fuzzy numbers and steady-state selection schemes on their search process to solve the Multi-objective POP with uncertainty (MOPOP).

Evolutionary algorithms commonly use a generational selection scheme to update each generation's population; the process creates several offspring through genetic operators and combines them with the parents to form the next generation of individuals [10,14,15]. On the other hand, an algorithm using a steaty-state selection scheme produces a single offspring during the reproduction process to combine with the parents. The efficiency of the population's update process achieved by the latter method is advantageous for any research [16]. Hence, this work proposes a new method based on FAME and fuzzy numbers to handling uncertainty and obtaining more robust solutions in MOPOP; the approach mainly uses fuzzy trapezoidal sets to reflect a magnitude's imprecision.

This work's main contributions are: (1) a new mathematical model for MOPOP that considers fuzzy trapezoidal parameters; (2) a new algorithm based on FAME to solve the proposed model; (3) two novel steady-state NSGA-II to solve this MOPOP's variant; and

(4) a novel strategy to measure the performance of the fuzzy multi-objective algorithms with the commonly used real metrics.

The remaining structure of this paper is as follows. Section 2 includes some elements of the fuzzy theory used in this work. Section 3 describes a new mathematical model of the Portfolio Optimization Problem with Trapezoidal Fuzzy Parameters. Sections 4 and 5 contain the proposed steady-state algorithms: T-NSGA-II and T-FAME, respectively. Section 6 describes the computational experiments done to assess the performance of the algorithms. Finally, Section 7 presents the conclusions.

#### **2. Elements of Fuzzy Theory**

This section contains the main concepts of fuzzy theory used in this work.

#### *2.1. Fuzzy Sets*

Let *X* be a collection of objects *x*, then a fuzzy set A defined over X is a set of ordered pairs *A* = {(*x*, μ*A*(*x*))/*x* є*X*} where μ*A*(*x*) is called the membership function or grade of membership of *x* in *A* which maps *X* to the real membership subspace *M* [17]. The range of the membership function is a subset of the nonnegative real numbers whose supremum is finite. Elements with a zero degree of membership usually are not listed.

#### *2.2. Generalized Fuzzy Numbers*

A generalized fuzzy number A is any fuzzy subset of the real line *R*, whose membership function μA(x) satisfies the following conditions [18]:

1. μA(x) is a continuous function from *R* to the closed interval [0, 1]


where 0 < *w* < 1, *a*, *b*, *α*, *β* are real numbers.

We denote this type of generalized fuzzy number as A = (a, b, α, β, w)LR. When w = 1, the generalized fuzzy number is denoted as A = (a, b, α, β)LR. When *L*(*x*) and *R*(*x*) are straight lines, then A is a trapezoidal fuzzy number, and denoted as A = (a, b, α, β). When b = *α*, then A is a triangular fuzzy number, and denoted as A = (a, b, β). ⎧⎪⎪⎪⎪⎪⎪⎪⎨

A triangular membership function definition is as:

$$\mu\_{\mathcal{A}}(\mathbf{x}) = \begin{cases} 0 \mathbf{x} < a \\ \frac{\mathbf{x} - a}{b - a} \mathbf{x} \in (a, b) \\ \frac{\beta - \mathbf{x}}{\beta - b} \mathbf{x} \in (b, \beta) \\ 0 \mathbf{x} > \beta \end{cases} \tag{2}$$
  $\text{Definition definition is as:}$ 

A trapezoidal membership function definition is as:

$$\mu\_{\mathcal{A}}(\mathbf{x}) = \begin{cases} \begin{array}{c} 0 \ge a \\ \frac{\mathbf{x} - a}{b - a} \mathbf{x} \in (a, \, b) \\\\ 1 \ge \epsilon \,(b, \, a) \\\\ \frac{\beta - \mathbf{x}}{\beta - \mathbf{a}} \mathbf{x} \in (\mathbf{a}, \, \beta) \\\\ 0 \ge \beta \end{array} \end{cases} \tag{3}$$

#### *2.3. Trapezoidal Addition Operator*

Given two trapezoidal numbers A1 = (a1, b1,α1,β1) and A2 = (a2, b2,α2,β2), then [19]:

$$\mathbf{A}\_1 + \mathbf{A}\_2 = (\mathbf{a}\_1 + \mathbf{a}\_2, \mathbf{b}\_1 + \mathbf{b}\_2, \mathbf{a}\_1 + \mathbf{a}\_2, \boldsymbol{\beta}\_1 + \boldsymbol{\beta}\_2) \tag{4}$$

#### *2.4. Graded Mean Integration (GMI)*

Graded mean integration [19] is a defuzzification method to compare two generalized fuzzy numbers. We compare the numbers based on their defuzzified values. The number with a higher defuzzified value is larger. The formula to calculate the graded mean integration of a trapezoidal number A is given by: *P*(*A*) = (; *w* ;

$$P(A) = (\int\_0^w h(\frac{L^{-1}(h) + R^{-1}(h)}{2}) dh) / \int\_0^w h dh\tag{5}$$

For a trapezoidal fuzzy number *A* = (*a*, *b*, *α*, *β*), there is a more straightforward expression which is *P*(*A*) = (3*a* + 3*b* + *β* − *α*)/6.

#### *2.5. Order Relation in the Set of the Trapezoidal Fuzzy Numbers*

Given the trapezoidal fuzzy numbers *A*<sup>1</sup> and *A*2, then:


#### *2.6. Pareto Dominance*

Given the following fuzzy vectors: xˆ = (*x*1, *x*2, . . . . . . ., *xn*) and yˆ = (*y*1, *y*2, . . . . . . ., *yn*) where *xi* and *yi* are trapezoidal fuzzy numbers, then we say that xˆ dominates yˆ, if only if *xi* ≥ *y*<sup>i</sup> for all *i* = 1, 2, . . . , *n* and *xi* > *yi* for some *i* = 1, 2, . . . , *n* [20].

#### **3. Multi-Objective Portfolio Optimization Problem with Trapezoidal Fuzzy Parameters**

This section presents the proposed mathematical model for MOPOP with Fuzzy Trapezoidal Parameters. It offers a detailed description of the construction of the fuzzy trapezoidal instances used in this work to assess the proposed solution algorithms' performance. It also includes a description of how the fuzzy trapezoidal parameter' values participate in evaluating objective functions and the candidate solutions' feasibility when the solution algorithms search across the solution space.

#### *3.1. Mathematical Model*

Let *n* be the number of projects to consider, *C* the total available budget, *O* the number of objectives, *ci* the cost of the project *i*, *b*ij the produced benefit with the execution of the project *i* in objective *j*, *K* the number of areas to consider, *M* the number of regions, *Amin <sup>k</sup>* and *<sup>A</sup>max <sup>k</sup>* the lower and upper limits in the available budget for the area *<sup>k</sup>*, and *<sup>R</sup>min m* and *Rmax <sup>m</sup>* the lower and upper limits in the available budget for the region *m*. The arrays ai and *bi* contain the area and region assigned to the project *i*. *x*ˆ = (*x*1, *x*2, . . . . . . ., *xn*) is a binary vector that specifies the selected projects included in the portfolio. If *xi* = 1 then the project *i* is selected, otherwise it is not. Now we define the MOPOP with Fuzzy Trapezoidal parameters as follows:

$$\text{Maximize } \hat{z} = (\mathbf{z}\_1, \mathbf{z}\_2, \dots, \mathbf{z}\_\bullet) \tag{6}$$

where

$$z\_j = \sum\_{i=1}^{n} b\_{ij} x\_i \; j = 1, 2, \dots, O \tag{7}$$

Subject to the following constraints:

$$\sum\_{i=1}^{n} \mathbf{c}\_{i} \mathbf{x}\_{i} \le \mathbf{C} \tag{8}$$

$$A\_k^{\min} \le \sum\_{i=1, a\_i = k}^n \mathbf{c}\_i \mathbf{x}\_i \le A\_k^{\max} \quad k = 1, 2, \dots, K \tag{9}$$

$$\mathbf{R}\_k^{\min} \le \sum\_{i=1, b\_i=k}^{\text{wt}} \mathbf{c}\_i \mathbf{x}\_i \le \mathbf{R}\_k^{\max} \ k = 1, 2, \dots, M \tag{10}$$

$$\text{If } x\_i \in \{0, 1\} \text{ for all } i = 1, 2, \dots, n \text{ } n \text{ } \tag{11}$$

In this model, all the parameters and variables in *bold* and *italic* are trapezoidal fuzzy numbers.

The objective function tries to maximize the contributions of each objective (6). We calculate each objective by adding all the selected projects' contributions in the binary vector (7). The constraint (8) makes sure that the sum of the costs required for all the selected projects does not exceed the available budget. The set of constraints (9) makes sure that the sum of the projects' costs is in the range of the involved areas' available budget. The set of constraints (10) makes sure that the sum of the projects' costs is in the range of the available budgets for the corresponding regions. The final set of constraints (11) makes sure that the binary variables *xi* can only have values of 0 or 1.

We should note that the problem definition is over the space defined by the binary vectors whose size is 2n. Then the solution algorithms must search across this space to find the Pareto optimal solutions. On the other hand, given that the well-known NP-hard Knapsack problem can be easily reduced to MOPOP, the latter is also NP-hard [21].

#### *3.2. Strategy to Generate the Fuzzy Trapezoidal Instances*

This work uses instances initially designed for the POP with interval parameters, where the fuzzy representation of the parameters of the problem uses fuzzy interval type numbers (for example, the interval [76,800, 83,200]) [10]. Fixing the values of *α*, *β* to 0.5, and adding them to any interval in the original POP's instances allowed the creation of MOPOP's instances with Trapezoidal Fuzzy Parameters. Following this way, an interval value such as [76800, 83200] would be seen as [76800, 83200, 0.5, 0.5] in the new set of instances.

To create a random fuzzy interval type instance the following real parameters are considered: budget (*B*), number of objectives (*m*), projects (*p*), areas (*a*) and regions *(r*), and ranges of costs (*c*1, *c2*), and objectives (*m*1, *m*2). Then to generate a fuzzy interval instance the following interval type values must be determined:

[*B*, *B* ] ← Budget as interval

[*ai*, *a i* ] ← Limits of each area *I* = 1, 2, . . . , *a*

[*ri*,*r i* ] ← Limits of each region *r =* 1, 2, . . . , *r*

[*bij*, *b ij*] ← Benefit from the objective *I* = 1, 2, ..., *m* and for each project *j* = 1, 2, . . . , *p*

{*Ci*, *Ai*,*Ri*} ← Real values of the cost, area and region for each project *i =* 1, 2, . . . , *p.*

Implementing MOPOP's instances generator combines the previous parameters along with Equations (12)–(24) to create random instances [10].

$$\mathbf{B} = \mathbf{0}.58 \mathbf{B} \text{ B}' = \mathbf{1}.3 \mathbf{B} \tag{12}$$

$$a\_l = (0.7\,\, ^\ast B) / (1.7^\ast + 0.1a^2), \\ a'\_l = (1.27\,\, ^\ast B) / (1.7^\ast + 0.1a^2) \,\, \} \tag{13}$$

$$a\_{\rm u} = ((1.02 + 0.06r) \, ^\star B)/r), a\_{\rm u}' = ((2.635 + 0.155a) \, ^\star B)/a \tag{14}$$

$$a\_{\mathbf{i}} = a\_{\mathbf{l}} + \text{Random } (a'\_{\mathbf{l}} - a\_{\mathbf{l}}) \text{ for } \mathbf{i} = \mathbf{1}, \mathbf{2}, \dots, \mathbf{a} \tag{15}$$

$$a'\_{\,i} = a\_{\,u} + \text{Randomn} \, (a'\_{\,u} - a\_{\,l}) \text{ for } i = 1, 2, \dots, a \tag{16}$$

$$r\_l = (0.7 \, \text{\*} \, B) / (1.7a + 0.1a^2), \\ r'\_l = (1.27 \, \text{\*} \, B) / (1.7a + 0.1a^2) \, \text{[} \, \tag{17}$$

$$r\_{\rm ll} = ((1.02 + 0.06r) \, ^\ast \text{B})/r), r\_{\rm ll}' = ((2.635 + 0.155 \text{a}) \, ^\ast \text{B})/a \tag{18}$$

$$\mathbf{r}\_{\mathbf{i}} = \mathbf{r}\_{\mathbf{l}} + \text{Random } (\mathbf{r}'\_{\mathbf{l}} - \mathbf{r}\_{\mathbf{l}}) \text{ for } \mathbf{i} = 1, 2, \dots, \mathbf{r} \tag{19}$$

$$r'\_{\text{I}} = r\_{\text{u}} + \text{Random}\left(r'\_{\text{u}} - r\_{\text{l}}\right) \text{for } i = 1, 2, \dots, r \tag{20}$$

$$A\_i = \text{Random}(a) \; i = 1, 2, \dots, p \tag{21}$$

*Ri* = Random(*r*) i = 1,2, . . . ,*p* (22)

$$o = m\_1 + \text{Random } (m\_2 - m\_1), b\_{\text{ij}} = 0.8^{\circ}o,\tag{23}$$

$$b'\_{\text{ij}} = 1.1^\* o \text{ for } i = 1, 2, \dots, p \text{ and } i = 1, 2, \dots, m \tag{24}$$

The interval instances, built with the instances generator, have names under the following format ompn\_idI, where *m* is the number of objectives the instance has, *n* is the number of projects, *id* is a consecutive number, and I indicate that the instance is of interval type. An example of this would be the instance o2p100\_1I, meaning that it is the instance number 1 with 2 and 100 projects.

The Algorithm 1 details the structure of a fuzzy interval type instance.


In order to transform a given fuzzy interval type instance into a fuzzy trapezoidal instance, all the interval values [*a*, *b*] are changed to fuzzy trapezoidal values [*a*, *b*, a, b] with a = 0.5 and b = 0.5. The Algorithm 2 shows the result of converting the fuzzy interval type instance o2p25\_0I to the fuzzy trapezoidal instance o2p25\_0T.

**Algorithm 2.** o2p25\_0T fuzzy trapezoidal instance // Fuzzy trapezoidal value of the total available budget [76800, 83200, 0.5, 0.5] // Number of objectives 2 // Number of areas 3 // Fuzzy trapezoidal values of the upper and lower bounds for the available budget // in each area, a row for each area. [13060, 16560, 0.5, 0.5] [46245, 49745, 0.5, 0.5] [13810, 15810, 0.5, 0.5] [47895, 48095, 0.5, 0.5] [13210, 16410, 0.5, 0.5] [46545, 49445, 0.5, 0.5] // Number of regions. 2 // Fuzzy trapezoidal values of the upper and lower bounds for the available budget // in each region, a row for each region. [22775, 24275, 0.5, 0.5] [67950, 68050, 0.5, 0.5] [23325, 23725, 0.5, 0.5] [67900, 68100, 0.5, 0.5] // Number of projects 25 // For each project, there is a row that includes the following: fuzzy trapezoidal value // of the project cost, project area, project region, and the fuzzy trapezoidal values of // the benefits obtained with each objective. (only 5 of the 25 projects are showed) [9308, 10082, 0.5, 0.5] [1] [1] [7642, 8278, 0.5, 0.5] [231, 249, 0.5, 0.5] [8290, 8980, 0.5, 0.5] [2] [1] [8506, 9214, 0.5, 0.5] [404, 436, 0.5, 0.5] [5895, 6385, 0.5, 0.5] [3] [1] [3831, 4149, 0.5, 0.5] [111, 119, 0.5, 0.5] [9053, 9807, 0.5, 0.5] [1] [2] [3908, 4232, 0.5, 0.5] [399, 431, 0.5, 0.5] [6058, 6562, 0.5, 0.5] [1] [2] [5760, 6240, 0.5, 0.5] [418, 452, 0.5, 0.5]

#### *3.3. Evaluating the Solutions and Verifying the Feasibility*

This section describes how to calculate the objective values of a solution and how to determine its feasibility. To explain this process, let *F* the trapezoidal fuzzy numbers set, and *R* the set of real numbers. Now it is described how to apply the map *δ* : *F* → *R* such that *δ*(*A*) = *P*(*A*). The map associates the GMI value to each trapezoidal fuzzy number. A remarkable property of this map is that if *<sup>X</sup>* <sup>⊂</sup> *<sup>F</sup>n*, then *<sup>δ</sup>*(*X*) <sup>⊂</sup> *<sup>R</sup>n*, hence, the computation of a vector solution for a MOPOP's instance with two objectives is transformed into a vector of *two* trapezoidal fuzzy numbers, which in turn is transformed into a vector of *two* real numbers. As this process is consistently applied to all the solutions, the algorithms will be performed considering that the binary vector objectives space is the real vector space. The transformation must also be applied to all the trapezoidal fuzzy numbers in the constraints to validate the solutions' feasibility in the search space process. Equations (25)–(30) shows how evaluate the solution and verify the feasibility. 

$$\text{Maximize } \hat{z} = (z\_1, z\_2, \dots, z\_\bullet) \tag{25}$$

where

$$\mathbf{z}\_{j} = P\left(\sum\_{i=1}^{n} \mathbf{b}\_{i\bar{j}} \mathbf{x}\_{i} \; \middle| \; j = 1, 2, \dots, O \right. \tag{26}$$

Subject to the following constraints:

$$P(\sum\_{i=1}^{n} \mathbf{c}\_{i} \mathbf{x}\_{i}) \le P(\mathbf{C}) \tag{27}$$

$$P(A\_k^{\min}) \le P\left(\sum\_{i=1, a\_i=k}^n \mathbf{c}\_i \mathbf{x}\_i\right) \le P(A\_k^{\max}) \, k = 1, 2, \dots, K \tag{28}$$

$$P(\mathbf{R}\_k^{\min}) \le P\left(\sum\_{i=1, l\_i=k}^n \mathbf{c}\_i \mathbf{x}\_i\right) \le P(\mathbf{R}\_k^{\max}) \,\, k = 1, 2, \dots, M \tag{29}$$

$$x\_i \epsilon \{0, 1\} \text{ for all } i = 1, 2, \dots, n \tag{30}$$

An additional benefit is that this mapping transforms the approximated Pareto front in a set of real vectors. In such a case, standard commonly used metrics can be applied to evaluate the performance of the algorithms. ⎡⎤⎡⎤

Example: Consider the following simplified instance: ⎣⎦⎣


Then using the model, the problem to solve is: Maximize:

$$z\_1 = [3, 6, 1, 1]x\_1 + [1, 5, 0.8, 0.8]x\_2 + [10, 15, 1, 0.5]x\_3 \tag{31}$$

$$z\_2 = [2, 10, 0.2, 0.4]x\_1 + [5, 13, 0.7, 0.5]x\_2 + [4, 9, 0.5, 0.8]x\_3 \tag{32}$$

Subject to:

$$[2, 8, 0.5, 0.8] \mathbf{x}\_1 + [10, 13, 0.2, 0.5] \mathbf{x}\_2 + [4, 12, 0.5, 0.5] \mathbf{x}\_3 \le [3, 20, 1, 5] \tag{33}$$

The objectives *z<sup>1</sup>* and *z*<sup>2</sup> are the benefits generated by the projects selected in the binary vector *x*. The constraint verifies that the cost of that project is not higher than the available budget (*C*).

Given the solution *x* = [0, 1,0], then the fuzzy trapezoidal values of the two objectives are the following:

$$z\_1 = \lfloor 1, 5, 0.8, 0.8 \rfloor \tag{34}$$

⎦

$$z\_2 = [5, 13, 0.7, 0.5] \tag{35}$$

Evaluating the constraint to verify the feasibility of the solution *x*, we have:

$$[10, 13, 0.2, 0.5] \le [3, 20, 1, 5] \tag{36}$$

Now the GMI is used to compare the fuzzy trapezoidal numbers. For a trapezoidal fuzzy number *A* = (*a*, *b*, *α*, *β*), the GMI is:

$$P(A) = (\mathfrak{A}a + \mathfrak{A}b + \beta - a)/6\tag{37}$$

As *P*([10, 13, 0.2, 0.5]) = 11.55 ≤ *P*([3,20,1,5]) = 12.166, solution *x* is feasible.

Notice that this process was done in the fuzzy trapezoidal numbers space; only at the end the GMI is used to verify the constraint. To perform the process in the real space, the two fuzzy objectives and the fuzzy costs in the constraint are transformed into real numbers using the GMI. The evaluation of the solution is as follows:

$$z\_1 = P([3, 6, 1, 1] \mathbf{x}\_1 + [1, 5, 0.8, 0.8] \mathbf{x}\_2 + [10, 15, 1, 0.5] \mathbf{x}\_3) = P([1, 5, 0.8, 0.8]) \tag{38}$$

$$z\_2 = P([5, 13, 0.7, 0.5])\tag{39}$$

Then *z*<sup>1</sup> = 3 and *z*<sup>2</sup> = 8.966.

Transforming the constraint we have:

$$P([2, 8, 0.5, 0.8] \mathbf{x}\_1 + [10, 13, 0.2, 0.5] \mathbf{x}\_2 + [4, 12, 0.5, 0.5] \mathbf{x}\_3) \le P([3, 20, 1, 5])\tag{40}$$

$$P(\left[10, 13, 0.2, 0.5\right]) \le P(\left[3, 20, 1, 5\right])\tag{41}$$

Hence, the solution *x* is feasible given that 11.55 ≤ 12.166.

The algorithms proposed in this work use the evaluation and feasibility verification procedures described in this section. The algorithms must call such methods on every new solution generated by them.

#### **4. Steady-State T-NSGA-II Algorithm**

This section presents the design of all the components included in the definition of the proposed algorithm. This is an adaptation of the classic Deb algorithm NSGA-II [22] modified to work with the trapezoidal fuzzy numbers. As all the algorithms proposed in this work, T-NSGA-II updates the population, applying in each generation the steadystate approach to include in the population only one of the generated individuals. In generational algorithms, the new set of offsprings are combined with the parents to create individuals' next generation; the input to the algorithm is a MOPOP's instance. The output is an approximate Pareto front for the instance.

#### *4.1. Representation of the Solutions*

A MOPOP's solution is represented by binary vector *<sup>S</sup>* <sup>=</sup> {0, 1}*n*, where *<sup>n</sup>* is the number of projects. This vector is a portfolio, and each value *si* = 1 represents the inclusion of project *i* in the portfolio. The first element in the vector is s0, and the last is s*n*–1. Figure 1 shows an example of this representation.

**Figure 1.** Representation of a solution.

#### *4.2. One-Point Crossover Operator*

The one-point crossover operator generates two offsprings from two parents [23]. The process first defines a random cutting point *cp* in the range [0, n – 1]. After this, it split each parent vector into *left* and *right* sections, where for parent *i,* the *lefti* contains its values {s0, ... , s*cp*}, and the *righti* contains its values {s*cp*+1, ... , s*n*–1}. Finally, it mixes the split sections to generate two new offsprings *h*1, *h*2, where *h*<sup>1</sup> uses *left*<sup>1</sup> and *right*2, and *h*<sup>2</sup> uses *left*<sup>2</sup> and *right*1. The parents are chosen at random. The steady-state approach only utilizes the first offspring *h*1. The number of crossovers that are done is a defined parameter. Figure 2 shows an example of this operator.

**Figure 2.** Example of one-point crossover operator at index *cp* = 3.

#### *4.3. Uniform Mutation Operator*

The uniform mutation operator generates a new solution for the mutation population from given a solution vector *S* = {*s*0, *s*1, ... , *sn*–1 } [24]. The process generates for each index *i,* for 0 ≤ i ≤ n − 1, a random number *u* in the range [0, 1], and if *u* < *mut* then the value of *si* changes from 1 to 0 or vice versa, otherwise the value *si* remains intact. The parameter

*mut* is the mutation probability used by the operator. Figure 3 shows an example of the use of this mutation.

**Figure 3.** Example of when an element changes its value.

Another parameter of the operator is the number of new mutated solutions that must be generated. Usually, the solutions that undergo this process come from the crossover operator's results; otherwise are randomly chosen.

#### *4.4. Initial Population*

A predefined number of randomly generated solutions are created to have an initial population. When a new random solution is generated, the objectives vector for the solution is determined and its feasibility is verified.

#### *4.5. Population Sorting*

This process consists of sorting the solutions of the population, and it is composed of two phases: (1) the elitist phase, which keeps the best solutions; and (2) the diversification phase, which ensures that there are solutions different enough to avoid local optima in the search process of the algorithm. The elitist phase is also known as non-dominated sorting. It consists of separating the population in fronts or sets of non-dominated solutions, making sure that the best solutions are always on the first front. The diversification phase sorts the solutions of a front according to the Crowding Distance indicator. The solutions in the best fronts are included in the population, and when a front cannot be completely inserted, the solutions with the worst crowding distances are discarded. Figure 4 shows both phases.

#### *4.6. Non-Dominated Sorting*

This process has two parts, and works on a given population. The first part constructs the first front with the set of non-dominated solutions identified from the comparison of vectors of objective values among all the population' solutions. A solution is nondominated if its vector of objective values is not dominated by any other. Note that the Pareto dominance uses real value vectors in its definition.

The second part builds the remaining fronts one by one. Each new front integrates those solutions that are only dominated by solutions in previously built fronts. The process repeats until no more fronts can be made.

**Figure 4.** Elitism sorting and diversification phases.

#### *4.7. Calculating the Crowding Distance*

According to [22], this process orders the solutions in a front by their Crowding Distance (CD). The distance is a measure of the separation of the solutions, and it is relative to the normalized value of the objectives. The CD identify the solutions with extreme values on the objectives and put it first on the front. After that, the solutions order are according to their accumulated degree of separation per objective, the greatest the separation the better. For each objective, the CD computes the degree of separation using the ordered array of objective values resulting from the front; the solutions with the highest and smallest objective values will have a specific Crowding Distance value *d* equal to infinite (∞), while the remaining solutions will be calculated by the following formula:

$$d\_{I\_j^m} = d\_{I\_j^m} + \frac{f\_{m^{j+1}}^{I^m} - f\_{m^{j-1}}^{I^m}}{f\_m^{\max} - f\_m^{\min}} \tag{42}$$

where *d* is the Crowding Distance, *I* is the solution position in the whole population in general, *j* is the solution position after the ordering by objective *m* within the front, *f* is the objective value and *m* is the current objective. The accumulation of Crowding Distance value *d* of all the objectives results in the final value of CD for each solution *I*.

#### *4.8. Calculating the Spatial Spread Deviation (SSD)*

The Spatial Spread Deviation (SSD) is a density estimator used to rearrange the solutions in a front, so the spread is not by a wide margin [25]. The method calculates for each solution the SSD value using a matrix of normalized distances between the solutions in the approximated front. The solutions are sorted from the lowest to highest SSD value in order to punish solutions according to their standard deviation and their proximity to their closest *k*-neighbors. The next three equations show how to calculate the SSD values, in the process *i* is the solution in the front for which the SSD is calculated, and *j* take values over all the solutions in the front except *i*. &

$$temp1(i) = \frac{1}{n-1} \sqrt{\sum\_{j=1}^{n} (D(i,j) - (D\_{\max} - D\_{\min}))^2} \forall \ i \neq j \tag{43}$$

$$temp2(i) = \sum\_{j \in \mathcal{E}K} \frac{(D\_{max} - D\_{min})}{D(i, j)} \tag{44}$$

$$SSD(i) = SSD\_0(i) + temp1(i) + temp2(i)\tag{45}$$

where *D*(*i*, *j*) is the distance from solution *i* to solution *j*. *D*max is the biggest distance between all the solutions and *D*min is the closest distance between all the solutions. *K* is the number of *k* neighbors closest to solution *i*. *SSD*<sup>0</sup> is the initial value of *SSD*, which is -*INF* if the solution is at one of the ends of the front when the normalized values of the graded mean integration of the objective values are calculated.

#### *4.9. Pseudocode of the T-NSGA-II Algorithm*

The T-NSGA-II is based in the structure of the classic multi-objective algorithm NSGA-II proposed by Deb [22]. As previously described, the algorithm had several modifications to work with trapezoidal fuzzy numbers and the proposed MOPOP model. Algorithm 3 shows the detailed pseudocode of the algorithm T-NSGA-II.

#### **Algorithm 3.** T-NSGA-II pseudocode

INPUT: Instance with the trapezoidal parameters of the portfolio problem. OUTPUT: Approximated Pareto Front NOTE: The algorithm is called T-NSGA-II-CD when the Crowding Distance is used, and T-NSGA-II-SSD when is used the Spatial Spread Deviation. \*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\* 1. Create the initial population *pop* 2. Evaluate all the solutions in *pop* 3. Order *pop* using no-dominated Sorting 4. For all solutions in *pop* calculate Spatial Spread Deviation/Crowding distance 5*. pop* sorting due to fronts and Spatial Spread Deviation/CD 6. **Main loop, until stopping condition is met** \*\*\* Steady state approach: only one generated individual is considered to include in *popc* 7. Create *popc* using crossover operator \*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\* 8. Create *popm* using mutation operator 9. Join *popc* and *popm* to create *popj* 10. Evaluate solutions in *popj* and put feasibles in *popf* 11. Add *popf* to *pop*, and calculate objective functions 12. Order *pop* using no-dominated sorting 13. Calculate Spatial Spread Deviation/Crowding distance 14. *pop* sorting due to the front ranking and Spatial Spread Deviation/CD 15. Truncate *pop* to keep a population of original size 16. No-dominated sorting 17. Calculate Spatial Spread Deviation/Crowding distance of the individuals in *pop* 18. *pop* sorting due to front ranking and Spatial Spread Deviation/CD 19. End **Main loop** 20. Return (Front 0). \*\*\*Approximated Pareto Front **5. T-FAME Algorithm** This section presents the design of all the components of the T-FAME algorithm. The algorithm adapts the FAME algorithm to work with the trapezoidal fuzzy numbers [25]. The input to the algorithm is an instance of MOPOP. The output is the approximate Pareto front for that instance. T-FAME updates the population, applying the steady-state approach to include in the population only one of the generated individuals. The following algorithm components are the same described in Section 4: the structure used to represent the solutions, the evaluation of a solution, the construction of the initial population, the sorting of the population, the non-dominated sorting process, and the density SSD estimator. The components described in this section are those not included in the previous description or with significant differences, such as the fuzzy controller, the additional genetic operators,

#### *5.1. Fuzzy Controller*

This section introduces an intelligent mechanism that allows an MOEA to apply different recombination operators at different search process stages. The use of different operators is dynamically adjusted according to their contribution to the search in the past.

and the structure used to store the approximated Pareto front.

Intuitively, the idea is to favor operators generating higher quality solutions over others. For this purpose, the fuzzy controller dynamically tunes the probability selection of the available recombination operators [25].

The fuzzy controller uses a Mamdani-Type Fuzzy Inference System (FIS) [26] to compute the probability of applying the different operators. Fuzzy sets defined by membership functions represent the linguistic values of the model's input and output variables. Regarding the inference, we use the approach originally proposed by Mamdani based on the "max min" composition: using the minimum operator for implication and maximum operator for aggregation. The aggregation of the consequents from the rules are combined into a single fuzzy set (output), to be defuzzified (mapped to a real value). A widely used defuzzification method is the centroid calculation, which returns the area's center under the curve. We use triangular-shaped membership functions in all inputs and outputs, ⎧⎪⎪⎨⎪⎪⎩

$$\mu\_{\mathsf{A}}(\mathbf{x}) = \begin{cases} \begin{array}{ll} 0 & \mathbf{x} < a \\ \frac{\mathbf{x} - a}{b - a} & \mathbf{x} \in (a, b) \\ \frac{\mathbf{c} - \mathbf{x}}{c - b} & \mathbf{x} \in (b, c) \\ 0 & \mathbf{x} > c \end{array} \end{cases} \tag{46}$$

the parameters *a* and *c* determine the "corners" of the triangle, and *b* determines the peak. A membership function μA(x) maps real values of *x* with a degree of membership 0 ≤ μA(x)≤ 1. The used granularity levels were: Low (*a* = −0.4, *b* = 0.0, *c* = 0.4), Mid (*a* = 0.1, *b* = 0.5, *c* = 0.9) and High (*a* = 0.6, *b* = 1.0, *c* = 1.4).

The interaction of the fuzzy controller with the algorithm works as follows: Let *Operators* the set of genetic operators available. The evolutionary algorithm monitors the search process in a series of time windows, each of size *Window*. At the end of each time window, the algorithm sends to the fuzzy controller the real values of the input variables *Stagnation* and *UseOp*, and receives from the controller the real value of the output variable *ProbOp.*

Each of the fuzzy variables has associated the fuzzy linguistic values: High, Mid and Low. Then the membership functions of the fuzzy variable *Stagnation* are: *μStagnation*=*High*(*x*), *μStagnation*=*Mid*(*x*) *and μStagnation*=*Low*(*x*). In a similar way, the membership functions are defined for the variables *UseOp* and *ProbOp*.

To show how works the fuzzification process consider that the received real values of the input variables are *Stagnation* = 0.7 and *UseOp* = 0.8.

The fuzzified values for the Stagnation variable are the membership degrees: *μStagnation*=*High*(0.7), *μStagnation*=*Mid*(0.7) *y μStagnation*=*Low*(0.7).

For the *UseOp* variable the fuzzified values are the membership degrees: *μUseOp*=*High*(0.8), *μUseOp*=*Mid*(0.8) *y μUseOp*=*Low*(0.8). All the membership degrees are values in the interval (0,1).

Now the FIS includes a set of fuzzy rules which are specified in terms of the fuzzy variables, the linguistic values, and a set of logic operators. To continue with the previous example, consider that the fuzzy rules in the FIS are:

$$R\_1: \text{ If } \text{Stagonation} = \text{High} \text{ and } \text{UseOp} = \text{High} \text{ then } \text{ProbOp} = \text{High} \tag{47}$$

$$R\_2: \text{ If } \text{Stagmentation} = \text{High} \text{ and } \text{UseOp} = \text{Low} \text{ then } \text{ProbOp} = \text{Mid} \tag{48}$$

Once the fuzzification of the inputs is done, the next process is to evaluate the antecedents of the rules *R*<sup>1</sup> *and R*2, determining the following values: *k*1 = *min k*2 = *min*

$$k\_1 = \min\left(\mu\_{\text{Stagmentation} = \text{High}}(0.7), \mu\_{\text{UscOp}} = \text{High}(0.8)\right) \tag{49}$$

$$k\_2 = \min\left(\mu\_{\text{Stagmentation} = \text{High}}(0.7), \mu\_{\text{LIscOp} = \text{Low}}(0.8)\right) \tag{50}$$

In the rule evaluation, the min operator is associated with the logic operator *and*, and the max operator is associated to the logic operator *or*.

Now the membership functions of the consequents of the rules must be determined. For each rule an operator of implication is applied to the antecedent value obtained in the previous process and to the consequent of the rule, to determine the membership function of the conclusion of the rule. The *min* operator is used to implement the implication logic operator, which truncates the membership function of the rule's consequent. For example, the truncated membership functions of the consequents are the following: *ProbOp*=*High*(*z*) <sup>=</sup> min *ProbOp*=*Mid*(*z*) <sup>=</sup> min 

$$\mu^\* \mu\_{\text{ProbO} = H \text{igh}}(z) = \min \left( \mu\_{\text{ProbO} = H \text{igh}}(z), k\_1 \right) z \in (0, 1) \tag{51}$$

$$
\mu^\*\_{\text{ProbO}\,p=\text{Mid}}(z) = \min\left(\mu\_{\text{ProbO}\,p=\text{Mid}}(z), k\_2\right) z \in (0,1) \tag{52}
$$

Now the truncated membership functions are integrated using an aggregation operator to create a new membership function, which is the controller's fuzzy output. The aggregation operators that are frequently used are *max* and *sum*.

For the example, the *max* operator is used to determine the aggregated membership function, which is the following:

$$
\mu^{\ast \ast}(z) = \max(\mu^{\ast}\_{\cdot Z = A}(z), \,\mu^{\ast}\_{\cdot Z = M}(z)) \, z \in (0, 1) \tag{53}
$$

Finally, the defuzzification of the fuzzy output obtained is done. In this step a real number is associated to the aggregated membership function, which is the output of the inference process. In the previous example, the center of the area under the curve of the aggregated membership function is used to defuzzify the output of the controller as following: <<

$$z = \frac{\int \mu^{\*\*}(z)zdz}{\int \mu^{\*\*}(z)dz} \tag{54}$$

Figure 5 graphically shows the fuzzy inference process for the example described.

**Figure 5.** Mamdani Fuzzy Inference System used in the fuzzy controller.

All of the controller rules are of the type: Antecedent AND Antecedent then Consequent. The fuzzy rules were designed to have soft changes in the input variables (*Stagnation* and *UseOp)*, to avoid abrupt changes in the output variable (*ProbOp*). The configuration was manually done by observing the surface that these three variables generated [25]. Table 2 shows the rules of the fuzzy controller.


**Table 2.** Fuzzy controller rules.

The Algorithm 4 shows the structure of the fuzzy controller used in the fuzzy controller implementation with the Java Library Fuzzy Lite 6.0.

#### **Algorithm 4.** Fuzzy controller structure.



In the [Rules] section, the first and second columns contain the linguistic values of the two input variables (1-Low, 2-Mid, 3-High), the third column is the weight of the rules, and the last one indicates the logic operator used in the rule (1-*and*, 2-*or*).

The interaction of the fuzzy controller with the algorithm works as follows: Let *Operators* the set of genetic operators available. The T-FAME algorithm searches in the solutions space in time windows of size *Window*, each time window the algorithm performs *Window* iterations. At the end of each time window, the algorithm sends to the fuzzy controller the values of the input variables *Stagnation* and *UseOp[i]* for all *i* ∈ *Operator.* For each pair of input values, a Fuzzy Inference generates *ProbOp[i]* for all *i* ∈ *Operator*. This process is done for the T-FAME algorithm with the following pseudocode where *v* is the windows counter:

If (*v* == *Window*) then

$$\forall \ i \epsilon \ \{1, 2, \dots. \\$ \acute{z} \acute{z} \acute{o} O P\} $$

$$\text{38. }ProbOp(i) = \text{FuxzyController}(\text{Stagmentation}, \text{ } \text{UseOp}(i));$$

39. *v* =0; *Stagnation* = 0;

40. Endif

The line numbers are those that appear in the T-FAME algorithm pseudocode included in Section 6.4. Notice that in lines 37 and 38, the algorithm uses the fuzzy controller to update all the available recombination genetic operators' selection probability.

The Stagnation value is shared for all the operators, and it is an indicator of the evolution of the search in the current time window. This is a normalized value that is increased by 1.0/Window each time the generated solution cannot enter the set where the non-dominated solutions are kept and reset when the time window is over. UseOp[i] is a normalized value that is increased by 1.0/Window every time the operator i is used.

#### *5.2. Additional Genetic Operators*

Four operators are used in T-FAME to create new solutions: One-point crossover, Uniform Mutation, Fixed Mutation, and Differential Evolution. Two of these operators (One-point crossover and Uniform Mutation) are the same ones that are used on T-NSGA-II, and they are already described in the previous section.

*Differential Evolution*: This method was proposed by Rainer [27], and its implementation was based on [28]. It uses the four parents obtained with the tournament method. The first part of the process consists of creating a new solution called Candidate using Parent 1, Parent 2, and Parent 3, this solution is obtained by doing a binary addition of the parents. Figure 6 shows an example of how this operator works.

Once the Candidate is calculated, a binary crossover operator is done between the candidate and Parent 4 to create a new solution called Son, this binary crossover operator is different from the one-point crossover operator described previously, and it uses a parameter called crossover percentage (CP). The binary crossover operator consists of the following: For each array index, a random number between 0 and 1 is generated, if that number has a lesser value than CP, then that index receives the value of the Candidate, if this is not the case, then that index receives the value of Parent 4.

**Figure 6.** Differential evolution operator example.

Once the new solution Son is completed, a dominance test is done between Son and Parent 4, if the objective values of Parent 4 dominate the objective values of Son, then Parent 4 proceeds to be the new solution, but if this is not the case, then Son proceeds to be the new solution.

*Fixed Mutation*: This method is very similar to the uniform mutation operator that was described previously. The main difference lies in the fact that the whole process is done in a loop until *n* mutations are made, where *n* is a parameter previously defined. This operator also makes sure that no element in the solution is changed twice or more times, this is done by using a fixed array to keep track of the changed elements in the solution. Figure 7 shows an example of the Fixed Mutation operator.

**Figure 7.** Fixed Mutation operator example.

#### *5.3. Used Structures to Store the Population and the Approximated Pareto Front*

The algorithm uses the structure *pop* to maintain a solutions population, which contains the following information for each solution *i*:

• V(*i*): vector binary associated to the solution *i.*


The structure *Front* is used to store the approximated Pareto front, which contains the following information for each stored solution *i*:


*5.4. T-FAME Algorithm Pseudocode*

This section presents the pseudocode for the algorithm T-FAME in Algorithm 5.

**Algorithm 5.** T-FAME pseudocode

INPUT: Instance with the trapezoidal parameters of the portfolio problem.

OUTPUT: Approximated Pareto front

Variables

*pop*: Population of solutions (binary vectors)

*Front*: Limited sized set were no-dominated solutions are kept

*Operator*: Vector of size *SizeOP* that contains the index of the available operators

*Parents*: Vector of size *NParents* that contains the chosen parents

```
ProbOp(i): Probability that operator i has of being chosen, it has values between 0 and 1
```
*UseOp*(*i*): Normalized Indicator of how much operator *i* has been used, it has values between 0 and 1

*Stagnation*: Normalized indicator of the number of generated solutions that couldn't be inserted into *Front*, because they were either

dominated solutions or there was not space available for them, it can have values between 0 and 1.

*MAXEVAL*: Maximum number of evaluations of the objective function (stopping criterion)

*Window*: Size of the time window.

*eval*: Accumulator of the evaluations of the objective function

*v*: Counter of the time windows that have elapsed

Functions

*CreateaSon*(*Operato*r(*i*), *Parent*s): Generates one solution using the previous chosen operator *i* with the chosen parents (Steady state) *Evaluat*e(*Son*): Calculates the objective values of *Son* and verify feasibility

*FuzzyController*(*Stagnation*, *UseOp*(*i*)): Function that invokes the fuzzy controller with *Stagnation* and *UseOp(i*) as input values and returns the probability of selection of all the operators

*no-dominated\_sortingSSD*(*NewPop*): Sorts the fronts of *NewPop* by dominance and uses as ranking the *SSD* values of the solutions. *EliminateWorstSolutionSSD*(NewPop): Eliminates from the last front of *NewPop* the solution with the worst *SSD*, and assign *NewPop* to *pop*.

\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*

1. Create(*pop*) \*\*Create random population

*2. Front*=NoDominated(*pop*) \*\*Insert in Front the no-dominated solutions of *pop*

3. ∀ *i*  {1, 2, . . . ., *SizeOP*} *ProbOp*(*i*) =1, *UseOp*(*i*)=0

4. *v* =0; *Stagnation* = 0; *eval*=0;

5. while (*eval*<*MAXEVAL*) do. \*\*\*\* Stop condition

\*\* Chose |NParents|

\*\* With a probability *β* each parent is taken from *Front to* intensify) and with 1- *β* from *pop* to diversify.

6. ∀ *i*  {1, 2, ..|*NParents*|} do

7. if (RandomDouble(0,1) ≤ *β*) then

\*\*The parent is chosen from *Front*


#### **6. Experimental Results**

Two experiments were done in order to evaluate the performance of the proposed algorithms. The tested steady-state algorithms were T-NSGA-II-CD, T-NSGA-II-SSD, and T-FAME. The first experiment was done to make sure the algorithms were implemented correctly, while the second experiment was done to compare the performance between them using performance metrics.

The software and hardware platforms that were used for these experiments include Intel Core i5 1.6GHz processor, RAM 4GB, and IntelliJ IDEA CE IDE.

#### *6.1. Performance Metrics Used*

In order to measure the performance of each algorithm, two metrics were used: hypervolume [28] and generalized spread [29].

Hypervolume is the n-dimensional solution space volume that is dominated by the solutions in the reference set. If this space is big, then that means that the set is close to the Pareto Front. It is desirable for the indicator to have large values. Generalized Spread calculates the average of the distances of the points in the reference set to their closest neighbor. If this indicator has small values, then that means the solutions in the reference set are well distributed.

#### *6.2. Experimental Setup*

In order to configure the algorithms used in this work, the parameter values reported in the state-of-the-art were considered. The parameter value for the maximum number of evaluations was determined after a preliminary experimental phase. The comparison of all the algorithms, under the same operation conditions, utilizes a steady-state approach, using the dominant son. Tables 3 and 4 show the values of the parameters used in the algorithms. The configuration of algorithm T-NSGA-II-SSD is the same one as T-NSGA-II-CD, however, it uses Spatial Spread Deviation instead of Crowding Distance as its density estimator.

#### **Table 3.** T-NSGA-II-SSD parameters.


#### **Table 4.** T-FAME parameters.


#### *6.3. Experiment 1. Validating the Implemented Algorithms*

For this experiment, an instance named o2p25\_rand was used, this instance was originally created for POP with intervals, which was converted in a trapezoidal fuzzy instance by adding two parameters to the intervals. The optimum Pareto Front was obtained using an exhaustive algorithm, and approximate fronts were obtained with T-NSGA-II-CD, T-NSGA-II-SSD, and T-FAME algorithms. All algorithms solve the MOPOP with Fuzzy Parameters and use a steady-state election mechanism, creating one solution from the genetic operators' application. This adaptation from FAME has an advantage over algorithms using the classic generational approach in genetic algorithms.

The purpose of this experiment is to validate the correct operation of the implemented algorithms in the project. In the experiment, the fronts are generated, and they are compared to the optimum front, in order to determine if the algorithms are generating similar

fronts. All the fronts that were generated are shown in Table 5. Each front is shown in two columns that contain the values of the two objectives that were originally Trapezoidal Fuzzy numbers, but they were converted into real numbers with the transformation based on GMI. The graph the fronts uses the GMI values obtained from the objectives.


**Table 5.** Generated fronts of the algorithms with instance o2p25\_rand.

It is worth nothing that, in Figure 8, the approximated fronts are relatively close and below the optimum front. Also, observe that the T-NSGA-II-SSD and T-FAME algorithms managed to reach some optimum solutions. Finally, note that the T-FAME algorithm has a good distribution between its solutions.

**Figure 8.** Generated fronts of the algorithms with instance o2p25\_rand.

#### *6.4. Experiment 2. Evaluating the Performance of the Algorithms with Instances of 25 Projects*

This experiment evaluates the performances of algorithms T-NSGA-II-CD, T-NSGA-II-SSD, and T-FAME, and utilizes 13 instances with 2 objectives and 25 projects. In order to compare the performance between the three algorithms, each algorithm was executed 30 times per instance. The performance metrics used were hypervolume and generalized spread. For each instance, the reference set contains the non-dominated solutions obtained from the combination of the 30 generated fronts. The computation of the metrics uses the reference set as an approximation to the optimum Pareto Front. The computation of the median value and interquartile ranges uses the metric values of the 30 instances sorted in ascending order. With the sorted array, the median value was the average of the metric values from positions 15 and 16. At the same time, the interquartile ranges correspond to those in positions 23 and 8, corresponding to the 75% and 25% of the metrics values, respectively. The median value and the interquartile ranges are used instead of the average and the standard deviation because they are less sensitive to extreme values. The experiment performs a hypothesis test to validate the obtained results. The hypothesis was proven using the parametric t student test on those data sets that passed the normality and homoscedasticity tests and using the non-parametric Wilcoxon signed-rank test on those that do not. Both tests apply a confidence level of 95%, pairing T-FAME with each of the other two algorithms. Tables 6–9 show the results of the normality and homoscedasticity tests done for all the instances used in this work (25 and 100 projects) and the metrics of hypervolume and generalized spread. Tables 6 and 8 show in the last column pairs (*i*,*j*), which indicate that the comparison of T-NSGA-II-CD and T-FAME uses test *i*, and the comparison T-NSGA-II-SSD and T-FAME uses test *j*. The values *t* and *W* in (*i*, *j*) stand for t student test and Wilcoxon test. This work tests each instance separately.


**Table 6.** Hypervolume normality test, the null hypothesis is that the samples follow a normal distribution which is accepted (a) when *p*-value < 0.05 and rejected (r) otherwise.



**Table 8.** Generalized Spread normality test, the null hypothesis is that the samples follow a normal distribution which is accepted (a) when *p*-value < 0.05 and rejected (r) otherwise.



**Table 9.** Generalized Spread homoscedasticy test, the null hypothesis is that all the input populations come from populations with equal variances, which is accepted (a) when *p*-value < 0.05 and rejected (r) otherwise. Observe that the null hypothesis is accepted (a) for all the instances. The parametric t student test can be applied for all the instances that accept the null hypothesis in the normality tests.

Table 10 shows the performance results with the hypervolume metric, and Table 11 shows the results with the generalized spread metric. For the hypervolume metric, the algorithm with the largest value is considered to be the one with the best performance. For the generalized spread metric, the best algorithm is considered to be the one with the smallest value. The table's cells show the median value of the metric (*M*) and the interquartile range (*IRQ*) in the following format: *MIRQ*. In the result tables, for each instance the best and second-best values are marked with solid or light black, respectively. In order to indicate if the observed differences in the performance of the algorithms are significant or not, for each algorithm the symbol <sup>=</sup> indicates that the performance of T-FAME is significantly better that the algorithm which it is being compared. The symbol > indicates the opposite, and the symbol = indicates that the difference is not significant. These symbols are marked with an asterisk when the t student test was applied. To confirm the results obtained with the paired tests, a global evaluation is done with the three algorithms. This evaluation was done by applying a Friedman test with 95% confidence.


**Table 10.** Results with the hypervolume metric.

**Table 11.** Results with the generalized spread metric.


The information presented in Table 10 shows that T-NSGA-II-CD stands out as the algorithm with the best performance in 12 of 13 cases. The results on Table 11 shows that T-NSGA-II-SSD positions itself as the best algorithm in 10 of 13 cases and T-FAME in 8 of 13 cases. It can also be observed that these differences are significant in all cases, this is due to the fact that when the differences are not significant between the best and second-best algorithms, then that means the algorithms are considered tied. Table 12 confirms the results observed with the t student and Wilcoxon tests. As a result of applying the Friedman test with the three algorithms, the ones with the lowest rank for the hypervolume and generalized spread metrics are T-NSGA-II-CD and T-NSGA-II-SSD, respectively.


**Table 12.** Friedman ranks of all algorithms with hypervolume and generalized spread.

#### *6.5. Experiment 3. Evaluation of the Algorithm' Perfomances Using Instances with 100 Projects*

As indicated previously, the previous experiment was done with instances with 25 projects, for which the algorithms had to navigate in a space of binary vectors of length 25. In that case the size of the solution space was of 225. For this experiment, 9 instances of 2 objectives and 100 projects were used, these instances represented a greater complexity for the algorithms because the solution space increased to 2100. The experiment conditions were just as in the previous one, using the same metrics but in a scenario of greater complexity scenario. For each instance, the reference set contains the non-dominated solutions obtained from the combination of the 30 generated fronts. The computation of the metrics uses the reference set as an approximation to the optimum Pareto Front. The computation of the median value and interquartile ranges uses the metric values of the 30 instances sorted in ascending order. With the sorted array, the median value was the average of the metric values from positions 15 and 16. At the same time, the interquartile ranges correspond to those in positions 23 and 8, corresponding to the 75% and 25% of the metrics values, respectively. The experiment performs a hypothesis test to validate the obtained results. The hypothesis was proven using the parametric t student test on those data sets that passed the normality and homoscedasticity tests and using the nonparametric Wilcoxon signed-rank test on those that do not. Both tests apply a confidence level of 95%, pairing T-FAME with each of the other two algorithms. Tables 6–9 shows the results of the normality homoscedasticity tests done for all the instances used in this work (25 and 100 projects) and the metrics of hypervolume and generalized spread.

Table 13 shows the results with the hypervolume metric and Table 14 shows the results with the generalized spread metric. For the hypervolume metric, the algorithm with the largest value is considered to be the one with the best performance. For the generalized spread metric, the best algorithm is considered to be the one with the smallest value. The table cells show the median value of the metric (*M*) and the interquartile range (*IRQ*) in the following format: *MIRq*. In the result tables, for each instance the best and second best values are marked with solid or light black, respectively. In order to indicate if the observed differences in the performance of the algorithms are significant or not, for each algorithm the symbol <sup>=</sup> indicates that the performance of T-FAME is significantly better that the algorithm which it is being compared. The symbol <sup>&</sup>gt; indicates the opposite, and the symbol = indicates that the difference is not significant. These symbols are marked with an asterisk where the t student test was applied. To confirm the results obtained with the paired tests, a global evaluation is done with the three algorithms. This evaluation was done by applying a Friedman test with 95% confidence.


**Table 13.** Results with the hypervolume metric.

**Table 14.** Results with the generalized spread metric.


The information presented in Table 13 shows T-FAME stands out as the algorithm with the best performance in 7 of 9 cases and T-NSGA-II-SSD in 5 of 9 cases. The results on Table 14 show that T-FAME stands out as the best algorithm in 6 of 9 cases and T-NSGA-II-SSD in 4 of 9 cases. These differences are significant in all cases, this is due to the fact that when the differences are not significant between the best and second-best algorithms, then that means the algorithms are considered tied. Table 15 confirms the results observed with the t student and Wilcoxon tests. As a result of applying the Friedman test with the three algorithms, the one that has the lowest rank for both metrics is T-FAME.

**Table 15.** Friedman ranks of all algorithms with hypervolume and generalized spread.


#### **7. Conclusions and Future Work**

This work approaches the Multi-Objective Portfolio Optimization Problem with Trapezoidal Fuzzy Parameters. To the best of our knowledge, there are no reports of this variant of the problem. This work, for the first time, presents a mathematical model of the problem, and, additionally, contributes with a solution algorithm using the Fuzzy Adaptive Multiobjective Evolutionary (FAME) methodology and two novel steady state algorithms that apply the Non-Dominated Genetic Algorithm (NSGA-II) methodology to solve this variant of the problem. Traditionally, these kinds of algorithms use the Crowding Distance density estimator, so this work proposes substituting this estimator for the Spatial Spread Deviation to improve the distribution of the solutions in the approximated Pareto fronts. This work contributes with a defuzzification process that permits measurements on the algorithms' performances using commonly used real metrics. The computational experiments use a set of problem instances with 25 and 100 projects and hypervolume and generalized spread metrics. The results with the challenging instances of 100 projects show that the algorithm T-FAME has the evaluated algorithms' best performance. Three hypothesis tests supported these results, and this is encouraging because they confirm the feasibility of the proposed solution approach.

The main open works identified in this research are to develop algorithms for solving the problem with many objectives, preferences, and dynamic variants. Currently, we are working to change the fuzzy controller selector for a selector based on a reinforcement learning agent.

**Author Contributions:** Conceptualization: A.E.-P., D.L.-G., H.J.F.-H., L.C.-R.; Methodology: M.L.M.-R., N.R.-V.; Investigation: H.J.F.-H., L.C.-R.; Software: C.G.-S., N.R.-V.; Formal Analysis: H.J.F.-H.; Writing review and editing: A.E.-P., D.L.-G., H.J.F.-H., C.G.-S. All authors have read and agreed to the published version of the manuscript.

**Funding:** Authors thanks to CONACYT for supporting the projects from (a) Cátedras CONACYT Program with Number 3058. (b) CONACYT Project with Number A1-S-11012 from Convocatoria de Investigación Científica Básica 2017–2018 and CONACYT Project with Number 312397 from Programa de Apoyo para Actividades Científicas, Tecnológicas y de Innovación (PAACTI), a efecto de participar en la Convocatoria 2020-1 Apoyo para Proyectos de Investigación Científica, Desarrollo Tecnológico e Innovación en Salud ante la Contingencia por COVID-19. (c) A. Estrada and D. López would like to thank CONACYT for the support numbers 740442 and 931846.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **A Peptides Prediction Methodology for Tertiary Structure Based on Simulated Annealing**

**Juan P. Sánchez-Hernández 1,†, Juan Frausto-Solís 2,\*,†, Juan J. González-Barbosa 2, Diego A. Soto-Monterrubio 2, Fanny G. Maldonado-Nava <sup>2</sup> and Guadalupe Castilla-Valdez <sup>2</sup>**


**Abstract:** The Protein Folding Problem (PFP) is a big challenge that has remained unsolved for more than fifty years. This problem consists of obtaining the tertiary structure or Native Structure (NS) of a protein knowing its amino acid sequence. The computational methodologies applied to this problem are classified into two groups, known as Template-Based Modeling (TBM) and ab initio models. In the latter methodology, only information from the primary structure of the target protein is used. In the literature, Hybrid Simulated Annealing (HSA) algorithms are among the best ab initio algorithms for PFP; Golden Ratio Simulated Annealing (GRSA) is a PFP family of these algorithms designed for peptides. Moreover, for the algorithms designed with TBM, they use information from a target protein's primary structure and information from similar or analog proteins. This paper presents GRSA-SSP methodology that implements a secondary structure prediction to build an initial model and refine it with HSA algorithms. Additionally, we compare the performance of the GRSAX-SSP algorithms versus its corresponding GRSAX. Finally, our best algorithm GRSAX-SSP is compared with PEP-FOLD3, I-TASSER, QUARK, and Rosetta, showing that it competes in small peptides except when predicting the largest peptides.

**Keywords:** protein structure prediction; Hybrid Simulated Annealing; Template-Based Modeling; structural biology; Metropolis

#### **1. Introduction**

Proteins or polypeptides are macromolecules built from amino acids (aa) and are mainly responsible for living beings' functionality. Proteins are essentials elements because every protein has a specific function related to its unique three-dimensional structure named Native Structure (NS). All the proteins consist of a polymer chain of aa; the junctions with a small number of them are named peptides. The peptides have significant importance in the science community because of their multiple applications, for instance, in pharmaceutical research [1–4], drug design [5–7], diagnosis [8–10], and therapy [11,12]. To obtain the NS of proteins from an amino acid sequence could bring benefits to human beings.

The PFP has been identified as an important problem since Kendrew and Perutz's research teams obtained the myoglobin and hemoglobin molecules' tertiary structure, respectively [13,14]. These studies established the relation between function and structure. PFP consists of obtaining the three-dimensional structure of a protein with the lowest Gibbs free energy, thermodynamically stable three-dimensional conformation [15].

The PFP is considered an NP-hard problem [16]. Thus, presumably, none of the known exact algorithms can solve it in polynomial time. In other words, the execution time grows

**Citation:** Sánchez-Hernández, J.P.; Frausto-Solís, J.; González-Barbosa, J.J.; Soto-Monterrubio, D.A.; Maldonado-Nava, F.G.; Castilla-Valdez, G. A Peptides Prediction Methodology for Tertiary Structure Based on Simulated Annealing. *Math. Comput. Appl.* **2021**, *26*, 39. https://doi.org/10.3390/ mca26020039

Academic Editor: Leonardo Trujillo

Received: 23 February 2021 Accepted: 27 April 2021 Published: 29 April 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

exponentially when using them. In contrast, any protein passes from the aa sequence to its NS three-dimensional structure very rapidly in nature. The latter issue is known as the Levinthal Paradox [17].

Several algorithms have been applied to solve the PFP successfully, and one of the most effective algorithms has been the Simulated Annealing algorithm (SA). The SA is commonly hybridized with other methods; the combination algorithms are called Hybrid Simulated Annealing algorithms (HSA). These algorithms successfully applied to peptides are the following:


The HSA algorithms previously mentioned obtained excellent results for small proteins or peptides. However, when the number of aa increases, the variables (torsional angle of aa) are also increased, the computational time for exploring the solution space is considerable. As a result, the PFP area needs new approaches to obtaining better solutions for large peptides or proteins.

This paper proposes the methodology GRSA-SSP that combines GRSA algorithms with the Secondary Structure Prediction (SSP). For a given chain of aa representing a peptide or a protein, the GRSA-SSP performs two processes:


These two processes are performed in several steps described in this paper. The algorithms used in the second phase of GRSA-SSP can be one of the GRSA family algorithms. This paper named these hybrid algorithms GRSAX-SSP, where X is used to distinguish the GRSA algorithm. We evaluate our methodology using RMSD and TM-score metrics [29]. Additionally, experimentation is performed with a set of forty-five instances of peptides and a set of six mini proteins, which are compared with the most popular algorithms in the literature, such as PEP-FOLD3 [28], I-TASSER [30,31], Rosetta [24,32], and QUARK [25,33].

The paper's organization is as follows: first, we present the introduction to PFP and HSA algorithms. Then, in the Background section, we review the Protein Folding Problem definition and some relevant research in the literature, and we explain the GRSA family of algorithms. In the next section, we describe the GRSA-SSP methodology. In the Results section, we present experimentation comparing the GRSA algorithms with those of the literature; also, we analyze the presented methodology's performance. Finally, the conclusions of this research are presented.

#### **2. Background**

The PFP is a significant multidisciplinary problem that has been investigated for over half a century [34]. Different scientific areas have been studied, for example, computer science, bioinformatics, and molecular biology, concerning this problem, and three questions in particular need to be answered [34].


• Is there an algorithm that predicts the protein structure from the amino-acids sequence?

This paper is related to the last question. We propose different strategies to obtain the NS tertiary structure using GRSA family algorithms and secondary structure prediction. As we mentioned before, finding new algorithms for PFP is significant not only because of its potential applications but also because it is an NP-hard problem [16], and the number of combinations that determine which algorithms must be explored in a very large solution space.

#### *2.1. Definition of Ab-Initio and Force Fields*

The ab initio modeling can be defined as an optimization problem where the Gibbs free energy is the objective function f(n), and this has to be minimized. Thus, this problem is defined as follows: let there be a sequence of amino acids: n = a1, a2, . . . , an; every amino acid has associated with it a set of angles σ1, σ2, . . . , σm where m represents a particular dihedral angle; then, minimizing the energy function f(σ|1, σ2, . . . , σm) provides the best tertiary structure or NS. The energy functions (force fields) are used for determining the energy of a protein structure [35], and some examples of these are AMBER [36], CHARMM [37], ECEPP/2, and ECEPP/3 [38]. The potential energy of ECEPP/2 is given by Equation (1), which is calculated in vacuo for only intramolecular energies, and this is the energy function to be minimized [39]. *rij*10

$$E\_{\text{total}} = \sum\_{j>i} \left( \frac{A\_{ij}}{r\_{ij}^{12}} - \frac{B\_{ij}}{r\_{ij}^{16}} \right) + 332 \sum\_{j>i} \frac{q\_i q\_j}{\varepsilon r\_{ij}} + \sum\_{j>i} \left( \frac{C\_{ij}}{r\_{ij}^{12}} - \frac{D\_{ij}}{r\_{ij}^{10}} \right) + \sum\_{n} l I\_n \left( 1 \pm \cos(k\_n q\_n) \right) \tag{1}$$

where: *rij* is the distance in Å (angstroms) between the atoms *i* and *j*; *Aij*, *Bij*, *Cij*, and *Dij* are the parameters of the empirical potentials; *qi* and *qj* are the partial charges in the atoms *i* and *j*, respectively; ε is the dielectric constant; *Un* is the energetic torsion barrier of rotation about the bond n; *kn* is the multiplicity of the torsion angle *ϕn*.

In this paper, we use the potential energy of ECEPP/2 as an objective function because we explore the conformational space, and when the energy of the protein structure is minimized, then the protein structure is accepted.

#### *2.2. Computational Approaches for PFP*

The CASP organization has classified PFP models into two main groups:

Group 1: Template-based modeling (TBM). In this group, we find algorithms that use biological information obtained from the secondary structure of the target protein, homology, and fragments of other proteins. These algorithms have achieved good results for predicting protein structures in the CASP [32,40,41]. TBM involves several strategies; some of the most common are homology [42,43], threading [44], and fragment assembly [30,45].

Group 2: Ab initio. This prediction approach classically refers to the determination of the NS using only the aa sequence information. Unfortunately, ab initio algorithms have achieved good PFP results but only for small proteins with less than 120 residues [46]. The Ab initio modeling is the most challenging approach because it uses the amino acids' sequence as unique information. Finding an optimal solution with ab initio is very difficult for big proteins because the solution space is enormous.

These two groups can be applied to small proteins or peptides (between 5 to 50 aa) [28,47]. There are successful studies applied to protein prediction using SA [48–50] or Monte Carlo algorithms with Metropolis-Hasting [26,27]. The Monte Carlo algorithms are also applied to the inverse protein folding problem, which objective function is to find a sequence given a structure [51,52]. This paper focuses on the classical PFP that consists of finding the functional structure given a sequence aa.

The Rosetta is a protein structure prediction or de novo approach that performs models for the tertiary structure using the primary and secondary structure predictions. The algorithm generates a local sequence to produce local structures (fragments) that form

a target protein template. Additionally, the fragments are then assembled by randomly using a Monte Carlo simulated annealing algorithm. Finally, the fitness of individual conformation interactions is evaluated based on a scoring function derived from known protein structures. However, only peptides longer than 27 aa can be provided as input [32].

Another PFP approach is I-TASSER (Iterative Threading ASSEmbly Refinement). It has four principal parts: generating a template using a multi-threading method, fragments' assembly method, refinement process, final model selection, and annotation tools. The I-TASSER applies an alignment of the target sequence and divides it into aligned using LOMETS [53,54] and nonaligned regions using the Monte Carlo algorithm. In the last step, annotation of functions is performed based on the structural models obtained using the BioLIP [55] database of ligand-protein interactions. Finally, the I-TASSER predicts protein structures from 10 to 1500 amino acids [31].

PEP-FOLD3 has a framework to predict the tertiary structure of peptides using de novo structure modeling. The process of predicting structure consists of three stages. Firstly, for a peptide amino acid sequence, a support vector machine is applied to predict the structural alphabet of fragments. Secondly, several models are generated using series of states and refined by a Monte Carlo algorithm. Finally, the five best conformations are selected [28].

Another approach is QUARK [33], in which an ab initio strategy is used to predict protein structures in ranges of 20 to 200 aa. Additionally, an assembly process of fragments with small structures is carefully selected and applied in the target sequence using a Monte Carlo algorithm.

SAINT2 is a fragment-based de novo structure prediction approach that has been successfully compared with the CASP12 approaches [56], which consists of a sequenceto-structure pipeline divided into four principal sections: (a) the secondary structure prediction where PSI-PRED [57] is applied, (b) the torsion angles prediction using SPINE-X [58], (c) a fragment library with the Flib package, and (d) the residue-residue contact prediction applying metaPSICOV [59]. Finally, the highest-scoring model is selected. In our methodology, sections (a) and (b) are applied, and they are shown in Figure 1.

#### The GRSA Family Algorithms

The SA algorithm is inspired by the physical annealing process of metals [60,61]. The algorithm has been applied with success in many NP-hard problems [20], including the PFP. SA employs the Metropolis algorithm to efficiently explore the solution space and obtain a good solution to optimization problems. We show the pseudocode of SA in Algorithm 1. *Ti* and *Tf* parameters define the initial and final temperatures, respectively; the α parameter represents the cooling factor. In the Metropolis cycle, new solutions are generated by a perturbation function. Finally, to accept or reject a new solution, an acceptance criterion based on Boltzmann distribution is applied (lines 11–14). The SA algorithm is executed until the final temperature, Tf, is reached. The SA algorithm source code is available at https: //github.com/DrJuanFraustoSolis/SimulatedAnnealing.git (accessed on 28 April 2021).

**Algorithm 1.** SA algorithm Procedure.

**Figure 1.** Methodology GRSA-SSP for peptide prediction.

However, when the solution space is very large, the algorithm's exploration takes a long time to obtain optimal solutions. Thus, new algorithms are necessary. The GRSA algorithm was proposed, which has been successfully applied in different NP problems [62,63], including the PFP [18]. The main characteristics of GRSA are the cooling scheme that decreases according to Tfp temperature cuts calculated by the golden number (F) and then

a stop criterion that reduces the cost of exploration (Algorithm 2). GRSA has a similar structure to the SA algorithm (lines 4 to 16). The difference with SA is that the GRSA calculates Tfp temperature cuts (five cuts are recommended), and in each cut, an α parameter in the range [0.7, 1] is associated (the common higher value is 0.95); the intermediate α values in this range are determined with an increment δ which represent the α increment since the lowest until the highest α value (in this case, δ = 0.05). These alpha values are associated with each temperature cut (line 17). The algorithm reduces the temperature cooling speed; thus, the execution time, corresponding to lines 18 to 23, decreases. Finally, to reduce wasting time in low temperatures, where the quality of the result is not improved, a stop criterion was implemented using the least-squares method (lines 24 to 29). This stop criterion detects the stochastic equilibrium for some *i* Metropolis cycles. We measure the slope (m is a global variable) of the linear regression of the energy of these cycles. In this regression, we used the coordinates (*Ei*, *i*); where *i* is in the range [2, *κmax*]. In our case, we used *κmax* = 5. The equilibrium is found when m is close to zero, calculated by (2).

$$m = \frac{\kappa \sum\_{i=2}^{\kappa} i E\_i - \left(\sum\_{i=2}^{\kappa} i\right) \left(\sum\_{i=2}^{\kappa} E\_i\right)}{\kappa \sum\_{i=2}^{\kappa} i^2 - \left(\sum\_{i=2}^{\kappa} i\right)^2} \tag{2}$$

The Equation (2) can be written as follows (3):

$$m = \frac{12\sum\_{i=2}^{\kappa} iE\_i - 6(\kappa - 1)(\sum\_{i=1}^{\kappa} E\_i)}{\kappa^3 - \kappa} \tag{3}$$

where: *κ* is the number of metropolis cycles for measuring the slope, *i* is the iteration of every metropolis cycle, and *Ei* the energy in each iteration.

The evaluation of m in Equation (2) does not imply a significative execution time; the summations on Equation (3) are only cumulative operations in Algorithm 3. This algorithm determines the equilibrium with this Equation (3). The GRSA algorithm source code is available at https://github.com/DrJuanFraustoSolis/GRSA.git (accessed on 28 April 2021).

**Algorithm 2.** GRSA algorithm Procedure.

```
'DWD7L
     7IS7I
        (6Įכį
Į כ į 
7IS 7L
    7N 7L
        ( 
6L
   JHQHUDWH6ROXWLRQ
 ZKLOH7N7I
         GR7HPSHUDWXUHF\FOH
 ZKLOH0HWURSROLVOHQJWKGR0HWURSROLVF\FOH
 6M
           SHUWXUEDWLRQ6L
 ǻ( (QHUJ\6M±(QHUJ\6L
 LIǻ(WKHQ
 6L
              6M
 ( (QHUJ\6L
 HOVHLIHǻ(7LUDQGRP>@WKHQ
 6L
              6M
 ( (QHUJ\6L
 HQGLI
 HQGZKLOH(QG0HWURSROLVF\FOH
 7IS 7IS
כ*ROGHQUDWLRVHFWLRQILYHFXWVUHFRPPHQGHG
 LI7N7IS WKHQ
 ĮQHZ Įį
 7N ĮQHZ
7N
 HOVH
 7N 7N
Į
 HQGLI
 LI7N7ISQWKHQ
 P (TXLOLEULXP(
 LI P§İWKHQ
 7.
               7I
 HQGLI
 HQGLI
 HQGZKLOH(QG7HPSHUDWXUHF\FOH
HQG3URFHGXUH
                 8SGDWHFRROLQJVSHHG
                    6WRSFULWHULRQ
```

```
Algorithm 3. Equilibrium Function.
 (TXLOLEULXP(
L &( L
(.PD[ 6XP( (P 
 LIL.PD[ WKHQ
 &( &(L
(
 6XP( 6XP((
 L L
 HQGLI
 LIL .PD[WKHQ
 P 
&(
L
6XP(LL
 HQGLI
 UHWXUQP
HQG)XQFWLRQ
```
The EGRSA (Algorithm 4) is an algorithm integrated by the hybridization of GRSA with evolutionary techniques. This algorithm has an evolutionary perturbation (EGR-SApert) in the GRSA phase (line 7), where a genetic algorithm is used. The EGRSA algorithm starts with a set of individuals generated for determining the initial solution designed as Si. Then in the Metropolis Cycle, the Si is perturbated by EGRSApert to generate new solutions. Next, the best individual generated Sj solution is selected of the population (lines 9 and 10). EGRSA is similar to GRSA, and both applied a stop criterion (see Algorithm 2.1) by the least-squares method [64,65] (lines 24–29). Algorithm 5 presents EGRSApert function, where one individual is a set of dihedral angles [F1, Ψ1, X1, ω1, F2, Ψ2, X2, ω2,..., Fn, Ψn, Xn, ωn] and a population is a set of individuals. Then crossover and mutation operators are applied to generate new solutions by the perturbation function. Finally, when the number of generations is reached, the best individual of the population is selected. The EGRSA algorithm source code is available at https://github.com/DrJuanFraustoSolis/EGRSA.git (accessed on 28 April 2021).

```
Algorithm 4. EGRSA algorithm Procedure. J J
'DWD7L
     7IS7I
        כĮ6(
Į כ į 
7IS 7L
    7N 7L
        ( 
6L
   JHQHUDWH6ROXWLRQ
 ZKLOH7N7IGR7HPSHUDWXUHF\FOH
 ZKLOH0HWURSROLVOHQJWKGR0HWURSROLVF\FOH
 6M (*56$SHUW6L
                   
 ǻ( (QHUJ\6M±(QHUJ\6L
 LIǻ(WKHQ
 6L 6M
( (QHUJ\6L
                   
 HOVHLIHǻ(7LUDQGRP>@WKHQ
 6L 6M
( (QHUJ\6L
                   
 HQGLI
 HQGZKLOH(QG0HWURSROLVF\FOH
 7IS 7IS
כ*ROGHQUDWLRVHFWLRQILYHFXWVUHFRPPHQGHG
 LI7N7IS WKHQ
 ĮQHZ Įį
 7N ĮQHZ
7N
 HOVH
 7N 7N
Į
 HQGLI
 LI7N7ISQWKHQ
 P (TXLOLEULXP(
 LI P§İWKHQ
 7.
               7I
 HQGLI
 HQGLI
 HQGZKLOH(QG7HPSHUDWXUHF\FOH
HQG3URFHGXUH
                 8SGDWHFRROLQJVSHHG
                     6WRSFULWHULRQ
```


The GRSA2 algorithm [23] is a hybridization of GRSA with the CRO algorithm [66]. GRSA2 (Algorithm 6) is an enhancement of GRSA. It has the same structure as the previous algorithms revised in this paper. Specifically, GRSA2 has two principal differences in the perturbation phase, applying decomposition and soft collision (line 8) and the acceptance criterion (lines 10 to 14). In Algorithm 7, we show the perturbation process implemented in the GRSA2pert function. In GRSA2, two soft collisions are used (unimolecular, Intermolecular). This algorithm has been applied only in the PFP with a set of 19 peptides and compared with I-TASSER and PEP-FOLD3 approaches obtaining outstanding results in the case of peptides [23]. The GRSA2 algorithm source code is available at https://github.com/DrJuanFraustoSolis/GRSA2.git (accessed on 28 April 2021).

**Algorithm 6.** GRSA2 algorithm Procedure. **<sup>J</sup>** <sup>J</sup>

```
Algorithm 7. GRSA2pert Function.
$OJRULWKP*56$SHUW)XQFWLRQ
 *56$SHUW6L
PROH&ROOE
 LI E!PROH&ROO WKHQ
 5DQGRPO\VHOHFWRQHSDUWLFOH0Ȧ
 LI'HFRPSRVLWLRQFULWHULRQPHW
 6M 'HFRPSRVLWLRQ6L
                         
 HOVHLI
 6M 6RIW&ROOLWLRQ6L
                       
 HQGLI
 HQGLI
 UHWXUQ6M
HQG)XQFWLRQ
```
#### **3. GRSA-SSP Methodology**

In this section, we present the GRSA-SSP methodology (Figure 1). This methodology has two main processes:


The GRSA-SSP methodology has an input (amino acid sequence), an output (tertiary structure prediction), and four stages: (1) secondary structure prediction, (2) torsion angles prediction, (3) template construction, and (4) refinement by GRSAX algorithms. Next, we explain each of these stages:

Input (Amino acid sequence). The amino acid sequences are taken as input.


Output. The GRSAX-SSP algorithm obtains the tertiary structure prediction.

#### **4. Results**

We performed the next GRSAX-SSP algorithms with the proposed methodology: (a) GRSA0-SSP using classical SA [19], (b) GRSA1-SSP using original GRSA [21], (c) GRSAE-SSP using EGRSA [22], and (d) GRSA2-SSP using GRSA2 [23]. For all of them, we used the methodology presented in Figure 1. The peptides in this experimentation have 9 to 49 amino acids. The number of variables (torsion angles) for each peptide in this data set is

in the range [49, 304]. We chose this set because these instances (peptides) were used before in the literature. This set was also useful for comparing the GRSA2-SSP algorithm with the top-performing approaches of the CASP, which can be used for small peptides. We compared the last algorithm with I-TASSER, PEP-FOLD3, QUARK, and Rosetta, which are among the best algorithms in the CASP competition. We noted a difference between the GRSAX-SSP algorithms and the one that only applies ab initio by naming it GRSAX. Table 1 presents the set of 45 instances sorted by the number of variables taken from [23,28,68,69] and a PDB code represents each peptide.



In the experimentation, the GRSAX-SSP algorithms were executed 30 times to validate the results. The energy function ECEPP/2 is determined with SMMP framework [38]; it is the objective function of our optimization algorithms. An analytical tuning [20] was performed to obtain the initial and final temperature for each instance. In GRSA0-SSP the α value is 0.95, and the temperature range has zero golden sections. For GRSA1-SSP, GRSAE-SSP, and GRSA2-SSP algorithms, the same cooling scheme was used, using the α parameter with values from 0.75 to 0.95 with five golden ratio sections, which was determined by experimentation [21–23]. The GRSAX-SSP algorithms were executed in one of the terminals of the Ehecatl cluster in TecNM/IT Ciudad Madero, and it has the following characteristics: Intel® Xeon® processor at 2.30 GHz, Memory: 64 GB (4 <sup>×</sup> 16 GB) ddr4-2133, Linux CentOS operating system, and Fortran language.

We used the minimum energy quality values, the RMSD, and TM-score to evaluate the results, which are two metrics of the structural quality used for PFP algorithms. The RMSD is a structural measure between the native structure and the one predicted by the GRSAX-SSP and classical SA named here as GRSA0:


The TM-score metrics can be calculated using the TM-align [70] (an algorithm to obtain the best structural alignment between two proteins) or in a classical formulation [29]. In this paper, we use the classical formulation of TM-score.

GRSAX-SSP algorithms use a model determined by the secondary structure, and then it is refined for obtaining a better prediction. The results are compared with the GRSAX based on ab initio that only uses the amino acid sequence as information. Figures 2–5 show average results related to energy (kcal/mol), RMSD, and TM-score for each peptide. The numbers in the x-axis, represent the instances or peptides of Table 1, and each instance is a set of torsional anglesX=[F1, Ψ1, X1, ω1, F2, Ψ2, X2, ω2,..., Fn, Ψn, Xn, ωn] associated to each amino acid. We averaged the results of 30 executions for comparison.

**Figure 2.** Comparison of GRSA0 versus GRSA0-SSP.

Figure 2 shows that GRSA0-SSP has better behavior than GRSA0 or classical SA. Note that in all the peptides, GRSA0-SSP obtained the lowest energy. In other cases, the RMSD is more stable with small instances (1–16), and in the next instances, the behavior is equal. Additionally, when we compared with TM-score, the behavior, in general, is similar. In conclusion, by implementing this methodology in GRSA0-SSP with these instances, we obtained slightly improved results.

Figure 3 presents the comparison of the GRSA1-SSP versus GRSA1 with the same metrics; we observed the behavior with the 45 instances evaluated. In terms of energy, RMSD, and TM-score, the performance of GRSA1-SSP is equivalent to GRSA1.

**Figure 3.** Comparison of GRSA1 versus GRSA1-SSP.

Figure 4 shows the behavior of GRSAE-SSP, and we compared it with the original GRSAE algorithm. In this figure, we can appreciate that the results are equivalent in all cases when energy, RMSD, and TM-score are used for comparison.

**Figure 4.** Comparison of GRSAE versus GRSAE-SSP.

In Figure 5, we present the comparison of GRSA2 versus GRSA2-SSP. Note that the results obtained in every instance are very remarkable, and the superiority of GRSA2-SSP uses the metrics of energy, RMSD, and TM-Score. In this case, we applied the methodology GRSA-SSP to improve the behavior of the classical GRSA2 algorithm.

Finally, in Figure 6, we present the comparison of the GRSAX-SSP family algorithms. We observe that GRSA2-SSP has the best values in several instances against the other algorithms, being higher than the others. Therefore, the best behavior of the algorithms with secondary structure prediction is GRSA2-SSP.

Furthermore, Figure 7 presents the computational time of the GRSAX-SSP family algorithms. The GRSA2-SSP has the best behavior in time with low values in most of the instances compared to the other algorithms.

**Figure 5.** Comparison of GRSA2 versus GRSA2-SSP.

**Figure 6.** Comparison of GRSAX-SSP algorithms.

**Figure 7.** Comparison of the average time of the 30 execution of GRSAX-SSP algorithms.

Table 2 presents the results obtained by GRSA2-SSP. For each instance, we show the best TM-score and their RMSD. Additionally, we calculated the average of the RMSD and TM-score for the five best predictions. Complementing the results, we determined the standard deviation (std) of the RMSD and TM-score for the five best predictions and included the best type of secondary structure: A (mainly alpha), B (mainly beta), and N (mainly none). This classification as A, B, and N is based on the secondary structure predominating in each peptide [27,68,69,71,72]. We sort Table 2 by the number of amino acids for comparing the best results obtained by GRSA2-SSP with the best algorithms of the literature. This comparison is presented in Figures 9–11.


**Table 2.** Results obtained by GRSA2-SSP.

**Note:** PDB code (Instance), number of amino acids (aa), SS is the predominant secondary structure type: beta strand (B), alpha-helix (A) and none (N), TM<sup>1</sup> = TM-score.

> Figure 8 shows the GRSA2-SSP algorithm performance with instances classified by secondary structure. We show that the GRSA2-SSP algorithm has the best behavior in alpha structure instances evaluated with TM-score in Figure 8a and RMSD metrics in Figure 8b. The values in Figure 8 are the best obtained using TM-score and their RMSD. In Figure 8c,d, we present the TM-score average for the five best predictions and their RMSD average.

**Figure 8.** GRSA2-SSP according to the type of secondary structure.

In Figures 9–11, we present the behavior of the GRSA2-SSP algorithm, and we compare it with the results obtained from the approaches PEP-FOLD3, I-TASSER, QUARK, and Rosetta. We divided the dataset of Table 1 into three groups of 15 instances; groups 1, 2, and 3 have instances 1–15, 16–30, and 31–45. We compared these groups using the metrics RMSD, TM-score, GDT-TS [73], and TM-score (classical), and we present the best TM-score, the average of the five best predictions of the TM-score, and their RMSD. Additionally, we present the GDT-TS average and TM-score average.

In Figure 9, we introduced the comparison of the first group, and we observed that GRSA2-SSP behaves similarly to I-TASSER and PEP-FOLD3, but in this group of small peptides, PEP-FOLD3 is slightly better than our algorithm and I-TASSER when GDT-TS is compared (Figure 9e). Furthermore, we observed that our algorithm is competitive in this group. In this comparison, Rossetta and QUARK were not added because the minimal number of amino acids predicted are 27 and 20, respectively.

**Figure 9.** Comparison of GRSA2-SSP, PEP-FOLD3, and I-TASSER by RMSD (up to 15 amino acids). Figure 9 (**a**) best TM-score and (**b**) their RMSD, (**c**) TM-score average of the five best predictions, (**d**) RMSD average of the five best predictions, (**e**) GDT-TS average.

Figure 10 compares the second group of 16 to 30 amino acids with the best and the five best obtained using the TM-score metric and their RMSD, and the GDT-TS average. In this comparison, we added the second group of instances' results of QUARK; Rosetta was omitted because it is unable to predict most of the instances of this group.

In Figure 10a we observe very similar behavior among GRSA2-SSP, PEP-FOLD3, I-TASSER, and Rosetta. Note in this figure, GRSA2-SSP and PEP-FOLD3 obtain the best prediction. In Figure 10c, when the best five predictions are compared, I-TASSER obtains the best results, followed by PEPFOLD3 and GRSA2-SSP. Additionally, when the RMSD average is compared (Figure 10d), I-TASSER is the best, followed by PEP-FOLD3 and GRSA2-SSP. Finally, in Figure 10e, when GDT-TS is compared, GRSA2-SSP has a similar performance to PEP-FOLD3, I-TASSER, and QUARK. According to this figure, GRSA2-SSP and I-TASSER obtained a similar average.

Figure 11 compares the third group of 31 to 49 amino acids with the five best results obtained using the TM-score metric and their RMSD y GDT-TS. This comparison added the Rosetta approach because it can process the number of aa in this group. As we observe, the best algorithm is I-TASSER, followed by Rosetta, QUARK, PEP-FOLD3, and finally GRSA2-SSP.

**Figure 10.** Comparison of GRSA2-TBM, PEP-FOLD3, and I-TASSER by TM-score (16 to 30 amino acids). Figure 10 (**a**) best TM-score, and (**b**) their RMSD, (**c**) TM-score average of the five best predictions, (**d**) RMSD of the five best predictions, and (**e**) GDT-TS average of the five best predictions.

**Figure 11.** Comparison of GRSA2-SSP, PEP-FOLD3, I-TASSER, QUARK, and Rosetta by TM-Score (31 to 49 amino acids). Figure 11 (**a**) best TM-score, and (**b**) their RMSD, (**c**) TM-score average of the five best predictions, (**d**) RMSD average of the five best predictions, and (**e**) GDT-TS average of the five best predictions.

The 45 instances evaluated in the below experimentation show the application of the secondary structure results and refine them with the GRSAX algorithms, enhancing the performance in energy, RMSD, and TM-score. Specifically, when GRSA2-SSP is compared with PEP-FOLD3, I-TASSER, QUARK, and Rosetta, we observed that our algorithm performs well in small instances (Group 1 and 2). Nevertheless, in the largest instances, our algorithm is not the best, but it is competitive.

We carried out a second experimentation with six mini-proteins (5wll, 5lo2, 5up5, 5uoi, 2ki0, and 2kik) presented in Table 3. The mini-proteins come from the de novo protein design field [74–78]. This data set was proposed to observe the behavior of our best algorithm in these kinds of instances.



**Note:** alpha-helix (A) and none (N) for secondary structure.

We applied the same evaluation of all the algorithms, as in the first experimentation, using RMSD, TM-score, and GDT-TS metrics. Table 4 shows the results of all the algorithms in this data set. Evaluating them with TM-score and GDT-TS, we observe that the best algorithms were Rosetta, I-TASSER, and GRSA2-SSP, where the number of times the best results were achieved 3, 2, and 1, respectively. Additionally, evaluating with the RMSD, the best algorithms were again Rosseta, I-TASSER, and GRSA2-SSP, but this time they obtained the best results in two instances, which were (5uoi, 2kik), (2ki0, 5up5), and (5wll, 5lo2), respectively. As a result, we can say that Rosetta is the best algorithm, followed by I-TASSER, and GRSA2-SSP.



**Note:** The asterisk (\*) represents the best result in each column.

#### **5. Conclusions**

In this paper, we present the methodology GRSA-SSP for Protein Folding Problem applied to peptides. The objective of this problem is to predict the functional tridimensional protein structure. The algorithms developed with this methodology are GRSA0-SSP, GRSA1-SSP, GRSAE-SSP, and GRSA2-SSP. The main relevance of the algorithm GRSA2- SSP, developed with this methodology, is that it produces very good results in the case

of peptides; specifically, it is similar or better than the algorithms Rosetta, PEP-FOLD3, QUARK, and I-TASSER for the small and medium peptides, according to the experimentation presented. The last algorithms have traditionally been among the best of the CASP competition; besides, they use modern machine learning techniques like artificial neural networks.

We compared the algorithms developed with the original algorithms GRSA0, GRSA1, GRSAE, and GRSA2; we used a data set of 45 instances for this comparison. We showed that the hybrid algorithms produced with the GRSA-SSP methodology outperform the original ones. For this comparison, we used the metrics Energy, RMSD, TM-score, and execution time. We observed that the best of all these algorithms is GRSA2-SSP formulated with the proposed methodology.

We made a second evaluation comparing the GRSA2-SSP algorithm with the best state-of-the-art algorithms (we used the same data set of 45 instances). We selected for this comparison PEP-FOLD3, I-TASSER, QUARK, and Rosetta. We used a data set of forty-five instances divided into three groups, from small to large peptides. The experimentation shows that for groups 1 and 2, GRSA2-SSP performs as well as these algorithms. We observe that for the first group PEP-FOLD3 was the best, followed by GRSA2-SSP, while in the second group, the best algorithm was I-TASSER followed by GRSA2-SSP and PEP-FOLD3. Finally, in the third group, the best algorithm was Rosseta, followed by I-TASSER. Additionally, we present an analysis of GRSA2-SSP results for each type of secondary structure, obtaining a better behavior with alpha structures.

Furthermore, we assessed GRSA2-SSP with a second data set of six instances named mini proteins. The GRSA2-SSP results were compared with PEP-FOLD3, I-TASSER, QUARK, and Rosetta. The best algorithms in this data set were Rosetta, I-TASSER, and GRSA2-SSP because the number of times the best TM-score and GDT-TS were 3, 2, and 1, respectively. However, each of the three achieved two times the first place when RMSD was evaluated. As a result, the best of these algorithms for this data set is Rosetta, followed by I-TASSER and GRSA2-SSP.

We conclude that GRSAX-SSP algorithms enhance the original GRSA algorithms. The best of them is GRSA2-SSP which achieves very good results, surpassing the best state-of-art for peptides up to thirty amino acids. Finally, we note that the main advantage of our methodology is that it is simpler than the most powerful approaches of the literature.

**Author Contributions:** J.F.-S. and J.P.S.-H. contributed equally to the development of this paper. Conceptualization, J.P.S.-H., D.A.S.-M. and J.F.-S.; methodology J.F.-S., D.A.S.-M., J.P.S.-H., and J.J.G.-B.; Software J.P.S.-H., D.A.S.-M. and F.G.M.-N.; validation, J.P.S.-H. and J.F.-S.; formal analysis, D.A.S.-M., F.G.M.-N., J.J.G.-B., and G.C.-V.; writing—original draft J.F.-S., J.P.S.-H., and D.A.S.-M.; writing—review and editing, J.F.-S., D.A.S.-M. and J.P.S.-H. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Acknowledgments:** The authors would like to acknowledge with appreciation and gratitude CONA-CYT and TecNM/Instituto Tecnológico de Ciudad Madero. Also, we acknowledge Laboratorio Nacional de Tecnologías de la Información (LaNTI) for the access to the cluster.

**Conflicts of Interest:** The authors declare that they have no competing interests.

#### **References**


## *Article* **Optimization of Power Generation Grids: A Case of Study in Eastern Mexico**

**Esmeralda López 1, René F. Domínguez-Cruz 1,\* and Iván Salgado-Tránsito 2,\***


**Abstract:** Optimization of energy resources is a priority issue for our society. An improper imbalance between demand and power generation can lead to inefficient use of installed capacity, waste of fuels, worse effects on the environment, and higher costs. This paper presents the preliminary results of a study of seventeen interconnected power generation plants situated in eastern Mexico. The aim of the research is to apply a linear programming model to find the system-optimal solution by minimizing operating costs for this grid of power plants. The calculations were made taking into account the actual parameters of each plant; the demand and production of energy were analyzed in four time periods of 6 h during a day. The results show the cost-optimal configuration of the current power infrastructure obtained from a simple implementation model in MATLAB® software. The contribution of this paper is to adapt a lineal progamming model for an electrical distribution network formed with different types of power generation technology. The study shows that fossil fuel plants, besides emitting greenhouse gases that affect human health and the environment, incur maintenance expenses even without operation. The results are a helpful instrument for decision-making regarding the rational use of available installed capacity.

**Keywords:** optimization; linear programming; energy central

### **1. Introduction**

Due to the increase in energy demand, the requirement to reduce its costs, and the need for a transition from a centralized to a distributed power generation system, global integration of energy supply must be planned and managed. Proper management guarantees a more efficient and sustainable delivery. Thus, within the electricity generation sector, different variables and parameters must be considered to enhance its preformance. Some of these considerations are the energy demand, the installed capacity, a plant's ability to ramp up or shut down quickly, and generation costs, among other things [1,2]. Studies based in stochastic techniques have been implemented to forecast the generation or demand for short, medium, and long term analysis [3]. These techniques consider time interval series that allows historical data to be examined to establish the statistical behavior of these variables and predict the values that may occur in the future. [4–6]. These variables delineate the cost-optimal configuration of the power generation grid.

The optimization technique is a mathematical tool that finds the best solution for a modeled system. The solutions are formulated considering system restrictions [7,8], which permits efficient decision-making conditions. Using these optimization models in the energy industry brings benefits such as minimizing costs, increasing utilities, preventing harmful environmental effects, and defining optimal power flow. Thus, this type of tool allows energy generation processes to be more reliable, productive, and cost effective.

**Citation:** López, E.; Domínguez-Cruz, R.F.; Salgado-Tránsito, I. Optimization of Power Generation Grids: A Case of Study in Eastern Mexico. *Math. Comput. Appl.* **2021**, *26*, 46. https:// doi.org/10.3390/mca26020046

Academic Editors: Marcela Quiroz, Juan Gabriel Ruiz, Luis Gerardo de la Fraga and Oliver Schütze

Received: 5 May 2021 Accepted: 5 June 2021 Published: 8 June 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Otherwise, neglecting prediction models could impact energy production costs, profit reduction, electrical power losses, and the overuse of non-renewable resources [9].

By means of a mathematical model considering all the system variables and parameters, it is possible to obtain conditions that have an efficient energy system. Each plant's conditions and the optimal distribution of its resources allow the reduction of expenses and losses generated in the power generation process [10,11]. Some of these methods and algorithms are linear programming (LP) [12], quadratic programming (QP) [13], multi-criteria optimization [14], genetic algorithms (GA) [15], particle swarm optimization (PSO) [16], simulated annealing (SA) [17], the ant colony (ACO) [18], Taboo search (TS) [19], bee colony (ABC) [20], and optimal control techniques [21]. These mathematical methods apply to any production system, no matter the nature or application.

Recently, genetic algorithms have been proposed to optimize power plants [22], where the objective is to minimize power losses in the transmission process. Additionally, the particle swarm algorithm [23] minimizes generation costs where it converges to a solution; its advantage is the reduced use of computational resources. However, the drawback of these metaheuristic algorithms is that they are optimal approximation algorithms and search for feasible solutions. Such solutions are close to the optimal and are not the most efficient, generating only local and not absolute optimal results [24].

The economic dispatch technique for optimizing electric power plants has been suggested as an attractive method [25,26]. This linear programming model finds the optimal solution for the generation system according to the parameters concerning minimization or maximization: For example, the minimization of operating costs in the generation of electrical energy [27,28]; the minimization of greenhouse gas emissions [29] from the different fossil fuel plants; and the economic dispatch (ED) problem in fossil fuel power systems including discontinuous prohibited zones, ramp rate limits, and cost functions [30]. Some other studies have addressed solving the economic dispatch problem concerning minimization of losses and costs in a microgrid incorporating renewable energy sources, but not on a large scale [30–34].

This paper presents an optimization study of an electric power generation plant network through the economic dispatch model, which is a linear programming scheme. The proposed model applies to one of the most significant energy production regions in Mexico, called the eastern zone. This region has different types of power generation technologies. Within the analysis presented, actual parameters such as the maximum and minimum powers of each plant, the ramp up and down according to the type of technology, variable costs, fixed costs, and shut-down costs are considered. Fluctuations in energy production by renewable energy plants are estimated based on a probability function according to the historical measured data of each renewable resource in the zone. The study allows a reduction of generation costs during four time perios, without risking the secure supply of energy. The applied model shows a day with 100 percent renewable energy output, 94.90% from hydroelectric plants, 4.32% from wind plants, and 0.78% from geothermal. These three renewable resources show to be profitable options due to their low generation costs and big environmental benefit. Furthermore, the study indicates that plants based on fossil fuels do not significantly contribute to satisfying the demand during the monitored period. This behavior is noticed because the variable costs are directly related to the cost of fuels, which means the operating cost of fossil fuels plants increases.

#### *Power Energy Generation in Mexico*

The supply of electrical energy in Mexico is provided through various interconnected transmission networks. Public and private electric utilities compose the national electrification system, and the Federal Electricity Commission, a state-owned electric company, is the institution that supplies electricity to consumers [35]. According to the National Ministry of Energy [36], Mexico has an installed capacity of 75,685.00 MW, of which fossil fuels generate 79.88%; the other 17.08% is generated by renewable energy, and 3.04% by other methods such as nuclear energy. In terms of daily peak demand, it is 48,750 Megawatts, which an

increase of around 15% annually due to population growth, economic development, and industrialization.

The energy distribution system in Mexico consists of nine zones, as shown in Figure 1. Each zone has its characteristics of supplying energy according to the requested demand [37].

**Figure 1.** Mexican Electric System denoted by zones. The zone of interest is shaded (zone 8). [34].

The eastern part of Mexico has 110 generation plants, of which the primary source is hydroelectric and wind energy, as shown in Figure 2. This feature is due to its geographical location and high wind potential.

**Figure 2.** Classification of generation plants in the Eastern Zone of Mexico according to technology used [37].

As shown in Figure 2, the range of generation technologies permits a higher installed capacity in the zone according to the regional demand. This feature allows 22% of energy generation to contribute to the Mexico national requirements [37] and supply other areas such as the Central and Peninsular zones. In the Central Zone, the population density is around 899 inhabitants per km2. Big corporations established in this zone contribute to 27% of the country's gross domestic product (GDP) [38]. Therefore, the energy demand is much higher compared to the supply capacity in the central zone. On the other hand, there is a high energy demand in the Peninsular Zone because it is a substantial touristic infrastructure [38]. Therefore, it is necessary to promote energy end-use efficiency and optimize energy resources in these zones. The following section describes the implemented model in our case study.

#### **2. Materials and Methods**

As we mentioned, economic dispatch is a mathematical model that aims to manage system resources. For our purposes, this model permits efficiently handling all power plant supplies in an interconnected network. The objective is to obtain the optimal combination in each generator's contribution to satisfy the energy demand and minimize its generation costs. The modeling considerations incorporate real characteristic parameters of each of the plants to obtain useful results for decision-making. In the following, the proposed mathematical model is described.

#### *2.1. Modeling for a Certain Time*

In an electrical generation system, there are several plants with particular characteristics. These, concerning the central, are denoted by *j*, that is *j* = 1, 2, . . . , *J*. Where *J* is the total number of generation plants in the system and each one *j* works under certain limits. No plant can operate below the minimum operating power, which is described as:

$$P \min\_{\mathbf{j}} v\_{\mathbf{j}} \quad \leq \ P\_{\mathbf{j}} \tag{1}$$

where *Pminj* is the minimum power of the central and *vj* is a binary operating variable. If *vj* = 1, it means that the central is working. When *vj* is multiplied by the minimum power, it will not be below its nominal value and *Pj* is the optimal power to be generated by each plant. To exemplify these conditions, suppose we have a system of three plants and plant 1 has a minimum power of 45 MW, plant number two is 35 MW and plant 3 is 40 MW. Implementing these parameters in Equation (1), it remains:

$$\begin{array}{rcl} 45v\_1 & \le & P\_1 \\ 35v\_2 & \le & P\_2 \\ 40v\_3 & \le & P\_3 \end{array}$$

On the other hand, no control unit can operate above the maximum operational power *Pmaxj*:

$$P\_{\rangle} \le \, ^{Pma\underline{\mathbf{x}}\_{\rangle} \cdot \underline{\mathbf{v}}\_{\rangle}} \tag{2}$$

Similarly, if the plant *j* is working, the power to be generated must not be exceeded. The power generated by each plant must satisfy the demand *D* requested by the electrical distribution grid; therefore:

$$D = \sum\_{j=1}^{I} P\_j \tag{3}$$

On the other hand, demand fulfillment generates individual costs which determine the total cost of generation, called *R*. For this reason, resources must be correctly assigned to minimize them. Thus, the whole cost function *R* is given as:

$$R = \sum\_{j=1}^{J} \left( A\_j \cdot v\_j + B\_j \cdot p\_j + M\_{\bar{j}} \cdot z\_{\bar{j}} \right) \tag{4}$$

The first term *Aj* indicates the fixed cost of plant *j* and *vj* is the binary variable described above (*vj* = 1 is working and *vj* = 0 is off). The term *Bj pj* corresponds to the contribution of the cost assumed to be proportional to the production of the plant, where *Bj* is the variable cost and *pj* the production for the plant *j*. Besides, a plant also generates costs just for being stopped. This contribution is represented by the third term *Mj zj*, where *Mj* is the cost of having each plant stopped and *zj* is also a binary stop variable that takes the value 1 if plant *j* stops and 0 indicates the opposite case.

 

This model describes the conditions to satisfy energy demand in a given time, limiting the power plants' administration because it does not allow long-term planning. The following section describes the mathematical considerations in the modeling for time intervals to have a more significant representation in the resources assigned.

#### *2.2. Model for Various Periods*

The problem of scheduling power plants by periods consists of determining for the planning horizon both the start-up and shut-down of each power plant and the allocation of energy to be generated. These three parameters must satisfy the demand in each cycle of time, reduce costs, and comply with specific technical and operational safety restrictions in each plant *j*. These planning horizons are divided into a day by time cycles. These time cycles are denoted by *k*, so the planning horizon consists of the periods: *k* = 1, 2, . . . , *K*, where *K* is determined by the number of cycles defined in total for the study. Each of the *j* power plants cannot operate below their minimum energy generation, being established for various periods such as:

$$Emin\_{\rangle} \cdot \upsilon\_{j\mathbf{k}} \le E\_{j\mathbf{k}} \tag{5}$$

where *Eminj* is the minimum energy to generate plant *j* in period *k*; *Ejk* is the energy that plant *j* will generate in period *k*; and *vjk* is the binary variable described above. Suppose, for example, we have a system of three plants and three established periods, if we talk about the minimum energy to be generated in plant 2 in period 3 it is established as:

$$Emin\_2 \cdot v\_{2,3} \le E\_{2,3}$$

Similarly, the power plants cannot produce more than the established maximum energy *Emaxj*; then:

$$E\_{jk} \le \mathit{Emax}\_{j} \cdot \upsilon\_{jk} \tag{6}$$

The energy to be produced in each plant in one period cannot increase abruptly in the immediately following period above a maximum quantity. This energy is known as the maximum load rise ramp *Uj*, expressed as:

$$E\_{\text{jk}+1} - E\_{\text{jk}} \le \mathcal{U}\_{\text{j}} \tag{7}$$

The difference between energy produced in the immediately following period and the current period's energy must be less than or equal to the maximum rising ramp of *U* of the plant *j*. Similarly, no power plant can reduce its energy production under a limit called the maximum load descent ramp *Fj*. So:

$$E\_{j\mathbf{k}} - E\_{j\mathbf{k}+1} \le \|F\_j\|\tag{8}$$

Additionally, it is convenient to define two conditions that allow setting the starting and braking for each plant, in order to have greater control of the costs that may be generated. For the first case, let us consider that a plant that is operating in a period *k* is established to be in operation and a previous period *k* − 1 is also in operation. In this case, it cannot start in period *k* expressed as:

$$
v\_{jk} - v\_{jk-1} \stackrel{\prec}{\leq} \mathcal{Y}jk\tag{9}$$

where *yjk* is also a binary start-up variable, and if *yjk* = 1 indicates the central *j* is working in a period *k* and *yjk* = 0 for the opposite case. In the same way, if a plant is in operation, it cannot be stopped and vice versa, therefore:

$$
v\_{jk} + z\_{jk} = 1\tag{10}$$

where *zjk* is the stop binary variable that indicates *zjk* = 1 plant *j* is stopped in period *k* and *zjk* = 0 when not; thus, it is possible to establish an equation that determines the state and allows these conditions to be fulfilled, given by:

$$
\upsilon\_{jk} - \upsilon\_{jk-1} + \mathcal{y}\_{jk} - z\_{jk} \le \ 0 \tag{11}
$$

To verify that the general conditions and any exchange are valid, consider the following particular example. Suppose that control unit 1 is stopped in period 1, but in the following period, it is in operation, which means that in period 2, it is going to start. Therefore it cannot be stopped in the same period 2. The equation for this situation is expressed:

$$v\_{1,1} - v\_{1,2} + y\_{1,2} - z\_{1,2} \le \ 0 \tag{12}$$

To verify that this last situation is consistent under the proposed model, consider the case that the power plant was off in period 1 and remained off in period 2, which is obtained from Equation (12):

$$\begin{array}{c} 0 - 0 + 0 - 1 \leq \ 0 \\ -1 \leq \ 0 \end{array}$$

Thus, employing the example proposed in Equation (12), it is verified that all the variables describe the logic of possible states in the system. On the other hand, the proposed model must supply the demand in each period. In consequence:

$$D\_k = \sum\_{j=1}^{J} E\_{jk} \tag{13}$$

where *Dk* is the total demand to cover in period *k*, the proposed Equations (5)–(12) are the restrictions inherent to each power plant in the system, where it is sought to reduce generation scabs by satisfying the demand established in Equation (13). 

The cost minimization *R* now considered in all time intervals must include all the regular electric power production plants' programming. Therefore, it must be expressed in terms of all possible contributions: *Aj* · *vjk* <sup>+</sup> *Bj* · *Ejk* <sup>+</sup> *Cj* · *yjk* <sup>+</sup> *Mj* · *zjk*

$$R = \sum\_{k=1}^{K} \sum\_{j=1}^{I} \left( A\_j \cdot \upsilon\_{j\mathbf{k}} + B\_j \cdot E\_{j\mathbf{k}} + C\_j \cdot y\_{jk} + M\_j \cdot z\_{j\mathbf{k}} \right) \tag{14}$$

where it is the sum of all the costs of the plants in each of the periods. The first term of Equation (14) incorporates the fixed cost *Aj* of each generation plant. The second term associates the variable cost *Bj*, considering that it is proportional to the plant's production and directly related to the cost of fossil fuels. The next cost in this model is considered the start-up *Cj* of a plant, where it is assumed to be constant throughout the periods. Finally, the fourth term of Equation (14) incorporates the cost *Mj*, generated when a plant is off. As can be seen, each of the costs described is established according to the state parameters defined by the activation or shut-down binary *vjk*, *pjk*, *yjk*, and *zjk*, respectively.

The conditions established to satisfy the different energy demands in the time intervals allow long-term planning, maintaining the optimal distribution of resources and minimizing the total cost of generation from the model as shown in Table 1.


**Table 1.** Model equations.

#### **3. Implementation and Discussion of Results**

The control area selected to carry out the study consists of 110 power plants that provide 16,992 MW of installed capacity with different technologies. The demand *D* in the area has a value from 6750 MW to 8500 MW on average per hour, according to the National Center for Energy Control (known by its spanish accronim, CENACE) in Mexico. 

From 110 power plants, we select 17 representative power plants which correspond to 57% of the area's installed capacity. This selection maintains the proportionality of the installed capacity of the area by type of technology. These plants have characteristic parameters such as maximum energy (*Emaxj*) and minimum energy (*Eminj*), variable costs *Bj* , fixed cost  *Aj* , start-up costs  *Cj* , and shut-down costs (*Mj*) as is shown in Table 2.


**Table 2.** Parameters of the Electric Power Plants [39].

For wind power plants, the maximum and minimum energy to be generated are obtained based on the statistics of the historical wind speed data of the place where they are located as reported by the Mexican ministry of energy [36]. The wind statistics are obtained through the Weibull probability density model, and in the same way with respect to hydroelectric plants, but it is a probability function of the flow and level they present.

For the model's implementation, it is necessary to indicate the requested demand in each period of the area, establishing 52% of the total demand for representing the study plants as shown in Figure 3a, and we are assuming 5% additional to compensate for generation losses that could be generated at the time of transmission, which means a total of 57% being established. Therefore, four periods were established in which each period consists of 6 h in duration, as reflected in Figure 3b. It is worth mentioning that these data are real and were provided by CENACE based on monitoring carried out every hour over a three-week interval.

The model established by Equations (5)–(14) and applied to the geographical area described above was implemented using the MATLAB® programming tool, by means of the intlinprog function, which allows solving mixed-integer linear programming problems, and which has the structure as shown in Figure 4.

**Figure 3.** The behavior of demand in the Eastern Zone. (**a**) The corresponding demand in the Eastern Zone per hour. (**b**) Consumption accross periods of 6 h.

$$\min\_{\mathbf{x}} \; f^T \mathbf{x} \text{ subject to} \begin{cases} \mathbf{x} (\text{intcon}) \text{ are integers;} \\ \quad \begin{aligned} A \cdot \mathbf{x} & \le b \\ Aeq \cdot \mathbf{x} & = beq \\ lb & \le \mathbf{x} \le ub. \end{aligned} \end{aligned}$$

**Figure 4.** Intlinpro function syntax from MATLAB® [40].

Where *f* is a vector of the objective function, *x* is a vector of the binary variables of the problem, *A* is a matrix, with the values of the left side of the inequalities, and *b* is the vector of the right side of the inequalities. *Aeq* is a matrix with the values on the left side of the model equations, *beq* is the right side of the equations, *lb* and *ub* are a vector with the maximum and minimum values of the variables. For the generation of these matrices and vectors, the proper values of each plant established in Table 1 are taken, obtaining the matrices with the following dimensions *F*<sup>272</sup> *<sup>X</sup>* 1, *XIntcon*<sup>240</sup> *<sup>X</sup>* 1, *A*<sup>190</sup> *<sup>X</sup>* 272, *b*<sup>190</sup> *<sup>X</sup>* 1, *Aeq*<sup>72</sup> *<sup>X</sup>* 272, *beq*<sup>72</sup> *<sup>X</sup>* 1, *lb*<sup>272</sup> *<sup>X</sup>* 1, and *ub*<sup>272</sup> *<sup>X</sup>* <sup>1</sup>

Once the matrices of the system were defined, the results presented in Tables 3–6 and Figures 5–8 were obtained. In them, the values for each variable defined in each of the defined periods are indicated.

In the first period, identified from 00:00 to 06:00 h, an energy demand of 21,832.52 MWh was managed. This demand is the lowest of the four periods considered because they are the first hours of the day and, consequently, cover less human activity. The power plants contributing to related demand are from technologies such as internal combustion, wind, geothermal, and hydroelectric, the contributions of which make it optimal, as shown in Table 3 and Figure 5a.


#### **Table 3.** Results of Period 1.

#### **Table 4.** Results of Period 2.



#### **Table 5.** Results of period 3.

#### **Table 6.** Results of period 4.


**Figure 5.** Generation and Energy in periods 1 and 2. (**a**) Period 1. (**b**) Period 2.

**Figure 6.** Generation and Energy in periods 3 and 4. (**a**) Period 3. (**b**) Period 4.

**Figure 7.** Contribution of energy generation by each plant of the study in the periods.

**Figure 8.** Electricity generation and Energy Costs for period 1.

Table 4 shows the model variables' results in period 2, where the demand to be satisfied is 24,467.92 MWh, as shown in Figure 5b.

Additionally, in Period 3 (see Table 5 and Figure 6a), the demand to be satisfied is the highest of the four periods, corresponding to 26,661.1 MWh. Here, the power plants that contribute to cover most of the demand are wind and hydroelectric. This aspect can be an opportunity to incorporate clean technologies for the generation of energy that satisfies the requested demand.

Finally, in period 4, the demand to satisfy is 25,434.44 MWh; the results of which are produced by the model and are described in Table 6 and Figure 6b. It is in this period where the most significant contribution is observed from renewable energies. In this way, we can observe that the model complies with what is proposed because it satisfies the demand for the established periods.

Figure 7 shows the different plants that comprise the study carried out and the contributions of each one of them in the different periods established to satisfy the demand in each one.

The costs obtained in period 1 are illustrated in Figure 8, which shows the behavior, and this trend continues in the following periods. The highest costs come from fossil fuel technology plants. This is mainly due to the various fossil fuels' high variable costs and the various costs attributed to these technologies. The lowest costs are from clean generation sources because maintenance costs are lower and provide benefits for the ecosystem.

#### **4. Conclusions**

In this work, the optimization of an Economic Dispatch model for a power supply network located in Mexico's eastern zone is presented. The established model incorporates real parameters and intrinsic restriction to each plant. The energy production of the renewable energy plants was estimated by means of probability functions according to the historical data of the location. The considerations incorporate the various types of generation costs and seek their minimization. This allows the state logic to be fulfilled at all times, as can be seen in Tables 3–6; this is due to Equations (10)–(12), which do not allow a power plant to be off and on at the same time, as well as also that a plant does not start in a period when it was on in the previous period. In addition, all costs can be better accounted for by relating them to binary variables, such as the shutdown, operation, and start-up of a plant.

The results show a majority participation of clean energy plants during the study time period. The model shows the costs that each of the power plants has in period 1, and it reflects the lower costs of the power generation mix that contribute to satisfying the demand, being in this case a combination of clean energy plants. In contrast, the study shows that some non-operating fossil fuel plants generate even higher costs than renewable plants in operation.

The mathematical model could be an important tool in decision-making in plant planning and a diagnostic mode that allows visualizing those plants with very high costs when incorporating new electricity generation sources. In future works, longer periods of time should be addressed (one year) to obtain more significant results from the most suatible energy generation mix for the zone. Energy distribution will be incorporated due to the importance of power plants, location, and the loads due to the loss of lines at transmission and their capacities. In this way, there is a broader panorama to analyze the system as a decision-making tool.

**Author Contributions:** Conceptualization, I.S.-T. and R.F.D.-C.; methodology, E.L.; software, E.L. and R.F.D.-C.; validation, E.L., R.F.D.-C. and I.S.-T.; formal analysis, E.L., R.F.D.-C. and I.S.-T.; investigation, E.L.; resources, R.F.D.-C., data curation, E.L.; writing—original draft preparation, E.L.; writing—review and editing, R.F.D.-C. and I.S.-T.; visualization, E.L. and I.S.-T.; supervision, R.F.D.-C. and I.S.-T. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


MDPI St. Alban-Anlage 66 4052 Basel Switzerland Tel. +41 61 683 77 34 Fax +41 61 302 89 18 www.mdpi.com

*Mathematical and Computational Applications* Editorial Office E-mail: mca@mdpi.com www.mdpi.com/journal/mca

MDPI St. Alban-Anlage 66 4052 Basel Switzerland

Tel: +41 61 683 77 34 Fax: +41 61 302 89 18

www.mdpi.com ISBN 978-3-0365-1670-7