*Article* **Combining Heuristics with Simulation and Fuzzy Logic to Solve a Flexible-Size Location Routing Problem under Uncertainty**

**Rafael D. Tordecilla 1,2,\*, Pedro J. Copado-Méndez 1,3, Javier Panadero 1,3, Carlos L. Quintero-Araujo 4, Jairo R. Montoya-Torres <sup>2</sup> and Angel A. Juan 1,3**


**Abstract:** The location routing problem integrates both a facility location and a vehicle routing problem. Each of these problems are *NP-hard* in nature, which justifies the use of heuristic-based algorithms when dealing with large-scale instances that need to be solved in reasonable computing times. This paper discusses a realistic variant of the problem that considers facilities of different sizes and two types of uncertainty conditions. In particular, we assume that some customers' demands are stochastic, while others follow a fuzzy pattern. An iterated local search metaheuristic is integrated with simulation and fuzzy logic to solve the aforementioned problem, and a series of computational experiments are run to illustrate the potential of the proposed algorithm.

**Keywords:** location routing problem; uncertainty; heuristics; simulation; fuzzy logic

#### **1. Introduction**

When designing and managing supply chains, one of the most relevant problems is the simultaneous location of distribution facilities and the routing of vehicles to deliver products to a set of geographically dispersed customers. The former is considered a strategic decision, while the latter is operational. This problem is known in the scientific literature as the location routing problem (LRP). The LRP addresses these two types of decisions in an integrated manner. From the formal view of the operational research community, the LRP is known to be *NP-hard*, since it can be reduced to either the facility location problem (FLP), the vehicle routing problem (VRP) or the multidepot VRP, which are all known to be *NP-hard*. This computational complexity means that optimal solutions are really difficult to obtain in a reasonable computational time. Thus, heuristic approaches are required to solve medium- and large-sized instances. Due to its complexity, some of the first studies tackled the problem by splitting it into the corresponding subproblems [1,2]. Nevertheless, this approach might lead to suboptimal solutions.

Due to the increase in computational power and the development of fast heuristic approaches, the LRP has been studied in an integrated way, which clearly has improved the obtained results [3]. One of the most studied versions of the LRP is the capacitated LRP, in which both depot and vehicle capacity constraints must be satisfied (the acronym LRP will henceforth refer to this version). However, all previous works consider the depot capacity as a fixed value for each location. This could not be a suitable approach when dealing with realistic problems, since it is usual that decision-makers can select the size of a facility from a discrete set of known available sizes, or even freely. For real-world problems, this set is usually associated with investment activities, such as building facilities [4], purchasing

**Citation:** Tordecilla, R.D.; Copado-Méndez, P.J.; Panadero, J.; Quintero-Araujo, C.L.; Montoya-Torres, J.R.; Juan, A.A. Combining Heuristics with Simulation and Fuzzy Logic to Solve a Flexible-Size Location Routing Problem under Uncertainty. *Algorithms* **2021**, *14*, 45. https://doi.org/10.3390/a14020045

Academic Editor: Javier Del Ser Lorente Received: 15 December 2020 Accepted: 26 January 2021 Published: 30 January 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

equipment [5] or qualifying workforce [6]. From an academic point of view, despite the increasing number of published works on the LRP, the consideration of flexible sizes for facilities has been rarely addressed in the literature. Nevertheless, real-life examples from both LRP [4,7,8] and non-LRP [5,6] contexts show the relevance of considering a variety of facility sizes to select from.

Traditional LRP approaches consider that parameters are deterministic or crisp, i.e., they assume that inputs are known in advance. This assumption is far from reality in many applications, such as waste collection, humanitarian logistics and urban freight distribution, where uncertainty is a key factor to consider. Despite this, the literature on the LRP addressing uncertain parameters is still scarce. In order to overcome this problem, articles employing stochastic approaches can be found in the literature. Customers' demand is one of the most addressed stochastic parameters [9–13]. Other parameters might also be considered as stochastic, such as transportation costs and travel speeds [14] or logistic costs and travel distance [15]. In general, many articles addressing stochasticity in routing problems hybridize simulation models with heuristic or metaheuristic algorithms to tackle efficiently both uncertainty and *NP-hardness*. In many real-life situations, however, it might not be possible to accurately model all uncertainty sources as stochastic variables following a probability distribution. This might be the case, for instance, when the volume of observations is low or the available data does not have enough quality [16]. Hence, uncertainty in the LRP has also been tackled through the use of fuzzy sets. Parameters such as customers' demands [17–20], travel times [21,22] or time windows [23] have been modeled as fuzzy in several studies. Notice that, whenever possible, modeling uncertainty as stochastic variables might allow a deeper statistical analysis of the results.

To the best of our knowledge, there are no works in the literature simultaneously addressing stochastic and fuzzy approaches to model demand uncertainty in a flexiblesize LRP. This is a realistic scenario, since many companies might have historical data on trustworthy customers and not enough data on new or unreliable ones. Hence, the main contributions of this paper are two-fold: on the one hand, a new variant of the location routing problem is studied, where facility sizing decisions and hybrid fuzzystochastic demands are simultaneously considered. On the other hand, this paper proposes a competitive solution approach based on the hybridization of a metaheuristic algorithm with both simulation and fuzzy logic, i.e., a so-called fuzzy simheuristic, to solve the aforementioned problem. Indeed, simheuristics have been traditionally proposed to deal with stochastic issues in hard combinatorial optimization problems [24]. However, their hybridization with fuzzy logic has been rarely studied.

The remainder of this paper is organized as follows: Section 2 describes previous works on the topic. Section 3 presents a description of our addressed problem. Section 4 explains the fuzzy simheuristic approach used to solve the problem. Section 5 describes a series of computational experiments. Section 6 analyzes our obtained results. Finally, Section 7 draws some conclusions and future research perspectives.

#### **2. Literature Review**

This section presents a summary of the published manuscripts on the main topics addressed by this work. Thus, Section 2.1 outlines works related to the location routing problem in both its deterministic version and the variant including uncertain parameters. Additionally, Section 2.2 summarizes the main contributions on the field of simheuristics and fuzzy logic as methodologies to handle uncertain parameters in routing problems.

#### *2.1. The Location Routing Problem*

Perhaps the first work related to the location routing problem is the one by Maranzana [25], who analyze the influence of transportation costs on location decisions. Moreover, Salhi and Rand [1] quantify for the first time the benefits of considering routing decisions when locating facilities. They also state that solving each subproblem (location and routing) independently does not provide optimal solutions. Multiple variants

of the LRP have been proposed over time. These variants depend on the characteristics of depots (capacitated or not), vehicles (capacitated or not, homogeneous or heterogeneous fleet), costs (symmetric or asymmetric) or the consideration of uncertain parameters. All capacitated variants addressed by these authors assume that depot sizes are fixed and cannot be changed.

Considering the limited computational power available at that time, the initial works on the LRP firstly solved the underlying location problem and used the obtained solution as a starting point to handle the corresponding routing problem. However, as the computational power has notably increased in recent years, the newest approaches deal with the LRP in an integrated manner [26,27]. Among the recently published works on the deterministic LRP, Escobar et al. [28] propose a granular tabu search within a VNS framework to speed up computational times without decreasing the solutions quality. A biased-randomization-based metaheuristic of two phases is developed by Quintero-Araujo et al. [27] to solve the capacitated version of the problem. Ferdi and Layeb [29] propose a GRASP with a novel technique used to create clusters around the open depots. Traditional applications of the LRP include horizontal cooperation [30], electric vehicle routing problems [31–33], city logistics [34], humanitarian logistics [35] or supply chain network design [36]. Moreover, most recent applications are related to environmental issues [37], cold supply chains [38] or waste management [39].

When dealing with uncertainty, most works have focused on the use of stochastic modeling. One of the utilized approaches has been the hybridization of simulation techniques with metaheuristics. For instance, Quintero-Araujo et al. [9] propose a simheuristic to solve an LRP with stochastic demands, by hybridizing Monte Carlo simulation with an iterated local search metaheuristic. A similar approach is employed by Tordecilla et al. [13], who address an LRP where the sizes of facilities to locate are also a variable to consider. Rabbani et al. [10] also propose a simheuristic approach that combines a nondominated sorting genetic algorithm-II (NSGA-II) and Monte Carlo simulation. They tackle a multiobjective multiperiod LRP in the context of the hazardous waste management industry. Both generated waste and number of people at risk are stochastic. Inventory decisions are also taken into account. Sun et al. [11] address a real-world case from an express delivery company in Shanghai. These authors tackle an LRP in which demand is stochastic and can be split for self-pickup. Then a simulation-based optimization model is proposed and two heuristics results are compared.

Other parameters are also considered to be uncertain. For instance, Herazo-Padilla et al. [14] hybridize an ant colony optimization metaheuristic with discrete-event simulation to solve an LRP in which both transportation costs and vehicle travel speeds are considered stochastic. Authors demonstrate that their proposed approach is not only efficient but is able to find statistical interactions among the different parameters. Zhang et al. [15] present an approach that hybridizes a genetic algorithm with simulation to solve a sustainable multiobjective LRP in the context of emergency logistics. The authors consider the travel distance, the demand and the cost of opening a depot as uncertain variables. Additionally, the emergence of new technologies introduces new challenges. This is the case of Zhang et al. [12], who address the problem of locating battery swap stations and routing electric vehicles with stochastic demands. This problem is solved using a hybrid approach that combines a variable neighborhood search with a binary particle swarm optimization algorithm. The problem's complexity increases when considering the low autonomy of this type of vehicles, since route failures can frequently be present when demands are not known in an accurate manner.

Uncertainty in the LRP has been studied using either stochastic or fuzzy parameters. Table 1 shows an overview of works addressing this topic, which includes: (i) whether the uncertainty is addressed stochastically or in a fuzzy fashion; (ii) the considered uncertain parameter; (iii) the mathematical modeling approach; (iv) the approach used to solve the problem; and (v) the objective function. Analyzed works show a clear preference for considering an uncertain demand, as well as for using fuzzy chance constrained models. Given both the considered uncertainty and the combinatorial nature of the LRP, most works employ a hybrid approach combining simulation with a metaheuristic algorithm. Finally, cost minimization is the prevalent objective, although a few works also consider the minimization of risk or the minimization of the additional travel distance due to route failures. Regarding works on fuzzy parameters, Zhang et al. [17] propose a hybrid particle swarm optimization (PSO) algorithm to solve a capacitated LRP with fuzzy triangular demands (CLRP-FD). The hybrid PSO algorithm is composed of three phases including a local search method and stochastic simulation. In addition, the authors propose a chance-constrained programming model for the CLRP-FD. Zarandi et al. [21] consider a multidepot LRP with fixed depot capacity and fuzzy travel times. Mehrjerdi and Nadizadeh [18] present a fuzzy chance constrained programming model where demands are modeled as fuzzy numbers. A four-phase method called "greedy clustering" is proposed, in which both an ant colony system metaheuristic and stochastic simulation are included. Fazayeli et al. [19] propose an LRP with time windows and fuzzy demands as the delivery part of a multimodal transport network. The mixed integer mathematical fuzzy model is coded and solved using GAMS and compared to the results provided by a genetic algorithm. Nadizadeh and Kafash [20] analyze a LRP with simultaneous pick-up and delivery in the context of reverse logistics. Both types of demands (pick-up and deliveries) are fuzzy variables. A fuzzy chance constrained programming model is proposed to represent the problem, and a greedy clustering method is used to solve it.


**Table 1.** Recent works related to the location routing problem with uncertain parameters.


**Table 1.** *Cont*.

The analyzed works show that uncertainty in the LRP has been addressed either by using stochastic or fuzzy demands but never considering both types of uncertainty at the same time—e.g., that some customers' demands are modeled as stochastic variables while others are modeled as fuzzy values. In addition, to the best of our knowledge, there are no previous studies on the LRP with facility sizing decisions and hybrid fuzzystochastic demands. Only Tordecilla et al. [13] have studied a similar LRP variant, although considering all customers' demands as stochastic. Thus, our work aims to fulfill the existing gap by considering a flexible-size LRP and two different types of uncertain parameters: stochastic and fuzzy demands.

#### *2.2. Simheuristics and Fuzzy Logic for Vehicle Routing Problems under Uncertainty*

When dealing with combinatorial optimization problems subject to uncertain parameters, one of the most recommended approaches is the combination of simulation (to handle stochasticity) with heuristic-based methods (to deal with the optimization part of the problem) [43,44]. In that sense, a simheuristic approach is a relatively new and efficient technique to tackle combinatorial optimization problems under uncertainty [24,45]. In general, a simheuristic algorithm works as follows: (i) given a stochastic problem, the random variables are transformed into their deterministic counterpart by using expected values; (ii) an approximated framework (heuristic or metaheuristic) is used to generate high-quality solutions for the transformed deterministic instance that can also be "promising" solutions for the stochastic version of the problem; (iii) these promising solutions are sent to a simulation engine in order to estimate its quality in a stochastic environment. The simulation engine, in addition, provides feedback to better guide the search used by the approximated procedure; and (iv) an improved estimation of the quality of the solutions is obtained for a subset of "elite" solutions using a longer simulation process. Different simheuristic algorithms have been presented in the literature to solve routing problems.

Stochastic demands in vehicle routing problems are addressed by Quintero-Araujo et al. [46] and Gruler et al. [47]. Moreover, stochastic demands are also studied in arc routing problems [48]. Stochastic versions of the inventory routing problem can be found in Gruler et al. [49]. Real applications like the waste collection problem with stochastic demands are analyzed in Gruler et al. [50]. Intermodal routing problems have also considered other stochastic parameters, such as capacity [51] or travel times [52,53]. Additionally, the need of using fuzzy logic in vehicle routing problems arises when there are some vague or uncertain parameters. The literature presents various works in which fuzzy logic is added, for instance, to model uncertain demands [54–57], travel times [58,59], capacity [57,59], and service times [60]. Additional aspects are also considered by these works, such as time windows [57,61,62], environmental aspects [59], multiple objectives [59], intermodal transportation [57,59] and an open VRP [63]. Additional applications of metaheuristics combined either with Monte Carlo simulation or fuzzy logic can be found in several fields, such as scheduling [64,65], controller optimization [66,67] parameter estimation [68], finance [69], facility location [70], etc.

#### **3. Problem Description**

The location routing problem is a well-known problem in which three main decisions must be made: (i) locating one or more facilities; (ii) allocating customers to open facilities without exceeding their capacity; and (iii) designing a number of routes whose aggregated customers' demand does not exceed a vehicle capacity. Each route must start and finish at the same facility. Furthermore, we consider a location routing problem with facility sizing decisions, where the size of each open facility is also a variable to decide on. Furthermore, we also consider both stochastic and fuzzy demands. Hence, a percentage of the vehicles' capacity is reserved as a safety stock (*SS*), in case the demand is higher than expected. Therefore, the main decision variables in this problem are related to the number of facilities to open, the facilities' size and location, which customers must be allocated to each open facility, how many vehicles must be used and how to design the associated routes. This problem is *NP-hard* since it contains, as special cases, the capacitated vehicle routing problem, the multidepot VRP and the facility location problem, all of them known to be computationally hard. Figure 1 provides an example of a complete LRP solution. Facilities are represented by diamonds and customers by circles. Black (solid) diamonds are the open facilities, while noncolored diamonds correspond to nonopen facilities. For each open depot, a set of routes starting and finishing at the corresponding depot location is designed to serve all customers' demands. Each route is assigned a single vehicle.

**Figure 1.** Graphical representation of a location routing problem (LRP) solution.

Formally speaking, the LRP can be defined on a complete, weighted, and undirected graph *G*(*V*, *E*, *C*), in which *V* = *J* ∪ *I* is the set of nodes (comprising the subset *J* of potential facility locations and subset *I* of customers), *E* is the set of edges, and *C* is the cost matrix of traversing each edge. Delivery routes are performed by a set *K* of unlimited homogeneous vehicles with limited capacity. This problem also assumes that all vehicles are shared by all facilities (i.e., no depot has a specific fleet) and each edge *e* ∈ *E* satisfies the triangle inequality. The customers' demands are uncertain and are modeled using stochastic values for a subset of customers *I*1, and fuzzy values for a subset of customers *I*2, such that *I*<sup>1</sup> ∪ *I*<sup>2</sup> = *I*. The variant of the LRP considered in this paper is the one in which a decision must be made about the size of the facilities to open. Hence, a set *L* of alternative sizes for each facility and associated fixed and variable opening costs are provided as inputs. Depots might have equal or different capacities. Each customer node must be served by exactly one vehicle that starts and finishes its route in the facility to which it has been allocated (i.e., split deliveries are not allowed). The following notation is used to describe our problem: **Parameters**


*q* = Capacity of each vehicle %*SS* = Safety stock percentage **Decision variables**

*yjl* = Binary variable that indicates whether the depot *j* ∈ *J* is open with size *l* ∈ *L* or not.

*xij* = Binary variable that indicates whether customer *i* ∈ *I* is assigned to the depot *j* ∈ *J* or not

*wek* = Binary variable that indicates whether arc *e* ∈ *E* is used in the route performed by vehicle *k* ∈ *K* or not

The objective is to minimize the total cost (*TC*), which includes opening facilities costs (*OC*), routing costs (*RC*), and failure costs (*FC*), i.e., *TC* = *OC* + *RC* + *FC*. These parts are defined in Equations (1)–(3).

$$OC = \sum\_{j \in J} \sum\_{l \in L} (f\_{\bar{j}} + o\_{\bar{j}l}) y\_{\bar{j}l} \tag{1}$$

$$RC = \sum\_{c \in E} \sum\_{k \in K} c\_c w\_{ck} \tag{2}$$

$$FC = \min\{c\_{rec}, c\_{prec}\} \tag{3}$$

*FC* represents the cost incurred whenever the actual demand of a route is greater than the vehicle capacity, where *creac* and *cprev* depend on the corrective action considered, namely:


Let <sup>∅</sup> <sup>=</sup> *<sup>S</sup>* <sup>⊂</sup> *<sup>V</sup>* be a subset of nodes, *<sup>δ</sup>*+(*S*) the set of edges leaving *<sup>S</sup>*, *<sup>δ</sup>*−(*S*) the set of edges entering *S*, and *A*(*S*) the set of edges with both ends in *S*. Hence, the location routing problem with facility sizing decisions and uncertain demands can be modeled as the following integer program:

$$\text{Minimize } \text{ TC} \tag{4}$$

*subject to:*

$$\sum\_{k \in K} \sum\_{\substack{\varepsilon \in \delta^-(i)}} w\_{\varepsilon k} = 1 \quad \forall i \in I \tag{5}$$

$$\sum\_{i \in I} \sum\_{\substack{\varepsilon \in \delta^-(i)}} D\_i w\_{\varepsilon k} \le (1 - \%SS)q \quad \forall k \in K \tag{6}$$

$$\sum\_{e \in \delta^{+}(n)} w\_{ek} = \sum\_{e \in \delta^{-}(n)} w\_{ek} \quad \forall k \in K, \forall n \in V \tag{7}$$

$$\sum\_{\{c \in \delta^+(l)\}} w\_{ck} \le 1 \quad \forall k \in K \tag{8}$$

$$\sum\_{c \in A(S)} w\_{ck} \le |S| - 1 \quad \forall S \subseteq I, \forall k \in K \tag{9}$$

$$\sum\_{c \in \delta^{+}(j)} w\_{ck} + \sum\_{c \in \delta^{-}(i)} w\_{ck} \le 1 + x\_{ij} \quad \forall i \in I, \forall j \in J, \forall k \in K \tag{10}$$

$$\sum\_{j \in J} x\_{ij} = 1 \quad \forall i \in I \tag{11}$$

$$\sum\_{i \in I} D\_i \mathbf{x}\_{ij} \le \sum\_{l \in L} s\_l y\_{jl} \quad \forall j \in J \tag{12}$$

$$\sum\_{l \in L} y\_{jl} \le 1 \quad \forall j \in J \tag{13}$$

$$\forall \, y\_{jl,\prime} x\_{ij,\prime} w\_{ck} \in \{0, 1\} \tag{14}$$

The objective function (4) minimizes the total cost. Constraint (5) ensures that each customer is served by a single route and a single vehicle. Constraint (6) guarantees that the total demand served by a vehicle in a route does not exceed its capacity. This limit is reduced by a safety stock, which is a percentage of the vehicle capacity reserved to respond more effectively to the uncertain demand. Constraint (7) guarantees the continuity of each route. Constraint (8) ensures the return of each vehicle to its starting depot. Constraint (9) guarantees the subtour elimination. Constraint (10) ensures that a customer is served by a route departing from an open depot only if this customer is allocated to this depot. Constraint (11) guarantees that a customer is assigned to only one depot. Constraint (12) ensures that the total demand served from a depot does not exceed its assigned size. Constraint (13) guarantees that only one size is assigned to an open depot. Finally, Constraint (14) determines that all decision variables are binary.

#### **4. Solution Approach**

Since the problem described in Section 3 is known for being *NP-hard*, the formulated mathematical model is not employed to find an optimal solution but just to provide a better understanding of the problem details Hence, we propose a fuzzy simheuristic approach [24] for minimizing the expected total cost. Traditionally, simheuristics have been used to solve optimization problems with stochastic components, such as arc routing problems with stochastic demands [48], stochastic waste collection problems [50] or team orienteering problems with stochastic travel times [71]. We have extended the simheuristic framework by including fuzzy components in order to deal with combinatorial optimization problems with uncertainty components of both stochastic and nonstochastic nature. In particular, our methodology combines an iterated local search (ILS) metaheuristic with Monte Carlo simulation and fuzzy inference systems (FIS) to deal with stochastic and fuzzy variables, respectively. As discussed in Ferone et al. [72], several metaheuristic frameworks offer a well-balanced combination of efficiency and relative simplicity and can be easily extended to a fuzzy simheuristic. In general, our approach is composed of three stages. During the first stage, a set of promising LRP solutions are generated using a constructive heuristic, which employs biased-randomization techniques [73]. In the second stage, the ILS metaheuristic tries to improve each of these promising solutions by iteratively exploring the search space and conducting a short number of simulations. Finally, in the third stage, a refinement procedure using a larger number of simulation runs is applied to these elite solutions, which allows one to obtain a more accurate estimation of the expected total cost.

Algorithm 1 outlines the main components of Stage 1. It generates quickly a ranked list of "promising" LRP solutions. The main input parameters of this heuristic are: the list of customers with both their demand and location in Cartesian coordinates, the list of facilities including their opening costs and the vehicle capacity. The algorithm procedure is as follows: initially, the minimum and maximum (*nbDepots*<sup>0</sup> and *maxNbDepots*, respectively) numbers of facilities required to serve the total demand are computed. Both bounds are calculated by dividing the total demand by the maximum available facility size, and the

minimum available facility size, respectively, and they are rounded up to the next integer number. Then we run our algorithm for each number of facilities between *nbDepots*<sup>0</sup> and *maxNbDepots* (line 3). Later, for each iteration of the line 4 loop, a new set of random locations are generated (line 5). This is stored in *usedOpenDepots* to avoid repeating. Next, if the available capacity of facilities in *openDepots* is enough to satisfy customers demand, customers' allocation and routing procedures are carried out; otherwise, *openDepots* is rejected. The customers' allocation procedure is performed by producing a new *map* (line 9) where each facility has a list of all customers sorted by savings. These savings represent the benefit of allocating each customer to the current depot instead to the best alternative facility. Then a facility in *openDepots* is selected randomly, and a biased-randomized procedure is used to allocate a customer of the list to the current depot. This procedure ends when all customers have been allocated. In the step in line 10 a VRP is solved for each subset facility-customers in the map. Finally, a feasible LRP solution is yielded and stored in the pool of solutions *poolSol*. The algorithm ends returning a top list of complete LRP solutions, assessed in terms of opening and routing costs.

#### **Algorithm 1** Constructive heuristic (*cust*, *depots*, *vehCap*, *β*, *itermax*)


Algorithm 2 outlines Stages 2 and 3. During the second stage, each "promising" map generated by the constructive heuristic is processed by the simulation and the fuzzy components to estimate its safety stock (line 4). This procedure is carried out by performing a low number of runs, where a new value is assigned to each random or fuzzy element based on its probability distribution or fuzzy function, respectively. We use Monte Carlo simulation in order to estimate the stochastic variables, whilst a fuzzy inference system is used to estimate the fuzzy variables. Then, the objective function and the constraints are evaluated under the random/fuzzy generated values to compute the expected cost of each promising map. Next, the ILS metaheuristic tries to improve the set of "promising" maps by iteratively exploring the search space and conducting a second process of fuzzy/simulation runs. We start the process by perturbing the current base solution *baseSol* (line 8). In this phase we use two different strategies. In the first one, the algorithm randomly selects a set of customers and tries to reassign them in a random way to another facility without violating its capacity. Regarding the second strategy, the algorithm randomly exchanges the allocation of a percentage of customers among facilities. This process is dependent on the value of *k*, which represents the degree of exchange to be applied. This value is

updated in each iteration between *Kmin* and *Kmax*, i.e., it is reset to *Kmin* whenever a new solution *newSol* outperforms the *baseSol*, and it is increased whenever the algorithm fails to improve the current solution until a maximum value *Kmax*. The strategy to be used in each iteration of the algorithm is randomly selected.


Afterwards, the algorithm starts a local search around the perturbed solution in order to improve it (line 9). This stage consists in a *two-opt inter-route* operator, which interchanges two chains of randomly selected customers between different facilities. A *newSol* is returned whenever no more improvements are achieved. Later, whenever the deterministic cost of the *baseSol* is improved (line 10), the *newSol* is processed by the simulation and the fuzzy components to deal with the uncertainty of the proposed problem, using a low number of runs to compute the expected cost of the solution (line 11). Notice that this procedure does not only provide estimated values to the expected cost associated with the solutions generated by our approach, but it also reports feedback to the metaheuristic search process. If the *newSol* is also able to improve the expected cost of the *baseSol* (line 12), the latter is updated. In the same way, if the expected cost of the *newSol* improves the cost of the best solution (*bestSol*) found so far (line 14), the latter is updated and added to the pool of elite solutions (line 16). This pool contains the best stochastic/fuzzy solutions found so far. The number of solutions in this pool is a known parameter that depends on the available computational time. Moreover, by limiting the size of this pool we ensure that we only keep track of the top solutions as the algorithm evolves. In order to further diversify the search, the algorithm might occasionally accept nonimproving solutions following an acceptance criterion (lines 20–28). Specifically, we have used a simulated-annealing acceptance criterion, which contains a decaying probability that is regulated by a dynamic temperature parameter (*T*).

Finally, a refinement procedure using a larger number of simulation runs is executed in the third stage for each elite solution (lines 31–36). Hence, a more accurate summary of output variables can be obtained. As before, both probability distributions and fuzzy functions are employed in this simulation, depending on whether the element has a stochastic or fuzzy nature. Finally, the "best" solution (or pull of best alternative solutions) is returned, considering that the decision maker might be not only interested in the average value associated with a solution but also in its variability level. Particularly, the main output variables in our experiments are: the opening and routing costs, the cost incurred whenever a route fails and the safety stock.

#### **5. Computational Experiments**

Multiple sets of instances are found in the literature to test the algorithms designed to solve the LRP [74–76]. Nevertheless, these sets do not consider characteristics such as parameters uncertainty and flexible facility sizes, i.e., instances must be adapted to our problem's features. Therefore, we use Akca's [74] instances and introduce the following modifications:


*Var*[*Di*] = *λφi*. These variability values are preserved identical to the ones used by Tordecilla et al. [13], in order to perform a suitable results comparison. The demand of the other half of the customers is considered to be fuzzy. In this case, *Di* can be estimated as low (DL), medium (DM) or high (DH). The demand in each of these fuzzy sets is represented by a triangular fuzzy number *Di* = (*d*1*i*, *d*2*i*, *d*3*i*). If *q* is the vehicle total load capacity, all fuzzy demand values are expressed as a proportion of *q* in order to perform an appropriate comparison between the demand and the vehicle available capacity, i.e., 0 ≤ *Di* ≤ 1. The membership function of these fuzzy sets are displayed in Figure 2.

#### *A Fuzzy Approach for the Demand and the Vehicle Available Capacity*

When considering customers with stochastic demands, the decision about visiting the next customer in a route is made simply by comparing its expected demand with the vehicle's current capacity. If this demand is greater, the vehicle will perform a detour to the depot for a replenishment. Nevertheless, when the next customer demand is fuzzy, the decision about serving it is made employing a preference index *pi* [77]. It indicates the strength of our inclination to visit the next node in a route. This index depends on both the estimated demand of the next node *Di*+<sup>1</sup> and the vehicle capacity *Ci* that remains available after serving the customer *i* ∈ *I*. *Ci* is expressed as a proportion of *q*, i.e., 0 ≤ *Ci* ≤ 1. It also can be treated as low (CL), medium (CM) or high (CH), and it is represented by a triangular fuzzy number *Ci* = (*c*1*i*, *c*2*i*, *c*3*i*). The membership function of the capacity fuzzy sets are displayed in Figure 3.

**Figure 3.** Fuzzy sets for the vehicle available capacity after visiting the customer *i*.

The preference index is defined between 0 and 1, i.e., 0 ≤ *pi* ≤ 1. When *pi* = 1, we will definitely visit the next node in a route since the vehicle available capacity can for sure meet its demand. When *pi* = 0, we are sure that *Di*+<sup>1</sup> exceeds *Ci* and the vehicle must return to the depot for a replenishment. We consider that the preference can be very low (PVL), low (PL), medium (PM), high (PH) or very high (PVH). Each of these categories is represented by a fuzzy set, whose membership function is depicted in Figure 4. Additionally, we define a set of reasoning rules (Table 2) to determine the preference to visit the next node depending on the levels of both the demand and the vehicle available capacity.

**Figure 4.** Fuzzy sets for the preference strength to visit the customer *i*.


**Table 2.** Reasoning rules determining the visit preference strength.

Figure 5 displays the procedure used to compute the preference index *pi* after serving the customer *i* ∈ *I*. This procedure is described as follows:

**Figure 5.** Procedure used to compute the preference index *pi*.

	- (a) Generate a random demand *di* between a lower bound and an upper bound. Since the objective is preserving the variability conditions similar to the stochastic demands, the lower and upper bounds are given by the expressions *<sup>φ</sup>i*<sup>−</sup> √3*λφ<sup>i</sup> <sup>q</sup>* and *<sup>φ</sup>i*<sup>+</sup> √3*λφ<sup>i</sup> <sup>q</sup>* , respectively.
	- (b) Calculate the membership degree *μ*(*di*) of this demand. Notice that *μ*(*di*) ∈ [0, 1].
	- (c) Generate a random number *ρ* ∈ [0, 1].
	- (d) Compare *ρ* and *μ*(*di*). If *ρ* ≤ *μ*(*di*), then assume the actual demand of the customer *i* as *di*; otherwise, repeat steps (a)–(d) until this condition is fulfilled.

We define a known threshold *p*∗, such that 0 ≤ *p*<sup>∗</sup> ≤ 1. The computed preference index *pi* must be compared with *p*<sup>∗</sup> in order to make a decision about the vehicle next destination. If *pi* ≥ *p*∗, the vehicle should visit the next customer directly; otherwise, we estimate that the vehicle available capacity cannot meet the next customer demand. In this case, both preventive (*cprev*) and reactive (*creac*) costs are calculated (see Section 3). If *cprev* < *creac*, the vehicle should perform a detour to the depot for a preventive replenishment; otherwise, it should visit the next customer directly and react to its real demand. The lower the threshold level, the greater the inclination to unload the vehicle as much as possible before making a replenishment trip to the depot. In this case, less preventive detours are performed. Hence, the number of times that a reactive round-trip must be carried out increases. Previous tests using modified Akca's instances yielded lower costs when *p*∗ = 0.45.

The following parameters are used by our algorithm to run the experiments: (i) 350 iterations for map perturbations; (ii) 150 iterations for the biased-randomized savings heuristic; (iii) 150 iterations for splitting; (iv) a random value between 0.05 and 0.80 for *β*1, the parameter of the geometric distribution associated with the biased-randomized selection during the allocation map process; (v) a random value between 0.07 and 0.23 for *β*2, the parameter of the geometric distribution associated with the biased-randomized heuristic for routing; (vi) *n* = 100 runs for the initial simulation stage; (vii) *N* = 5000 runs for the intensive simulation stage; and (viii) 100 iterations to estimate the safety stock (*SS*), testing only discrete values between 0% and 10%. Our proposed algorithm was coded as a Java application. All experiments were executed on a standard Windows PC with a Core *i*5 processor and 6 GB RAM. A total of ten different random seeds were used for each instance.

#### **6. Results and Discussion**

Table 3 shows our obtained results for 12 Akca's instances. Five main indicators are computed: depots opening costs (OC), which is formed by both fixed and variable costs; routing costs (RC); failure costs (FC), which is incurred whenever the vehicle must perform either a detour or a round-trip to the depot; total costs (TC); and the estimated safety stock (SS) level. Four types of solutions are compared. All of them are flexible, i.e., they consider facility sizing decisions. Firstly, our best deterministic solutions are shown, i.e, there is no uncertainty in the customers' demand and its realization is exactly as expected. In this case, a safety stock is not necessary and there are no failure costs. Secondly, we show the best stochastic solutions reported by Tordecilla et al. [13], in which the exact customers' demand is not known. Instead, all of them follow a log-normal distribution with known mean and standard deviation. Thirdly, our best hybrid fuzzy-stochastic solutions are displayed, in which half of the customers' demand follows a log-normal distribution, and half of the customers' demand is considered to be fuzzy. Finally, our best fuzzy solutions are shown, in which all customers' demand is considered to be fuzzy, due to a high level of uncertainty. Additionally, results for three levels of variability (*λ*) are shown. Clearly, our best deterministic solutions are the same regardless of the variability level, given the total absence of uncertainty.



, *14*, 45

**Average**

191.67

671.03

862.70 191.67

673.54 5.58

870.79 1.67% 191.67

 673.41

 7.50

872.57

2.08% 197.92

 668.13

 8.53

 874.57

 2.33%



Results in Table 3 show a slight average increase in total costs when increasing the variability level for all types of solutions, except the best deterministic solution. This growth is caused mainly by the rise in failure costs, since a greater number of detours and round-trips is expected when the demand variability level is higher. Additionally, total costs also increase when the uncertainty level is higher regardless of the variability level, i.e., the deterministic solution is the cheapest one, and the fuzzy solution is the most costly. If we compare only the average deterministic cost of each set of solutions, formed by the sum of OC and RC, we obtain values with negligible differences. Hence, the contrasts in total costs are caused mainly by failure costs. For example, for the instance *Cr30x5a-3* in the low variability scenario, 1.6% of total costs are failure costs in the best stochastic solution. However, in the best fuzzy solution this percentage rises to 3.5%. Most instances show this steady growth when increasing the uncertainty level, which confirms that fuzzy scenarios have a higher uncertainty level when compared with deterministic and stochastic scenarios. Finally, the average safety stock increases when both variability and uncertainty levels rise, since more protection against uncertainty is necessary in both cases.

Results corresponding to our best deterministic solution in Table 3 were yielded assuming that the realized demand is deterministic. Hence, an additional experiment has been performed, in which this solution (called henceforth OBD) is tested in a hybrid fuzzystochastic environment, using 0% of safety stock protection against uncertainty. Figure 6 compares this solution's results with our best-found hybrid fuzzy-stochastic solution (OBF) in terms of failure costs. Results for 12 Akca's instances are depicted for each demand variability scenario. Extreme points in dashed lines indicate the average cost for each set of data. As expected, average failure costs show an increasing trend when the variability grows, regardless of the type of solution. Conversely, Figure 6 shows that OBF outperforms OBD when tested under uncertainty conditions. This fact demonstrates the quality of our fuzzy simheuristic approach, especially in scenarios where the demand variability is high.

**Figure 6.** Failure costs of our best deterministic and our best hybrid solutions.

Table 4 compares two types of hybrid fuzzy-stochastic solutions. Firstly, we show our best solution with a single facility size alternative given by the original Akca's instances—

i.e., the solution is not flexible since only one size is available to select. Secondly, we show our best flexible solution, which corresponds to our best hybrid solution in Table 3. When comparing the total costs of both types of solutions, the negative gap obtained for all instances and under all variability levels shows the advantages of considering facility sizing decisions. For example, we reach a maximum absolute gap of 7.71% in total cost savings for a single instance. In average, both opening and routing costs decrease whenever alternative depot sizes are available. Nevertheless, each instance shows different results regarding OC and RC. The most evident case is that in which opening costs decrease. Clearly, this is a direct result of having smaller facility size alternatives. Without loss of generality, all examples below take as reference the high variability scenario. For example, the instance *Cr30x5b-3* has a total demand of 1620. Both flexible and nonflexible approaches design the same routes and yield equal routing costs. Nevertheless, the nonflexible approach locates two depots of size 1000 each. Conversely, our flexible approach locates one depot of size 1000 and one depot of size 750. Hence, the nonflexible solution assigns an extra capacity that is not necessary under the problem's current conditions.

**Table 4.** Comparative results between our hybrid solutions when considering facility sizing decisions.


Some instances show an opposite behavior, i.e., opening costs either increase or remain the same while routing costs decrease. For example, the nonflexible solution of the instance *Cr30x5a-1* opens two depots of size 1000 each. Alternatively, the flexible solution opens one depot of size 1500 and one depot of size 500, i.e., the total capacity is equal and, given our defined costs structure, also the opening costs. However, this slight change drives a redesign of routes that decreases RC. An additional example is given by the instance *Cr40x5a-2*. Figure 7 depicts the best solution found by both the nonflexible approach (a) and our flexible approach (b). The solution in Figure 7a locates two depots of size 1750 each, and the solution in Figure 7b locates three depots of size 875 each. The latter case has a total capacity that is smaller than the former's; however, opening costs are higher since the fixed cost is clearly greater when 3 facilities are open instead of 2. This new configuration decreases considerably routing costs (Table 4), which shows that considering facility sizing decisions not only reduces total costs by decreasing depots capacity but also by increasing it, since shorter routes can be designed.

**Figure 7.** Best-found solution by a nonflexible (**a**) and a flexible (**b**) fuzzy LRP for the instance *Cr40x5a-2*.

#### *Managerial Insights*

From a managerial perspective, we have shown a general algorithm useful to solve a flexible-size LRP where a subset of customers provides enough information to model stochastically their demand, while the complementary subset provides scarce data. In this case, decision makers may estimate a fuzzy demand. Our algorithm is general because scenarios where the demand of all customers is deterministic, stochastic or fuzzy represent particular cases of our described problem. Hence, decision makers can employ our approach more extensively than other algorithms. We analyze these scenarios through some numerical results and assess how the level of uncertainty influences opening, routing and route-failure costs. Clearly, more precise data decrease total costs. Furthermore, we have calculated the cost of assuming a deterministic demand when the real scenario is fuzzy or stochastic. It has been shown that our hybrid approach yields less average costs, which leads to a more competitive supply chain. Additionally, we have also shown that important cost savings are generated whenever a set of facility size alternatives are analyzed by decision makers, instead of considering a single alternative—as in most LRP studies. Finally, our algorithm is able to generate detailed information about the location-allocation-routing decisions that should be made.

#### **7. Conclusions**

This work presented a location routing problem where the facility size is an additional variable, instead of a known parameter as the traditional LRP assumes. Moreover, we consider a hybrid fuzzy-stochastic setting in which some customers' demands are fuzzy and others are stochastic. Hence, a fuzzy simheuristic approach is proposed to solve this problem cost- and time- efficiently. Initially, our algorithm selects the best size for each open facility from a set of provided alternatives. We perform an iterative procedure in which a set of location-allocation-routing configurations are assessed in terms of opening and routing costs. Then a top list of complete LRP solutions is iteratively perturbed and simulated. The perturbation stage is performed by employing an iterated local search metaheuristic. The simulation stage is carried out by running a classic Monte Carlo simulation for the stochastic demands and a fuzzy simulation for the fuzzy demands. Failure costs are introduced as an additional performance indicator. Finally, a set of elite solutions is assessed through a refinement procedure where a larger number of simulation runs is executed.

Our fuzzy simheuristic approach has been proved to be flexible enough not only to combine efficiently stochastic and fuzzy demands in a single execution but also to address less general scenarios in which demands of all customers are either deterministic or fuzzy. Our approach has also been proved to be a cost-efficient algorithm when considering uncertainty scenarios. It decreases route failure costs when compared with the best deterministic solution tested in a hybrid fuzzy-stochastic environment. The use of a safety stock policy as a protection against uncertainty has also contributed to this decrease. In order to design a time-efficient algorithm, our current approach employs stochastic and fuzzy simulation only to assess the designed routes. Hence, our algorithm results can be enhanced by introducing fuzzy-stochastic aspects from the construction stage. However, this approach might also increase computational times.

To the best of our knowledge, this is the first time that a hybrid fuzzy-stochastic LRP with facility sizing decisions is addressed. Medium-sized benchmark instances considering three demand variability levels were used. Obtained results show that introducing such flexibility decreases total costs in two mutually nonexclusive ways: firstly, yielding savings in opening costs by locating facilities of smaller size; and secondly, yielding savings in routing costs by locating facilities of higher size, which drives a routes redesign that reduces the total traveled distance. We also have demonstrated that these savings are always incurred regardless of the demand variability level.

Multiple challenges remain open for future research. Since we are considering that only routes fail when demands are higher than expected, future work can include the simulation of facility failures, which would prompt a revision of location-allocation decisions. In addition, failure costs are currently measured only by considering the distances traveled to perform round-trips and detours. Still, real-life customers might not allow a delivery delay, e.g., because a time windows constraint must be met. This delay may drive lost sales or a goodwill reduction. Hence, this type of costs can be included in the computation of failure costs. Finally, large-sized instances can be used to assess the influence of the number of nodes in our approach performance.

**Author Contributions:** Conceptualization, R.D.T. and A.A.J.; methodology, R.D.T., P.J.C.-M. and J.P.; software, P.J.C.-M. and J.P.; formal analysis, R.D.T.; investigation, R.D.T., P.J.C.-M., C.L.Q.-A. and J.R.M.-T.; writing—original draft preparation, all authors; writing—review and editing, R.D.T. and A.A.J. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work has been partially supported by the Spanish Ministry of Science (PID2019- 111100RB-C21/AEI/10.13039/501100011033). In addition, it has received the support of the Doctoral School at the Universitat Oberta de Catalunya (Spain) and the Universidad de La Sabana (INGPhD-12-2020).

**Data Availability Statement:** Data are available upon reasonable request to the corresponding author.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Integrated Simulation-Based Optimization of Operational Decisions at Container Terminals**

**Marvin Kastner \*,†, Nicole Nellen †, Anne Schwientek † and Carlos Jahn**

Institute of Maritime Logistics, Hamburg University of Technology, 21073 Hamburg, Germany; nicole.nellen@tuhh.de (N.N.); a.schwientek@tuhh.de (A.S.); carlos.jahn@tuhh.de (C.J.) **\*** Correspondence: marvin.kastner@tuhh.de; Tel.: +49-40-42878-4793

† These authors contributed equally to this work.

**Abstract:** At container terminals, many cargo handling processes are interconnected and occur in parallel. Within short time windows, many operational decisions need to be made and should consider both time efficiency and equipment utilization. During operation, many sources of disturbance and, thus, uncertainty exist. For these reasons, perfectly coordinated processes can potentially unravel. This study analyzes simulation-based optimization, an approach that considers uncertainty by means of simulation while optimizing a given objective. The developed procedure simultaneously scales the amount of utilized equipment and adjusts the selection and tuning of operational policies. Thus, the benefits of a simulation study and an integrated optimization framework are combined in a new way. Four meta-heuristics—Tree-structured Parzen Estimator, Bayesian Optimization, Simulated Annealing, and Random Search—guide the simulation-based optimization process. Thus, this study aims to determine a favorable configuration of equipment quantity and operational policies for container terminals using a small number of experiments and, simultaneously, to empirically compare the chosen meta-heuristics including the reproducibility of the optimization runs. The results show that simulation-based optimization is suitable for identifying the amount of required equipment and well-performing policies. Among the presented scenarios, no clear ranking between meta-heuristics regarding the solution quality exists. The approximated optima suggest that pooling yard trucks and a yard block assignment that is close to the quay crane are preferable.

**Keywords:** container terminal; simulation; simulation-based optimization; meta-heuristic; horizontal transportation; hyper-parameter optimization

#### **1. Introduction**

Seaports are the interface between various transport modes in the maritime supply chain. In 2019, the volume of global maritime containerized trade had tripled to 152 million TEU (Twenty-foot Equivalent Unit, the size of a standard container) from its value in 1997 [1]. Moreover, ship sizes have also tripled in the past 20 years, from an 8000 TEU capacity to around 24,000 TEU. This implies that, in addition to adjustments to the port's infrastructure and superstructure, container terminals have to substantially increase their efficiency in ship handling in order to keep unproductive berthing times as short as possible while container volumes that are handled during one ship call increase. Thus, the challenge for terminals is to handle a large number of containers within a very short period of time. Terminals can address this challenge by creating technical prerequisites (i.e., using more and higher-performance equipment) and by optimizing operational processes. While the use of more equipment entails correspondingly more investment and higher running costs, intelligent control of operational processes leads to more efficient cargo handling without additional costs. Therefore, it is reasonable to minimize the amount of necessary equipment and to coordinate operational processes.

Container handling requires a large number of process steps in the terminal. When a ship is berthed, Quay Cranes (QCs) unload the containers and set them onto waiting

**Citation:** Kastner, M.; Nellen, N.; Schwientek, A.; Jahn, C. Integrated Simulation-Based Optimization of Operational Decisions at Container Terminals. *Algorithms* **2021**, *14*, 42. https://doi.org/10.3390/a14020042

Academic Editor: Angel A. Juan Received: 6 December 2020 Accepted: 25 January 2021 Published: 28 January 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Yard Trucks (YTs), which then transport them to the storage area. There, Rubber-Tired Gantry (RTGs) cranes lift the containers into the respective Yard Block (YB) for short-term storage until the container is picked up. The steps in the process of loading a ship run in the opposite direction. The processes are coupled because YTs are passive equipment and are not able to lift the containers themselves. As a result, waiting times and utilization of the respective equipment must be weighed against each other. In order to perform ship handling as quickly as possible and to reduce the waiting times of the QCs, it is preferable to use more YTs however, this leads to longer waiting times and lower utilization of YTs. Furthermore, using too many YTs leads to congestion and, therefore, to delays in ship handling [2]. The better the balance between these conflicting objectives, the more efficiently the terminal works.

There are several decision problems in the design and operation of container terminals, which strongly influence the efficiency of container handling. Figure 1 shows an overview of typical decision problems at container terminals.

**Figure 1.** Decision problems at container terminals.

Decisions regarding the design of the terminal layout (e.g., location and size of the YBs) or the equipment that must be used and how much of it to procure have a rather long-term influence (refer to [3] for a recent overview). From a short-term perspective, on the quayside, decisions include berth allocation, stowage planning, and QC assignment and scheduling (refer to [4] for an overview). In horizontal transport, the decision problems are dispatching or scheduling (assigning vehicles and transport orders) and routing. Dispatching (assigning RTGs and storage orders), YB assignment, and YB position assignment are the primary decision problems in the storage area. On the land side, there are also questions of order assignment and gate control. Kizilay and Eliiyi [5] provide a recent overview of container terminal decision problems.

These decision problems influence each other [6]. For example, berth allocation directly influences the YB assignment and vice versa. The distances between the ship at the berth and the assigned YBs should be as short as possible, and at the same time, sufficient YBs should be assigned to a berth or QC. RTGs in the yard can typically move 15 containers within one hour, while QCs have a productivity of around 30 moves/h. Thus, at least two YBs (each being served by at least one RTG) have to be connected to one QC. Another example is the relationship between gate organization and dispatching in the yard: If the number of truck arrivals is regulated by a truck appointment system, then this also influences the number of handling orders for RTGs and thus affects dispatching [7]. These are just two examples of the numerous interactions between decision problems. Therefore, there is a risk that the overall solution will deteriorate if only one decision problem is

optimized. Hence, it is necessary to combine the different related decision problems into a single integrated decision problem. Usually, the operations on the waterside are focused on due to the high costs related to the berth time of a ship. This means that the number of QCs, YTs, and RTGs as well as the integration of the handling processes need to be jointly considered to ensure efficient operations. Thus, this study analyzes a combination of QC assignment and dispatching, as well as YB assignment as shown with a dark frame around the respective boxes in Figure 1. Additionally, the dashed frame around the quantity of equipment per type indicates that the number of YTs used is modified.

Integrated decision problems covering QCs, YTs, and RTGs have been traditionally solved by formulating and solving a mathematical model. At first glance, this seems like a reasonable approach. However, this results in a very complex problem for which a solution is difficult to find and, especially under the real-time requirements of a container terminal, is almost impossible to solve in terms of computing power. Therefore, it is worthwhile stepping back and considering different methods, including the use of policies.

#### *1.1. Literature Review on Integrated Decision Problems*

There are two main approaches to addressing the operational decision problems of container terminals [8]. The first approach takes into account the complex dynamic environment of a container terminal. It typically uses priority rules, which are analyzed with the help of simulation models that can represent stochastic processes. Simulation models involving container terminals are reviewed in [9,10]. In the second approach, a simplified deterministic mathematical model is formulated. Either the model can be solved optimally for small instances, or an approximation can be found with the help of heuristics.

While the first approach is better suited to volatile processes at container terminals, simulation studies that aim to investigate several decision problems quickly become very complex [11]. The second approach also quickly reaches its limits. In this context, Zhen et al. [12] showed that the integrated QC and YT scheduling problem is non-deterministic polynomial-time hard, which means that the computing time required to optimally solve the problem is too long to be used in practice. To take advantage of both approaches, especially in order to investigate integrated decision problems that influence each other, the approach of combining simulation and optimization has been developed in recent years. He et al. [13] addressed integrated QC, YT, and RTG scheduling. They developed a mixed-integer programming model and proposed a simulation-based optimization method. Their optimization algorithm integrates genetic and particle swarm optimization algorithms. Cao et al. [14] aimed to schedule RTGs and YTs simultaneously in order to decrease the ship turnaround time. They introduced a multi-layer genetic algorithm to solve the scheduling problem and designed an algorithm-accelerating strategy. Castilla-Rodríguez et al. [15] focused on the QC scheduling problem: They integrated artificial intelligence techniques and simulation, combining an evolutionary algorithm with a simulation model to embed uncertainty. Kizilay et al. [16] studied the integrated problem of QC assignment and scheduling, YB assignment, and YT dispatching. They proposed a mixed-integer programming and constraint programming model and showed that the constraint programming model performed much better in terms of calculating time. Integrated quayside problems can also be investigated with other aims, such as saving costs and energy simultaneously [17] or exploring different modes of integration [18]. Sislioglu et al. [19] combined discrete event simulation, data envelopment analysis, and cost-efficiency analysis to investigate different investment alternatives based on the number of QCs, total length of a quay, YTs, and RTGs. They applied their model to 16 different scenarios but did not modify the operating policies. Furthermore, other research has combined different simulation paradigms to solve optimization problems at different decision levels. For example, in [20], a system-dynamic model was used to optimize the main parameters of a dry port, and in [21], a combination of system-dynamic and discrete-event simulation models was used. Kastner et al. [22] provide a literature overview of simulationbased optimization at container terminals. They focused on the covered problems, chosen

meta-heuristics, and the shapes of the parameter configuration space in the respective publications. Similarly, Zhou et al. [23] present a summary of publications on the integration of simulation and optimization for maritime logistics. They classified five modes of integration according to the interaction of the two techniques.

Kastner et al. [24] proposed applying the Tree-structured Parzen Estimation (TPE) approach to scale the amount of utilized equipment in a simulation model. With the help of simulation-based optimization, only a subset of the experiments was executed. At the same time, a fine search grid (all equipment was scaled in step sizes of 1) enabled a very good approximation of the unknown optimum.

The presented study is an extension of [24]. The reviewed literature is expanded and updated. Previously, the three meta-heuristics TPE, Simulated Annealing (SA), and Random Search (RS) were used to scale the number of QCs and YTs. In this study, in addition, the number of YBs is alternated, and the coordination of equipment is varied by using different policies. This enrichment required several extensions of the simulation model and the parameter configuration space. In the new study, the number of QCs is considered to be fixed during an optimization run and a caching mechanism is implemented to speed up optimization. As a new meta-heuristic, Bayesian Optimization (BO) is introduced. The application of dispatching and other policies differentiates this publication from most of the previously mentioned publications. Previous works often directly search for near-optimal sequences of container handling tasks. For larger container terminals, terminal operating systems integrate the computed schedules of different equipment [25]. Smaller container terminals tend to use less complex IT solutions that lack automated scheduling methods, relying more on operational rules of thumb, such as dispatching policies [26]. This study presents a solution method that is applicable for these smaller container terminals. The simulation results describe the near-optimal combination of multiple decisions, such as the quantity of equipment, the dispatching policy, and other policies for a given situation. In this study, optimization plays a role that is very different from that in the above-mentioned scheduling methods. Several parameters, some of them categorical, some discrete, and some continuous, are adjusted in parallel. It is known from similar prior studies (e.g., [11]) that different parameters also affect each other. In this study, therefore, a multivariate optimization problem is solved.

#### *1.2. Optimizing Objective Functions without Mathematical Optimization*

The integration of several operational problems leads to many decisions that are made in parallel. In the presented study, the number of utilized resources is scaled, while, concurrently, different storage and equipment control policies are taken into account, potentially allowing for policy tuning. All of these parameters are optimized without a mathematical model. This section elaborates on the options that a simulator can choose from if no mathematical model is present but an optimum, or at least its approximation, is sought.

For simulation studies, a full factorial design is often used, where each parameter combination is tested by running a corresponding simulation experiment [27]. The parameter combination that performs best according to an objective function is then reported as the best known solution. In other words, it is the best available approximation of the unknown optimum. Such simulation studies do not include a mathematical model, and hence, the distance between the best approximation of the optimum and the optimum of the mathematical model cannot be reported.

A study that covers a full factorial design is only feasible for finite sets, such as categorical values or selected numerical values. Continuous parameters (e.g., real numbers) need to be restricted to a finite set of selected values. The search grid for such a study needs to be sufficiently fine (i.e., for natural numbers, few omissions within a given range; for real numbers, small step sizes are used) so that the optimum can be approximated well. For high-dimensional parameter configuration spaces, the combination of all concurrently varied parameters, at some point, becomes too large for exhaustive examination. This

holds true for large simulation studies as well as for hyper-parameter optimization in machine learning. Furthermore, if all permissible values of a parameter are exhaustively examined, for real numbers, only an approximation for the optimal input parameter might be identified.

As soon as the combination of all varied parameters results in an amount of expensive simulation runs that could not be executed within a reasonable time frame, some of the simulation runs need to be skipped or simplified. If the goal of the study is to approximate an optimum, expensive simulation runs which are expected to result in a lower objective function value need to be avoided. Several approaches presented in the following paragraphs exist. However, if the simulation model is sufficiently complex, any approach might fail to identify the best approximated optimum in the set of feasible solutions.

One option to reduce computational time is to use a multi-fidelity approach [28]. With a low-fidelity simulation model, each parameter combination is evaluated. These results are used to identify promising parameter configurations, which are further examined with a high-fidelity simulation model. From that subset, the best solution can be determined. This approach requires the simulator to create two simulation models of a different fidelity. The low-fidelity simulation model requires special skills during creation as all important aspects need to be covered in the model because, otherwise, a promising parameter configuration for the high-fidelity simulation experiment may be omitted. At the same time, the processes must be sufficiently simplified to improve the required time for running the experiments.

An alternative option to reduce computational time is to maintain a computing budget, i.e., the total amount of computing resources that are available to approximate the optimal solution [29]. The initial experiments are randomly chosen from the parameter configuration space [30]. If, during the evaluation, a given parameter configuration is identified as having superior performance, more computing resources are invested to obtain a better picture of the corresponding objective function value. The longer the total observed time range for a given simulation model, the more that typically noisier sample statistics approximate the population parameters. AlSalem et al. [29] stated that this approach does not necessarily create optimal solutions, but solutions that are close to the optimum with a very high probability. Furthermore, in industry, such approximations often satisfy requirements [29]. Even if optimal input parameters are calculated for a given simulation model, the difference in performance might be of little practical relevance.

Another option to reduce computational time is to embed the simulation into an outer loop of optimization. The simulation model itself is regarded as a black-box function. If the optimization problem (i.e., the objective values derived from the simulation model) has some known structural properties, one might prefer to derive a simpler representation that enables optimization with other tools than simulation [31]. The optimization algorithm in the outer loop tests a subset of the feasible parameter configurations to approximate the optimum of the black-box function. This concept goes by many different names, such as "simulation optimization" [32], "simulation evaluation" [33], "simulation integrated into optimization" [34], and "simulation-based optimization" [23]. For cases in which a combinatorial optimization problem is solved, the term "simheuristic" has been coined [35]. This concept is the only combination of simulation and optimization that does not rely on maintaining an additional mathematical model [34]. If the simulation model is sufficiently complex, then the optimization algorithm can only consist of general guidelines for searching good (but not necessarily optimal) solutions. These general guidelines, which are applicable across research domains, are also referred to as meta-heuristics [36]. Metaheuristics often start with several randomly drawn parameter configurations, and the first objective values that are obtained direct further search. A good meta-heuristic balances exploration and exploitation. During exploration, parameter configurations that are quite different from the previous samples are tested. During exploitation, well-performing parameter configurations are slightly altered to obtain an improved parameter configuration. After several iterations, a stopping criterion is reached, and the best solution found so far is returned as an approximation of the global optimum. For a guided search, a meta-heuristic

needs to keep track of past evaluations. The large number of meta-heuristics reported in the literature stems from the fact that it is a non-trivial decision as to how to continue a search given a set of observations.

To the best of the authors' knowledge [37], and the authors' previous study [24] are the only ones that have used the simulation-based optimization approach at a container terminal for scaling the amount of several utilized resources. Kotachi et al. [37] simultaneously optimized the berth length, the number of QCs, the number of gates, the fleet size of YTs, the number of export and import rows, and the number of RTGs per row. The objective function balanced the throughput and the utilization, weighted by investment costs. While a high throughput was achieved with more resources, the weighted utilization ensured that no superfluous resources were added to the container terminal. Even though not all permissible realistic values were taken into account, the authors calculated 72,576 possible parameter combinations. They decided that these were too many to fully cover in a simulation study. They built an optimization framework that consisted of two stages: First, the interactions between resources were examined to determine the most promising sequence of resource optimization tasks. As seven different types of resources were checked, 7! = 5040 possible permutations existed. Second, the gained sequence was utilized to optimize each resource one by one. The resources that were not yet optimized were selected according to stochastic sampling. Following the proposed optimization framework, the number of executed experiments could be reduced by 34% and it is stated that further enhancements are possible. In the following, some applicable ideas for such an optimization framework are explored.

#### *1.3. Relationship between Hyper-Parameter Optimization and Simulation-Based Optimization*

Identifying high-performing solutions for a given model is a typical research question in many fields of science. The speed of the necessary search process has increased since the advent of computers and the new possibility of automating tedious and complex computations. Complex computations often become quite resource-intensive and tend to contain a large number of parameters that can be varied. Each parameter configuration describes an alternative shape of the executed computation. For categorical parameters (i.e., an element of a finite set), the size of the parameter configuration space grows exponentially with every additional parameter. For continuous parameters (e.g., a range of real numbers), even for a single parameter, the search space is infinite. Therefore, for many applications, the parameter configuration reaches a size that is impossible or impractical to cover. Hence, scientists are forced to evaluate only a subset of all feasible parameter configurations. This search process is further complicated if the model contains stochastic components. Therefore, a single parameter configuration often needs to be tested several times before the resulting statistics reliably inform the scientist of the quality of a parameter configuration.

One field of science that faces similar difficulties is machine learning. Here, often learning algorithms that consist of many exchangeable components are optimized according to some metric. For neural networks, e.g., for the activation function, different mathematical functions can be inserted, the weights inside a neural network can be adjusted by different algorithms, the number of neurons for each layer can vary, etc. [38]. These decisions are referred to as hyper-parameters. They are usually considered to be constant during one experiment. It is a non-trivial problem to identify the best hyper-parameters for a given machine learning problem. Since machine learning pipelines often contain stochastic components, a repeated evaluation is often necessary.

The task of constructing and adjusting machine learning is so complex that, in some cases, randomly picking parameter configurations outperforms manual model calibration by scientists [39]. The authors explain these results by the higher resolution of the search grid and less wasting of the computational budget on the variation of parameters that have little or no impact on the final result. To support the expert in automating the search through a parameter configuration space, Hutter et al. [40] were the first to present an optimization procedure that can deal with numerical and categorical parameters in a problem-independent manner. Bergstra et al. [41] quickly followed with an alternative approach that they called TPE. A comparison between several data sets showed that the performance of such a hyper-parameter optimization technique varies with each setup [42–44]. This research topic is often referred to as hyper-parameter optimization and is the subject of active research and development [44–46].

In the past several years, many newly developed hyper-parameter optimization approaches have advanced the field. Studies such as [42–44] have empirically compared different meta-heuristics. This approach is the key to identifying characteristics of each meta-heuristic, which, in turn, can result in the recommendation of one meta-heuristic. Such an empirical approach is necessary because of the No Free Lunch Theorem (NFLT) in the optimization of black-box functions: One cannot determine the most successful optimization algorithm for an unseen problem [47]. This means that, for simulation-based optimization, a meta-heuristic that has worked well in one study might fail to lead to good approximations of the optimum for a different "problem", i.e., the black-box function that consists of a simulation model and an objective function. In machine learning, the black-box function can be understood as a combination of the learning algorithm and the data set on which it operates. Once one of the components is changed, it constitutes "an unseen problem" according to the NFLT.

As McDermott [48] elaborated, this claim might not be in harmony with observations from scientific literature as often, certain meta-heuristics tend to provide better results than others. Therefore, no published ranking of different optimization algorithms is guaranteed to be reproducible for other problem instances. However, when several optimization studies are considered together, the observed characteristics of each optimization algorithm (e.g., a meta-heuristic) should fit into the broader picture. Hence, a comparison study such as [42–44] or this publication contributes to these deeper insights.

This study uses simulation-based optimization on a multivariate optimization problem at a container terminal. As parameters, only categorical, discrete, and continuous value ranges are permissible. This makes it suitable for choosing policies, determining the amount of employed equipment, and defining policies that accept tuning parameters. The simulation model and the objective function together are treated as a black-box function that is repeatedly evaluated because of its stochasticity. The novelty of this optimization study is that meta-heuristics that have also been used in the context of hyper-parameter optimization are applied to a discrete event simulation model that models several integrated problems of a container terminal. As only meta-heuristics are deployed, the parameter configuration space and the simulation model can both be extended to represent additional complex integrated decisions with little effort. Alternative approaches that couple simulation and optimization require that both the simulation model and mathematical model remain aligned [34].

#### **2. Materials and Methods**

First, the simulation model is presented in Section 2.1. Then, the subsequently used meta-heuristics are presented in Section 2.2. Section 2.3 describes the method of identifying good parameter configurations for the simulation model. In this context, the objective function is presented.

#### *2.1. Simulation Model*

In the following, the created simulation model of the container terminal and its restrictions are presented. The discrete-event simulation model is implemented in Tecnomatix Plant Simulation and is based on the data of a real terminal. The layout of the terminal is shown in Figure 2.

**Figure 2.** Layout and process illustration of the simulation model.

The terminal has a quay length of 800 m and a total of 20 YBs. The simulation model represents container handling between the quayside and container yard. As displayed in Figure 2, the specifics of the design of the land-side transport interface as well as the berths are not considered in detail. In this study, parallel handling of several ships is not modeled. Consequently, the simulation model of the container terminal is stressed by the arrival of a single ship. For this, 12 bays of the ship with a total of 4000 containers have to be handled. About half of them are import containers to be unloaded and the other half are export containers to be loaded. The container transport orders are pre-defined and contain details such as the origin and destination of the containers at the terminal and the availability time for handling in the yard or quayside by gantry cranes. To reduce the execution time of a simulation run, three variations for each parameter combination are generated before all experiments. Depending on the experiment, at least 3 and at most 6 QCs are available for loading and unloading the ship. The ship is unloaded and loaded bay by bay, whereby the QC always handles the entire bay that is assigned to it. Thus, the QC moves after the unloading of a bay and before the loading of the next bay. These bay changes are represented by a delay between the handling of two containers of the same QC. When fewer QCs are used for one ship, more bays are served by each QC.

In the simulation model, an average quayside handling rate of 30 containers per hour is assumed, which is represented by an expected handling frequency of 120 s by each QC. To model stochastic influences, a triangular distribution of the QC handling times is assumed with a minimum handling time of 80 s and a maximum value of 180 s. YTs carry out horizontal transport between the quay and yard. Since YTs are unable to lift containers themselves, the transfer between the QC and YT must be synchronized. The travel times of the YTs are determined by the distances between QCs and YBs, as specified by the terminal layout. For calculating the travel times, an average speed of 8.4 m/s is assumed for the YTs. By inserting a triangular distribution, stochastic influences are taken into account when calculating the travel times. The total number of YTs used per experiment is generated in the yard at the beginning of the simulation.

The inbound and outbound yard operations are performed by RTGs. For the simulation model, it is assumed that RTGs can handle an average of 15 containers per hour, which corresponds to an expected value of 240 s per handling. Deviations and irregularities in the process are modeled by a triangular distribution with a minimum handling time of 180 s and a maximum value of 420 s. For all experiments, at least two YBs are required for each active QC. Thus, it can be ensured that sufficient stowage space is available for the containers to be handled. Depending on the experiment, one of the two storage policies

*random YB assignment* and *close YB assignment* is investigated with the simulation model. For the storage policy *random YB assignment*, containers from all active YBs can be transported to and from each of the QCs. In addition, import containers are randomly assigned to YBs, regardless of which QC is used for the unloading. All active YBs are ensured to be equally burdened as much as possible. The second storage policy is *close YB assignment*, which seeks to minimize the distance between QCs and YBs. YBs with the shortest possible distance for horizontal transport are assigned to every QC. The handling of containers only takes place between the defined QCs and YBs.

Furthermore, the two different dispatching policies *fixed QC-YT assignment* and *free QC-YT assignment* are implemented in the simulation model. In the first policy, *fixed QC-YT assignment*, a fixed number of YTs are assigned to each QC. This policy is typically used in practical terminal operations, as it is the simplest to apply. Every YT receives an attribute that assigns it to a specific QC. In the second dispatching policy, *free QC-YT assignment*, each YT can approach every QC. This policy is a modified version of the hybrid method from Schwientek et al. [11]. The next suitable job is chosen on the basis of the necessary driving time at the terminal and the waiting time of orders. The driving time and waiting time can be weighted differently for each experiment, controlled by the policy tuning parameter *Dispatching Weight*. For the selection of the next suitable order, each free YT inspects the next 20 orders. The order with the earliest availability time is determined. For the other 19 orders, the difference between the order's availability time and the earliest availability time is determined. Additionally, the required travel time to the start position of the respective order is calculated. Both values are multiplied by the intended weighting factor *Dispatching Weight* of the experiment. Finally, the results are added, and the order with the smallest sum is chosen by the YT. If no YT is available at the quay, the QCs have to wait. Otherwise, the containers can be loaded directly onto the YTs. Loaded YTs drive the containers to the defined YB. There, the YTs and containers are separated from each other. The same process steps for handling export containers occur in the reverse sequence of that described above.

#### *2.2. Employed Meta-Heuristics*

Meta-heuristics are used to approximate the best parameter configuration for the given simulation model described in the previous subsection. The corresponding process is depicted in Figure 3. First, the history *H* and the counter *i* are initialized as an empty set and 0, respectively. Since no prior observations are recorded in *H*, the meta-heuristic must select the experiment randomly. In the next step, the meta-heuristic suggests a parameter configuration *x*(*i*). If this is a previously unseen parameter configuration, a simulation experiment is executed and the fitness is calculated. Otherwise, from the previous experiment runs stored in *H*, the fitness value corresponding to *x*(*i*) is retrieved. In both cases, *H* is extended with the new value, and the meta-heuristic is set up with this *H*. After 50 evaluations, the results are reported, and the optimization study is completed. The history *H* is implemented with a global database that shares the history over several optimization runs independently from the employed meta-heuristic during the retrieval phase, which helps to reduce the wall time. The meta-heuristic is only set up with experiments that are previously suggested within the same optimization run to ensure the independence of each optimization run for the subsequent evaluation.

Without having executed any simulation experiments, the simulation model must be considered a black-box. It is, therefore, impossible to know which of a set of given meta-heuristics would lead to the best approximation. Since many meta-heuristics themselves have a stochastic component, even two different optimization runs of the same meta-heuristic can result in different approximations of the optimum. Hence, for a yet unknown simulation model, it is unpredictable whether the optimum will be approximated sufficiently well and, if so, which meta-heuristic will achieve this. Each meta-heuristic needs to empirically prove its applicability to a problem [48]. In this study, TPE was

applied to a discrete event simulation model and compared with SA, BO, and RS. These meta-heuristics are introduced next.

**Figure 3.** The optimization process.

#### 2.2.1. Tree-Structured Parzen Estimator

TPE was developed to automate the search for a sufficiently well-performing configuration of a Deep Belief Network [41]. For a Deep Belief Network, parameters are either categorical variables (e.g., the decision whether to use pre-processed or raw data) or continuous variables (e.g., the learning rate). Integer values can be modeled as either categorical variables (the numbers are mere labels) or continuous variables that are rounded before further use. In addition, dependencies between variables exist: One variable determines the number of network layers and then each layer is configured on its own. Having the configuration of the third layer as part of a parameter configuration is only reasonable if the variable encoding the number of layers is set to at least three. Such a parameter is called a conditional parameter. To reflect the dependencies, this type of parameter configuration space is adequately represented as a tree. This requires specific meta-heuristics that support such a tree structure. The TPE approach has produced good benchmark results [42,49], and the initial paper is among the most cited publications on hyper-parameter optimization; at the time of writing, Scopus indicates that there are 939 citations.

The TPE models *p*(*y* < *y*∗), *p*(*x*|*y* < *y*∗), and *p*(*x*|*y* ≥ *y*∗), where *p* denotes a probability density function (short: density), *y* is a point evaluation of the model, and *x* is a parameter configuration. The first density *p*(*y* < *y*∗) = *γ* is a fixed value (e.g., 0.15 was used in the first publication) set by the experimenter, and *y*∗ is altered to fit the set value of *γ* for each iteration. The other two densities can be summarized as *p*(*x*|*y*): The probability that a certain parameter configuration has been used, given a desired point evaluation value. Typically, TPE is formulated to find a minimum and, therefore, *p*(*x*|*y* < *y*∗) describes the density of parameters that have shown better results, whereas *p*(*x*|*y* ≥ *y*∗) describes the density of the parameters that have led to poorer performance. As the true densities are unknown, they need to be estimated based on the obtained evaluations in each iteration. For each categorical parameter, two probability vectors are maintained and updated: Given the prior vector of probabilities *p* = (*p*1, ..., *pN*), with each probability *pi* for *i* ∈ 1, ..., *N* representing one category, the posterior vector elements are proportional to *N* · *pi* + *Ci*, where *Ci* counts the occurrences of choice *i* in the recorded evaluations so far. An example is depicted in Figure 4a. The estimator for the better-performing parameters (the top 15%) and that for the worse-performing parameters (the bottom 85%) are calculated based on the observations already recorded. For each continuous parameter, two adaptive Parzen estimators are used. Given a prior probability density distribution determined by the experimenter, with each point evaluation of the parameter configuration space with the help of the simulation model and the objective function, the densities are further approximated. An example is depicted in Figure 4b. The parameter choices of the better- and worseperforming models are used to create the respective densities. In each iteration, the model consisting of the two densities is used to select the next parameter configuration *x* to evaluate. To achieve this, several parameters *x* are sampled from the promising distribution *p*(*x*|*y* < *y*∗). The parameter configuration *x* with the greatest expected improvement is chosen. This criterion is positively correlated with the ratio *p*(*x*|*y* < *y*∗)/*p*(*x*|*y* ≥ *y*∗) [41]. In Figure 4, this is referred to as ratio. The criterion favors the parameter configuration that has a high probability of leading to small evaluation values and a low probability of obtaining large evaluation values for the minimization problem at hand. After the parameter configuration has been evaluated by running the experiment and calculating the objective function value, the probability estimators are updated. For this publication, the reference implementation [49] in version 0.2.5 (the newest at the time of conducting the study), provided by the original authors, was chosen. As the prior distribution, a uniform distribution was chosen.

(**a**) Categorical variable.

(**b**) Continuous variable.

**Figure 4.** The Tree-structured Parzen Estimation (TPE) uses the ratio of better- and worse-performing parameters to guide the search. In the example on the left, the categorical variable takes one of the three values "a", "b", and "c". For the continuous variable, in the example on the right the values range from 0 to 20.

#### 2.2.2. Simulated Annealing

SA is a meta-heuristic that is applicable to both combinatorial and multivariate problems [50], the latter of which is relevant for this optimization study. For this publication, the implementation of [49] in version 0.2.5 (the newest at the time of conducting the study) was used. This version of SA works on the tree structure presented in Figure 5. The tree structure requires some specific adjustments of the algorithm documented in [51]. The default initialization values of the implementation were used.

**Figure 5.** The parameter configuration space in tree form.

#### 2.2.3. Bayesian Optimization

BO, also referred to as Gaussian process optimization, aims to minimize the expected deviation [52]. The response surface (i.e., the objective function values for given parameter configurations), including the uncertainty about the result, is estimated. For this publication, the implementation from [53] in version 1.2.6 (the newest at the time of conducting the study) was used. Since this procedure does not support tree-structured parameter configuration spaces, two simplifications are made. First, the numbers of YTs for the *QC-YT Assignment fixed* and *free* are grouped together. The parameter value ranges as in the right branch. If the fixed assignment is chosen, the number of YTs is divided by the number of QCs, and the result is rounded to the closest integer. Second, the dispatching weight is always set, even if it is not interpreted by the simulation model. During the initialization of an optimization run, five random experiments are conducted before BO takes over and guides the search.

#### 2.2.4. Random Search

RS serves as a baseline. According to the NFLT, for some optimization problems, metaheuristics perform worse than RS. It is crucial to identify meta-heuristics that misguide the search (e.g., by becoming stuck in local optima). For this publication, the implementation of [49] in version 0.2.5 (the newest at the time of conducting the study) was used, which is capable of sampling from a tree-structured parameter configuration space.

#### *2.3. Optimization Procedure*

The optimization procedure is written as an external program that encapsulates the simulation. First, the simulation model is initialized with the parameter configuration to examine. After the simulation runs for one experiment are finished, the output of the simulation model is read. The communication between the two programs is realized through the COM-Interface. The output values of the simulation model are inserted into the objective function, which determines the fitness of a given solution. This optimization procedure is described in more detail in the following.

#### 2.3.1. Parameter Configuration Space

During initialization, each parameter of the parameter configuration is set as a global variable in the simulation model. The interpretation of one global variable in the simulation model can depend on another global variable. For example, the parameter *#YTs* is interpreted as the number of YTs for each QC for a fixed QC-YT assignment, but it is interpreted as the number of YTs for all QCs for a free QC-YT assignment. The parameter *Dispatching Weight* is the weight for the travel time and ranges from 0 to 1 in steps of 0.1. The weight for the availability time for handling is deduced by calculating 1−*dispatching weight for travel time*. This parameter is only used for a free QC-YT assignment. Although setting this uninterpreted parameter does not harm the simulation experiment, the recorded observations for the meta-heuristic are flawed. This is because the meta-heuristic might use a record that includes the uninterpreted parameter to guide further search, despite the lack of any effect, which might guide the search process in the wrong direction.

In Figure 5, the parameter configuration space is depicted in its tree form. It is dependent on the number of QCs, which can be either 3, 4, 5, or 6. Each case is considered to be independent and requires optimization.

The presented tree is used for TPE, SA, and RS. For BO, a vector representation is derived whereby each of the parameters *QC-YT Assignment*, *YB Assignment*, *#YTs*, *#YBs*, and *Dispatching Weight* are represented by one dimension of this vector. The different interpretation of *#YTs* is alleviated by varying the value over the range from *#QCs* · *3* to *#QCs* · *7* and, in the case of a fixed assignment, dividing by *#QCs* and then rounding that value to the closest integer value before using it to parameterize the simulation model. The

parameter *Dispatching Weight* can theoretically take any value between 0 and 1. To make use of the caching mechanism, only steps of 0.1 were permitted in this study.

#### 2.3.2. Objective Function

After a simulation run is executed, the objective function is invoked to calculate the fitness for the given parameter configuration. The objective function needs to reflect the fact that the unloading and loading process needs to be fast, while at the same time, resources are only added if they are needed. Therefore, the following objective function was developed (based on [37]):

$$fitness = \frac{\overbrace{t\_{slip}}^{\cdot -}}{t\_{slip}} \cdot \frac{50 \cdot \#\text{QCs} \cdot \text{util}\_{\text{QCs}} + 5 \cdot \#\text{RTGs} \cdot \text{util}\_{\text{RTGs}} + \#Y\text{Ts} \cdot \text{util}\_{Y\text{Ts}}}{50 \cdot \#\text{QCs} + 5 \cdot \#\text{RTGs} + \#Y\text{Ts}} \tag{1}$$

The left factor of the multiplication reflects the inverted relative makespan of the ship. *tship* is the time used to unload and load the ship. As an approximation of *t ship*, prior to the optimization runs, 100 random samples are drawn from the parameter configuration space, and the makespan for the ship is measured. This normalization process centers the left factor to around 1 and ensures that it remains in proportion to the right factor. The shorter the makespan, the larger the left factor becomes.

The right factor of the multiplication is the weighted utilization. #*QCs* refers to the number of QCs, #*RTGs* is the number of RTGs and, therefore, also the number of YBs, and #*YTs* is the number of trucks. *util*<*equipment*<sup>&</sup>gt; refers to the ratio of the time for which the equipment has been working to the overall makespan. When summarizing these utilization values to one factor, weights are assigned according to the investment costs. It is assumed that a QC is 50 times more expensive than a truck, and the cost of an RTG is 10% of a QC [37]. The higher the equipment utilization (with a special focus on expensive equipment), the larger the right factor becomes.

#### 2.3.3. Structure of Optimization Study

For each scenario (3, 4, 5, and 6 QCs) and meta-heuristic (TPE, BO, SA, and RS), 50 optimization runs are executed. This allows for gaining insights on the reproducibility of the optimization results using meta-heuristics. Each optimization run consists of 50 experiments. For each experiment, 30 simulation runs are executed. The results of each simulation run vary slightly due to stochastic factors. These are implemented by drawing handling times from random distributions, as described in Section 2.1.

#### **3. Results and Discussion**

In the following, the experimental results of the optimization study are analyzed. Then, the solutions found for the different meta-heuristics are compared and discussed.

#### *3.1. Preparatory Study*

The objective function (see Equation (1)) requires *t ship*, the median of *tship*. The population parameter is only known after exhaustive coverage of the parameter configuration space, which must be avoided for an optimization study. As a replacement, a sample estimate must suffice. For this purpose, for each scenario, 100 parameter configurations were sampled randomly, and the corresponding simulation experiments were run. For each of these simulation experiments, the makespan of the ship was recorded. The median, minimum, and maximum of the makespan are noted in Table 1. The values in the median column are used as *t ship* for the respective scenario.


**Table 1.** Makespan of the ship for 100 randomly drawn experiments.

Table 1 shows that the median and minimum both decrease as the number of QCs increases. The median shows that using 4 QCs instead of 3 results in a reduction of ca. 20% in the makespan. By doubling the number of QCs from 3 to 6, the performance can be increased by ca. 57%. This is not true for the maximum, which remains nearly constant at around 160 h. Furthermore, the difference between 4 and 5 QCs is rather small. This can be explained by the fact that in the simulation model, 12 bays of the ship are handled. With 4 QCs, each QC handles exactly 3 bays. With 5 QCs, 3 QCs are responsible for 2 bays each, and 2 QCs are responsible for 3 bays each. Thus, 3 QCs complete the container handling task earlier, but the makespan is based on the QC that finishes last.

#### *3.2. Observations from All Experiments*

In the scope of the optimization study, 40,000 experiments in total were evaluated. For each scenario (3, 4, 5, and 6 QCs) and each meta-heuristic (TPE, BO, SA, and RS), 50 optimization runs were executed, each consisting of 50 experiments. This set of experiments covers many randomly chosen experiments (e.g., RS or initialization phase of any of the meta-heuristics) in addition to experiments that are biased by the manner in which each meta-heuristic works during its search process. This overview, which omits the search process, provides some insights into the characteristics of the simulation model.

For each experiment, among others, the makespan, as well as the utilization of the equipment, is measured. The utilization is the arithmetic mean of the working time of all equipment of its respective type. In Figure 6, the utilization of YTs, YBs, and QCs is shown. Due to the larger investment costs, the weighted utilization is closest to the QCs. The median of the utilization of the YTs is the lowest. Since a YT cannot lift a container itself, it must wait for a gantry crane (either QC or RTG) to load or unload the YT. High utilization of QCs and YBs is only possible if enough YTs are available, which inevitably results in lower utilization on the YT side. As the utilization of YTs is assigned a rather small weight in the fitness function, the lower utilization rate carries no relevant weight.

In Figure 7, the difference between the maximum and minimum working times of the YTs for each experiment that used the global assignment policy is depicted. This is an indicator of how effectively the work is shared among YTs. As a general tendency, it can be seen that more QCs lead to less statistical scattering.

**Figure 6.** The utilization of the equipment over all experiments.

**Figure 7.** Time difference between maximum and minimum working times of YTs.

#### *3.3. Approximated Optima*

Both the preparatory study and the first screening of all executed experiments created first impressions of the underlying processes. Within the parameter configuration tree, there are exceptionally low-performing solutions. This raises the following question: Which of the meta-heuristics identified the best parameter configuration? To determine this, for each optimization run, the experiment with the highest fitness is extracted. During optimization, both TPE and BO select the next experiment with the greatest expected improvement. Hence, in contrast to SA, after a phase of exploitation (i.e., minor adjustments), a phase of exploration (i.e., larger changes) can follow. RS is the most extreme example since exploitation is never sought. This, in turn, means that the best result of an optimization run can appear at any position of the sequence of recorded experiments.

In Figure 8, the four meta-heuristics TPE, BO, SA, and RS are compared for each scenario with 3, 4, 5, and 6 QCs. Consistent with the preparatory study, the results for 5 QCs are far worse than those of all other scenarios. Each of the meta-heuristics shows outliers in at least three of the boxplots, which indicates the importance of stochastic influences during the search process. There is no clear ranking among the meta-heuristics. For 3 QCs, BO performs considerably worse than the other three meta-heuristics. These results are similar for 4 and 5 QCs, although with less severity. This is especially interesting since, for 6 QCs, BO produces the highest median. Over the first three scenarios, RS and SA have very similar performances, and for 6 QCs, SA has the worst median, with outliers in both directions. TPE performs very well in the first three scenarios, providing many of the best solutions with few outliers that are substantially worse. For 6 QCs, the median of TPE is much lower than that of BO. However, in two instances, BO arrived at substantially lower-performing solutions.

The differences between meta-heuristics were examined statistically. A significance level of *α* = 0.01 was chosen for the whole study and corrected for each test using Bonferroni correction. To make general statements regarding the meta-heuristics, all scenarios (different numbers of QCs) are agglomerated. The large number of outliers for some of the meta-heuristics precludes the assumption of a normal distribution and requires a nonparametric approach. Hence, a Kruskal–Wallis H test was employed. The test statistic of *H* = 31.303 leads to *p* 0.005. Hence, the null hypothesis that the four groups stem from a single population is rejected. In a posthoc Nemenyi's test for pairwise comparison, only TPE was significantly different from the other meta-heuristics. In other words, BO and SA are not significantly different from RS. The comparison of descriptive statistics (as the reader can approximate from Figure 8) shows the superiority of TPE in this study.

The wide range of approximated optima and the large difference between metaheuristics are indicators of the complexity of the simulation model. The parameter configurations of all optima are more closely examined in the following. The meta-heuristics always determine that the fixed assignment of YTs to QCs leads to an inferior performance compared with pooling. Furthermore, in all instances, the pairing of each QC with its clos-

est YBs performs better than delivering each container to a randomly chosen YB. In contrast to these parameters, the parameter *Dispatching Weight* shows no clear interpretable results.

**Figure 8.** The approximated optima for each scenario and each meta-heuristic.

In Figure 9, the frequencies of specific numbers of YBs for 3, 4, 5, and 6 QCs are depicted. A small number of YBs creates a bottleneck in the yard, while having too many YBs leads to a low utilization of each YB as well as longer travel paths of the YTs. The fewer YBs are used, the shorter the traveled paths, the higher the probability that an unloading job and a loading job can be combined. The number of used YBs is often a multiple of the number of QCs. This can be explained by the rather conservative transportation job assignment policy in place, which is designed to avoid traffic jams but rather postpones a job.

In Figure 10, the number of YTs per QC for each scenario is presented. The large number of outliers to the right can be explained by the rather small impact of the number of trucks on the weighted utilization and, therefore, on the objective function.

**Figure 9.** The number of YBs in the set of approximated optima. The parameter value that leads to the best objective function value over all optimization runs is marked with an asterisk.

**Figure 10.** The number of YTs in the set of approximated optima. The parameter value that leads to the best objective function value over all optimization runs is marked with an asterisk.

In this study, as in [24], TPE exhibits the most robust behavior. Of the 200 optimization runs, TPE never returns the worst approximation of the optimum. For some scenarios, other meta-heuristics provide better medians of approximations. At the same time, only TPE is significantly different from RS when the data are agglomerated over all scenarios. These observations provide evidence that TPE is an appropriate approach. For an explicit recommendation, meta-analyses of several publications using the same meta-heuristics are required in future to accumulate such evidence.

In comparison with [37], this study shows an alternative approach to calibrating the quantity of equipment used in different functional areas of a container terminal. In addition, policies are selected and tuned (if the policy accepts a parameter). The approach of Kotachi et al. [37] requires all parameters to be at least on an ordinal scale since the mutation is defined as changing a parameter one level up or down. For categorical parameters that can take more than two values, there is no order. For continuous parameters, this mutation makes it necessary to define a step size. The approach presented in this study also discretizes the parameter *Dispatching Weight* to the the set {0, 0.1, ..., 1}. However, this is performed to enable caching to speed up the optimization study. All of the metaheuristics used support continuous parameters, which might be of interest for future optimization studies.

#### **4. Conclusions**

This study provides an approach to solving integrated decision problems at container terminals. Earlier studies have often approached such problems by using a mathematical model that aims to optimize the schedule of jobs. Depending on the concept, sometimes the schedule is determined hours before the actual execution of a job, which is an appropriate approach in rather deterministic environments. Another common approach in the literature is to define a policy that is evaluated using a simulation study. The design of experiments e.g., a full-factorial design with a coarse grid—leads to either very large simulation studies or a selection of experiments biased by the researcher's beliefs. These shortcomings of optimization alone and manually designed large simulation studies are partly overcome by the presented simulation-based optimization approach. This approach uses simulation to evaluate the quality of a given solution, and only in this way can the dynamics of real systems be properly represented. Simulation-based optimization allows for the possibility of illustrating these dynamics, providing an approximated solution to the problem without maintaining a separate mathematical model. Therefore, further decision problems can be integrated into the simulation model and the parameter configuration space with little effort.

The authors showed the transferability of meta-heuristics, which originate from the domain of machine learning or have been successfully applied in that area. These methods could be used to optimize discrete event simulation models. A special focus was placed on discrete and continuous parameters that were potentially interdependent. Several optimization runs guided by different meta-heuristics were executed and with a restricted computational budget, promising parameter configuration ranges were identified. This publication focused on examining the results of different optimization runs. For this purpose, the numbers of QCs, YTs, and YBs were modified during different experiments. At the same time, different dispatching policies, as well as QC-YB assignment policies, were investigated. Furthermore, different allocation policies of YBs were applied. In this study, the approximated optima suggest that the pooling of YTs was preferable to free allocation. Furthermore, a YB assignment close to the QCs was considered better than a random one. By choosing more QCs, the number of bays to be served per QC decreased. Thus, a reduction of the makespan could be achieved. A doubling of the number of QCs from 3 to 6 led to a reduction of the makespan by 57%.

Due to the NFLT, it is not clear whether these empirical results can be generalized to future studies that use simulation-based optimization. The applicability of meta-heuristics such as TPE or BO needs to be demonstrated by further optimization studies, potentially with various simulation models, different objective functions, and additional metaheuristics or different fine-tuning of the same meta-heuristic for comparison.

**Author Contributions:** Conceptualization, M.K., A.S., C.J., and N.N.; methodology, M.K., A.S., and N.N.; software, M.K., A.S., and N.N.; validation, M.K.; formal analysis, M.K.; investigation, N.N., M.K.; resources, N.N.; data curation, M.K.; writing—original draft preparation, M.K., A.S., and N.N.; writing—review and editing, A.S., C.J.; visualization, N.N., M.K.; supervision, C.J.; project administration, M.K. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Data Availability Statement:** The data presented in this study are openly available in zenodo at 10.5281/zenodo.4473251.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Abbreviations**

The following abbreviations are used in this manuscript:


#### **References**

