**1. Introduction**

Disaster managemen<sup>t</sup> remains a crucial challenge for the international community. National and supranational institutions and non-governmental organizations have to deal with high impact disasters such as the 2010 Haiti earthquake, the 2013 Typhoon Haiyan or the 2018 Sulawesi earthquake and tsunami. This global concern is reflected in the scientific literature and specifically in the Operations Research community, where the increasing interest on humanitarian logistics and disaster managemen<sup>t</sup> entails the publication of surveys [**? ?** ], books [**? ?** ], special issues [**? ?** ] and journals (such as the Journal of Humanitarian Logistics and Supply Chain Management).

Within the disaster managemen<sup>t</sup> cycle, the response phase includes the actions to be performed once the disaster has occurred. Research on the response phase may focus on specific types of disasters, such as earthquakes [**? ?** ] or floods [**?** ], or can address particular problems for generic disasters, such as location-allocation [**?** ], evacuation [**?** ], or debris managemen<sup>t</sup> [**?** ]. Among the operations to be carried out after a disaster strikes, the distribution of aid to the affected people plays an important role. Along with the usual objectives in commercial logistics, mainly the operation cost, in humanitarian logistics other attributes have to be considered [**? ?** ], such as the response time, the equity of the distribution, etc. Therefore, multi-criteria models are necessary to face many realistic problems in this context. **?** ] provide a review on multi-criteria decision making in humanitarian aid classifying the existing models as deterministic or stochastic and pre-disaster or post-disaster.

Many times the distribution operations must be performed under unsecure conditions [**?** ]. Assaults to relief vehicles are sadly frequent and this complicates the work of organizations that execute the aid missions. Avoiding as much as possible the most dangerous roads and forcing the vehicles to travel together forming convoys are two ways of improving the security of the operations.

In this paper, we develop an algorithm based on Ant Colony Optimization (ACO) metaheuristic to solve the multi-criteria last mile distribution problem in an unsecure environment stated in [**?** ]. The complexity of this problem makes it necessary to use a heuristic approach to solve realistic cases. Besides, due to the strong links between the elements of a solution, it is difficult to develop heuristics that perform progressive alterations of a solution while ensuring feasibility, as it is the case of evolutionary algorithms or local search heuristics. Instead, algorithms based on construction of solutions, such as Greedy Randomized Adaptive Search Algorithm (GRASP) or Ant Colony Optimization metaheuristics, fit much better.

The ACO methodology that we propose in this work presents some novelties that make it suitable for routing and distribution problems in which the vehicles must travel in convoys. The use of convoys is appropriate not only in humanitarian operations, but in any logistics operation performed in an environment of insecurity, in which attacks on delivery vehicles to steal the commodities are possible. The algorithm considers two types of ants—vehicles and aid kits—four types of standard pheromones, and especially the use of effective pheromones, updatable in the course of each solution building process. This new concept allows a better balance between the elements of a solution and a higher diversification in the set of solutions that can be built. Effective pheromones may be applied to a wide variety of problems, especially vehicle routing problems involving a large number of vehicles. The results obtained show that the new ACO Algorithm proposed in this paper is a useful decision aid tool. The ACO Algorithm provides good quality solutions in a short time in realistic test cases, frequently improving the solutions obtained with the GRASP Algorithm developed in [**?** ].

The rest of this paper is organized as follows. Section **??** reviews the relevant literature on multi-criteria optimization for the last mile distribution problem, as well as some applications of Ant Colony Optimization in disaster management. Section **??** presents the problem of transportation for last mile distribution, including the data notation and the description of the six attributes considered. Section **??** introduces the heuristic method proposed to solve the problem. An Ant Colony Optimization algorithm is developed together with the subordinate procedure that guides the construction of feasible solutions. Section **??** is devoted to showing and analyzing the results obtained when applying the proposed metaheuristic to solve two test cases based on real disasters. Finally, Section **??** draws some conclusions derived from this work.

#### **2. Literature Review**

Some of the most significant papers that apply multi-criteria techniques in humanitarian last mile distribution are mentioned below and summarized in Table **??**. **?** ] develop a multicommodity mixed integer programming model in order to maximize the total population covered and minimize the total travel time for earthquake response. **?** ] present a multicommodity, multimodal and multiperiod last mile distribution problem and establish a scalar objective function that aggregates minimization of transportation costs and maximization of demand covered. **?** ] consider three attributes (reliability of the transporting tours, coverage and total travel time) in a distribution problem where infrastructure may be damaged in a post-disaster situation. **?** ] use the scalarization approach to deal with a three-objective (total unsatisfied demand, total travel time and equity of the distribution) integer programming model that allows split deliveries. Two heuristic methods are proposed to solve the model. **?** ] propose a robust stochastic compromise programming model for disaster relief logistics, minimizing the sum of the expected value and the variance of the total cost and minimizing the sum of the maximum unsatisfied demands. **?** ] consider four attributes (amount of aid distributed, time of the operation, equity and cost) in a dynamic flow model that is solved via lexicographical goal programming. **?** ] develop a quadratic multi-objective programming model to face an allocation and

distribution problem. Three attributes are considered: lifesaving utility, delay cost and equity of the distribution. This model allows vehicles to visit a node more than once and provides the individual itineraries of the vehicles. **?** ] aggregate the distribution time, the penalty cost of unsatisfied demand and the costs of opening distribution centers into a single objective function for a location-routing problem considering network failure, random travel times and multiple uses of vehicles. **?** ] apply a multi-objective evolutionary algorithm to solve a multiperiod and multicommodity distribution model which aims to minimize the unsatisfied demand and to minimize the risk of choosing damaged roads. **?** ] present a multimodal model considering dynamic and stochastic demands in order to minimize the travel time and the unmet demand. **?** ] develop a biobjective multiperiod stochastic location-transportation model that allows multiple trips to the vehicles not only in different periods, but also in the same period. The authors propose different heuristic techniques to solve the model. Outside the scope of aid distribution, **?** ] address a post-disaster waste managemen<sup>t</sup> problem through a possibilistic programming model that minimizes both the cost of waste processing and the carbon emissions and maximizes the job opportunities.

Despite the frequency with which aid distribution is to be carried out in unsafe conditions, only a few papers consider the security as an attribute. **?** ] integrate seven attributes (quantity distributed, time of response, cost, equity, priority, reliability and security) into a linear lexicographical goal programming flow model. This model is multimodal, multidepot, allows split deliveries and vehicles travel in convoys so that they can be escorted. An alternative version of this model is established in **?** ], where the authors set a fixed amount to distribute and combine the six remaining attributes into a non-lexicographical model. Nevertheless, these models do not provide the individual route of each vehicle, so it is not possible to easily implement plans for the relief distribution. **?** ] propose a GRASP metaheuristic to solve a nonlinear multi-objective model that allows vehicles to visit a node more than once and provides the individual itineraries of the vehicles. The mathematical formulation of this model is established in **?** ], where an extensive comparison of the characteristics of the models is made with other studies that deal with the last mile distribution problem.


**Table 1.** The literature related to multi-criteria techniques in humanitarian last mile distribution.

In this study, we propose an Ant Colony Optimization methodology to solve the model developed in [**?** ] and [**?** ]. Even though ACO has been applied frequently to different routing and distribution problems [**????** ], it has barely been used in the context of disaster management. The most remarkable work is **?** ], where the authors apply this metaheuristic to solve a distribution and evacuation problem. More recently, **?** ] combine different techniques, including ACO, to face a location-allocation problem

of relief centers after an earthquake. Therefore, this paper presents the first application of Ant Colony Optimization to a multi-criteria last mile distribution problem. Furthermore, the proposed ACO methodology introduces novelties such as specialized ants, representing both vehicles and aid kits, and effective pheromones, that conveniently balance the elements of the built solutions.

#### **3. Problem Description**

The last mile distribution problem addressed in this paper consists in deciding how to deal out a predetermined amount of humanitarian aid from distribution centers to populations in need by moving a set of vehicles through a transportation network. We consider that the distribution is performed in an insecure and uncertain environment and, as a consequence, the convoys transporting the aid may be attacked while traveling and some roads may be unavailable due to damages produced by the disaster.

Table **??** lists the notation used to represent the main elements of the problem. The transportation network is formed by nodes and arcs. Each node is of one of three types: supply, demand or connection. A supply node represents a distribution center where a certain known amount of aid is available, while the demand nodes represent populations in need or help centers. Each demand node is associated with a certain amount of aid that is desirable to receive and a priority level. A connection node represents an intermediate point, where there is neither available nor demanded aid, but may be used by the vehicles for aid trans-shipment.


**Table 2.** Problem notation.

The arcs represent links (roads, pathways, streets, etc.) connecting the nodes. The links that can be traversed in both directions have two associated arcs, one for each direction. Each arc has associated a length, a speed, an assault probability and a probability of being available (the arc can be used).

The vehicles are initially located at some nodes, which are not necessarily supply nodes. We will call depots to the nodes where there are vehicles at the beginning of the operation. There are different types of vehicles to carry out the distribution. Vehicles of the same type have the same capacity, velocity and costs. For technical reasons, we classify the vehicles in classes: vehicles belonging to a same class are of the same type and are located also in the same depot.

The humanitarian aid is packed in kits composed of water, food, medicines, blankets, etc. The operation to be performed consists of delivering *qglobal* of such kits.

In addition, the design of the routes must be carried out under the following assumptions:

− Vehicles can visit a node more than once. Likewise, they can pick up or deliver humanitarian aid more than once, even in the same node.


As usual in humanitarian logistics, this problem is intrinsically multi-criteria. In this case, we consider six different attributes that will be incorporated into the model as objectives to be minimized. In what follows, a description of the attribute measures used in the model is provided; however, since they are not necessary for the understanding of the paper, we decided to omit the mathematical formulation. The interested reader can see [**?** ] for further details.


$$idp = \frac{q \, glabal}{\sum\_{i \in I\_D} dem\_i} \tag{1}$$

The equity measure is defined as the square root of the quadratic deviation of the proportions of satisfied demand from the ideal demand proportion over all demand nodes.


The objectives can be minimized separately or in an aggregated function as in (**??**), where weights *wg*, *g* ∈ *G* represent the preferences of the decision maker and *fg* is the objective function associated with the attribute *g*.

$$f = \sum\_{\mathcal{S} \in G} w\_{\mathcal{S}} f\_{\mathcal{S}} \tag{2}$$

For obtaining solutions that take all attributes into account in a balanced way, we apply compromise programming (CP), introduced in **?** ]. CP is a multi-criteria decision making technique which consists in minimizing the distance to the ideal point. The components of the ideal point are the optimal values of the attributes when optimized independently. To normalize the attributes, the difference between the ideal and the anti-ideal is used, where the components of the anti-ideal point are the worst values for the attributes on the set of non-dominated solutions.

The CP objective function when using the *p*-norm is denoted as ˜ *fp* and defined as follows:

$$\tilde{f}\_p = \left[ \sum\_{\mathcal{S}' \in G} w\_{\mathcal{S}} \left( \frac{f\_{\mathcal{S}} - i\_{\mathcal{S}}^+}{i\_{\mathcal{S}}^- - i\_{\mathcal{S}}^+} \right)^p \right]^{\frac{1}{p}} \tag{3}$$

Again, *wg*, *g* ∈ *G* represent the preferences of the decision maker, while *i*+*g* and *i*−*g* are the ideal and the anti-ideal components associated with attribute *g*, respectively.

#### **4. Heuristic Approach**

In this section we present an algorithm based on Ant Colony Optimization, that was introduced in **?** ]. The ACO metaheuristic emulates the way in which the ants of a colony manage to find the shortest path from the anthill to a source of food through the pheromone trails deposited by the ants when they move. Extensive information about ACO metaheuristic and its multiple variants can be found in [**???** ]. The metaheuristic algorithm that we propose in this paper takes some elements of the Ant Colony System (ACS) developed in **?** ] to establish a multiple-ant colony system variant. Section **??** is devoted to the main program of the metaheuristic (ACO Algorithm), and Section **??** to the subordinate procedure that performs the construction of solutions (Pheromone Trail Constructive Algorithm or PTC Algorithm).

Table **??** shows the main parameters and variables that are used in the metaheuristic. According to the ACS, two types of pheromone updates are made. The global update is performed every time *m* consecutive solutions are obtained and is intended to improve the construction of future solutions. The local update is performed after each single solution is obtained, and its purpose is to diversify the construction. Parameters *ρg* and *ρl* are the corresponding evaporation rates. Our algorithm uses two types of ants, representing vehicles and humanitarian aid kits, and four types of pheromones, that are deposited by the vehicles or by the aid kits in different situations. First type (respectively second type) pheromones are deposited by the vehicles (aid kits) when moving through the arcs, third type pheromones are associated with the amounts of aid leaving supply nodes, and fourth type pheromones correspond to the amounts of aid remaining at the nodes at the end of the operation. *<sup>τ</sup>*1*c*,*k*, *τ*2*k* , *τ*3*i* and *τ*4*i* are the variables that indicate the quantities of pheromone of the four types that have been deposited at an arc *k* or at a node *i*.

The construction of solutions takes into account both pheromones and visibility. Pheromones indicate how often each element candidate to be added to the current partial solution was selected by previous ants, while visibility, established as greedy functions, indicates how good an element could be according to each of the objectives considered.

The PTC Algorithm builds new solutions by adding elements iteratively. As a result, the values of the pheromones associated with the added elements should be decreased to prevent an excessive concentration of these elements in the solution. This is achieved through what we called *effective* pheromones, a new concept that refers to variables that at the beginning of the execution of the PTC Algorithm take the same values as the corresponding pheromones variables, but which will be updated during the construction of the solution.


#### *4.1. Ant Colony Optimization Algorithm*

In this subsection we introduce the ACO Algorithm, whose pseudocode is given in Algorithm **??**. Its main steps are explained below.

#### **Algorithm 1** ACO algorithm

1: Perform preprocess.

```
2: Create a solution S∗ with the PTC Algorithm.
```

```
3: for it = 2, ··· , n do
```

```
14: Output S∗.
```
**Preprocess.** It comprises several tasks needed before starting the construction of solutions:


**Pheromone update.** The local update increases the pheromones of the elements that appear less frequently in the previous iteration in order to diversify the construction, as stated in Equations (**??**)–(**??**).

$$
\tau\_{c,k}^1 = (1 - \rho\_l)\tau\_{c,k}^1 + \frac{\rho\_l}{1 + vt\_{c,k}} \quad \forall c, \; \forall k \tag{4}
$$

$$
\pi\_k^2 = (1 - \rho\_l)\tau\_k^2 + \frac{\rho\_l}{1 + lt\_k} \quad \forall k \tag{5}
$$

$$
\pi\_i^3 = (1 - \rho\_l)\pi\_i^3 + \frac{\rho\_l}{1 + qav\_i - fs\_i} \quad \forall i \in I\_S \tag{6}
$$

$$
\tau\_i^4 = (1 - \rho\_l)\tau\_i^4 + \frac{\rho\_l}{1 + fs\_i} \quad \forall i \in I\_D \cup I\_S \tag{7}
$$

In the same way, the global update increases the pheromones of the elements that appear more in the current best known solution in order to guide the construction towards potentially better solutions, as stated in Equations (**??**)–(**??**). The auxiliary parameter *clc* in (**??**) denotes the number of vehicles of class *c*, while the asterisk in *vt*<sup>∗</sup> *<sup>c</sup>*,*k*, *lt*∗ *k* and *f s*<sup>∗</sup> *i* indicates that these auxiliary variables correspond to the current best known solution.

$$
\tau\_{\mathfrak{c},k}^1 = (1 - \rho\_{\mathfrak{F}}) \tau\_{\mathfrak{c},k}^1 + \frac{v t\_{\mathfrak{c},k}^\*}{c l\_{\mathfrak{c}}} \quad \forall c \; \forall k \tag{8}
$$

$$
\pi\_k^2 = (1 - \rho\_\mathcal{g})\pi\_k^2 + \frac{l t\_k^\*}{q g l 
abla l} \quad \forall k \tag{9}
$$

$$
\tau\_i^3 = (1 - \rho\_\mathcal{S})\tau\_i^3 + \frac{qav\_i - fs\_i^\*}{qav\_i} \quad \forall i \in I\_\mathcal{S} \tag{10}
$$

$$\tau\_i^4 = \begin{cases} (1 - \rho\_\mathcal{S}) \tau\_i^4 + \frac{fs\_i^\*}{qav\_i} & \forall i \in I\_\mathcal{S} \\\ (1 - \rho\_\mathcal{S}) \tau\_i^4 + \frac{fs\_i^\*}{dem\_i} & \forall i \in I\_D \end{cases} \tag{11}$$

The elements that are taken into account in both updates are the following: for the first type pheromones, the class-arc pairs (**??**) and (**??**); for the second type pheromones, the amounts of transported aid in the arcs (**??**) and (**??**); for the third type pheromones, the amounts of aid that have come out from the supply nodes (**??**) and (**??**); and for the fourth type pheromones, the amounts of aid that remain in the supply and demand nodes at the end of the distribution (**??**) and (**??**).

**Increase** *λ***.** Variable *λ*, which controls the weight of pheromones and visibility, grows over the course of the ACO Algorithm so that the importance of pheromones in the construction is increasing. The pheromones weight is updated according to (**??**), where *it* is the current iteration. The exponential expression makes the increase of *λ* to be progressively slower.

$$
\lambda = 1 - \sigma\_1 e^{-\sigma\_2 \frac{\mu}{n}} \tag{12}
$$

#### *4.2. Pheromone Trail Constructive Algorithm*

The pseudocode of PTC Algorithm is shown in Algorithm **??**. Lines 1 and 2 are the constructive steps, where pheromone and greedy functions are introduced in order to guide the solution building process. Lines 3 to 12 coordinate the elements of the current solution and determine if it is feasible. It must be taken into account that, as all vehicles travel in convoys through the arcs, the solution must satisfy the precedence relation induced on the set of arcs. Lines 13 to 17 intend to improve the solution with the help of pheromone functions.

The solution is built iteratively by adding or removing elements, being the nature of the elements dependent on the algorithm step. An element can be, for example, a trip of a vehicle or a demand node where to drop an aid kit. If *ζ* is a candidate element to be added, *<sup>p</sup>hζ* and *grζ* are, respectively, the effective pheromone and the greedy function associated to it. As a result, the score of element *ζ* is calculated as stated in (**??**).

$$sc\_{\mathbb{S}} = \lambda ph\_{\mathbb{S}} + (1 - \lambda)gr\_{\mathbb{S}} \tag{13}$$

The reason why this expression is additive rather than multiplicative—as usual in ACO—is because, in this way, the gradual increase of the pheromone weight can be more accurately controlled.

#### **Algorithm 2** PTC Algorithm


The greedy function of a candidate element *ζ* is stated in (**??**), where *gζ* is the individual greedy function of objective *g*, *wg* is the weight of objective *g* in the objective function and *bg* is an upper bound for *gζ* .

$$\log r\_{\mathbb{S}} = \sum\_{\mathbb{S}^{\xi} \in G} w\_{\mathbb{S}} \left( \frac{b^{\mathbb{S}} - \mathfrak{e}\_{\xi}^{\mathbb{S}}}{b^{\mathbb{S}}} \right) \tag{14}$$

The greedy functions are used only where it is possible to define them in a reasonable way. When there are no greedy functions, the score is simply *scζ* = *<sup>p</sup>hζ* if the decision regards which element should be added to the partial solution (it is convenient to select elements with more pheromone) or as stated in (**??**) if the decision regards which element should be removed from the partial solution (it is convenient to select elements with less pheromone).

$$\text{sc}\_{\%} = \frac{1}{p h\_{\%}} \tag{15}$$

In all steps of the PTC Algorithm, the elements are chosen according to the criterion established in ACS: if *q* ≤ *q*0, where *q* is a random variable uniformly distributed over [0, 1], the candidate element with the highest score is chosen; otherwise, the election is made according to the probabilities stated in (**??**), which are proportional to the scores.

$$pr\_{\tilde{\zeta}} = \frac{\text{sc}\_{\tilde{\zeta}}}{\sum\_{\tilde{\zeta}} \text{sc}\_{\tilde{\zeta}}} \tag{16}$$

The main steps of the PTC Algorithm are described in detail in what follows. **Design itineraries.** The routes to be followed by the vehicles are built iteratively, starting from an empty solution where all the vehicles are parked at their corresponding depots. At each iteration, the candidate elements to be added to the partial solution are the feasible trips of every single vehicle *j* through an available arc *k*. Therefore, each element is a pair *ζ* = (*j*, *k*). The effective pheromone associated to each element is defined in (**??**), where *c* is the vehicle class corresponding to vehicle *j*.

$$
\mu h\_{\zeta} = \left(1 - \frac{\upsilon t\_{c,k}}{c a r\_c}\right) \tau\_{c,k}^1 \tag{17}
$$

On the one hand, the effective pheromone is proportional to the first type pheromone, *τ*1 *<sup>c</sup>*,*k*, which depends on the vehicles of class *c* traversing arc *k* in the previously obtained solutions. In particular, the more vehicles of class *c* transit through this arc in the current best known solution, the more effective pheromone candidate element (*j*, *k*) has. On the other hand, the effective pheromone depends on the number of vehicles of class *c* traversing arc *k* in the current partial solution, so the greater *vtc*,*<sup>k</sup>* is, the less effective pheromone candidate element (*j*, *k*) has. If this gradual adjustment of the effective pheromone is not performed, it could lead to an excess of vehicles of a class through an arc for those pairs (*c*, *k*) with higher first type pheromone. This eventual excess of elements in a solution justifies the use of the effective pheromone and it also applies to other steps of the PTC Algorithm.

All the greedy functions used in the PTC Algorithm are defined in the same way as in the Elite Set Constructive Algorithm (ESCA) described in [**?** ]. For example, the individual greedy function of the time objective is the time that takes vehicle *j* to traverse arc *k* (**??**), while the individual greedy function of the reliability objective is the probability that arc *k* is not available (**??**).

$$\varrho\_{\zeta}^{t} = \frac{dist\_{k}}{\min\{vela\_{k'}, velv\_{j}\}} \tag{18}$$

$$
\varrho\_{\mathbb{Q}}^f = 1 - r\_k \tag{19}
$$

The score of each element is calculated as stated in (**??**). Once an element (*j*, *k*) is chosen, arc *k* is added to the route of vehicle *j* and the associated precedence relations are updated, in order to ensure that all vehicles travel together in convoys. The construction process is iterated until the routes cannot be continued any longer.


The augmenting paths from the source to the sink are determined iteratively, node by node, as follows:

• If the current node is the source, the candidate elements are the supply nodes with available aid. The effective pheromone of each supply node, *ζ* = *i*, is stated in (**??**), where *f lsc*,*<sup>i</sup>* is the current flow from the source to the supply node *i*, i.e., the amount of aid leaving the supply node *i* in the current partial flow solution. The effective pheromone depends positively on the third type pheromone (which is related to the amount of aid leaving supply node *i* in the previously obtained solutions) and negatively on the amount of aid leaving *i* in the current solution.

$$
\mu l \eta\_{\tilde{\varsigma}} = \left( 1 - \frac{f l\_{\kappa \jmath j}}{q a v\_i} \right) \tau\_i^3 \tag{20}
$$

No greedy function is defined in this case, so the score of each supply node *i* is equal to the corresponding effective pheromone.

•If the current node is a node *i*0 different from the source, a candidate element is a node *i* linked to *i*0 through an arc *k* = (*<sup>i</sup>*0, *i*) with positive residual capacity. The effective pheromone of each candidate node, *ζ* = *i*, is calculated as stated in (**??**). It depends positively on the amount of aid transported through arc *k* in the previously obtained solutions and negatively on the amount of aid transported through arc *k* in the current solution.

$$
\mu \text{pl}\_{\mathbb{V}} = \left(1 - \frac{f l\_{i0,i}}{q \text{g} l \text{obal}}\right) \text{r}\_{k}^{2} \tag{21}
$$

The individual greedy function of the time objective is the arriving time of the convoy associated to arc *k*, the individual greedy function of the cost objective is the average cost of transporting an aid kit by arc *k* among the vehicles of the associated convoy, etc.

If the current node *i*0 is a demand node, in addition to the other candidates mentioned earlier, the sink *sn* must be considered as a candidate element as well. The selection of the sink would finish the augmenting path. Its effective pheromone (**??**) depends positively on the amount of aid remaining at demand node *i*0 at the end of the operation in the previously obtained solutions and negatively on the remaining aid in the current solution.

$$
\mu h\_{sn} = \left(1 - \frac{f l\_{i\_0 sn}}{dem\_{i\_0}}\right) \tau\_{i\_0}^4 \tag{22}
$$

The individual greedy function of the equity objective associated to the sink measures how the deviation of the proportion of satisfied demand at node *i*0 from the ideal demand proportion would vary if the sink is selected; the individual greedy function of the priority objective is obtained in a similar way; and the individual greedy functions of the other objectives take the value 0, because finishing the augmenting path does not worsen those objectives.

Finally, the scores are established according to (**??**).

Each time a node is selected and labeled, the current node is updated. Once an augmenting path is determined, the flow is updated according to the Ford-Fulkerson algorithm. The process is iterated until all the planned amount of aid is distributed from the source to the sink or it is not possible to find an augmenting path, in which case the current solution would be discarded and the algorithm would be restarted at Step 1.


The augmenting paths from the source to the sink are determined in a similar way as in the Aid distribution step:

• If the current node is the source, the candidate elements are the improvable nodes. The effective pheromone of each improvable node, *ζ* = *i*, is stated in (**??**). The effective pheromone depends positively on the final stock at node *i* in the previously obtained solutions and negatively on the final stock at node *i* in the current solution.

$$
\mu h\_{\tilde{\varsigma}} = \left(1 - \frac{fs\_i}{dem\_i}\right) \tau\_i^4 \tag{23}
$$

The score of each improvable node *i* is equal to the corresponding effective pheromone.

• If the current node is a node *i*0 different from the source, a candidate element is a node *i* linked to *i*0 through an arc *k* = (*<sup>i</sup>*0, *i*) with positive residual capacity. The effective pheromone of each candidate node, *ζ* = *i*, is calculated as stated in (**??**). It depends positively on the amount of aid transported through arc *k* in the previously obtained solutions and negatively on the amount of aid transported through arc *k* in the current solution.

$$
\eta h\_{\tilde{\zeta}} = \left(1 - \frac{l t\_k - f l\_{i\_0, i}}{1 + q g 
delta l}\right) \tau\_k^2 \tag{24}
$$

Since we are interested in removing needless flow, the score of each improvable node *i* is inversely proportional to the effective pheromone, as established in (**??**). Note that the unit added in the denominator of the fraction in (**??**) prevents the effective pheromone from being 0.

If the current node *i*0 is a demand node linked to the sink, the sink must be considered as a candidate element as well. Its effective pheromone (**??**) depends positively on the final stock at demand node *i*0 at the end of the operation in the previously obtained solutions and negatively on the final stock at *i*0 in the current solution.

$$
\eta h\_{sn} = \left(1 - \frac{sf\_{i\_0}}{1 + dem\_{i\_0}}\right) \pi\_{i\_0}^4 \tag{25}
$$

The score of the sink is established according to (**??**) as well.

Each time an augmenting path is obtained, the flow is updated according to the Ford-Fulkerson algorithm. The final stock at each demand node *i*, *s fi*, and the amount of aid transported through each arc *k*, *ltk*, are updated as well. The algorithm stops when the current flow cannot be increased further.

**Adjust supply nodes.** The purpose of this step is to eliminate as much incoming flow of aid as possible at supply nodes in which there is aid available at the end of the operation. Once again, a variation of the maximum flow problem is solved. The auxiliary transportation network includes all the nodes and the used arcs, which are now reversed because we try to return flow back from the supply nodes. All supply nodes are linked to the source and to the sink by dummy arcs. The capacity of each dummy arc leaving the source is the minimum between the amount of load arriving to the corresponding supply node and the final stock at it. The capacity of each dummy arc reaching the sink is the amount of aid available initially at the corresponding supply node. The capacity of each remaining arc is equal to the amount of aid transported through it (in opposite direction).

The calculation of the effective pheromones is as follows:

• If the current node is the source, the candidate elements are the supply nodes. The effective pheromone of each supply node, *ζ* = *i*, is stated in (**??**). The effective pheromone depends positively on the final stock at node *i* in the previously obtained solutions and negatively on the final stock at node *i* in the current solution.

$$\text{pl}\_{\mathbb{V}} = \left(1 - \frac{fs\_i}{1 + qav\_i}\right) \tau\_i^4 \tag{26}$$

Since we want part of the available (but unused) aid to leave the supply node and, therefore, the final stock decreases, the score of each supply node *i* is inversely proportional to the corresponding effective pheromone.


$$
\mu \text{pl}\_{\zeta} = \left(1 - \frac{v t\_{c,k}}{1 + c a r\_c} \right) \tau\_{c,k}^{1} \tag{27}
$$

Since we want to trim the routes, the scores are inversely proportional to the corresponding effective pheromones. Once an element (*j*, *k*) is chosen, arc *k* is removed from the route of vehicle *j* and the variable *vtc*,*<sup>k</sup>* is updated.


#### **5. Computational Study**

#### *5.1. Implementation Details and Calibration*

The algorithms introduced in the previous sections have been implemented in Fortran 95. The results presented in this section have been obtained by executing the algorithms on a personal computer with an Intel Core i5-2450M processor with 2.5 Ghz and 8Gb RAM running Windows 7.

Before applying the ACO Algorithm to our test cases, it is necessary to tune several parameters, namely the number of solutions obtained between each couple of global updates ( *m*), the probability of choosing the element with the highest score in any of the decision situations (*q*0), the local and global evaporation rates (*ρl* and *<sup>ρ</sup>g*) and the two parameters which determine how to update the weight of the pheromone and the greedy functions in the constructive process ( *σ*1 and *σ*2). We randomly generated a set of five small test instances, with 10 to 17 nodes, 29 to 41 arcs and 20 to 30 vehicles, which allow us to do a large number of executions in order to obtain a set of parameters that makes the ACO Algorithm work well on realistic cases.

The calibration consists of two stages. In the first stage, the parameters were taken in three pairs, (*<sup>m</sup>*, *q*0), (*ρl*, *<sup>ρ</sup>g*) and (*<sup>σ</sup>*1, *<sup>σ</sup>*2), and for each one, the ACO Algorithm was executed on all test instances, varying the values of the two chosen parameters in a range of between ten and twenty values, while the other four parameters were fixed to arbitrary values. With the results obtained in this first stage we established a new assignment of values.

In the second stage of the calibration, we took the parameters one by one and executed the ACO Algorithm 100 times on each instance, to obtain 2000 solutions in each execution, varying the value of the chosen parameter and fixing the values of the other five parameters according to the assignment established after the first stage. We collected the average of the best objective function value over the one hundred executions and over the five test instances, that is represented in the vertical axis of the figures included in this subsection, while the values tested for the parameter are represented in the horizontal axis. Each execution required an average time of 1.5 s of computation.

To calibrate the number of solutions obtained between each couple of global updates, we varied *m* = 0, 1, ... , 10. In the literature, *m* = 5 or *m* = 10 have been commonly used and our results did not contradict it, so we decided to set *m* = 5, which requires less computation time.

In the same way, the probability of choosing the element with the highest score is usually set to a high value in the literature; however, we found that *q*0 = 0.3 was the best value in our experiments, as shown in Figure **??** (note that the values higher than 0.5 have been rejected in the first stage of the calibration). This means that in 30% of the decision situations the candidate with the highest score will be selected directly, whereas in the remaining 70% the selection will be made at random, where the probability of each candidate is proportional to its score.

**Figure 1.** Calibration of the probability of choosing the element with the highest score.

Figure **??**a,b shows the results obtained for the evaporation rates. According to those, we set *ρl* = 0.02 and *ρg* = 0.04. Since both are small values, the pheromone trails will be very persistent, especially in the case of local updates, which are performed more frequently than the global ones.

Finally, we varied *σ*1 = 0, 0.05, ... , 0.5 and *σ*2 = 0, 1, ... , 10. The results led us to set *σ*1 = 0.2, meaning that at the start of the metaheuristic the weight of the greedy functions is 0.2, and *σ*2 = 5.

**Figure 2.** Calibration of the evaporation rates.

#### *5.2. Description of the Test Cases*

The proposed ACO Algorithm has been tested on two realistic test cases with different characteristics: Haiti earthquake 2010 and Niger famine 2005.

The first test case is based on the earthquake that devastated Port-au-Prince, Haiti's capital, and its metropolitan area in 2010. It was introduced in [**?** ], and it is very appropriate to illustrate the problem, since many of the characteristics that define it are present here. It develops mainly in an urban environment, but the effects of the disaster produced a grea<sup>t</sup> uncertainty about the transitability of many streets. Besides, aid operations were carried out in an environment of insecurity, with frequent attacks to convoys.

The logistic map, illustrated in Figure **??**, corresponds to the situation of the Port-au-Prince area fifteen days after the main earthquake. The network consists of 24 nodes and 84 arcs, corresponding to 42 two-way links. Nodes 1, 2 and 3 are supply nodes, with 60, 80 and 140 tons of humanitarian aid available, respectively. Nodes 10, 12, 13, 16, 17, 18, 20, 21 y 22 are demand nodes, with demands of 30, 40, 30, 30, 10, 30, 40, 20 y 20 tons of aid, respectively. Of these demand nodes, node 13 has special priority. The remaining 12 nodes are connection nodes and can be used to transship load.

The links have been classified by their quality (according to the thickness of the line: the thicker the line, the higher the speed) and by their reliability (according to the color scale shown in the figure). The arcs also have different ransack probabilities.

To perform the distribution operations there are 300 vehicles, classified in three different types according to their velocities, costs and capacities. There are 70 large vehicles, 95 medium size vehicles, and 135 small vehicles, which can carry 3, 2 and 1 tons of aid, respectively. The vehicles are located at the supply nodes and at the connection nodes 4, 6 and 18. The purpose of the operation is to deliver 150 tons of aid among the demand nodes.

The data for the Haiti test case come from various documents available in January 2010: a logistic map provided by the United Nations Office for the Coordination of Humanitarian Affairs (OCHA) [**?** ], a map reporting the satellite-identified IDP concentrations, road and bridge obstacles in central Port-au-Prince, Haiti, created by the United Nations Institute for Training and Research (UNITAR), and a map focused only on the Port-au-Prince road conditions created by the World Food Programme (WFP) Emergency Preparedness and Response [**?** ].

**Figure 3.** Transport network at Port-au-Prince.

The second test case, introduced in **?** ], is based on the 2005 Niger food crisis. The logistic network (see Figure **??**) consists of two supply nodes (Niamey and Gaya cities), two demand nodes (Agadez and Zinder), three connection nodes (Dosso, Tahoua and Maradi) and 11 two-way roads. There are a total of 376 vehicles available of three different types located at the supply and connection nodes to distribute 500 tons of humanitarian aid. The data for this test case come from the report of the real operation developed by the Spanish organization Fondo de Ayuda Humanitaria y de Emergencias de Farmamundi (FAHE) [**?** ] and public websites such as Google Maps.

**Figure 4.** Transport network at Niger.

Both test cases are explained in detail in [**?** ] and their data are available on the website: www.mat.ucm.es/humlog.

## *5.3. Results*

In this section we will compare and analyze the results obtained when applying the ACO Algorithm, proposed in this paper, and the GRASP Algorithm, introduced in [**?** ], to the Haiti and Niger test cases. Furthermore, we compare as well the best solutions obtained by both algorithms with the optimal values for these test cases when they are available. These optimal values were obtained by solving to optimality relaxed flow models [**? ?** ].

Firstly, we present the results obtained in the Haiti test case. Both algorithms—ACO and GRASP—have been executed ten times minimizing each objective independently and ten more times minimizing the aggregated function with equal weights for all attributes. The parameters of the algorithm were set according to the results of Section **??**. A fixed computation time of 1500 s has been set for each of the 140 executions—two algorithms, seven objectives per algorithm and ten runs per objective.

Table **??** shows the computational results obtained with the ACO Algorithm and with the GRASP Algorithm in the Haiti test case. For each algorithm, the objective values of the best and the worst solutions among the ten executions are presented, as well as the average and the standard deviation of the objective values of the ten solutions. In this test case the operation time is given in minutes, the cost in dollars, and the security and the reliability in expected lost tons. We can see that the ACO Algorithm provides a better best solution when minimizing time, cost and security attributes, although for the cost the average is worse than with GRASP. For the reliability attribute, both algorithms provide the same best result, but this best is obtained in all GRASP executions—standard deviation is equal to zero—and only in two ACO executions. Both algorithms provide optimal solutions for the priority attribute in all executions and ACO performs worse than GRASP for the equity and for the aggregated function. The ACO algorithm provides around 10,000 feasible solutions on average while the GRASP algorithm provides 25,000. Nevertheless, this substancial difference does not translate into a proportional improvement in performance.


**Table 4.** Results obtained with ten executions per objective of GRASP and ACO algorithms in Haiti earthquake 2010.

Table **??** shows the best values obtained among all tests performed during the experimentation—not only the ten executions of the previous table—in the Haiti test case, together with the optimal values for the attributes in which they are known. The optimal values for time and cost were obtained by using the static flow model described in [**?** ], which although has some different assumptions, can serve as a reference for these two attributes that are measured in a very similar way. The ACO Algorithm provides the assumed optimal value for the time attribute and stays very close to the optimum for the cost, substantially improving the results obtained with the GRASP Algorithm. For the priority and the reliability attributes, both algorithms provide the same values, and almost the same for the security, whereas for the equity attribute, the ACO Algorithm performs worse. For the aggregated function, the GRASP Algorithm also works better, probably because of the higher influence

of the equity attribute in the aggregated function, due to its tighter bound in comparison to the other attributes.

Figure **??** depicts the itineraries of the best solutions found for the time and the cost attributes. The numbers on the arcs represent the amounts of aid transported by the vehicles traversing them (a "0" on an arc indicates that an empty convoy is traversing the arc). In the best solution for the time (Figure **??**a), the three demand nodes farther from the supply nodes (18, 21 and 22) do not receive any aid and the distribution is mainly carried out by fast arcs, regardless of their dangerousness or reliability. The transportation is done by small and medium-sized vehicles, because they are faster than the large ones. When the cost attribute is optimized (Figure **??**b), only a few arcs are used and the convoys consist mainly of large vehicles, giving a better cost-capacity ratio. In this solution three demand nodes, including the priority node (13), are not visited.

**Table 5.** Best values obtained with several executions of GRASP and ACO algorithms in Haiti earthquake 2010.


(**a**) Optimize time

(**b**) Optimize cost

**Figure 5.** Best itineraries optimizing time and cost in Haiti earthquake 2010.

Figure **??** shows the solution obtained when minimizing ˜ *f*2 with equal weights. In this case, the ACO algorithm has been executed until 25,000 feasible solutions have been generated. This compromise solution provides reasonably good values for all attributes, as it can be observed in Table **??**, all demand nodes receive some aid and, in particular, the priority node receives 26 out of 30 tons demanded.

**Figure 6.** Best itineraries optimizing ˜ *f*2 in Haiti earthquake 2010.

**Table 6.** Best objective values optimizing *f*2 in Haiti earthquake 2010.

˜


Finally, we summarize the results obtained in the Niger test case. As in the Haiti test case, GRASP and ACO have been executed ten times minimizing each objective independently and ten times minimizing the aggregated function, running 1500 s per execution. Again, Table **??** shows the objective values of the best and the worst solutions among the ten executions for each attribute, together with the average and the standard deviation of the objective values of the ten solutions. The operation time is given in hours, the cost in euros, and the security and the reliability in expected lost tons. Both algorithms provide optimal solutions for the equity and priority objectives in the ten executions. The ACO Algorithm performs slightly better than GRASP for the security and significantly better for the remaining attributes—time, cost and reliability—and also for the aggregated function. The best value obtained with ACO for the cost is only 1.6% higher than the assumed optimal value of the cost attribute, which is EUR 63,240.0, obtained using the static flow model [**?** ]. ACO is significantly better for the cost, the time, the reliability and the aggregated function. Despite the higher speed of GRASP generating feasible solutions in this test case—9000 on average per execution with GRASP versus 6400 with ACO—ACO performance is much better than that of GRASP.

**Table 7.** Results obtained with ten executions per objective of GRASP and ACO algorithms in Niger Famine 2005.


It is remarkable that the number of feasible solutions generated using the same computation time is lower in the Niger test case than in the Haiti test case, even though the logistic network is much smaller in the former. This is because there are other factors that influence the computation time as, for example, the structure of the network, the number and the location of the vehicles, the relation between the amount to be distributed and the total capacity of the vehicles, etc.

The proposed compromise solution for the Niger test case is shown in Table **??**. All attribute values remain reasonably close to the best values obtained by optimizing each objective individually, except the value for the priority objective. This is partially due to the high confrontation between the equity and priority attributes. For optimizing the priority attribute, the node with the highest priority level (Zinder) should receive all the demanded aid, which would lead to an inequitable distribution.

**Table 8.** Best objective values optimizing ˜ *f*2 in Niger famine 2005.


In summary, the results obtained in both test cases show that the ACO Algorithm is able to obtain good solutions for all attributes when optimizing each of them individually, together with good compromise solutions as well. In particular, it provides the best known solutions for several attributes, such as the operation time in the Haiti test case or the reliability of the itineraries in the Niger test case. In addition, the computational effort required by the ACO algorithm is quite reasonable, making it suitable to solve realistic cases.

#### *5.4. Managerial Insights*


#### **6. Concluding Remarks**

This paper addresses a humanitarian last mile distribution problem in response to a disaster. The distribution must be done under unsecure conditions, so the vehicles must travel in convoys to try to prevent assaults. Furthermore, there is uncertainty about the state of the roads, that may have been damaged by the disaster. The proposed model is multimodal, multidepot, allows split deliveries and multiple trips and provides the individual itineraries of the vehicles. Besides, the model considers six performance criteria—time, cost, equity, priority, security and reliability—and is solved by a compromise programming approach.

The main contribution of this paper is the proposed methodology to solve the model. We develop an algorithm based on Ant Colony Optimization that considers several types of ants and pheromones. The methodology introduces the concept of effective pheromones to balance the elements of the provided solutions and to diversify the set of obtainable solutions. This methodology is especially appropriate to approach routing problems in which the vehicles are required to travel in convoys, but some of its most novel elements, as it is the case of the effective pheromones, could also be successfully applied to many other complex distribution and routing problems.

The algorithm is tested on two realistic case studies of public domain, and its performance is compared to that of the GRASP Algorithm, the only existing solution method in the literature. The ACO Algorithm performs well for all the objective functions tested in our experiments and provides, for both test cases considered, the best solutions obtained to date for several attributes. In comparison with the available solution method in the literature—the GRASP Algorithm—in the Haiti test case, both algorithms have their own strengths, but in the Niger test case, ACO clearly outperforms GRASP. The computation time required to obtain good solutions is reasonable in all cases, proving that the ACO Algorithm can be perfectly applied in an emergency context. Furthermore, GRASP and ACO algorithms actually complement each other and could even be used in combination in order to improve the quality of the final solutions provided to the decision makers.

One interesting future research line to continue this work comprises the development of a stochastic model in which the possibility of suffering assaults or finding blocked roads could be represented by random variables. Additionally, the design of simulation models to replicate real situations and test different solution strategies could also be worth considering.

**Author Contributions:** Conceptualization, J.M.F., M.T.O. and G.T.; Methodology, J.M.F., M.T.O. and G.T.; Software, J.M.F.; Validation, J.M.F., M.T.O. and G.T.; Formal analysis, J.M.F., M.T.O. and G.T.; Investigation, J.M.F., M.T.O. and G.T.; Data curation, J.M.F., M.T.O. and G.T.; Writing—original draft preparation, J.M.F.; Writing—review and editing, M.T.O. and G.T.; Visualization, J.M.F., M.T.O. and G.T.; Project administration, M.T.O. and G.T. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the European Commission gran<sup>t</sup> H2020 MSCA-RISE 691161 (GEO-SAFE), the Government of Spain gran<sup>t</sup> MTM2015-65803-R and the Government of Madrid, gran<sup>t</sup> S2013/ICE-2845. The APC was funded by the European Commission gran<sup>t</sup> H2020 MSCA-RISE 691161 (GEO-SAFE).

**Conflicts of Interest:** The authors declare no conflict of interest.
